Numerical Methods
Numerical Methods
Hello friends so this is the first lecture of this course, numerical methods and in this lecture I
will talk about errors in numerical computations and in the later part of the lecture I will
introduce system of linear equations. So first of all let us talk about error, so in the concept of
error in numerical analysis we are having a term significant digits and that play a very
important role for the accuracy of any numerical computation.
So for example suppose I want to perform a numerical computation with a real number let us
say x =1/3, however the computers generally store the finite number of digits, so every
computer performs the computation by taking finite number of digits. However, we see this
particular number, this particular number in decimal format can be written as like this and
having the infinite number of digits after decimal, so the computer cannot perform the
computation with such infinite long string of digits.
So what happens we need to cut down this number somewhere after some fixed decimal
places, so for example if I cut down this number after 4 decimal places, so this portion after
this particular thing will be ignored and hence we are introducing an error in our
computation. If you see here, here this is a very small error up to fifth place of decimal
however in further computations this small error propagates and becomes a large error. So
1
now how to do it? Before that let us learn how the computer stores the number, so in
computer each number can be stored by a fixed length, so fixed length means the fixed
number of digits and it depends on computer that how many digits we can store in the
computer, different computers can have different ability to store the number and to perform
the computation with different number of digits.
But in general we are having the floating point representation which generally each computer
follows and in this representation each number can be written as ±M . 10k if am talking about
decimal number means numbers having base 10. Here this M is called mentissa and k is an
integer. Now the range of this M will depend on the base of the number so β is the base of
representation in this will be… For example, in decimal representation the range of M will be
0.1 ≤ M <1. In fact, this particular representation is called normalised floating point
representation.
So for example if I am having a number let us say 5431, so in floating point form this
particular number can be written as 0.5431 . 104. Similarly if I am having a number - 1.23
then this particular number can be written as - 0.123 . 101, so here you can see the Mentissa is
0.5431, here Mentissa is 0.123 and each one is coming in this range and then k is 4 here and k
is 1 here. Suppose I am having a number 0.0056, so here you can see that 0.0056 is less than
0.1 so what will happen I will write this number as 0.56 . 10-2, so this is the floating point
representation of a number in the computers and digit in mantissa are called significant digits.
2
So if someone will give you a number and ask you tell me how many significant digits we are
having in this number, so what you need to do, you need to write the number in the floating
point representation and then the number of digits in the mantissa are called significant digits
in the given number. Now as I told you that computer can perform the computation with
finite number of digits, so after the finite digits, if the number is having more digits what we
need to do? We have to cut down those digits.
As I have given an example of 1/3 that is 0.3333 and so on and suppose I have taken only this
portion, now what will happen, so there will be some we are introducing some error in our
competition by ignoring this part of the number, so this can be done in 2 way one is either we
truncate the number after finite digit or we perform rounding of the number, so here
truncation means that we will take this much term and after this whatever we are having we
leave out as such, however rounding off each having a different procedure for cutting down a
number after some finite digits.
3
So let me tell you how will perform rounding off, so when a number is rounded to n
significant digits, the last retain digit is increased by 1 if the discarded portion is greater than
half a digit and the last retain digit is not changed if the discarded portion is less than half a
digit. For instance, rounded to 2 significant digits the number 0.1251 becomes 0.13 because
please note that here the 51 is greater than 50 means this is 0.0051 is greater than 0.005, so
what will happen if I want to rounded off this digit up to 2 significant digits after the decimal
if I want to round off this particular number and it will become 0.13. Similarly if you take
0.1249, it becomes 0.12.
So what is happening, if I am having a number x which is having some digits let us say like
this, so let us say 4 digits and I want to round it off this number up to 2 significant digits after
decimal, so what will happen? I will see this particular portion, so if this particular portion is
greater than 0.005 then what will happen in this particular number, this particular digit will be
increased by 1, if it is less than 0.005 then this particular digit written as such, no change in
this digit, so for example I have taken 0.1251, so here 0.0051, so this particular digit is
greater than half, so after rounding off to 2 significant digits it will become 0.13, if I take
0.1249, it will become 0.12.
What will happen if it is exactly 0.125 and I want to round off it up to 2 significant digits? So
what will happen this will be written 0.12 but suppose I am having a number 0.135 and I
want to round off this up to 2 significant digits, it will become 0.14, why I am doing this?
This number is same but here I am not increasing this digit by one but here I am increasing,
this because if this is the case that in the last you are having exactly half of the number so
4
what will happen we will round off the number in such a way that the last digit should be an
even digit for example should be an even number, for example here the 0.125. So 2 is an even
number 0.135, here 4 is an even, so I want to make this even so for making this even I want
to increase it by 1. So this procedure is called rounding off a number up to given significant
digits and this particular thing play a very important role in numerical computation and
basically here we are introducing a very small error in the number by rounding off or
truncated up to some finite number of digit at later on in the computation it propagates and
become a large error.
Let us take an example of that, so consider this particular matrix and for this matrix and I
want to find out determinant of this by rounding each intermediate calculation up to 2
significant digits, so I have taken this matrix let us calculate the determinant, so (0.12)(0.13)
– (0.21)(0.14), so this particular number becomes 0.0156 and this particular number becomes
0.0294. Now as I told you I need to round off each intermediate calculation to 2 significant
digits, so when I rounding off it, it will become 0.016 because here 6 is half of the digit and
so 0.016 this will remain as 0.029 and the final determinant is - 0.013.
On the other hand, the exact solution is this one - this one and it is 0.0138. Now if I round off
the final result up to 2 significant digits it will become - 0.014. So you can see that up to 2
significant digits the correct solution is - 0.014, wherever if I am performing in the
intermediate calculation up to 2 significant digits, it is coming - 0.013 so we are having a
significant error in the calculation where I am rounding off each intermediate calculation up
5
to 2 significant digits. Now what type of error we are having in numerical computation, so the
first error can be defined as true value - approximate value, approximate value means which
we are calculating using the numerical methods, so if I am getting like the true value exact
solution so error will be 0.
Now absolute error is called absolute value of the error, the relative error is a measure of the
error in the relation of the size of the true value and it is given as the absolute value of the
error that this is the absolute error / absolute true value, the percentage error is defined by
multiplying by 100 to the relative error or 100 times of the relative error. The term truncation
error is used to denote error which results from approximating a smooth function by
truncating its Taylor series representation to a finite number of terms.
So if we take this particular example here the true value is - 0.138 while the approximate
solution which we are getting by rounding off each intermediate calculation up to 2
significant digits is - 0.13, so the error here is - 0.0008, absolute error is 0.0008, the relative
error is 0.0008/0.0138 and it becomes 0.058. It is looking small but when we see the
6
percentage error it is 5.8 because it is 100 times of this particular number and which is now
by looking at the error 5 percent error more than 5 percent error is quite significant. Now let
us talk about significant digits in the approximate solution of approximation of a number.
7
So let us say the true value is 1/3 and the approximate solution is 0.333, now I want to check
up to how many significant digits this approximate solution is true for the means match with
the exact solution, so first of all I will calculate x - xA and it will be something 0.0033 and so
on. Now if I check this, this is less than equals to basically 0.0005, so 0.5. 10-3 and as I told
you it is by the given definition, so β is the base, so base is 10 here. Now s - r + 1 = - 3, I got
a relation.
Now I want to find out the value of s as I told you s is the largest integer such that βs less than
absolute value of true solutions. So here moreover we can have that 10-1 there is 1/10.1 which
is less than 1/3, the true solution, so this gives me s = - 1. So from these 2 equations what I
can write r = 3, so hence I can say that the approximation 0.333 is having 3 significant digits
to the exact value 1/3 or match with the exact value with 3 significant digits.
If you take one more example of the same thing. So for example I am having true value is
0.02138 and approximation is let us say 0.02144, so here again following the same process
this becomes 0.00006 which is less than 0.0005 or what I can say it is equals to 1/2. 10-3. So
again like earlier one, here s - r + 1= - 3. Moreover here the true value 0.02138 will be greater
than 10-2 that is 0.01. So here this gives me s = - 2, so by substituting this value of s here - 2 -
r + 1 = - 3, I got r = 2.
So here this approximate xA is having 2 significant digit to the true value 0.02138 and let me
introduce system of linear equations. So in general we use to see the system of linear
equations in number of applications in science and engineering, most of the problems in
8
science and engineering can be formulated. nonlinear equations and that can be approximated
or converted system of linear equations finally to solve the overall system.
Now a linear system of n equation with n unknowns is given as by this type of so here this is
the first the question here x1, x2, xn are the unknown variables a11, a12, a11 or aij are the
coefficients in different equations and b1, b2, bn are the right-hand side vector. In the matrix
notation we can write this particular system as Ax = b. Now how to solve such a system?
So solving a linear system with n equation and n variables is more difficult when n is greater
than equal to 3 because if I am having two equations with two unknown I can solve it directly
but if I am having 3 equations or more than 3 equations in the same number of unknown
variables, it becomes quite difficult and we need some systematic way of solving such
system. So here I want to introduce a direct method that is called Gaussian elimination, so it
is a systematic method for solving the system of linear equations having 3 or more variables
with the same number of questions.
9
(Refer Slide Time: 24:02)
It involves mainly 2 steps first one is changing coefficient matrix to row echelon form, so
basically coefficient matrix is having the coefficients of unknown variables from different
equations and then finally back substitution. So let us take an example of 3 by 3 systems to
explain this particular method and then we will solve an example of this.
So suppose I am having 3 equations the first equation is like this. So 3 equations and 3
unknowns in matrix form, I can write it [a11, a12, a13, a21, a22, a23, a31, a32, a33], the unknown
vector [x1, x2, x3] equals to right-hand side vector [b1, b2, b3]. So in Gaussian elimination
method what we use to do, first of all we will write the augmented matrix which is given as
the coefficient matrix say and we append the right-hand side vector as the last column. So for
10
example this is the efficient matrix here, what I will do? I will apprehend this [b1, b2, b3] here.
Now this is my augmented matrix. Now what I will do? I will perform elementary row
operations on this matrix to convert it to row equivalent form. So first of all to convert it in
row equivalent form which is basically I want to make it an upper triangular matrix
something like this.
11
So first of all I will make this and this particular element 0 with the help of this element, so if
this element is 0, what I will do? I will interchange it with some other row where first
element the element in the first column is known 0, so then after that if I will do this row
operation on this so first row will not change so I can do it here itself, so first row will not
change, so I will make this 2 entries 0 and then these numbers will change after that what I
will do, I will make this particular entry-0 with the help of second row, so I will make this
entry-0 so these 2 entries will change.
Now if you see this particular thing in the system of linear equations from the last row I will
be having a33’’. x3 = b3’’, so this will give me x3 = b3’’/a33’’. So from here I will calculate the
value of x3, if I substitute this value of x3 in the second equation I will get the value of x2 and
if I substitute the value of x3 and x2 back, the first equation I will get the value of x1 and this
particular procedure is called Gaussian elimination.
12
13
So let us take an example this particular system of equation, we need to solve the following
system using Gaussian elimination on a computer using floating point representation with 3
digits in the Mentissa and all operations will be rounded. So I am having 3 equations and 3
unknown, so augmented matrix for this system is given by this particular 3 by 4 matrix, so
this is the right-hand side vector.
Now what I will do first of all I need to make this and these 2 entries zero with the help of
first one, so for that I will use the row operations like R2 is replaced by R2 + 1.3/0.143 R1 and
R3 is placed by R3 - 11.2/0.143 R1, we get this particular matrix, so these 2 entries are 0, these
entries are also changed. When I will make this particular term 0 with the help of second row,
so I need to perform the operation R3 is replaced by R3 + 32.3/4.19 R2 from that I will get this
one.
14
So now you can see this particular matrix in a coefficient matrix is an upper triangle matrix.
Now from the third of this I can say that x3 is 2 /- 1 that is - 2. If I substitute this value of x3 in
the second equation, I will get x2 as - 2.82 and finally if I substitute the value of x3 and x2 in
the first equation I will get extra one as - 0.950 however the exact solution of this system is
like this and you can see you having huge error in our solution which we obtain using the
Gaussian elimination method.
Here as you know we have put the procedure in a correct manner but still a big difference in
the final solution, so how to overcome this? This can be done using the Gaussian elimination
method using partial pivoting and that I will introduce in the coming lecture in the next
lecture. So in this lecture I told you about the errors in numerical computation, what type of
error we are having? What is the concept of significant digits? I have taken some examples in
which we lose the significant digits in further computations and finally Gaussian elimination.
So with this I will stop myself, thank you very much.
15
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 2
Gaussian Elimination with Partial Pivoting
Hello everyone so this is second lecture of this course and in this lecture I will introduce
Gaussian elimination method with partial pivoting in the later part of this lecture will discuss
about the ill condition systems those are difficult to solve using Gaussian elimination method
or Gaussian elimination with pivoting, so in the last lecture we have solved this 3 by 3 system
of linear equations using the Gaussian elimination method and after applying the Gaussian
elimination and using the back substitution, we got the solution like x1 = - 0.950, x2 = - 2.82
and x3 = - 2.00. However, the exact solution this x1 = 1, x2 = 2 and x3 = - 3.
So basically there is a big difference in these 2 solutions one which we obtain using the
Gaussian elimination method and the other one which is the true solution. So there were some
mistakes when we apply the Gaussian elimination method or something went wrong, so
procedure was correct however the reason for this errors is rounding off error propagated to
such an extent that the final solution becomes hopelessly inaccurate, so what is the solution to
this problem?
The solution to this problem is the Gaussian elimination method with partial pivoting, so this
is a modified version of Gaussian elimination procedure and here we will search for pivot
elements and based on that we will perform the elementary row operations. So again consider
16
a 3 by 3 system having 3 equations with 3 unknowns that is a13 x3 = b1 so this is the 2nd
equation and the 3rd equation is, so in matrix form I can write the coefficient matrix as a11,
a12, a13, a21, a22, a23 and coefficient of the 3rd equation, so these are the 9 coefficients of x1, x2,
x3 in all 3 equations and then if I had the right-hand side vector here then this becomes the
augmented matrix.
In Gaussian elimination we use to perform elementary row operations on this matrix in such a
way this matrix reduces into row equivalent form. In Gaussian elimination with partial
pivoting what we will do, first of all we will search for the element having maximum
absolute value from the 1st column, so I will look for this 3 elements from the 1st column and
I will see which element is having the maximum absolute value and if this is the element a11 I
will not do any operation at this particular moment if it is a21 I will interchange 1st and 2nd
equation, if it is a31, I will interchange 3rd equation with 1st equation. After doing this what I
will do let us assume that the maximum element is a31, so what I will do? I will interchange
my 3rd equation with 1st equation.
So my augmented matrix will become a11, a12, a13 and here it will come b1 while in this a31,
a32 and then a33 = b3. So I have done a operation R1 interchange with R3, so now this a31 is the
biggest element in terms of absolute value in the 1st column now what I will do, I will make
this element as 1 so what I need to do, I need to divide the 1st row by a scalar that is 1 / a31, so
I need to replace R1 with 1 / a31 R1.
17
So what will happen this element will become 1, this will become a32 / a31 this will become
a31 and it will become / a31. Now what I will do with the help of this 1st equation I will make
these 2 elements 0 that is the element in the 1st column of the 2nd row of and element in the 1st
column of the 3rd row. For doing this I need to perform two more row operations that is R2
will be replaced by R2 - a21. R1 and R3, I need to replace by R3 - a11 into R1. Please note that
this element is now a11 because we have interchanged R1 and R3 in our 1st operation.
So if I will perform these 2 row operations then what I will get these 2 elements will become
0 and I will get according to some other values here. So this is the 1st pass of Gaussian
elimination method with partial pivoting. Now in the 2nd pass what I will do I will have left
out 1st row and 1st column and I will perform same operations in this sub matrix in such a
way that this entry will become 1 and this entry will become 0. So here we are changing
finally the matrix in row echelon form however we are performing some extra operations by
searching the pivot elements and we are making the entry 0 with the help of the pivot
elements and the diagonal position. So this method is called Gaussian elimination with partial
pivoting.
18
(Refer Slide Time: 9:08)
So let us take an example of this method so I will consider the same example which I have
taken in my previous lecture for that I got a wrong answer using Gaussian elimination, so
here the same augmented matrix 3 equations with 3 unknowns, so now in the 1st operations of
Gaussian elimination with partial pivoting what I will do? I will search the element having
the maximum absolute value in the 1st column, here you can see element 11.2 is having the
highest value, so what I will do, I will interchange my 1st and 3rd equation. This is the same
exactly which we have done while I was explaining the procedure.
So here I have interchanged my 1st and 3rd row so this has become my 1st equation now and
3rd equation will become 1st equation, 1st equation becomes 3rd equation. Now what I will do?
19
I will divide the 1st row by 11.2 to make the pivot entry as 1, so what will happen R1 is
replaced by 1 / 11.2. R1, so this will become 1 and subsequently other entries change, this is -
4.30 / 11.2 so it is coming out like - 0.384 and so on. Now what I will do, I will make these
elements and this element 0 using the pivot element that is 1 now.
So for this I will replace the 2nd row that is R2 by R2 + 1.31. R1 and R3, I will replace by R3 -
0.143. R1, so after doing these 2 operations I will get this particular matrix, so here you can
see in the 1st column in the 1st row I am having the pivot entry as 1 and rest of the entry
below the pivot elements are 0, so this completes the 1st pass of Gaussian elimination method
with partial pivoting. Now what I will do I will do the same on the 2nd and 3rd equation but
leaving out the 1st column because these are 0 0 already, so for this I will go to the 2nd
column, in 2nd column will choose the element having the maximum absolute value or of
these 2 that is 0.408 and 0.412, so element 0.412 is bigger than 0.408, so I will replace 3rd
equation, I will interchange 3rd equation with 2nd equation.
20
(Refer Slide Time: 12:16)
So system will become like this. Now here to make this pivot entry as one I need to divide 2nd
row by 0.412, so when I will divide the 2nd equation by 0.412, I get this augmented matrix, so
here pivot element is 1 and then again what I will do? I will make this element 0 with the help
of this pivot element, so what I need to do? I need to replace 3rd equation that his 3rd row R3
by R3 - 0.408. R1, so after performing this particular operation I got the system and this
matrix is in row equivalent form however if you look at this element it is not still 1. So it is a
pivot element and I need to make it 1, so to make it 1 in the 3rd pass I need to divide this
equation by this particular number.
21
(Refer Slide Time: 13:27)
So by doing this dividing the 3rd row by 0.0800, I got this element as 1, now you can see it is
a triangular form that is, it is an upper triangular matrix or I can say it is in row echelon form.
All the diagonal entries or pivot entries are 1, so from here if I use the back substitution, you
can see that x3 will become – 3, x2 will become 1, after putting the values of x3 and x2. The
exact solution is x1 is 1, x2 is 2, x3 is - 3 which is same which we get both using the Gaussian
elimination with partial pivoting hence the exact solution agrees with numerical solution okay
when rounded to three significant digits. So this is the modification in Gaussian elimination
and this particular method called Gaussian elimination method with partial pivoting.
22
Here the term partial in partial pivoting refers to fact that we have chosen the biggest absolute
entry in 1st column in the 1st pass. So we are looking only at 1st column when we are going
through the 1st pass. If we look instead of searching in 1st column, if we search it in whole
matrix or in 2nd pass like whole 2 by 2 sub matrix then the pivoting or pivoting is called
complete pivoting or full pivoting. Unfortunately, we are having some systems where we
cannot solve the equation or system of equations using either partial pivoting or complete
pivoting.
Now a talk about some of those systems, so 1st of all I will take the system those are called ill
conditioned systems, so some linear systems those are ill conditioned system are extremely
sensitive to the rounding off error or any other kind of numerical errors. For such systems
23
pivoting does not help much even though you use partial pivoting, you use full pivoting we
can have incorrect solution using the numerical scheme. So I will illustrate or I will explain
such ill conditioned system with the help of an example after that what I will do? I will define
matrix norm and I will relate the matrix norm with the conditions when the system becomes
ill condition means how to measure whether the system is ill conditioned or it is not.
So let us take an example which is having 2 equations and 2 unknowns, so a very small
example and then we will see that this particular example of an ill conditioned system. So let
us take 1st equation as x1 + x2 = 2 and the 2nd equation is x1 + 1.001 x2 = 2, so here in the
matrix form I can write the coefficient matrix as [1, 1; 1, 1.001] into the unknown vector that
is [x1 x2] and right hand inside vector is [2, 2]. If we solve this particular 2 by 2 system just
looking at the system we can say the solution of this system is x1 = 2 and x2 = 0, so this
particular solution satisfied our system of equations and hence it is an exact solution of the
system.
Now what I will do? I will not change my coefficient matrix but there is a little change in
right hand side vector in the entry right-hand side entry of 2nd equation let us say earlier it
was 2 but it has become let us say 2.001 due to some error, it may be some numerical error or
some error in the measurement let us say some sensor error. So what will happen? I am
having a very small change in the input data earlier it was 2 and now just here it is 2.0001
that is a change of the order of 10- 4.
24
Now if you look the solution of this particular system this new system what I will get? x1 = 1
and x2 = 1. Now look at these 2 solutions and the 2 systems, in input data I am having a very
little change, a change of the order 10- 4, here it was 2 it is 2.0001 however if you see the
change in the solution we are having a large deviation, so if we are having large deviation in
the solution just due to a small change in the input data such a system is called ill condition
system, so it is a perfect example of an ill conditioned.
Now the same example and this particular example shows that a small change of 0.0001 in
system of equations makes a significant change in the solution of the system, so in a system
of equation Ax = b, when the solution is highly sensitive to the value of the coefficient matrix
A or the right-hand side factor b, the equations are called to be ill conditioned.
25
(Refer Slide Time: 21:24)
A matrix norm is a real valued function which is defined on a set of let us say n by n matrices
and satisfying the following conditions, condition 1 the norm of a matrix will be always non-
negative. The 2nd condition is it will be 0 if and only if A is a null matrix or 0 matrix, if you
multiply the matrix A by some scalar α then norm of the α A will become absolute value of α.
norm of A for R α belongs to set of real numbers. Now 4th condition is triangle inequality that
is basically if you are having 2 matrices A and B of order n by n then || A + B|| will be always
≤||A|| + ||B||, so any real valued function satisfying these properties on a set of n by n matrices
is called a norm.
26
We are having different types of norm the 1st one is natural or induced matrix norm, it is
defined for a matrix A by this equation so norm of A will be maximum of norm of A. x
where x is a vector in Rn and it is a vector having unit norm, vector of unit length. So here if
you having a vector of unit length you multiply this vector with A, so norm of A is equal to
maximum of norm of x.
The 2nd type of norm we are going to define here maximum row norm, for any n cross n
matrix A, the maximum row norm is defined as norm of A = maximum over the absolute sum
of all row, means absolute sum means we are having entries of each row, let us take entries in
1st row take the absolute value of each element of the 1st row, take their sum, similarly take
the absolute sum of 2nd row and so on and then maximum of those sums will give you the
maximum row norm. The 3rd norm we are going to define here over the set of matrices is
Euclidean norm and that is obtained by square root of r(ATA), where r(A) is given by
maximum absolute value of lambda and lambda is a eigenvalue of A, it means maximum of
the all eigenvalues in terms of absolute value.
27
(Refer Slide Time: 24:39)
Let us take a matrix 3 - 2, 4 as the 1st row, 1, 2 - 3 as the 2nd row and 2, 4, 1 as the 3rd row, so
here if you look the maximum row norm of this matrix let us say A is given by maximum of
so in the 1st row 3 the absolute value of - 2 will be 2, so 3 + 2 + 4 it will become 9, for the 2nd
row 1 + 2 + 3 so it will become 6 and in the 3rd row it will become 2 + 4 + 1 that is 7 so it is
9. On the other hand, Euclidean norm for this particular matrix will be square root of
maximum of eigenvalues of ATA, so if this is the ATA will be having the eigenvalue as 3.61,
23.66 and 36.73, so it means square root of 36.73 which is maximum among these 3
eigenvalue and these are the eigenvalue of ATA and this is coming out as 6.06.
28
So here you can see the Euclidean norm induces the matrix norm. Now to relate this norm
with the ill conditioned system we are having an important result in this theorem. In this
theorem let A be non-singular matrix, then the solution x and y of the system A x = b1 and A
y = b2 respectively satisfy this particular inequality.
So I can write norm of A-1 ||b1 - b2||, so || x - y||≤ || A-1 b1 - b2|||, if I divide it by the ||x||, this
will also divide divided by the ||x||. Now I can write it the left-hand side as such ||x - y||/||x|| ≤
29
if I multiply in the right inside in the numerator as well as in denominator by ||A||, so it will
become ||A||, ||A-1 ||.||b1 - b2||/||A|| since I have multiplied in the numerator so I have to
multiply in denominator also into ||x||.
30
We know that ||A||.||x|| ≥ ||A x||, so if I replace it this inequality will not change and as you
know that norm of A x = here A x = b1 so ||A x|| will be ||b1||, so I can replace here norm of b1
and this is the result of the theorem. Here we can see that multiplying coefficient norm of A
into norm of A-1 it is interesting, it depends entirely on the matrix in the problem and not on
the right-hand side vector, yet it shows up as an amplifier to the relative change in the right-
hand side vector. So if it will be a high-value, that is ||A||.||A-1|| for matrix there will be large
change in the solution due to a small change in the right-hand side vector and what will
happen? The system will become ill conditioned.
So based on this idea we can define condition number, so for a given non-singular matrix A
which is having real entries and of size n by n and a given matrix norm The condition number
31
of A with respect to the given norm is defined by ||A||.||A-1||, means this particular term, so if
condition number for a matrix is large even a small variation on the right-hand side vector
that is in b1 or b2 can lead to a drastic variation in the solution and such matrices are called ill
conditioned matrices.
So if you see the earlier example which we have taken [1, 1; 1, 1.0001], so this example was
taken in the beginning when I introduced the ill conditioned system, so norm for if this is the
matrix M, the condition number of this matrix will be something around which is quite large
and hence the system was the ill conditioned system. So I will stop this lecture here, in the
summary of this lecture 1st I introduced the Gaussian elimination with partial pivoting and in
the 2nd phase of the lecture I have introduced an ill conditioned system and the idea of
condition number to measure the whether a system is ill conditioned or not and in the next
lecture I will continue with the direct method of solving the linear system and there I will
introduce a technique called LU decomposition. Thank you very much.
32
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 3
LU Decomposition
Hello everyone so this is the 3rd lecture of this course and today I am going to introduce you
another direct method for solving linear system that is called LU decomposition, so again the
idea of this method LU decomposition is the same which we were having in case of Gaussian
elimination that to convert the coefficient matrix in triangle matrix, however in Gaussian
elimination we were converting or reducing our matrix into an upper triangle matrix then we
were using back substitution. Here the idea is to write the coefficient matrix as a product of
lower and upper triangular matrices and then solve the linear system of equation using
forward followed by the back substitutions.
33
So consider the system of equation A x = b, this also known as the decomposition or
factorization method, so the coefficient matrix is decomposed or factorized into product of
lower triangular and upper triangular matrices L and U, so A = LU, L is lower triangle matrix
and U is the upper triangle matrix, so these 2 matrices are given as, as you can see it is n by n
matrices and it is a lower triangular matrix because all the entries above the main diagonal are
0. Similarly, U is an upper triangular matrix and here all the entries below the main diagonal
are 0. We are using the usual matrix multiplication to multiply the matrices L and U.
So basically what we are doing we are taking the system A x = b but what we are doing at
this moment, we will focus on the decomposition of this coefficient matrix is the product of 2
matrices L and U, so if we are having again a 3 by 3 system a11, a12, a13, a21, a22, a23, a31, a32,
34
a33 this can be written as the product of a 3 by 3 lower triangle matrix, so this is the lower
triangle matrix L into an upper triangle matrix U, [u11, u12, u13; 0, u22, u23, 0, 0, u33. Now if
you look in the left-hand side we are having total 9 entries, these 9 entries are known to us
however in the right-hand side we are having total12 entries, 6 from the matrix L and 6 from
the matrix U.
Hence we multiply these 2 matrices and try to find out the values of all lij and uij, we will not
be able to do because here we are having only 9 entries, so 9 equations and 12 unknowns, so
what is the solution? So what is the solution to this problem? In general, we can say this
problem is like that if A is n by n matrix then I will be having total n2+n unknowns because
you can see in 1st row I will be having one, in 2nd row 2, in 3rd row 3, in nth row n, so it will
be some 1 + 2 + 3+…+ n, that will be basically (n.n + 1)/2.
So (n.n + 1)/2 from the lower triangle matrix and in (2n + 1)/2 from the upper trying matrix
total will become n2+ n unknowns including all lij and uij. While for a n by n matrix say I will
be having only n square entries which is known to me. So somehow I need to reduce n
unknowns, so what I will do? If I write the diagonal entry, entries of either from l or diagonal
entries of u as 111 then the trick will work, so what I will do? Either I will choose these
entries as 111, so lij = 1 when i = j, so what will happen? The n entries will become less here,
so n number of unknowns will be reduced, so n2 unknown n2 entries, I will get a unique
solution or the LU factorisation or instead of this I can take all the diagonal entries of my
upper triangular matrix as 1.
35
(Refer Slide Time: 7:25)
So if I take the diagonal entries of lower triangle matrix as 1, the method is known as
Doolittle method. If I take the entries of upper triangular U matrix as 1, entries mean diagonal
entries, the method is called as Crout’s method. Now let us take the entries of upper
triangular U matrix the entries those are at diagonal, when diagonal is 1 then total 6
unknowns from here, 3 unknowns from here and 9 entries then what I will do? I will find out
the values of all lij and uij.
36
(Refer Slide Time: 8:22)
So if we take uii = 1, 1 to n means all diagnosed entries of upper triangle matrix as 1, the
system of equation (1) can be written as lij = aij - k = 1 to j - 1∑ lik ukj, whenever I ≥ j. This
can be written in this way whenever i is less than say by the equation (2), if and all rest of the
uii = 1. When i < j, Uij is given by this equation when i ≥ j, lij is given by this equation, so this
equation will give all the entries of lower triangle matrix.
This equation will give you all the entries of upper triangular matrix those are above the main
diagonal and the entries of the upper triangular matrix with basically the diagonal entries are
1. Then what we will do? From the 1st column of the matrix L, we can find out the entries like
l11, l21, l31 and ln1 because the 1st column of the lower triangular matrix will be identical to the
matrix A. After doing this after finding li1 for i = 1 to n, what we will do? We will go to 1st
37
row, in 1st row we can calculate all u1j that will be basically a1j/l11 where j = 2 to n. Here we
are taking j from 2 because u11 is already we have fixed as 1.
So 1st column will give you the entries like l11, l21, l31 and so on the 1st row will give you the
entries like u11 is 1, u12, u13 up to u1n and then what we will do? Then we will go to 2nd
column, in 2nd column will give me the entry like li2 = ai2 - li1 u12, i = 2 to n and then 2nd row
will give the entries of 2nd row of U by the equation 5 and then we will go like 3rd column 3rd
row, 4th column 4th row and so on and we will be able to get all these entries lij and uij.
38
So here we will take an example of this method okay, so take a 3 by 3 matrix say which is
given as 1, 2, 4 and then the 2nd row of this matrix is 3, 8, 14 in the last row of this matrix is
2, 6, 13. Now in Crout’s method when we are decomposing this as a lower triangular matrix
and upper triangular matrix A = l11, 0, 0, l21, l22, 0 and then l31, l32, l33, so this is a lower
triangular matrix L into an upper trying matrix U, so here I am taking a main diagonal
elements of U as 1, so 1, u12, u13, 0, 1, u23, 0, 0, 1. Now if I multiply these true matrices then
the 1st element will be l11 in the 2nd element will be l11 u12 and the 3rd element will be l11 u13.
Similarly, from the 2nd row l if I multiply with 1st column this element will be l21, this element
will be l21 u12 + l22 and this entry will be l21 u13 + l22 u23. In the last row of the product matrix
this element will be l31, this element will become l31 u12 + l32 and finally the last entry of this
matrix will be l31 u13 + l32 u23 + l33. So this matrix equal to A. Now comparing these 2
matrices and the comparison will be done based on the strategy I have earlier explained and
initially we will compare the elements of 1st column.
39
(Refer Slide Time: 14:20)
So when I compare the elements 1st column of this matrix with this one, I will get l11 = 1, l21
= 3 and finally l31 = 2. So after comparing the elements of 1st column now will compare the
elements of 1st row, so I will take this element and this element is l11.u12 and in this matrix
this is 2 since l11 is 1, so u12 comes out to be 2, so here u12 is 2. Now I will take this element,
so when I will compare this element with this one l11 is already one so I will get u13 as 4. Now
I will compare the elements of 2nd column, so for this I will take this particular element, so l21
u12 + l22 = 8 and l21, l21 is 3 .u 12 is 2 + l22 is 8.
40
So from here I got l22 = 8 - 6 that is 2. Now I will take this element, so this gives me l31 u12 +
l32 and from here it is 6, 31 is 2u12 is 2 + l32 = 6 and from here I got l32 = 2. So out of the 9
elements I got 7 elements just by comparing 2 columns and one row.
41
Now I will compare 2nd row, so from 2nd row this element is already I have taken in
comparison this one have taken, so now I will go for this element. So when I will compare
this element it is l21.u13 + l22 u23 and from here it is 14 so l21 is 3 u13 is 4 + l22 is 2 u23 = 14. So
from here I will get u23 = 2, so u23 = 1 and finally I will compare this element, so this element
is l31, so l31 is already known to us it is 2.u13 which is 4 + l32, l32 is 2.u23 which is 1 + l33 = 13,
so 8 + 2 = 10 so from here I got l33 as 13 - 10 that is 3.
So in this way I calculated all the 9 entries, so now if I write these 9 entries, the
decomposition or LU decomposition of this matrix will be 1, 3, 2 then l22 is 2, l32 is 2, l33 is 3,
so this is the lower triangular matrix and the upper triangular matrix is u12 is 2, u13 is 4 and
42
finally u23 is 1. So hence this is an example of LU decomposition of a given matrix A when
the diagonal entries of the upper triangular matrix are 1.
So this is the example of Doolittle method so as you can see u11 will be 1, u12 will be 2, u13
will be 4, so this is coming from the 1st row similarly 1st column gives me l21 = 3 and l31 = 2
similarly making the other comparisons I will be able to decompose the matrix say which is
the same matrix as I have taken in Crout’s method equal to product of these 2 matrices.
43
(Refer Slide Time:20:39)
Now this is about the factorisation or decomposition, now question how to solve a linear
system using the concept of factorization. So let me explain it, so basically we are having Ax
= b, so I have decomposed A as L . U so LU . x = b. Now what I will do let us assume that
this Ux = z, so what I will be having Ux will be a column vector so let us say it is z1, z2, zn, so
Ux = z and if I substitute Ux = z, the original system will become Lz = b.
Here you can note down that L is a lower triangular matrix, so what I will do it will be
something like that [l11, 0, 0, l21, l22, 0, l 31, l32, l33].[z1, z2, z3] = [b1, b2, b3], so what I can do, it
is a lower triangular matrix so from the 1st equation it gives z1 = b1/l11 directly. Now what I
will do in the 2nd equation I will substitute the value of z1 and I will get the value of z2
similarly from the 3rd equation I will substitute the value of z1 and z2 and I will get the value
of z3, so this is something I am doing like forward substitution, so making use of forward
substitution I am getting the values of z1, z2, z3, so once z1, z2, z3 or up to zn if we are having
n dimensional matrix n by n matrix then I can find out the unknown factor z.
44
(Refer Slide Time: 23:13)
So here what I am doing I am making use of forward substitution and I have to calculate
vector z. Once I know this vector z, I know that Ux = z, U is an upper triangular matrix z is
known to me by this step I can find out the value of x by making use of back substitution as
we have done in Gaussian elimination, so from the last equation I can get the value of xn, I
will substitute the value of xn in penultimate equation from there I will get the value of xn - 1
and so on. Finally substituting the value of xn, xn - 1 up to x2, from the 1st equation I will get
the value of x1 and this method is called Crout’s method for solving the linear system of
equation, so you are what we are doing, 1st we are decomposing our coefficient matrix as the
product of lower triangular and upper triangular matrix and then we are using forward
substitution and finally back substitution.
45
(Refer Slide Time: 25:02)
46
This is the example which I have taken earlier, so suppose we are having this system of
equation [1, 2, 4; 3, 8, 14; 2, 6, 13], so 3 equations with 3 unknowns the matrix the coefficient
matrix is same which have taken earlier, so I can write this as a product of L. U x = b. Now
assume this = z1, z2, z3. So what will happen if I substitute this by z1, z2, z3 the original
system can be written like this. From here using the forward substitution I can get z1 = 4, z2 =
0 and z3 = 3.
47
What I will do, now I will put the value of z1, z2, z3 here and I will get the values of x1, x2, x3
which is coming out like x1 = 2, x2 = - 1 and x3 = 1 which is the solution of the system. We
can also find out the inverse using the LU decomposition, so for finding the inverse you
know that we can write A = L.U, so A-1 will become LU-1 that is basically U-1.L-1. Now
question is whether this method will always work? No.
This method fails if any of the diagonal elements either from the lower triangular matrix or
from the upper triangle matrix, in the 2 method is 0 because what will happen for calculating
the values of other variables the diagonal elements used to come in denominator and if it is 0
we cannot find a finite value or the other variable. So what is the condition, sufficient
condition for this method? LU decomposition is guaranteed to give you a solution if the
matrix A is a positive definite matrix.
48
(Refer Slide Time: 27:29)
There is one more method for the symmetric matrix if the coefficient matrix is a symmetric
matrix that is called Cholesky method is basically since the coefficient matrix is the
symmetric we can write it as the product of L.LT, so if L is the lower triangular matrix
LT will become an upper triangular matrix, so in other way we can write as the
product of an upper triangular matrix U . UT. Hence in decomposition we need to find
out only one matrix instead of L and U either L or U and then the rest of the process
will be similar to the Crout’s or Doolittle method.
49
(Refer Slide Time: 29:24)
50
Like if you are writing A = L. LT the system of equation say Ax = b can be written as L.LTx =
b, just as U LT as x = z, so this will give Lz = b, so from here you will calculate z and then
you can substitute back the value of z here to find out x, if L is non-singular you can calculate
z as L-1b and then finally x as LT-1 z that is L-1Tz, similarly you can use Cholesky method to
find out the inverse of a matrix, so inverse is given by L-1T. L-1, so this is the working process,
equations to find out the entries in the Cholesky method. It will be the same as we have did in
Crout’s method and only change will be due to the symmetric property of A. Consider this
example, it is a 4 by 4 system the coefficient matrix is a tri-diagonal matrix, it is a symmetric
matrix.
51
So if we solve this system using Cholesky method, I will take L as the lower triangle matrix
LLT will become like this, so here this will be the LT. Products can be written as in this way
after comparing the entries of A with the entries of this matrix, L comes out in this way,
okay. From here I will get the values of z1, z2, z3, z4. Once I will get z1, z2, z3, z4, I can solve
LT x = z, using the back substitution and I will get the values of x1 as 56/209, x2 as 15/209, x3
as 4/209 and x4 as 1/209.
52
(Refer Slide Time: 31:23)
If I want to find L-1 the same process L-1 comes out in this way which is equal to this
particular matrix and A-1 will become L-1T. L-1 that this is the matrix A-1. So in this lecture I
discussed about method based on triangular matrices for solving the linear system of
equation. Here first I discussed about the Crout’s method and Doolittle method and finally I
have given a small explanation to Cholesky decomposition which is basically in the case
when A is a symmetric matrix. In the next lecture I will go to other category for solving the
linear system of equation and that category is called Iterative methods. So far I have
discussed direct methods and Iterative methods are basically having few advantages over the
direct methods, so thank you very much for listening this lecture.
53
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 4
Jacobi and Gauss Seidel Methods
Hello everyone in this lecture I am going to introduce a new way of solving linear systems
that is called iterative methods, so in the past couple of lectures I have introduced the direct
methods those include Gaussian elimination, Gaussian elimination with partial pivoting and
then LU decomposition. Today iterative methods are quite beneficial in terms of numerical
computation when you are going to solve the linear system of equations.
54
So however these techniques can only be applied on square linear system, Square linear
system means you are having n number of equation together with n number of unknown
variables, so basically iterative methods for Ax = b being with an approximation to the
solution x0, that we use to say initial solution then we seek to provide a series of improved
approximation x1, x2 etcetera that converge to the exact solution.
In engineering problems with this approach is appealing because it can be stopped as soon as
the approximation xi has converge into an acceptable precision means whenever the
difference between the exact solution and the solution in any iteration is less than a given
threshold. The iterative methods are good to use for the problems where the matrix A is large
and sparse, okay and in these cases if the matrix A is large and sparse they are much faster
than the direct methods.
55
(Refer Slide Time: 3:08)
Now how to write this P and q that is the difference in various methods those coming under
the category of iterative methods. As I told you we start with an initial solution x0, so Px0+q
will give us x1 then in the 2nd iteration x2 will be Px1+q and so on, so in each iteration we
update our solution and with an assumption that solution is going to converge to the exact
solution however is not true always, solution may diverge also for a given iterative scheme.
Now how to write this particular matrix P that is the iteration matrix and the column vector q.
56
(Refer Slide Time: 5:11)
So as we know you can always write given n × n matrix say as the sum of 3 matrices L + D +
U . x = b where L is a lower triangular matrix, D is a diagonal matrix and U is an upper
triangular matrix. For example if you are having a 3 × 3 system given by the coefficient
matrix A as this one then I can write this coefficient matrix A equal to a lower diagonal
matrix + a diagonal matrix + an upper triangular matrix which is given as a11, a12, a13, 0 here
a11 will not come as I have taken a11 already in diagonal matrix that is capital D, so this is the
matrix capital L this is the matrix capital D and this is the matrix capital U.
57
(Refer Slide Time: 7:29)
58
Now we are having different scheme under the category of iterative methods 1st of all I want
to drive or I want to introduce the most simple scheme that is called Jacobi Method, in Jacobi
method what we use to do, we use to write this system as Dx = - L + U x + b, so what we
have done? We have taken these 2 terms into right-hand side. Now what I will do? My
iterative scheme will work like this that at K + 1 iteration x is given as from the estimate of x
in K iteration or I can write xK + 1 =just multiplying both side pre multiplying by D -1.
-1
So it will become - D L + U x K + D -1. b. Here if I compare this particular scheme with a
general formula of iterative scheme that is given by xK + 1 =P(xK + q), so the iteration matrix P
is given by this particular matrix. The column vector q is given by this term it will be n × n
matrix and this will be a n × 1 column vector, so this particular scheme is called Jacobi
iterative scheme. Here we will start with an initial solution x0 then I will find x1 using this
formula and so on.
In a more simple setting if I want to write the iterative equation for this particular scheme, I
need to consider a system of 3 equations with 3 unknown let us say the coefficient matrix as
given by this one and vector x is [x1, x2, x3] and right hand side vector b is [b1, b2, b3] okay,
so now this is the linear system Ax = b, the 1st equation can be written like x1 K + 1 = 1/a11.b1 -
a12 x2 at K iteration - a13 x3 at K iteration.
So what I have done? I have taken the variable which is corresponding to diagonal elements
of the coefficient matrix in the left-hand side and I have taken the rest of the 2 terms of the
coefficient matrix in the right-hand side. Similarly, from the 2nd equation we can get the
59
iterative equation for the variable x2 and it is given as 1/a22 b2 - a21 x1K - a23 x3K. Finally from
the 3rd equation I can write the iterative scheme for x3 so x3K + 1 = 1/a33 b3 - a31 x1K - a32 x2K.
So these 3 iterative equations can be upgraded simultaneously to get the iterative values of x
in each iteration.
Now I will take an example and I will solve it using this iterative Jacobi iterativescheme, so
here let us take this system of equations here I am having 3 equations with 3 unknowns x1, x2,
x3 the iterative equations can be written in this way x1 K + 1 will be 1/4, 8 - 2x2 K - 3x3 K, x2 K +
1
=- 1/5(- 14 - 3x1K - 2x3K) and similarly from the 3rd equation I can write the iterative
equations for x3 that is x3K + 1 =1/8(27 + 2x1K - 3x2K). Now if I start with an initial solution let
us say x10 is 0, x20 is 0, x30 is 0 then the iteration 1, I will get the value of x11 as 2 that is
60
coming from the 1st equation here I am putting x20 is 0, x20 is 0, so x11 will become 8/4 so it is
coming out as 2.
Similarly from the 2nd equation I am getting x21 that is coming out to be 2.8 and x31 =3.375.
Now if I these values in the right-hand side of these 3 iterative equations I can get the next
iterate of x and that will be for x1, it will be - 1.9314 x2, it will be 5.350 and for x3, it will be
2.825. If we need a solution correct up to 3 places after decimal, then we need to calculate
this sequence of values for further.
61
So using the same this table gives me the values of x1, x2, x3 in 3rd iteration then in 4th
iteration and you can see the values are changing quite differently, in each iteration going in
the same manner in 22nd iteration is coming out - 1.025 for x2, it is 2.976 for x3, it is 1.973 and
then in the next iteration the values can be seen like this after 43rd equation, that is in 44th
iteration I am getting the value of x1 as - 1.001 x2 as 3 and x3 as 2. In 45th iteration I am
getting - 1, 3 and 2 then in 46th iteration I am getting - 1, 3, 2. So in 2 successive iterations I
am getting the same value hence my solution converges to - 1, 3 and 2 and which is the exact
solution also of the given linear system. So here we have taken 46 iterations to solve this 3 ×
3 system using the Jacobi method and this example gives an illustration of the Jacobi method.
Now as I told you we have taken 46 iterations to solve a 3 × 3 system just by starting with an
initial solution 0, 0, 0 hence I can comment on the convergence of Jacobi method and I can
say it is having a slow convergence. So in the next method that is called Gauss Seidel method
I will introduce some modification in the Jacobi scheme and then Gauss Seidel method
iterative equations will give a bit faster convergence compared to the Jacobi scheme.
62
(Refer Slide Time: 18:46)
If we see the iterative equation of this Jacobi method here you can see in the 2nd equation I
am using the value of x1 from the previous iteration however before going to the 2nd equation
I have find out the value of x1 at the current iteration, so here the value of x1 from the current
iteration is available to me but I am using the old value. Similarly if you see the 3rd equation
here both the values of x1 and x2 are available to us from the current iteration that is at the K
+ 1 iteration from the 1st 2 equations but what we are doing, we are still using the values
which we obtain in the previous iteration.
If I modify these values the values of current iteration then I can have a bit more fast
convergence. For example like in the 1st equation I will calculate x1 at K + 1 iteration and
what I will do here I will use this value. From the 1st and 2nd equations I will get the value of
x1 and x2 respectively in K + 1 iteration and in the 3rd equation I will use these new values. So
basically in a general setting what I am doing, I am having a system Ax =b again I am writing
it L + D + U x =b and then what I am doing? I am writing this D + L x =- Ux + b and from
here I am writing my iterative scheme that is D + L at K + 1 iteration = - UxK + b.
So this equation can be written in a more appropriate form like this, just I am pre-multiplying
by the inverse of this matrix, so it will become - D + L -1 UxK + b + L -1 b. So this particular
equation gives the Gauss Seidel scheme in a general setting where my coefficient matrix A is
n × n matrix. Here iteration matrix P can be written as - D + L -1 U and the column vector q is
given as D + L -1.b, if I compare this scheme with the abstract equation of iterative scheme.
63
So this scheme is called Gauss Seidel method for solving the linear system of equations and if
I use this method on a 3x3 general system a11 x1 + a12 x2 + a13 x3 =b1 and 2 equations like we
have taken in Jacobi method. Now final system comes out to this particular 3 iterative
equations now the only changes as I told you earlier also here we are using the updated value
of x1 and x2 in 2nd and 3rd iterative equations respectively for finding the values in the current
iteration that is K + 1 iteration. Hence we hope that this scheme will be faster because we are
using the more updated values of the variables in the scheme.
64
Let us take an example of this method. So the same example which we have taken in the
earlier case that is in the case of Jacobi method there we used or we need to use 46 iteration
for finding the exact solution I mean for the convergence of the solution here let us see how
many iterations we need using the Gauss Seidel method. So for that particular example the 3
iterative equations that is for x1, x2 and x3 can be written in this way so you can note down
this particular term and these 2 terms here we are using the updated values as I have told you
in the derivation of this particular method. Then if I start with an initial solution that is x10
=x20 =x30 =0.
So initially I am taking a 0 vector as the initial solution then in the 1st iteration I am getting
the value, so here I need to put x2 =0 and x3 =0, so what will happen once I will put the 0 0, I
will get x11 as 8/4 that will be 2. Now in the 2nd equation I will not use your x1 =0 because if I
will use your x1 = 0 that will be exactly like Jacobi. Here I am using this updated value of x1
that is 2 so this will become x21 = - 1/5 - 14 - 3 . 2 that is - 6 - 2 . 0.
So here x21 will become - 20/- 5 that is 4, so x21 will become 4 and finally x31 will become 1/8
27 + 2 . 2 please note that again I am using the updated value of x1, in Jacobi method I have
used here 0 that is the value from the previous iteration. So 2 . 2 4 - 3 . 4 12 so 27 + 4 - 12 so
it will be 27 - 8 so 19/8 and that will be coming out something like 2.375. Now in the next
iteration I will get the value of x2, in the 2nd iteration x1 as - 1.781 x2 =2.681 and x3 =1.924.
When these are the values in the 3rd iteration then we are having values in 4th iteration and so
on continuing this calculation in 9 iterations I am getting the value of x1 is - 1 x2 is 3 and x3 is
2. In the 10th iteration I am getting x1 is - 1 x2 is 3 and x3 is 2. So hence my iterative scheme
65
converges in 10 iterations itself while in case of Jacobi method it was taking 45 iterations but
here in Gauss Seidel method it converges in 10 iterations hence the claim which we made that
by using these 2 updates Gauss Seidel method will be something more faster than the Jacobi
method is verified by this example.
Now in this particular lecture we discussed the 2 iterative schemes for solving linear system
of equations and we have seen that later one is quite faster than the earlier one hence an
iterative scheme is good if it is more faster, so the effort should be made for developing an
iterative scheme by taking care of its convergence or rate of convergence, how fast it
converge to the exact solution? In the next lecture I will talk about it and then I will introduce
one more technique to you that is basically called successive over relaxation.
In this particular technique I will use a relaxation parameter I will find out the optimal value
of that relaxation parameter and then based on that optimal value in terms of convergence
speed. Once I will get the optimal value I will get my iterative system and using that
particular iterative system I will write the iterative equations. Further in the next lecture I will
also introduce about the conditions for convergence of an iterative scheme, so thank you very
much for this lecture.
66
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 5
Iterative Methods II
Hello everyone so today in this lecture we will continue from the previous lecture where I
have introduced you 2 particular iterative schemes one is called Jacobi method and another
one Gauss Seidel method. It has been seen that Gauss Seidel method converge to solution in
less number of iterations when compared to the Jacobi method.
However there are many problems where these 2 schemes not converge at all, in this lecture
we will learn a more generalised scheme called successive over relaxation or SOR scheme.
67
Finally we will discuss the conditions under which these schemes converge, so consider an
iterative scheme for a n×n linear system Ax = b of this form, so it is something D + ω L xk+1
= - 1, 1 - ω L + U xk+ b. If I put ω = 0, what will happen this particular scheme will convert
into the Jacobi scheme?
So it will become D (xk+1) and so on in the right-hand side it will be - L + U xk+ b. If I take ω
= 1 and this scheme becomes the Gauss Seidel method. For ω = 0.5 we have a method that lie
between somewhere Jacobi and Gauss Seidel. If ω > 1 we have a method that goes beyond
the Gauss Seidel method. This particular thing when ω > 1 takes us into the realm of over
relaxation and for certain problem it turns out to be highly effective in terms of convergence.
If I take this particular scheme which is Jacobi for ω = 0 and Gauss Seidel for ω = 1, can be
converted this particular scheme such a way that the matrices on the left-hand and right-hand
side are lower and upper triangular respectively. If we will be able to do it, what will happen?
We can do this over relaxation method in such a way that we can update few of the unknown
parameters or unknown variables by their updates from the current iteration itself like we
have done in Gauss Seidel method. This type of iterative method is known as successive over
relaxation.
68
(Refer Slide Time: 3:29)
So let me drive this particular scheme that is SOR method, so in short we say it SOR method
which is for successive over relaxation. So as you know we will start with a linear system Ax
=b where A is n × n coefficient matrix x is n × 1 unknown variable vector and b is right-hand
side column vector of size n. I will write it as the sum of 3 matrices L, D and U where L is
lower triangular matrix, D is a diagonal matrix and U is an upper triangular matrix. I can
write this equation in this way also, so what is happening? I can write (L + α D)x =so what I
have done this particular thing will remain the left-hand side and these 3 terms have taken in
right-hand side.
Now just consider a relaxation parameter ω in such a way that ω. α =1, if I multiply this
whole system with that particular ω, the system will become ω L + Α into ω which is 1, so D
69
x = Α.ω will become 1 - ω D - ω U x + ω b. Here ω is a scalar as I told you it is a relaxation
parameter. Again if this is non-singular matrix I can find out the inverse of this matrix and I
can multiply pre-multiplying by the inverse of this matrix in left-hand right-hand side. So if I
will do it what will happen? In the left-hand side you will be having x then what we are
having just I am writing this D + ω L-1 (1 – ω)D - ω U x + D + ω L-1 ω b. Now you can see it
is in the step form of an iterative scheme and I can write here the value of x in k + 1 iteration
and here I will take the value of x in k iteration, so this is called the successive over
relaxation iterative scheme.
Here the traditional matrix P is given by D + ω L-1 (1 – ω)D - ω U while the column vector q
is given by ωD + ω L-1 b. So this is the derivation of this particular scheme as I told you it is
called successive over relaxation scheme.
70
(Refer Slide Time: 9:36)
71
(Refer Slide Time: 10:26)
72
If we solve this particular example using the successive over relaxation scheme it is a 3 × 3
system and here we are going to perform 3 iterations of SOR method by taking the initial
solution as 0, 0, 0, so here from this particular coefficient matrix I can write letter L, D and U
in this way then the iteration matrix P will become this 3 × 3 matrix and vector q is a 3 × 1
column that is 7ω/2, this will be the 2nd element and this will be the element in the 3rd row.
However you can see here the iteration matrix P as well as q is having ω in all terms, so what
we need to do we need to find out the optimal value of the ω by using the spectral radius of
the Jacobi iteration matrix.
And if I use Jacobi iteration scheme matrix for this particular scheme comes out to be this 3 ×
3 matrix, if I calculate the eigenvalues of this matrix, eigenvalues are 0, 1/√2 and -1/√2. It
73
means the spectral radius of Jacobi iteration matrix for this particular example is 1/√2 and
hence the optimal relaxation parameter for the SOR scheme is 1.171573. If I use this value
for these 2 matrices the final iteration matrix is this one and the column vector q is this one
and scheme can be written as xk+1 p xk+1 q. Starting from k =0 to 1 et cetera.
So if I use the 0, 0, 0 as the initial guess in the iteration 1, I got x1, x2 and x3 this one, in
iteration 2 this one and in iteration 3 the values are coming out this one. For the convergence
we can do the further calculations. So this is all about successive over relaxation scheme, so
this scheme depends on a relaxation parameter ω and based on the optimal value of ω it
converge quite faster towards the exact solution when compared to the Jacobi and Gauss
Seidel scheme. Now we will discuss about the convergence for iterative methods. What sorts
of conditions are responsible for the convergence of any iterative scheme? So here how to
drive the convergence we will show it here in an abstract way in a general setting and then we
will take some specific example for it.
So let us say we are having a system Ax =b and this system we are solving using an iterative
scheme Pxk+ q. So convergence of this scheme means if we are having s as the exact solution
of this system, our scheme can be written as like this for the convergence means after a finite
number of iterations I got this s and further if I use this s in subsequent iteration there is no
change in the solution because it is the exact solution, so let us say equation 1 and equation 2.
If I subtract the equation 2 from the equation 1, I can write it xk+1 - s as the left-hand side
and in the right-hand side it will be P xk- s, q will be cancel out. Now if I denote that ei be the
74
error in ith iteration so error will be the difference between the value of x in that particular
iteration and the exact solution. So it will be something like xi - s. So this system I can write
in terms of error, the error in k + 1 iteration equal to P times error in k iteration. Now if I use
the norm on this equation I can write ||ek + 1|| = ||P.ek||.
As we know that ||Ab||≤ ||A||.||p||, so I can write it left-hand side will be same as the previous
line in the right-hand side I will be having ||Pek|| ≤ ||P||.||ek|| or I can write ek + 1. Now this
particular equation tells us about the convergence of any iterative scheme and as you can see
this equation is totally dependent on the iteration matrix P.
75
So this particular equation reveals that the system converges if we take the ||P||≤1, in such
cases, I can write so if I take here ||P||≤1 this equation can be written as it means error is
reducing in each iteration, so if we are having a very large error in the initial solution, in 1st
iteration it will reduce, in 2nd iteration, it will further reduce and so on.
So basically when you are having large number of iterations then what will happen, this
particular equation tells us that error tends to 0 as this k tends to infinity and this is the
convergence of any iterative scheme. Basically what it is telling? It is telling to me that the
solution if error is going to 0 when k is infinity the solution is going to converge to the exact
solution as k tends to infinity. It means this particular condition that is norm of the iteration
matrix is less than 1 gives a guarantee for the convergence of an iterative scheme.
76
(Refer Slide Time: 20:18)
It means however I want to put a remark here that if we choose a particular matrix norm say
the infinity norm and find that in this norm P the norm of P is greater than 1, this does not
necessarily indicate that the iterative scheme will fail to converge because it is a sufficient
condition and here we are not talking about any particular norm. For a start there may be
some other matrix norm such as column sum norms or Euclidean norm that is strictly less
than 1 in which case convergence is still guaranteed. So in any case condition norm of P less
than 1 is only a sufficient condition for convergence, not a necessary one.
If we need to tell necessary as well as sufficient condition for the convergence of an iterative
scheme then the iterative scheme xk+1 = Pxk+q is convergent for any initial solution if and
77
only if every eigenvalue of P satisfy λ<1. Every eigenvalue is less than one means the
spectral radius of that particular matrix is less than 1. Here we have established a condition
on the iteration matrix and as you know original problem was given in the form Ax =b and
from here you need to drive the iteration matrix in which you need to do some computation.
Can we have some condition on A itself which reveals about the convergence of the scheme
that this particular system will converge in case of n iteration scheme.
So yes we are having such condition on A, if A as the diagonally dominant property, then the
Jacobi as well as Gauss Seidel method are both certain to converge. Now what we mean by
diagonally dominant, a matrix is a diagonally dominant if, in each row, the absolute value of
the entry on the diagonal is greater than the sum of the absolute value of the other entries.
78
If we take the example of this 3 × 3 coefficient matrix let us say let me explain here so if A is
[5, - 1, 2; 2, - 8, 1; - 2, 0, 4]. Then what you can say about the solution of this system Ax =b
for any right-hand side vector b using Jacobi or Gauss Seidel method whether Jacobi or
Gauss Seidel method will converge to the true solution for any right-hand side vector b and
for any initial solution. Yes, I can see, because the matrix A is strictly diagonally dominant,
here you can see this 5 is greater than 1 + 2 that is coming from the 1st row.
In the 2nd row the absolute value of - 8 is 8, 8 is greater than 2 + 1 and in the 3rd row the
diagonal element 4 is greater than 2 + 0 where 2 is the absolute value of - 2, since the matrix
is a diagonally dominant hence Jacobi as well as Gauss Seidel method will converge for
initial solution when you are using this method, when you are solving this particular problem.
Moreover, if you see the iteration matrix and Jacobi iteration matrix for this particular
79
problem it is coming out like this and here the norm is absolute row norm is 0.8 and column
sum norm is 0.75.
If you consider this example this is the coefficient matrix and if someone ask you that what
can you comment about the convergence of Jacobi iterative method in this case then in this
form the matrix is not diagonally dominant since you can see 2 is less than 4. However, if I
interchange 1st and 3rd equation of the system, the system convert into the previous system
and the coefficient matrix will become the diagonally dominant and hence the Jacobi scheme
will converge for this system.
80
(Refer Slide Time: 26:03)
Now consider one more example this is the coefficient matrix and here in this coefficient
matrix we can see that the matrix is not strictly diagonally dominant because from the 1st row
4 =2 + 2 hence we cannot comment anything about the convergence of Jacobi iterative
method for this particular problem from the property of the diagonally dominant, so what we
need to do? We need to find out the iteration matrix, so iteration matrix in case of Jacobi
method will be like this and here you can see that the row sum norm is 1 as well as column
sum norm is 1.
Hence the condition that row sum norm is less than 1 or column sum norm is less than one
fails here as well as we cannot comment anything from the diagonal dominant property but if
you calculate the Euclidean norm of Frobenius norm it is coming out to be 0.901 and as I told
you if one norm is coming greater than one of the iteration matrix, it does not mean that
method will not converge, you need to find out a norm which is less than 1 and here we are
able to find out Frobenius norm which is having value 0.901 and hence convergence is
guaranteed for this particular problem. So in this lecture we talked about successive over
relaxation method and then in the later part of the lecture we discussed few conditions about
the convergence of iterative scheme.
There we found that if any norm of the iteration matrix is less than 1 then the scheme will
converge. Moreover, we have seen that if the coefficient matrix of the system Ax =b is a
diagonally dominant then Jacobi as well as Gauss Seidel method will converge for any initial
solution and right-hand side vector b. So this is all about iterative schemes and the end of this
particular unit that is linear system of equations. Here we learned 2 categories of methods one
81
is direct method another one has a iterative method. In the next class we will discuss methods
for solving non-linear systems or 1st we will start with non-linear equation, a single non-
linear equation and then we will also learn that how to solve non-linear system of equations
system of non-linear equations. Thank you.
82
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 6
Introduction to Nonlinear Equations and Bisection Method
Hello everyone so today we are going to start a new unit in which we will learn few
Numerical methods to solve nonlinear equations, so 1st of all what is a nonlinear equation?
One of the most frequent problems in real life in engineering is to find the roots of unknown
linear equation f(x) = 0. The equation is nonlinear due to the unknown non-linearity of the
function f. Here a function is said to be nonlinear if the variables or variable has an exponent
either greater than 1 or less than 1, but never one or if you are having an equation having
more than one variable you are having the cross product terms of the variable in the equation.
83
exponent as 2. Similarly, if I am having some other functions let us say x + √x and again it is
a nonlinear equation because here the exponent is 1/2 of this particular x, in this particular
term, so in a nonlinear equation if any of the term is having exponent less than 1 or greater
than 1 but not exactly 1 then the equation is called nonlinear equation.
If you are having an equation in more than one variables let us say f is a function of x and y,
where x and y are independent variables, so if I am having this function as xy = 0, so here
you can see this term is linear, this term is linear but here we are having separately x and y
having exponent 1, but with a product this particular function becomes nonlinear, so in next
few lectures we will learn how to solve these nonlinear equations. Now what we mean by the
solution of nonlinear equations?
84
So given a nonlinear function f we seek value of x for which f is 0. Such a solution value for
x is called a root of the equation and a ‘0’ of the function f. An example of a nonlinear
equation in one variable if you take f = x2 - 4 sin x = 0 and if I see the graph of this function,
it can be clearly seen that the function is 0 when x is 0 and another point when x is near to
1.9, so hence this particular equation is having two roots 0 and just near to 1.9. So graphically
we can say the solution of a nonlinear equation means we have two find the curve of
nonlinear function and we need to see where this particular curve is touching or intersecting x
axis.
Like we had the system of linear equations in the previous unit in this unit also we can have
system of non-linear equations. If we have more than one equation in more than one variables
85
then it is called a system of non-linear equations. For example if you take these 2 equations
that is 1st equation is x2 + y2 = 1 and the 2nd equation is x3+ x2 + xy - 1 = 0. So here we are
having two equations in two unknown and both the equations are non-linear in x as well as in
y.
So we will have learnt in the last lecture this unit that how to solve such a system using some
numerical technique. As you see in all these examples which I have given to you are basically
algebraic functions. Algebraic function means they are the polynomials in x or in y. Apart
from that you are having another type of nonlinear equation or nonlinear function those are
called transcendental function, so a function which is not algebraic is a transcendental
function. They are also referred to as non-algebraic function or example if you are having
trigonometric term in your function, exponential term, logarithmic term or their
combinations. For example, if you see this equation f(x) = x2 + tan x. Here it is a
trigonometric term in x and hence it is a transcendental function. So we will learn how to
solve or how to find out zeros of such transcendental functions using numerical techniques.
86
(Refer Slide Time: 6:31)
Now given a nonlinear equation we can have multiple roots. For example if you take a
nonlinear equation f(x) equal is to 0. If it is a linear equation then it will be having one root
but if it is a quadratic equation means it is a second degree polynomial in x, it will have 2
roots. If it is a cubic polynomial in x then the equation will be having 3 roots and so on. If the
roots may further be distinct or repeated for example if you take a simple function x2 - 4 = 0.
So basically this quadratic polynomial is having 2 roots x = + 2 and - 2 and both the roots are
distinct but on the same time if you take another function let us say x2 + 4 - 4x basically (x –
2)2 then the roots are 2 and 2 and hence the two roots but both are same, so roots are repeated
twice. In case of transcendental equation if we consider an example like twice of sin x - 1 = 0
then if we find a solution of this it will be something like x = sin-11/2 and it is coming out π/6
±2nπ where n = 0, 1, 2. So you can see that this equation has many roots corresponding to
different values of n.
87
(Refer Slide Time: 8:27)
So using numerical techniques we can find single root given in a particular interval or more
than one roots. For this we are having two types of techniques as I told you one is analytical
methods or direct methods like in case of linear equations and the other one is numerical
methods. So direct method can solve system in finite number of steps and gives an accurate
solution. However, there are systems of equations which are very time consuming when
solving with direct method or there are number of equations which you cannot solve using
the direct method hence we need to rely on the numerical methods to find out a solution of
such equation those are not solvable using direct method, so numerical methods provides a
technique to find an approximate but accurate solution to the system of equations.
88
(Refer Slide Time: 9:38)
So in the next few lectures we are going to discuss following numerical techniques like
bisection method which we will cover in this lecture itself then in the next lecture we will
take Secant method and Regular Falsi method, then in the 3rd lecture of this unit we will go
for Newton-Raphson method, in the 4th lecture of this unit we will discuss about Fixed point
method and in the last lecture of this unit will learn how can we solve system of non-linear
equation using these techniques. So what we mean by numerical solution of a nonlinear
equation.
So given a nonlinear equation f(x) = 0, numerical solution means a point we need to find out
an approximate x* such that f(x*) star is approximately 0. For this we always assume that
89
f(x) is continuously differentiable and real valued function also it is assumed that root of this
particular equation f(x) = 0 are isolated. What we mean by isolated root? The root of f(x) = 0
for which there is a neighborhood which do not have any other root of (1) is said to be
isolated root for example if you take f(x) = x2 - 4 = 0 again, so here we are having 2 roots one
is x = - 2 another one is x = 2, so both the roots are isolated because in the neighborhood of 2
as well as neighborhood of - 2 we do not have any other root.
To find out such isolated roots the main idea consists of the following steps in a numerical
technique, first of all we need an initial guess that is take a point x0 belong to a close interval
[a b] as an approximation to the root of f(x) = 0 and then we need to improve this initial root
or initial solution by using an iterative equation that is xn + 1 = T(xn), and = 0, 1, 2 and so on.
Here T is a real valued function called an iteration function. In the process of iterating a
solution we obtain a sequence of numbers xn which are expected to converge to the root of
f(x) = 0. Like here we will start with x0 then using this particular equation, iterative equation
we will find out x1 then using x1 here we will find x2 and so on.
90
(Refer Slide Time: 12:34)
Now in the previous I told you that we obtain a sequence of numbers xn which are expected
to converge to the root of f(x) = 0. Now what we mean by convergence of such an iterative
scheme, a sequence of iterates xn is set to converge with order p and p will be always greater
than equal to 1, to a point x*, if xn + 1 that is the approximate in n + 1 iteration – x*, that is the
approximate solution is less than equal to c|xn – x*|p for some constant c > 0. So if p = 1, the
sequences said to converge linearly, if p = 2, we will say that the sequences converging
quadratically and so on.
91
So the 1st numerical technique which we are going to discuss is Bisection method. The
method is based on intermediate Value Theorem, so what this theorem tells us that if f is
continuous in a close interval [a b] and K is any number between f(a) and f(b) then there
exists c belongs to open interval (a b) such that f(c) = K. So if you see this graphics here we
are having x on the horizontal axis f(x), on the vertical axis and this blue curve is showing the
graph of f(x) between a and b. Now if I take a number K between f(a) and f(b), so there will
be some number c such that image of c under this function f = K.
It means if we are having a function f(x) which is continuous on given interval [a b] the
function f satisfies the property f(a).f(b) is than 0 with f(a)≠ 0 and f(b)≠ 0 that a and b are not
root of f, then by intermediate value theorem, we can see there exists c or a root between a
92
and b such that f(c) = 0. This procedure will also work if there exist more than one root in the
interval [a b]. Here we assume that root in [a b] is unique just for simplicity. This method
calls for repeated bisection of subintervals of [a b] with locating the half interval containing p
at each step.
So geometrically how this method will work? Let me explain here. Suppose we are having a
function like this, so this is the point a, this is point b. Now as you can see this function is
having a root at this particular point, so what I will see that this point is (a, f(a)) and this point
is (b, f(b)). So what you can see f(a) is a negative value while f(b) is a positive value, so it
means f(a).f(b) means there is a root between a and b. Now what I will do? I will calculate c1
which will be midpoint of a and b, so if I take the midpoint this interval [a b] midpoint will be
somewhere here, so let us say this is my c1.
Now if I check f(c1) and if it is 0, I will say that c1 is a root of the equation, if it is not 0 or I
will check? I will check f(a).f(c1). So product of these 2 if this product is negative the root
will lie between a and c 1. If it is positive the root will lie in the right half interval that is c1 to
b, like in this example root is here, so what I will do? I will update this as a, and again this
will become b, so what will happen in the next iteration where the length of the interval was b
- a in the 1st iteration it reduced to (b – a)/2. Now again I will find out the midpoint of these 2,
so let us say this one c2, so c2 will be somewhere here and again have the check f(a).f(c2)
which is coming positive.
93
Here it means root is in the right half interval that is in this interval, so what I will do? I will
name it as a and this as b, so it means and then I will find out midpoint of this which will be
somewhere here and continuing this process my method will converge to a root which is
given here as b, let us say or x*. So this is the geometric explanation of the bisection method
in each iteration we reduce the half of our search intervals and then we will find out root by
repeating this half and half and half like this.
So the algorithm for bisection method can be described in the following steps, the 1st step is
given an initial interval [a0, b0], set n = 0. In step 2 define cn + 1 = (an + bn)/2, we are finding
the midpoint of the original interval if f(an).f(cn + 1) = 0 then the root is cn + 1, if it is the
product of f(an) with f(cn + 1) is negative then we take root as means root will lie in the interval
an + 1 and bn + 1 where an + 1 is an and bn + 1 is cn + 1. Similarly if this product is positive then take
an + 1 = cn + 1 and bn + 1 = bn means the root will lie in the 1st half of the interval and so on.
94
Step 4 if the root is not achieved in step 3 then find the length of new reduced interval [an+ 1
bn + 1]. If the length of the interval bn + 1 - an + 1 is less than a threshold ε, then take the
midpoint of this interval as the root otherwise go to step 2 for the next iterations.
95
So this is the same which I have described on the board graphically how bisection method
works. Now if you talk about the convergence of this particular method let (a0 b0) = (a b) be
the initial interval with f(a) . f(b) < 0 that is negative. Define the approximate root as xn which
is the midpoint of (an - 1 + bn – 1)/2 then there exist a root x* in [a b] such that x – x*≤ 1/2n. (b-
a) where [a b] is the original interval.
Moreover, to achieve the accuracy it is given that the accuracy should be of order of epsilon
where epsilon is a small positive number. It is sufficient to take (b – a)/2n ≤ ε because we
need this number less than equal to ε for getting the desired accuracy in our numerical
solution, so it is less than equal to ε. It means if we take the logarithm, I can write it n ≥log|b-
a| - log(ε)/log 2. So this particular equation tells us about the number of iterations required to
get a given accuracy ε.
96
For example consider this function x3 + cos x + 1 = 0. Let the length of the initial interval is 1
that is b - a is 1. If the permissible absolute error is 0.125 that is xn – x =* ≤ 0.125. Then
minimum number of iterations required to be carried out using the earlier formula, so it will
become n ≥ log|b – a| - log(ε)/log 2, so log 1 - log 0.125/log 2 and it is 3. So you need at least
3 iterations to get the accuracy of the order 0.125.
97
Now we will take an example and we will see how we can solve this particular example using
the bisection method, so example is a nonlinear equation and we will follow the steps of
bisection method to get the solution, so I need a nonlinear equation f(x) = 0 where f(x) is
given by x2 - x - 3. Now I need to find out root of this, so if I check f(1), it comes out to be 1 -
1 - 3 as - 3 and if I check f(2), it will be negative, if I check f(3), it will become positive that
is 3 into 3 9 - 3 - 3 so it is positive.
As we can see that product of f(1) into f(3) is - 9 which is a negative number it means the root
of this particular equation f(x) = 0 where f(x) is given by this function is lie between 1 and 3,
so what I will do in the 1st iteration I will set a0 = 1, b0 as 3, f(a0) is - 3, f(b0) as 3. Now I will
find c1, c1 comes out to be the middle point of a0 and b0 so it will 3 + 1 / 2 so which is 2, now
I will check f(2), f(2) is 4 - 2 - 3, so it is - 1, so check here if I take the product of f(1).f(2) it
is coming out a positive number hence the root lies between 2 to 3 not from 1 to 2. So it
means my a1 becomes 2, b1 becomes 3, f(a1) is - 1, f(b1) is 3.
98
(Refer Slide Time: 26:58)
So if I need to find out an accuracy up to of the order 10- 3 then the iteration will go like this
in the 11th iterations I will get an as 2.3027, bn as 2.3037, xn is 2.3032 and the value of f(x) at
this particular number is 0.0016. If I check this xn - xn - 1 when the difference between 2
consecutive iterations it is coming out to be 0.0005 and hence in the 11 iteration the absolute
error is less than 0.001 which is permissible error and the root is accurate to 3rd decimal
place. Thus the root of the equation is 2.303. We are having some advantage when we are
using the bisection method as well as some disadvantage.
99
(Refer Slide Time: 27:53)
Advantage, like this method is very easy to understand. It always converges to a solution.
That is why it is often used as a starter for other more efficient numerical techniques. The
disadvantages are this method is relatively slow to converge, actually you can see if I am
having xn + 1 – x*, so that is the error in nth iteration. Hence an + 1 ≤ 1/2an, so if I check with
the convergence formula you can see if I take c = 1/2, so this method is having exactly linear
convergence and hence this method is quite slow to converge, moreover if you choose a
guess close to the root, may result in requiring many iterations to converge, so in this lecture
we have learn about bisection method. In the next lecture we will learn two other methods
those are quite close to bisection method in the same category, iteration updates in the same
manner however they are having better convergence when compared to the bisection method.
Thank you.
100
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 7
Regula Falsi and Secant Methods
So welcome to the next lecture of the second unit of this course and in this particular lecture,
I will talk about two methods for solving non-linear equations the one is called Regula Falsi
method, is also called method of false position and another one which is just similar to this
particular method and a very little different from this one that is called the Secant method. In
the last class we have learned about bisection method for solving non-linear equation. In the
bisection method we take the next iterate as the midpoint of the two endpoints of the initial
interval or earlier interval in an iteration.
Just consider a particular scenario like we are having a function like this, so here this is my
point a, this is point b, so (a, f(a)), (b, f(b)). Now here we can see that the roots are close to b
when compared to a, now if I use the bisection method in this particular example or in any
other example where root is just close to one of the endpoint, in these cases bisection method
does not work or does not work in an efficient manner, efficient means in terms of
convergence. The bisection method used to take a lot of iterations to converge to this
particular solution, the reason behind it, since we are finding the midpoint in each iteration.
So like if root is close to one of the endpoint we have to take quite large number of iterations
to reach up to here because first we will take this particular interval then we will go here then
101
again we will take this one then this one and so on. Can we have other methods where we do
not use such a midpoint, instead of such a midpoint as the next iterate, we use some other
idea and we can have if root is close to one of the endpoint, our particular our midpoint or our
next iterate comes quite closer to the root, so that we can have a better convergence. So
Regula Falsi as well as Secant method use this idea where we do not use the midpoint we will
use some other weighted average instead of midpoint to find out the next iterate.
So let me introduce first Regula Falsi method and then I will go to the Secant method, so in
Regula Falsi method instead of taking the midpoint of the interval, we take the weighted
average of f(x) given by w = f(b).a-f(a).b/f(b)-f(a).
102
So consider the equation f(x) = x3- x - 1 = 0. If we check f(1) it is coming out to be - 1 which
is a negative quantity and f(2) is coming out to be positive. So hence we can say that a root
will lie between 1 and 2 thus we can take the initial interval for the Regula Falsi method or
like in bisection method as [1, 2] but here you just observe one more thing f(1) is coming - 1,
f(2) is coming 5.
So here I can say that f(1) is quite close to 0 when compared to f(2) means the root is closer
to 1 when compared to the 2, so it is very likely that the root of the given equation is closer to
1 then x = 2 and if we use the bisection method we will calculate our c1 as the midpoint of 1
and 2 that is 1.5 but here in the Regula Falsi method we find the weighted of f(x) w as f(b).a
– f(a).b/f(b) – f(a), so here if I take a = 1 and b = 2, so f(a) will become - 1, f(b) become 5.
So w will come 1.16666. Now if I find out f(w), it will be - 0.578703 which is a negative
number while f(2) is positive, so it means root will be between 1.16666 and 2. Repeating this
process once again I will get next if I assign it as a, w = a and b remain as 2. So I get the next
iterate as w = 1.2531. We can continue the same manner to find the shorter interval in which
the required root lies. Please notice that I can use the bisection method here again we are
reducing the length of interval by taking care that root will always lie in the shorter interval
which we are choosing in each iteration.
103
So if we talk about algorithmic way of this method given an initial interval (a0, b0) set n = 0.
In the step 2 calculate Wn + 1, so like here if n is 0 we will have w1, so it will become f(b0)a0 -
f(a0).b0/f(b0) – f(a0). In n iterations it will become f(bn)an - f(an).bn/f(bn) - f(an). Once we
calculate this w, we will check in the next type as f(an).f(wn + 1).
If it is 0 this product, then the root will be wn + 1 if it is negative we will assign an to an + 1 and
wn + 1 to bn + 1 and our root will lie in the interval [an + 1 bn + 1]. Basically if you check if there is
negative means root will be between an and wn + 1. So we are updating interval in this way. In
the step 4 if root is not obtained in step 3 by this condition we check the condition f(wn + 1)
absolute value of this less than given threshold that is the permissible error in our method. If
it is not we will go back to step 2, if this this particular inequality agreed then wn + 1 will be
the root, like that.
Consider the equation x2- 3x - 3 = 0, we need to find the root of the equation using Regula
Falsi method.
104
So what we will do or how this method will work let me explain here so equation is f(x) = x2-
3x - 3 = 0. So I need to find out a root of this equation, so if I check f(1) it is coming out to be
- 5 which is negative, if I check f(3), so 9 - 6 at 3 it is coming exactly 0, so let me take f (4),
so at 4 it is 16 – 15, so 1, which is positive, it means x* belongs to [1 4]. Now take a as 1, b
as 4, so f(a) will become - 5, f(b) will be 1 calculate w1.
105
So if I repeat this process using the Regula Falsi method the next iteration will give me W as
3.777778 at this particular point my an will be 0.06173, bn will remain as 4, f(bn) is 1, so it
means root will lie between 3.777778 and 4, if I use this process I will get the next w as
3.790698 where f(xn) + or f(wn + 1) is - 0.002704. In the next iteration if I use this as my left
point and b = 4 as the right point next w comes out to be 3.791262 and the value of f(an), this
particular point is given by this particular number.
So hence since it is a negative number, again I will say that 3.791262 is my left point of the
interval in which root lie and the right interval is 4, so if I continue with the next iteration I
get 3.791287 and here I can see that 3.7912 is correct up to 4 decimal places and hence if I
compared with the previous iteration previous value the value of w in the previous iterations
and hence the root is, numerical solution is 3.7912.
So like in the bisection method we are having some advantage of Regula Falsi method as well
as some disadvantages. Advantages is, it is a simple method expected to converge to the
exact root, it does not need any prior information nor does it involve calculation to find
derivatives of the functions which is a very beautiful advantage. We will describe some
method like Newton Raphson and fixed point iterations in the next lectures where you will
see that we need to find out derivatives of a function. Disadvantages are, this method is slow
method hence it is recommended to begin with a small interval in order to obtain an
approximate root of the desired accuracy with lesser number of iterations.
106
(Refer Slide Time: 15:48)
The next method we are going to discuss is Secant method which is a sort of Regula Falsi
method only. If we start from the beginning the intermediate value theorem tells us for each
root we find a close interval [p0 p1] where p longs to open interval (p0 p1) and f(p0) and f(p1)
is negative. Make sure these intervals do not overlap but now instead of taking the midpoint
as our next approximation, we find the second and joining (p0 f(b0)) to (f(p1), f(p1)) and take
the point where this line intersects the x axis as our next approximation p2. This point is likely
closer to the root p than the midpoint of the interval.
107
Let me explain this method by taking a graphical example and then I will come to the
algorithm of this particular method. So like we are having this particular point a this is the
point b, so and this is the function f(x), so like earlier in this point will be (a, f(a)) this will be
(b, f(b)). Now in the bisection method what we were doing? We were getting the midpoint of
a and b which will be somewhere here as our next iteration, however, in Secant method what
we will do? We will find the line is joining point (a, f(a)) and (b, f(b)) as our next iterate.
So it means this will be our next point and then we will continue with this one, so basically if
I write let us say a is p0 or if I denote p0 as a and p1 as b, so how to find out next iterate? So if
I find out the equation of this line that is basically (p0, f(b0)) and (p1, f(p1)), so line joining
these two points and be given by y – f(p0, )so this is point (x1 y1), (x2 y2), so f(p0) = y2 - y1
that is f(p1) – f(p0)/x2 - x1 that is p1 – p0 in x - x1 that is x - p0. So this is the equation of the
line joining point this point and this point.
Now what we will do, we will find out this particular point, let us say p2 is the point where
this particular line intersects x axis. So this is the equation of the line joining these 2 points.
Now if I define this point p2 where this particular line intersects x axis then I can write this
equation as 0, since at this point y0 – f(p0) = f(p1) – f(p0)/(p1 - p0).(p2 - p0). Please note that I
have replaced x by p2, so now if I want to modify this equation furthermore, I can write p2 -
p0 = - f(p0 ).( p1 - p0)/f(p1) – f(p0).
108
(Refer Slide Time: 21:42)
Or if I want to simplify it further then I can write p2 = p0 - f(p0), p1 - p0/f(p1) - f(p0). It means
our next iterate p2 can be calculated using this formula and by continuing this we can get the
iterations of Secant method. We got this formula and now the approximation pn + 1 for n > 1,
two root of f(x) = 0 are computed from the approximation pn and pn - 1 using this particular
equation as I have derived on the board if I explain this method using an algorithmic manner
the input will be initial approximation p0, p1 at tolerance TOL and maximum number of
iterations N0, output approximate solution p or a message of failure. In step1 set i = 2, q0 is
f(p0), q1 is f(p1), while i≤ N0, do step 3 to 6. In step 3, p can be calculated as p1 - q1(p1 - p0)/(q1
- q0).
109
If p - p1 is less than tolerance than output becomes p and the method will stop otherwise set i
= i + 1, set p0 as p1, q0 is q1, p1 as p, q1 as f(p) and step 7, if you will repeat these steps 3 to 6
if you have achieved the maximum number of iterations that is N0 in step 7, it will give a
message of failure that is the method fail after N0 iterations, okay and N0 = N0. The process
was unsuccessful and stop.
If we consider a example of this method, so for example if we take f(x) = ex+ 2- x + 2 cos x - 6
= 0 on interval [1, 2] to within 10- 6. Then as a told you in the beginning my p0 will become 1,
p1 will become 2 and from here I will get p2 as 1.67830848477 by continuing this by using p1
and p2, I will get my p3 by using p2 and p3, I will get my p4 by continuing this in 8 iteration I
110
will get the desired accuracy of 10- 6 and the correct root will be 1.829383 up to 6 decimal
places.
Order of convergence for a continuous function, Secant method converge more rapidly near a
root. Its order of convergence is the golden ratio that is 1.618, so that limit k →∞|εk + 1| ≈
constant |εk|1.618. Hence you can notice in bisection method we were having it as 1 but here it
is 1.618 hence this method is having better convergence when compared to the bisection
method hence we told earlier that bisection method is having linear convergence so here I
will say that the method is having super linear convergence.
111
As you can see when we are finding the next iterates in Secant method that is similarly to the
Regula Falsi method available these two methods are different up to the assignments of the
atoms only like in Regula Falsi method what we are doing, we are taking in each iteration the
interval in such a way that root will lie in that particular interval however in this particular
method what we are doing? We are updating our p using previous two iterations like if I want
to find out p3, I will use p1 and p2, I am not making any assignment of p1 and p2 to any other
variables. So hence these two methods are different only up to an assignment. Thank you
very much.
112
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 8
Newton-Raphson Method
Hello everyone so today welcome to the 3rd lecture of module 2 and in today’s lecture we will
discuss one of the most popular technique of solving non-linear equation. This technique is
called Newton Raphson method, so basically why I am saying this technique is quite popular
because this technique is easy to apply, moreover, the convergence rate of this technique is
faster than the earlier discussed techniques like bisection method, Regula Falsi method or
Secant method. In particular, this method is having quadratic order of convergence which is
higher than bisection method which is linear in terms of convergence and Secant method
which is having super linear convergence, that is, the convergence at p is around 1.61 that is
golden ratio.
So in this method we use this formula for solving the non-linear equation suppose f(x) = 0 is
a nonlinear equation and x0 is the initial solution which is given to us. The next solution can
be obtained by this formula xn + 1 = xn - f(xn) divided by f‘(xn). Here f‘(xn) is the derivative of
a f‘(xn) and x0 is an initial guess and we will find the subsequent iterates of the sequence xn
using this formula.
113
(Refer Slide Time: 2:26)
Geometrically this formula can be described as that we are having a function like this, so it is
my x axis, it is y-axis and this is the curve f(x). As I told you that we are having an initial
solution let us say this is my initial solution x0 and the root of this equation f(x) = 0 is here let
us say this point is x*, so I will start with this x0 and using the formula of Newton Raphson
method I need to converge here, so what I will do? If I draw a perpendicular here this point is
(x0, f(x0)) on the curve y = f(x).
Now what I do, I will draw tangent at this particular point and the tangent line intersect at the
x axis and let us say at x1, so this will become the next iterate of the sequence xn. If I consider
this angle is θ, so tangent on the curve f(x) at this point x0 can be written by f’(x0) = tan θ, so
if this angle is θ in this right angled triangle I can write this = f(x0) means this distance/this
distance, so this will become x0 - x1.
Now I can write it x0 - x1 = f(x0)/f’(x0) or x1 = x0 - f(x)0/f’(x0). So in this way I get the 1st
iterate of the sequence xn using the initial solution x0 and if I generalise this can be written as
xn + 1 = xn - f(xn)/f’(xn) where n = 0, 1, 2 et cetera. So as you can see this is the Newton
Raphson formula for solving the non-linear equation. Here we are having one drawback of
this method that is in the denominator of this we are having f’(xn). If f’(xn) becomes 0 for
some n then the method fails and we cannot get the next iterations.
Hence this is drawback of the method and geometrically we can see this thing like this. If I
am having a curve like this so let us say this is the curve f(x), this is again x axis this is y axis.
If I choose my initial guess at this point, so at this particular point tangent will be parallel to
114
the x axis and hence this tangent will never intersect the x axis, okay and in this way I cannot
get the next iteration or if it is very close to x0, let us see here then what will happen? This
quantity will be very small and if it is very small this particular term will become very large
and hence next iteration will not diverse, so this is one of the drawbacks of Newton Raphson
method.
Now we can drive the Newton Raphson formula using the Taylor series expansion also for
this we have to assume that f is twice differentiable, so let f belongs to C2[a, b] on a close
interval a to b. Second let x0 be an approximation of the root f(x) = 0. Moreover, this
approximation lies in the close interval [a, b] then what we can do. If we have this x this x0
very close to x, then I can write x - x0 is a small number, let us say δ which is very small
when compared to 1. Feminine masculine
So it means x0 is very close to x then if I write the Taylor series expansion of f(x) around f(0)
then I can write f(x) = f(x0) + x - x0 . f’(x0) + (x - x0)2/2 f”(xi) where xi is a number between x
and x0. So here xi, now you can see the left-hand side is f(x) and our given equation f(x) = 0,
so left-hand side is 0 = f(x0) + (x - x0).f’(x0). Now as I told you (x - x0) is very small number,
so the square of this number will be more small and hence I can neglect the 2nd order term
here.
So I can say it is a approximate value, so from this I can write x = x0 - f(x)0/f’(x0) and this
will be the next iteration of the Newton Raphson Method. So here you can see that we can
generalise this formula again xn + 1 = xn - f(xn)/f’(xn). So this is again the formula for Newton
115
Raphson method for solving non-linear equation. So we start with an initial approximation x0
then we find the approximation x1, which is the x intercept of the tangent line to the graph of f
at the point (x0, f(x0)).
Then similarly we find x2 which is x intercept of the tangent line to the graph of f at x1, f(x1)
and finally this method continues in this way for finding the approximate or numerical
solution x* where the sequence xn is going to converge.
Let us take a very simple example, so the example is like that you are having a non-linear
equation f(x) = x2- 4. We can manually see that the root of this non-linear equation or a 2nd
116
order polynomial is x = + 2 and - 2, so let us check the root of this with an initial solution x0
= 6. Graphically this will be a parabola like this, so this is 2 and f(2), this is - 2 and f( - 2), so
my x axis, y axis and I am going to start. So this will be 2 and f(2), f(2) will become
obviously 0, since it is a root, so for this particular curve let us say I start with x = 6 so this
point is (6, 0).
So if I draw a perpendicular line, it intersects this curve at point 6 and 32, so here x0 is 6, f(x0)
is 32, if I get f’(x), it is coming out to be twice of(x) and hence f’(x0) is 12. So using the
Newton Raphson Formula I got x1 = 6 - 32/12 and this is…which is nearly as 10/3. So a bit
bigger than 3, so if it is 3, so we are having a…this tangent which intersect here at 10/3 and
f(10/3). Again I will draw a perpendicular line, again I will find the tangent on this point that
is x1 and f(x1). This tangent line intersects x axis at x2 and in this way again I will draw a
perpendicular line, again I will get the tangent line and this way I will converge towards my
exact solution in this particular example using the Newton Raphson techniques.
Numerically if I solve this problem the iteration will be like this the 1st x0 is 6, x1 will be 3.33,
x2 will be 2.27 and then x3 will be 2.01. So which shows that Newton Raphson method
converge to root rapidly as with just 3 iterations we are going to have the value very close to
the exact root of f(x) = x2- 4.
117
If we take a bit difficult example let us say my function is a transcendental function which
involve tan hyperbolic terms, so f(x) = x . tan hx/2 - 1 = 0, is the equation, so we will
approximate the 0 of f using Newton Raphson method. So we can see that this function at a
point 1.5 is giving you the negative value and at f = 2 it is going to be a positive value. So
since f changes sign between 1.5 and 2, so there will be a 0 or root of f(x) = 0 in this interval
calculate f’(x) that will be tan h(1/2x) + x (1/2cos h2(1/2x).
118
So if we start with initial solution 1.75, the 1st iteration gives the value of(x1), so x1 will
become x0 - f(x)0/f’(x0) which is 1.75 minus this number and it is coming out to be 1.547587.
In the 2nd iteration x2 becomes 1.543407 in the same way we can have x3 as 1.543405 and
finally x4 as 1.543405. So here you can see these 2 consecutive iterations are same and hence
we are having accuracy in the solution up to 6 decimal places. So we are having several good
things about Newton Raphson method.
119
One of them is this method is quite simple in formulation as I have done in this lecture. In
most of the cases it is rapidly convergent, moreover, in this method it is easy to understand
that when this process will behave well. Moreover, we are having certain numbers of
drawbacks of this method one of them I have told you earlier that if at any xn, f’(xn) becomes
0 then we cannot apply this method further. Moreover if f(x) has no real root, then method
gives no indication about this and so iterations may simply oscillates.
For example take a very simple example to illustrate this particular drawback of Newton
Raphson Method, so let me take a simple function f(x) = x2+ 2 = 0, so you can see that we do
not have any real root of this equation and this equation can be drawn as a parabola and this
parabola never intersect the x axis. Now if I take an initial solution here which is some
120
positive value of x0, then I will find out the tangent at this point, so tangent line becomes like
this and if this is x0, I get x1 which is a negative number.
So I start with a positive number, I got a negative number in the next iteration. Now at this
particular point I need to find out tangent at this point of the curve, I got this as x2 which will
become again a positive number. Similarly at this point if I find out a tangent, I get another
negative number as the next iterate is f(xn) and in this way I will start oscillating between a
positive value - negative value, positive value - negative value or sometimes 2 consecutive
positive values and negative values like that and hence this for this particular problem
Newton Raphson method will never converge, it will always oscillates and the problem is
that the equation is not having any real root and hence if re-apply this particular method using
our formula, we do not have any feeling of this thing.
Now another drawback of this particular method is, generally we speak that if an equation is
having several roots and you need to find out a specific root, so take any initial
approximation close to that particular root, so that method will converge towards that root.
121
For example, if I take a simple transcendental equation f(x) = sin x, then the curve of this
function will be like this, for the positive values of(x). Now this is x = 0, this is π this is 2 π.
Now if I take an initial approximation here which is let us say something around 2.4 π, so if
my x0 is 2.4. π, then what will happen, this particular initial guess should converge to 2 π.
However, if we draw a tangent here according to the Newton Raphson method, what I will
get? I will get this as my next iterate and then if I draw a line here, this converge towards 0.
So I have taken an initial guess near to the root 2 π, however it will converge to 0 using the
Newton Raphson formula, so sometimes it does not converge to the nearest root.
122
Now if we talk about the convergence of Newton Raphson method and as I commented in the
beginning of this lecture as this method is having second order of convergence, so let us
prove it. So suppose we are given function f(x) which is 0. Let xr be the root of f(x) and xn is
an estimate of xr such that the difference between xr and xn is very small that is less than 1
then by the Taylor series expansion, we can have the expansion of f around about the point x r
and this will become f(xn) + xr - xnf’(xn) + (xr - xn)2/2 f’’(ξ), so here ξ is a number between xr
and xn.
Now the Newton Raphson Formula tells us that f(xn) = f’(xn)( xn - xn + 1). Now substitute this
value of f(xn) in the above equation, so here I got 0 = f’(xn)(xn - xn + 1) + (xr - xn) f’(xn) + the
second order term of the Taylor series expansion. Now you can see we can cancel f’(xn).xn
because it is a positive from this term and negative from this term.
So what I will get, I will get (xr - xn).f’(x) + f’’(ξ). Now you can see if I denote this particular
term that as error sorry it will be n + 1, so error in n + 1 iteration and then so f’(x) will
become plus then this term will become error in nth iteration, so en2/f’’(ξ)/2, so from here I
can write that en + 1 = - f ‘’(ξ)/2f’(xn).en2. So you can note it will be constant, so I can write
en+1 is proportional to en2. So it means that error will be square of the error in the current
iteration in the next iteration, so if it is less than 1 it converges to the exact solution
quadratically and hence I can say that Newton Raphson is having the second order of
convergence.
So in this lecture we have learn about Newton Raphson method, first of all you have derived
this using a geometric illustration of this method, then we have derived it using Taylor series
123
expansion of a function f(x), then what we have done we have solved two examples using
this method one of them was quite simple, one is having the transcendental equation. Finally,
we have looked on some drawback of this particular method we have explained them
graphically and we have proved that this method will be having quadratic convergence.
Thank you very much.
124
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 9
Fixed Point Iteration Method
Hello everyone so welcome to the fourth lecture of the module 2 which is the module for
nonlinear equations, so in this lecture I am going to introduce you another method of solving
non-linear equations and the method is called fixed point iterations method. Why we say it
fixed point iteration Method? Because it is based on the concept of fixed point for a given
function. In the past three lecturers we have learned about bisection method then Regula Falsi
and Newton Raphson method.
So in all those methods what we were doing, we are finding the iterations, we were
establishing an iterative equations and based on that iterative equations we are finding a
sequence which is going to converge to the root of the equation. In Newton Raphson Method
we were calculating the derivative of a function, however, in earlier method like bisection or
Regula Falsi method, we did not calculate the derivative. Since in Newton Raphson Method
we are calculating derivatives and we were doing some extra efforts, we were having the
assumptions that function should be twice differentiable and hence based on that we were
getting a good convergence of Newton Raphson method compared to the earlier methods.
Now let us discuss about this fixed point iterations method, so first of all what is a fixed
point? So the fixed point of a function of f(x) is a point x such that f(x) = x. For example if
125
we have a function f(x) = x where - x - 3, then the fixed point of f(x) are given by f(x) = x
that is x2 - x - 3 = x and when we solve it, we can check that we are having two fixed points,
this particular equation that is one is 3 and another one is – 1, because if we calculate f(3), so
it is coming out 3.3- 3 - 3, so 3 and if you calculate f (- 1), it is coming out - 1.
So in general we can say a function the fixed point of a function is given by the intersection
of this function with the line y = x for example this function is having these 2 fixed points. If
I take another function f(x) = sin x, so x = 0 is the only fixed point of this function. Similarly,
we can get another function for which we can have 1 fixed point, 2 fixed points or more than
2 fixed points. For example if you take the identity function all the points of these functions
are the fixed points. Now based on this concept of fixed point, we will develop our fixed
point iteration Method.
126
(Refer Slide Time: 4:20)
So let us say we are having a non-linear equation f(x) = 0. Now we need to write this function
f = 0, function f in such a way that I can write it as x = g(x) in such a way that the fixed point
of this particular equation becomes root of this equation means if this is the equation 1, this is
the equation 2, so the fixed point of the equation 2 becomes the root of equation 1, so fixed
point of 2 is the root of 1. Now if I give you a non-linear function f(x), I can write this
function in this form in several ways, for example consider a function f(x) = x2 - x - 3 = 0.
So here f(x) is x2 at - x - 3, so I can write this function as x = x2 - 3, so here and that is equal
to g(x). So g(x) is x2 – 3, another way of writing this, I can write this x =√f(x)+3 and that is +
-, so here g(x) can be √f(x) + 3 or g(x) maybe - √f(x) + 3, so this is my g(x), so these are the 2
ways of writing this function f(x) in terms of x equal to g(x), we can have several other ways
of writing function x = g(x). Now after writing equation 1 into equation 2 then what we will
do? From the 2, I will generate an iterative scheme that xn + 1 = g(f(xn).
If x or x* is a fixed point of this particular g then what will happen when sequence or this
iteration scheme will converge towards the x*, finally this will become x* = 0 x* and hence
x* is a fixed point of this which gives the root of f(x) = 0. So this particular formula is called
the iterative scheme or formula for the fixed point iteration Method. The only thing you need
to take care in this, that choice of this function g how to write this g from the given equation
f(x) = 0.
127
So criteria to choose this function g should be given any initial x0 then the iterative scheme of
this equation or iterations of this equation can be calculated very easily. The second criteria
for choosing g is the sequence xn is convergent using that particular g and third one is the
limit to which sequence xn converge, let us say η it should be a fixed point of g(x) that is g(η)
= η. Now if you talk about first point that even an initial value of x0 using this method we
should be able to find out subsequent approximations of xn.
If you take a very simple example let us say f(x) = x2 - x = 0 suppose I need to solve this
equation. Now what will happen? If I choose x as - √x = g(x), that I can calculate, I take this
x the right-hand side and then I can take the square root of the both side, so it will be + square
128
root as well as - square root. Suppose I take this - square root, so then my x1 will become –
square root of x0.
Now this g(x) is define only for positive x or non-negative x, so if x0>0, what will happen? I
will be able to find out x1, which will be a negative number. Now when I will calculate x2 it
will be - square root of a negative number which is coming from the first iterations, I will not
be able to find out x2, hence I am not able to find out any subsequent approximation of xn and
hence we should choose g(x) in such a way that we can find on the approximation in
subsequent iterations of xn.
Moreover, the convergence of the fixed point iterations method depends on function g that is
how you are choosing your function g and the initial approximation x0.
129
Now I am going to explain few of the assumptions for choosing g and the first assumption is
g(x) should belong to the domain of x that is if x belongs to a closed interval [a, b], then g(x)
should belong to close interval [a, b]. Why I am saying this or why I am taking this
assumption? That if we have a x0 between [a, b], then for all n, xn should lie in the close
interval [a, b] because xn + 1 = g(xn) and hence xn + 1 should be defined in the close interval [a
b] and it can happen only when your domain of g is between a to b.
The another is g should be a continuous function that is if the nth approximation of xn tending
to x* then x* can be written as limit n →∞ xn = limit n →∞ g(xn-1). Since xn = g(xn-1), now
since xn - 1 is a fixed point of g, I can write it, I can take g out and limit g(n →∞ xn - 1) = g(x*).
Now this is very important assumptions and this is a condition of the choice of g which
subsequently tells us that whether the method is going to converge for this particular choice
130
of g or not, so the iterative function g is differentiable on [a, b]. In addition there exist a
constant k between 0 to 1 open interval (0 1), such that the absolute value of g’(x) should be
≤ k for all x belongs to [a b]. Hence I can write this condition like this I can state like this, the
|g’(x)| ≤1 for all [a b]. If you are having such a g then your fixed point iteration scheme
converge for any initial approximation chosen from the interval [a b].
Moreover we are having a condition on the uniqueness of fixed point that is assumed that g is
continuously differentiable on a close interval [a b] and the domain of g is in close interval [a
b] with λ that is the maximum value of g’(x) for all x belonging to [a b] and this value is less
than 1 then x = g(x) has a unique solution, x* in [a b] that is the fixed point will be unique in
this particular interval [a b]. The another one is for any choice of x0 in this interval [a b] with
xn + 1 = g(xn), n = 0, 1, 2 et cetera. The sequence xn will converge to this unique fixed point.
Further the absolute value of |xn - x*| ≤ λn |x0 - x*|, because one λ will be added in each
iteration like |xn - x*|≤ xn - 1 λ|xn - 1 - x*|≤ λ2 |xn - 2 - x*| and so on and that will be less than
equal to λn/1 – λ(|x1 - x0| and limit n →∞ the error in n + 1 iteration/ error in nth iteration will
be equal to g’(x*).
It means when n is quite large, the limit value of the error in n + 1 iteration that is en + 1/en < 1
because we are taking this x* will be in this interval and we are taking λ as the maximum
value of g’(x) for all x in this interval, so x* will be somewhere in this particular interval and
hence this value will be always less than 1, so limiting value of this en + 1<en, strictly less than.
131
After this let us take you examples which we will solve using this fixed point iteration
method.
So consider the equation f(x) = x3 - 7 x + 2 = 0. This equation is having root in [0 1], because
when we test it at x = 0, it is coming out 2 which is a positive number when I test it at x = 1,
so f(1) coming out to be 1 - 7 + 2 that is - 4 which is a negative number, so here f(0).f(1) is a
negative number and according to intermediate value theorem there will be a root between [0
1]. We can write this equation as x = 1/(7 x3 + 2). So we will have g(x) = 1/(7 x3 + 2) and you
can see the function g belongs to [0 1] will be always [0 1].
So domain equal to range, it means that whatever domain you are having for x same domain I
am having for g(x), that is our first assumption in this method. Moreover if you check g’(x)
that will be always less than 3/7 for all values of x between 0 to 1 thus by the convergence
condition of the fixed point iteration method, the sequence xn + 1 = 1/(7 x n3 + 2) will
converge to root of x3 - 7 x + 2 = 0. That is the fixed point of this equation will give you the
root of this equation.
132
(Refer Slide Time: 18:52)
The algorithm works like that suppose we are having equation f(x) = 0, first we write the
given equation in the form g(x) = x then we start with an initial approximation say x0 and
then we find the successive iterative approximation of xn in different iterations.
Now let us take another example of fixed point iteration method, suppose we are having an
equation that is the 4th order polynomial x4 - x - 10, so consider g(x) as 10/x3 - 1, so by the
fixed point method the iterative formula will become xn + 1 = 10/xn3 - 1. For n = 0, 1, 2.. start
with an initial approximation x0 = 2 and then x1 will become 1.429, x2 will become 5.214, x3
will become 0.071, x4 will become - 10.004, x5 will become - 9.97.10- 3 then x6 will become -
133
10, x7 will become - 9.99.10- 3, x8 will become - 10 and then it will always oscillate between -
9.99.10- 3 and - 10 and which shows that the iterative process will never converge.
So what is the problem here? The problem is if you see over g(x), so here we are choosing
g(x) as 10/x3 - 1. If I find out g’(x), it will become – 10/(x3 – 1)2.3x2, so basically I can write
it – 30(x/x3 – 1)2. Now if you check that when x is 2, so when x is 2 it will become – 30(2/7)2
which is – 30.4/49. So hence |g’(2)| will be somewhere 120/49 and which is greater than 1.
So this choice of g does not fulfill the requirement of the convergence and hence we are
getting such a behavior of the iterative scheme that is not going to converge.
134
If we take another function g for the same equation x4 - x - 10 = 0, so another way of writing
g(x) is (x + 10)1/4 then the fixed point iteration formula becomes xn + 1 = (xn + 10)1/4. Again if
I start with x0 = 2, so my x0 is 2, I got x1 is 1.861, x2 as 1.8558, x3 as 1.85559, x4 as 1.85558
and then x5 as 1.85558 which is same up to 5 decimal places in this two iteration and hence
my fixed point iteration method converges to the root with an accuracy of 5 decimal places in
just 5 iterations and this is happening because if you calculate here for this particular choice
of g(x).
So here I am taking g(x) = (x + 10)1/4, so here g’(x) become ¼(x + 10)3/4. If I calculate g’(2),
it will become ¼(1/12)3/4 < 1 and this is the condition for the convergence on the choice of g
is fulfilled here and that is why method is converging quite faster.
135
If we take another choice the third choice of g that can be written as (x + 10)1/2/x then the
fixed point iteration form becomes xn + 1 = (xn + 10)1/2 /xn and here let us start with x0 = 1.8
which is quite close to the exact solution earlier we were taking 2 but now let us take 1.8, so
here x0 is 1.8, x1 will become 1.9084, x2 will become 1.80825, continuing in the same manner
we see that the solution converge to 1.8555 in 98 iterations, so in the earlier method you are
taking the initial solution with far away from the root still we are getting the convergence in
just 5 iterations, here we are getting initial solution close to the exact root but we are getting a
solution with a large number of iterations.
However, method is converging, hence the choice of g and it is happening because you find
out g3’(x) at x = 1.8, it is just close to one means it is near around 0.9 something and hence
we are taking more number of equations. Hence in this method g is very important for the
convergence, you need to check the condition |g’(x)|≤ 1 for all x belonging to [a b], for each g
and which is giving the value were less than 1 or a smaller 1 we will use that one, smaller one
as well as less than 1 or further iterations based on that we will use our fixed point iteration
methods.
Moreover, this method is simple in formulation if we leave out the choice of g, so thank you
very much for this lecture. In the next lecture we will learn about the solution of system of
non-linear equations like in module one, we have solved the system of linear equation here
we will take the system of non-linear equations. If you take the system of non-linear
equation, we will have non-linear equation involving more than one variable and therefore we
will modify our Newton Raphson method as well as fixed point iteration method for solving
such type of systems of non-linear equations. Thank you once again.
136
Numerical Methods
Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 10
System of Nonlinear Equations
Hello everyone so welcome to the last lecture of the second module of this course. In the past
lectures in this module we have learned that how to solve a non-linear equation, in particular
for finding the roots of non-linear equations we learned several methods including bisection
method then Regula Falsi and Secant method. Later on we have gone to the Newton Raphson
method and finally fixed point iterations methods. In today’s lecture we will learn about how
to solve a system of non-linear equations. In particular, we will focus on two methods that is
Newton Raphson and fixed point iterations for doing this job.
So first of all let me tell you what is a system of non-linear equations, so consider a system of
n nonlinear equations having n unknowns and written as f1(x1, x2, xn) = 0. f2(x1, x2, xn) = 0
and so on and finally nth equation is f2(x1, x2, xn) = 0, so here you can say we are having n
equations each one is denoted by f1, f2,….fn and then each one is having n number of unknown
variables x1, x2,….xn. So solving this non-linear system means we need to find out the values
of x1, x2,….xn we satisfy all those equations.
Basically analytically it is very hard to solve and hence here I will introduce some numerical
methods for solving these equations. So if I want to write this system into a vector form, so
just consider a vector f which is a vector of functions f1, f2,….fn and a vector x which is a
137
vector of unknown variables x1, x2, xn. Then this system of non-linear equation can be written
as f(x) = 0.
Now using Taylor series expansion each of the function fi where i is from 1 to n can be
expanded in the neighborhood of the vectors xi that is x1, x2, xn. So using the Taylor series
expansion we can write fi(x+δx) = fi(x)+ j = 1 to n ∑∂fi/∂xj δxj+ higher order term. Here you
can notice that particular summation term comes out as a nxn matrix. So this particular matrix
containing the partial derivative of f with respect to unknown variables x1, x2, xn is called the
Jacobian matrix. So the i jth entry that is an entry in ith row and jth column of this Jacobian
Matrix J can be given as ∂fi/∂xj that is partial derivative of ith function f with respect to jth
variable.
138
(Refer Slide Time: 4:14)
We can write the equation 1 in matrix form as f(x+δx) = f(x)+Jδx + second order and higher
order terms. So here you can notice here we were having a vector means f1,
fi(x1)+fi(x2)+fi(x3), so we have written it as f(x) and then Jacobian matrix this is coming from
the term summation J = 1 to n ∑∂fi/∂fj, so for ith equation will be like
∂fi/∂x1.δx1+∂fi/∂x2.δx2+….up to nth, so we have written it in matrix form in this way. Now as
we are assuming that is δx is a small quantity, so we can neglect the second order and higher
order term and moreover the left-hand side f(x+δx), we can set it to 0.
So we get the system of non-linear equation for corrections ∂x that makes each fi
approximately 0 thus we have since this left-hand side is 0, so we can write J.δx = -f(x). So
here you can notice we are having this system here Jacobian matrix is known to us for a given
139
x0, or for a given vector x1, x2, xn and so we know the f(x) and hence it is coming out as a
system of linear equations. After solving this system of linear equation we add the correction
to the older solution that is xnew = xold+δx and in this way we update our iterations. So let us
modify this particular thing into combination of Newton Raphson method for solving the
system of non-linear equations.
So in case of a single variable Newton Raphson method was obtained using the linear
approximation of function f around an initial point x0. So in case of a system of non-linear
equations can write the linear approximation of vector function f around the unknown
variables vector x0 in this way that is f(x) = f(x0)+Jx0. x-x0, where J x0 is nxn matrix and as
you know it is the Jacobian matrix which is forming using the partial derivatives of f.
140
(Refer Slide Time: 7:16)
So like this one j f1/jx1, j f1/j x2 up to j f1/j xn then we are having in the second row we are
having the partial derivatives of the function f2 respect to x1, x2 up to xn and so on in the last
we are having the partial derivative of fn with respect to x1, x2 up to xn.
Now we have to find x so that the system f(x) becomes 0. So choose x1 such that f(x0)+J(x0)(,
x1- x0) = 0. Please note that here all these f(x0, x1 -x0) all these are vectors and J is a matrix
nxn matrix and so on. So here J(x0) is a square matrix so we can write the system in this way
where x1 = x0 – J(x0)-1 f(x0). However we can write in this way only when the inverse of J
exists and this method is called Newton Raphson method for the system of non-linear
equations.
141
So in algorithmic way we can use this method provided inverse of J exist but suppose you are
having hundred equations with hundred unknowns it means the size of J will become hundred
times hundred and hence finding the inverse of hundred times hundred matrix is not an easy
job in terms of computation complexity.
So we can use an alternate instead of finding the inverse of J, what we will do that we will
write x1-x 0 as the δx, so δ x is the vector of differences in the two iterations, let us say in first
and second iteration and then J(x0) will become we can write this scheme as J(x0).δ x =-f(x0).
We know J for a given x0 as well as f for a given x0, so the above system will become a
system of l linear equation with n unknowns and hence once we can got this system we can
solve it using any scheme which I have told you in the module 1 like Jacobi or Gauss Siedel
142
kind of approaches. Once you find x1 using any one of the iterative scheme for the linear
equation you can obtain δx and from the δx the new solution can be obtain using x1 = x0+δ x.
So in brief we can say if we are having the ith vector x i that is in the ith iteration we will solve
this particular equation J(xI).δx =-f(xi) and by solving this we will find δx and vector x in
next iteration will become xn+δx. So let us take an example of this system.
143
transcendental equation like in first equation we are having cos sin term similarly in the
second equation we are having a sin term and the 3rd equation we are having an exponential
term. Let us solve this system of transcendental equations with an initial guess 0.1, 0.1 and-
0.1.
It means in an initial solution we will take x1 as 0.1, x2 as 0.1 and x3 as-0.1. So now let us
assume the first equation is f1 that is this equation is a function of f1(x1, x2, x3). similarly
second equation let us f2 and 3rd equation is f3, so we can write in this way and hence first of
all we need to find out the Jacobian of this system.
So for finding the Jacobian of the system we are having a system like f1(x1, x2, x3) = 0, f2(x1,
x2, x3) = 0 and then f3(x1, x2, x3) = 0. Now the Jacobian matrix will be a 3x3 matrix which
will be having ∂f1/∂x1 then this element will be ∂f1/∂x2, ∂f1/∂x3. In the same way in the
second row we will be having instead of f1 function f2, so ∂f2/∂x1, ∂f2/∂x2 and then ∂f2/∂x3. In
144
the final row you will be having function f3 that is a partial derivative slope f3 with respect to
x1, x2 and x3.
So hence Jacobian matrix will be a 3x3 matrix and here ∂f1/∂x1 will become 3, ∂f1/∂x2 will
become sin(x2x3).x3, so x3 sin(x2x3). Similarly, ∂f1/∂x3 will become x2.sin(x2x3) this will be
the partial derivatives slope f2 with respect to x1 with respect to x2 and with respect to x3 and
the final row is having partial derivative of f3 with respect to x1, x2 and then x3.
145
Now what I will do, I will put these values in the Jacobian matrix that is x1 = 0.1, x2 =0.1 and
x3 =-0.1, so the Jacobian matrix or initial Jacobian matrix, I will say become like this
moreover for the initial solution 0.1, 0.1 and -0.1 the value of f1, f2, f3 is given by these 3
numbers, so I am having J(x0), f(x0). So what I will do, I will solve the system J(x0).δx = -
f(x0) and after solving this I will get the solution for δx like this that is δx1 will become
0.39986, δx2 will become-0.080533 and δx3 will become this number, so this is the change
which we need to made in initial solution to getting the update in first equation. So adding
this in the initial solution, I will get this solution that is 0.479, 0.019 and -0.521 as my first
solution of x in first iteration that is I will denote it as x1.
So continuing this, now I will use this x1 and I will find x2, so it means I need to solve system
J(x1).δx =-f(x1) and hence from there I will get x2 that is x2 will be x1+δx, which I get solving
this system.
146
So hence this table, so the iterations for various n like earlier initial solution was 0.1, 0.1, -0.1
then this is the solution getting in first equation. The last column of the table shows the norm
of the difference of the two vectors x(10) in successive iteration means in current iteration
and in the previous iterations. So for the second iteration I will get x1 as 0.5 then this is x2 and
this is x3 and here this difference will be of order 10-2. In the 3rd equation these are the values
and difference is of 10-3.
On the 4th iteration difference means these are the values and differences of 10-5. So here you
can notice the difference is increasing, it means accuracy we are moving towards the exact
solution. So in the 5th iteration these are the numbers, value for x1, x2 and x3 and the
differences is of order 10-10, hence this is the vector x which is very close to the exact
solution and in this way using the successive iterations we can get the numerical solution of x
for a system of non-linear equation using the Newton Raphson method.
147
So the above example shows that Newton Raphson method converges faster, if we initially
have a good approximation means our initial solution is close to the exact solution but it is
not possible always to find a good initial solution for a given problem, so in those cases the
method will become costly to apply. So for this we will take the another method that is the
fixed point iteration method like the earlier which we have done for a single non-linear
equation, we will extend it for a system of non-linear equations.
And here again consider a system of n nonlinear equations with n unknowns is given by this
equation, here just I have changed the variables, so here n variables are x, y up to z and like
that functions are f1, f2, fn, so we can write this system again in a vector form like this f(x) =
148
0, where f is the vector of functions f1, f2, fn and x is the vector of variables x, y up to z that is
n variables.
Now if I take the initial approximation as x0, y0, z0, then the fixed point iteration for the given
non-linear system of equation is given in this way, so here you can notice that we are having
g1, g2, gn. So like you are having the first non-linear equation, so you have to write the first
non-linear equation that is f1(x) = 0 in such a way that x = g1(x). Similarly, for second
equation and rest of the equations, so in the case of a single non-linear equation I told you
what is the criteria for writing such a g, so that our method converges. So we need to write in
this way which converge to the numerical solution s, t, u for i = 1, 2, n.
149
Now what is the condition for convergence in this case, so that is given by this particular
result that is suppose that the functions g1(x) g2(X), gn(X) and their partial derivatives with
respect to various unknown variables x, y up to z for i = 1, 2, n are continuous on a region
that contains the fixed point s, t up to u. So here what we are saying that the region where you
are having your initial solution partial derivatives. Your g should be continuous, partial
derivative should be continuous on the same region and your fixed point also lies in the same
domain.
So if the initial chosen point is (x0, y0, z0) is sufficiently close to the fixed point (s, t, u) and
the partial derivative of g1 with respect to first unknown variables at the fixed point + partial
derivative of the g1 with respect to second unknown variable and so on, partial derivative of
g1 with respect to nth unknown variable and absolute sum of all these terms should be less
than 1. Similarly, for g2 similarly, for nth, gn then the fixed point iteration defined earlier
converges to the fixed point (s, t, u).
150
So for this let us take an example again, so here we are taking an example of 2 equations in 2
unknowns, so first equation is x2-2 x-y+0.5 = 0, second equation is x2+4 y2-4 = 0, so now we
are writing the first equation in the form as x = g1(x), so here we can choose we can write this
equation like x = x2-y+0.5/2, so what we have done we have taken this -2 x term into right-
hand side and then x will become half of x2-y+0.5.
Similarly, we need to write the second equation as the second unknown variable equal to g2
of function of all unknown variable. So here we are writing it as y =-x2-4 y2+8 y+4/8, so
starting with the initial point (x0, y0), we have the sequence (xi+1 yi+1) as as you did it in case
of a single non-linear equation, do it separately for this g1(x) as well as for g2(x), so we get
this iterative scheme. Now start with an initial value of x and y and then you can get the
successive approximation of x as well as y which converse towards the fixed point, so let us
check the convergence condition.
151
So here we are having these 2 functions, we have written it like this, so partial derivative of
g1 with respect to x is given by 2x/2 that will be x here. Similarly, partial derivative of g1 with
respect to y is given by, so this is my g2. So it will become -8y/8, so it is g1(y) that is -1/2g2 /x
– x/4 and g2/y will be -y+1, so if you not take this rectangular domain that is x is between-0.5
to 0.5 and y is in 0.5 to 1.5 then you can note down that the partial derivatives satisfy the
convergence conditions that is ∂g1/∂x+∂g1/∂y will be less than 1. Similarly, ∂g2/∂x+∂g2/∂y
will be less than 1 for all values of x and y belonging to this domain.
So here since if you choose any initial solution this rectangular domain then our system will
converge to the fixed point of g1 and g2 and fixed point of this system will be -0.22 and 0.99
which is also in the given rectangular domain. Now the question is how to choose g, various
g in the system so that my method converges, because as you know from the single non-
152
linear equation when we did it using the fixed point iteration method we can have various
choice of g. So as you know from the non-linear equation and the related fixed point iteration
method that the choice of g is very important for the convergence. The same case applies in
case of system of non-linear equations, so here again choice of g is quite important.
So how to choose the best g let me explain here, so let us take we are having a system of non-
linear equations and which is represented in matrix form as f(x) = 0, so here f is a vector
containing the functions f1, f2 up to fn and x is again a vector containing the variables x1, x2
up to xn. So we are having n number of non-linear equations in n unknown. Now the fixed
point iteration method for this can be written as x = g(x).
Now what should be this g the best choice of g, so let me write this g(x) as x that is the same
equation + a constant matrix A, which is a square matrix of order n.f(x). Since f(x) = 0, so
this term is 0 basically. Now if I differentiate it partially with respect to different x1, x2, xn
then I am getting matrix capital G(x), this will come out as an identity matrix I+A is a
constant matrix, so it will be domain like this into the differentiation of this with respect to
different x1, x2, xn. So this as you know will become Jacobian matrix. Now my capital G(x) =
I+A.J, Jacobian that is the matrix containing the partial derivatives of f with respect to
different x. Now for the rapid convergence or for the convergence this G(x) should be less
than 1.
153
This gives me if I take G(x) = 0 which is the minimum possible value of this G, absolute
value of G, I will get the rapid convergence or the best convergence. So if I put G(x) = 0 here
what am getting, I am getting A as the matrix J-1, –J-1 basically because this is 0, so I can
write it 0 = I+AJ, so I can write or AJ = -I and then if I multiply by J-1 that is the post
multiplication I will get A = -J-1. So A = -J-1 so if I put here I will get g(x) as x + Af(x) or it is
x–J-1f(x).
And what is this? This G(x) because my iterative scheme is xn+1 =xn-J-1f(xn), so if I write it
xn+1, xn then what is this scheme? This is the Newton Raphson method for system of non-
linear equations and hence the best choice of G gives me the fixed point iteration method
same as the Newton Raphson method for the system of non-linear equations.
154
So when we find the extrema or maxima or minima of a system of more than one variable let
us say this one, so let us take this function as x22+x2 sin x1. So let us say I need to find out the
maximum or minima or extrema of this function, so here the necessary condition is that
∂f/∂x1 as should be 0 and ∂f/∂x2 should be 0, so from this we find the points in the domain of
x1 and x2 where we need to check the points for maxim or minima of the function f.
Basically we find the stationary points, so here if I do it so ∂f/∂x1 gives me x22+x2 cos x1 = 0
and the second equation gives me 2x1x2+sin x1 = 0. So please note that, what is the system?
This is a system of non-linear equations which we have discussed just now. So here you will
find the system quite frequently and if we consider a problem of minima of this system and
we apply the sufficient condition, we will find that the Jacobian matrix should be invertible
for having the maximum or minima of this system on the points of solution of this system.
155
Numerical Methods
Professor Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 11
Introduction to Eigenvalues and Eigenvectors
Hello everyone, so today we are going to start module 3 of this course and in the first lecture
of this module, I will give an introduction to eigenvalues and eigenvectors. Why I am giving
this introduction to you, because in this unit, we will focus on, finding the eigenvalue and
eigenvectors using the numerical methods for a given matrix. Basically in real life problems
quite frequently, we need to find out eigenvalues or eigenvectors of a matrix for getting some
idea about the system and hence, it is very important, to know about these two things that is
eigenvalues and eigenvectors for a given matrix and how to apply a numerical method for
finding eigenvalues and eigenvectors of a given matrix.
Let me define the eigenvalue and eigenvectors. So let A be a square matrix of order n. So it
means, this matrix say is the element of the vector space Cmxn, then a non-zero vector X
belongs to Rn or more generally, you can write it belongs to Cn is said to be an eigenvector of
A. If AX = λ X for some scalar λ, notice this, that I said X is an eigenvector of A, first of all it
should be a non-zero vector, second it should satisfy this particular condition that is AX = λ
X.
156
Now let us talk about this scalar λ. Here, λ is called eigenvalue of A and hence, we can say
that X is an eigenvector corresponding to the eigenvalue λ, for example, take a 2x2 matrix [2
1; 2 3], so if I take a vector let us say [1 2]. So this I can write it 4[1 2]. So this is my matrix
A, this is the eigenvector X. This is=λX. So it means this particular vector [1 2] is an
eigenvector of this matrix A and 4 is an eigenvalue of A and this eigenvector is an
eigenvector corresponding to eigenvalue 4.
Now what this equation is telling, what we are having X is a vector and A is a matrix. So
every matrix is a transformation, basically linear transformation. So what we are doing, we
are applying a linear transformation on a vector X and we are getting a scalar change in the
vector X, scalar time change. So either the vector will expand or vector will strech . This will
depend on the value of λ, for example, in the earlier one, I was having initially vector (1 2)
and after applying the transformation, it is becoming 4 (1 2) that is, it is going to (4 8). So
there was a magnification in the vector by 4 times. So hence this is the another interpretation
of eigenvalue and eigenvector of a transformation or matrix A.
157
(Refer Slide Time:6:38)
Now how to find eigenvalues and eigenvectors of a given matrix? So methods for finding
eigenvalues and eigenvectors. Now look at this definition of eigenvectors. Here, I am saying
that X is a non-zero vector and it is said to be an eigenvector of A, if A X = λ X. So A X=λ X
can be written as AX - λ X = 0 or this I can write this A - λ I=0, where I is the identity matrix
of the same order as A.
Now from here you can see, we are having here system of homogeneous equation with n
equations and n unknowns; I am saying that X is a non-zero vector. So if X is a non-zero
vector means, this system is having non-zero solutions and if I talk that, this system is having
non-zero solution, it means that is the null space of this particular transformation A - λ I, is
having dimension more than 0. Now if this is having the non-zero solution, it should be
having rank less than n. So rank of this A - λ I should be less than n, it means determinant of
A - λ I should be 0.
Now if I get the determinant of A - λ I, it will be a polynomial of n degree in λ and the zeros
of that particular polynomial are the eigenvalue of matrix A. So using this concept we can
find the eigenvalues of a matrix means what you need to do? You have to write the matrix A
- λ I, you have to find out the polynomial, which is coming from the determinant of A - λ I
and solving this equation, now linear equation that is a polynomial of degree n, you will find
the eigenvalues of A.
158
Now as I told you by solving, this equation you can get the values of λ for which determinant
of A - λ I = 0 and these values of λ is called the eigenvalues of A. This particular polynomial
is called the characteristic polynomial of A and hence, eigenvalues are also called the
characteristic values of the given matrix. Now once you find out the eigenvalues, let us say
eigenvalues are coming like this λ1, λ2,.. ,λn, then what we need to do? We need to find out
the eigenvectors corresponding to each eigenvalue. So eigenvector corresponding to
eigenvalue λ = λ1 can be calculated just by solving the homogeneous system of equations (A -
λ1)X =0.
Since, we have chosen such a λ1 for which, the system is having non-zero solution and hence
we will get a non-zero vector X as a solution of this system and then non-zero vector X will
be the eigenvector corresponding λ=λ1. Then similarly we can find the eigenvectors
corresponding to other eigenvalues. So this is the classic way of finding eigenvalues and
eigenvectors of a given matrix.
159
(Refer Slide Time: 12:29)
So let us take an example of it. Let us consider this 2x2 matrix. So first row is (3 2 7 -2), we
need to find out the eigenvalues and eigenvectors of this matrix. So the characteristic
polynomial of A is determinant of A - λ I, which becomes (3 – λ)(- 2 – λ)- 14 = λ2 - λ -20 and
zero of this polynomial is 5 and - 4. So hence the eigenvalue of this matrix is 5 and - 4.
160
(Refer Slide Time: 13:58)
Similarly, we can calculate the eigenvector corresponding to λ= - 4 and it is coming out (2, -
7), you can verify that both of these eigenvectors with respect to eigenvalue 5 and - 4 satisfy
the relation X=λ X.
Now geometrically, an nxn matrix A, we are multiplying it by nxn vector X. So resulting into
another nx1 vector Y=AX. Thus A can be considered as a transformation matrix, which is
transforming a vector into another vector of the same dimension. In general, a matrix X on a
vector by changing both its magnitude and its direction; however, a matrix may change
certain vectors by changing only their magnitude and leaving their direction unchanged or
possibly reversing the direction. These vectors are the eigenvectors of the matrix, okay.
161
So if we are having a matrix A, we are applying this matrix on a set of vectors for some of the
vectors, it will change magnitude as well as direction; however for certain vectors what it will
do, if you are having nxn matrix, there will be a vectors n or less than n for which just what it
will do, linearly independent vectors. It will only change the magnitude and those vectors will
be the eigenvectors of the matrix. So in this way we can differentiate or we can take out the
eigenvectors of a matrix from the set of vectors.
If λ is negative number, the direction of the vector becomes in the reverse direction; however
magnitude will certainly change, just take this beautiful example here we are having this 2x2
matrix A, I am applying this matrix A on the set of points. So after applying this
transformation or this matrix on this set of points, I am getting these red points. So blue
points are before transformation and red cluster of points are after transformation. So what is
happening, it is changing. This square say point to in this shape.
162
Now if I calculate the eigenvalues and eigenvectors of this matrix. Eigenvalues are coming 1
and 2 corresponding to 1 eigenvector is [- 0.71 0.7071] means it is something aligned in the
direction of Y=X and here, it is coming same, so it is aligned in the direction of Y= - X, sorry
this is the line in direction of y = - X, this is aligned in the direction of direction of Y=X.
163
And hence, you can see what we are having, this change is the maximum in the direction of
Y=X and what is the scale of the change. The scale of change is just double, so I want to say,
that in the direction of eigenvector corresponding to largest eigenvalue we are having the
maximum change and change is just multiple of that particular eigenvalue and no change in
the direction of this Y=X, because here eigenvalue is 1. So 1 into that vector remains the
same.
So here the role of eigenvector corresponding to the biggest eigenvalue becomes very
important for analyzing patterns or when you are talking about data analytics or pattern
classification on these areas. So we will learn some numerical methods to find out the
maximum eigenvalue and corresponding eigenvector of a given matrix in the coming
lectures.
164
(Refer Slide Time:19:55)
Now as I told you the eigenvalues are the roots of the characteristic polynomials and a
polynomial can have repeated roots. So for example, I can have λ1=λ2=…. λk. So if that
happens, the eigenvalue is said to be of algebraic multiplicity k. So what is algebraic
multiplicity of an eigenvalue? The algebraic multiplicity is the number of times, it is
repeated, for example, if I am having a matrix A of order 3x3 and eigenvalue is (2, 3, 5). So
hence all the eigenvalues are having algebraic multiplicity 1. If I am having eigenvalue as (2,
2, 5) so here eigenvalue 2 is having the algebraic multiplicity 2 and 5 is having algebraic
multiplicity 1. If I am having eigenvalue as (2, 2, 2), so 2 is repeating 3 times, then algebraic
multiplicity of eigenvalue λ=2 is 3.
165
Now for each distinct eigenvalue of matrix A there will be at least 1 eigenvector
corresponding to it, which can be found by solving the appropriate set of homogeneous
equations. Let k be the algebraic multiplicity of eigenvalue λ. If m is the number of linearly
independent eigenvectors, please note that linearly independent eigenvector corresponding to
eigenvalue λ, then 1 will always lie between 1 to k, means the number of linearly independent
eigenvectors corresponding to an eigenvalue will be always less than equal to the algebraic
multiplicity of the eigenvalue and this particular number of linearly independent eigenvectors
corresponding to given eigenvalue is called the geometric multiplicity of the eigenvalue.
So I want to say that, geometric multiplicity never exceeds algebraic multiplicity, like if you
take this example, it is a 3x3 matrix. It is a upper triangular matrix. It is having characteristic
equation (λ – 2)3 = 0. So hence, it is having root 2 repeated 3 times so algebraic multiplicity
of eigenvalue λ = 2 is 3.
166
If I calculate the eigenvector corresponding to λ = 2, then I am getting the eigenvector (1, 0,
0) and (0, 0, 1), so hence geometric multiplicity of eigenvalue 2 is only 2, whereas algebraic
multiplicity is 3.
Let me tell you some properties of eigenvalues and eigenvectors. So the sum of eigenvalues
of a matrix equal to trace of the matrix. So how can we define that trace of a matrix. Trace is
the sum of the diagonal elements of the matrix and hence, sum of eigenvalues equal to sum of
diagonal elements of the matrix and that you can see from the characteristic equation very
clearly. The product of eigenvalues of a matrix equal to the determinant of the matrix. Hence,
if a matrix is having a zero eigenvalue, one of the eigenvalues as zero means the matrix is a
167
singular matrix, because determinant is the product of eigenvalues, 0 is coming there. So
product will be 0. Hence, determinant will be 0.
The eigenvalues of an upper or lower triangular matrix are the elements of the main diagonal.
Similarly, the eigenvalues of a diagonal matrix are the elements of the diagonal. If λ is an
eigenvalue of A and A is an invertible matrix, then eigenvalue of A-1 will be 1/λ and what
will be having the same corresponding to same eigenvectors.
168
So this result tells us that eigenvectors corresponding to distinct eigenvalues are linearly
independent moreover, if we do not have the distinct eigenvalues for a given matrix and sum
of the eigenvalue λ is having algebraic multiplicity k, then the number of linearly independent
eigenvectors of A associated with this eigenvalue λ is given by m and where m is the n - rank
(A - λ I).
Now what is the diagonalization of a matrix? So the eigenvalue and eigenvectors of a matrix
having a very important property that is if a square nxn matrix A has n linearly independent
eigenvectors then it is diagonalizable that is, it can be decomposed as A=PDP-1, where, D is
the diagonal matrix containing the eigenvalues of A along the diagonal, so D can be written
169
as diagonal(λ1, λ2, λn) and P is the matrix, which is formed with the corresponding
eigenvectors writing them in the columns.
For example, if you take a matrix this (2 1; 2 3), it is having eigenvalue 1 and 4 and
corresponding eigenvectors (1, -1) and (1, 2) respectively. Then it can be diagonalized as (2
1; 2 3) = (1 1; -1 2). So it is my matrix P and you can note here, I have written this first
eigenvector as the first column of P, second eigenvector as the second column of P.D. So D is
a diagonal matrix having the eigenvalue as the diagonal entries, only thing you have to note
down that first eigenvalue is 1 in the first row. So the eigenvector corresponding to 1 should
be the first column.
170
(Refer Slide Time:28:51)
Moreover, we can use this property in many other ways like suppose; you need to find out
Am. So this means PDP-1, PDP-1, PDP-1 m. It is coming out to be P.Dm.P-1 where, D is a
diagonal matrix. So Dm can be calculated very easily just by taking the power of λ1, λ2m and
easily calculated and hence, we can calculate Am in a very easy manner; however, all the
matrices not have this property that all the matrixes cannot be written in the form PDP-1.
So in the next class what we will do we will take some example, where some of the matrix I
can write as PDP-1, but some of them I cannot write, I will tell you the condition which is
necessary for writing A=PDP-1, I will tell you if I cannot write A=PDP-1 then, is there any
transformation other transformation, which is able to which makes this A factorized into
product of different matrices. So now I will stop myself and I will start from the next lecture
that is similarity transformation, thanking you.
171
Numerical Methods
Professor Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 12
Similarity Transformations and Gershgorin Theorem
Hello everyone, so welcome to the second lecture of this module and this lecture contains
some examples of similarity transformations and a beautiful result to find out the bounds on
the eigenvalue of a matrix. So in the last lecture, I told you about eigenvalues and
eigenvectors and in the last, I told you about diagonalization of a matrix. So in this lecture, I
have generalized that concept in a better way that is, because diagonalization is a sort of
transformation and it is a particular example of similarity transformations.
In this lecture, I will introduce few more similarity transformations and then Gershgorin
Theorem for finding the bounds on eigenvalue.
172
(Refer Slide Time: 1:20)
173
So if we can prove it here very easily. Let us say I am having a matrix A having eigenvalue λ
and eigenvector as X. So I am writing AX = λ X, it means X is a eigenvector of A and λ is an
eigenvalue. Now if I write this, so what I am doing? I am pre-multiplying by P-1 in this
particular equation and since here P.P-1 will be identity, so it will become P-1. λ X and if I
take λ, which is scalar, p-1.X. So it tells me that, if X is the eigenvector of A corresponding to
eigenvalue λ, then the eigenvector corresponding to the same eigenvalue λ of the similar
matrix P-1AP will be P-1 X. hence, A as well as P-1AP are having the same values.
Why we need similarity transformation? Especially when we are talking about eigenvalues
and eigenvectors so the use of similarity transformations aims at reducing the complexity of
the problem of evaluating the eigenvalues of a matrix, for example, if a 10x10 matrix is given
174
to you. So a square matrix of order 10 and will say that, okay find out the eigenvalues of A.
So since the order is 10x10, the characteristic polynomial of that matrix will be of degree 10
and it is quite difficult to calculate eigenvalues of such a matrix manually, until and unless
the matrix is either a diagonal matrix or a triangular matrix, upper triangular or lower
triangular.
So what we need to do? Here, our aim is to find out some similarity transformations such that
we can for a given general matrix we can apply the similar transformation and we can convert
it in either as a diagonal matrix or triangular matrix, such that it will be easy to find out the
eigenvalue of such a matrix, for example, if A is any general matrix here. If I apply U is a
unitary matrix. So if I apply U-1AU. So as you know that a matrix shown is said to be unitary,
if U-1 = UT. So I can replace this U-1 by UT, so UTAU will become a triangular matrix. So this
particular results for a given matrix A, there exist a unitary matrix U such that, this result
holds, is called schur lemma.
Now we are having several applications of schur decomposition lemma that is, every
Hermitian matrix is unitarily similar to a diagonal real matrix, means if you are having a
Hermitian matrix, it will be similar to a real diagonal matrix and hence, we can say,
Hermitian matrices are having the real eigenvalue, because at the diagonal we will be having
the real entries. When A is Hermitian schur decomposition of A is diagonal. A matrix A
coming from a nxn matrix having the complex entries is normal if and only it is unitarily
similar to a diagonal matrix.
175
Moreover, we can say from this particular lemma that let A and B be two normal and
commutative matrices then, the generic eigenvalue μi of A+B is given by the some λi+ψi
where λi and ψi are the eigenvalues of A and B associated with the same eigenvector and
hence, if A and B you are having the eigenvalues of A and B separately, you can find out the
eigenvalues of A and B.
Now we can have other variants of similarity transformation, one of them is called Jordan
canonical form. So what is that? Let A be any square matrix, then there exist a nonsingular
matrix X, which transform A into a block diagonal matrix J such that, X-1 AX = J and where J
is a block diagonal matrix and blocks are called Jordan blocks of A corresponding to matrix
A and J is called the Jordan canonical form of A.
176
do we need to decide the block size number of block like if an eigenvalue is repeated 5 times
for a matrix and it is having only 3 linearly independent eigenvectors. So what we need to say
that this particular matrix is similar to a block diagonal matrix, which is having 3 Jordan
blocks and total size of 3 Jordan blocks would be 5. So if I decompose 5 into 3 terms, it may
be 2+2+1. So one Jordan block of size 2, another Jordan block of size 2 and the 3rd Jordan
block of size 1 or it may be 3+1+1 like that.
So if an eigenvalue is repeated, the size of the corresponding Jordan block is greater than
equal to 1. Therefore, the Jordan form tells us that a matrix can be diagonalized by a
similarity transformation, if and only if it is non-defective, for this region the non-defective
matrices are called diagonalizable, in particular normal matrices are diagonalizable, okay.
So let us take a beautiful example of Jordan canonical form and here I am having this matrix
as my matrix A. so it is a 4x4 matrix, the first row is (2, 0,1, -3), (0, 2, 10, 4) is the second
row, and the third row (0, 0, 2, 0) and in the 4th row (0, 0, 0, 3) as you can notice the matrix is
a an upper triangular matrix and hence, the eigenvalue of this matrix are 2, 2, 2 and 3
however; we do not know what will be the Jordan blocks corresponding to this matrix. If I
find out the modern matrix P and then I calculate P-1AP that is the similar matrix to the
Jordan matrix. So A is this upper triangular matrix. Now P-1AP coming out as my matrix J,
which is the Jordan canonical form of this matrix A, so in this Jordan canonical form, you can
see, this is a 1x1 block corresponding to eigenvalue 2.
177
Now this is a 2x2 block corresponding to eigenvalue 2 and this is 1x1 block corresponding to
eigenvalue 3. Since, the algebraic multiplicity of 3 is 1. So hence we are sure that there will
be only one block or one Jordan block for 3, because there will be only one linearly
independent eigenvectors, if you find out the eigenvectors of this matrix corresponding to λ =
2. What you will find? The number of linearly independent eigenvectors for this will come
out as 2. So it means the Jordan blocks will be 2, total 2 Jordan blocks, 1 Jordan block of size
1, another Jordan block of size 2, because we have to factorize 3 in 2 terms as the sum of 2
number. So obviously it will be 1 and 2. So this is the Jordan canonical form of this matrix A
and hence these two matrices are similar matrices and this transformation is a similar
transformation.
178
(Refer Slide Time: 12:57)
So how we can write the Jordan canonical form of a matrix, suppose I am having a 7x7
matrix, so A is a 7x7 matrix which is having eigenvalue as 2, 2, 2, 3, 3, 5, 5. So here
algebraic multiplicity of 3 is 2 and algebraic multiplicity of 5 is 2. Let we are having only 2
LI eigenvector corresponding to λ = 2 that is the geometric multiplicity of 2 is 2. Let us say I
am having only 1 linearly independent eigenvector corresponding to 3 and the 1 linearly
independent eigenvector corresponding to λ = 5. So hence all the eigenvalues are repeated
here.
Now what will be the Jordan canonical form of this particular matrix? So here J will be, I
need to find out an invertible matrix X such that j = X-1AX. So here you can see algebraic
multiplicity is 2. So total I will be having 3x3 size reserve for this eigenvalue out of which I
am having only 2 linearly independent eigenvectors. So it means I will be having only 2
blocks. So how I can decompose 3 into 2, it will be 2+1 or 1+2. So 2+1 means one Jordan
block of size 2 and another Jordan block of size 1.
179
(Refer Slide Time: 15:39)
Similarly, if m is greater than n then I will be having nxn square matrix which is diagonal
matrix and then m - n number of rows will be appended in the bottom of this matrix. Hence
[σ1, σ2, σm] are called singular values of a matrix A and they are the square root of the
eigenvalues of AAT or AT A. The eigen vectors of AAT will be the columns of U and the
eigenvectors of AT A will be the columns of V and hence in this way we can achieve this
decomposition.
180
(Refer Slide Time: 19:13)
So this is an example of singular value decomposition. Here I am having a 2x3 matrix, which
is first row is [3, 2, 2], second row is [2, 3,-2] and this is the singular value decomposition of
matrix A. So this is matrix U S and VT geometrically I can say like this. so I am having a
circle which will transform to this ellipse by a transformation M which is a matrix in terms of
singular value decomposition, it will be like that, first I am applying V* on it that is VT. So
what will happen, it will rotate, orientation will change, because it is an orthogonal matrix.
Hence, it is a rotation matrix then what will happen then I will apply this, it is a diagonal
matrix, so what will happen, it will change the scale and it will deform this particular shape.
So circle will become ellipse.
181
So geometrically it is a stage of 3, 2 rotations and one deformation. So here we have seen
some similar transformations and from them what we can say we can apply those
transformations to the given matrix and we can say that this matrix is similar to sum of the
diagonal matrix or triangular matrix and hence, it is a similarity transformation. So both
original matrix and diagonal matrix will be having the same spectrum and hence, the same
eigenvalue and it is easy to find out the eigenvalues of a diagonal matrix. We will see some
methods on the basis of this similarity transformations in this module in next lectures;
however, before that let me introduce a very beautiful results proposed around 1930 by a
Russian mathematician jurs Gershgorin and it is called Gershgorin disc theorem or
Gershgorin circle theorem.
So basically this theorem tells us about a bound on the eigenvalues, suppose I am giving you
a 4x4 matrix and I will say okay tell me the eigenvalue of this matrix, just by looking on the
matrix, I cannot say about the eigenvalues, one of the thing is if I can at the diagonal elements
and hence I can find out the trace and I will say okay trace is 5. So some of the eigenvalues
will be, but if trace is 5 still we cannot say anything about eigenvalues, if it is a 4x4 matrix, it
may happen two of the eigenvalues are quite high and two are having let us say 1 is 100,
another one is 105 and rest two are -100, -100. So that sum will be 5 and trace is 5 or it may
be eigenvalues are (0,0,1,4) or it may be (0,0,0,5). So I cannot get any idea or any guess
about the eigenvalues just by having the trace. So how to get some idea of the eigenvalues
just by looking on the matrix this particular theorem tells us about it.
182
So this theorem tells that every eigenvalue of matrix A which is of square matrix of order n
satisfies this particular inequality that is if λ is an eigenvalue λ - aii, so aii is the diagonal
element in ith row will be less than equal to sum of all the elements in that ith row except the
diagonal element.
So, how to do it? Basically, so if I am having a 4x4 matrix. So Gershgorin theorem tells us
that the eigenvalues will be like from the first row I am saying that λ-a11 will be a14. So
absolute value of |λ-a11|≤ absolute sum of rest of the entries from the first row and since,
eigenvalues are coming from the field of complex numbers. So I need to find out this
particular inequality, giving me a disc in the complex plane, which is having center at a11 and
there this sum. Similarly, second row is giving me another disc and similarly I will get
another form the third row and the last one from fourth row and hence, Gershgorin theorem
tells us that all the eigenvalues will lie in the union of all disc or all Gershgorin disc
corresponding to that particular matrix, okay.
183
(Refer Slide Time: 26:04)
So let us take some example of Gershgorin disc, so whether the eigenvalues are coming
inside the Gershgorin disc that is the union of all disc or not. So let us take a 2x2 example, a
matrix A which is given as [1, 2; 1,-1]. Now if I find out the eigenvalue of this matrix then
the characteristic polynomial will be (λ-1).(λ+1-2) = 0. So this will be λ2-1-2 = 0. So, λ =
±√3, now if I plot the disk of this matrix according to Gershgorin theorem. So let us say, this
is my X and Y that is the complex plane. So the imaginary axis and real axis let us say 1, 2, 3,
4. Similarly here, -1,-2,-3,-4 and real 1, 2, 3 that is the i, 2i, 3i, ,2i, 3i.
Now from the first line what I am getting that is, the first disc will be λ-1 ≤ 2. It means the
center is at 1 and radius is 2. So center is 1, radius is 2. So it will be this disc okay. So this is
the center of the disc, if I plot the another disc will be |λ+1|≤ 1, So the second, the center of
second disc at -1 and radius is 1. So it means it will start from here and this will be like this.
Hence, this region will be the union of these two disc. Now the eigenvalue is √3, √3 will
come somewhere here and -√3 will come somewhere here. So these are the 2 eigenvalues and
here we can say the eigenvalues lie in the union of Gershgorin disc. Here, eigenvalues are
real.
184
(Refer Slide Time: 29:22)
Let us take another example, where eigenvalues are imaginary eigenvalues. So again for sake
of simplicity, let me take a simple matrix that is a 2x2 matrix, [1, -1; 2, -1]. Now eigenvalue
of this matrix are if we calculate with characteristic polynomial i and -i. Now this row gives
me the Gershgorin disc as |λ-1|≤1 and the another one is giving |λ+1|≤ 2. So if I plot first disc
so center at 1 and radius is 1. Now if I see the second disc, center is -1 that is the this point
and radius is 2. So it will pass through here and then here. So this will be the another
Gershgorin disc.
Now check the eigenvalue, eigenvalues are i and –i and here, you can see both the
eigenvalues are in union of Gershgorin disc. Basically here both the eigenvalues are coming
in the second disc. They are not coming in this disc. So the claim that the eigenvalues will lie
in the union of Gershgorin disc is correct. If someone say that each eigenvalue will lie in its
respective Gershgorin disc that is not true and how to prove this.
185
(Refer Slide Time: 31:30)
So proof is quite easy, let us say I am having a square matrix A, which is having eigenvector
x corresponding to eigenvalue λ. Let us say in this eigenvector xi is the largest component. So
if A is nxn matrix, the vector x will be having n components and out of n, ith component is
the largest. Now what this particular relation is telling to me, suppose I am having, see the ith
row of this matrix, this particular relation. So from this ith row will multiply each component
of x and this will be called as λ(xi), ith component. So it like this j = 1 to n∑ aij xj = λ(xi).
So if I see the ith row of this product, because there will be i equations, so I am taking n
number of equations, out of n I am taking the ith equation. This is equal to λxi or I can write
this, j ≠ i, so one element it is j is 1 to n. So I am taking 1, when j is i into other side, because
there it will be xi. So it will become aijxi, j ≠ i from 1 to n and here, it will be λ-aii xi or let us
take this into this side take a mod on it will be summation j ≠ i, aij xj/xi, okay and it is up to n.
so what I have in the beginning, I told you that ith entry is the largest one. So xj/xi will be
always less than 1. So if I replace it by 1 what will happen, it become less than and this is our
Gershgorin disc and this is the poof, very simple proof, just coming from the definition of
eigenvectors and eigenvalue.
186
(Refer Slide Time: 35:17)
Let us see some of the applications of this theorem, just look at this 4x4 matrix, can you tell
me just looking at this matrix whether it is invertible or not, you can tell this to me if you
know the determinant of this matrix or you know the eigenvalues of this matrix; however just
look here, if I apply the Gershgorin theorem on this particular problem what I am getting, the
first disc is |λ-2|≤ 3. The second is |λ+3|≤3, the third one is |λ+5|≤ 2 and the last one |λ-4|≤3.
So from here we cannot say anything, but if you apply the Gershgorin circle theorem on the
columns of A, because a eigenvalues of A and AT will be equal and hence, the theorem holds
for columns also. So from columns it is quite clear that in each column I am having this
inequality as strict inequality, it means none of the disc will contain the origin and hence,
union of all the disc will not contain the origin and hence, 0 cannot be an eigenvalue of this
matrix and hence, it is a invertible matrix. So same kind of analysis you can make for these
two examples also like, again from the columns we can see, it is invertible.
Here, we cannot see from the columns as well as from the row due to this second thing,
second row or second column, because second column or the disc corresponding to second
row will touch the origin and since, it will touch the origin, it may happen that 0 may be an
eigenvalue of it. So what to do in this case? Here, we are not having any result from the
Gershgorin theorem, we cannot talk about the inevitability, but from these 2 examples, I have
told you how to apply Gershgorin theorem just to check whether the matrix is invertible or
not. Thank you for this lecture, so in this lecture we learn about Gershgorin theorem as well
as we have seen some examples of similarity transformation, thank you.
187
Numerical Methods
Professor Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 13
Jacobi’s Method for Computing Eigenvalues
Hello everyone, so welcome to the third lecture of this module. So in this lecture we will learn a
method for computing eigenvalues and here, the method will be a numerical method, because so
far you learn how to calculate eigenvalues as a root of characteristic polynomial. So here what
we will do? We will apply some sort of similarity transformations on the given matrix such that
after a sequence of similarity transformations, the matrix convert into a diagonal matrix and from
the diagonal matrix we can see the eigenvalues directly as the diagonal elements.
Furthermore, the sequence will also contain the information about the eigenvectors of the matrix.
So this method is called Jacobi method and this method gives a guarantee for finding the
eigenvalues of real symmetric matrices as well as eigenvectors for the real symmetric matrix. So,
as I told you, it is based on the sequence of similarity transformations and those transformations
will be based on the rotation matrices or given rotations, we will apply the rotation matrices in
terms of similarity transformations to the given matrix in such a way that all the off diagonal
elements become zero after a series of transformations. Here, off diagonal elements become zero
means, there should not be any change in the eigenvalue of the matrix and that is why I am
188
telling you the similarity transformations, because the two similar matrices will be having the
same spectrum.
So first of all, here we should know or we should have the idea about a rotation matrix. So what
we mean by a rotation matrix? So a 2x2 rotation matrix in a plane is given by this particular
matrix. So it is an orthogonal matrix you can check, having determinant as 1, AT will be equal to
A-1 and so on. This is about the rotation in a plane by an angle θ. If I talk about the rotation in the
space then, first of all we should decide whether I want to make the rotation whether the rotation
about X axis or Y axis or Z axis, the second thing by which angle? So if I want a rotation about
X axis by an angle φ1 then the rotation matrix will be (1, 0, 0), (0,cos φ, sin φ), (0, - sin φ, cos φ),
if I want to make the rotation about Y axis by an angle φ2, so it will become, so the rotation
matrix will be this one and if I want to make it about Z axis by an angle, so I want to make it
about Z axis by an angle φ3, then the rotation matrix will become, the third column will become
(0,0,1) third row will become (0,0,1) and here, rotation will be cos φ, sin φ3, cos φ3 - sin φ3 and
cos φ3.
189
(Refer Slide Time: 5:31)
So these are some examples of rotation matrix in 2D plane and 3D space, but suppose a nxn
matrix is given to us, then a nxn rotation matrix let us define denote it by J(p, q, θ). So here we
want to put the cos and sin terms in pth rows and qth row pth column and qth column and this
will be of the form this one. So except these two rows rest of row will be like (1,0,0,0 0,1,0,0 like
that in this two rows we will be having the terms of cos θ and sin θ like here in pth row, I mean
0,0,0,0 then at the diagonal of pth row and pth column it will be cos θ at the intersection of pth
row with qth column, it will be sin θ. Similarly, at the intersection of qth row pth column will be
- sin θ and the intersection of qth row with qth column will be having term cos θ.
190
So let us denote this cos θ by C and sin θ by S, then the matrix J(p, q, θ) is known as Jacobi
rotation matrix or given rotation matrix. The matrix j(p, q, θ) is applied to symmetric matrix A as
a similarity transformation and once, we applied to A, this will rotate row and columns p and q
of A through an angle θ, so that (p, q) and (q, p) entries become zero. So these are two of
diagonal entries (p, q) and (q, p) and these two entries will become 0 and I told you this method
is applicable only for symmetric matrices. So hence (p, q) will be (q, p).
So let us denote this the similarity transformation like this, we are having pre-multiplication of JT
and a post-multiplication of J with matrix A and I am getting my next matrix A`. So A` is JTAJ,
where J is a rotation matrix with the p and q and by an angle θ. So let off (A) and off (A`) be the
square root of sum of squares of all off-diagonal elements of A and A` respectively. Then off
(A)2 will be Frobenius norm of (A)2 ie ||A`||2F - i = 1 to n ∑aii 2. So what we are doing we are
taking square of all the elements and we are subtracting diagonal elements. Since, the Frobenius
norm is invariant under orthogonal transformations and only p and q columns are reformed in
matrix A`. So we can have the sum of squares of off diagonal elements of matrix A` = ||A`||2F -
square of the diagonal elements of A `.
191
This equals to ||A||2F - i ≠ p,q ∑a2ii –(a`2pp+a`2qq), because there will be change only in p and q
elements pth row (p, q) and (q, p) elements, this will become this particular term and finally, it
comes out that the square sum of the off diagonal elements of A` will be less than sum of square
of off diagonal elements of A, it means the square elements are the elements from off diagonal
are going towards zero and this is a basic motivation for the Jacobi method.
So here it shows that the size of off diagonal part decreases by applying Jacobi transformation.
The post-multiplication of A by J1 will change in columns p and q in the same way the pre-
192
multiplication of J1T bring changes in rows p and q. Hence, the transformation A` = J1TAJ1 alters
only rows p and q and columns p and q of A and there is no change in rest of the rows and
columns.
If we see this transformation in nxn set up, so what will be the relation between matrix A and
A`? So the elements say ajk of the matrix A` are given by the formula. So when j ≠ p and j ≠ q,
a`jp = Cajp, please not that here, C is cos θ - Sajq. Similarly, a` jq is given by Sajp plus Cajq when
j not equals p and j not equals to q. The diagonal entry in the pth row is given by a` pp equals to
C square app plus S square aqq - 2 C S apq. Similarly, the diagonal entry in the qth row is given
by S2app+C2 aqq+2C.S.apq. The elements on the intersection of p th row and qth column or qth
row and pth column is given by that is a`pq C2 - S2 apq+C S app – a qq and please, here note that the
last and rest of the elements we can find by the symmetry, but here please note that this element.
This is the off diagonal element in the pth row and qth column or qth row and pth column and we
want to make them zero.
193
So if I make them zero, I can write 0 = (C2 - S2)apq+C S(app – aqq) like this. Given by equation 2,
moreover the goal of every step of Jacobi equation is to make the off diagonal elements a`pq =0
and a`qp = 0. So from this we can have φ = cot 2θ C2 - S2/2CS. So it will be cos 2θ -2 cos θ.sin θ
that is sin 2θ. So this is φ.
194
Now from equation 1 and 2, I can write C2 - S2 /CS = (aqq - app)/apq. So from here, I can write φ
equals to instead of this I can write this term. So (aqq - app)/2apq. So this is equals to cot 2θ, so
tan2θ will become this value 2apq/(aqq - app) and θ will become ½ tan-1 2apq/(aqq - app), however, a
less round of error generated if we use tan θ in computations, like let us assume t = tan θ which is
sin θ/cos θ. So from equation 1 that is my equation 1. So I divide the numerator and denominator
by C2, so it will become (1 - S2/C2)/2S/C. So (1 - t2)/2t, because S/C is t and this gives a quadratic
equation t2+2t φ - 1 = 0.
195
(Refer Slide Time: 14:12)
The roots of this quadratic equation is given by - φ±( φ2+1)1/2 and that will be sign(φ)/|φ|+
(φ2+1)1/2. Here, sign(φ) is 1, when φ is non-negative and it is - 1, when φ is a negative number.
Thus, once we get t we can calculate C and S by this formula C =1/(t2+1)1/2 and S = Ct.
Let us take an example of this method. Consider this matrix. So this is a 3x3 matrix having
elements (1,√2, 2),( √2, 3, √2), (2, √2, 1) and let us solve this matrix or let us apply.
196
So let us take a 3x3 matrix and find out the eigenvalue of this matrix as well as eigenvector using
Jacobi method. So matrix is (1, √2, 2),( √2, 3, √2), (2, √2, 1). So it is a real symmetric matrix.
Now first of all, in Jacobi method we will look for the off diagonal element having the maximum
absolute value, because we will perform a similarity transformation to make it 0. So if I look at
the off diagonal elements the biggest off diagonal elements is this one that is a31 = a13 = 2. It
means my p is 1, q is 3. So it means app is 1, aqq is 1 and apq = aqp = 2.
So here if I calculate θ, θ will be 1/2 tan-12apq/(aqq - app) and it is coming out π/4, because here it
will be 0 in the denominator. So tan-1(∞) and so it will be π/2.1/2, it will become π/4. This is one
of the way of calculating θ, we can calculate it using the quadratic equation, first by calculating φ
then t and then from t we will directly calculate cos θ and sin θ as I told you. Basically that is
more accurate compare to this one.
Now so define matrix J. So J will be cos π/4 that is 1/√2, 0, that is cos π/4 - sin π/4, sin π/4 and
cos π/4. This is J1. Calculate A1 that is J1T A.J1 and it comes out [3, 2, 0; 2, 3, 0; 0, 0 ,-1]. So
please compare A and A1, just see these two elements, I have made these two elements 0, okay
just by applying this Jacobi rotation.
Now to make this particular matrix as a diagonal matrix what I need to do, I need to make these
two elements also zero. So for making these two elements 0, again I will do that, so my p is 1
and q is 2 and a12 = a21 = 2 that is apq = aqp = 2app = aqq = 3 these two diagonal elements again if I
calculate θ, θ again will be ½.tan-1 2apq that is 4/0. So it is coming again π/4 and my next matrix
197
will become J2. So J2 will become (1/√2, -1/√2, 0), (1/√2, 1/√2, 0),(0,0,1). If apply again this
particular similarity transformation on A, the matrix obtain in this one, I got the matrix A3, A3
will be (5,0,0), ( 0,1,0) and (0,0,-1). So please look at A3, A3 is a diagonal matrix and hence, the
eigenvalue of A3 is 5, 1, -1 and so these are the eigenvalues of A, because I obtained A3 just by
applying the two similarity transformations on A. Now the eigenvector of this matrix A is given
by the product of J1.J2 and I will tell you while I am writing. So eigenvectors will be columns of
this matrix that is J1.J2. So I got the eigenvalue, I got the eigenvectors and hence, I will be able to
solve this particular problem for finding eigenvalues and eigenvectors using Jacobi method.
Now actually what is happening let me tell you few things about this method. Let me write a 2
here, which I obtained just by applying J1T A J1 on A. So it was something like (3, 2, 0) (2, 3, 0)
(0,0, -1). Now if you remember previous lecture there we talk about Gershgorin theorem. So let
us see how eigenvalues are changing or Gershgorin circles are changing in different iterations of
Jacobi method.
So if I talk about this and this first disc is having center at 1 and radius is 2+√2, so something
3.4, so 2, 3, 4,-1,-2,-3,-4. So this will be the disc from the first row, from the second row, center
at 3 and radius is 2√2. So it will be something radius at 3, so 1, 2, 3 and center at 2√2, so
something 2.8. So it will be like this and the third row is again center at 1, radius is 2+√2, that
198
will be just your first disc and as I told you eigenvalues are 5, 1, -1. So one of the eigenvalue will
be here at 5, one will be here 1 another one at -1.
So Gershgorin theorem holds for this matrix; however here what I am having one of the
eigenvalue is the common area of two disc. The Gershgorin discs are not disjoint here if I say in
this matrix, so let us say 1,2,3,4, -1,-2,-3,-4. So first is λ - 3, so center at 3 and radius is 2. So this
one, so 1, 3,4,5, second is giving center at 3, radius 2, which is similar to first one and the third
one is having at center at - 1 and radius is 0. So in the second matrix which I got in the first
iteration of Jacobi method what is having the disc as overlapping each other and one is disjoint.
Now see the third matrix that is A3. Here, simple I am having one of the eigenvalue is here at 5,
another one at 1 and the third one at -1 and these are the Gershgorin disc for the given problem.
So basically what I am doing by applying the Jacobi rotations to the given matrix, I am reducing
the size of Gershgorin disc in such a way they become disjoint or they converge to a single point
that is your diagonal matrix we are getting here.
Another thing I want to tell you here why I am saying that the product of J1, J2 like that will give
me the eigenvectors. What is happening let us say A1 is the eigen matrix which is having
diagonal entries as the eigenvalue and this I am getting just by applying a Jacobi rotation on J.
Now as you know that J is an orthogonal matrix, it is a rotation matrix. So JT will be equals to j-1.
199
So what happens if I multiply both sides, so it will become, I can write like this or this is identity.
So let us say x1, x2…xn, are the eigenvectors of A and as I told you they will be the columns of
matrix J. So I have written like x1 is the first column of J, x2 is the second column of J; xn is the
third column of J.
Now look here what are these A1 is a diagonal matrix having the eigenvalues of A. So it will be
something diag(λ1, λ2, λn). this matrix into again J, so J = [x1, x2, xn]. So when I multiply this with
first column what I will get Ax1=λ1x1, which is the first eigenvalue and corresponding
eigenvector for A from the Ax2 will be λ2 x2, Axn will become λn xn. So hence this matrix J is
having eigenvectors of A as its column, which I claimed earlier that you can see from here.
So in this lecture we have learn a method for calculating eigenvalues and eigenvectors just by
using the similarity transformations or a series of transformations and those transformations are
formed just using the rotation matrices, in such a way that the off diagonal elements become
zero. In the next lecture we will learn another method for finding the largest eigenvalue and its
corresponding eigenvector for a given matrix and that particular method is called power method.
So thank you very much for this lecture, see you in the next lecture.
200
Numerical Methods
Professor Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 14
Power Method
Hello everyone, so welcome to the 4th lecture of this module and today we are going to
discuss one more method for finding eigenvalues and eigenvectors of a matrix. So in the last
lecture we have discussed Jacobi method for finding eigenvalues for a symmetric matrix and
there we have used the similarity transformation in such a way that a similar matrix is found
corresponding to a given matrix and that similar matrix is just like a diagonal matrix where
the eigenvalues are the diagonal elements. Like Jacobi method was restricted up to symmetric
matrix only, today we are going to discuss a method which is applicable to any square matrix;
however again we are having some conditions to apply power method, which we are going to
discuss today and we will discuss about those conditions.
So first of all, let A be a square matrix of order n. the eigenvalues are calculated just by
solving the characteristic equation, which is given as λn+Cn-1λn-1+ and so on. So it is a n
degree polynomial in λ and roots/zeros of this polynomial will give you the eigenvalues. Let
us say eigenvalues are λ1, λ2, λn, where some of them may be equal or repeated we will say
that λ1 is a dominant eigenvalue of A, if this condition is satisfied means, the absolute value
of λ1 is greater than the absolute values of rest of the eigenvalues that is |λ1|> |λi| for i=1, 2, 3,
201
up to n and if this condition hold, λ1 is the dominant eigenvalue and the corresponding
eigenvector is called dominant eigenvector of A.
So for example, if I take this 2x2 matrix so the characteristic polynomial of this matrix is
λ2+3 λ+2, which is having zeros as λ1 e= -1 and λ2 = -2. So the eigenvalues are -1 and -2 for
this matrix. Now if we see that absolute value of -2 is greater than absolute value of -1.
Hence, -2 is the dominant eigenvalue of this matrix and corresponding eigenvector is (3, 1).
Now what are the conditions to apply power method? First of all, eigenvalues can be
arranged in the following way that is, the dominant eigenvalue should be there for the matrix
202
and there should not be any repetition of dominant eigenvalue, for example, if we are having
a 3x3 matrix, there should be 1 dominant eigenvalue which is not equals to others, for
example if A is 3x3 and we are having eigenvalues -5, 3 and 2. Then here -5 is clearly
dominant eigenvalue, but if we are having eigenvalues like -5, 5 and 4. Here, we cannot apply
power method as such and we will not be able to find out the eigenvalue, because here the
dominant eigenvalue is -5 as well as 5 and it is repeated. So here λ1 is not clear strictly greater
than the rest of the eigenvalue in terms of absolute value. The second condition is which is
again very important that the matrix A should have a linearly independent eigenvectors. It
means A should be similar to a diagonal matrix if we talk in terms of similarity
transformation. So with these two conditions let us derive the power method.
203
So let A is nxn matrix and it is having eigenvalues λ1 which is strictly greater than in terms of
absolute value to rest of the eigenvalues, moreover we are having eigenvector as V1
corresponding to λ1, V2 corresponding to λ2 and Vn corresponding to λn and here, we are
assuming that V1, V2, Vn are linearly independent. So if these vectors are linearly
independent, then any vector V from the vector space Rn can be written as the linear
combination of these eigenvectors. So if V belongs to Rn or if matrix is from the complex
number and from the Cn, so if V from Rn then we can write V = C1V1+C2V2+Cn Vn. Here, C1,
C2, Cn are scalars.
Now if I multiply by matrix A, in this equation I will get in the left hand side A.V and then
C1A.V1+C2A.V2+CnA.Vn, as we know that V1 is an eigenvector corresponding to eigenvalue
λ1 for the matrix A. So I can write C1 λ1V1. Similarly, this term I can write λ2 V2+Cn λ n Vn or
if I take λ1 common from the right hand side, I can write C1V1+C2.λ2/λ1.V2. So this is equals
to AV.
If I multiply one more time by the matrix A, in this equation I will get A2V in the left hand
side and this will become, so C1V1+C2 λ22/λ1.V2+Cn2 of this ratio term into Vn or if I continue
by multiplying A again and again, let us say I multiply k time it will become AkV = λ1k
C1V1+C2 λ2/λ1k and finally the last term will become.
Now look at this equation, here we are having in this term λ2/λ1. Similarly, in the next term
we will be having λ3/λ1 and so on. Here our assumption is that λ1 is the dominant eigenvalue.
It means that this particular term λ2/λ1 will be less than 1. Similarly, λ3/λ1 will be less than 1
and up to λ n/λ1 which is again less than 1.
204
So when k is tending to infinity my AkV will become λ1 k.C1V1, it means I can find out λ1 as
limit k →∞, by this ratio and here R is the component of the vector this particular vectors
having the highest value in magnitude. So this will give me dominant eigenvalue and since,
we are using various powers of A that is why we are saying it the power method and from
this equation, it is clear that if λ1 is the eigenvalue that is the dominant eigenvalue then the
corresponding eigenvector will be V1. So with this I can talk about the conversions of this
power method.
So if λ2/λ11 and absolute of this ratio term is less than 1, the rate of conversion is fast,
moreover whatever will be, if it is quite small then 1 the method will converse faster, if it is
close to 1, this ratio term, the method will converge slowly, as we know that power method is
an iterative method, because each time we are starting with the initial solution V0, then we are
finding V1 as A(V1), then V2 will be A(V2), V3 will be A(V2) and so on until unless it will not
converge.
205
So there should be some stopping criteria and the stopping condition is if in the two
successive iterations λ is less than a given threshold, okay, in terms of 10-3 or 10-5 whatever
accuracy you want in your method or the maximum component in 2 successive vector Vk+1
and Vk and difference of this, which is having the maximum value is less than a given
threshold. Moreover, to control the round off error the vector is normalize before pre-
multiplying by A so that the largest element remains unity, for example, if you start with (1,
1, 1, 1) and after multiplying with A you are getting let us say some vector (2, 3, 4). So what I
will do I will normalize it, in such a way that the biggest component of this V2 vector that is 4
should become 1, so that vector will become (2/4, ¾, 1).
206
If we talk about eigenvector, so we start with V0 as the initial vector and the condition is that
this V0 should not be orthogonal to vector V1, and it should be a non-zero vector obviously,
because finally this particular vector is conversing to the eigenvector. So yk+1 = A(Vk), then I
will find out Vk+1 as yk+1/mk+1 and as I told you where mk+1 is the largest element in yk+1 in
magnitude. So in this case, λ1 will be limit k →∞, yk+1 r/Vk r. finally, Vk+1 will be the required
eigenvector corresponding to λ1.
Let us take an example of this method, just consider this 3x3 matrix and let the initial column
vector be (1, 1, 1).
207
So first of all I will find out V1, V1 will be A.V0, A is this 3x3 matrix, V0 is (1, 1, 1), this
column vector after multiplying I am getting another column vector, which is V1, (2,-1, 0).
Now what I will do first of all, I will find out from V1 to y1 and y1 will be, I will see which is
the biggest component in this vector in terms of a float value and here it is 1. So I will divide
this V1 by 2. So y1 will become (1, - 0.5, 0). So here I can say that, in this iteration my
eigenvalue is 2 and the eigenvector is (1, -0.5, 0).
Then I will calculate V2. V2 will be A.y1 and A.y1 when I will calculate, it will become (3.5 -
4.5), again I will divide this vector by 4 so that this term will become 1, -0.875, 1 and -0.125.
So here in this iteration eigenvalue the approximation of eigenvalue is -4. So just look in first
iteration it was 2, in the second iteration, it is coming -4 and similarly, we are getting a
208
deviation in eigenvector. In the third iteration, eigenvalue becomes 6.125 that is
approximation and y3 becomes -0.918, 1 and -0.1837.
There are some disadvantages or limitations of this method. The first one is if the initial
column vector V0 is an eigenvector of A other than the dominant eigenvector, then the
method will fail since the iteration will converge to wrong eigenvalue, moreover the speed of
convergence depends on the ratio magnitude of dominant eigenvalue λ1/magnitude of the
next largest eigenvalue. If the ratio is small, the method will converge slowly. The power
method only gives one dominant eigenvalue at a time; okay I will tell you how can we find
out other eigenvalues using this method.
209
So as I told you, there are some limitations when I told you the assumption for applying this
method, I told you that there should be a dominant eigenvalue, if this is not the case what will
happen, whether the method will converge or not. Let us see it with an example. So if I take a
2x2 matrix, let us say (1, 1; 0,-1). Let us find out the eigenvalue of this matrix using power
method. So let me start with a initial vector V0 which is (1, 1). So V1 will become (1, 1, 0.-1)
that is my matrix A.V0, (1, 1), so 1+1, 2 and -1.
Now I will calculate, V2 will become A(V1). So it is (1, 1; 0, -1).( 2, -1). So 2-1 will become
1 and it became (1, 1), again then if I will calculate V3, V3 will come (2, -1), V4 will come (1,
1) and so on. So my method will stick here in these two vectors, for the odd iterations of V, I
will get (2, -1) for the even iterations like 2, 4, 6 I will get (1, 1) and I will it will never
converge. Why it is happening. This is clear from here, if you see the eigenvalue of this
matrix, it is an upper triangular matrix and here eigenvalues are 1 and -1 and the region of
this oscillation is very simple that the matrix is not having the dominant eigenvalue that is
why the condition that the matrix should have dominant eigenvalue is quite important for
applying the power method. Up to now we have seen that using the power method, we can
calculate only dominant eigenvalues and corresponding eigenvector, suppose I want to
calculate other eigenvalues also.
210
So we can modify this power method in such a way that we shift the dominant eigenvalue to
0 in a new matrix such that the second dominant eigenvalue become the dominant, for
example, if you are having eigenvalues λ1, λ2, λ3, if λ1 is dominant what we will do we will
shift this λ1 to zero in some other matrix such that the λ2 becomes the dominant and then in
this new matrix we will apply the power method. So this method is called method of deflation
and it is based on deflation theorem. So how it works?
So once you calculate the dominant Eigen pair that is λ1V1 of a matrix A, you want to
calculate λ2. Here, so I will take an example of symmetric matrix, but it can be generalized
for any other matrix also. So if A is a symmetric matrix then it can be proved that if U1 is V1/|
V1| then A1 is A-λ1 U1 U1T has eigenvalues 0, λ2, λ3, λn and the eigenvector of A1 will be the
same as of A. so this is one of the result of deflation theorem.
211
So here we are saying that if A is nxn matrix having eigenvalue λ1 λ2, λn and corresponding
eigenvectors are V1, V2, Vn. Now also we are assuming that λ1 is the dominant eigenvalue
and the corresponding eigenvector to this V1 that is a dominant eigenvector is V1. Now I am
saying, if A is a symmetric matrix, I can calculate a new matrix A1 or let us say it B, which is
A-λ.U.UT, where U is the unit vector in the direction of V1 then, this matrix B will be having
the eigenvalues 0, λ2, λ3 up to λn and the eigenvectors will remain same like V2, V3, Vn for
this new matrix B. so what we can do, suppose using the power method on this matrix A, we
calculate the dominant eigenvalue and corresponding eigenvector that is λ1 and V1 and here,
this is my λ1 is the dominant eigenvalue in this result.
So I calculate these two what I will do I will apply this transformation, I will get a new matrix
B and again I will apply the power method on B so that I can calculate the next eigenvalue to
the dominant that is λ2 and corresponding eigenvector that is the dominant eigenvalue of B
will be the next two dominant eigenvalues of A and how we are getting this result? Suppose
A is a symmetric matrix, so I want a new matrix B, which is something A-λV.XT. So I am
taking a vector X, if I multiply this vector X in this second term of the right hand side of this
equation, I get a new matrix B, which is having the one of the eigenvalue as 0 if λ is the
eigenvalue of A.
212
Let us check this with one of the example again I am taking a 2x2 matrix and now my
question is to find all the eigenvalues of this matrix using power method together with
method of deflation. So it is a 2x2 matrix, so there will be only 2 eigenvalues, one of the
eigenvalue that is the dominant eigenvalue we can calculate using the power method and the
other one, we will use the method of deflation in power method.
213
So let us start with (1, 1). So using the power method and after going up to 10th iteration I am
getting that the dominant eigenvalue is converging to 9 and the corresponding eigenvector is
converging to (-1/2, 1). So hence one of the eigen value of this matric, the bigger one in terms
of absolute value is 9 and the corresponding eigenvector is (-0.5, 1).
Now I apply this transformation deflation, so A will become A-B will become A1, I have
written A1 will become A-λ1 UUT. So after applying this my A1 is coming 4/5.(4, 2, 2, 1).
Now again I will apply and from the method deflation theorem, one of the eigenvalue of this
matrix will be 0 and the corresponding eigenvector will be V1 that is -0.5 and 1, which is
corresponding to 9 or A. So by applying the power method again on this new matrix A1
starting with (1, 1), we get X2 as this 1/4.8.(1, ½) and after going this way, we will see that
214
the method is converging to 4 as the eigenvalue dominant eigenvalue of A1 and
corresponding eigenvector as (1, 0.5).
So hence this λ2 = 4 is the dominant eigenvalue of A1, but it is the other eigenvalue that is the
second eigenvalue of A and the corresponding eigenvector is (1, 0.5). So hence using method
of deflation with power method, we can calculate all the eigenvalues of a given matrix. So in
this lecture we learn how to use power method for finding the dominant eigenvalue and
corresponding eigenvector of a given matrix, later on we have seen method of deflation if we
apply together with power method, we can calculate other eigenvalues also, those are not
dominant of the given matrix, thank you very much.
215
Numerical Methods
Professor Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 15
Inverse Power Method
Hello everyone, so welcome to the last lecture of this module and again in this lecture we are
going to introduce another method that is the inverse power method, sifted inverse power
method. These are the variants of power method for finding the eigenvalues which are not
dominant for a given matrix. In the last lecture, we have talked about power method and we
have seen that using the power method, we can find only the dominant eigenvalue. However,
we have seen in the previous lecture that if we use method of deflation with power method
we can compute the other eigenvalues than dominant for a given matrix. However, in method
of deflation together with power method what you have to do? First you find the dominant
eigenvalue and eigenvector, then generate a new matrix and then for that matrix again apply
power method, which will give you the next dominant eigenvalue. Then make a new matrix
again apply the power method and so on.
So if I am having a 10x10 matrix and suppose I want to find out the 5th eigenvalue of this
matrix, which in order of decreasing order. So what I have to do? For doing this, I need to
apply 5 times power method to a 10x10 matrix and 4 times deflation transformation I need to
use. So hence it will be very expensive in terms of computational cost. So inverse and shifted
inverse power methods give us algorithms for computing the eigenvalues and eigenvectors,
those are not dominant with step by using the power method or process of power method just
once.
So basically these methods are based on the two principles. The first principle is if λ and V is
an eigen pair of a square matrix A of order n then, λ-1, which will be basically 1/λ and V be
216
the eigen pair of matrix A-1. This is one of the rule and this can be shown very easily as it is
the eigen pair for A. So I can write AV = λ V. If A is invertible then I can multiply both side
by A-1. So A-1AV will become, A-1λ V. What I can do λ is a scalar, I can take out, so I can
write and then I can divide the whole thing by 1/λ both sides, so 1/λ V will become A-1V, it
means the eigenvalue of A-1 is 1/λ and corresponding eigenvector is V.
The other result we shift the eigenvalue that is, it is saying keep λV is an eigen pair of a
matrix A then, λ - α together with eigenvector V will be the eigen pair of matrix A - α I for
the scalar α and here, α ≠ λ. So this again we can show, we are having a new matrix B that is
A - α I, so BV will become A - α I.V and this will become AV - α V and we know that λ is an
eigenvalue of A, so it will be λ - α V or I can write λ - α.V. So it means the eigenvalue of B is
λ - α and the eigenvector is V, which is the same as of A and V is A - α I. So with these two
results we will start our shifted inverse power method or inverse power method.
217
So suppose that A as distinct eigenvalues λ1, λ2, up to λn. Consider the eigenvalue λj, suppose
I need to calculate or I need to compute this eigenvalue. So then a constant α can be chosen
so that μ1 is 1/λj-α is the dominant eigenvalue of A-αI-1, further more if we chose V0 carefully
then the sequence Vk that is V1k, V2k, Vnk having the components.
And Ck given by yk = A - α I-1Vk and Vk + 1 is 1/Ck + 1yk. This is the power method only power
method for A - α I-1 and finally we can calculate the jth eigenvalue that is λj as 1/μ1 + α from
the μ.
218
Now what should be the choice of α? We cannot choose α just like equals to λj, but to be the
μ1 as the dominant eigenvalue. So it should be very large and for this α should be be quite
close to λj. So for example, if I want to find out eigenvalue 4, α should be somewhere 4.2 or
3.8 or 4.3, 3.7 like that. So proof of this result can be given very easily, suppose the
eigenvalue satisfy λ1<λ2 up to <λn. Also, let α be the number such that α ≠λj, but very close to
λj as compared to other eigenvalues then, I can write that |λj – α| < |λi – α| for rest of the i
from 1 to j – 1, and then j + 1 to n.
Then using the result which I have derive on the board I can say that 1/λj - α will be the
eigenvalue of A-αI-1 and the corresponding eigenvector will remain V, which is the
eigenvector of original A corresponding to the eigenvalue λj. So more over we can say that
|1/λi – α|<|1/λj – α| and hence, λ1/λj - α, which is my μ1 will be the dominant eigenvalue of the
matrix A - αI-1. So how it will work, suppose I want to find out an eigenvalue of a given
matrix let us say, some eigenvalue λj. So I will choose α close to this eigenvalue and which is
not close to rest of the eigenvalues.
Now question arise without knowing about eigenvalues how we will choose α, because each
and every time I am saying that α should be close to λj compared to any other λi. So how to
do it without looking or without knowing about the eigenvalues so this will come from the
Gershgorin disc just by looking on the given matrix I can say about the range of eigenvalue
or in which disc eigenvalue will lie and from there I can get an idea.
219
So the algorithm should be like this, first of all you have to choose initial V0, which should be
non-zero then for k = 0,1, 2 you will find out yk and yk will be A - α I-1 Vk from here Ck + 1
will be the largest component of vector yk in terms of absolute value and then you can define
your Vk + 1 as 1/Ck + 1.yk. So the shifted inverse power method with this fixed shift α is nothing
just power method where the matrix A is replaced with a new matrix A - α I-1.
The convergence of this algorithm is given by this λ1 - α/λ2-α, where λ1 and λ2 are the closest
and the second closest eigenvalue to α. So for example, if you are choosing a matrix having
eigenvalue 5, 8 and 10 and I am choosing α is 4. So it will be convergent, will be 1//5 - 4/8 -
4. So 1/2, so it will be linear okay, so it will be always linear. It will be something like always
between 0 and 1, okay or up to 1.
220
(Refer Slide Time: 12:50)
Now we can use this shifted inverse power method with variable shift also. Variable shift
means we can update our α also, in the earlier algorithm we have a fixed α chosen it in the
beginning and we are using that; however here we can update our α also to improve the
convergence of a given method. So here the algorithm will be like that you take a non-zero
vector V0 and a initial α that is, let us say α0, then compute yk, yk will be A - α kI-1.Vk and
Ck+1 will be the maximum component in terms of absolute value of yk like for a 3x3 matrix yk
is coming [1 - 2 – 4].
So here Ck will become - 4 and then you set your Vk + 1 as 1/Ck + 1.yk and here, the same time
you will update your α, α will becomes αk + 1/Ck + 1 and this method is locally quadratic
convergent having the quadratic or second order convergence locally.
221
We can apply the shifted inverse power method with Rayleigh Quotient also and that is like
choose an initial value V0 that is not equals to 0 such a way that, this is having the unit length
then compute α0, which will be the Rayleigh Quotient of this vector V0 and that will be
V0TA.V0. Now for k = 0, 1,2 compute yk. So yk will become A - α kI-1.Vk. Set Vk + 1 as 1/yk
and then αk + 1 can be updated as Vk+1TA.Vk+1. So in each iteration again updating my V and
here I am updating my α by the definition of Rayleigh Quotient and this method is having
(cubic) this method is cubic convergent in case of symmetric matrices.
So let us take an example of separate inverse power method for finding the eigenvalue one of
the eigenvalue of a 3x3 matrix and here, we will use a fixed α version of the shifted inverse
power method hence, with fix shift. So the eigenvalue of this matrix is (4, 2, 1) the dominant
eigenvalue is 4. So suppose I take α = 4.2, so if I take α = 4.2 my shifted inverse power
method will converge to eigenvalue 4 and corresponding eigenvector. So for λ1 = 4, I can
define my A-α will become A-4.2 1 and then I will apply power method on this A - 4.2 I with
initial value (1,1, 1). So this I have taken in this way.
222
And then using this we continue in this way until the sequence Ck and Vk converges. So y0 is
this value then C1 comes -23.18181818 and then V1 is this particular vector.
After 8 iterations we have μ1 = - 5, which is the dominant eigenvalue of A - 4.2 I-1 and then
Vk converges toV1 that is [2/5, 3/5, 1]. So hence the eigenvalue is given by 1/μ1+α that is -1/5
+4.2 that is 4, which verify our claim that for a given α it will converge to the closest
eigenvalue.
223
If I take α = 2.1 and I apply the same process, it converges to eigenvalue 2 with
corresponding eigenvector [1/4, ½, 1].
224
So far we were talking about shifted inverse power method. Let us take A and in shifted
inverse power method we need to calculate inverse and like that. Let us take another variant
of this shifted inverse power method just inverse power method and this we can use for
finding the smallest eigenvalue of a given matrix and the corresponding eigenvector and here
we are using the result that if λ and V be the eigen pair of a matrix A then 1/λ and V will be
the eigen pair of A-1.
So the inverse power method has advantage over power method that, it can approximate any
eigenvalue. Consider y0 which is a non-zero eigenvector vector in Rn and y0 can be expressed
as linear combination of eigenvectors of A and then applying power method on A-1 we can
get Zk + 1 = A-1yk and yk + 1 = Zk + 1/mk + 1.
225
So in this way which gives the approximation to the dominant eigenvalue of A-1 in modulus
that is the smallest eigenvalue of A in modulus. However, here we do not need to find A-1 to
find smallest eigenvalue of A, because if you are having a 10x10 matrix, so finding the
inverse of a 10x10 matrix is computationally expensive and I will not prefer, suppose I want
to find out the smallest eigenvalue of a 10x10 matrix A. So what is the inverse power method
I need to calculate A-1 and the dominant eigenvalue of A-1 will be the smallest eigenvalue of
A, but we need to find out A-1. Here, we do not require to find out A-1 at all.
What we can do? We are starting with a V0 and what we are doing? We are finding V1 as
A-1.V0. So what I will do? Here, I will multiply both side by A. So my AV1 will become V0.
So instead of finding V1 with this iterative process or from this multiplication of a matrix
226
with a column vector, I will be having a linear system of equation and V0 is known to you, A
is known to you. So you can find out V1 directly from here without using A-1. Then in the
next equation, your V2 will be A-1.V1. So what you can say, you can solve this system AV2
=V1 and from here, you will find out the next iteration of the inverse power method that is
your V2 and so on. So here no need to calculate A-1 at any stage; however, we need to solve a
linear system of equation in each and every stage.
So let us take an example of this, we are taking this matrix. So A is [2, -1, 0; -1, 2, -1; 0, -1,
2]. Here, we are doing it by calculating A-1, but we can do it without calculating A-1 also.
227
So if we are doing it with A-1 and starting with an initial solution [1, 1, 1], we are getting our
first approximation as y1 that is [1.52, 2, 1.5]T and here, if I divide it by 2, it is [0.75, 1,
0.75]T. So first approximation of the eigenvalue is coming 2 and eigenvector is this one.
However, here we are doing it with A-1, but if I do it without finding A-1 then the system can
be solved with less computational effort, for example my original matrix is [2,-1,0; -1, 2, -1;
0 , -1, 2]. So this is A, V0 is [1, 1, 1]T and I am finding V1 as A.V0, which is coming [-1.5, 2, -
1.5], I think so which is coming [1.5, 2, 1.5], it is A-1.V0, so doing this I need to calculate
inverse of this matrix, but if I solve this system A.V1=V0 then my augmented matrix becomes
[2, -1, 0; -1,2,-1; 0,-1,2:1,1,1].
228
And after solving this, let us say solve it with Gauss elimination. So (2, -1, 0) and then R2
will be replaced by R2 + 1/2R1 and R3 replaced by R3 + 2/3 R2 . So this will become [2, -1, 0;
0, 3/2, -1; 0, 0, 4/3: 1, 3/2, 5/3] and from here we will get this vector V1.
And after 4 iterations we observe that my system is converging to μ = 1.71 and λ = 1/μ that is
0.5848. Since, A - 0.5848I will be 0, so here λ = 0.5848 will be the required eigenvalue that is
the smallest eigenvalue and the corresponding eigenvector is [0.707, 1, 0.7073]. The smallest
eigenvalue of A will be 2, -√2 that is again true which is as we have computed numerically.
229
So in this lecture we have learned the two variants of power method that is the shifted inverse
power method and inverse power method for finding the eigenvalues other than dominant for
a given matrix. This ends the module 3 of this course and in this module, we have learned
various methods for calculating eigenvalues and eigenvectors, we started with Jacobi method,
then we have learn power method, power method with fixed shift inverse power method with
variable shift inverse power method with Rayleigh Quotient and finally, the classical inverse
power method for finding the smallest eigenvalue of a given matrix. In the next lecture, we
will talk about interpolation till the end bye. Thank you very much.
230
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 16
Interpolation Part 1
Welcome to the lecture series of numerical methods. In the present lecture we will discuss
about interpolation, in the interpolation first we will discuss what interpolation means or
where we are using this interpolation. In the 2nd step we will discuss about the different
operators finite different operators that is a different class, it is used for approximating this
function with the polynomials and how we can relate between different operators so that I
will discuss in the last section of this lecture.
So whenever we are going for this interpolation or we are trying to discuss about this
interpolation, interpolation is nothing but that if we will have a set of data points then if the
231
curve is passing through this set of data points specially, it is called interpolation. Suppose we
will have this data points (x0, y0), (x1, y1), (x2, y2), (x3, y3), then if you plot a curve passing
through this points then this curve is called interpolating curve.
This means that if we are finding this curve the process of finding this curve passing through
these points (x0, y0), (x1, y1), (x2, y2), (x3, y3) is called interpolation and the obtained curve is
called interpolating curve. And here specially if you see y0 is nothing but it is taking the value
of f at the point x0 here, similarly if you define y1, here that is nothing but f(x1) here, y2 is
nothing but f(x2) here, y3 is nothing but f(x3) here, this means that if we will have a function
that is y = f(x) then particularly for this function f(x), if we will have the points (x0, x1 upto
xn) are the set of tabulated points or data points then at that point exactly we can determine
this functional values or the function at that point exactly.
So if we will have this functional values that is in the form of y0 = f(x0), y1 = f(x1) and y2 =
f(x2) and yn = f(xn) then we can put all these points and at that points various ways we can
connect this points through the curves and the best way to fit this curve with this point is
called interpolation here. So basically before going to interpolation or how we can
approximate this tabular values by a curve or cord here, so we will discuss brief history about
this interpolation here.
So interpolation came from ancient Babylon and Greece, where farmers planned the
plantation of their crops based on the prediction of position of Sun, Moon and the planets.
232
Specifically, if you see in India also we are visualising that the month of monsoon is coming
in June or July, so farmers started harvesting their crops accordingly.
So this hypothesis, basically it came from ancient Greece only and how this prediction
movement of Sun or the movement of planets if it is observed in a different lines or curves
that at a particular point if we are expecting that heavy rain will come and if at a particular
point we can expect that this Sun ray will be more and we will have a summer.
So based on that only it can be predicted that how we can get the best output of the crops then
if you see the ancient history again then Hipparchus of Rhodes in Greece used linear
interpolation to construct a cord function which is similar to sinusoidal function specially
your sine function or cos function.
This means that if suppose sometimes some insects are also moving in different parts, this
means that they are following some functions at that points also. This means that if we are
233
relating different creatures to the nature then we can find that interpolation existing at each
phase of this nature also. So there specially use this sinusoidal function to compute the
position of a celestial bodies and especially Chinese people also used this interpolation
method to formulate their standard calendar, this is nothing but in the present numerical
analysis it is called Gregory newton interpolation.
So Gregory newton interpolation means either it is running from the beginning of the table or
it can start from the end of the table to computed the data throughout the whole table. This
means that if we will start this year at the beginning of the table then if you use this
interpolation then we can compute that when this month will end and when the next month
will start or when the year will end also.
Sometimes if you start this computation at the end of this year then if you will take this
backward calculation of all this tabular values then we can predict that when this Sunday will
come and when this Monday will and which month it will fall it out, so that we can predict
out.
Indian astronomer and mathematician if you will see they have also introduced this method of
2nd order interpolation of sine function and later on a method for interpolation of unequal
interval data. Specially Brahmagupta introduced this one and HH, they have used also this
sine function and cos function for the 2nd order interpolation functions there itself to visualise
that how this moon and this structural motion of this sound it is coming in the space also.
So then we will go for this basic introduction of interpolation, suppose if a function f(x) is
known to us then we can put all this tabular values in a particular form. This means if we will
have this set of data points or this particular points x0, x1, x2,….x3 it is known to us also this
function y = f(x) is known to us then specially we can determine this values y0 = f(x0), y1 =
f(x1), y2 = f(x2), at particular points.
Sometimes if this function is not known to us in explicit form then how we can assume that
this car will move in that form or if we will have this set of data points (x0, y0), (x1, y1), (x2,
y2), (x3, y3), it is known to us than at a particular point how we can determine this function,
that specially we can made it out if you will use interpolation here.
234
(Refer Slide Time: 07:50)
Suppose the set of data points if you will have (x0, y0), (x1, y1), (x2, y2), (xn, yn), satisfying
this condition y = f(x) are given where this explicit nature of f(x) is not known to us then it is
possible to construct a simpler function suppose φ(x) such that f(x) and φ(x) agrees well at
this set of tabular points. This means if we will have this set of data points (x0, y0), (x1, y1),
(x2, y2), (x3, y3), here then if the function is not known to us then we can construct a simpler
function suppose this is the function we can construct as φ(x) here which can pass through
this points. Then we can say that φ(x) is a interpolating polynomial with this function f(x) at
that points.
235
(Refer Slide Time: 08:47)
And the question now arises that how should the closeness of the approximation be
measured? And what is the criteria to decide the best polynomial approximation to the
function? This means that if suppose φ(x) is passing through this point here then we want to
find the best fit of this polynomial with this function, this means that there is a difference
existing for the functional values with this polynomial here. So how we can choose that this
arrow should be minimised here and also afterwards at all the particular points even if curve
is passing through that points for this polynomial φ(x) but we can find the differences at each
of the level there.
236
So if you consider these points are very close to each other, then this arrow can be minimised
this is the first condition we can assume here. And for that we will go for a theorem here that
is approximation of a function by a polynomial that is basically called Weierstrass’s theorem
this means that for this interpolation we have to consider this function continuous here else
we cannot say anything.
So based on Weierstrass’s theorem if we will consider that any function f(x) which is a
continuous function within this close interval suppose your beginning point is here and the
end point if you consider here that is b here suppose. And it is for associated functional
values f(a) here and your associated functional value is f(b) here then we can see that f(x) is a
continuous function within this closed interval starting from the point a and ending at the
point b there.
237
(Refer Slide Time: 10:54)
Then we can approximate this function by a polynomial p(x) suppose uniformly over each of
this intervals that is (x0, y0), (x1, y1), (x2, y2), (x3, y3) up to if the last point is considered as
(xn, yn) there and each of this intervals this function is approximated by this polynomial p(x)
then we can say that for a positive number ε we can write ||f(x)-p(x)|| or we can see that
absolute value of |f(x)-p(x)| < ε for x between a to b.
238
And to justify this theorem here we will go for this existence and uniqueness theorem, the
statement of this existence and uniqueness theorem states that, there is a unique polynomial
pn(x) of degrees less than or equal n such that pn(xi) = f(xi) for i = 1, 2, upto n here. So to
prove this theorem, if you consider the set of tabulated data point are as if you will have the
set of data points n +1 data points are in the form of (x0, f(x0)), (x1, f(x1) upto (xn, f(xn)) here.
Specially this x0, x1 or xi are called tabulated points or nodal points or data points and if you
will have starting point is 0 and ending point is n here we will have n + 1 data points and
corresponding function values are called your functional values. And if we consider suppose
two functions f(x) and G(x) two different functions which satisfies this functional values that
is f(x) at all of thes points on the set of data points we can see that f(x)= f(x) at i = 0, 1, 2,
upto n or we can see that f(xi) = f(xi) and G(xi) this also equals to f(xi) at i = 0, 1, 2, upto n
then we can say that this is also a polynomial of degree n since we will have n + 1 points
here, G(xi) is also polynomial of degree n.
239
So if you take the difference of these two functions here, suppose we will have H(x) = F(x)-
G(x) this will represent a polynomial of degree less or equal to n here. Since it consists of n +
1 points and it has zeros at n + 1 points. If you see, a polynomial of degree less or equal to n
has exactly n roots + H(x) is identically 0 here, and specially we can say that F(x) = G(x) here
this is nothing but that the interpolating polynomial is unique.
And if you go for examples of polynomial interpolation here, first will go for linear
interpolation. Linear interpolation means we will have two points and within that if you
approximate suppose (x0, f(x0)) are the first point and (x1, f(x1)) is the 2nd point here then the
best way to feed this polynomial is that if you put the straight line on these two points here
and it is representing a polynomial of degree or order it should be less or equal to 1 here.
And if you take three-points suppose, so if you take 3 points then we can represent this, this
and this one as the three-point here (x1 f1), (x2 f2), here and if we can join by a curve here this
can be represented in this form here. And it can represent the polygon of order less or equal
240
to 2, there that passes through these 3 points. So we obviously say that if you will have 2
points here that is representing a linear interpolating polynomial and if you will have 3 points
then you can see that it is representing a quadratic polynomial here.
So first if you will go for this linear interpolation here, we will take two points (x0, f(x0)) and
(x1, f(x1)) the line interpolating with this 2 points are in the form of if you write that can be
presented in the form of f1(x) is equal to our intercepting form we can write as f(x0) + your
slop that is nothing but f(x1)-f(x0)/(x1-x0) into your values that is in the form of x-x0 here.
Specially we used to write this one as y -y0 this as m.(x-x0) here, m is nothing but the slope
that is nothing but the dy/dx sometime we are writing, sometimes if the functional value is
known to us then we can represent as this one here.
241
So this is a first representation of this linear interpolation polynomial. For example, if you
consider find a polynomial that interpolates suppose the point (1, 2) and (2, 4), than directly if
this point is written as (1, 2) here and the 2nd point if it is written as (2, 4) here, so directly we
can put this y0 as 2 here and this x0 is 1 here then x1 value we can put as 2 here and f(x1) can
be represented as 4 here.
And if we compare this one, so f1(x) can written in the form of first value f(x0), we can write
that is nothing but 2 here and then + f(x1), f(x1) is nothing but 4-f(x0) is 2 here divided by x1-
x0, so x1 is considered as 2 here, 2-1. x-1 here and obviously this will represent the value as
2x here.
Similarly, if you go for a quadratic interpolation here. So the quadratic interpolation can be
constructed by considering three-point that is (x0, f(x0)), (x1, f(x1)), (x2, f(x2)) here. If you
consider these 3 points, then we can write this formulation that is interpolating or quadratic
242
interpolating polynomial as f2(x) = b0 + b1(x-x0) + b2(x-x1). Where this coefficient b0 can be
defined as f(x0) here and if you write b1 here that is nothing but f(x1)-f(x0)/(x1-x0) here.
So based on this if you go for some examples of linear interpolation and quadratic
interpolation we can find that interpolation is used to provide an estimate of a tableted
function at the values where this functional values is not known to us. Particularly if you will
243
have a curve here and at a particular point if you want to calculate this functional values then
we will use this interpolation.
Then suppose the question is asked that if the sine is a function here and the functional value
of sine x is given as 0, 0.1, 0.3, 0.4 here and it is asked to compute sine(0.15) based on this
tabular values. Then if you used this linear interpolation that is represented in the form of
f1(x) as b0 + b1(x-x0) formulation, then we can obtain this value as 0.1493 here but exactly the
true value is 0.1494, upto 4 decimal places.
So if you will locate it then this error is existing after a 3rd decimal precision and next we will
go for this method of interpolation that where we can use this interpolation. If suppose the
tabular values are equally spaced or unequally spaced, suppose if sometimes some points if it
asked to evaluate it is not within the range of this interval where this function is continuous,
then the process to find this functional values outside this interval is called extrapolation.
244
And we can visualise that if the data points are existing (x0, f(x0)), (x1, f(x1)), (x2, f(x2)), so
this distance is not equals to this distance also sometimes. So then we will have two types of
arguments that is equally spaced arguments and unequally spaced arguments, some methods
are existing for this integration that is based on this equally spaced intervals and some
methods that is basically existing for unequally spaced arguments.
Basically for equal spaced arguments we are using Newton’s and Gauss interpolation for
unequal spaced intervals we are Lagrange’s and Newton’s divided difference interpolation.
And the basic advantage of this unequal spaced argument formulation is that it can handle
both these equally spaced and unequally spaced arguments.
245
And if you go for this finite difference operators here, that is when the arguments are equally
spaced suppose. Equally spaced means we can consider that your x0, x1, x2 upto xn all are
equally spaced, means the distance between these 2 points are equal here. And then we can
write here xi-1 = h here, if the space size is h here and specifically if x0 is the starting point
then we can write x1 = x0 + h here and x2 can be written as x0 + 2h here or we can write x2 =
x1 + h this means that x0 + 2 is there itself.
And if you use this functional values at this point specially we are writing f(xi) this is nothing
but f(xi-1) + h here, for i = 1, 2, 3 up to n, then we can use different finite difference operators
for this formulation or to find this values using this finite difference point x0, x1 up to xn, for
the finite difference interpolation here or finite difference operators here.
So specifically if you see here this finite difference operator basically it is existing forward
difference operators, backward different operators, central difference operators, average
operators, shift operators and differential operators. So first we will go for this forward
difference operators and back difference operators, based on this if all these points are
equally spaced here or we can say that the distance between all of these points are equal.
246
So for the forward difference operators if you will use the standard value that is in the form of
x0, y0, x1, y1, up to xn, yn here, then we can use this function as y = f(x) here and Δf(x) can be
written as f(x) + h-f(x) here. If h is the space size between these two points x0 and x1 here and
specially if you write in a functional formula here Δy0 this can be written as y1-y0.
Similarly, if you write Δy1 this can be written as y2-y1 and if you continue up to Δyn here, so
it can be written as yn + 1-yn, if you see yn + 1 is not within this tabular value then we can
compute this delta operator up to yn-1 here, so specially it can be written as yn-yn-1.
If you will use this backward difference operators, so backward difference operators it moves
this functional values towards the back of the point there this means that ∇ is the operator
which is called the backward operator and the Δ is the operator which is called the forward
operator here.
So if you use this ∇ operator here then we can write this one as y1-y0, and if you relax this
operator here then we can see that Δy0 this is nothing but ∇y1 here. And similarly all other
operators value you can define here that is ∇y2 that can be written as y2-y1 here and if you
will go for this up to last point, we can write ∇yn as yn-yn-1 here.
247
And if you go for central difference operators, central different operators mean? We can write
this central difference operators as δ here and if it is operated on function f(x) specially this
can be written in the form of f(x + h)/2-f(x-h)/2 here. And in the complete form or in the
relation of yr if it can be written, so it can be written in the form δnyr-1/2 as δn-1yr- δn-1yr-1, r =1,
2, 3 here.
Specially, if you see here that if n is odd we can write in this form, if n is even we can write
δnyr = δn-1yr+1/2- δn-1yr-1/2. and if you consider δ0yr this is nothing but yr here.
248
So if you see if you use this operator that is nothing but δ can be operated on f(x + h)/2 then
again one h/2 will be added here and it can represent as f(x + h) here and it can give another
value that is f(x - h)/2 here then it can give you f(x + h)-f(x) there.
Similarly, if you use this δ operator once more for this function here this can give you f(x-
h)/2 + h/2-f(x-h)/2-h/2 here. So then this will give you the function f(x)-f(x-h) that will give
you a complete form of this equation if you will use this operator twice here. This can be
written as f(x + h)-f(x), you can say that this is f(x)-f(x-h).
Similarly if you will go for this average operator, so average operator means we can write
this one as in the form of μ operator that is μf(x) can be written as 1/2f(x + h)/2 + f(x-h)/2
249
here. And in the yr form if you write this equation that can be written in the form of μyr
=1/2yr +1/2+yr -1/2.
Now shift operator. So specifically the shift operator is denoted by the capital letter E here
and this is basically called your shift operator and whenever it is operated on this function
f(x), it moves this function to the immediate next step. This means that you can write Ef(x) as
f(x + h) here, so all this functions whatever we have discussed here Newton’s forward
difference operator, backward difference operator, central different operator or average
operator all this functions can be expressed in the form of shift operator here, that is in the
form of E here.
So specifically there is operator that is called differential operators usually this differential
operator is designated as Df(x) that is nothing but d/dxf(x) here. So with this I am ending this
lecture and in the next lecture I will continue for this Newton’s forward difference operator
and backward difference operators or how this operator relation can be established based on
this shift operator here. Thank you for listening this lecture.
250
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 17
Interpolation Part II
Welcome to lecture series on numerical methods, last class we have discussed this interpolation
on finite difference operators. So in the finite difference operator first we have started about this
Newton’s forward difference operator Δ operators, ∇ operators then this Central difference
operators and Average operators.
So, these finite difference operators whatever we have discussed includes ∇, Δ and this Central
difference operator δ and the Average operator μ and in this presentation first I will give few
introductions about this few operators that I have discussed in the last class then we will go for
Newton’s forward difference formula and Newton’s backward difference formula and then the
error approximation.
So this forward difference operator, backward difference operator, whatever it is operated in the
last lecture that are basically expressed in the form of ∇ f(x) that is represented as f(x)-f(x)-h and
Δf(x) it is expressed in the form of f(x)+h-f(x) and this Central difference operator δf(x) it is
251
expressed in the form of f(x)+h/2-f(x)-h/2. And this Average operator this is expressed in the
form of 1/2 of f(x)+h/2+f(x)-h/2.
So whenever we will go for application of this 4 difference operators, so then we have to go for
this Shift operators and Differential operator. Basically if we are using sequentially these
operators, so sequentially this can be applied if ∇y1 can be applied so it can be written in the
form of y1-y0. So then ∇y2 can be written as y2-y1. So ∇yn can be written as yn-yn-1.
Similarly, if we will apply this forward difference operator sequentially, so Δy0 this can be
written as y1-y0, Δy1 can be written as y2-y1, so if you will express Δyn can be written as yn-yn-1.
So similarly we can use for Central difference operator δyr can be written as yr+1/2-yr-1/2.
252
Similarly, if you will use this operator in Average operator, so this can be expressed in the form
of μyr this can be written in the form of ½[yr+1/2+yr-1/2]. And if we are using these operators in a
Shift operator form usually E is called shift operator, so usually this shift operator is generated in
the form of E and it can be written in the form of Ef(x) as f(x)+h.
Similarly, if you will write Differential operator so this Differential operator is signified as D,
and if it is operated on f(x) that can be written as d/dx f(x). So if we want to extend these
operations in different forms how we can express this Δ operator, ∇ operator or the Central
different operators or this Average operator in Shift operator from then we have to expand this
operators in different forms.
So first if you express this Shift operator in a recursive way, so we can write, Ef(x) as f(x)+h,
then Ef(x)+h as f(x)+2h there. So if you express f(x)+h in a Taylor’s series expansion form then
we can write this Taylor’s series expansion as that is basically if we are writing E as a shift
operator it can be expressed in the form of f(x)+h.
So then sequentially if you use EEf(x), so it can be expressed as Ef(x)+h and this can be written
as f(x)+2h. So similarly, if you will use a Δf(x), so it can be written in the form of f(x)+h-f(x) is
this one. So it can be written as E f(x)-f(x), and which can be written as E-1.f(x).
253
Similarly, we can express ∇ operator in the form of shift operator, so that is basically written in
the form of ∇f(x) as f(x)-f(x)-h, since we are writing Ef(x) in this form, so that is why it can be
written as EEf(x) as E2f(x). So this means that this can be written as f(x)+2 h, if we want to write
f(x)-h then it can be written in the form of E-1 f(x).
So similarly we can express this expansion as f(x)-E-1f(x) which can be written as f(x) or we can
write this as 1-E-1 inverse that is the operator operated on function f(x). So different operators
central different operators also we can express in the power of E to the power 1/2-E-1/2, since we
are expressing this Central difference operator that is δf(x) usually it is written as f(x)+h/2-f(x)-
h/2.
So which can be written as δf(x) as f(x)+h/2-f(x)-h/2 and this can be written as E1/2f(x)-E-1/2f(x)
which can be written as E1/2-E-1/2f(x). So obviously we can write the operator as Δ = E1/2-E-1/2.
254
Similarly, the Average operator it can be expressed in the form of μf(x) as ½[f(x)+h/2+f(x)-h/2].
And this can be expressed as ½[E1/2+E-1/2] operated on f(x) and hence we can express μ as ½[E1/2
+E-1/2].
So sequentially if you apply this shift operator then we can express Δ +1 = E = 1+δ2/2+ δ√(1+ δ
2
/4). So it can be obtained from this operator since usually we are expressing δ as E1/2- E-1/2.
If you will take square on both the sides then we can express that as since we are expressing δ as
E1/2- E-1/2, so that is why we can express δ2 as E+E-1-2. And which can be accessed in the form of
E+1/E-2 and which can be written as E2+1-2E/E. And in a product form if you write this can be
expressed as E2-2E-δ2E+1 = 0. And this implies that it can be written as E2-E(2+δ2)+1 = 0.
If we want to find this root that is as E, E can be written in form of that as 1+δ2/2+δ2 √(1+δ2/4).
So this is formulation usually we are obtaining that is E in the form of Central difference
operator, so if we want to express this E as in the form of forward difference operator, so E is
usually written as 1+Δ also.
Since we are writing Δf(x) as Ef(x)-f(x) which can be expressed as E-1 f(x) and this implies that
Δ can be expressed as E-1 this implies E =1+Δ . So directly if we want to find this root we can
write this one as ±√(B2-4 AC2)/2 A, so A is 1 and the coefficient -B means this is B coefficient
is 2+δ2and C is 1, so that is why this is the root for this equation.
255
Next step we want to express this Δ in the form of Average operator. We can express the
Average operator as μ = 1/2[E1/2+E-1/2]. And if we take square at both the sides, so it can be
express μ2 = 1/4 this can be expressed as E+E-1+2. Since already we have known that E+E-1, it
can be written as if you see this can be written as Δ2, so can I write this one as 1/4δ2+2+2.
256
And if I want to find this square root for this μ of function, so it can be written in the form of μ
equal to that is expressed also that is in the form of ±1/2√δ2+4. So usually it is written in the
form if you will write inside this function that can be written in the form of δ2/4+1 so this is the
expression that is written in the form of μ as in the form of Δ.
So next if we want to express this Shift operator in the form of Differential operator, so usually
Ef(x) is written in the form of f(x)+h and hence if you will expanded in Taylor series from that
257
can be written as f(x)+h f `(x)+h2/2!f ``(x)+this one. So we can write this one as since f `(x) is
written in form of as f(x)+hDf(x)+h2/2! D2f(x)+….
So we can express this as EhD operated on f(x), so directly we can write E as EhD. So specifically
if you are expressing this one this shift operator in the form of Differential operator then we can
express this Central difference operator δ as E1/2-E-1/2 which can be written as EhD/2-E-hD/2 which
can be written as 2 sinh hD/2.
Similarly, if you write Average operator μ this can be written as E1/2+E-1/2/2, which can be
written as EhD/2+E-hD/2/2 and this can be written as I think cosh hD/2. Since usually it is express
in the form of 1/2, so that is why 1/2 is coming over there so this 1/2 will cancel there, so cosh
hD/2.
So next we will go for this interrelation between different operators, so that is E expressed as in
the form of Δ in the form of ∇, in the form of Central difference operator δ and then it can be
expressed in the form of Differential operators. So all of this operators we want to express in
different forms that is E, Δ, ∇ and the Central difference operator δ then μ is the Average
operator then the Differential operators.
258
So then we will go for this polynomial approximation in Differential operators, how we can use
this Differential operator in a differential form also there. This means that how we can use this
forward difference operator your Δ, ∇, in a differential sense for a polynomial. So if suppose we
are considering f(x) is a polynomial, first we will assume f(x) is a constant polynomial suppose.
If f(x) is a constant polynomial we can write f(x) = a. So if I am writing this polynomial f(x) as a
constant polynomial then I can apply this forward difference operator Δf(x) can be written as
f(x)+h-f(x), which can be written as a-a = 0. So next if I am expressing f(x) = ax that is a
259
polynomial of degree 1 then Δf(x) we can write as f(x)+h-f(x) and this can be written as f(x)+h it
can be replaced as ax+h -ax, so this can be written as ax+ah-ax and it can be written as ah.
So if again will apply Δ operator this means that δ2Ef(x) for this function we can express Δf(x)
which can be written as Δah and obviously we have already defined that Δa constant function
this is giving a zero value, so it can be expressed as 0.
Similarly, if you consider f(x) is a 2nd degree polynomial that is f(x) = x2 then Δf(x) this will give
you first-degree polynomial and Δ2f(x) that will give you a constant polynomial there. So
suppose if you are considering f(x) = x2, so if you consider f(x) = x2 then Δf(x) can be written as
(x+h)2-x2. So I can write this one as x2+2xh+h2-x2 and it can be written as 2xh+h2.
So repeatedly if I will apply one more operator this means that Δ2f(x), I can write ΔΔf(x) which
can be written as Δ2xh+h2. Obviously already we have defined this one I can write 2hΔx +Δh2.
So already we have obtained that Δ operated on a constant function that is giving you a zero
value then we can write this one as 2h2, since I have written Δx 2h2. since Δx can be written as
x+h-x this one so that is why it can be written as h, so the final function is coming as 2 h2.
260
(Refer Slide Time: 20:15)
So this we can prove this theorem in the form of induction, since for n = 0 we have already
satisfied that one, for n = 1 we are satisfied, n = 2 in the previous case already we have shown
that one. So if we will assume that this theorem is true up to n-1 this means that 0, 1, 2 up to n-1
we can write if f(x) = xn-1 then it can be written as Δn-1f(x) can be written as (n-1)!hn-1.
So first we will assume that the theorem is a true for n = 0, 1, 2, upto n-1 suppose then each of
this functions f(x) = x, it can be expressed as your function that is one there that is Δf(x). if we
are expressing f(x) = x2 then Δ2Ef(x) can be expressed as 2 there, so it is satisfied that this
theorem is true for n = n-1 also.
So if we will assume that this theorem is true for 0, 1, 2 up to n-1 then for Δnf(x) we can write
Δnxn. So we can express this one as Δ-1Δxn. And this can be written as Δn-1(x+h)n-xn.
And we can expand this (x+h)n in a binomial sense and if we will expand this one in binomial
sense we can write Δn-1 and xn+n hxn-1+n.(n-1)/2!h2xn-2, so the final is hn-xn.
261
So Δnxn it can be cancel it out and I can write Δn-1(xn+nhxn-1+n(n-1)/2! h2xn-2+up to hn. Since
already we have already assumed that this theorem is true for all the values that is 0, 1, 2, up to
n-1 if you see except this term all other terms will take 0 values since all of these polynomial are
degree less than n-1. So the final term we can write as Δn-1nh xn-1,+all other terms are assumed to
be 0.
So in the final form we can write Δnf(x) that is Δnxn, it can be written as Δn-1 nhxn-1. And which
can be written as nh (n-1)! hn-1, so it can be written as n!hn. Finally, we are obtaining if f(x) is a
polynomial of degree n then we can write Δn xn as n!hn.
262
So next we will discuss about how we can use in a tabular form of the forward difference
operators, backward difference operator, Central operators and Average operators. So you will
write these operators in a tabular form in the forward difference table. Now if you move in the
end of the table to the beginning value the function in a forward difference form. In a backward
difference form, we will move to the value which is existing at the end of the table there.
So if I am writing this forward difference table, so this forward difference table can be written as,
if I am writing, xi, yi, then i = 0, 1, 2, 3, 4, suppose, then xi values are x0, x1, x2, x3, x4 are the
values associate with functional values will be y0, y1, y2, y3, y4,. Then if you use this forward
difference operator then Δyi, so first 2 differences I can write Δy0, then 2nd one I can write Δy1,
3rd one I can write Δy2 and last one I can write Δy3, since difference of these 2 is giving you Δy0.
So then difference y2-y1 is providing Δy1, then difference y3-y2 is providing Δy2, then difference
y4-y3 it is providing the value y3. Similarly, if you go for this 2nd operator that is in former
difference form I can write the difference as Δ2y0, difference of this two can be written as Δ2y1,
the difference of these two you can write Δ 2y2.
Similarly, we can go Δ3yi and where we can get the difference of these two, it will give you Δ3y0,
difference of these two, it will give you Δ3y1. Again if you take the final difference since we
263
have existing 5 different points, so 5 different points means lastly we can go up to the
polynomial of degree 4, so then I can write the difference of these two can be written as Δ4y0.
So in the final forms we are obtaining this value that if you see in the final form this difference is
moving to y0 point in the final form, this is that if a value is existing at the beginning of the table
and if it is asked to evaluate then we can use this forward difference form to evaluate this value
in a differential form there.
264
Then we will go for the discussion of backward difference table, so if we will go for backward
difference table the same formulation we will use but this will take step back of the values for
each of this calculated values or the calculated tabular points. So if you use this backward
difference table that is in the form of i = 0, 1, 2, 3, 4 and associated tabular values are xi, that is
x0, x1, x2, x3, x4, and associated variable values that is yi as y0, y1, y2, y3, y4.
If you will express this tabular values then we can write this as ∇yi, so ∇yi we can write ∇y1, then
difference y2-y1 this will give ∇y2, difference y3-y2 that will give you ∇y3, difference y4-y3 this
will give you ∇y4. Similarly, if you will take difference ∇’s, then ∇2yi I can write, so ∇y2-y1, so
this will give you ∇2y2, then ∇2y3, then difference ∇y4-∇y3 that will give you ∇2y4.
If you take again difference of these two that is ∇3yi , I can write ∇3y3, then the difference of
these 3 that is ∇3y4. And if I will write ∇4yi then the difference of these two that will give you
∇4y4. In the final form if you will see this difference is moving towards the end of tabular value.
This means that if a function is asked to evaluate at the end of the table we can use this backward
difference table to evaluate the values.
So in the next lecture will continue this Central difference operator and average difference
operator and next onwards will go for this Newton’s forward difference formula and backward
difference formula, thank you for listening this lecture.
265
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of TechnologyRoorkee
Lecture No 18
Interpolation Part III
Welcome to the lecture series of numerical methods and till now we have discussed about
this finite difference operator. So now we will continue about this Central difference table
and we will start about this Central difference approximation how we can use for different
values.
So in the tabular form if you will write this Central difference table, the Central difference
can be written as if I will write i as the value that is 0, 1, 2, 3, 4 and t respected tabular values
266
are like x i then I can write this tabular point as x0, x1, x2, x3, x4, and associated variables
values like y0, y1, y2, y3, y4, then the Central difference approximates that δyi as since we are
expressing δf(x) as f(x) + h/2, - f(x) - h/2.
I can write δyr as yr + 1/2 - yr - 1/2. So if I am expressing this one in this form then I can write
this post approximation as Δy1/2, then 2nd approximation I can write since you are considering
the average of this two. So first average if I will consider 1 + 0/2, since δyr means Δy1/2, so
δy1/2 means y1/2 + 1/2 this will give you y1 - y1/2 - 1/2 means this will give you y0 t.
So that is why if you are taking the difference of this two then we can write that as δy1/2. If
you will take the difference of y2 - y1, I can write δy3/2 and for y3 - y2, I can write δy5/2. If I
will take the difference of these two I can write δy7/2, since δy3/2, I can write this one as in the
form of y3/2 + 1/2 - y3/2 - 1/2. So it can be written as y3 + 1/2 - y3 – 1/2, so it can be written as y2 - y1.
So if I will take δy1 so δy1 I can write as y1+ 1/2 - y1 - 1/2 and it can obviously have written as y1
+ 1/2 - δy1 - 1/2. And I can write this one as y1 + 1/2 + 1/2 - y1 + 1/2 - 1/2. Similarly, - it can be written
as y1-1/2+1/2 - y1-1/2-1/2, so obviously it can be written as 1 + 1 this is y2 - y1, since it cancels it
out, so directly I can write this one as y2 - y1 - y1 - y0. So if you will see this δ2y1 is nothing
but we can write y2 - y1 - y1 - y0, so obviously this is coming as δ2y1.
267
Similarly, we can write the difference of these two as δ2y2 and the difference of these two we
can write δ3y3. If you will take the difference of again this two, so it can be written in the
form of δ3yi, so I can write this one as δ3y3/2 and this one can be written as δ3y5/2. If I will take
one more difference that is δ4yi, so then I can write δ4y2. Since if you are taking this central
difference form, so the values are associated as 0, 1, 2, 3, 4 and the central value is taking as 2.
So that is why the final form the central difference approximates the value at the centre of the
table only.
268
So next we will discuss about this Newton’s forward difference formula, basically we are
using this set of tabular point that is in the form of the x0, x1, x2 up to xn, and all of this
tabular points that is x0 to xn are equally spaced this means that x0, x1, x2, x3 likewise we are
writing up to xn. This means that if all points are equally spaced we can write xi = xi - 1 + h or
we can write this one as x1 = x0 + h, x2 = x1 + h likewise xn = xn- 1 + h.
And we can write all these points that is in the form of like x0, x1, x2, x3 up to xn if these are
the given set of tabular points we can express all these points should be equally spaced, this
means that h is the space size in that locations then we can express this associated variable
values as y0, y1 up to yn.
Then we can express this Newton’s forward difference formula as yxp this means that any
particular point if you want to evaluate at any middle of this intervals then we can write this
formula as y0 + P Δy0 p(p – 1)/2! Δ2y0+….+p (p – 1)(p – 2)… (p -n- 1) /n! Δny0.
So basically if you want to evaluate any point within this particular interval then we can
express this point as xp as x0 +ph. And if we want to express this function in the form of a
polynomial we can replace at any point x means we can replace this p as in the form of xt and
we can evaluate this formula in the form of a polynomial t. So in a complete form if you want
to derive this formula then we can take this Taylor series expansion which is used for this
shift operator.
269
So if we want to express this Newton’s forward difference formula in the form of the p this
means that Newton’s forward difference formula can be written as yxp that is at any point
within this interval x0, x1 up to xn, if we want to find at any point then we can write this
formula as in the form of y0+p Δy0, p(n- 1)/2!Δ2 y0 + …. P(p – 1)…(p -n- 1)/n!Δny0.
Especially if you will see in a tabular point this is starting at x0, x1, x2, x3 upto xn. So if any
point we want to evaluate at the beginning of the table we can use Newton’s forward
difference formula. Basically if we are writing this point as x, this means that x has a
coefficient so that is why we are expressing that is in the form of xp. So usually xp can be
written as x0 + ph since all of these points we are expressing that is in the form of xi = xi - 1+h.
And we will start this competition we have to choose the beginning of the point as x0 since
this p values should lie between 0 and 1. Especially if you will consider then we can evaluate
270
this formula, this means that if you want to find yxp in a functional form, we can write this
one as yx0 + ph and it can be expressed as Epy0. And E can be expressed in the form of 1 + Δ
since whenever we are expressing Δf(x) usually it is written in the form of f(x) + h - f(x) and
hence it can be written as Ef(x) - f(x) and it can be written as E-1 f(x).
So Δ can be expressed as E-1 so that is why we can write 1 + Δ = E. So that is why we can
replace E as (1 + Δ)py0, and if you expand this term then we can write this one as 1 + pΔ+p(p
-1)/2! + p(p -1).(p-2)/3!Δ3, so likewise it will continue and operated on this y0 value. So I can
write this values as y0 + pΔy0 + p(p – 1)/2!Δ2y0 likewise I can express.
And if I want to evaluate this function in a polynomial form that is in the form of x, xp is
expressed as x0 + ph. So this xp is nothing but any point which is existing within this interval
x0, x1, x2 upto xn, in this interval so this xp is situated.
So commonly we can consider this point as x, so which can be expressed as x0 + ph and I can
write p as (x - x0)/h. Similarly, if I want to write p - 1 this can be written as (x - x0)/h - 1 and
which can be written as (x - x0 – h)/h, w I can write (x - x1)/h, since x1 can be expressed as x0
+ h.
Similarly, I can write p - 2 that is in the form of (x - x2)/h. In each of this formulation I can
express this one in the form of pi will replace p in terms of x, I can obtain a polynomial that is
in the form of x. So if it is asked to evaluate any polynomial which is existing at any point
271
which is existing at the beginning of the table then we can express that as a polynomial of x
in Newton’s forward difference formula.
So this is a Newton’s forward difference formula whatever I have discussed then we will go
for this Newton’s backward difference formula. In the Newton’s backward difference
formula especially we are using this tabular values at the end of the table only.
272
(Refer Slide Time: 16:00)
So if you will discuss about Newton’s backward difference formula, especially if the set of
tabular points are expressed in the form of x0, x1, x2 up to xn and the corresponding associated
values are y0, y1, y2 up to yn and each of these points are equi-spaced, this means that x0 - x1
difference will be h and x2 - x1 difference will be h, then we can write Newton’s backward
difference formula as y(xp) as y0 + P∇y0 + p(p +1)/2!∇2y0+ p(p + 1)…(p+n- 1)/n!∇ny0.
We have to choose y0 in such a fashion that y0 should be existing at the end of the table. So,
previous values can be considered as y-1, y-2 up to y-n. Since this value y0 will exist at the end
of the table only and we can express xp or x as x0 + ph, where p can be written as (x - x0)/h in
that position also. And if we want to express this function in a form of shift operator this
means that y(xp) can be written as Epy0, since xp can be expressed as x0 + ph, where we can
write E(xp) means E(x0 + ph), which can be expressed as Ep(x0).
So since we are expressing this function that is Ef(xp) or E(xp) means it is E(x0 + ph), so if
you want to express this one that is y as a function of x that means y(x0 + ph) and E is
operated on yp, so that is why it is written as Ep(y0) this one.
So that is why if we want to express this one as in the form of Ep(y0), so it can be expressed
as in the form of ∇ that is usually ∇f(x) is written in the form of f(x) - f(x) - h and it can be
expressed as f(x) - E - 1f(x) this one, so I can write 1 – E-1f(x).
So this can be written in the form of like ∇ = 1 – E-1 or I can write this one as E-1 = 1 - ∇, and
E can be written as (1 - ∇)-1. So if I will write this can be written in the form of (1 - ∇)- p y0
273
and it can be expanded in the form of like [1 + p ∇ + p(p+1)! ∇2+…]y0. So in a combine form
I can write y0 + p∇y0 p(p + 1)/2!∇2y0 + this will continue.
So this is the representation for Newton’s backward difference formula and if you write this
is in the form of polynomial so then we can estimate this series as in the form of like p = (x -
x0)/h. So similarly I can express p + 1 as p = (x - x0)/h, so p + 1 can be written as (x - x0)/h +
1 this can be written as (x - x0 + h)/h. So I can write (x - x0 – h)/h, I can write (x - x- 1)/h this
one.
Similarly, I can write p + 2 also, so p + 2 can be written as (x - x- 2)/h. And in this form we
can express this as in the form of polynomial that is taking all of this a backward point. So
whenever we are discussing this Newton’s forward difference formula or Newton’s backward
difference formula there is always existing a error term. Since whenever we are writing this
series expansion so finally we are ended up with this turns up to n-th term, so after the n-th
term so there will be some extra terms which we are neglecting.
So if that terms we will include, so in a complete form we can write this series expansion as
y(x) equals to your series expansion that as y(xp) + Rn term, that is the remainder term. So in
each like Newton’s forward difference formula, backward difference formula always there
will be associated error. So if you want to calculate this error first we will discuss about a
generalised approximated formula.
274
So if you write this error of approximations, so let us consider suppose a function y = f(x) is
existing at k + 1 point suppose, there is x0, x1 up to xk and xk + 1 points and each of these
points these functional values will be associated like y0, y1, y2, upto yk + 1 then if y(x) is
satisfied at exactly these points with a polynomial since we are discussing this one in a
polynomial sincere.
So y(x) is approximated with the p(x) at x0, x1, x2 up to xk, since k + 1 points are existing then
if we will approximate this function that is y = f(x) is a function which is approximated or
interpolated with a polynomial p(x) that exactly at this points like x0, x1, x2, upto xk point at
each of these points the difference between this y = f(x) and p(x) is exactly 0, but at all other
points we can find a difference is existing.
So if this difference is existing then there will be an error term associated with each of this
function and the polynomial whenever we are approximating them in a polynomial sense. So
if we will write this in complete form, so y(x) can be written as p(x) + r(x) term. So usually
this y(x) and p(x) exactly equals at each of the nodal points or tabular points, but at all other
points there exist a difference between this y(x) and p(x) where R(x) is existing.
So if we will write R(x) as in the form of like K(x)W(x), since at all these tabular points we
are approximating that f(x) = P(x), y = f(x). So if exactly it will be equal at this point so this
functions should be 0 at exactly this points also, so that is why we can express this R(x) as in
the form of K(x).W(x), where W(x) will be associated with these term like x - x0, x - x1, x - x2
up to x - x k, where we can satisfy y(xi) = P(xi) + R(xi).
275
But if we will approximate this function exactly at this point we can consider a point which is
existing within this interval suppose x̄ at that point this function is also satisfying that x̄ value.
So if you will consider that function as x̄ value, so we can choose K(x) as K(x̄) at that point
exactly that K(x̄) =f(x̄) – P(x̄)/W(x̄). Since at that point these values will not satisfy, maybe
that point lies somewhere else, since exactly at this point we are observing that y(x), P(x) and
W(x) takes 0 value.
We are considering x̄ as a point within this interval somewhere that should satisfy the value
that is in the form of K(x̄) =f(x̄) – P(x̄)/W(x̄). So we can approximate this function at that
point exactly f(x) = P(x) + W(x)K(x̄), so we can write this remainder term R(x) as W(x).K(x̄).
276
So if we want to determine the value of K(x̄), let us consider this function that is Φ(x)=f(x) –
P(x) - f(x̄) – P(x̄)/W(x̄).W(x), since W(x) takes the value that exactly f(x) and P(x) are
satisfied at x0, x1 up to xn . So if these are the K + 1 points where Φ(x) satisfies these value
then we can assume that Φ(x) will be satisfied 0 value at K + 2 points, since x̄ is the extra
point.
So if x̄ is the extra point and x0, x1 up to xk + 1 points then Φ function will satisfy zero value at
K+ 2 points. So if you will consider in a polynomial sense that satisfying rules theorem,
Φ(x)will vanish at K + 2 points then Φ`(x) must vanish at K + 1 points. Similarly, if we go
ahead we can find that Φ(K + 1) will vanish at one point only suppose that point is ζ suppose.
Since P(x) is a polynomial of highest degree k, so it will take K + 1-th degree of derivative
P(x) will also be 0, similarly f(x) will also be 0 at that point. But Φ(k + 1) since at least at one
point the K + 1-th derivative Φ function that will give also 0 function, so that is we can write
Φ(k + 1) ζ = 0. So then we can express f(x̄) – P(x̄)/W(x̄), this equals to f K + 1 ζ/(K+1)!.
So if you compare both easy questions that is expressed as a question 4 and a question 7 we
can get that K(x̄) can be expressed as fK + 1 ζ /(K + 1)!. In a complete sense if you want to
write this function or this remainder term, so R(x) can be expressed in the form of W(x).fK+1 ζ
/(K + 1)!.
277
So if I am writing R(x), this can be written as W(x).fk + 1 ζ/(K + 1)2!, where ζ should lie
between x0 to xk. And W(x) term is written in the form of (x - x0)(x - x1)….(x - xk).fk + 1
ζ/(k+1)!.
So if we are expressing this generalised function that is expressed in the form of like (x -
x0)(x - x1)….(x - xk).fk + 1 ζ/(k+1)!., where ζ should lie between x0 to xk. So in the next class
we will continue this function that can be expressed at Central difference tabular points and
we can approximate this values in a Central difference approximated form, that Central
difference approximation term includes like a Gauss forward difference, difference formula
and Vessels formula and Sterling’s formula, that we will continue in the next lecture.
278
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology1Roorkee
Lecture No 19
Interpolation Part IV
Welcome to the lecture series on numerical methods. In the last lecture we have discussed
Newton’s forward difference formula and Newton’s backward difference formula, in the
present lecture we will start about this error computation for Newton’s forward difference
formula and Newton’s backward difference formula.
279
So before going to the Newton’s forward difference formula error competition, first will go
for this Newton’s forward difference formulas representation. So how we are representing the
Newton’s forward difference formula is that first we can write this one as y(x) or y(xp) for a
particular point usually as y(p) as a function of p. We can write this one as y0 + p Δy0+p(p-
1)/2!Δ
2
y0 + up to the term like p(p – 1)….(p - n – 1)/n! Δny0.
So especially it is in the form of p, we are writing this one and if we will write in terms of x
then we can write y(x) as y0, so p can be expressed in the form of x that is nothing but (x -
x0)/hΔy0 + (x - x0)/h+(x - x1)/h.1/2! Δ2y0 + up to the last term. So specifically1if you noice
this one here the first formulation we can apply if a value is asked to compute at a particular
point that where we can determine the value of p and we can apply that formulation to
evaluate this function at that point, but if this function is asked to compute in the form of a
polynomial then this second formulation we can use there.
So if you see the slide here that is y(x) is expressed in the form of y0 + (x - x0)Δ y0/h + (x -
x0)+(x - x1)Δ2y0/2! H2 and the last term especially it is expressed in the form of (x - x0)(x-
x1)….(x - xn)Δny0/n! hn.
And before going to use this formula first we should have to note few points, that where you
can use this Newton’s forward difference formula or where we can use Newton’s backward
difference formula or other forms of this finite difference operator’s formulas. So first if you
go for this application of this Newton’s forward difference formula here, so this formula can
280
be used to compute the value of y for a given value of x and the formula in terms of x if can
be used for this representation of this function y as in the form of a polynomial here.
And then in the second form if you will say that as many differences we can try to retain
without losing its accuracy we can put there and in Newton’s forward difference formula
especially the number of difference for a particular y decreases as we go downwards the table
if you see. And if we want to find the value at beginning of the table especially this formula is
the suitable formulation to evaluate this values at the beginning of the table or at the start of
the table there itself or in the upper end of the table there.
So there are other formulas which you can use for the computation of the values near the
middle or the end of the table. And if sometimes we are observing that this difference start
behaving erratically or increasing in magnitude at any stage which would leave out those
differences and higher at the differences of this formulation this means that, suppose if you
are going for this computation and after suppose second-order differences if you find that 3rd
order difference value is getting increased or it is infinitely getting increased then we can
terminate this series up to secondary term here and we can evaluate this value here itself.
And sometimes suppose if the starting point is not the origin then we can shift this origin to
the immediate next point if you will consider these points like your starting points are x0, x1,
x2, x-ray so likewise every point is existing here and suppose if the values asked to compute
exactly at x0 or within this interval x0 to x1 then you can use this Newton’s forward difference
formula.
281
But sometimes if the value is lying between suppose x1 to x2 you can switch this x0 position
to x1 here then your tabular value will be shifted like x-1, x0, , x1, x2 up to xn -1 here. And
sometimes also if this values is asked to compute within this interval x1 to x2 then again you
can switch this point as x 2, x-1, x0 to xn-2 here. And if you are shifting this origin according to
your choice that where you want to evaluate this value or which interval you want to evaluate
this values or the values asked to evaluate then in that interval you can find that p should lie
between 0 and 1 here.
Since especially you are expressing xp or x as x0 + ph, so that is why whatever this point if
interval is lying between x1, x2 here and then you can consider this point like x1 and x2 within
that it will exist. So that is why you can consider (x - x0)/h as p there and if we are
considering this x0 as the previous point to the next point and where we want to evaluate this
values of p there, so that is why we can say that p should lie between 0 and 1 there. And
Newton’s forward difference formula can be applicable and sometimes for the extrapolation
before the 1st tabular point p will be the negative values.
282
To find the error formula in a combine form or in a committed form by replacing y = f(x) by
means of a polynomial p(x), especially here what we are doing is if the tabular values is
known to us and if complete function is not known to us then especially we are evaluating
this function by considering all of this tabular point.
So if you we are approximating this function with this polynomial then the error will exist
there itself and if this error is existing there then if you take this difference like y(x) – p(x)
then this represents the error term at that position. So, especially this can be written in the
form of like (x - x0) (x - x1) …(x - xk).fk + 1ζ/(k + 1)!, where ζ should lies between x0 to xk.
And sometimes also in a product form if you want to express this can be written in the form
of like product of n = 1 to K there, especially if we want to write in the form of, i = 0 to k and
we can write this one as x - xi and obviously we can write this one as fk + 1ζ/(k + 1)!, where ζ
should lie between x0 to xk.
And to get suppose this f(x) in a differential form, differential form means? How we can
express f(x) in terms of f(k + )
1 there. Then to get that form we have to expand f at point
suppose like x0, if you want to expand like f(x0 + h), since usually this Taylor series
expansion is used if a point is given and all of its neighbourhood points if f is continuous then
we can use this Taylor series expansion that is in the form of f(x0 + h) which can be written
as f(x0 + h), f`(x0 + h)2/2!f `` (x0) + all other terms.
283
284
If we will go for this computation of this f(x) in terms of f k + 1 term here, so then we can use
mean value theorem for all other terms. This means that we can write f(x0+h)- f(x0) as h f`(x0)
+ θ1 h by using mean value theorem, where θ1 should lie between 0 and 1. So obviously we
can write this f(x0+h)- f(x0) as Δf(x0) = h f `(x0 + θ1 h).
If we will repeatedly apply then we can write Δ2f(x0) as we can write since this it is expressed
in the form of h f `(x0) + θ1, so that it can be written as h2 f ``(x0) + θ1h + θ`h here. And
directly we can say that θ` should lie between 0 and 1 here and we can rewrite this formulation
as h2 f ``(x0) + θ1 + θ`. h this one, where θ1 + θ` if you see here, so θ1 is lying between 0 and 1,
θ` is lying between 0 and 1 here.
So that is why we can see that θ1 + θ` should lie between 0 and 2 here and if you write another
parameter suppose θ2 here and which can be expressed as θ1 + θ`/2 then we can say that the θ2
should lie between 0 and 1 here. So obviously we can rewrite this formulation in the form of
like Δ2 f(x0) as h2 f ``(x0) + this is 2θ2.h here.
And successively if you use this formulation here repeated way then after like k-th steps we
can obtain this one repeatedly suppose this a forward difference operator we are using yr then
we can write θ to the Δk f(x0) = hk f k(x0) + kθk h here, where θk should lie between 0 and 1
here.
Where fk especially it is called as k-th derivative of f(x) and if may be noted that since θk is
lying between 0 and 1 here obviously you can determine the exact value of like this point x0 +
k θk h, exactly what it is giving providing the value whenever θk = 0 and θk = 1 there itself.
So if we can determine then we can write it in a composite form there and if you write in this
form then we can write Δkf(x0) as hk fk ζ. And if we want to find the range of ζ here then if
you put here like θk = 0 suppose then this value like x0 + k θk h this will give you the values
as x0 here.
And if you will put θk = 1 here then we can say that the value of x0 + kθk h this is nothing but
x0 + k h, that is nothing but xk here. So that is why we can say that if we can express Δk f(x0)
in the form of hk fk(ζ) then ζ should lie between like x0 to xk here. So these things you can
clearly visualise here also that if x0 + θk.k h should lie between x0 to xk then generally it may
be written as Δkf(x0) = hk fk(ζ), ζ should lie between x0 to xk here.
285
And further if we will go for this like forward difference formula if it is passing through these
points like (xi, yi), where xi is the nodal point and yi is the associated functional value then we
can express this remainder terms as if you will see the error term R(x) is usually written in
form of (x - x0, x - x1 up to x - x k, f to the power k + 1 ζ by k + 1 factorial here, where ζ
should lie between x0 to x k here.
And if you want to represent in terms of like Newton’s forward difference operator here then
we can write this one as (x - x0)(x - x1)..(x - xk).Δk + 1 y0 or f(x0) (k + 1)!.hk + 1 here. So this is
the generalised error estimation formula for Newton’s forward difference formula here. So if
you use this transformation that is we have already shown that if you will write like p - i this
can be represented in the form (x - xi)/h since already we have shown that p can be
represented as (x - x0)/h that we have expressed in the beginning since x = x0 + ph.
286
So that is why write p = (x - x0)/h and if we will write like p - 1, so especially it is represented
in the form of (x - x1)/h so likewise if you will express p - i can be expressed as x - xi/h here.
And in terms of p if you will write this error formulation then R(p) can be written as p(p –
1)…(p – k) Δk + 1 y0/(k + 1)!.
And it may be noted that if all tabular points like x0, x1 up to xk approaches to x0 suppose if h
is very small then we can see that Δkf(x0) this can be written as hkf k(x0) here, for k = 0, 1, 2
up to any value you can say for a small h we can write this formulation as in this form here.
Then in that case this error reduces to R(k + 1) x as (x - x0)k + 1/(k + 1)!fk + 1 ζ, where ζ should
lie between x0 to xk here.
287
So if you will go for this error estimation in terms of Newton’s backward difference formula
then this error formula or error committed is obtained by replacing this function y = f(x) with
the polynomial p(x) as y(x) – p(x) is (x-x0), (x - x1)….(x - xk)fk + 1ζ/(k + 1)!, where ζ should
lie between x0 to xk there also. And in the product form as we have expressed in the earlier
section that usually it can be expressed in the form of product of x( - xi)fk + 1ζ/(k +1), where ζ
should lie between x0 to xk here.
And if you substitute here that is p = x - xk, since we are starting this backward difference
formula at the end of the table so that is why we can see that the last point as xk there itself.
So that is why p can be written in the form of x - xk /h here, where we can write x = xk + ph
here. And your R(x) or the remainder term can be written as R(x) = y(x) – p(x) as sometimes
usually it is written as f(x) – px, since y(x) is approximated by this function f(x) there.
And if you substitute all these values then we can write this Newton’s backward difference
formula error as R(p) this is the error term for Newton’s backward difference formula and it
can be express in the form of like p(p + 1)(p + 2)..(p + k) hk + 1 fk + 1 ζ/(k + 1)!, where ζ should
lie between x0 to xk here also.
288
And if we want to represent it in Newton’s forward difference formula form since usually this
error is represented in the form of Newton’s forward difference operator. So if you represent
in terms of Newton’s forward difference formula, so it can be written in the form of like p(p
+ 1)(p + 2)(p + k) Δk + 1 f(ζ)/(k + 1)!, where ζ should lie between x0 to xk here also.
289
Since already we have obtained that hk + 1fk + 1(x), this can be represented as Δk + 1f(x) here,
where x is lying between x0 to xk. Then if you will write all this tabular points which is
approximating towards the point xk for small h if xk is approaching towards x0, x1 to xk
suppose sorry xk-1 here suppose then we can say that Rk + 1(x) here this can be represented as
(x - xk)k + 1/(k + 1)! here and fk + 1(ζ), where ζ should lie between x0 to xk here.
Especially if you see here you are saying that all of these points extended by this point xk here
this means that xk is tending towards x0, x is tending towards x1 then xk is tending towards xk1
there. So that is why this formulation for small h can be reduced in this form, so thank you
for listening this lecture.
290
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture No 20
Interpolation Part V
Welcome to the lecture series on numerical methods. So now we are discussing about
interpolation in various polynomials. So last lecture we have discussed about these errors
occurring in a Newton’s forward difference formula, backward difference formula and the
general error terms what is occurring in polynomials. So today we will discuss about about
various examples associated with this Newton’s forward difference formula and backward
difference formula.
So if you just discuss this example we have to consider certain finite difference points which
would be exist at equal spacing’s to get the solutions here. So first example if you just
consider here is a population model that is given in censual or census in 10 years, suppose we
will just consider a problem that first example a population is given for like 5 years here.
291
So if you are writing this problem that is year as x and population as y here in thousand that is
supposed in 1891 the population size is 46,000 and 1901 the population size is 66,000 and
1911 the population size is 81,000, 1921 the population size is 93,000 and in 1931 the
population size is 101000. So it is asked to find this population size for the year 1895.
Since if you will see that the population level is asked in the year 1895 which is existing at
the beginning of the table this means that 1895 will lie between 1891 to 9001 here. So that is
why we can use Newton’s forward difference formula to evaluate this population size at this
level here. And for that we will construct the table and from there itself we can collect the
data and then we can use the formula to evaluate this population size in the year 1895 here.
292
So if you prepare this table we can write in a tabular form here first one is x then the
population y, so year is given as 1891, 1901, 1911, 1921, 1931 and the population size are 46,
66, 81, 93, 101 here. And first differences if you will calculate here Δy, 66 - 46 that will give
you 20 here, and next difference 81-66 this will give you 15 here, 93-81 that will give you 12
here and 101-93 that will give to 8 here.
And for the 2nd differences if you will compare that is in the form of Δ2 y here, so this is -5
here, then -3 here, then -4 here. If you will go for 3rd differences here that is 2 here, then this
is -1 here, -4 so this is +3, so then the final difference it is Δ4y, so it will take -3 here. And if
we are seeing here so the final value will be approaches towards the beginning of population,
so that is we can consider this tabular values in this form here.
This means that if you will use this formulation for Newton’s forward difference formula
always this point is shipping towards the beginning of the table so that is why we can
consider the upper part of the tabular values for the computation here and we have signified
that one in red colour symbols in the data in the slides.
So if you write the formula in your yp can be written in the form of y0 + pΔy0 p(p-1)/2! Δ2y0,
p(p-1)(p-2)/3!Δ3y0 + p(p-1)(p-2)(p-3)/4!Δ4y0 since your tabular value is existing upto fourth
order difference here, so we can consider up to 4th order term there.
So if we want to compute p first since it is asked to find in the year 1895 here, so first we
have to find the value of p here. So if you will see this spacing here, the spacing is 10 here
and usually the formula is written as xp = x0 + ph or the data which is asked to find which can
be represented in the form of x here and which can be written as x0 + ph and x0 is exempted
293
as 1891 here and h is 10 here, so we can compute that is our xp value for the year which we
have to compute that is 1895 here.
So from this 3 data points we can compute p as (x-x0)/h here, so which can be written as
(1895-1891)/10 here, so this can be the computed value will be 0.4 will. So if you will see
here p value should lie between 0 and 1 for the application of Newton’s forward difference
formula, so that condition is satisfied here since p value lies between 0 and 1 here.
So next if we can use this formula here, since p value is known to us, y0 value is this value
here and Δy0 value is this one here then Δ2y0 here, then Δ3y0 value is this one here and Δ4y0
value is known to us here and p value is 0.4 here and all other values are known to us. So
now you can compute what is the population size in the year 1895 here.
294
So if we will go for the computation of this formula here I can write that one as in the form
that is y(1895). This value first value is 46 year, 46 + you can write p value there, pΔy0, so I
can write here y(1895) = y0 is 46 year + p value is 0.4 here into Δy0 is 20 + p value is 0.4
here, so 0.4-1 here by 2!Δ2y0 is -5 here + p(p-1)(p-2)/3! into the last tabular value here is 2 +
your last value is 0.4(0.4-1)(0.4-2)(0.4-3)/4! into your last tabular value here -3 and the final
answer it will come as 54.85 thousand.
So if you will see here the population size is whatever is giving here this should lie between
46 and 66, so that is why this population size is accurate in error of this increment of the
values in the preceding years here. So if you will go for a backward difference formula since
all of these values are known to us then we can use this backward difference formula if the
value is asked to compute at the end of the table.
295
In the 2nd question it is asked that estimate the population for the year 1925 here. Since if the
question is asked that estimate the population size for the year 1925, so 1925 if you will see
so this is existing at the lower end of the table here, so that is why we can use Newton’s
backward difference formula for the competition of the value or to find this population size
for the year 1925 here.
So if we can use this tabular values here then we can write this backward difference formula
as y(x) = y0 + ∇y0p(p + 1)/2! ∇2y0p(p+1)(p+2)/3!∇3y0 + p(p+1)(p+2)(p + 3)/4!∇4y0 here. So
now we have to first compute p according to the backward difference formula then we will go
for backward difference tabular values and we can put that values to obtain the population
size for this year 1925 here.
296
So if you show this p value here, so p can be written as your x = x0 + ph here, so if you will
consider x0 = 1931 here, your x value is asked at 1925 years. So p value can be computed as-
0.6 here, where p is lying between -1 to 0 here, hence we can use Newton’s backward
difference formula since the basic difference is that if p lies between 0 and -1 we can use
Newton’s backward difference formula to compute the tabular values at the end of the table.
So that is why p is giving a value that is as -0.6 which is lying between zero and -1, so that is
why you can use backward difference formula there. And now we can put this value y0 value
is starting here, so that is why your x0 value can be considered as 1931 here, x-1 value can be
considered as 1921 here, x-2 can be considered as 1911 here, x-3 can be considered as 1901
here, x-4 can be considered as this value here and the relating values we can use 101, 8, -4, -1
and -3 there.
297
So if you will write in a complete tabular form here, so that we can write as, so y(1925). So
first value is y0 value here 101+ -0.6(8)+ p(p+1)/2!∇2 so that value is -4 here, + (-0.6)(-0.6 +
1)(-0.6 + 2)/3! and this value is -1 here + (-0.6)(-0.6 + 1)(-0.6 + 2)(-0.6 + 3)/4! into last value
that is-3 here. And the final population size is 96.8368 here and obviously if you will
compare this one up to 2 terms we can write this one as 96.84 thousand.
So next we will go for extrapolation since sometimes we are finding that if the value is
existing outside the table, so that is basically called your extrapolation. The process of finding
the values outside the table even if all of this tabular values are given, so by considering all
this data existing on this tabular values we can find out the preceding or succeeding values by
considering this overall data either it can be a future data or it can be past data, so that is
298
basically called extrapolation, if the value is computed inside the table that is basically called
interpolation here.
299
And if you consider for example that if a value is asked to compute outside the tabular values
then it is called extrapolation and we can use a sample to find these extrapolated values here.
So if the question is asked suppose find the cubic polynomial which takes the values y(0) = 1,
y(1) = 0, y(2) = 1 and y(3) = 10 here and it is asked to obtain what is the value y(4) here?
Since 3 or 4 preceding values are known to us we can use this interpolation method to
evaluate the value at the point 4 there itself. So if you will write this difference table here, so
that is difference table can be written in the form since x0 or xi data’s then corresponding yi
data’s. So your x0 value is 0 here, than 1, then 2, then 3 and yi values are 1, 0, 1 and 10 here.
So if you will find this forward difference tabular values here we can write Δyi, Δ2yi and the
Δ3yi here. So if you consider this difference, 1st difference is -1, 2nd difference is 1, 3rd
differences is 9 here and 2nd difference if you can see here that is 2 here then this is 8 here
and last tabular value is 6 here. And if you see here the spacing that is x can be defined as xi-
xi-1 and obviously your value is 1 here and it is asked to compute the value at x0 = 4 here.
So if we can right here x = x0 + ph here and x0 = 0 here, since the value we have to compute
that is in the form of x0 + ph here and x0 is 0, it is given. so that is why we can write your ph,
so x = x0 means we can write this one as x = p there, x = x0 + ph here. So that is why we can
write this one as x0 = 0, so that is why x0 + ph here, so 1st part if this is 0 here, so 0 + ph is
this one hence we can right here x as p here.
300
So if we want to compute this value at p = x here this means that h = 1, first it is given. I want
to clear these things here since x is asked to compute, so x = x0 + ph here and h is 1, so we
can write x0 = 0 here, so that is why x = p here. So if h = 1 and p = x, here we can obtain this
function y(x) in the form of a polynomial that can be expressed in the form of x.
Since 1st value if you consider here y0 = 1 + p, p is expressed as x here, so pΔy0 that is -
1+p(p-1)2! into next tabular value that is 2+x(x-1)(x-2)/3! into last tabular value that is 6
here. So which can be expressed in the form of your x3-2x2 + 1 here which is a polynomial
and if we want to find the value y at 4 here we can write 43-2.42 + 1 here, so it can be written
as 33.
301
So next we will go for this Central difference table. Now we can discuss about Central
difference approximations. So if sometimes the value is asked to compute at the middle of the
table then you can use this Central difference approximation. So basically this Central
difference approximation includes Gauss forward difference formula, Gauss backward
difference formula and Vessel’s formula and Stirling’s formula.
So in Central difference table if you will use Gauss backward difference formula, so the P
value should lie between -1/2to 1 and if you use this forward difference formula and then the
p value should lies between 0 and 1/2 there. So if we will write this Central difference table
in Gauss backward difference form then this table can be written in the form since we are
302
approaching here the values x(-3), x(-2), x(-1), x(0), x(1), x(2), x(3) and we are comparing
this value at the middle of the table here.
Since already we know that the formula that is Gauss forward formula which can be used at
beginning of the table, backward formula which can be used at the end of the table. So that is
why we are trying to evaluate this formula which should be used at the middle of the table
here. So that is why we can express this tabular values as x0 as in the middle of the table and
all other points preceding and succeeding are in the form of - and + 1 here.
So the corresponding y values or these functional values can be expressed in the form y0, y-1,
y-2, y-3, y1, y2, y3 here. And if you use this Central difference approximation, here we can
write this Central difference approximation as δy-1/2 here and δy1/2 here. So similarly we can
write δ2y0 then δ3y-1/2 then δ4y0.
So likewise we can consider this tabular values for the Newton’s backward difference, we are
going one step back to evaluate this tabular values there, so that is why it is called Gauss
backward difference formula and if we will go forward one step here then we can say that is a
Gauss forward difference formula here.
So for the forward difference formula if you will go here so that can be represented in the
form of first function is y0 here then δy1/2 then we can go δ2y0 here then we can go δ3y1/2 here,
then δ4y0, so likewise we can go.
303
So this is for a Gauss backward formula if you will go for the forward formula here we can
write Gauss forward formula as that we can consider the Central difference approximations
that is in the form of x-1, x0, x1 here, so y-1, y0, y1 here. So this difference is 1st will go for here
δy1/2 here, then δ2y0 here, then δ3y1/2 here, then δ4y0 here. This means that a forward marching
step we are approaching here to get this forward difference formula. So first case we are
telling that p should lie between -1/2 to 0 here and in this case p should lie between 0 and 1/2
here this is the condition.
If we will take the average of these 2, so then we can obtain this Stirling’s formula where p
should lie between -1/4 to 1/4 there. And if you will take this even differences average of
even terms there then we can obtain Vessel’s formula, where this p value should lie between
1/4 to 3/2 there.
304
So now we will go for a complete elevation of this Central difference formula. In the Gauss
backward difference formula and we can use y0 then it’s even differences, even differences if
you will see y0, δ2y0, δ4y0 we are using here. And odd differences y-1/2, odd differences means
this power if you will see here that is δy-1/2 here, δ3y-1/2 here.
So likewise all the even powers y0 we are using square here, 4th here, then 6th again and if you
will go for this odd differences here, so odd difference means it can be taken y-1/2 here, so this
is 1 here, then this is 3 here then again this is 5 there, so likewise it will continue. And our
aim is that for the derivation of this formula we have to consider yp as in the form of a0, y0, a1,
δy-1/2 there, a2 δ2y0 there, a3 δ3y-1/2 + a4 δ4y0.
305
So if we can express in this form here then yp can be written as a0 first-term we can write as
a0y0, a1δy-1/2 here, a2δ2y0 here, a3 δ3y-1/2 here, so likewise we can express. Since we are
expressing yp as in this form here if we can express yp in terms of δ and if we will compare
both sides the coefficients we obtain the value of a0, a1, a2, a3 all the coefficient there.
so if you will use this operator then we can write that in the form y0 + δy0 p(p-1)/2! +δ2y0, so
likewise we can right here. So it can also be expressed in the form of Central difference form
that is if you will write δy-1/2 here, we can write E-1/2y0 there. Then we can use this Central
difference operator E1/2-E-1/2 into product of the terms.
So if you will write that one we can write that one as (1 + δ)p this can be written as a0 y0, + a1
δ E-1/2y0+ a2δ2y0+ a3δ3E-1/2y0+…., so likewise we can write. And if will try to find this
coefficient here, so it can be written as δ-1/2y0, so which can be expressed as E1/2-E-1/2 E-1/2y0.
And if you take this product here this can be expressed as E-1 or if I am adding it appear so I
can write E1/2 this is 1-E-1y0 here.
306
Till now we have discussed this Newton’s forward difference formula and backward
difference formula and the examples based on it. And in this lecture we have discussed about
this Central difference operators and in tabular form how it can be related and in the next
lecture we will continue this Gauss forward difference formula and backward difference
formula, thank you for listening this lecture.
307
Numerical Methods
Doctor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 21
Interpolation Part VI (Central Difference Formula)
So in the last class also we have given little hints about Gauss’s backward difference formula.
Here usually we are using even differences of y0 and odd differences of y-1/2 where the
formula for this Gauss backward difference formula is expressed in the form like yp or y(xp)
can be written as a0y0+a1+δy-1/2+a2δ2y0+a3δ3y-1/2, so likewise.
If you will express this left hand side of y(xp) as y(x0 + ph) here then it can be expressed in
the form Epy0 which can be written as (1 + δ)py0 here. And if we will write the right hand side
expression here in the coefficient form like a1δy-1/2 it can be expressed as a1δE- 1/2y0 here.
Similarly if we express here a3δ3y-1/2 it can also be expressed as a3δ3E- 1/2y0 here.
308
(Refer Slide Time: 02:26)
So if you will express in the coefficients as y0 here then both the sides can be expressed as
operator here. So operator means we can express this left hand side in the form of (1 + δ)p,
this can be expressed as right hand side as a0 + a1δE-1/2 + a2δ2+a3E-1/2 + as in this form here.
So now we can express δE-1/2 in the form of δ function here and δ3E-1/2 also in the form of like
δ here then if we will equate both sides coefficients then we can obtain the values for a0, a1, a2
and a3.
309
And where we can find the exact formula for Gauss backward formula. So if you will expand
here (1 + Δ)p here as LHS, so in LHS we can write (1 + Δ)p as your first will be 1+ pΔ + p(p-
1)Δ2/2!+….., so likewise we can express.
And if you will express this right hand side terms here δ E-1/2 in terms of like δ here we can
express that one in the form E1/2 - E-1/2 since already we have known that the central
difference operator δ can be expressed as E1/2 - E-1/2. And if we want to write this term δE-1/2
here we can write (E1/2 - E-1/2)E -1/2.
310
And if we want to express this one in the form of capital Δ here so we can write this term as
δE-1/2 = (E1/2- E-1/2)E1/2.E-1 here. So this mean that if we will take the product here we can
obtain (E-1)E-1 here and they say this one we can write E-1 which can be expressed in the
form of like Δ(1 + Δ)-1 and which can be expressed as Δ/(1 + Δ) here. Similarly, we can
express rest of the terms like δ3E-1/2in this form of capital Δ then all other terms also it can be
expressed.
So if you will express all these terms this can be represented in the form of like δ2 can be
expressed as Δ2/(1 + Δ)Δ3 can be expressed as Δ3/(1 + Δ)2 and Δ4 can be expressed as Δ4/(1 +
Δ)2 and Δ5E-1/2, it can be expressed as Δ5/(1 + Δ)3.
311
So if you will put all these coefficients in both the sides then we can obtain a0 value as 1 and
to obtain the values of remaining coefficients like a1 if we will multiply 1 + Δ in both the
sides then we can obtain these coefficients as in the form of 1 + (p + 1)Δ + p(p + 1)/2!Δ2. So
this will be the left hand side and in the right hand side we can put that one as since a0 = 1, so
this can be expressed as 1 (1 + Δ) + a1 Δ2 + a2Δ3 + likewise.
So if you will see here the complete expression for this Newton's backward difference
formula can be written in the form like 1 + (p + 1)Δ + p(p + 1)/2!Δ2…..= 1(1 + Δ) + a1Δ2 + a2
Δ3, likewise we can write.
312
And if you will multiply this by factor (1 + Δ) here so we can write this coefficient that as (1
+ p + 1) Δ and this can be written as like both the sides if we take common (1 + p + 1) Δ here
and if we equate the coefficients of Δ and Δ2 we can obtain a1 value as p here.
Since if you can see here like 1(1 + Δ) and this right hand side and left hand side it can cancel
it out and we can obtain these values a1 as p and a2 as permutation of (p+1P2) there. And again
multiplying like (1 + Δ)2 then some of these coefficients again we can get it up as a3 and a4
where this a3 can be expressed as (p+1P3) and a4 can be expressed as (p+2P4). Proceeding in this
manner we can obtain these coefficients for 2m - 1 and 2m.
And a2m - 1 can be expressed as since we are taking this differences here, (p+mP2m-1) – (p+m-
1
P2m-2), this can be given as (p+m-1P2m-1). And a2m can be represented as (p+mP2m) for m = 1, 2,
3, likewise we can consider these values.
313
Finally, if you put all these coefficients a1, a2, a3, then we can obtain this formula as in the
form of yp = y0 + pδy-1/2+(p+1P2)δ2y0. So this will be the formula for Gauss’s backward
difference formula.
And if we are going for this Gauss’s forward central difference formula then we can choose
these coefficients as forward marching steps. Like if you will write this coefficient that we
can write yp as a0y0, a1 as δy1/2, a2 as δ2y0 ,a3 as δ3y1/2, so likewise we can write.
Where these coefficients a0, a1, a2, a3 are to be determined. And if you will use this same
relationship here like this coefficient we can replace here δE1/2y0 and this coefficient it can be
taken as δ3E1/2y0, so then we can express both the sides in the form of like forward difference
operator as the coefficients operated on y0 there.
So if you will write it in operator form we can write this left hand side as 1 + p Δ + p(p –
1)/2! Δ2+p(p – 1).(p – 2)/3!Δ3. This equals to a0 + a1 Δ + a2Δ2/(1+ Δ)+a3Δ3/(1+ Δ)+
a4Δ4/(1+Δ)2 + up to like finite number of terms. If you will compare these coefficients on
both the sides like Δ0 and Δ1 then we can obtain a0 as 1 and a1 as p.
And if you will multiply both the sides by 1 + δ then we can obtain the values of a2 as the
permutation of (pP2) and a3 as permutation of (p+1P3) there. So likewise if you will multiply
one by one we can obtain rest of the coefficients like a2m, a2m+1 and we can write this Gauss’s
forward formula as yp = y0 + pδy1/2 + (pP2) δ2y0 + (p+1P3)δ3y1/2, so this one.
314
So if you will take the average of these two scales then we can obtain this Stirling’s central
difference formula. This means that this formula uses y0 and its even differences + average of
the odd difference of y-1/2 and y1/2. Hence the formula can be expressed in the form like yp =s
a0y0 + a1μδy0. Since we are taking the average of these odd terms here so that is why this μ
operator is assigned in the odd terms here and even terms are remained in that scale only.
So where these coefficients like a0, a1, a2, a3 are the constants to be determined and μ is the
average operator here where μ can be expressed as if (E1/2+ E-1/2)/2.
315
If we will express this one in constant coefficients form then we can obtain these values of a0
as 1 here and rest of the coefficients will be expressed in the form of like a2m-1 as permutation
of (p+m-1P2m-1) and a2m as 1/2 of like average terms we are considering for this odd terms here.
So if you will expand these terms here then we can obtain this coefficient as a2m = p2(p2 -
12)(p2 - 22)…(p2 - m – 1)2 /2m!.
So putting all these values we can obtain this Stirling’s central difference formula as yp = y0 +
p μδy0 + p2/2!δ2y0 + (p+1P3)μδ3y0 + p2(p2 - 12)/4!δ4y0 + up to the finite number of terms.
Whatever these terms where we want to truncate this series, so up to that term we can
consider in the series expansion.
316
So next we will go for Bessel’s central difference formula where we will use the opposite of
this Stirling’s formula. This means that we considered the average of odd differences in
Stirling's formula but here we will use the average of the even differences.
This means that in the a0 term, a2 term, a4 term we can use these average operators. Hence
this formula can be represented in the form of y0 = a0 μy1/2 + a1δy1/2 + a2μδ2y1/2, so likewise
we can express.
And where these coefficients like a0, a1, a2, a3, a4, etc are constants and we can determine by
expanding both the series expansions like if you will consider like y(xp) = (1 + Δ)p and we
can equate both the sides coefficients then we can obtain the values for this a0, a1, a2, a3, a4,
etc.
So if we want to expand this one so we have to keep in mind that these average operators like
μ E1/2 it can be written in the form. Since this term here we can write μ E1/2y0 and this term
also it can be written as μΔ2E1/2here. So if we want to expand this term here we can write
since we are operating μ operator on E1/2 here, so we can write this one as
(E1/2+E1/2)/2.E1/2here.
317
Here also the same thing we can use since we are using this operator on δ2 E1/2here. So first
we can write this one as (E1/2+ E-1/2)/2.δ2 E1/2. So if you will multiply these terms here, first
we will get (E + 1)/2.
And rest of the terms we can equate with both the sides and we can obtain these coefficients
that in the form of like δE1/2, it can be written as Δ2/(1 + Δ). Already we have discussed these
things in the earlier slides and Δ4 can be expressed as Δ4/(1 + Δ)2.
And if you will put all these terms in terms of capital Δ here and equate both the sides then
we can obtain this coefficients a0 = 1, a0/2+a1 = p then we can get since a0 is 1 there so a1 can
318
be computed as p - 1/2 here. And if you will multiply both the sides (1+Δ) throughout the
equation and comparing the coefficients for Δ2 and Δ3 we can obtain p + a2 as p(p+1)/2! and
a2 as the (p+1P2) – (pP1) = (pP2).
Similarly, we can obtain these coefficients for a3 and a4. In general term if we want to write
a2m that can be written as (p+m+1P2m) there. And a2m+1 this can be written as permutation of
(p+m-1P2m).(p-1/2)/2m+1. And if we will put all these coefficients and rewrite this formula, the
formula can be written in the form of yp = μy1/2 + (p-1/2)δy1/2 + (pP2)μ δ2y1/2, so likewise we
can write.
So with this formula we can try to solve some of these problems for example it is given that
suppose the values of ex are tabulated at different points starting from 1. 00 to 2.0 with the
incremental size 0.20 and it is asked to evaluate this value at 1.44 here. So if we want to
evaluate this value at 1. 44, we have to consider this value along the centre of the table here
since the tabular values are expressed as if you will see.
319
So if we want to discuss about this central difference approximation so we have to consider
this value along the centre of the table only. So that is why the value it is asked here it is 1 .44
which is lying between 1.40 and 1.60 in the tabular values. So if this is the value then first we
can find this divided difference from here like forward differences form. If we will take the
differences so we can find first difference, second difference, third difference, fourth
difference and fifth difference here.
Since the points are in the middle of the table we will use this central difference formula like
Stirling's formula or Bessel’s formula here. And if we will consider x = 1.44 by choosing x0 =
1.40 we can obtain these values of p that is form of like 1-.44 – 1.40/2.0 where this formula
for p can be used as x = x0+ph where x can be chosen as like 1.44 here. And from that
formula itself we are getting p as 0.2.
And if you will use this formula we can obtain this value as 4.22068 and especially if you
will see here we are using these values that is Δy1/2+Δy-1/2 since we are using the central
difference approximations. And in the central difference approximation if you will see that
this average operator is operating on Δ operator here. Δy1/2 means it can take the values of y1
- y0. Y-1/2 means it can take y0 – y-1.
320
(Refer Slide Time: 21:12)
Since the tabular values are associated like y-3, y- 2, y- 1, y0 then y1, y2, y3, so as I have
discussed in the earlier classes the similar fashion we can choose this data values and if we
will put in a proper form then we can use this forward difference table to get the central
difference values here easily and in a comfortable form.
Till now we have ended all these formulations that formed by the central difference
approximations and we have solved the examples based on this central difference
approximation also. In the next lecture we will go for some other interpolation formulas to
compute these functional values and the polynomial approximations. Thank you for listening
this lecture.
321
Numerical Methods
Doctor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 22
Interpolation Part VII (Lagrange Interpolation Formula with Examples)
The Lagrange interpolation formula is applicable for both these uniformly and non-uniformly
spaced grid sizes. Especially in interpolation formula we will have this tabular values like x0
y0, x1 y1, x2 y3, x3 y3, so likewise. So earlier whatever this formula we have discussed or
derived basically all these lectures they are discussed for this finite difference operators
which are used for equi-spaced points.
Sometimes if you just see these points which are placed here like if you will have the tabular
points like xi, yi where i is varying from 0 to n here then these points will be placed like (x0,
y0), (x1, y1), (x2 y2) up to (xn, yn) here.
322
So if you will just consider these tabular points like (x0, y0) will be placed like this point here.
(x1, y1) will be placed here and suppose the next tabular point (x2, y3) is placed here. Next
tabular point will be placed as (x3, y3) here. So likewise suppose the points will be placed.
But if you just see the distance between these two points which are unequally. Since this is
not = this one or this is not = this one here. This means that we can just say that (x1 – x0) ≠ x2
– x1, here or this is not = x3 – x2 here.
So in this case we cannot use like Newton's forward formula or Newton’s backward
difference formula or any of these forward difference approximations. So for that we need a
separate formula which can be used for these unequally spaced points here. So basically these
unequally spaced points here we will just consider three different interpolation methods to
approximate this functions with a polynomial.
That is first one is Lagrange interpolation method, second one is a Newton’s divided
difference formula and third one is Hermite’s interpolation method. And in this method if you
will just see this method simply suggest to represent the given data like (xi yi), i = 0 1 2 up to
n which can approximated in the form of yi = p(xi) here.
323
(Refer Slide Time: 03:27)
Here p(x) can be written in the form of a0, a1x, a2x2 up to anxn where these coefficients a0, a1,
a2 up to an are to be determined. And to determine these n + 1 coefficients here since it is a
polynomial of degree n so we need like n + 1 equations to evaluate these coefficients there.
Since all of these constants here so one value like a0 can be taken to the right hand side.
324
And if we will just approximate this function with a polynomial to determine these
coefficients a0 and a1 from this equation we have to consider three equations that is in the
form of like p(x). Then if you will just put that x = x0 where p(x) = f(x) or x = x1 suppose,
p(x) = f(x) here then we can just eliminate these coefficients a0 and a1 in a polynomial form
here.
So if you will just do that things we can just write p(x) = a1x + a0 where a0 and a1 are the
arbitrary constants which satisfies the interpolating conditions that as f(x0) = p(x0) as a1x0 + a0
here. Similarly, we can just write f(x1) that as p(x1) = a1 x1 + a0. And our original form of the
equation that is especially written in the form of p(x) = a1 x + a0 here.
So if we want to eliminate a0 and a1 from these three equations we can just write these
equations in the form like p(x), x1, then f(x0), x01, f(x1), x11, this = 0. If you will just expand
this determinant here we can just obtain that p(x)(x0 – x1) – f(x0)(x – x1) + f(x1)(x – x0) = 0.
If we want to find p(x) or the polynomial to be determined here so then it can be written as
p(x) = (x – x1)/(x0 – x1)f(x0) – (x – x0)/(x0 – x1)f(x)1 which can be written as (x – x1)/(x0 – x1)
f(x)0 + (x – x0)/((x1 – x0))f(x1) here.
Since– sign is there so I have just taken common “–“ from this denominator side. So that can
be written in the form of (x – x0)/((x1 – x0)).f(x1). Where, I can just write this one as L0(x),
f(x0) + L1(x)f(x1) here. Where, I can just write L0x = (x – x1)/(x0 – x1). And L1x, that can be
written as (x – x0)/((x1 – x0)) here.
325
(Refer Slide Time: 08:56)
So if we will just see here that these are called L0x and L1x are called Lagrange's fundamental
polynomials which satisfies. If you will just add both these terms here, like L0 x + L1 x here
which can be written as (x – x1)/(x0 – x1) + (x – x0)/((x1 – x0)) here. So this total value can
give you the values as 1 here.
326
Since if we can just put here that is L0(x0) that is nothing but 1 here since x0 can be replaced
here so (x0 – x1)/(x0 – x1) that we will just give you 1 here. If I will just replace x/x1 here this
means that L1(x1) that will just give you (x1 – x0)/(x1 – x0) as the value as 1 here. This means
that Li(xj) = 1 whenever i = j and this = 0 whenever i ≠ j there.
So if you just extend this linear polynomial to nth order polynomial here so we can just write
this complete polynomial as p(x) = L0(x), f(x0) + L1(x), f(x1) + up to Ln(x), f(xn) where this
coefficient Li(x) can be written as (x – x0)(x – x1)…(x – xi – 1)(x – xi + 1)…(x – xn)/(xi – x0)(xi
– x1)…(xi – xi – 1)(xi – xi + 1)(xi – xn) here.
327
(Refer Slide Time: 11:32)
And it can be written in a combined form that as p(x) = i = 0 to n ∑Li(x)f(xi) here. And
obviously this Li(x) will satisfy the property that is Li(xj) = 1 when i = j and 0 when i ≠ j here.
If I will just take the product of the term like x – xi in the upper side here so I can just write
that one as product of summation i = 0 to n (x – xi)/(x – xi) here into your products like (xi –
x0)(xi – xi – 1)(xi – xi + 1)…(xi – xn) here.
And also sometimes people are just using since this is a product form here I can just write this
one as product of i = 0 to n (x – xi)/(x – xi) into this term I can just replace as П’i = 0 to n, (xi
–– x0)(xi – x1). So if I will just take the derivative of this term with respect to xi here then I
can just write П’xi here, that one.
328
So we can just denote in our like convenient form that is in a derivative form if you will just
take the expansion here and with respect to xi if I will just take the derivatives then except
this xi term all other terms will be 0.
So in that way also I can just write this expansion that is in the form of like xi – xj if I will just
write or I can just write this one as x – xi also. So it is very convenient to express this Li(x)
term in different senses. Since all of these terms are occurring in the form of product comes
here so that can be expressed.
To go for this polynomial what is just created in the form of like Lagrange interpolation so
we will just discuss about an example that using suppose the data like x = 0, 1, 4, 5 and the
329
corresponding data of y = 8, 11, 68, 123, how we can determine the value of y or the function
f(x) at the point 2? That we will discuss using Lagrange interpolation polynomial here.
Since the data points whatever it is just given, it is given in the form of like x as 0, 1, 4, 5
here and corresponding y data that is given as 8, 11, 68, 123. And since we will have this
points like 0, 1, 2, 3 so we can just go up to a polynomial that is of degree like 3 here. So that
is why we can just write this one as the Lagrange polynomial expression Li(x).
I can just write (x – x0)(x – x1)…x – xn)/(xi – x0)(xi – x1)…(xi – xn) here. And the complete
polynomial p(x) can be written as L0(x) f(x0), L1(x)f(x1), L2(x)f(x2), L3(x) f(x3).
So if I will just go for the computation of L0(x) here, so L0(x) can be written as (x-x1) (x-x2)
(x-x2)/(x0 – x1) (x0-x2) (x0-x3) here. Since we have here 4 data points like the 4 data points can
be signified as x0 = 0 here, x1 = 1 here, x2 = 4 here and x3 can be written as 5 here. So if you
330
will just put all these points here that can be written in the form like (x – 1)(x – 4)(x – 5)/(0 –
1)(0 – 4)(0 – 5) here.
So if we want to evaluate L0(x) at the point 2, since the question is asked to evaluate the
functional value at point 2 so directly we can just replace here L0 at the point 2 and putting
this 2 at the position of x in each of these expression here. So if you will just replace here like
L0(2), so we can just write this one as (2 – 1)(2 – 4)(2 – 5)/(0 – 1) that is – 1, then – 4, then –
5 here. And similarly we can just write L1( x) also and L2( )x, then L3(x) we can just write.
331
And if you just put all these values with the functional values here we can just obtain the total
expression as your y(x) or p(x) can be expressed as (x-1) (x-2) (x-5)/(0 – 1)(0 – 4)(0 –
5).8+(x – 0)(x – 4)(x – 5)/(1 – 0)(1 – 4)(1 – 5).11 + (x – 0)(x – 1)(x – 5)/(4 – 0)(4 – 1)(4 –
5).68 + (x – 0)(x – 1)(x – 4)/(5 – 0)(5 – 1)(5 – 4).123.
Since we want to evaluate this y(x) or p(x) value at exactly x = 2 here so we can just put here
y(2) and this total value that can become as or we can just obtain that value as 18 here.
So since we are just using this Lagrange interpolation polynomial if you will just see the data
points here, first point is 0, then second point is 1, then third point is 4 here. If you just see
this difference or this difference both are unequal here.
332
And again this 4 and 5 this difference is 1 here, this difference is 3 and this is 1 again. So that
is why you cannot use any class of like finite difference operators to find this value at x = 2
here. So Lagrange method this can also be applicable if we will have these data points given
and if it is asked to evaluate this functional values within any of these functional values here.
And for that what we will do is, we have to fit another polynomial suppose q(y) such that the
function is defined as a function of y there. This means that always we are just expressing y
as a function of x. If it is asked to evaluate this functional value, we can just use it in a
reverse form.
That means we can just use that one as x = q(y) there. Since y as the independent variable and
x as the dependent variable. So for that we can just consider another example suppose y =
f(x) then usually we can just express in inverse form here x = g(y) here to find these values at
the tabular points if a function is prescribed to us.
The following data are prescribed here like y = x3 is given here, x at the points like 1, 2, 3 and
y values are given as 1, 8 and 27 here. And the question is asked to compute the √21 from the
above data using Lagrange’s method and also discuss the error associated in that formulation.
333
(Refer Slide Time: 23:04)
So for that what we will do is we can just express this function in an inverse way like x is
given here as 1, 2, 3 and corresponding y values that are written as here 1, 8, 27 here. And so
if the question is asked to find the value of x where y value is 21 here, we can just write this
formulation in a reverse form that x as a function of y here. Especially if you will just see this
part here, x is expressed as a function of y here.
So that is why we can just write this formulation x as a function of y which can be expressed
as (y – y1)(y – y3)/(y0 – y1)(y0 – y3). And corresponding value it will just take these values as
x0 + we can just write (y – y0)(y – y3)/(y1 – y0)(y1 – y3) into these functional values which can
be in a inverse form it can be written as x1 here.
And the last part you can just write this one as (y – y0)(y – y1)/(y3 – y0)(y3 – y1).x2 here.
334
So if you will just represent this Lagrange polynomial in this form here so especially we are
just writing in this form here x0 as 1, x1 as 2, x2 as 3 here and corresponding y values these
are expressed as y0 = 1 here, y1 as 8 here, y3 as 27 here.
And if you will just put these values here then we can just find these functional values as like
x(21). Since the y value it is asked to compute the x position that where we can just find this
3√21 there. Since especially the function is defined in the form of like y == x3. So that is why
if in a inverse form if you will just write x can be expressed as y1/3 here.
335
So that is why if it is asked to compute this one x(21) here so then we can just write this one
as 21 – y1 that as here as like 8(21 – 27)/(1 – 8)(1 – 27).x0 value. x0 value is especially1 here
+ if you will just write again like 21 – y0. So y0 is 1 here then into (21 – y3) here. So y3 is
obviously 27 here.
This divided by like if you will just write here or value y1. y1 means this is 8 – 1, then 8 – 27
and the second value if you will just write this is 2 here. + the third value, so last value we
can just right (21 – y0)(21 – 8)/(27-1)(27-8) .3.
If you will just go for this computation so then we can just find this value in a particular form
and we can just obtain this value of x(21) there itself. So if we will go for this error
computation here we can just find the exact 3√21 as 2.7589.
And if we want to go for the computation of error terms here in case of Lagrange
interpolation polynomial that is the generalized interpolation polynomial errors, what is
occurring in our finite difference operator, the same we can just find this error term that in a
generalized form we have just obtained that one as r(x) = (x-x0)(x – x1)(x – xn) fn + ζ/(n + 1)!
where ζ should lie between x0 to xn.
So the same approximation we can just use over here also. For n terms usually we are just
writing this error term r(x) as (x-x0)(x – x1)…(x - xn)fn + 1
ζ/(n + 1)! where ζ should lie
between x0, xn here.
336
(Refer Slide Time: 29:00)
So the same approximation we can just use for the computation of error here also and in this
case if you will just go for this error computation I can just write r(x) = (x – 1)(x – 8)(x –
27).f ```(ζ)/3!.
Since we will have here 3 points like n = 2 here. So 3 points that is why we are just
considering 1, 8, 27, so it can just generate a polynomial of degree 2 and that is why 3 points
mean we will have a polynomial of degree 2. Suppose if we are just considering a polynomial
of degree n there then we will have exactly n + 1 terms.
So that is why here we will have 3 points, 1, 2 and 3. We are just considering that is why this
degree of the polynomial will be 2 here. And if n = 2 here obviously we can just write fn + 1
that is 2 + 1 is 3 here. So this represents the third order derivative here and this is 3 factorials.
So if you will just go for the computation of this error term here so y can be expressed as in
the form of like x1/3 here. And if you will just go for this triple order derivative f ```(x) here, I
337
can just express this one as 10/37 x-8/3. And the maximum value of f ```(x) at x = 1, it can be
obtained and hence your error term that is r(21) can be written as (21 – 1).(21 – 8).(21 – 27).
And the maximum value that is occurring at 10/37 so that is why 3!, I can just write this one.
That is as r(21) as (21 – 1)(21 – 8)(21 – 27).10/37.1/6 here. So this total value it is just
coming as 96.3 here or 96.29 something it is just coming over that. So the drawback of this
method that is if we want to add suppose any other term we have to go for this computation
of n + 1 terms here.
This means that whenever we have an extra point here usually we are just writing p(x) as f(x)
or we are just writing this one as y(x) here. That is represented in the form of L0(x) f(x0) L1(x)
f(x1). So likewise we are just writing Ln(x)f(xn) here.
338
But suddenly if suppose one extra point we want to add it up here then we have to consider
that this n + 1 point in each of these products here, since L0(x) is usually expressed in the
form of (x – x1) to (x – xn)/(x0 – x1) to (x0 – xn) there. So if another extra point if it will be
added so in upper side also we have to consider like (x – xn + 1), lower side also we will have
to consider (x0 – xn + 1) also.
So in each of the terms if it is multiplied so a large multiplier series required to go for this
solution process. In this Lagrangian method usually if we want to add suppose any term we
can just find that all of the terms like L0(x) if it is written, so it can be written in the form of
like (x – x1) to (x – xn)/(x0 – x1) to (x0 – xn) here.
So if we want to add extra more term or one more term there then we have to do all of the
computations again. So that is why we can just go for like a Newton's divided difference
interpolation formula for the better computation or it requires like less computation for an
extra addition of these points here. Thank you for listening this lecture.
339
Numerical Methods
Doctor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 23
Interpolation Part VIII (Divided Difference Interpolation with Examples)
So before going to Newton's divided difference interpolation first we will discuss about what
is divided differences? So divided difference means if you will have a set of data points (x0,
y0), (x1, y1) up to (xn, yn) here. Maybe it is uniformly spaced or non-uniformly spaced but we
can use this divided differences here. So suppose these corresponding values of y in terms of
x if we are expressing here as y = f(x) for x = x0 to xn here.
We can express this divided difference for this function y equals to f(x) as f(x0, x1) for two
consecutive points we can write f(x1) – f(x0)/(x1 – x0). Similarly, if we want to write for x1, x2
for two consecutive points since whatever this tabular points I have written it up here that is
starting point is x0, next point is x1, corresponding these y values are associated values y0, y1.
So we can write this first function as also (y1 – y0)/(x1 – x0) here. And for the second two
points x1 and x2 if I want to express in divided difference form I can write f(x2) – f(x1)/(x2 –
x1) here which can be written as (y2 – y1)/(x2 – x1) here.
340
Obviously the question arises that if we are writing this values of x0 and x1 here in argument
form or in arguments then how we can express f(x0) here? So we can write obviously f(x0) =
f(x0) as this one and f(x1) since all are independent points we can write these arguments as
functional values at that points.
And if we are going up to the last of the interval here that is xn – 1 to xn here we can write this
divided difference of xn – 1 to xn as f(xn) – f(xn – 1)/(xn – xn – 1). And this can be written as also
(yn – yn – 1)/(xn – xn – 1) here.
So these are all called first order divided difference here and if you go for second order
divided differences here then we can express the second order divided difference especially if
the points are placed at (x0, y0), (x1, y1) up to (xn, yn) here. At a time if we are considering
three points suppose x0, x1 and x2 we can write this divided difference that in the form of
second order divided differences. That as f(x0, x1, x2) as f(x1, x2) – f(x0, x1)/(x2 – x0) here.
341
And the third order divided difference since if you will see here for second order divided
difference we are considering three points. For the first order divided difference we are
considering two points there. So if we will go for third order divided difference here then we
have to consider four points here x0, x1, x2, x3 and it can be written in the form of f(x1, x2,
x3)– f(x0, x1, x2)/(x3 – x0).
And for nth order divided difference we can write that one as nth order if you will have n + 1
values here, x0 x1 up to xn here. We can write this one as f(x1 to xn) – f(x0 to xn – 1)/(xn – x0)
here. We can see that if we are going for this first order divided difference we are using two
points and especially that can establish a relationship between these functional values at x1
and x0.
And if we are going for the second order divided difference here we can find that this
establishes a relationship between this differences here it can be written in the form of f(x2) –
f(x1)/(x2 – x1) – f(x1) – f(x0)/(x1 – x0)/(x2 – x0) here. This means that this establishes a
relationship between x2, x1 and x1, x0.
342
And if we go for this third order divided difference here, it establishes a relationship between
x0, x1, x2 and x3 combined here. The advantage of this method is that in a Lagrange
interpolation method usually whenever an extra point is added there so we have do all the
computations newly. This means that all of these products again we have to consider in a
modified form that the extra point can be multiplied with the rest of the factors there.
But here if you will see this extra point can be added in a uniform way that extra
multiplication is not needed. If we are seeing this differences here, these arguments are equal
if the divided differences may still have a meaning here. This means that if the arguments are
equal suppose x0 if we want to write f(x0, x0) there. This means that if we are writing f(x0, x1)
suppose where x1 can be written as x0 + ε where, ε is very small.
This means that we can consider ε tends to 0 here. So then we can express f(x0 x1) as f(x1) –
f(x0)/(x1 – x0). Obviously this is the first order divided difference formula represented in this
form here.
But if you will replace this x1 here in the form of x0 + ε where ε is very small we can write
this statement as in the form of limit ε tends to 0, f(x0) + ε – f(x0/ε here. And obviously this is
nothing but f `(x0) here. So we can say that here if the arguments are equal then the divided
difference still produces some value here.
343
And it is in the form if we are considering here f(x0, x0) where x1 can be represented in the
form of x0 + ε and ε is very small then it can represent the derivative of this function at that
point only. So if suppose f(x) is a differentiable and if we will have this r + 1 argument
suppose. we can write f(x0, x0) up to r + 1 times. We can write this one as fr(x0)/r!. Since if
you will see here two points are there, so that is why we are writing this as f(x0, x0) as f `(x0).
Similarly, we can consider that this should be r + 1 points to get this derivative in the form
order of r here.
344
So if you will go for this divided difference table here perfectly we can find that this tabular
values i = 0, 1, 2, 3, 4 and its corresponding values suppose x0, x1, x2, x3, x4 then we can
evaluate this first order divided difference, second order divided difference, third order
divided difference to get any of the approximated value within that range.
This means that suppose our earlier computation if the value is asked to compute within any
of the interval suppose then we have to use this divided difference in such a fashion that this
value can be computed in easy form. So that is why if you will consider i as the values as 0,
1, 2, 3, 4 here, the corresponding xi values are x0, x1, x2, x3, x4 here. And corresponding these
y values we can write y as y 0, y1, y2, y3, y 4 here.
Or we can write this one as f(x0), f(x1), f(x2), f(x3), f(x4). And if you will take this first order
divided difference here sometimes also this divided difference is written in this form also.
And the first order divided difference if you will see that is giving you f(x0, x1) here. This
means that the difference of these two divided by the difference of these two here.
345
And if you will take this divided difference of x1, x2 here, that will give you f(x2) – f(x1)/(x2 –
x1) here. Similarly, if you consider this divided difference of x2, x 3 here then we can consider
that one as f(x3) – f(x2)/(x3 – x2). Similarly, we can find this divided difference of x3, x4 as
f(x4) – f(x3)/9x4 – x3).
And then second order divided difference you can find it out here that is in the form of f(x0,
x1, x2). And then we can write this one as f(x1, x2, x3) here. And next divided difference we
can write f(x2, x3, x4) here. Similarly, the third order divided difference if you see you can
write this one as f(x0, x1, x2, x3) and the third order divided difference for this function if you
will write, this can be started from x1 to x3, x4 here. And the last divided difference if we want
to write this can be represented as fourth divided difference which can be written as f(x0, x1,
x2, x3, x4) here.
346
So if the tabular value is given xi and yi so we can find this divided difference table and based
on this divided difference table data we can evaluate this polynomial at any point within this
interval using any of the formula Newton's divided difference formula or any specified
formula which can be based on this divided differences. So next we can show that this
divided difference arguments are independent.
This means that if we are considering these arguments here, the arguments can be written in
the form of suppose pre arguments if we are writing the second order arguments, these
arguments can be written in the form of x0, x1, x2 here. This can be written also as f(x0, x2,
x1). And this can also be written in the form of f(x1, x0, x2).
So this can be proved easily since if you will see for first order divided difference either we
can write f(x, x1), it can be written as f(x1) – f(x0)/(x1 – x0) or it can be written as f(x0) –
f(x1)/(x0 – x1) here. This is nothing but f(x1, x0) here.
347
So similarly we can show this independent order of arguments are independent in divided
difference of order 2, order 3. We can show that one also. So if you will go for this second
order divided difference we can find that f(x0, x1, x2), it can be written in the form of f(x1,
x2)– f(x0, x1)/(x2 – x0) here.
And if I will take here 1/(x2 – x0), then it can be written in the form of f(x2) – f(x1/(x2 – x1) –
f(x1) – f(x0)/(x1 – x0) here.
So if you all take common of these terms here I can show that 1 – x2 – x0 this can be written
in the form of f(x1) – f(x2)/(x1 – x2) – f(x0) – f(x1)/(x0 – x1) there. And obviously if you will
348
interchange the signs, we can write this complete statement as in the form of here f(x0)/(x0 –
x1)(x0 – x2) + f(x1)/(x1 – x0)(x1 – x2) + f(x2)/(x2 – x0)(x2 – x1) here.
So again interchanging of arguments it does not affect the solution process also here.
Similarly, it can be shown that if we will consider a nth ordered divided difference it can be
written in the form of f(x0, x1 to xn) suppose, n + 1 terms are there.
So that is why we can say that this is nth order divided difference and it can be written in the
form of f(x0)/(x0 – x1)…(x0 – xn) + f(x1)/(x1 – x0)(x1 – x2)…(x1 – xn) + f(xn)/(xn – x0)(xn –
x1)…(xn – xn – 1) here.
Hence we can say that these divided differences are symmetric in their arguments. Now let
the arguments be equally spaced suppose. x1 – x0 = x2 – 1. Suppose this space size is h here.
If it is equally spaced we can say that all of this differences we can express x0 – x1 as h here,
so it can be expressed as hn here. Similarly, f(x1)/hn, so likewise we can express the total
arguments here.
349
So if we want to express it in divided difference formula suppose then we can use this one in
a form that as f(x) = f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1)f(x0, x1 x2 )+…(x – x0)(x – x1)(x
– xn – 1)f(x0, x1 to xn).
So if we want to prove this formula that Newton's divided difference interpolation formula
here the statement is, if x0, x1 to xn are given set of observations for y0, y1, y2 up to yn are
there corresponding values where the function y = f(x) is given then the interpolating
polynomial is f(x) = f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1)f(x0, x1, x2). So likewise we can
write this one as (x – x0)(x – x1)…(x – xn – 1)f(x0,…xn) here.
So especially this immediate next term it can be said to be as the remainder term where
another x value it is required to get this last term there. Since if you will see here for the
remainder term or the next immediate term if you will see this next term here that is x – xn
here and it also requires another extra point to get this remainder term for the series to
truncate over there.
350
So if you will see it can be written in the product form also here that is f(x) = i = 0 to n, f(x0,
x1 to xi) that can be written as j = 0 to i – 1,Пx – xj. And if you go for the proof of this divided
difference formula here, especially we can write f(x, x0) this can be written in the form of f(x)
– f(x0/(x – x0). Either way you can write that one since this order of arguments are
independent or symmetrical in nature.
So from these two if you want to separate we can write f(x) = f(x0) + (x – x0)f(x, x0) here.
Similarly, if you will write or if you will add extra more point here x, x0, x1 here we can write
this one as f(x0, x1) or we can write as f(x, x0) here – f of, since we are saying that this is
independent of order of arguments we can write in any forms here, so f(x0, x1)/(x – x1).
And if we separate this f(x, x0) here this can be written in the form of f(x0, x1) + (x – x1).f(x,
x0, x1) here. So if you replace this function f(x, x0) in this expansion here then we can obtain
this expression of f(x) as f(x) = f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1)f(x, x0 x1) here.
351
So likewise if you will add one more point , finally we can obtain this formula that can be
given as f(x) = f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1)f(x0, x1, x2) + …(x – x0)(x – x1)…(x –
xn – 1)f(x0, x1 to xn) + r(x) term or this complete expansion I can write here that one as r(x)
term here.
So if I will write this r(x) term so our earlier error term expansion I can write this error term
as r(x) here as (x – x0)(x – x1)…(x – xn) and last term can be written as x, x0 to xn here. And
obviously in a modified form it can be written as (x – x0)(x – x1)…(x – xn)fn + 1 ζ/(n + 1)!.
where ζ should lie between x0 to xn here.
352
So Newton's divided difference formula also converts to Newton's forward difference
formula for equidistance tabular points. So based on this interpolation formula we will solve
1 problem that is using this tabular values that has been given as x =– 1, 0, 3, 6, 7 and f(x) is
given as 3 , – 6, 39, 822, 1611 here.
If you will use this divided difference as the tabular values as x and f(x) that as – 1, 0, 3, 6, 7
and the f(x) values are 3, – 6, 39, 822, 1611. So if you find the first divided difference here
this means that (– 6, – 3)/(0 – ( – 1)) here. So it can give the value of – 9 here. Since – 6 – 3,
if I will write here – 6 – 3 divided by 0 – of – 1 here. So that is nothing but – 9.
353
Similarly, I can find the difference (39– (– 6))/(3 – 0) and that will give you value of 15. And
if you will take this next difference, I can get that one as 261 here. This means that difference
between these two divided by difference of these two.
Similarly, if you take the difference of (1611 – 822)/(7 – 6) here, I can obtain this value as
789 here. Next we will go for second order divided difference here that is (15 –(– 9))/(3 – ( –
1)) here. So for this one we have to consider 3 – (– 1) here.
354
Second if I will consider this divided difference of these two here, 261 – 15 here, I have to
consider the difference of 6 – 0 here.
So the next one if I will consider the difference of (789 – 261) here, I will consider the
difference of (7 – 3) here.
So this divided difference is 27:03 the values 6 here, then 41, then 132 here. And again if I
will take the difference 41 – 6 here, I have to take the difference of 6 – ( – 1)) here.
355
And if I will take the difference of 132 – 41 here, I will take the difference of 7 – 0 here. So
these values are 5 and 13 here. This is third order divided difference and fourth order divided
difference that is 13 – 5, 7 – (– 1) here. That is nothing but the value is 1 here.
So if you will put in the formula we can obtained that one as f(x) = first value as f(x0) + (x –
x0)f(x0, x1) + (x – x0)(x – x1)f(x0, x1, x2) + (x – x0)(x – x1)(x – x2)f(x0, x1, x2, x3) values. So if
you all put all these values then it can provide us one more point, I have to add it out here.
Since x0 value is this one here, x1, x2, x3, x4 here.
356
So I have to add one more term here that is (x – x0)(x – x1)(x – x2)(x – x3)f(x0, x1, x2, x3, x4)
here.
So if you put all these values then I can obtain this polynomial of order 4 here and the final
polynomial as x4 – 3x3 + 5x2 – 6 here. So if I will use this divided difference I can deal any
type of interpolation formula to get solutions. So next class maybe I will continue for some
advanced interpolation formula. So that I will discuss in the next lecture, thank you for
listening this lecture.
357
Numerical Methods
Doctor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 24
Interpolation Part IX (Hermite’s Interpolation with Examples)
Welcome to the lecture series on numerical methods and we are just discussing here this
interpolation. In the interpolation we have already discussed these finite difference operators
like Newton's forward difference operator, backward difference operator, central difference
operator and after that we have covered up this unequal spaced interpolation like Lagrange
interpolation and Newton's divided difference interpolation.
So today we will just discuss about this interpellation based on Hermite’s hypothesis that is
Hermite’s interpolation. And then we will just go for some of the examples that we will solve
using Hermite’s interpolation formula.
So whenever we are going for this interpolation, basically we are dealing here like if a
function f(x) is defined at the set of data points like x0 x1 to xn and this function is
approximated by a polynomial of degree n suppose. Since we have defined like points here
x0, x1 to xn are the tabular points or the nodal points and corresponding to each of these
tabular points we have associated functional values like f(x0), f(x1) up to f(xn).
358
Sometimes also we are expressing this tabular points expression as in the form of (x0 y0), (x1,
y1) up to (xn, yn). Basically the idea is that we want to approximate this function f(x) with a
polynomial p(x.) Since this function is satisfied at n+1 points if you will see then we can
approximate this function with a polynomial of degree n here. Then if we will use like
Lagrangian interpolating polynomial then we are using both this equally or unequally spaced
interval points.
Basically if we are considering here x0, x1, x2 up to xn. Sometimes may be it is equally
spaced. Sometimes maybe it is unequally spaced. But for both these cases we can apply
Lagrange's interpolation formula. But here in Hermite’s interpolation we can extend these
polynomial degree but at the same number of points. This means in a Hermite’s interpolation
the degree of the polynomial is increased without increasing the number of tabular point here.
This means it uses this functional value that is f(x0), f(x1), f(x2) and these first order
derivatives at that point also. This means that it uses these functional values f(xi) and f `(xi).
So i is varying from 0 to n here.
359
So definitely if we have here all the points 0, 1, 2 up to n and f(xi) is satisfied at n + 1 points
and f `(xi) is satisfying at n + 1 points then we will have 2n + 2 points here. And if we
formulate a polynomial based on this 2n + 2 points then it can generate a polynomial of a
degree 2n + 1 here. So let us suppose so y(x) be a polynomial of degree suppose 2n + 1.
Suppose we are considering here y(x) is a polynomial of degree 2 n + 1.
Since the polynomial involved here consist of 2n + 2 points here. So if you will write this is
y(x) is a polynomial of degree 2n + 1 which approximates this function f(x) here. y(x) is
approximating this function f(x) at the nodal points like x0, x1 to xn here.
Then definitely we can write this one as y(xi) = f(xi) and y`(xi) = f `(xi), for i = 0, 1, 2 up to n
here. And if we want to express it in polynomial form we can write this polynomial as y = i: 0
to n,∑ui(x)yi + i : 0 to n,∑vi(x)yi` where ui(x) and vi(x) are polynomials of degree 2n + 1 here.
Definitely we can say that where ui(x) and vi(x) are polynomial of degree 2n + 1.
360
So if we want to express this as in a complete polynomial sense this means that we are using
these n + 1 nodal points to determine the polynomial of order 2n + 1 here. Since we have here
n + 1 points only so then we have to justify that whatever this polynomial we are generating it
can take the n + 1 points here but it can produce a polynomial of degree 2n + 1 here. That is
why we have to consider ui(x) polynomial of degree 2n + 1 and vi(x) polynomial of degree
2n+1 also.
So if we are writing here y(xi) = f(xi) and y`(xi) = f `(xi) and the polynomial is expressed in
the form of y(x) = i: 0 to n,∑ui(x)yi + i: 0 to n, ∑vi(x)yi` which represents a polynomial of a
degree 2n + 1.
361
Then since we are associating this polynomial in a Lagrangian sense so that is why we can
consider that since usually we are writing this polynomial y = i: 0 to n, ∑ui yi + i: 0 to n,
∑vi(x). Everywhere x is associated here. So ui(x) y(x), vi(x) and yi`(x) here.
So we can define ui(x) = 1 when x = xi or we can represent this one as ui(xj) = 1 for i = j and
this equals to 0 for i ≠ j here. And similarly we can define vi(xj) = 0 for all i and j here.
So similarly if we want to express also since it is in a Lagrangian mode here and two
functions we are defining here similarly we can define that ui`(xi) or we can write this one as
ui`(xj) = 0 for all i and j. But vi`(xj) this can give you like 1 for i = j and this equals to 0 for i ≠
j here. Then we can combine this Hermite interpolating polynomial in a 2n + 1 degree since
that can generate a polynomial which can be represented as Hermite interpolating polynomial
here.
362
This means that if we want to associate this Lagrangian polynomial with this variable here or
this polynomial ui(x) and if we want to associate this Lagrangian polynomial Li(x) with this
polynomial vi(x) then we can say that if it is multiplied here like Li(x).Li(x) then this can
generate a polynomial of degree 2n there. And if it is multiplied with this function here ai(x)
+ bi so it can generate a polynomial of degree 2n + 1 there. And similarly since we are
considering Li(x) is the Lagrangian coefficient of Lagrangian interpolating polynomial.
This Li(x) can be expressed in the form of like (x – x0)(x – x1)(x – xi)(x – xi – 1)(x – xi + 1)…(x
– xn)/(xi – x0)(xi – x1)…(xi – xn). So if we will define this Lagrangian interpolating
polynomial where this Li(x) represents the coefficients of Lagrangian interpolating
polynomial here. Using this Lagrangian interpolating polynomial we can determinant these
coefficients ai, bi, ci, di from these polynomials ui(x) and vi(x) here.
Obviously sometimes we are expressing as I have discussed in the earlier classes that we can
express this one in a product form that can be represented as i = 0 to n, φ(x)/(x – xi).φ`(xi)
here where φ(x) can be represented in the form of (x – x0)(x – x1)…(x – xn) here. And φ`(xi)
especially this is called φ`(x). We can write this one as dφ/dx at x = xi here.
363
(Refer Slide Time: 13:34)
So in both these forms it can be expressed since φ(x), it can also be represented in the form of
product of i = 0 to n, П(x – xi) also. And some people are also using this one as prime of like i
= 0 to n, ∑(x – xi) also.
So if you will express this Li(x) in different forms here by considering like n + 1 points that
represents the polynomial of a degree n here then if we will take this differentiation of this
ui`, whatever we have expressed for this Hermite’s interpolating polynomial we can write
ui`(x), since it can be expressed in the form of like ui(x) is expressed in the form like ai(x) + bi
Li2(x) here.
So if you will take the derivative with respect to x here in the first coefficient we can write
this one as ai.L2i(x) + ai(x) + bi 2Li(x).Li`(x) here. So if you will take common of this ai from
both these terms here we can write this one as aiLi2(x) + ai(x) 2Li(x) Li`(x) + bi 2Li(x) Li`(x)
here.
364
And if I will take common here ai since we want to separate or eliminate ai from these
equations here. So I will write this one that can be written in the form of like Li2(x) +
2xLi(x)Li`(x).ai + bi.2Li(x)Li`(x) here.
So likewise we can differentiate also v(x) and we can obtain this vi`(x) as, since vi(x) is
usually written in the form of like vi(x) = cix + di.Lix2. And if you will take the derivative
with respect to x here this can be written in the form of ci.Li2(x) + ci(x) + di.2 L(x).Li`(x) here.
And if we will take common of ci from both these terms here, so ci.[Li2(x) + x2Li(x)Li`(x) + 2
Li(x)Li`(x).di.
So if we will put here x as xi then we can eliminate this ci, di from both these equations and
we can get these coefficients that is either in the form of Li(x) or in the form of xi there.
365
(Refer Slide Time: 17:17)
So if we will put this like the values here, this means that using equations 3a and 3b here that
is as the conditions I have already written. ui(xj) = 1 for i = j and this equals to 0 for i ≠ j
where for all i and j we can say that vi(xj) = 0 for all i, j here. But we are saying that ui`(xj) = 0
for all i and j but we are saying vi`(xj) = 1 for i = j and this equals to 0 for i≠ j.
If you will use both these conditions in these two equations here that is ui` and vi` then easily
we can eliminate ai(x) and vi(x) here.
So now if you will apply these conditions then we can find here. So first condition we can get
as ai(xi)+ bi =1. Second condition we can get as ci(xi)+ di = 0. Since especially if we are
366
seeing here in the first term here that is ui(x) = ai(x) + bi.Li2 here, so then if we will put x = xi
then we will get Li2(xi) = 1. But if you will put vi(xi) here or xj both the terms we are getting 0
there.
Since we have defined already that ui satisfies this Lagrangian polynomial for i = j coefficient
only at x = xi point. But in the derivative since if you will consider then vi will satisfy there.
So that is why we can find this first coefficient as this form here. And if you will use these
same coefficients for the second two equations like 6a and 6b equations here then we can
obtain this one as 2xiLi`(xi)+ 1 ai+ 2 Li`(xi) bi = 0.
367
Similarly we can get 2 xiLi`(xi)+ 1 . ci+ 2 Li`(xi).di, = 1 here.
Since now we have four equations with four unknowns so easily we can obtain these
coefficients here. So first if you will try to eliminate ai and bi from these two equations here
then directly if you will multiply simply since this is 2 Li`(xi) is there, if you will multiply
here 2 Li`(xi) in this equation and if you subtract then we can obtain the coefficient for ai first.
So if you will multiply this is ai xi 2 Li`(xi)+ we can write bi.2Li`(xi).bi = 2 Li`(xi) here. And
this equation it can be written in the form of like ai xi Li`(xi)+ ai+ 2 Li`(xi) bi = 0 here. So if
you will subtract then this term will cancel it out and this term also here cancel it out. So that
is why we can obtain ai as, subtract means we can say that – 2 Li`(xi) here.
368
And if you will put this ai coefficient in the first equation here then we can obtain bi and bi can
be written in the form of like 1 + 2 Li`(xi). xi here. Similarly, if you will eliminate this ci and
di from these two equations we can obtain as ci as 1 and di as – xi here.
So simple multiplication if you will do then we can find these coefficients ai and bi and ci and
di from these four equations. So in a complete form if you will write this one that it can be
represented in the form of y(x) = i:0 to n, ∑(1 – 2 Li`(xi)(x – xi).Li2(x).yi+ i:0 to n∑(x –
xi).Li2(x) yi`. This is the complete formula.
Sometimes also you can remember these coefficients as hi(x) here, this one as hi`(x) also here.
Since Li2(x) is easy to remember here and both these coefficients first one is multiplied. Since
we want to generate this polynomial of degree 2 n + 1 here, so that is why yi can be
369
multiplied there and yi` can be multiplied here. And the beauty of this method is that only n +
1 points are used to get a polynomial of order 2 n + 1 here.
And if we want to find this error terms here, the error terms can be written in the order of 2 n
+ 1 here, since this polynomial is involved here, the degree of 2 n + 1 here, the error term will
be represented as f2 n + 2(ζ)/(2n+2)!. П(x – xi)2 here.
Since if you will see here so both these polynomials that has been multiplied here so that is
why if you will consider like the earlier Lagrangian interpolation error term we are used to
write like fn + 1(ζ)/(n + 1)!П(x – xi).
So if Li2(x) if we are considering so it can be taken as a2 since the derivative can be taken as
the order of 2 n + 2 here. So based on this if you will go for a problem here that is suppose if
the question is asked like using Hermite’s interpolation formula find the value of sin(0.5)
from the following data. So we can write first like L0(x0), L1(x1), L2(x2). Then we can find
their derivatives and you can put in the formula.
370
Then you can obtain this interpolation in that problem. So if this problem is given like sin x
data is given and it is asked to find sin(0.5) here. So suppose these tabular values are given as
x, sin x, cos x here and the values are like – 1, 0, 1 and like – 0.8415, one value is 0. 8415
here. Similarly, cos values are like 0. 5403, cos 0 is 1, then this value is also 0. 5403 here.
And we have here if you will see f(x) is sin x, f `(x) is your cos x here. And if we want to
express this one in the formula like here y(x) if we want to find so directly we can put yi(xi)
or y(xi) as f(xi) that is sin(xi) here. And if we want to put yi`(xi) here so that can be written as
cos(xi) there.
371
So first we have to find here like L0 (x) we will find. So if we want to find here L0 (x) first,
L0(x) can be written in the form of like (x – x1)(x – x2)/(x0 – x1)(x0 – x2) here. Similarly, L1(x)
can be written as (x – x0)(x – x2)/(x1 – x0)(x1 – x2) here. And L2 (x) can also be written as
(x – x0)(x – x1)/(x2 – x0)(x2 – x1).
So if you will put these values here like L0(x), so it will come as 2 here.
Similarly, if you will determine L1(x) here, that can be written as like (x – ( – 1)). So (x +
1)(x – 1)/1, if you will see here since we have these coefficients like x0 equals to this one, x1
equals to this one, x2 equals to this one here.
372
So we can write this one as like So (x + 1)(x – 1)/1 it will come. So similarly you can obtain
like L2(x), that as x – x0. So that is why you can write x + 1, then x – 0. This means that x
only, so divided by if you will see it will come as 2 here.
So directly we can put the values there like L0(0.5) if you will put here then we can obtain
this value as – 0.125 and L2(0.5) if we will put we can get this one as like 0.75. And if you
will put these values of 0.5 here then we can obtain these values as 0.375.
So similarly we can obtain like L0`(x) that is directly if you will differentiate here, since this
is a polynomial we are getting here so L0`(x) we can write as (2x – 1)/2 here.
373
(Refer Slide Time: 31:28)
And L0`(-1), we can obtain that one as – 3/2 there. So then again if L1`(0) if we want to find
that can be written as 0. Then L2`(1) if we will find that will be represented as 3/2 here.
So finally if you will put these values then we can get these values as 0.3743 here. Therefore,
y(0.5) we can write it as 0.3743 and the exact value of sin(0.5) it can be represented as
0.4794. If the Lagrangian formula can be used, we get the sin(0.5) as 0.4207.
374
But for the higher degree polynomial it is not necessary, it can provide a better result. Thank
you for listening this lecture.
375
Numerical Methods
Doctor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 25
Numerical Differentiation Part I (Introduction to Numerical Differentiation)
Welcome to the lecture series on numerical methods. Today's lecture we will go for
numerical differentiation. In the numerical differentiation we will discuss about various
interpolating polynomials and based on this polynomial differentiation we can evaluate these
functional derivatives. This means that if we are approximating a function with a polynomial
that the derivative of this function can be written as a derivative of the polynomial also.
So that is why here first we will go for the introduction section that how we can approximate
a function with a polynomial? Then we will start this differentiation by interpolation formula,
first for finite difference formulas, then we will go for undetermined coefficient and in the
last phase we will discuss about this unequal interval.
So in this section if you will see that the estimation of the value of a derivative based on
certain known functional values always and at that point usually we are evaluating the
derivatives. Sometimes if the exact function is not known to us then we can approximate this
function by a polynomial and we can evaluate the derivatives for the polynomial and then we
can say that this derivative of this function can be represented in this form there.
376
So several methods are available to find the derivatives of the function by using given set of
tabular data. if you will have this set of tabular data (x0, y0), (x1, y1),…,(xn, yn) , even if the
function is not known to us we can evaluate this derivative of that function at certain points.
So here we will discuss both these methods based on both equal and unequal spaced data
points.
Since already in the last lectures we have discussed that some of these interpolation
polynomials that deals with these equal space points, some interpolation polynomials are
based on both these equal spaced points and unequal spaced points. And we have also
discussed, what are the drawbacks? Or what are disadvantages of different interpolation
formulas?
So the methodology for computing the derivatives at a given point is rather simple. Since if
you will find these derivatives dy/dx = f(x, y), so various methods are available to evaluate
these first order derivatives. But if the function is exactly known to you it is easy to evaluate.
377
(Refer Slide Time: 02:54)
So we have already discussed how we can approximate the function with a polynomial in the
last classes. In order to find this derivative of this function here we can differentiate this
approximated polynomial pn(x) with this function f(x) here. This means that if we have the
tabulator points (x0 y0), (x1, y1) up to (xn, yn), where only these tabular values are known to
us but the function is not known to us.
Then we can formulate a polynomial by considering all these tabular points here and from
that polynomial we can find the derivative for the function. And at certain points if it is
required to evaluate these derivatives then at that point exactly we can put in the polynomial
derivatives there and we can evaluate these derivatives for that function at that point. If you
will see this graph here that is y =f(x) is the curve that we have dotted and this curve is
approximated by a polynomial y =p(x) here.
378
And if this is approximated if you will visualise a point A in this graph here and if you will
plot the tangents at that points AF and AP, we can find that these tangents are completely
different. Then we can say that sometimes these polynomial differentiation is completely
differ from these approximated functional derivatives.
So but if you will use a different technique to find these numerical differentiations by
sufficiently close to these tabular points then we can obtain this derivative for the function is
equal to the derivative for the polynomials. This means that we have to do this domain
discretization in a sufficiently closed form so that these tangents at different points for both
these functions should be equal.
Then we can say that this derivative for the function and derivative for these polynomials are
equal. So first in this discussion we will consider these equi-spaced points. Whenever the
379
points are or the tabular points are equally spaced how we can use this differentiation there?
Then we will go for finite difference operators. Then we will go for undetermined
coefficients.
Suppose we will have this given set of data values of f(x) at x0, x1 up to xn. In general
approach we can derive this numerical differentiation method first to obtain interpolating
polynomial there. Then difference at this polynomial suppose r times if you will differentiate
this polynomial we can write this polynomial as pnr(x) suppose. This means that first we are
approximating. If you will see f(x) is approximating with a polynomial of order pn(x) here.
This means f(x) is approximated with a polynomial p(x) of degree n. Then we can write this
nth order difference or this differentiation with respect to pn(x) with respect to r in the order
of r can be written as pnr(x) here.
So then at a particular point we can write this polynomial differentiation of order r as pnr(xk),
where this differentiation is evaluated at the point x = xk of order r there. It may be noted that
pn(x) and f(x) maybe sometimes they have the same values at the nodal points but the
derivatives are different. That I have already shown in the graph.
First we will go for this equi-spaced points and the first equi-spaced differentiation
interpolation we will do, sorry this differentiation with the interpolation we will carry out
here is Newton's forward interpolating polynomial. So basically this interpolating polynomial
for Newton's forward difference formula is expressed in the form of y for any point xp or
usually you are writing y(x) or y at xp or we are writing yp.
380
This can be written as in the form of y0+pΔy0+ p(p-1)/2! Δ2y0 +…p(p-1)..(p-n-1)/n!Δny0 here.
So if we want to differentiate this one first we have to consider this xp point as x0+ph here.
And especially this xp is nothing but the undetermined point or the point where we want to
find this interpolation polynomial or the formula. So at that point especially if we want to
differentiate we can write dx =hdp here. Or we can write dp/dx that as 1/h here.
Either ways you can define this first order differentiation here. So if you write this
differentiation for this polynomial here we can write dy/dx, that is first differentiation of y
with respect to x here. So we can write dy/dp into dp/dx here.
381
And especially it can be written as 1/h since all of these points or this function whatever it is
expressed here that is variable of p, so that is why it is easy to differentiate this function with
respect to p here. So that is why we can write this one as d/dp y0+p Δy0+all these points here.
And if you differentiate that one so directly we can write at this one as 1/h. First point if you
will differentiate that will give you 0 here. Second point we will get as Δy0 here. Third point
if you will differentiate here that is in the form of (p2-p)/2! here. So it can be expressed as
2(p-1)/2Δ2y0 here. So likewise all of these other points you can differentiate there.
382
So in a complete form if you will write this one then dy/dx can be written as 1/h. So first
point we are writing here Δy0+(2p-1)/2Δ2y0+(3p2-6p+2)/6Δ3y0+likewise. Similarly, if you
will go for this differentiation of second order we can write the second order derivative as
d2y/dx2.
This is nothing but we can write d/dx 1/h, dp/dx here. So once more if you will differentiate
this one we can write this one as 1/h2 d2 p/dx2 here.
So if you will go for this second order differentiation of this formula here, once more if you
will differentiate this one, we can write this one as d2y/dx2. This equal to 1/h2. So since once
more we are differentiating this one with respect to p here so this term will give you 0 here,
second term if you will differentiate this one with respect to p here, so 2/2 this is 1 here.
So Δ2y0 is the first term here. Then second one if you will see here, so (6p-6)/6Δ3y0+all other
terms it would be carried out in the same fashion here.
383
So this formula especially can be used to compute the first and second derivatives
respectively near the upper end of the table. This means that at the beginning of the table if
the data is supposed asked to find then we can use this formula at that point. Since already in
the previous classes we have discussed that Newton's forward difference formula especially it
is used if the tabular values are asked at the upper end of the table or the beginning of the
table there.
So if suppose this tabular point is asked to compute this derivative exactly at suppose the
nodal points. This means that if it is asked to compute at x =x0 or x =x1 especially we can find
that at that point exactly p =0. So then we can reduce these formulas as suppose it is asked to
obtain the derivative at x =x0 suppose. Then we can say that p =0 at that point.
And we can write dy/dx at x =x0 as 1/h Δy0. Since p is equal to 0 here we can write this one
as -1/2Δ2y0 +, so 2/6 here. So that is why you can write 1/33y0. So all other points it will be
considered in that form only.
384
Similarly, if it asked to compute this d2y/dx2 at x =x0 then this formula can be written in the
form 1/h2. If you will see here this means that we are obtaining 0 here. So that is why this can
be written as Δ2y0-Δ3y0. So all other points can be considered in the same fashion there.
So it can be written in the form of 1/h2Δ2y0-Δ3y0+11/12 Δ4y0-all other terms. So first for this
type of differentiation we can consider one example here.
So if you will consider this example that all the points are equi-spaced here, so if you will
take this tabular points (x, y) as 2.00, 2.20, 2.40, 2.60, 2.80 and 3.00 suppose and its
functional values are expressed as 0.6932 and 0.7885, 0.9555, 1.0296, sorry one more. I have
missed here. if you will write this.as 0.7885, at 2.40 it is 0.8755.
385
Then at 2.60 if the value is 0.9555 and at 2.80 suppose value is 1.0296 and at 3.0 suppose the
value is 1.0986. So first differentiation easily we can obtain since in the last lecture we have
already explained that one.
So we will consider the difference of these two values here. So first differentiation we can
take the differences 0.7885-0.6932. So it can be written in the form of 0.0953. This is the
difference of these two values here. Similarly, if you will take the second difference values
here 0.0870. If you will take the difference of these two values here so that can be written in
the form of 0.0800.
The difference of these two values it can be written in the form of 0.0741, the difference of
these two values it can be written in the form of 0.0690. Similarly, for the second difference
we can consider the difference of these two values here that will come as -0.0083. Difference
of these two values we can write this as -0.0070 here. So third. we can write - 0.0059 and last
one we can write - 0.0051 here.
For third difference if you will see we can take the difference of these two here and that can
be written in the form of 0.0013 here and if you will take the difference of these two here,
that can be written in the form of 0.0011. If you will take the difference of these two here that
can be written in the form of 0.008. And in the fourth difference if you take the difference of
these two here that can be represented as -0.0002 and last difference this one also -0.0003
here.
386
(Refer Slide Time: 18:26)
So if you will put all these coefficients in the Newton's forward difference formula with
differentiation up to third differences here, so fourth difference if you can see that values are
very small here. So that is why you can consider these differences up to third differences
here. Since the question is asked that using this following data find y` and y`` at 2.00.
This means that at x =x0 there using up to third difference only. So if you will use here up to
third differences we can write this formula dy/dx that in the form of like, first term if you will
write here Δy0, first one is 1/h here -1/2 Δ2y0+1/3Δ3y0 here.
387
And since it is asked to evaluate up to third differences we can write this term up to third
differences here only. For the second order differentiation we can write d2y/dx2 = 1/h2. So
first differentiation term here we can get it as Δ2y0 - Δ3y0 here.
So if you will put these tabular values for x0 =2.0 here then we can obtain this first difference
term as x0 =2.00. So h =0.20 also. So that is why y dash at 2.00 it can be written as 1 by0.2
into the first differentiation value that is 0.0953-half into-0.0083+1 by 3 into 0.0013. This
=0.4994 here.
And if you will use for the second order differentiation here then we can find this as 1/h2 that
is (1/0.2)2Δ2y0 that is nothing but -0.0083 - Δ3y0, that as -0.0013 here. So the total value is -
0.24 here.
388
So next we will go for Newton's backward interpolating polynomial. In the Newton's
backward difference interpolation formula especially this formula is written in the form y(x)
or y(xp) or especially y(p). It can be represented in the form y0+p ∇y0+p(p+1)/2! ∇2y0+all
other terms are there. So if you will write here p then p can be represented in the form of p =
(x-x0)/h here or x can be written in the form of x = x0+ph here.
And if you define here dp/dx here so dp/dx especially if you will differentiate both the sides
here this can be represented in the form of 1/h here and if you will go for this differentiation
here for this function y(x) here then we can write dy/dx as dy/dp, since y is a function of p
here, this into dp/dx here. And especially it can be written in the form of 1/h, dy/dp here.
389
So then if you will go for second order differentiation here then this can be written as d2y/dx2
=d/dx dy/dx. If you will see here that is dy/dx can be replaced by 1/h dy/dp and then we can
write this one as 1/h2 d2y/dp2 here.
And if we can put this formula or if we will write this one in the form of y(x) there then we
can represent this one as dy/dx as 1/h, d/dp of the complete formulation that as
y0+p∇y0+p(p+1)/2!∇2y0+all other terms are there. And if you will differentiate this one then
we can get this one as 1/h.
So first differentiation this will give you Δy0 since y0 is a constant. That will give you 0 value
here. So then next one we can write that as (2p+1)/2!Δ2y0+the third order term we can
differentiate. We can write there.
390
And Similarly, if we will go for the second order differentiation here then we can write
d2y/dx2 as 1/h2. And one more differentiation for these functions if you will write out here
then the first value since Δy0 is a constant here then we can write the second one as Δ2y0 here,
since the differentiation if you will take here 2p means this is 2/2 it will cancel it out.
So first term will be Δ2y0 +if you will see here then immediate next term it can be represented
as (p+1)∇ 3y0+all other terms are there.
And especially if we want to find this formula at the upper end of the table since usually I
have explained you in the previous lectures that this backward difference formula it is used at
the end of the table. So that is why if it is asked to find the value near the upper end of the
table or at the lower end of the table then we can use Newton's forward difference formula
391
and the backward difference formula there. Suppose if you will use this tabular point at x = x0
suppose where p = 0.
This means that we are shifting this point to the lower end of the table and at that point only
we are using these p values. So that is why if you will put at the lower end of the table x0 then
upper points it will be represented in the form of x-1, x-2 up to x-n there. And if we will put
this means that if x =x0 exactly then we can put here p =0.
So that is why your formula can be reduced in the form of dy/dx it can be represented as 1/h,
∇y0+since p = 0 here then we can write this one as 1/2, ∇2y0+rest of the terms it can be
represented from the formulation there itself.
392
Similarly, if you will put this p = 0 in the d2y/dx2 here then we can obtain this value d2y/dx2
at exactly x = x0 at the lower end of the table. And at that point we can say that 1/h2, so first
value it will be ∇2y0+∇3y0+all other terms are there, if p =0.
And immediate next term if you will write this one then it can give you 11/12 Δ4y0 here. And
if you will go for the set of data points the data points are given 1.00, 1.25, 1.50, 1.75, 2.00,
2.25 and their corresponding y values are 2.7183, 3.4903, 4.4817, 5.7546, 7.3891, 9.4877.
And if you will use this backward difference formula so in the backward difference formula
if you will see first we are taking this first difference that is nothing but the difference of first
two values. But we have to consider these values if we can write x0 at the end of the table
here then our tabular values will be followed from the bottom to the up of the table.
393
So if you will use these tabular values that in the form of 1.00 as 2.71 and take the differences
3.4903 - 2.7183 then it can provide the values as 0.7720 here. If you take the difference
4.4817 - 3.4903 then it can provide us the value 0.9914. So likewise the differences we can
calculate.
And if you will use the differentiation formula here since the question is asked to find y` and
y`` at x = 2.25, since it is at the end of the table using up to third differences. So up to third
difference formula we will consider. Up to third differences this formula can be written in the
form of dy/dx that as 1/h ∇y0+1/2 ∇2y0+1/3 ∇3y0 here.
And Similarly, for second order differentiation we can write these forms in the Newton's
backward difference formula up to third order terms as 1/h2 ∇2y0+(p+1)∇ 3y0. And if you will
394
put here h = 0.25 and x0 = 2.25 here then we can obtain this derivative as y` at 2.25, that as
1/0.25.
If you will see these tabular values, Δy0 it is giving you 2.0986+1/2∇2y0 that is coming as
0.4641. And last value 1/3 that is ∇3y0 that is coming as 0.1025.
So if you evaluate these values that will come as 9.4593. Similarly, we can obtain this
derivative for second order that is at 2.25 also and it can be written as 1/h2. So that is why we
can write (1/0.25)2 ∇2y0 that as 0.4641+ 0.1025 that as 9.0656 here.
So maybe next class we will continue about this differentiation using Lagrange interpolating
polynomial that is both for equi-spaced points and unequi-spaced points. Thank you for
listening this lecture.
395
Numerical Methods
Professor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 26
Numerical differentiation part-II (Differentiation based on Lagrange’s interpolation)
Welcome to the lecture series on numerical methods, so we now discussed Interpolation then
in the last lecture we have discussed about this numerical differentiation based on Newton’s
forward difference and Newton’s backward difference operations. So today we will just
discuss this numerical differentiation based on Lagrange interpolation, divided different
interpolation and then from that differentiation how we can determine this maxima and
minima of a function based on this polynomial approximation so that we will discuss.
So basically we are just discussing here that whenever you will have tabular values like (xi,
(fxi)) at n + 1 distinct points starting from 0, 1 up to n. Then usually this Lagrange
polynomial Pn(x) is written in the form like Pn(x) this = i: 0 to n ∑Li(xi)f(xi).
Or sometimes for our convenience we are just writing k = 0 to n ∑Lk(xk)f(xk). also, where
Lk(x) is called the Lagrange coefficient and if it is interpolating with a polynomial then it is
called Pn(x) here. This means we are just formulating or constructing a polynomial Pn(x)
based on this n + 1 model point considering these functional values based on this function
f(xk) here. So if you just write here this Lagrange coefficient then usually we are just writing
this Lagrange coefficient Lk(x) is ω(x)/x – xi ω`(xi), so where usually this ω(x) is defined in
the form like ((x – x0))((x – x1))…(x – xn) here.
396
Since we have these tabular points expressed in the form of like (x0, f(x0)), (x1, f(x1)),..(xn,
f(xn)), specially we are just evaluating these derivatives at the position that if the function is
not known to us explicitly. This means that if the tabular values has been given like (x0, y0),
(x1, y1) up to (xn, yn) where exactly this function is not known to us then we can just also
approximate this polynomial in the form of Pn(x) from where or from that we can just get this
differentiation of this function f(x) at different points, maybe at the same tabular point or
sometimes maybe at different points also which is existing within any of the intervals of this
tabular point.
Basically if you will just want to evaluate this derivative for this function Pn` or Pn(x) here,
the derivative of Pn(x) usually it is written in the form of Pn`(x) here and this can be written as
i: 0 to n ∑Li`(x) f(xi) here. Since if you will just see here, this ω(x) consists of these variables
in the form of x here where we can just differentiate this function specially if you just see this
derivative of Pn(x) is nothing but the derivative so whatever it is just operating on Li(x) here,
multiplied with these functional values or the tabular values which has been given and all of
these nodal points are tabular points.
If we can just write this one in a complete form we can just write this one as L0`(x) f(x0),
L1`(x) f(x1) since how these known values of x0, x1 and x2 up to xn and where also these
functional values are known to us which all are constants here. Specially it can be written of
the function in the final point is nth point, so Ln`(x) f(xn) this one, so specifically if you just
see the slides so we can just say that xi are the values which is expressed at all of these node
points that is in the form of x0, x1 to xn and the functional values that are expressed in the
397
form of f(x0), f(x1) to f(xn) here. So suppose first we will just consider Lagrange polynomial
which is used 3-point suppose, since at a time if you will consider n points it is difficult to
understand, so first we will just consider 3-point suppose.
So if we assume that the function goes through suppose 3 points and the points are defined in
the form like (x0 f(x0)), (x1 f(x1)), (x2 f(x2)), either you can just write in the form of f(x0),
f(x1), f(x2) or you can just write that one as (x0 y0), (x1 y1), (x2 y2), since we are just
expressing y = f(x). And if we will just consider this polynomial here so this can just generate
a polynomial of degree 2 that is in the form of P2(x) here which can be written in the form of
L0(x), f(x0) L1(x) f(x1) + L2(x) f(x2) here. And if we will just slide independently all the terms
here, this means that L0(x) can be written in the form of like ((x – x1))((x – x2))/((x0 –
x1))((x0 – x2)) here.
Similarly, L1(x) this can be written in the form of like ((x – x0))((x – x2))/((x1 – x0))((x2 –
x2)) here. Similarly, L2(x) can also be written in the form of like ((x – x0))((x – x1))/((x2 –
x0))((x2 – x1)).
398
(Refer Slide Time: 7:58)
So if we will just express these 3 Lagrange polynomial coefficients L0(x), L1(x) and L2(x) in
this form here then we can just write P2(x) in the form of like (x – x1)(x – x2)/(x0 – x1)(x0 –
x2)f(x0) + (x – x0)(x – x2)/(x1 – x0)(x2 – x2)f(x1) + (x – x0)(x – x1)/(x2 – x0)(x2 – x1)f(x2) here.
So if you will just write this polynomial in this form then usually we can just evaluate the
derivative for this function P2(x) here or the polynomial P2(x) here. So if you just differentiate
this polynomial P2(x) with respect to x here, differentiating P2(x) with respect to x here, if
you just differentiate this one then we can just write this differentiation as P2`(x) here.
So P2`(x) if you just see here, first product if you just take so x2 so this means that we can just
write 2 x – if you just say x is a product of here that can just gave you – x1 here, if you just
put the product of this one also here that will just give you – x2 here and the last derivative if
you just consider, this is nothing but a constant here + x1 and x2 and its derivatives will be 0
here, then divided by all of these points if you just see all are here constant so that is why we
can just write (x0 – x1)(x0 – x2) f(x0) since all are here constant only.
So next derivative point if you just consider here, the next derivative term associated with this
P2(x) that can be written in the form like, 2x – (x0 – x2)/(x1 – x0)(x2 – x2) f(x1) + the last term
if you just differentiate, it can be written in the form of 2x – (x0 – x1)/(x2 – x0)(x2 – x1)f(x2).
So now if you just differentiate once more here P2`(x) then we can just obtain P2``(x) also
here so next if you just go for the second differentiation of this function here.
399
If you will just differentiate once more here, then we can just write P2``(x) this and can be
written as, first one is the variable term only this is involved here. So 2/(x0 – x1)(x0 – x2). f(x0)
+ 2/(x1 – x0)(x2 – x2).f(x1) + the last term if you just take one more differentiation, it
represents the term here (x2 – x0)(x2 – x1).f(x2) here. And combine if you will just write this
term this can be written in the form of like 2 is common for all the terms here, 2 (x0 – x1)(x2 –
x2)(x2 – x0) here.
And if you will just write the other part of the term here this can be written in the form of like
(x2 – x1).f(x0) + next term if you just say see here that is not common to this term here that is
in the form of like if you will just see that is nothing but (x0 – x2)f(x1) + last term that is in the
form of like if you will just see that is in the form of (x1 – x0), so (x1 – x0) f(x2). So this is the
complete representation of this second order differentiation of P2(x) this means that if you
will just involve only 3 points then we can just represent this polynomial in this form.
So likewise we can just express the nth differentiation or if you will just consider polynomial
of degree n also here then the first order differentiation, second order differentiation, third
order differentiation we can just get it in the same manner here. So based on this we can just
say that if all these points suppose equally spaced, sometimes we say that the Lagrange
interpolating polynomial this can be applicable both for equi-spaced points and unequi-
spaced points. In this case we consider that all points may be unequally spaced, if we will just
consider that all the points are equally spaced suppose.
400
starting point here and xn is the endpoint here, if we will just divide this complete interval
into n equal parts we can just say that it is equally spaced or all the points are equi spaced. So
if we will just replace all of these points right this transformation then we can just obtain the
first order differentiation P2`(x0). Suppose we are just considering this differentiation at this
point x0 here, if you will just consider that one that can be written in the form of like if you
will just see this one is the differentiation here.
So first point it can be written in the form of like 2x0 –, suppose this differentiation we want
to determine at the point x0 here so then x = x0 here – x0 – h – x0 – 2h /(x0 – (x0 – h)), this is
(x0 – (x0 – 2h).f(x0) + (2x0 – x0) – (x0 – 2h)/{(x0 + h) – x0)(x0 + h) – (x0 – 2h).f(x0) + h + the
last term if you will just write that is 2x0 – x0 – (x0 – h)/{x0 + 2h) – x0}{x0 + 2h) – (x0 – h)}.
f(x0) + 2h here.
So if you just cancel all the terms here then we can just say that this x0 we cancel it out, this
x0 we cancel it out, this x0 be cancel it out here, this is 2x0 we cancel it out so this x0 we
cancel it out, this x0 we cancel it out, this x0 we cancel it out, this x0 we cancel it out. So
finally if you just write all the terms here, we can just write P2`(x0), first you just see here – h,
– 2h, here also – h, – 2h so if you will just see here we can just take common 1/2 h from all
these terms and it can be written in the form of like – 3f(x0) + 4f(x1) – f(x2) here.
So this means that all of the terms if you just see that first term if you just see here that is in
the form of like – h, – 2h here so that is why this is – 3h and lower one if you just see here – h
so – 2h, this one also – 3h here then f(x0) so next one if you just see here so the same thing
401
we can just say that is (2x0 – x0) – x0, we will just cancel it out then 2h is remained in the
upper side here so then lower side if you just see here that is in the form of x0 – x0 you just
cancel it out then h remains here then h – 2 h here so that will be – h will remain so – h.h that
will just generate here h2.
So likewise if you just solve these 3 terms, we can just obviously get this 3-point formula
here that is this is called the first 3-point formula for first order derivative here. Next we will
go for this second order differentiation here, so if you will just differentiate this P2`` at x0 then
we can just obtain here P2``(x0) as directly the same formulation we can just use here also that
is 2/(x0 – (x0 – h)(x0 + h) – (x0 – 2h)(x0 + 2h – x0)(x0 + h) – (x0 – h)f(x0) + (x0 – (x0 –
402
2h)).f(x0+h) + (x0 + h – x0)f(x0 +2h). So if you just see here, this x0 cancel it out x0 also
cancel it out here, this is also cancel it out, this is also cancelled out.
If you will just write in a complete form here then it can be written as 2y – h, this one also
here – h, last one if you just see here this is nothing but 2 h here. And in fact these functions
if you just see, first one just gives h f(0), x0 x0 cancel it out so 2h – h so that is why it is just
giving you h here so hf(x0) that is nothing but hf(0) here. Second part if you just see here that
is nothing but – 2 h here that we can just write – 2hf1 and last point if you just see here, it will
be just giving you h f(x2) and finally we can just write this as since –-, – it is just giving you +
here this one, so f(0) since h can be taken out common from all the terms here so f(0) – 2 f(1)
+ f(2)/h2.
Since we are just writing here f(0) is nothing but f(x0) and f(1) is nothing but f(x1) and f(2) is
nothing but f(x2), this is called second order forward difference formulas, especially if you
just see the same formula you can get it also in Taylor series expansion also, this is called
second order forward difference formula. So based on this differentiation we can just go for
the solution of some problems, first we will discuss these problems based on this Lagrange
Interpolation where it can be exactly applied to the nodal points. We can say that either at x0
or at x1 directly if we just put that point whatever we have just discussed here, the same way
we can just differentiate and we can obtain the solution.
So first if you just consider this example that is in the form like x is prescribed at 0, 2, 3, 4, 7
suppose since we have considered as unequal spaced points and if the functional values are
403
expressed as 4, 26, 58, 112, 466 suppose and the values asked to compute at x = 2. Since if
you just see this tabular value here, x = 2 is a particular tabular value on this problem here
and it has asked you to evaluate these values at y` (2) so first if you just discuss here.
Find y`(2) using the following data, suppose the question is asked and the data points are
given in the form like x at 0, 2, 3, 4, 7 suppose and these functional values are expressed in
the form 4, 26, 58, 112 and 466, this is suppose given here. And obviously we can say that we
have to find y`(x) at x = 2 here, obviously 2 is nothing but the tabular point here. And if we
will just use this Lagrange Interpolation then the formula can be written in the form of Pn(x)
= K: 0 to 4 since if you just see here, 5 nodal points are here so it can just generate the
polynomial of degree 4 here so that is why we can just write Pn(x) = K: 0 to 4∑Lk(x)f(xk)
here.
And directly we can just write Lk(x) as ω(x)/x – xk, ω`(xk) here. And ω(x) can be expressed in
the form of (x – 0)(x – 2)(x – 3)(x – 4)(x – 7) here since all the points should be included in
the numerator product part here. So if we want to find this L0(x) here, so L0(x) can be written
in the form like (x – x1)(x – x2)(x – x3)(x – x4)/(x0 – x1)(x0 – x2)(x0 – x3)(x0 – x4) here. And
directly if you just write these functional values in the upper part here, this can be written as
(x – 2)(x – 3)(x – 4)(x – 7)/(0 – 2)(0 – 3)(0 – 4)(0 – 7) here.
And similarly if you just write L1(x), this can be represented in the form of like (x – x0)(x –
x2)(x – x3)(x – x4)/(x1 – x0)(x1 – x2)(x1 – x3)(x1 – x4) and this can be written in the form of
like (x – 0)(x – 3)(x – 4)(x – 7)/(2 – 0)(2 – 3)(2 – 4)(2 – 7) here. Similarly we can just write
404
L2(x) as (x – x0)(x – x1)(x – x3)(x – x4)/(x2 – x0)(x2- x1)(x2 – x3)(x2 – x4) and which can be
represented in the data points as (x – 0)(x – 2))(x – 4)(x – 7)/(3 – 0)(3 – 2)(3 – 4)(3 – 7) here.
Similarly, we can define L3(x) as (x – 0)(x1 - 2)(x – 3)(x – 7)/(4 – 0)(4 – 2)(4 – 3)(4 – 7)
here. And L4(x) can be defined in the form like (x – 0)(x – 2)(x – 3)(x – 4)/(7 – 0)(7 – 2)(7 –
3)(7 – 4) so that is the representation for L4(x) here.
So if you just consider the first order derivative based on Lagrange Interpolation then we can
just express this Pn`(x) as so we can just express Pn`(x) as K: 0 to 4 ∑LK`(x) f(xK). And if you
will just define here L`K(x) x as summation or directly if you just write here d/dxω(x)/(x –
405
x)ω`(xK) then ω(x) is represented as here if you just see that is the function of x only here that
is in the form of like (x – 0)(x – 2)(x – 3)(x – 4)(x – 7) here, and directly we can just evaluate
this derivative with respect to x for this function here and obviously if we want to write
suppose at a point 2 suppose here.
If you will just see the first function L0(2) here so directly we can just say that since at a
particular point we are just evaluating, if you just see here L0(x) which is represented in the
form of like (x – 2)(x – 3)(x – 4)(x – 7)/(0 – 2)(0 – 3)(0 – 4)(0 – 7) this one. So if you just
take this differentiation with respect to x here that is just giving you at a point particularly 2 if
you will just see here, L0`(2) this one. So then except first product if you just consider this
one, this will just give you a nonzero value and everywhere else wherever this x – 2 is present
for remaining all of these derivative terms that will just give a 0 term there.
So that is why we can just write this complete product form here that one as (2 – 3)(2 – 4)(2 –
7)/(– 2)(– 3)(– 4)(– 7) here. Since if you will just see this differentiation with respect to x – 3
if you just consider, then x – 2 will be present specially x – 2 that will just give you 0 here. If
you will just differentiate x – 4 with respect to x – 4 the next – 2 is also present in another
term so that will just give you 0 value there. So likewise whatever these product terms if you
just consider except x – 2 then all other terms will be 0 there, so that is why this can be
represented here in this form only and barely if you just calculate that is just giving you –
5/84.
406
Similarly, if you just calculate here L1`(2) suppose, next immediate calculation is L1(x)
suppose if you just here L1(x) can be written in the form of like (x – 0)(x – 3)(x – 4)(x – 7)/(2
– 0)(2 – 3)(2 – 4)(2 – 7) here. And if you just take the derivative with respect to x and put
these functional values 2 there so all these terms will exist here if you just see, since in first
fall differentiation if you take this one so we can just write this term as (x – 3)(x – 4)(x – 7).
Second if you just do this one then + x(x – 4 )(x – 7), then next term if you just take the
differentiation with respect to (x – 4) here then we can just write (x – 0)(x – 3)(x – 7).
And last if you just differentiate so all the terms at point 2 it will just exist and in that case we
can just like this one as L1`(2) = – 6/5 here if you just calculate. Then we will just go for like
L 2`(2), L2`(2) if you just see again the same scenario it can happen, this means that if you
will just calculate L2(x) again, this x – 2 term will be there so that is why except one term all
other terms it will just vanish after this differentiation and we can just write this
differentiation as L0`(2) there and which can be written as like 20/12 here so the value is
coming as like L2`(2) that is just giving you like – 5/3 here sorry this is 5/3 only and L1`(2)
this is just giving you – 6/5.
And if in the same manner if you just compute here L3`(2) that is the same way we have to
treat here also that will just give you – 5/12. And if you just compute L4`(2) that will just give
you 1/105.
407
Once all of these values are known to us than we can just write these functional derivatives,
this means that we can just write y`(2) as L0`(2) f(x0) + L1`(2) f(x2) + L1`(x3) f(x3) + L3`(2)
f(x3) +L4`(2)f(x4) there. So if you just write all these terms here, the derivative of like this
term y`(2). It can be written in the form like P`(2) as – 5/84 that is the first term into f(x0) that
is nothing but 4 here - 6/5 . 26, so next term is like 5/3.58 and immediate next term is like –
5/12.112 + 1/105.466 here and the computed values that is just giving you here 23.
If you just see here, this means that we are just writing these values that is in the form like
L0(2) or L0`(2).f(x0) + L1`(2) f(x1), L2`(2)f(x2) + L3`(2)f(x3) + L4`(2)f(x4) here. Or completely
free just write all these terms then we can just evaluate this derivative or first order derivative
for this function f(x) by constructing a polynomial Pn(x) here. So obviously based on that we
can just evaluate this derivative for all of these unequi-space points or equi-spaced points that
is based on this Lagrange Interpolation formula. So next lecture we will just consider this
unequi-spaced points or some other examples based on this Lagrange interpolating
polynomial, thank you for listening this lecture.
408
Numerical Methods
Professor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 27
Numerical differentiation part-III (Differentiation based on Divided difference formula)
Within that interval means if you will have these tabulated points (x0, f(x0)), (x1 f(x1)), (xn,
f(xn)), then if we want to find the derivative at a particular point, it should lie between like x0
to x1 within this interval or x1 to x2 within this interval, so somewhere maybe it is just lying
and at that point we want to evaluate this differentiation.
409
(Refer Slide Time: 1:40)
And for that if you just consider first example like suppose you will have these data points
like 0, 2, 4, 5 suppose and its functional values are like 5, 9, 85, 180 suppose and
corresponding Lagrange interpolating polynomial if we want to write so it can be represented
in the form of like Pn(x) = k: 0 to n ∑ Lk(x)f(xk), where Lk(x) can be represented as ω(x)/(x –
xk)ω`(xk) here. So basically for this problem if you just write here ω(x), so ω(x) can be
written as (x– x0)(x – x1)(x – x2)(x – x3) here, since we will have here 4 points it can
construct a polynomial of degree 3 here.
So we can just write P3(x) that is your L0(x)f(x0) + L1(x)f(x1) + L2(x)f(x2) + L3(x)f(x3) here,
where this ω(x) if you just consider this particular problem here this can be written as (x –
410
0)(x – 2)(x – 4)(x – 5) here. And if you just write here L0(x) and L0(x) can be written in the
form like (x – x1)(x – x2)(x – x3), divided by (x0 – x1)(x0 – x2)(x0 – x3) here. And particularly
these tabular points we can just write this one as (x – x1) is here since especially we are just
defining this is as x0, this one is x1, this is as x2, this is as x3 here. So if you just write all these
points here that is (x – 2)(x – 4)(x – 5)/x0 is 0 here so (0 – 2)(0 – 4)(0 – 5) here.
Similarly, we can just write L1(x) also so if you just write L1(x) here this can be written as
L1(x) as (x – x0)(x – x2)(x – x3)/(x1 – x0)(x1 – x2)(x1 – x3) so directly if we will put all these
values; x is 0, (x – 4)(x – 5)/x1 is given as 2 here, (2 – 0)(2 – 4)(2 – 5) here. Similarly, L2(x)
can be written as (x – x0) that is as 0 here, (x – x1)(x – x3)/(4 – 0)(4 – 2)(4 – 5) here.
Similarly, if you will just write L3 also here, so L3(x) can be written as (x – 0)(x – 2)(x – 4)/(5
– 0)(5 – 2)(5 – 4) here.
411
(Refer Slide Time: 6:25)
And particularly if you have determined this L1(x), L2(x), L3(x) and L0(x) on these tabular
values so then we can just obtain the first order derivative of this Lagrange coefficient as
L0`(x) can be written as so first product if you just consider here that is in the form of like (x
– 4)(x – 5) divided by all of these product terms that is – 2. – 4.. – 5 here so directly we can
just write that one and – 2.– 4. – 5 + second term if you just differentiate here (x – 2)(x –
5)/(– 2)( – 4)( – 5) here.
Similarly the last if you just differentiate here, (x – 2)(x – 4)/(– 2)(– 4)(– 5) here and similarly
if you just differentiate L1 `(x) here, L1`(x) can be written as like first-term if you just
412
differentiate here so (x – 4)(x – 5)/2. – 2. – 3 here + x(x – 5)/2. – 2. – 3 + last term if you just
differentiate here, x(x – 4)/2. – 2. – 3 here.
Similarly, L2(x) if you just write it down so L2`(x) can be written as your first-term if you just
differentiate then (x – 2)(x – 5)/4.2.– 1 here + second term if you just differentiate x(x –
5)/4.2.– 1, last term if you just differentiate x(x – 2)/4.2.– 1 here. And similarly we can just
obtain this derivative for L3`(x) also here so if we just write L3`(x) here so L3`(x) it can be
written as (x – 2)(x – 4)/5.3.1 + second differentiation if you just take, x(x – 4)/5.3 + last
differentiation x(x – 2)/5.3.1 here.
So if you just consider here x as 3 then we can obtain L`3(x) here then if you just consider
L1`(3) here then we can just obtain the value of L1`(3). L2`(3) we can just put x as 3 here and
we can obtain the value and L3`(3) we can just obtain the values by putting x as 3 there. So if
you will just put all these values then we can just obtain that L`3(x) as 1/40, L1`(3) as – 7/12,
L2`(3) as 5/8 and L3`(3) as – 1/15 here. And once these values are known to us like if you just
write here this value is just giving you like that specially we are just writing here y`(3) since
it has asked you to find here.
413
(Refer Slide Time: 10:32)
So we can just write this one as L`3(x).f(x0) + L1`(3) f(x1) L2`(3) into f(x2) + L3`(3) f(x3) here.
So if you will just evaluate or put all these values here then first value if you just try to write.
So L`3(x) we just found it as 1/40 here and L1`(3) it is just found as – 7/12 here, L2`(3) it is
just found to be 5/8 here and L3`(3) it is just found to be like – 1/15 here. And directly all of
these functional values that is given as f(x0) as 5 here so if you put all of these values we can
just write this one as 1/40f(x0) here 5 + – 7/12 . 9 + 5/8.85 – 1/15.180 here.
And if you will just do this computation specially you can just get the value as 36 here. So
then we will just go for this differentiation using Newton’s divided different formula.
414
(Refer Slide Time: 14:26)
And in Newton’s divided difference Interpolation formula already we have known that the
advantage for this Newton’s divided difference formula is that in case of Lagrange
Interpolation formula if a small quantified quantity it should be associated with this
Interpolation formula then always we will just go for a large multiplication of these vectors,
but in divided differences that we can just overrule that factors. So specially if the term is
asked you to associate in that terms, directly we can just write a multiplication and we can
just evaluate that one.
So here in like Newton’s divided difference Interpolation polynomial we will have like the
set of tabular points that is xi and f(xi) defined at n + 1 points like x0, x1 up to xn points. So if
415
you just write this Newton’s divided difference interpolating polynomial that can be
represented as Pn(x) = f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1) f(x0, x1, x2) up to (x – x0)(x –
x1), up to (x – xn – 1)f(x0 to xn) here. So if we may complete form we want to write or in a
product form if you want to express or in other sum form if we want to express then Pn(x) can
be written as i = 0 to n ∑f(x0, x1 to xi) J = 0 to i – 1П(x – xj) here.
So if we want to find the derivative for this polynomial here, this polynomial derivative can
be written as Pn`(x) here, this = i = 0 to n since all the values are constant here like f(x0, x1 to
xi) then we can just write this derivative as here d/dx of product of j = 0 to i – 1(x – xj). And
all of these tabular values it is known to us and associated functional values are also known to
us then we can just use these functional values to calculate this derivative with respect to x
here, first for this we will consider a linear interpolation here. In case of linear interpolation,
we have to consider only 2 points where these tabular points may be (x0 f(x0)), (x1 f(x1)).
And if we will just consider this linear interpolator polynomial using Newton’s divided
difference Interpolation here then we can just write this polynomial as P1(x) = f(x0) + (x – x0)
f(x0, x1) here since we are approximating the function f(x) with polynomial of degree 1 here.
So always we can just remember that if we will have a polynomial of degree n always we
have to consider n + 1 terms there. So if we are just approximating this function f(x) with a
polynomial of degree 1 here then we have to consider 2 points that is in the form of like (x0,
f(x0)), (x1, f(x1)) here.
And if you just consider these 2 points then the polynomial approximation can be written as
P1(x) as f(x0) + (x – x0) f(x0, x1) here. And if you just try to evaluate this first order derivative
for this polynomial here so P1`(x) can be written as since first-term if you just see that this
argument contains only a constant point here. Hence we can just write P1`(x) as f(x0, x1) here.
And if we just consider like a quadratic interpolation suppose so then we will have like 3
points we have to consider for quadratic interpolation since it represents a polynomial of
degree 2 so 3 terms it is required.
416
(Refer Slide Time: 18:21)
So if you just consider is quadratic polynomial here, the quadratic polynomial associates the
points like (x0 f(x0)), (x1 f(x1)), (x2 f(x2)) suppose, since 3 points are required that is why we
have just considered here 3 points here as x0, x1 and x2. And if you will just approximate this
one with a polynomial of a degree 2 here, this polynomial can be written as P2(x) as f(x0) + (x
– x0) f(x0, x1) + (x – x0)(x – x1)f(x0, x1, x2) here. If we want to find this first order derivative
for this function P2(x) here so this can be written as P2`(x) as first-term this can just give you
0 value and second term if you just consider here that can be represented as f(x0), x1 here and
immediate next term if you just consider that can be written as 2x – (x0 – x1)f(x0, x1, x2) here.
417
So based on this if you will just go for an example here then we can just write this tabular
values as like compute suppose y`(3), y``(3) from the tabular data like 1, 2, 4, 8, 10 suppose
the functional data is given and sorry this tabular data is given and the functional data are
given like 0, 1, 5, 21, 27 suppose.
If you just write this tabular data that is in the form of like 1, 2, 4, 8, 10 suppose and the
functional values are like 0, 1, 5, 21, 27 and it has asked you to compute y`(3) and y``(3)
from the following table. Compute y`(3) and y``(3) from this table, so first we will just use
divided difference table formula to get all this difference data there. So like first difference if
you just find here, this can be represented as (1 – 0)/(2 – 1) since it is formula specially, so
the first order difference we can just write here (1 – 0)(2 – 1), second one if you just consider
(5 – 1)/(4 – 2) here, third one if you just consider (21 – 5)/(8 – 4) and last 2 points if you just
consider, (27 – 21)/(10 – 8) here.
Obviously we can just obtain the data as 1, 2, 4, 3 here and in the second difference or second
divided difference we can just write this data as (2 – 1)/(4 – 1) then we will have like (4 –
2)/(8 – 2) here then last value if you just consider here (3 – 4)/(10 – 4) here . So obviously
you can just obtain these values as 1/3, 1/3 and last value this will be – 1/6 here. And next
third difference table you can just obtain these values as 1/(3 – 1)/ 3 divided by sum of this
data like (8 – 1) and then – 1/6, – 1/3, 10 – 2 here.
So this can first value you can just get it as 0 here and second value can just get it as – 1/16
here and forth divided difference you can just find – 1/(16 – 0)/(10 – 1) here and you can just
418
obtain this one as – 1/144. And if you will just use this Newton’s divided difference
Interpolation formula for this calculation of these values here so you just use these values that
is 0, 1, 1/3, 0, – 1/144 as the tabular values here.
So maybe in next lecture we can just continue that how we can just use this Newton’s divided
difference formula for finding this derivative, since I have just given you recount of this
Newton’s divided difference formula and directly we can just use these formulas for the
derivation of the data and I can just consider the same example in the next lecture and I can
explain that how we can just put this data in the tabular form and we can obtain the first order
derivative of second order derivative in a complete form, thank you for listening the lecture.
419
Numerical Methods
Professor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 28
Numerical differentiation part-IV (Maxima minima of a tabulated function and errors)
Welcome to the lecture series on numerical methods, in the current lecture series we are
discussing here numerical differentiation. In the last lecture we have started this numerical
differentiation using Lagrange interpolation method and divided differences. And in the end
of the last lecture I have just given one example based on this divided difference that how we
can use this differentiation. And there itself I have discussed that how we can just apply the
divided difference on the tabular form first and then we can just go for this derivative. And
after this we will go for this Maxima and minima of tabulated function then the error
estimation in the differential equations.
But this numerical differentiation how it can be applied for this error estimation and
interpolation polynomials. So this present table already I have discussed in the last lecture
that how we can find this first order difference, second order difference, third order difference
and fourth order difference. If a tabular value is given to us like x = 1, 2, 4, 8 and 10 and then
the corresponding y values are given as 0, 1, 5, 21 and 27 suppose, how we can compute the
derivative of this function y at point 3 suppose and double derivative also at that same point.
So first we have to go for this divided difference table, already in the last lecture I have
derived this divided difference table there itself and after that once we are using this divided
420
difference table we will have these values like first divided difference value that is we have
obtained here as 1, second divided difference for the first table form, we have obtained 1/3
then 0, then – 1/144.
So if you will use this divided difference formula, usually this Newton’s divided difference
formula is written in the form Pn(x)= i: 0 to n, if you will write then we can write f(x0,
x1,…,xi) then we can write product of j = 0 to i – 1 П(x – xi). So if we are writing this
polynomial in this form then since the data it is given to us are like x0, x1, x2, x3 and x4 here.
So if you collectively write this data here like x0 = 1 here, x1 = 2 here, x2 = 4 here, x3 = 8 here
and x4 = 10 here.
Then we will have like 5 points here, this can generate a polynomial of degree 4 here so we
can write this function f(x), it can be approximated with polynomial of degree 4 here P4(x)
and that can be written in the form like f(x0) + (x – x0)f(x0, x1) + (x – x0)(x – x1)f(x0, x1, x2) +
(x – x0)(x – x1)(x – x2)f(x0, x1, x2, x3) + (x – x0)(x – x1)(x – x2)(x – x3)f(x0, x1, x2, x3, x4) here.
So if we put all these values here since our initial values it is in tabular form it is given us as
like 0 here.
421
(Refer Slide Time: 5:14)
If you see this tabular values here, so first corresponding value of x is given here if you see f0
it is given as 0, f1 as 1 here, f2 as 5 here, f3 as 21 here, f4 as 27 here. So if you put these
values, f(x0) is 0 here so we can write this one as 0 + x – x0, x0 is once here then f(x0, x1) this
is the tabular form already we have discussed in the last lecture that is specially written as 1+
(x – x0)(x – x1) f(x0, x1, x2).
Collectively if you see this table, so we have already obtained that one the values as 1/3 + (x
– x0)(x – x1)(x – x2) that is 4 here into your f(x0, x1, x2, x3) here and that value it is giving you
0 + (x – 1)(x – 2)(x – 4)(x – 8)f(x0, x1, x2, x3, x4) in arguments form and that value especially
giving us in the tabular form as – 1/144. So if we can multiply all these terms here then we
422
can obtain this polynomial that is in the form of x only here. So obviously if this polynomial
is expressed in the form of x, directly we can apply the derivative to get this differential form
of this polynomial P4(x) here.
So if we will take all these products here then this product form can be written in the form
like first-term in x – 1 here, second term if you will see here so that can be written as 1/3x2 –
3x + 2 and next immediate term is 0 there so then the last term it can be written as per –
1/144x4 – 15x3+ 70 x2 – 120 x + 64 here. And if you will take this derivative here that is in
the form of f `(x), so f `(x) can be written as first derivative here 1 + 1/3 x2– 3x + 2 derivative
is 2 x – 3 – 1/144 then x4 is 4 x3 – 45 x2+ 140 x – 120.
423
So final form we are obtaining here f`(x) as 1/3 2x – 3 – 1/144 x4 – 15 x3 + 70 x2 ,(70 x2
means derivative will come as 140 x here) then the last term is – 20. So if you want to find
suppose this derivative at the point 3 so directly we can replace this x by 3 here and we can
obtain this derivative at the point 3. And if we want to evaluate this second order derivative
here then directly we can differentiate once more here that is first-term is 0 here then 1/3 (2 –
1/44) then this is 4 x3 – 45 x2 + 140.
And similarly if you put here f ``(3) then we can obtain the second order differentiation of
this fourth order polynomial at the point 3 there. So obviously these values if it can be
calculated so that can be represented in the form of f `(3) = 1 + 1/3.2.3 – 3 – 1/144.4.33 there
so specially it is, it should be like 4 x3 – 45 x2 if you see so this should be 4 x3 – 45 x2 if you
will take the derivative. So then it can be multiplied and we can have the clear form here then
it will change to this is like 12 x2, this should be like 90 x here and it is like 140. So if you
will directly put this f ``(3) here then we can obtain this value.
So then we will go for this computation of Maxima and minima using this polynomial
differentiation. The beauty of this method is that even if the function is not known to us then
we can have this maxima or minima of this function if the prescribed tabular value is given to
us. So we can use this same arguments form of x here and we can determine this maxima and
minima for this tabulated function by differentiating this interpolating polynomial here.
424
Maxima and minima of y = f (x) can be obtained by equating dy/dy = 0 and first if you use
Newton’s forward difference formula to compute this Maxima and minima from the tabulated
values, then the formula can be written in the form like the formula for this Newton’s forward
difference formula already we have discussed this one in previous lectures and if I will write
once more here.
Then this Newton’s forward difference formula can be written in the form of y(p) or y(xp) or
y(x), this can be written as y0 + PΔy0 + P(P – 1)/2! Δ2y0 +…P(P – 1)…(P – n + 1)/n!Δny0
here. And if you differentiate this word with respect to P here so then we can obtain dy/dp as
Δy0 + (2p – 1)/2!Δ2y0 + all other terms there. For maxima if you put here dy/dp = 0 suppose,
so this can be written as 0 here. This implies that we can terminate this left-hand side series
expansion up to certain terms then we can obtain this maxima or minima of a function.
Suppose if the left-hand side is truncated after suppose third differences for our convenience
suppose, then we will obtain quadratic equation here in p and which gives 2 values of p since
it is the quadratic equation here. Corresponding to these values of p then we can have
maxima or minima at that point and so when we have these values of P there then we can
obtain this value of x at that point. Since usually this x can be written as in the form of x = x0
+ P h here, or for convenience we can write sometimes if x0 = A is a particular value then we
can write this one as A + P h here.
And once this value of P is known to us then we can determine the values of x and if the x is
known to us then we have to see these tabular values that where this x is placed inside this
425
table and at that point either this forward difference formula or backward difference formula
can be applicable that we have to check and once we can check that one, since that depends
on the values of p at that point. So once we are obtaining the values of p at that point then we
can obtain the values of another y at that point for these values of y(x). And to obtain this
Maxima or minima the usual criteria is that we have to see that for Maxima we have to put
d2y/dx2 is negative and we have to show that for minima the d2y/dx2 is positive.
Suppose we have example here that the question is asked to find x for which y is maximum
taking the difference up to second order from the following table and find maximum value of
y. And if I will consider this table here like x values are prescribed as 1.2, 1.3, 1.4, 1.5 and
1.6 with space size h = 0.1 here. Suppose the tabular values are given at x = 1.2, 1.3, 1.4, 1.5,
1.6 here and the corresponding values of y are given as 0.9320, 0.9636, 0.9855, 0.9975, and
last value is given as 0.9996.
And the corresponding values for forward difference formula or forward difference table can
be given as the difference of these 2, so that is given us 0.0316 here, If we will take the
difference of these 2 here so the values can be written as 0.0219 here. If you will take the
difference of these 2 here then that will give you 0.0120 here, if we will take the difference of
these 2 here that can be given as 0.0021. Then the second difference for these numbers can be
written as – 0.0097 first one, then second one it can be written as – 0.0099, third difference it
can also be written as 0.0099 here. Sorry, this is Δ2 then we will go for Δ3y here.
426
And if we will take the difference here so the first difference this will give you – 0.02 here
and this will be 0 here and fourth order difference you can write this one as 0.002. And the
space size if you see, these equidistance points are there so that is why h can be written as 0.1
here and starting value for y0 it can be written as 0.9320 here.
And based on this if we want to determine this interpolating polynomial, the interpolating
polynomial can be written in the form like y(x) = y0, y0 means 0.9320 here and the second
point it can be written as PΔy0 here so 0.0316 here, and third one if you will write here so +
P(P – 1)/2 into the third difference if you will write – 0.0097 here. And then again this
difference if you write P(P – 1)(P – 2)/3! into next immediate value is – 0.0002 + all other
427
terms we can write here and since the question is asked to compute or taken these terms up to
suppose second differences here.
So we can consider these terms of 2 this one only here, and then we can write y(p) up to
second differences as 0.9320 + 0.0316 P + P2 – P/2 since this sign can come as – 0.0097 here.
And if we want to first find the first order differences here so y`(P) this can be written as
0.0316 – so this will be giving you (2P – 1)/2. 0.0097 = 0 here. If this equates to 0 here for
maxima and minima of this point, then we can obtain these values as 0.0316 = (2 P – 1)/2.
0.0097 here.
And finally we will have this value like P = if you solve here, P can be written as 3.7577
here. If you will see here, P is giving you this value then to find this maximum or minima at
that point we have to check d2y/dp2 here, so if you will evaluate here d2y/dp2 here, then this
value it is giving you directly as – 0.0097 here this is less than 0 already, so that is why we
can say that this second order derivative is minimum means or negative means we will have
the maximum value at P = 3.7577 here. To obtain the values of x for that p we have to
consider again x = x0+ P h or x = A + Ph there.
So if you consider that to find this x value for this corresponding P value here so x can be
written as x0 + P h here or this can be written as A + P h so the final value if you see here that
initial value is taken as 1.25 here + P is 3.7577 into h is 0.1 here, so finally we can obtain
these values as 1.5758 if you see this point here, so this value of x is observed in this point
here on these tabular values. This means that we have to use this Newton’s backward
difference formula to obtain this value at that point.
428
So to find this maximum value at that point especially we have to use this backward
difference formula as x = xn + P h at that point. This simply satisfies that P can be written as
(x – xn)/h there which can be written as 1.5758 – 1.6 since 1.6 is your xn value so we can
write that (1.5758 – 1.6)/0.1 = – 0.242 since P is lying between like – 1 to 0 there so we can
use Newton’s backward difference formula. And if we want to evaluate this polynomial at
that point especially we have to consider that yn = 1.6 the particular value will be 0.9996 here,
then + P∇yn that is P is giving you here the value as – 0.242.
So – 0.242∇yn if you see this backward table differences here that is giving you 0.0021 here
if you see and + then P(P + 1)/2! Δ2yn here so P is given as – 0.242(– 0.242) + ½(– 0.0099)
here. So the final value it is obtained as 1.000 here up to 4 decimal places.
429
So then we will go for this error approximation in the interpolating polynomial case. So for
that we have to consider like Newton’s forward difference formula, backward difference
formula or like divided difference formula to value at this error at a different points there.
So already we have discussed that these errors that is written in the form like rn + 1(x) as rn +
1(x) this error term is defined as (x – x0)(x – x1)…(x – xn)fn + 1(ζ)/(n + 1)!. Usually this is the
error term for general approximation of this interpolating polynomial so we have defined
where this ζ value should lie between x0 to xn here. And if we go for this differentiation of
this error term here then we can differentiate this x, let us see here separately and we can take
the differentiation of this ζ since ζ is a function of x here, we can treat these 2 variables
product form here.
This means that if you consider this side as ω(x) here and this side as fn+ 1(ζ)/(n + 1)! here
then this differentiation with respect to x for this error term it can be written as ω`(x)fn +
1
(ζ)/(n+1)! + ω(x)f n + 2(ζ)/(n + 1)! ζ `(x). So based on this we can say that this error it will
occur at a different polynomial basis, this means that if you consider exactly x = xa then
obviously this ω(x)will give you a 0 value here. Then we can write this polynomial as r`n+1(xj)
= ω`(xj)fn + 1(ζ)/(n + 1)!. Where this ζ should lie between like x0, x1, to xn for maximise value
within any of these intervals there and minimum value within this interval.
430
Why we are considering this maximum and minimum value is that sometimes maybe it is
occurring this variable x the form of fractional form there so we have to consider this
minimise value for minimise values of x to determine this maximum value of that function so
that is why we have to consider either it is in minimise form or in the maximise form to
obtain this maximum value for this error term there. So if you consider this one in a divided
difference form here so this divided difference form of this polynomial can be written as if
you will see this Newtons’s divided difference error derivation from our earlier lectures, you
can find that rn(x) term can be written as ω(x)f(x, x0, x1,…. xn) there.
So obviously r`n+1(x) can be written as ω`(x) into this first product term + ω(x) into the
derivative of the second product. And we can write this one as ω`(x)f(x, x0,…xn) + ω(x),
431
since we are considering the derivative of x, x0, x1 to xn arguments so one more x can be
involved inside these arguments. Why it is coming if you see here, f(x0), x0, usually it is
written in the form like limit h –> 0 f(x0) + h – f(x0)/h here. And obviously this can be written
in the form of like f `(x0)/1!. Similarly, we can write f(x0, x0, x0, 3) arguments if it is placed.
We can write this one as limit x –> 0 f(x0) + h x0 – f(x0)/h there. So if you write in that form
then we can apply this L hospital rule twice then we can obtain the second differentiation of f
at the point x0 there. This means that if you will see here that h2 is occurring if you take f(x0)
+ h – f(x0)/h, so 1/h product it will be there so that is why 0/0 form it is occurring so that is
why we have to take this L hospital’s rule to obtain this limit at that position and this gives
you like f ``(x0)/2!. So if suppose r times the same argument is placed then we can write that
one as fr(x0)/r!.
Similarly, we can write if suppose 2 arguments are equal and all arguments are defer so then
we can transfer that one as f(x, xr, x0 to xn) limit hr –> 0 if hr can be placed at that point then
that will represent the same expression as we have seen from the previous one. And if we
write in a limiting form there, that can be transformed directly to the derivative form as
dy/dx(x , xr, x0 to xn) there at x = xr.
432
(Refer Slide Time: 31:05)
And which can be written as derivative form with r arguments as 1/r!dry/dxrf(x, x0 to xn)
there. And if you use this expression in the complete derivative term of this remainder term
then we can get R`n +1(x` as ω`(xj)fn + 1(ζ)/(n + 1)! + the arguments involved for this derivative
of fn + 1(ζ) by n + 1 terms there, so that can be represented as ω(x)f n + 2(ζ)/(n + 2)!. So next
lecture we will continue that how we can obtain this differentiation for differential
approximation, thank you for listening the lecture.
433
Numerical Methods
Professor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 29
Numerical differentiation part-V (Differentiation based on finite difference operators)
Welcome to the lecture series on numerical methods, in this lecture series we are just
discussing this numerical differentiation based on interpolation and in the previous lecture, I
have discussed this like differentiation based on this Newton’s divided difference
interpolation and this error differentiation. And in this lecture we will just continue about this
differentiation using this Central difference approximation and some of this finite difference
approximation based on finite difference operators.
So if you will go for this error term analysis based on this Newton’s forward difference
formula, so usually these error terms for this Newton’s forward difference approximation is
written in the form Rn + 1(x) = ω(x)fn + 1ζ/(n + 1)!, where this ω(x) is usually written in the
form (x – x0)(x – x1)…(x – xn) here.
434
(Refer Slide Time: 1:36)
And in terms of p if you will write then it can be written in the form p(p – 1)…(p – nh)n+1fn+1
ζ/(n + 1)!. Since usually we are expressing p as (x – x0)/h and that is why obviously we can
replace that one as in the form of (p – 1)(p – 2) that I have discussed in the previous lectures.
So this ζ, it can be determined as minima of x0 to xn this should lie between ζ and maximum
of x0 to xn here. So if you want to differentiate this error term here then this error term Rn+1(x)
can be written as R`n+1(x) which can be expressed as hn+1 since it is a constant here hn+1d/dx
p(p – 1)…(p – n)f n + 1(ζ)/(n + 1)!+ if you will consider here + hn + 1d/dxfn + 1(ζ)/(n + 1)! P(p –
1)…(p – n) here. Since ζ is the function of x so that is why we have to differentiate this one
with respect to x also here. So if you will go for this elimination process here then we can
find that hn it will come since if you see d/dx usually it can be written in the form d/dp dp/dx
so dp/dx that will give you 1/h here. Usually we are expressing x as x0 + ph here so that is
why if you will write this one as dp/dx here, which can be written as 1/h here.
Obviously, this differentiation can be replaced as d/dp of the total function p(p – 1)…(p –
n)dp/dx here. So that is why this h term is coming and the final representation it can be
written in the form of hnd/dp p(p – 1)…(p – n)fn + 1 (ζ)/(n + 1)! + the remaining terms.
435
And if you will apply this differentiation since in the earlier lecture we have derived this
differentiation of fn + 1(ζ) there. So that is completely expressed in the form of divided
difference from there, but here if we want to express this differentiation in the form of
Newton’s forward difference operators here then this forward difference operators can be
replaced here this fn + 1(ζ) as Δn + 1y0/(n + 1)!.
Obviously we can express this one in Δ form here since in all of these lectures I have
explained that this tabular value is known to us but the function is not known explicitly to us,
so that is why if these tabular values has been given to us directly we can approximate this
error term by considering all of these tabular values even if this differentiation can be
approximated by using this forward difference table here. And if we will write this difference
terms here in terms of Newton’s forward difference operator so this can be written in the
form of Δn+1 y0/(n+1)!hn+1 here.
436
So once it can be expressed in this forward difference from here, then we can obtain this
fn+2(ζ) term since if you will take the differentiation of fn+1 (ζ)/(n+1)!, so that will give you
fn+2(ζ)/(n+1)! ζ`(x) there so that is why it can be expressed in the form of Δn + 2y0/(n+2)! hn+2
there. And obviously if you write all these terms here then this can be expressed as d/dp p(p –
1)(p – 2)…(p – n)Δn+1 y0/(n+1)! h +p(p – 1)…(p –n)Δ n + 2y0/(n + 2)! h there.
Since hn+1 is there and it can cancel it out with hn + 2 and 1/h is left over there. Since at x = x0
obviously we are obtaining p = 0 if x = x0 there and at that point we can write this error term
differentiation that is R`n + 1 (x0) this can be written as (– 1)n n!Δn+1 y0/(n+1)!h.
437
Since if you will see this x function here, that is d/dp p(p – 1)…(p – n) at p = 0 is nothing but
(– 1)n n!. And that is why this R`n + 1(x0) can be written as (– 1)nΔn+1y0/(n + 1)h also
sometimes. Since (n+1)! it takes the product of n factorial into n + 1 so n + 1 remained over
there. So based on this approximation we will go for computation of errors in the Newton’s
forward interpolation formula.
So for that if you consider a table, x is prescribed at 2, 2.20, 2.40, 2.60, 2.80 and 3.00 here
and corresponding y values if it is to be given 0.6932, 0.7885, 0.8755, 0.9555, 1.0296, 1.0296
here. Then the first difference if you compute here that is the difference of 0.7885 – 0.6932
since already we have derived lot of forward difference approximation in tabular form so that
is why it is easy to compute since we will consider this difference of 2 values corresponding
438
tabular values associated to each other there. So 0.7885 – 0.6932 this gives you the value as
0.0953, difference of 0.5755 – 0.7885 this is giving you the values as 0.0870, the difference
of these 2 values is 0.9555 – 0.8755 this is giving you 0.0800.
Then 1.0296 – 0.9555 this is giving you 0.0741, then the difference of 1.0986 – 1.0296 this is
giving you 0.0690 here. Similarly, we can obtain the second differences here, third
differentiation and fourth differentiation.
So based on this if the tabular point is asked you to compute up to third differences and then
obtain this error term then obviously we can write this one as the fourth term there since the
divided difference up to third order can be approximated in a polynomial form there,
immediate next term can be considered as the error term for that position therefore the error
term can be written in the form R4(x) here. So usually this R4(x0) it can be written in the form
of if you will see here, are 4 of x can be written as ω(x) f4(ζ)/4! so specially it can be written.
439
If you will write in a complete form here, so R4(x) can be written as ω(x)f4(ζ)/4!. And in the
differentiation form if you write R`4(x), this can be written as (–1)n that is n = 3 there and
then we can write Δ4y0 /(n + 1) this means n is 3 here so 3 + 1.h here. So obviously if you
will put directly in the tabular values here, so R`4(x0) this can be written as (–1)3Δ4y0/4h here.
And at that point exactly if the error is asked suppose at x0 = 2.00 here since the beginning of
the tabular values is 2.00 so R`4(2.00 ) it can be written as (–1)3 so that will give you ‘–‘sign
here and this your h size is 0.20 here so that is why – 1/0.20 (1 /4)Δ4 value that is giving you
– 0.002 here. So obviously if I will write here this can be written in the form of absolute
values of – 1/h(1/4) Δ4y0 here and this is nothing but – 1/0.20 into since this value is giving
you – 0.0002/4 so this can be written in this form here.
And obviously this value is giving you 0.00025 here, so if you will define this differentiation
using finite difference operators, so finite difference operator means we can consider that
shift operator here, shift operator is usually expressed in the form of E(x) here so Ef(x) is
usually written in the form f(x+h) here and f(x + h) can be expressed in the form of Taylor
series as f(x)+ hDf `(x)+ (hD2/2!f ``(x) + rest of the terms. And usually it can be represented
as [1 + h D + (h D)2 2! + this one ]f(x) here, and obviously it can be written in the form of
EhD f(x) here.
And finally we are expressing this shift operator E as ehD here. That we have expressed in the
earlier lectures that D = d/dx here that is nothing but the derivative here and which can be
expressed in the form of forward difference operator, backward difference operator, centre
440
difference operator, in any form we can express this shift operator. So based on that we can
say that these operators can also be expressed in the differential operator form als,o that
means that d/dx can be expressed in the form of Δ, ∇, δ operators and average operators μ, et
cetera.
So if you express these differential operators in the form of forward difference operator, first
difference if you consider for this Newton’s forward difference operator at a point k then we
can write h Df(xk) can be expressed in the forward difference form as Δf(xk) –1/2Δ2f(xk) +
1/3Δ3f(xk) – up to finite number of terms. Similarly, this differential operator can also be
expressed in a backward difference form that can be expressed as ∇fk + 1/2Δ 2fk + 1/3Δ3fk so
likewise we can also express.
441
(Refer Slide Time: 16:13)
And in general form if you will express this one as hrDr here, so this can be expressed as Δr –
1/2rΔ r + 1 + r(3r + 5)/24 Δr+2 – all other terms are there. Similarly, if you take this backward
difference operator, that it can be expressed as ∇r + ½ r∇r + 1 + r(3 r + 5)/24 ∇r + 2 + all other
terms. And if you will use this central difference operator to express this rth order derivative
of any function then it can be expressed as small δr – r/24 δ r + 2 + r(5 r + 22)/5760 δ r + 4 – all
other terms it can be expressed up to finite number of terms.
So next we will go for central different approximation how we can use this differentiation for
central differential approximation Stirling formula and Bessel’s formula here. For this
442
Stirling’s formula already we have derived this expression for Stirling formula that is
expressed in the form of the central different approximation for Stirling formula usually it is
written in the form of yp or y(x) or y(xp) can be written as y0 + pμδy0 + p2/2!Δ2y0 (p+1P3)μδ3y0
+ all other terms are there.
So if you differentiate with respect to x here that can be written as dy/dx this as 1/h into
usually this expression can be written in the form dy/dp.dp/dx so that is why dp/dx is nothing
but 1/h and d/dp of your expression y0 + p μΔy0 p2/2!δ2y0 + all other terms are there. If you
will differentiate this one with respect to p here this can give you 1/h, first term it will give
you 0, second term μ Δy0 here + 2 p/2!pδ2y0 + the third factorial if you take this one that will
give 3p2 – 1/6 μδ3y0 + all other terms.
And if you take one more differentiation of this scheme here that is d2/dx2, it can be written
as 1/h2 so first term is stated as a constant here then it will start from this immediate next term
so it is δ2y0 here so 3.2.p so 6/6 cancel it out so that is why it can be written as p μδ3y0 + rest
of the terms here.
Similarly, we can use this formulation for Bessel’s formula here so Bessel’s formula usually
it is expressed as y p = y(xp) that as μy1/2 here so if you write this one as yp in Bessel’s
formula here, this is for Stirling’s formula here.
443
For Bessel’s formula usually this yp is written in the form μy1/2 + (p – ½)δy1/2 + (pP2)μδ2y1/2 +
(pP2)(p – ½)/3 δ3y1/2, so + all other terms. If you will take the differentiation here, this same
form we can write dy/dx this as 1/h, the first-term we can take here as δy1/2 here so then + 2p
– 1/2 μδ2y1/2 then 6 p2– 6 p + 1/12 μδ3y1/2 + all other terms there. Similarly, if you take the
second differentiation here that can be written in the form of 1/h2μδ2y1/2, since if you will take
the differentiation of second term here since first-term is constant here.
Second term if you take that can be represented a μδ2y1/2+ 6 p2 means 12 p – 6 here so that is
why 2p – 1/2 Δ3y1/2 + all other terms we can write.
So based on this if you go for example here since central different approximation usually it
can be approximated at centre of the table here where for both of these formulas we have
444
defined this approximately parameters so that where it can be used. So in that regard if we
will consider this table as: x is defined as 1.00, 1.50, 2.00, 2.50, 3.00, 3.50, all are equally
spaced points with space size 0.5 suppose with this corresponding y values as 1, 1.447.
1.2599, 1.3572, 1.4422, 1.5183 here, then if you use these first differences for all these values
we can obtain this one as 0.1447, 0.1152, 0.0973, 0.0850, 0.0761 for first differences.
For second difference if you will take the differences of immediate next terms; 0.1152 -
0.1447 here then that will give you – 0.0295 here. Similarly, if you take the difference of
0.0973 – 0.1152, this will give you – 0.0179 here. And if you will take the difference of
0.0850 – 0.0973 this will give you – 0.0123 here, if you take the difference of 0.0761 and
0.0850 that will give you – 0.0089. Similarly, we can define this third difference, 4th
differences and 5th differences here since we have 6 points so that is why we can consider the
differences up to 5th order here.
And if it is asked to use this Stirling’s formula in Bessel’s formula for this error
approximation up to suppose third order term here so up to third order we can consider and
immediate next term we can consider the error form here. First we have computed this
derivative at that values suppose the values asked you to compute this derivative as 2.40 here
so that is why we can put here y` as x is defined as 1/x μδy0 + pδ2y0 + 3 p2 – 1/6 μδ3y0 here so
that is why we can use h as 0.5 here, 1/0.5 so average of these 2, this is μδ can be written as
your values, we can write (δy1 + δy0)/2 there so that is why it can be written as (0.0973
+0.0850)/2 there.
445
(Refer Slide Time: 25:47)
Directly you can express in the form of if we want to write this one as μδ here so μδ can be
written as (E1/2+ E-1/2)/2 (E1/2 – E-1/2). If you take the product here we can obtain this one as
(E – E– 1)/2 here, so obviously we can write this one as f(x + h) – f(x – h)/2 so that is why we
are using this differentiation if you will see as (0.0973 + 0.0850)/2. So using all these we can
obtain these values as 0.1866 here. The second order formula if you will use here that is
y``(x) here so y``(x), it is given as 1/h2δy0 + p μδy0 + all other points.
So similarly, h is given as 0.5 here so that is why (1/0.5)2 here we have considered and δ2y0
that is nothing but the second order differentiation of this forward difference table also so that
is why it can be considered as – 0.0123 here + p value that is given as – 0.2 into the average
446
of δ3 values we have considered here and rest of the values can be defined in the same form
here and finally we are obtaining y``(2.40) as – 0.0518 here. And if you use this formula for
the point 2.80 here then we can use Bessel’s formula for that since p range is lying between
0.25 to 0.75 within that range specially we can use the Bessel’s formula.
So for that if you will find here the derivative of y(x), is usually written as 1/h δy1/2 + (2p –
1)/2 μ2y1/2 + 6 p2 – 6 p + 1/12 μδ3y1/2. So if you put these values 1/0.5, 0.0850 + 2(0.6) – 1, so
all other terms it can be placed accordingly. Then we can obtain this first order derivative at
2.80 at 0.1676 here so the second order formula if you use here that is in the form of
1/h2μδ2y1/2 + (2 p – 1)/2 δ3y1/2, so then we can obtain at the point 2.80 as since already we
have obtained that h is 0.5 here.
447
So that is why this can be written as 1/h2 means (1/0.5)2 so then μδ2y1/2 if you will take μ as
average value here so δ21/2 we can write this one as δδy1/2 and we can obtain the values in the
form of – 0.089 – 0.0123/2 here + 2 p – 1 so p is given as 0.6 here, 2(0.6) – 1/2 δ3y1/2 this is
nothing but directly we can obtain from this forward difference formula and the final value is
obtained as – 0.0410 as this one, thank you for listening this lecture.
448
Numerical Methods
Professor Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 30
Numerical differentiation part-VI (Undetermined coefficients unequal intervals)
Welcome to the lecture series on Numerical methods and currently we are discussing here
numerical differentiation. So basically we are just dealing here this numerical differentiation
based on different interpolation formulas like some of the finite difference operators, some of
the unequal spaced intervals. So today we will just discuss about this numerical
differentiation based on undetermined coefficients and some of these approximations using
unequal intervals based on general formulas Taylor series expansion.
So first we will just go for this differentiation using method of undetermined coefficients,
sometimes we may be interested to devise a differentiation formula considering some of the
specific ordinates. This means that if some of the special coordinates are consisting of certain
differentiation formulas or certain differentiation approximations it has asked you to evaluate
so we can just do that one.
449
(Refer Slide Time: 1:33)
Suppose a formula is written in the form f `(x0) = a f(x0) + b f(x0+h) + c f(x0+3h). So how we
can just determine these coefficients a, b, c from this derivative here? Specifically, if we want
to determine these coefficients a, b, c, we have to expand this f(x0+h) and f(x0+3h) by Taylor
series expansion about this point x = x0. So the basic idea is that sometimes if these nodal
points are defined and if the functional values are given at that nodal points so how we can
just evaluate these derivatives at that point suppose.
And here the hint is given that suppose your first order derivative is prescribed at a particular
point that is in the form of f(x0) + b f(x0+h) + c f(x0+3h) and it has been asked you to find
these coefficients for this derivative formula here. So if you will just expand this Taylor
series expansion of f(x0+h) at point x0 then we can just write this Taylor series expansion as
f(x0) + h f `(x0) + h2/2!+ f ``(x0), so likewise we can just write. Similarly, if you will just
expand f(x0+3h) here, this can be written as f(x0)+3h f `(x0)+ 9 h2/2! f ``(x0)+ all other terms.
450
(Refer Slide Time: 3:34)
So if we will just eliminate here the terms containing f ``(x0) from these 2 equations by
suppose multiplying this first equation by 9 and subtracting the second from 8. If we will just
do that one so if you just see here so we have to multiply here 9 and then if you subtract, this
term will go out and from there itself we can evaluate f `(x0) in terms of f(x0) and f(x0+h) and
f(x0+3h) and all other remaining terms that is of higher powers of h containing f ```(x0) and
all other higher order derivatives there.
So if you multiply this equation 1 by suppose 9 and from the second equation if you subtract
here then we can obtain f `(x0) as 9 f(x1) – 8 f(x0) – f(x0+3h) here divided by it can be written
as 6 h + h2/2 f ```(x0) + all other terms are there. So if you will consider this high-powered
451
terms as order of h here then we can write this expression as f(x0+h) – 8 f(x0) – f(x0+3h)
divided by 6 h + order of h2 or order of h you can also write sometimes since higher powers
of h can be neglected if the h size is very small here.
And if we will consider this error term here so the error term can also be written in the form
this = h2/2f ```(ζ) where ζ lies between x0 to x0 + 3 h. Obviously in a compatible way if we
want to write so we can write this formula that is in the form of (9 f1 – 8 f0 – f3)/6h, or
approximates to f `(x0) here. So this above formula can also be obtained by equating these
coefficients after expanding the function by Taylor series expansion also sometimes we can
do that one.
452
But in this form especially if you write here (9 f1 – 8 f0 – f3)/6h so we have to consider this h
size, h size should be small so then these higher powers of h can be neglected. So if you go
for this other method of determining these coefficients, if you will expand this one by Taylor
series, if you see the first-term a f(x0) we will keep that one as a f0 here and the second term
we can write this one in the Taylor series expansion form and third term we can write it in
Taylor series expansion form, then both sides we will equate the coefficients and if we will
solve these equations then we can find the values of a, b, and c there.
So for that if we will rewrite this equation in the form as this one, a f0 + b[f(x0+h) f `(x0)+
h2/2!f ``(x0)+ h3/3!f ```(x0) + all other terms]+ c[f(x0+3h) f `(x0)+ 9 h2/2! f ``(x0)+ all other
terms.]
453
If you will equate the first coefficient here f0 coefficient, then right-hand side we do not have
any or in the left-hand side we do not have any coefficient of x0 there. So that is why we can
write this coefficient as 0 if you will consider only f0 coefficients from the right-hand side
there.
So first if you separate all these coefficients from the right-hand side then first one it can be
written as (a + b + c)f0, second one if you see here that is (b + 3c h) f `(x0)+ h2/2!, if you will
see here so that is b coefficient it is taking (b + 9 c h)f ``(x0)+ remaining terms are there. So if
you will equate these coefficients, so first coefficient we can write a + b + c = 0 here, next
coefficient we can write b + 3 b = 1 here and the next coefficient if you write b + 9c = 0 here.
So if you will solve these 3 equations since we have here 3 constants so that is why if we will
solve these 3 equations then we can obtain the solution as c = – 1/6 h and b as 3/2 h and a as
94/3 h here. If you will put all these coefficients then directly we can obtain this formula for
f`(x0) as since f `(x0 )that is written as – 4/3 h f0 + b as here 3/2 h f1 + c as here – 1/6 h here f3.
And if you will solve this equation then we can obtain this formula as 9 f1 – 8 f0 – f3/6 h here
that is the basically that formula we have obtained in the earlier formulation also.
454
Next we will go for this derivative that deals with unequal intervals here. So already we have
considered this approximation of derivatives and the functional values were prescribed at
equal intervals at the beginning we have already discussed in previous lecture. And in many
practical problems usually we can get that one the function values are given at the nodal
points which are not necessarily evenly spaced. And if we will not use any interpolation
formula, Lagrange interpolation or divided difference Newton’s divided difference
interpolation formula how we can find this differentiation using directly this Taylor series
expansion if we have unequal spaced points that we will discuss now.
So if we go for that formulas, first if we will discuss about forward difference formula here.
Let us suppose that the functional values f(x) are given and x = x0, x = x1 and x = x2 suppose.
455
And if the space points are situated at h1 distance from x0 the first point and second point if it
is spaced at suppose x1 to x2 this space size is x2 here, so we can write these 3 point suppose,
prescribed at x = x0, x1 and x2 here and the points are placed starting point is x0 here and at a
point h1 distance suppose x1 is placed and at distance of h2, x2 is placed there.
So if we will write here in terms of x0; x1 can be written as x0 + h1 here and x2 can be written
as x0 + h1 + h2. If you do the Taylor series expansion at f(x1) and f(x2) about x0 here, so then
we can write f(x1) this can be written as f(x0+h1) here and that can be written as f(x0+h1),
f`(x0)+ h12/2! f ``(x0)+ h13/3! f ```(x0) + rest of the points. Similarly if you will write here f(x2)
here, this can be written as f(x0)+(h1 + h2) here and obviously if you keep this one as in the
form of h here x1 + h2 this can be written as f(x0)+(h1 + h2)f `(x0)+(h1 + h2)2/2!f ``(x0), so
likewise we can write.
456
And if we will try to evaluate suppose f `(x0 )from these 2 equations so from the first equation
we can write f `(x0) = f(x1) so this equation if individually we try to evaluate f `(x0) then we
can write this one as f(x1) – f(x0)/h1 + if you see here h1, it can be taken out so – h ½!f ``(x0)
rest of the terms are there.
And if we want to write that in terms of error term here then we can replace this x0 point as ζ,
where ζ should lies between x0 to x1 here. And suppose if it is asked to obtain a higher order
formula so then we can eliminate f ``(x0) from this first 2 equations whatever we have defined
here f(x1) and f(x2) from these 2 equations.
457
(Refer Slide Time: 15:57)
So from then we have to multiply first equation if you will multiply here, (h1 + h2)2 into the
first equation and second equation if you will multiply h12, so then if you subtract these 2
equations then we can obtain the values of f `(x0 )in terms of f(x1), f(x2), f(x0) and the higher
powers of h and the higher powers of derivatives there. So if you will multiply these 2 terms
by eliminating f ``(x0) from these 2 equation then we can obtain this derivative as f `(x0) this
is = (h1 + h2)2f1 – h12f2 – (h1 + h2)2 – h12f0 + h1 h2 + h2/6f ```(ζ) that is the error term there and
the first term whatever we have written that is divided by h1 h2 (h1 + h2) here.
And if you see here, this error term contains only f ```(ζ) since we have eliminated f ``(ζ) from
this term here, where we are usually given these symbols are denoted this x0 is f(x0) and f1 as
f(x1) f2 as f(x2) there. And the second derivative if we want to find from these 2 equations
here from these 3 equations f(x1) f(x2) so then we have to eliminate f `(x1) from the equation
or f `(x0) from that equation.
So if we will eliminate f `(x0) by multiplying suppose h1 + h2 in the first equation and second
equation only h1, it can be multiplied and both these equations after this multiplication if both
these equations will be subtracted then we can obtain the second order derivative in terms of
the f0, f1 and f2 there, where all the coefficients will involve in terms of h1 and h2 only and
higher powers of terms will be involved, these derivatives of third order or higher powers
where f `(x0) is eliminated from these 2 equations, so if we go for these unequal spaced points
and in a backward difference formula form suppose. Suppose the values of f(x) are given x0,
x– 1 and x– 2 our Newton backward difference formula.
458
(Refer Slide Time: 18:42)
So if you go for this backward difference formula here, we can write this nodal points that is
in the form of x = x0, x– 1, x– 2. So usually we have to go back from x0 in a line if you define x0
is the last point our Newton’s backward difference formula. So the immediate previous point
it will be x– 1 it can be placed at h1 distance and if you consider here x– 2 here and which is
placed at a distance of h2 from x– 1 here then we can write x– 1 as x0 – h1 here and x– 2 it can be
written as x0 – h1 – h2.
And then we can use this Taylor series expansion at f(x–1) there so then we can write f(x– 1) as
f(x0 – h1) here. And if we will write in Taylor series expansion that can be written in the form
of f(x0) – h1f `(x0)+ h12/2! f ``(x0), so likewise we can try it, next term is – here. So similarly if
you will write this f(x-2) here then we can write that one as f(x0 – h1 – h2) and it can be written
as f(x0) – (h1 + h2)f `(x0)+ (h1 + h2)2/2! f ``(x0)+ all other terms. So if we want to find this
derivative of f at point x0.
459
So from the first equation we can write that one as f(x0) – f(x– 1)/h1 that is first order
derivative at x = x0 and the remaining term that will be represented in the form of h1/2f ``(ζ)
here where ζ should lie between x0 to x– 1 here. Similarly, higher order formula may be
obtained by eliminating f ``(x0) from the above equations by eliminating f `(x0) from both
these equations f ``(x0) from both these equations here.
So that is why if we want to eliminate f ``(x0) from these 2 equations then we have to
multiply (h1 + h2)2 in the first equation and h12 in the second equation. And if we will
multiply and subtract then we can obtain this formula for f `(x0) here in terms of higher
powers of h and the remainder term specially it can come in the higher powers of higher
derivative powers of f here so that is why this first remainder term, it is coming in the order
460
of 3 here and this remainder term is usually written as (h1 + h1).(h1 + h2)/6 f ```(ζ), since the
points are existing between x0 to x– 1 so ζ should lie between x0 to x– 1 here.
Similarly, if we want to evaluate the second order derivative then we have to eliminate the
first order derivative term from both these equations so that is why we can multiply in the
first equation here h1 + h2 and in the second equation only h1 and if we subtract then we can
obtain the second order derivative in the form of h1 and h2 f0 f– 1 f– 2 here.
Next we will go for central difference formulas, in the central difference formula if we will
write these points that is in the form of we have these points x0 is their centreline so that is
why we can write x as x– 1 and the previous point, so a x0 is the centreline here.
So that is why we can write h1 as x1 here and if we will write x-1 here that is h2 here, at that
point we can use this Taylor series expansion f(x-1) as f(x0 – h2). Similarly at f(x1) we can use
f(x0+h1) here and if we will expand in Taylor series expansion at both these points x = x0
there then we can obtain this Taylor series expansion at f(x-1) and f(x1) there.
461
(Refer Slide Time: 24:22)
To obtain this first derivative here, so if we will eliminate f ``(x0) from both these equations
then we can obtain the derivative at or first order derivative of f at x0, so this first order
derivative at x0 for f, it can be written as [h22 f1 – h12 f – 1 + (h22 – h12 )f0]/h1h2(h1 + h2) +
h1h2/6 f ```(ζ), where ζ lies between x-1 to x1 here since this is a central difference formula
here. And sometimes if we will use suppose h1 = h2 both spaces are equal then your formula
can be rewritten as f `(x0)= (f1 – f–1)/2h +h2/6f ``( ζ), usually this is the central difference
formula for all other formulas also.
This means that if you expand f(x+ h) and f(x– h) and if you will add both these terms, the
same formula you can obtain. So that is why this f `(x0) it can be written in the form of (f1 – f–
1)/2h and this remainder term that can be written in the form of h2/6 f ```(ζ), where ζ should
be lies between x1 to x-1 here. And if you will go for second order derivatives here, second
order derivative means we have to eliminate f ` terms from both these equations there.
So if we want to eliminate both these equations of the terms of f0` then we have to multiply in
the first equation h12/2! or h12 if you multiply and the second equation if you multiply h22 then
we can eliminate sorry if you will multiply in the first equation h1 and second equation h2 then
we can eliminate f `(x0) from both these equations and we can obtain this formula for second
order derivative in central difference approximation and this formula can be written as 2(h2f1
– (h1 + h2)f0 + hf-1/h1h2](h1 + h2) in the remainder term that will be represented in the form of
– 1/6h13+ h23)/(h1 + h2) f4(ζ).
462
Since third order term it will associate with some of these constant terms here that is why that
cannot be neglected so that is why this one more term it will be associated – 1/6(h1 – h2)
f```(x0) there. Since central different approximation we are approximating at these 3 points we
are considering for this one. And in this form also if you consider this h1 = h2 = x suppose
then we can write f ``(x0) it can be written as (f1 – 2f0 + f–1)/h2. And this error it will occur in
the order of h2 here and that can be written as – h2/6 f4(ζ), where ζ should lie between – x-1 to
x1 here, thank you for listening the lecture.
463
Numerical Methods
Professor Dr. Ameeya Kumar Nayak
Department of Industrial and Systems Engineering
Indian Institute of Technology Roorkee
Lecture No 31
Numerical Integration Part 1
Welcome to the lecture series of numerical methods. In the last lecture we have discussed
about numerical differentiation and today we will start about numerical integration. So in
numerical integration we will deal different kinds of interpolation formulas and how we can
integrate this polynomial or this function using these interpolation formulas.
So specifically the advantage of this numerical integration is that if we will have this specific
nodal points with its tabular values then at that point if the function is not known to us
explicitly we can use this numerical integration to evaluate this functional values at all of this
point or whatever wherever this points are asked to evaluate.
So first we will go for this introduction that where we are observing these difficulties of
solving the problems in integration and why we are going for this numerical integration. Then
we will go for this methodology for numerical integrations and obviously for this lecture we
will go ... the rectangular rule that how we can use this rectangular rule to evaluate this
integration.
464
(Refer Slide Time: 01:44)
So specifically if you will go for the history of integration you can find that Archimedes has
developed this integration to find the surface area and volumes of solids and this integration
method was very modern nowadays. And since earlier it was developed to calculate this area
or volume or volume of solids in the sphere or in cones. Gauss was first introduced to make
the integrals in a graphical sense this means that graphs for integrals he has developed and
Leibniz and Newton discovered the calculus and found this differentiation and integration
which are converse to each other.
465
So integration was used to design or it can be applicable to many fields of science and
engineering if you will see. And here I am showing some of this modern infrastructures
where this integration was uses like Petronas Tower and Sydney Opera House for finding the
curves under the Centre of mass, displacement, velocity and fluid flows, specifically without
integration and derivative we cannot solve such type of problems.
So if you will go for this integration, if somebody asked what is integration, so then
especially we are putting curve sign and which is giving a parametric region where we want
to evaluate this integration or calculate this areas or surfaces or volume bounded within that
surface. And if we are defining this integrals, so integrals are of 2 types, first it is called
definite integrals, where these end points should be finite and we can easily position these
points. And next is the indefinite integrals, if some ranges are infinite then especially it is
called indefinite integral, where it is difficult to evaluate the functional values within that
range.
466
(Refer Slide Time: 03:46)
So in such cases, if you will go for finite definite integral or finite range integrals. So
especially we are defining a range there, y has a range y = 0 to some range suppose and x has
a range x = 0 to suppose a, or x = a to b and within that point how this y curve is
approximating or how this variation of y is occurring that we tried to estimate with the axis
bounding region with this curve there. This means that if y = f(x) is a function is defined and
how this area is occurring or how this aerial interaction or this enclosure is occurring within
this x-axis for 2 fixed points that can be evaluated by using integrations here.
So why we are going for this numerical integration, if the function f(x) is not given as I have
told you but the values are given at discrete points then we can use numerical integration to
find these values of the function within a given range. Sometimes if some of these
complicated functions, if it is given ∫e-x2 dx, suppose within the range of 0 to 1. So it is very
difficult to carry out and in that case if you will go for numerical integrations it is very easy
to evaluate the integration’s.
467
(Refer Slide Time: 05:29)
Let us suppose that the functional values are known at x = a and x = b and suppose a and b is
divided into n equal parts this means that n-1 internal points in [a, b]. Suppose these internal
points we are naming as x1, x2, to xn-1, so if you will visualize this one in a graphical sense
here suppose we will have this points x = x0 to x = xn suppose then we can divide this range
into suppose n number of of intervals.
So we will have now this n-1 internal points x1 to x2, x3 up to xn-1, obviously can write or
sometimes for convenience we are expressing x0 = a and xn = b. Then we can define this
points as a = x0 < x1 < x2 < … < xn-1 < xn = b. So if we are approximating any function with a
polynomial approximation within this range suppose x0 to xn here then we can obtain a
polynomial within that range.
468
(Refer Slide Time: 07:09)
So in that range we will have n + 1 nodal points with n’s of intervals xi, xi+1, if i is varying
from 1 to n-1 here. So if you will approximate this function with a polynomial or we want to
fit the polynomial p(x) over suppose k’s of intervals here passing through the points (xi, yi)
here, i = 0, 1, 2 ... k and evaluate this integrals.
469
(Refer Slide Time: 07:39)
So then we can write I1 as x0 to xk, f(x) dx or we can write this one as x0 to xk, p(x) dx here.
Obviously if it covers k intervals xi to xi+1 for i = 0, 1, to k here and this process is repeated
for all the intervals this means that first it covers k intervals then in the repeated process in
the next interval also it can cover k intervals again, until over the entire domain should be
covered off.
470
(Refer Slide Time: 08:34)
Then we will go for the discussion of this numerical integration in a different interpolation
formula, first will go for this Lagrange’s formula here. So if we will write this Lagrange’s
formula, this Lagrange’s formula for this polynomial approximation is usually written in the
form i = 0 to n,∑li(x)f(xi)dx, where li(x) is called Lagrange’s polynomial coefficients.
And usually this li(x) is expressed in the form of ω(x)/(x-xi).ω`(xi). Specifically, if we will
write in a elaborated form then we can write li(x) as (x-x0)(x-x1)(x-xi-1)(x-xi+1)…(x-xn)/(xi-
x0)(xi-x1)…(xi-xi-1)(xi-xi+1 )…(xi-xn).
If you will write this integration here this integration can be written as integration of x0 to
x1∫f(x)dx suppose. This can be written as or if we want to write this one as x0 to xk suppose in
each of this intervals if you will evaluate x0 to xk, ∫f(x)dx this can be written as x0 to xk of i =
0 to n,∑ li(x)f(xi)dx here or it can be written in the form of i = 0 to n,∑f(xi) x0 to xk ∫li(x)dx
here.
471
(Refer Slide Time: 11:07)
Already it is known that li(x) can be expressed as the coefficient of x here which can be
written in the form of either in the form of ω(x)/(x-xi).ω`(xi) there. So if we will implement
the numerical integration for all this product terms there then we can evaluate this numerical
integration for this polynomial there.
472
So next we will go for this as numerical integration using Newton’s forward difference
formula, when the values of function y = f(x) are provided suppose at equally spaced
abscissas then we can use this Newton’s forward difference formula. And in finite different
approximation usually this Newton’s forward difference formula in polynomial form are
written as y0+pΔy0+p(p-1)/2!Δ2y0+p(p-1)(p-2)/3!Δ3y0+... finite number of terms that is as
p(p-1)(p-2)... (p-n-1)/n!Δn y0 + the remainder term.
473
So if you will replace in terms of x here usually we are replacing here x = x0+ph here. So that
is why p can be written as (x-x0)/h here. And if you will replace this p value in terms of x
here then p(x) can be written as y0+(x-x0)/h Δy0 +(x-x0)(x-x1)/2!h2Δ2y0+all other terms in the
same fashion.
So we will write this polynomial in terms of p usually we can denote that one in terms of p
here that is P(p) can be written as y0+pΔy0+p(p-1)/2!Δ2y0 ... p(p-1)... (p-k+1)/k!Δky0 there.
So if you will use this integration for this formula here then we can write integration of x0 to
xk∫p(x)x here or it can be written as x0 to xk∫f(x)dx. This equals to we can write h ∫0k P(p) dp.
474
Since if you will see here dx is nothing but h dp if you will write this one, so that is why this
h is coming out and at this point x = x0, if I will put p = 0 and if I will put x = xk that is
nothing but x0+kh, so that is why p = k there. So that is why we are changing this interval that
is in this form here.
So in this regard also we can write this error term also, so usually these error terms are
written in the form of (x-x0). So for this Newton’s divided difference formula if you will
write this error term that can be written as, so this error term this Newton’s interpolation
formula R(x) can be written as p(p-1)(p-2) ... (p-k) if you will write to k-th term here.
475
So we can write (k+1)!Δk+1y0+ here. And in terms of x if you will write this can be written in
the form of (x-x0)(x-x1)... (x-xk)/(k+1)! and it can be written also h term it is also there, so
that is why we can write this one as fk+1(ζ)/(k+1)!.
If you integrate this one within this range of x0 to xk here, so this can be written as x0 to xk,
∫R(x)dx this can be written as x0 to xk here (x-x0)…(x-xk)/(k+1)! fk+1(ζ) here, where ζ should
lie between x0 to xk here. So if we will take out this term fk+1 (ζ)/(k+1)! as that are constant so
it can be taken out, so inside this integration we can write x0 to xk, ∫(x-x0)(x-x1)…(x-xk)dx
here.
476
So if you will eliminate the dx as in the form of hdp here, so then (x-x0)(x-x1) in terms of p
here, so we can write that the total term can be written as hk+2, this can be written as
hk+2/(k+1)! fk+1(ζ), 0 to k, ∫p(p-1)(p-2) ... (p-k)dp.
So this is the integration formula, so if you go for this total error in computing the integral i =
x0 to xk, f(x)dx, then sum of the module of errors that is the number of times it is occurring in
each of intervals that can be taken off and sum of all this terms to cover off this error terms in
each of this intervals x0 to xn. So now we shall discuss some of the methods that are based on
equally spaced abscissas, suppose that interval [a, b] is subdivided suppose n’s of intervals
here each of width suppose h here then we can define h = b-a/n here.
477
(Refer Slide Time: 19:17)
So suppose in a graphical sense if you visualize we can define x0 to xn ... xn = b here, so then
we can define if all this points are equally spaced here then we can define h = b-a/n or we can
write xn-x0/n here. And each of this points since it is equi-spaced here we can write xi = xi-1+h
here, where i is varying from 1 to n here incremented by 1.
So we can define this points that is in the form of a = x0<x1 ... <xn = b here. And if you go for
this rectangular rule suppose, in the rectangular rule form usually this method is applicable
whenever we have a constant value of function which attempts its maximum value within that
range. This means that if suppose we will have a integration range from x0 to x1 then we can
478
assume that y0 or this functional values at x0 can be considered as a constant value for f(x)
within the whole interval there.
And in the 2nd interval the same fashion we can approximate that one y1 as f(x) throughout
the interval and in that case we can write this integration as in the form of a real cross-section
and within that if you will divide each of this intervals in a rectangular section if suppose we
will have this integration range y = f(x) is given to us and it is asked to evaluate this
integration within this range a to b suppose we can subdivide these total regions into n’s of
intervals here. And if you will take this small section as a straight line or this perpendicular
we can draw at each of these points here since sufficient close intervals we are considering,
so that is why the error of approximation should be minimized in this sense here.
479
If you will consider the starting point as x0, next point is x1 and next point is x2, likewise the
last point is xn here and corresponding y values y0, y1, y2 likewise the last point will be yn
here, then we can consider this integration within this range x0 to x1, y1(x)dx or f(x)dx which
can take the maximum value y0 within this range x0 to x1 here. Then we can write this
integration as y0(x1-x0) here and this can be written as h y0 here.
So similarly we can define this integration range x1 to x2, x2 to x3 in the same fashion and
adding it, if you add up all this areas x0 to xn suppose y(x)dx or f(x)dx this can be written in
the form of x0 to x1, x1 to x2 ... xn-1 to xn f(x)dx here. Then each of this intervals first interval
we can obtain hy0, 2nd we can obtain hy1 and the 3rd one we can obtain that one hy2. So
likewise if you will sum of all this terms here then the final integration we can get it as h.
y0+y1+ ... +yn-1 here.
480
And if the function is suppose monotonically increasing then we can write integration a to
b,∫f(x) dx > h (y0+y1+…+yn-1). If suppose it is monotonically decreasing then we can define
as a to b,∫f(x)dx < h(y0+y1+…yn). So then we will go for rectangular rule used for Lagrange’s
formula.
481
So in the Lagrange’s formula suppose if the interval is given x0 to x1 and it is asked to fit with
a polynomial passing through the points (x0, y0) and (x1, y0) suppose since the functional
values are same at both this points. So then we can define this polynomial as we will have
this points (x0, y0) and (x1, y0), this function value will remain constant at both this points.
482
And if you will integrate this linear interpolating polynomial for Lagrange’s method then we
can write this one as x0 to x1∫p(x) dx, this can be written as y0(x1-x0) that is nothing but h y0
here. And this formula is the rectangular rule especially it is called and if you will
approximate the error terms also, this error terms can also be written as E = x0 to x1 R(x)dx
here.
And that can be written as x0 to x0+h,∫(x-x0)f `(ζ)dx here, which can be written as h2/2 f `(ζ),
where ζ should lie between x0 to x1. And in the same form we can obtain this composite
formula for this linear interpolating polynomial in Lagrange’s method as a to b, ∫f(x)dx that
as x0 to xn ∫f(x) dx = h(y0+y 1+y 2 ... yn-1) here.
483
So based on Newton’s forward formula if you will go for this calculation here if we will have
this function f(x) in x0 to x1 by terms of Newton’s forward interpolating polynomial here,
usually we can write x0 to x1, f(x) dx here or that can be written in the form of x0 to x1, P(p)dx
here. And obviously this integration range for the P(p) can be transformed to 0 to 1 and dx
can be written as h dp here.
So that is why we can write this one as h, 0 to 1 into your functional value y0 dp here and this
specially can be written as h y0. And in a similar fashion we can also determine the error for
this formula that can be written as E = h, 0 to 1, pΔy0dp, so which can be written as h2/2f `(ζ).
Since we have known that Δy0/h = f `(ζ), so in both these methods we are obtaining the same
formula for rectangular rule. Thank you for listening this lecture on numerical integration,
next lecture we can continue for this numerical integration based on Newton’s forward
difference formula and all other interpolation formulas.
484
Numerical Methods
Professor Dr Ameeya Kumar Nayak
Department of Industrial and Systems Engineering
Indian Institute of Technology Roorkee
Lecture No 32
Numerical Integration Part II
Welcome to the lecture series on numerical methods, in the last lecture we have discussed
numerical integration. In the numerical integration we have started about this rectangular rule
that how we can just implement to find this integration for a particular function. Even if this
function is not known to us but the tabular values like functional values at different points if
it is known to us then we can evaluate this integration by using this nodal or tabular values.
So this class we will go for some of the higher-order integration methods, so first will discuss
about this Quadrature formula, then we will go for 2-point formula that is trapezoidal rule,
then we will go for this geometrical interpretation of this trapezoidal rule and we will go for
error estimation in a trapezoidal rule, then for composite formula of trapezoidal rule. If the
point is like in each of the intervals if you will use 2 points then basically that is called
composite trapezoidal rule and how we can obtain the formula that I will discuss in this
lecture.
485
(Refer Slide Time: 01:38)
So first if you go for this Quadrature form usually this integral is approximated by linear
combination of functional values of f(x) at tabular points. This means that integration a to b,
∫f(x)dx it can be expressed in the form of, k = 0 to n,∑λkf(xk) . Since already it is known that
unless always we will have this set of tabular values that is in the form of like (x0, f(x0)), (x1,
f(x1)) upto (xn, f(xn)) . So if this set of tabular values is known to us then we can implement
this Quadrature formula basically evaluate this integration by using these tabular values .
So especially if you will expand this formula that is the linear combination of this functional
values are different tabular points then we can write this one as λ0f(x0)+λ1f(x1)+ …λn,f(xn) .
This tabulated point xk’s are called abscissas and f(xk)’s are called coordinates or ordinate we
486
can say and λk’s are called weights for this function or this integration and this complete
expression it is called Quadrature formula.
And if we will go for this competition of this Quadrature formula usually this error formula
Rn(f) can be written as integration a to b, ∫f(x)dx - k=0:n ∑λkf(xk). So if we want to evaluate
this error always we have to subtract both this terms since especially directly we can write
this is not exactly equals to this is approximation of this term and we can write this+ Rn(f) as
the total expression for this complete integration formula. Then we can write Rn(x) or Rn(f)
this equals to integration of a to b, ∫f(x)dx k=0:n ∑λkf(xk).
487
Obviously sometimes we are approximating this formula like certain order of form, order of a
method we can justify. Suppose this function is exact, suppose some of this order of the
polynomial then we can say that in integration method of the form like we have express a to
b, ∫f(x) dx = k=0:n ∑λkf(xk).+ Rn(f) , this can be of order piRn = Rn(f) = 0 for all polynomial’s
of degree less than or equal to p .
This means that it produces exact results for all polynomials that is in the form of f(x) = 1, x,
x2,… xp. This means we can say that Rn(xm) = a to b∫ xm dx k=0:n ∑λk xkm = to 0 for m =0, 1,
2, …, p .
So then we can say that this integration method is said to be of order p. Exactly if you put like
1, x, x2, upto xp, exactly this value will give you the 0 value. So that is why it is said to be of
order p. And if you will try to find out this error term, if immediate next term if you will
consider then that will provide the error term.
488
So obviously we can write the error term for f(xp), f(x) = xp+1 this means that the error term
for f(x) is defined as we can write C = a to b ∫ω(x)xp+1- k = 0:n, ∑xkp+1λk , C is called error
constant and ω(x) especially always it is chosen as 1 this is called weights factor. So that is
why this formula can be written also directly as a to b, ∫xp+ 1- k = 0: m,∑ xkp+ 1λk also.
And if we will calculate this total error term, so if you will use this integration formula
directly we can write that one as Rn(f) =, since we have already calculated the difference of
this one this can be written as C/(p+ 1)!fp+1(ζ), where ζ should lie between a and b or you can
write Rn(f) = a to b,∫f(x)dx – k=0:n ∑λk xkm = C/(p+ 1)! fp+1(ζ), w ζ should lie between 0 and
1.
489
(Refer Slide Time: 08:34)
And if you want to find this bound for this error term we can find this absolute error term this
should be less or equal to absolute values of C/(p+ 1)! and then immediate next term we can
find this into maximum of fp+ 1(x) for x lies between a to b, this completely defines the bound
for the error term we can define that one.
So then we will go for this integral methods based on uniform mesh spacing, suppose you
will have this tabular points like x0, x1 to xn and all points are equi-spaced then how we can
evaluate this error using Quadrature formula that we will discuss.
490
(Refer Slide Time: 09:40)
So if you will go for this uniform mesh spacing for a prescribe data set points like x0 = a and
xn = b supposed then we can define your space size that is h = b-a/n, if we can define this
tabular values that is in the form of like x0 = a < x1< x2 <…< xn = b and all these points are
equi-spaced suppose x0, then x1, then x2 upto xn, so all are equi-spaced.
Then we can write this integration I = a to b, ∫f(x)dx = k = 0:n ∑λkf(xk). Usually we can write
this one as (λ0, f(x0)), (λ1, f(x0+ h)), (λ2, f(x0+ 2h)),…(λn, f(x0+ nh)). And this expression
specially are called Newton-Cotes Quadrature formula and λn’s are called Cotes number,
Cotes number and this formula is called Newton’s cotes Quadrature formula.
491
So first will go for trapezoidal rule, suppose we will have a curve and it has ordinates like (a,
f(a)) and (b, f(b)). So suppose these points are like p, q, so if we want to find this area or this
integration from a to b range with this function bounded by this curve y = f(x) then we can
estimate this area by a trapezium or trapezoid then we can evaluate this integration in a
complete form.
Suppose this curve y = f(x) is given to us and it is asked to find this integration within this
range a to b suppose and this can be approximated by joining this line like p to q in the curve
and if you will use this Newton’s forward difference formula for a linear interpolating
polynomial passing through these points like (a, f(a)) and (b, f(b)), then since we are
approximating this curved region by straight line that, so that is why you can approximate by
a linear interpolating polynomial.
So if you will interpolate this linear interpolating polynomial for this curve then we can write
this formula as f(x)= f(x0)+ pΔf(x0). Since it is a linear interpolating polynomial after that we
can get the 0 values, so that is why we can write this terms upto this series. And in terms of x
if you express this expansion, so this can be written as f(x0)+ (x-x0)/h Δf(x0), we can write x0
= a and x1 = b and your space size that is h can be defined as h = b-a.
492
And then if you will go for this integration of this function over this range a to b, the
integration from a to b, I can be written as integration a to b, ∫f(x)dx and this can be written as
integration a to b, ∫f(x0) dx+ a to b ∫ p in terms of x if you write this one so it can be written as
∫(x-x0)/h Δf(x0) dx. And if you replace this one in terms of x0 and x1, so this can be written as
x0 to x1∫f(x0)dx+ x0 to x1∫(x-x0)/h f(x1)-f(x0) d of x.
And if you integrate this one this can be written as like f(x0)(x1-x0)+ f(x1)-f(x0)/h and next one
it can be written as |(x-x0)2/2| x0 to x1. And if you will put x1-x0 = h, this can be written as h
f(x0)+ f(x1)-f(x0)/h h2/2. And finally if you will write this can be written in the form of like
h/2, f(x0)+ f(x1).
493
And that it can be represented also in the form of like h/2 f(a)+ f(b) if you will replace also h
in terms of b and a, this can be written as, b-a/2 f(a)+ f(b). So this is basically called 2-point
Quadrature formula if you will consider this linear interpolating polynomial which is
interpolating this curve y = f(x) in a particular region then we can evaluate this integration.
So next if you will go for this geometrical interpretation that is we can express this one that
the area of trapezium if you will see with width b-a and ordinates f(b) and f(a), which is an
approximate or which is an approximation to the area under the curve y = f(x) above the x-
axis and the ordinates x = a and b, this is nothing but representing the area that is b-a/2, f(a)+
f(b).
494
So if we can go for the geometrical calculation of this area we can say that we will have these
2 regions that is a rectangular region if you will see this one and this is a triangle region also.
So if you will calculate the area for this rectangular and the area of this triangle in that region
then also we can get this total area under this 2 sections are as, b-a/2 f(a)+ f(b).
So if you will see geometrically if you calculate this area then the area bounded by this lines
x = a, x = b, y = 0 and p q line that is area of the rectangular APRB if you will see+ area of
the triangle PQR. So this is nothing but first this is total area of this rectangle we can write (b-
a)f(a), so that is why (b-a) f(a)+ your triangular area that is 1/2h(b-a). So usually we are
writing that is ½(b-a)f(b)-f(a). So finally we can get it as (b-a)/2 f(a)+ f(b).
So if you will go for this error estimation for this trapezoidal rule, so we can verify that this
error as R(f), x = 0 for x = 1. If already I have discussed that one, an integration method is
said to be order p, if p is the largest positive real number for which the less numbers existing
that means 0, 1, 2 upto p all will give us the zero values for the error term.
495
So then we can express for suppose f(x)= 1 if you consider then we can express this one a to
b, ∫f(x)dx-our formula that is expressed as-b-a/2.f(a)+ f(b), or the value it is providing that is
nothing but the error function. So if you will see that is R(f,x)which can be written as like x0
to x1 or a to b whatever you want you can write. So that is if you write a to b, ∫1.dx – (b-a)/2,
since f(a) is 1, f(b) is 1, then we can express this one as b-a - b-a/2 . 2, so this is giving you 0.
Similarly, if will go for f(x)= x then also we can express a to b ∫x dx – (b-a)/2 and f(x)= x, so
f(a) = a and f(b) = b. So then if you will evaluate this one for R(f,x)this implies that this can
be written in the form of x2/2 - (a+b)(b-a), so b2-a2/2, this range is a to be, this is nothing but
R(fx), which can be written as 0 also for R(fx).
So if you will consider like f(x)= x2 then we can obtain the value for C. This means that for
f(x)= x2 since we have considered this polynomial of degree one or we are approximating this
function with polynomial of degree one, then we can get this exact 0 value or this polynomial
provide us a polynomial of degree one which is exact or error term is zero for this polynomial
of degree one.
496
So that is why immediate to the next if you consider this polynomial that is in the form of x =
x2 or f(x)= x2 then we can get this error term. If you put f(x)= x2, C can be written as
integration a to b, so first function that is ∫x2 dx - summation or you can use this formula for
this trapezoidal rule that can be written also as, (b-a)/2 x2, f(x) is x2 here, so we can write a2+
b2 .
497
If you will solve this equation that is x3/3 it will come, so you can write b3/3-a3/3, first one or
if you will put this values all of this values that is in the form of like x3/3, then put all of this
ranges then you can obtain the C value as -1/6 (b-a)3.
498
And if you will put this C value in the error term, so usually these are Rf(x) for this one it can
be written as, C/(p+ 1)! fp+ 1(ζ). So p is especially 1, so that is why we can write that one as,-
1/6(b-a)31/2f ``(ζ). Which can give you -(b-a)3/12 f ``(ζ), where ζ should lie between a and b.
And if you will go for this maximum bound of this error this can be written as R(f,x) in
magnitude form this will be less or equal to (b-a)3/12, maximum of f ``(x). And from this
formula we can visualize that if the length is larger suppose then we can obtain this b-a
difference is also large and in that case this method becomes meaningless, if the section is
very small then we will have good result.
499
So then if you will go for example of trapezoidal rule, so if the question is asked suppose find
the value of integration 0 to 1∫x/(1+ x) dx. We have considered a simple example taking h = 1
using trapezoidal rule than how we can implement this formula that I will discuss, also we
can find this exact value of integrals since it is a simple function I have considered, we can
easily get this exact value and then we can obtain the error by taking this approximated value
calculated by using trapezoidal rule and exact value calculating in numerical form or
analytical form.
Suppose this integration is asked 0 to 1 ∫x/1+ x dx and this space size that is given as I think
1. So using trapezoidal rule solve this integral equation if you see, so a = 0, b = 1, f(x)= x/1+
500
x and space size h = 1 means n = 1. Usually we are defining h = b-a/n, so that is h is given as
1, so 1-0/h, so that is why we can consider n = 1.
So we can use trapezoidal rule by considering all this values whatever it has given to us. So if
you use them we can write integration 0 to 1∫x/1+ x dx, using trapezoidal rule this can be (b-
a)/2, (1-0)/2 f(a), that is if I will put 0 that will especially give me 0 value + 1/2. And this can
be written as 1/2 . 1/2 that is 1/4 or especially we can write this one as 0.25.
And if you will go for this analytical solution I can write this one as 0 to 1, ∫1-1/1+ x into dx
and this can be written as like x- ln(1+ x)|, 0 to 1 range. So it can be written as like 1-ln(2), I
can say and this value is giving you like 0.6990. And if you will see this difference then we
501
can find this error that is 0.6990-0.25 the error is coming as 0.4490, since we are taking h
sizes larger that is why this error is very high.
So then we will go for composite trapezoidal rule, so in each of this interval we can form a
trapezoid and then we can approximate this integration in each of this ranges that is basically
called the composite trapezoidal rule. So graphically you can say that in each of this section a
trapezium is formed and area of this trapezium is approximated in each of this intervals and
then we can combine each of this intervals and then in a composite form we can obtain a
composite formula.
502
So if you will consider these space sizes are all are equal, this means that if you will starts
from x0, x1, to xn, so if you will use this one this formula like a to b∫f(x)dx, then we can write
this one as [x0 to x1∫+x1 to x2 ∫+….xn-1 to xn ]∫f(x)dx.
And each of intervals if you write this formula then we can rewrite this formulation as, so 1st
interval we can write h/2[f(x0)+ f(x1)], 2nd interval we can write h/2[f(x1)+ f(x2)], 3rd interval
we can write h/2[f(x2)+ f(x3)]+ the last interval if you will write f(xn-1)+ f(xn).
So if you will add up all this terms we can write this one as h/2 [f(x0)+ f(xn)+ 2f(x1)+…f(xn-
1)] this is basically called composite trapezoidal formula. So with this I am completing this
lecture, next lecture I will go for this error computation of trapezoidal rule, thank you for
listening the lecture.
503
Numerical Methods
By Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 33
Numerical Integration Part 3
Welcome to the lecture series on numerical methods, in the last lecture I have discussed about a
numerical integration based on trapezoidal rule. So this lecture I will continue this error
approximation for this trapezoidal rule.
So in the trapezoidal rule if you will go for this error calculation then we can write this error as in
the form of like as I have told you that if you will have 2 points only that is B and A and we will
have a single interval there then usually we are expressing this error term as in the form a R(f, x)
≤ (b-a)3/12 f ``(x) absolute values that is a maximum of x lies between a to b. We have
considered this positive values since we have taken here absolute value in the left hand side.
504
(Refer Slide Time: 1:32)
So if you will go for this expression for a composite formula this is for a single formula if you
will consider this formula in the form of a to b ∫f(x) dx and h is suppose only one interval we are
using then we are using this formula that is in the form of b-a/2 f(a)+f(b) there but in a composite
form if we are expressing this formula then n intervals if this interval [a b] is a subdivided a to b
x0, x1 to xn here each is of equal spacing these points are situated.
So usually we are defining this (xn-x0)/n this as h we are defining this one. So that is why we will
have this intervals x0 to x1 ∫f(x) dx+x1 to x2 ∫f(x) dx+ up to xn-1 to xn ∫f(x) dx. And then they say
formula directly can be written as h/2 f(x0)+f(xn+2) f(x1) to f(xn-1). So in each of this intervals if
505
you will see here we will have an error term, since whenever we are approximating this curve
bounded area there so each of this region that is approximated by trapezium there.
So that is why in each of this trapezium we will have an error section either in the lower side or
in the upper side. So that is why if you will consider this total area for this composite formula
here that can be written as R(f, x) this equals to obviously this formula is written in the form of –
h3/12 and each of this error a first term if you will write the error is f ``(ζ) second we will have
this error ζ one f ``(ζ2) up to the last section, we can this write this error as f ``(ζn) suppose, where
we can write this ζi is the error which is lying between xi-1 to xi here i = 1, 2 up to n there.
So if you will go for this error bound then we can write R(f, x) ≤ h3/12 and if you will find a
maximized term or the maximized error term from this one or we can consider each terms of the
error are of equal value or we can consider this maximized error term there itself then we can
take the total sum as Nf ``(x), which can take a positive value here since we are taking the
absolute value on the left hand side so we can take this absolute value that as f ``(x) here also
where x should lie between a to b here that can be taken as the maximized value of f ``(x) within
that range.
So we can write that one as a maximum of x which should lie between a and b and can be taken
by the this second order derivative there and in a complete form if you will write this formula
this can be written as Nh3/2l max{f ``(x)} where x should have lie between a to b here and
sometimes also if you replace N h = b-a then you can replace this formula that is h2 b-a/12 . max
{f ``(x)}.
So that I have written here also that is N h3, N h = b-a so that is why this can be written as (b-
a)3/12 N2 also sometimes we can write and max{f ``(x)} and this above expression shows that
the increment of number of intervals gives less error if you will see in comparison to the earlier
one if you will see total interval if you will take a single step of composite rule then we will have
a larger error if you will sub divide each of this sections then we can have less error there itself
since in each of this intervals if you will see this error will be reduced. And that is why we can
use this composite formula to get a better result.
506
In geometrical interpretation if you will see here in each section if you will see this error is
getting reduced here but in total section if you will see this error is very large compared to this
lower graph here that is N number of trapeziums with width is with ordinates f(xi-1) and f(xi) if
you will consider in a geometrical way, then the sum of approximation of the area for this curve
y = f(x) above the x axis and ordinates x = a and b gives the maximized error cross sectional
under this curve.
507
Then if suppose a question is asked, evaluate integration i = to 1 to 2 ∫1/(5+3x) dx with four
subintervals using trapezoidal rule and compare with exact integral and find the absolute error
and find the bound for the error. If you will see here your a is given as 1 here and b is 2 here four
subintervals means N = 4.
So this integration is given as I = 1 to 2 ∫1/(5+3x)dx here with four subintervals this means that
with n = 4 here and if n = 4 here so then we can define h = b-a/n here so that is 2-1/4 that is 0.25.
And if we have defined h = 0.25 here then nodal points it is starting from 1, 1.25, then 1.5, then
1.75, then last point is 2 here and corresponding f values you can define at 1 as f(x) is given as
here 1/(5+3x). Corresponding values for x = 1 we can get f(x) as 0.125, for 1.25 we can get
0.11429, 1.5 we can get 0.10526, 0.09756 and 0.09091. Now we will go for this integration so
this trapezoidal integration It can be written as h/2 so we can write 0.25/2 f(x0), x0 is given as 1
here and xn is 2. So f(x0) it is written as 0.125[0.125 +2{0.11429+0.10526+0.09756}+0.09091].
508
So this will give the values as 0.10627 and exact value of this integration if you go for this
computation we can get 1/3 ln(5+3x) if you see and if you put this range 1 to 2 here we can
obtain that one as 0.10615 and the absolute error is that is IExact-It, here in absolute form if you
write 0.10615-0.10627 this is 0.00012, it is very less. And if you go for this bound of this error
so absolute value is if you consider that can be written as error should be less or equal to (b-
a)h2/12 maximum{f ``(x)} if you go for this error term here so this error can be written as this
should be less or equal to to (b-a)h2/12 maximum{f ``(x)}, x should lie between 1 and 2.
509
And if you will calculate this f ``(x) we can say this one as this equals to (2-1)(0.25)2/l2
max{18/(5+3x)3} = (0.25)2/12 . 18/512 where x should lie between 1 and 2. So if you see here
this function will achieve this maximum value whenever x = 1 itself. This will give you this
maximum error term of this composite trapezoidal rule that is coming as a 0.00018.
So then we will go for 1/3 Simpson’s rule here so whenever we will calculate this area of this
curve bounded by this x axis and if this area is approximated by this parabola here then we can
obtain this 1/3 Simpson’s rule.
510
Specially we are dividing this interval, in the earlier case we have divided into 2 points there
only single interval we have considered, here we will subdivide this interval into 2 parts here
considering this middle part as a+b/2.
So if you will divide any interval a to b here 2 intervals 2 subintervals here, then we can write
this midpoint as a+b/2.
And we will have 3 points (a, f(a)) and ((a+b)/2, f((a+b)/2)) and then the third point ((b, f(b))
here and we can write this space size h = (b-a)/2 here with x0 = x1 as (a+b)/2 and x2 = b there. So
if you will see in this slide here you can find that this parabolic section this is approximating this
aerial cross section that is bounded by this x axis and the curve there.
So if you consider these points x0 and f(x0) is the first point at (x2 f(x2)) is the last point here then
the middle point is denoted as (x1, f(x1)).
511
(Refer Slide Time: 14:36)
So this means that we are approximating the 3 points of this curve y = f(x) by parabola joining
this points P, Q, R there and this parabola is approximated with this curve of degree 2 there. So if
this is approximated by a polynomial of degree 2, then we can use Newton’s forward difference
formula for a quadratic polynomial by considering these points (x0 f(x0)), (x1 f(x1)) and (x2 f(x2)).
If you will write this Newton’s forward difference formula for a degree 2, so f(x) can be written
as f(x0)+pΔf(x0)+p(p-1)/2!Δ2f(x0). Since, this function is approximated by a polynomial of
degree 2 the higher powers will give you the 0 values there. So in terms of x if you will write
512
since we have here 3 points x0 = a, x1 = a+b/2 and x2 = b here, if you will rewrite this formula
here then this can be written as a to b ∫f(x) = x0 to x2 ∫f(x0)+(x-x0)/hΔf(x0)+(x-x0)(x-x1)/h2
1/2!Δ2f(x0)]dx.
So first point if you will integrate then you can get (x2-x0)f(x0) here, second point if you integrate
here (x-x0)2/2! h will come out so f(x1)-f(x0) so that is why it can be cancel it out x2-x0 if you will
take that will give you 2h there since we are approximating here x0, x1, x2 here so each is of
space h here, equi space points we are considering. So that is why we can write x1 = x0+h and x2
as x0+2 h.
513
So if you will integrate them in the final form if you put all these values here that we have
derived here that is x2 = x0+2h, x1 = x0+h here we can obtain that one as h/3 f(x0)+4 f(x1)+f(x2).
514
(Refer Slide Time: 18:20)
So a simplification if you do then you can obtain this formula and directly also we can write
since h is expressed here that is b-a/2 since if you will see here h is nothing but b-a/2 here since n
is 2.
So that is why we can write h/6, sorry this is (b-a)/6 f(a)+4 f(a)+b/2+f(b) here in terms of a and b
if we want to write this formula so this can be represented in this form.
515
(Refer Slide Time: 19:06)
This is called 1/3 Simpson’s rule and if you will go for this error calculation of this Simpson’s
rule we have to consider that this function is exact for polynomials of degree 0, 1 and 2 there this
means that it can provide the exact value this means that this integration-this formulated value
whatever this formula we have derived here this difference will be 0 for 1, x and x2, gives the
exact polynomial of degree less or equal to 3.
So for degree 4 we can get exactly this error term for this function here if you put here f(x) = 1
for this error approximation that I can see that R(f, x) = 0 our earlier calculation if you see.
516
(Refer Slide Time: 20:04)
For f(x) = 1 if you will put a to b ∫1dx here, that is for f(x) = 1 if you will write this error term
here R(f, x) this can be written as a to b ∫1dx - (b-a)/6[f(a)+4 f(a+b)/2+f(b)].
Then if you will write this one that is 1+4+1 here this will give you 0 value. Similarly if you will
consider here f(x) = x suppose R(f, x) can be written as a to b ∫x dx - (b-a)/6[a+4(a+b)/2+b]. So
this will obviously gives you also a 0 value, similarly if you will consider f(x) = x2, this can also
be written as R(f, x) here a to b ∫x2 dx – (b-a)/6[a2+4(a+b)2/2+b2], this will also give you 0 value.
If you consider f(x) = x3 since directly if you put a polynomials of degree less or equal to 3 here
exactly 3 should be the error term here but specially for degree 3 we are getting 0 value here so
that is why we have to consider the immediate next term for this error here that is f(x) equals to
x4 if you put here f(x) = x4 here that can be written as c = f(x3), R(f, x) = 0 here and for f(x) = x4
we can compute c term here so that is a to b ∫x4dx – (b-a)/6[a4+4[(a+b)/2]4+b44.
And if you do some simplifications we can obtain this value as – (b-a)5/120. So then the total
error term we can write R(f, x) = c/(n+1)! especially if you will see or p+1.
517
So c/3 is your highest p term since we are getting for degree 3 also we are getting the exact
polynomial function which is approximated with this f(x) value. So that is why you can consider
p as 3 here c/(p+1), so that is why c/4!f4(ζ) value here where ζ should lie between a to b there so
that id why we can write if you put here that as –(b-a)5/120 4! f4(ζ).
And this total term that is giving you here -h5/90f4(ζ) the bound for this error if you will go for
the positive values here modulus of R(f, x) this can be written as less or equal to (b-a)5/2880
max{f4(x)}, where x should lie between a to b here since Simpson’s 1/3 rule gives us the exact
result for polynomials of degree less or equal to 3 the method is said to be of order 3 here that is
why it is called 1/3 Simpson’s rule.
518
(Refer Slide Time: 24:45)
So as in the case of trapezoidal rule if the length of the interval a, b is large, then we can find that
b-a is also large so the error term in expression 5 is also large. And in this case the interval a, b is
subdivided into number of subintervals of equal length and simply apply Simpson’s 1/3 rule for
each of this intervals to or the subintervals to evaluate this integration and it is basically called
the composite Simpson’s rule. Basically in trapezoidal rule we have considered a single interval
there and if you go for this your 1/3 Simpson’s rule, we have considered 2 subintervals there.
So that is why both these methods if you compare trapezoidal rule and 1/3 Simpson’s rule here
then we can visualize this error will be larger in trapezoidal’s rule compared to the 1/3 Simpson’s
519
rule. This means if you will use 1/3 Simpson’s rule we can obtain these results maybe around 6
steps then it requires near about 12 steps to get this result for this trapezoidal rule.
So the conclusion is that whenever we are subdividing these domains into more number of
intervals then we are getting these results in more accurate form compare to the less number of
division of the intervals. So in trapezoidal rule if you see we have used N = 1 and if we are
extending that one to N = 2 there then this error is getting reduced in R1 forms. So that is why it
is always impressive to choose this space sizes as larger to get this less error, thank you for
listening this lecture.
520
Numerical Methods
By Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 34
Numerical Integration Part 4
Welcome to the lecture series on numerical methods, last class we have discussed this numerical
integration based on trapezoidal rule, composite trapezoidal rule and Simpson’s rule. So today
we will just go for the discussion of this integration method based on composite Simpson’s 1/3
rule and some of the examples based on composite Simpson’s 1/3 rule and then Simpson’s 3/8
rule and then we will just go for some examples of 3/8 Simpson’s rule.
521
So last class we have discussed about this 1/3 Simpson’s rule, there itself I have just written this
formulae this 1/3 Simpson’s rule as integration a to b ∫f(x) dx this can be written in form of like
h/3[f(a)+4 f(a+b)/2+f(b)] where h can be defined as (a-b)/2 since we are just dividing this total
domain into 2 sub parts here if our starting point is a = x0 here then middle point we are just
assuming that (a+b)/2 here and the last point b as x2 here so middle point is x1.
Then we can just rewrite this formula as in the form of like h/3[f(x0)+4 f(x1)+f(x2)] here. So if
you just go for composite formula for this 1/3 Simpson’s rule then since we are just considering
only 2 sub intervals here then the total interval should be divided by 2n+1 number of nodal
points then we can just using this Simpson’s 1/3 rule there and if you just go for this step size
here then for composite Simpson’s 1/3 rule we have to divide this total interval that (a-b)/2n sub
intervals, we have to consider since total number of tabular points is 2n+1.
And the nodal points it can be chosen either in the form of like a = x0 then x1, then x2 since all
are of equi spaced here then we can just write x3, so likewise we can just write the last point here
that is in the form of like x0+2nh that is xn = b here. So that is why it is just considered it of like
2n+1 points here. So that is why we can just consider a = x0, then x1 = a+h or x0+h, then x2
=x0+2h so likewise we can write the last point it can be written in the form of x(2n), if you write
since 2n+1 points here, so if you write x2n, so then it can be written as x0+2nh=b.
So if you write this 1/3 Simpson’s rule in a composite form then we can write this integration in
the form of a to b ∫f(x) dx since we are starting this point from x0 here, so x0 to x2 ∫f(x)dx+x2 to
522
x4 ∫f(x) dx + likewise we can write the last point as x2n-1 to x2n-2 ∫f(x) dx here, since twice points it
is required there that is one sub interval since we are dividing if this point is written from x0 to xn
here then xn can be replaced as 2n here then equally we are subdividing this domain into 2n parts
there.
So that is why we can write since the starting point is x0 to x2 here then x2 to x4 so last point it
will go from x2n-2 to x2n here and in each of the intervals if you use this 1/3 Simpson’s rule then
especially it is called composite Simpson’s rule here.
523
So then if you apply this 1/3 Simpson’s rule in each of this intervals here then this formula can
be written as, in the first interval x0 to x2∫f(x)dx this can be written as like h/3 [f(x0)+4
f(x1)+f(x2)] here. Similarly in the second interval if you write x2 to x4∫f(x) dx this can be written
as h/3[f(x2)+ 4[f(x3)+f(x4)] here. So likewise if you write the last interval it can be written in the
form of x2n-2 x2n ∫f(x) dx here this can be written as h/3[f(x2n-2+4f(x2n-1+f(x2n)] here.
So if you add all this terms, then we can obtain this composite trapezoidal rule that as in the form
of like integration a to b ∫f(x) dx = h/3[f(x0)+f(x2n)+4{coefficient terms as f(x1), f(x3) up to f(x2n-
1)} here. So that can be written as like f(x1)+f(x3) up to f(x2n-1)+twice of the repeated terms if you
see here like f(x2) is appearing here, f(x2) is appearing here so then again f(x4) will appear here
then f(x4) is appearing here.
So that is why we can write twice of this even number terms that is f(x2)+f(x4)+up to f(x2n-2). So
this is basically called the composite 1/3 Simpson’s formula here and if you go for this error
competition of this 1/3 Simpson’s rule here then last class we have derived this error for this
Simpson’s 1/3 rule and if you will consider like our earlier trapezoidal rule, so in each of this
intervals we will have a maximized error that as in the form of like –h/90.
So first error if you see for this interval here that is -h5/90 f4(ζ1), second one if you consider here
that is h5/f4(ζ2) here. So likewise if in each of these intervals you will take this error terms and if
you add all these error terms then we can get this one as in the form of like ,–h5/f4(ζ1)+f4(ζ2) up
to f4(ζn) since we are dividing this 2n number of points or 2n+1 number of points into 2n number
of intervals here and then in twice of this interval or sub intervals like 2 points within 2 points we
are getting one error term so that is why this total number of error terms it will be like n number
of terms there.
524
So that is why the last point of this error term it is coming as f4(ζn) here. And in a composite
form if you write that is the maximum bound of error that is absolute modulus of R(f, x) if you
will write here maximum |R(f, x)| here. So you can write this should be less or equal to first term
here that is h5/90 and if you consider all other terms f4(ζ1)f4(2)+ up to f4(ζn), here or we can write
this way n as 2n/2.
So that is why if you consider all these terms at a time so it can be written in a form like |R(f, x)|
≤ Nh5/90 and this maximum error that is occurring between a to b for this f4(x) here, this one we
can write maximum {f4(x)} where x should lie between a to b.
525
(Refer Slide Time: 11:43)
And obviously sometimes we can write that one Nh = (a-b)/2 if you see here that is your h size
that is defined as (a-b)/2n here, so Nh if you replace here like N h = (a-b)/2 so we can write this
one as like h4/180 (a-b) maximum of x lies between a to b f4(x) here. So this one is the
representation of error in composite Simpson’s 1/3 rule here and it can be observed from the
expression of error that this composite 1/3 Simpson’s rule is of order 3 here since fourth order
term it is giving the error term here.
526
So since we have already explained that one usually if this error term is written in form of like
integration a to b ∫f(x) dx - k = 0 to n ∑λkf(xk) = c/(p+1)!.fp+1(ζ) there. So that is why p should be
the order of this polynomial which will give you the exact solution, so that is why for degree 3
we are exactly getting 0 values at this integration level so that is why order of this error terms
can be considered as 3 here, since p+1 = 4 here.
So next we will go for the example of this 1/3 Simpson’s rule and this composite 1/3 Simpson’s
rule here. So if you go for this Simpson’s 1/3 rule, then suppose the example is given as, like find
the approximate value of this integration that is I = 0 to ∫1 dx/(1+x) using Simpson’s 1/3 rule
527
with 8 equals sub intervals, 1/3 Simpson’s rule for 8 sub intervals. So if you use these 8
subintervals here then we can consider that N = 2 N = 8 here.
And then we can define this h that in the form of like (a-b)/2 N here and then we can write this
one as 1/8 obviously since here integration range a is given as 0, b as 1 so that is why it can be
written as 1-0/8 that as 1/8 and the nodal points if you write in this form here then it can be
written in the form of like x0 = 0, x1 =1/8 means 0.125, x2 = 0.25, x3 = 0.375, x4 = 0.5, then x5 =
0.625, x6 = 0.75, x7 = 0.875, x8 = 1.0.
And then if you use this following tabular values for this function f(x) = 1/x 1/1+x then this
functional values that will give like the values as 1.0 for 0 here since if you see x = 0 means this
is f(x) = 1, then 0.125 it will give you 0.888889, then 0.25 it will give you 0.8 here, 0.375 this
will give you 0.727273, 0.5 it will give you 0.66667 here, 0.625 this value will give as 0.615385,
0.75 the value will give as 0.571429, 0.875 this is given as 0.533333 and 1.0 obviously it is
1/1+1 so that is why it will give you 1/2 means 0.5 here.
So if you use this formula here then this integration can be written in the form of like h/3
[f(x0)+f(xn)] that as like 1.0 here+4 times the even terms here that is in the form of like if you
will write x1+f(x3), f(x5) or terms here, and then last point is f(x7) here then+2 [f(x2)+f(x4)+f(x6)].
528
So if you put all these values here the corresponding nodal points whatever it is define, so
corresponding this functional values if you put in this formulation here then you can obtain this
value as 0.10615 here.
And if you see this exact value so this value is giving you like 0.10615 here and if you see IExact
value so this is nothing but 0 to 1 ∫dx/1+x here, so it can be represented in the form of ln(1+x)|01
here and this is nothing but ln(2), this one since ln(1) is 0 and this value is coming as 0.693147
and first value if you see this one so this integral value coming as like as like 0.693155 here and
if you take the difference of IExact-Is here then we can obtain this value as 0.693147-0.693155
here.
So this will give you like 0.000008, so error is very less here for this case. So next we will go for
the Simpson’s 3/8 rule here, in this method so f(x) is approximated by a cubic polynomial since
in the beginning of the first case that is trapezoidal rule we have written that f(x) is approximated
with a linear polynomial, then in the second case like 1/3 Simpson’s rule we have taken that f(x)
is approximated by a quadratic polynomial, then for Simpson’s 3/8 rule we will consider that f(x)
will be approximated by a cubic polynomial here.
To construct this cubic polynomial so 4 nodal points are equated, we can say like x0, x1, x2 and x3
here, hence this interval will be subdivided into 3 equal parts to obtain 4 nodal points.
529
So for that we have to consider h = (a-b)/3 here and then nodal points can be written as x0 = a,
then x0+h = x1, x0+2h as x2 here, then x0+3h = x3 here. So usually we are defining these
subintervals 3 subintervals so that is why we can write this one a = x0 1, 2, then 3 subintervals x1,
x2, x3 here h, h, h.
Now if you go for this Newton’s forward difference formula like the earlier cases, so the cubic
polynomial approximation for f(x) interpolating these points like p(x0)f(x0) then Q(x1)f(x1), then
R(x2)f(x2) and S(x3)f(x3)) then we can write this formula that is in the form of like f(x0) if you
write these 3 different terms like f(x) = f(x0) so first term can be written as (x-x0)pΔf(x0)+(x-
x0)(x-x1)/h 1/2!Δ2f(x0)+(x-x0)/h (x-x1)/h (x-x2)/h 1/3!Δ3f(x0).
530
Then we can obtain this formula for this 3/8 Simpson’s rule here. So then we will take this
integration here that is in the range of from a to b here, a to b, ∫dx, x0 to x3 or you can write a to
b∫dx here then a to b∫dx here. If you will integrate this one, then you can obtain this formula that
is in the form of like 3h/8 [f(x0)+3 f(x1)+3 f(x2)+f(x3)] here. So this expression is basically called
the 3/8 Simpson’s rule, so all of these values it can be expressed in the terms of x0 and h since x1
can be written in the form of like x0+h, x2 can be written in the form of x0+2h and x3 can be
written in the form of x0+3h here.
531
And the error approximation for this Simpson’s 3/8 rule if you go like our earlier computation,
then we can obtain this one as R(f, x) = -(a-b)5/6480f4(ζ) here, where ζ should lie between a and
b and in terms of h if you write this can be written in the form of like (a-3)h5/80 f4(ζ) here and h
can be written in the form of like (a-b)/3 here.
And from this expression it is also shown that this order of approximation for the error is third
order here also since when f(x) is a polynomial for degree less or equal to 3 then R3(f, x) it is
giving the value here.
And this shows that the error increases as (a-b) values are increasing here also and in this case
this interval a, b is subdivided into a number of subintervals that can be represented in terms of
3k+1 here so earlier it was 2N+1 since 2 subintervals we have considered. So here 3 subintervals
so that is why it can be considered as 3k+1 total number of points small n = 3 k+1 here than
simple we can use 3/8 Simpson’s rule.
532
533
Then if you will go for like composite Simpson’s rule here then the total number of points can be
taken as 3k+1 where your h can be divided in the form of like (a-b)/3k or we can consider this 3k
= n is the total number of points there and all of these nodal points it can be represented in the
form of a = x0, then b = x0+3kh that is xn usually it can be written. So if you write here x0 =a then
x3k this can be written as xn = b here and usually it can be represented in the form of x0+3kh the
last point.
And in composite form if you write this 3/8 Simpson’s rule then it can be written in the form as
integration a to b ∫f(x) dx this equals to this is composite 3/8 rule especially it can be written in
534
the form of like 3h/8 [f(x0)+f(xn)+3 f(x1) f(x2)], if you see here so 2 coefficients that is x1 and x2
takes the 3 coefficients and the last point f(x3) is considered as the single coefficient again if you
apply this one the starting point will be f(x3) there. So that is why these points will go up to f(xn-
1)+2 f(x3)+f(x6) so likewise it will go up to f(xn-3) here. This is basically called the composite
Simpson’s 3/8 rule.
Similarly, the error can be computed by considering since we are considering these 3
subintervals at a time so that is why it can be divided by 3 means we can finally consider k
intervals here or k maximized values within these subintervals. So that is why this final error
term can be written in the form of 3/90 kh5 max{f4(x)}, x lies between a and b here.
535
So using this 3/8 rule if you try to solve one problem basically the problem statement it can be
written in the form of like suppose I = 1 to 2 ∫dx/(5+3x) with 3 and 6 subintervals, then we can
define this problem as I = 1 to 2 ∫dx/(5+3x) and we have to consider 3 and 6 subintervals.
So here we can consider n = 3 first then n = 6 here, so a = 1 here, b = 2 then h can be written as
like (a-b)/n here so that is why we can write that as like 2-1/3 here so 1/3. So then we can use
this formula, I as 3/8 rule for this 3 point we can write h/3, h/8 usually we were writing, so 1/3
we can write this one as so directly I can this formula that will be easy to understand so 3h/8[
f(x0) that is in the form of like f(1) here so+3 f(x1), x1 can be considered as 1+0.3 here so we can
consider that one as like 3h/8[f(1)+3f(4/3)+3f(5/3)+f(2)].
From this you can obtain all of these values so like functional values once f(x) you write here in
the form of like 1/(5+3x) from there itself you can get f(1), f(4/3), f(5/3) and f(2) and put this
values and you can obtain these values here.
536
And if you go for the calculation of 6 points here, then repeatedly you have to use all of these 2
sequences once more sequence you have to add it up to get this final answer here and if you go
for this exact value, so the exact value computation is giving this value as a 0.10615, but in 3
point form we are also getting 0.10616 here but in 2 subintervals if at a time we are considering
like 6 points then we are obtaining this value as 0.10615 here.
And this magnitude of error if you will compute so for n = 3 we are getting this value as 0.00001
for n = 6 so this correct value is giving at least up to 5 decimal places here. So the conclusion is
that if we are using like more number of subintervals then this value is giving the accurate values
in both this like exact solution and this like computed solution or numerical solution in the same
form but if you are considering less number of intervals then this error is increasing. Thank you
for listening this lecture.
537
Numerical Methods
By Dr. Ameeya Kumar Nayak
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 35
Numerical Integration Part 5
Welcome to the lecture series on numerical methods, currently we are discussing numerical
integration. In the last lectures we have discussed about this numerical integration based on
trapezoidal rule, 1/3 Simpson’s rule, 3/8 Simpson’s rule. And today we will just go for this
numerical integration based on quarters rule here based on Gauss Legendre integration methods.
So first we will just go for this Gauss Legendre integration rule for one point formula then for
two point formula then for three point formula and then we will just go for error analysis of
Gauss Legendre integration rule.
538
So if you just go for this Gauss Legendre integration methods basically in the beginning of this
lectures we have discussed that whenever we are representing an integration this integration is
usually written in the form of like a to b ∫f(x)dx, and this strict rule for this like Gauss Legendre
integration method is that we have to transform this integration range a to b to -1 to 1 here. So
first transformation we have to do that one and once this transformation has been made then we
can go for this like series expansion or this linear combination of this functional values at a
different nodal points.
So this means that once we are transforming this integration range a to b to -1 to 1 ∫f(x) dx then
we can represent this as a linear combination form that is k = 0 to n ∫λkf(xk) dx, here or it can be
represented in the form like λ0f(x0)+λ1f(x0)+hλ2f(x0)+2h+… λnf(x0)+nh here. So the approach of
this integration rule is that first we will go for this undetermined coefficient form, undetermined
coefficient form means λ0 is not known to us, λ1 is not known to us, λ2 is not known to us and x0
is not known to us and x0+h is not known to us.
Even if these points are not present there we can consider any functional or any polynomial
approximation to this integration here to get these nodal points here.
539
(Refer Slide Time: 3:17)
So first we will go for this like Gauss Legendre one-point rule here that is usually Gauss
Legendre one-point rule is written in the form of like -1 to 1λ f(x) dx this can be written as λ0
f(x0) here, since we are considering for a polynomial of degree 1 here so that is why it is called
like one point formula here.
And if you see here λ0 ≠0 so then we will have 2 unknowns here that are like λ0 and x0 here. So if
we want to determine these values for λ0 and x0 then we will go for this like approximation of
this function with a exact polynomial of order p and if this is approximated with a function or a
polynomial of order p there then the error constant can be written in the form of like c = -1 to 1∫
f(x) dx - k = 0 to n ∑λnλkxkp+1 since usually we are considering this function or this polynomial
which should be if it is order greater than p than it will give a non-exact term then we will have a
error existing at that point.
540
541
And this error term that is Rn(f) here, that is usually represented in the form of like
c/(p+1)!fp+1(ζ), where ζ should lie between a and b and if you go for this like calculation of this
coefficients here like λ0 and x0 here, then making this formula exact this means that error term
should be 0 if you consider here making this formula exact since already I have explained that
we have to consider this formula is exact for polynomial of order p if p = 0, 1, 2 up to p it will
give you the exact solution here this means that 0 value we can obtain here.
So if you use this making this formula exact for f(x) = 1 and x, since we can consider that as 1,
x, x2 , …, xp since already we have considered that this is for p =1 here, function is approximated
or the polynomial is approximated with a polynomial of degree 1 here. So that is why we can
consider this one I = 1 or p = 1 here so that is why two functions we can consider first function is
x0 that is as 1 here, second function is x1 as x here.
So if you will put these functional values for this integration here, then we can obtain that as like
f(x) = 1, suppose then we will have this -1 to 1 ∫1dx, so this can be written as like 2, this equals
to if you put these functional values as λ0 . 1 here. So that is why λ0 can be taken as 2 here and
for the second one f(x) = -1 to 1∫x dx this is equals to 0 here and it can be written as λ0 f(x0) = x0
here so x0.
So that is why we can consider that since λ0 = 2 here that is not equals to 0 we can consider x0 =0
here. So the total formulation for this one point formula it can be represented as -1 to 1 ∫f(x) dx =
2 f(0), and this error constant c can be written as since it is represented in this form here so c can
542
be represented as like -1 to 1 so immediate next term if you consider here that as xp+1 so x2dx
here.
So that can give you like 2/3 here, so 2/3c/(p+1)!, so that is why this error term or Rn(f) it can be
written as 2/3.1/2! fp+1, so f ``(ζ), here 2, 2 cancel it out so this can be written in the form of 1/3 f
``(ζ). So ζ should lie between like -1 to 1 here, so this is the error term for one-point Gauss
Legendre integration method.
So then we will go for Gauss Legendre 2 point rule here, if you discuss about Gauss Legendre
rule for 2 point form here then we can write -1 to 1∫ f(x) dx = λ0 f(x0)+λ1 f(x1) here. So 2 point
means we can consider 2 points here so that is why λ0 is the first coefficient here and λ1 is the
second coefficient here where we can consider λ0 and λ1 ≠0 and x0 ≠ x1 here.
543
And this should be determined to make this formula exact then we can consider this polynomials
of order like 0, 1, 2, 3 since we will have here 4 unknowns if you see here λ0, λ1, x0, x1 are 4
unknowns here. So we can make the formula exact for polynomial of degree 3 so that is why this
function can be written as in the form of f(x) = x0, x1, x2, x3 here or we can write this as 1, x, x2,
x3 here, for which exactly this function or this polynomial approximation give the exact value
here.
544
So if you use these values then we can obtain here like -1 to 1 first point if f(x) = 1 if you
consider 1.dx this equals to λ0+λ1 here.
This implies that λ0+λ1 = 2 if you use f(x) = x then we can write -1 to 1 ∫x dx = λ0 x0+λ1x1 this
equals to 0 here. Similarly, if you use f(x) = x2 here then you can write -1 to 1 ∫x2dx = λ0x02+λ1
x12 here and this is nothing but 2/3 here. Similarly, if you use f(x) = x3 here this can be written as
-1 to 1 ∫x3 dx = λ0 x03+λ1 x13 = 0, again.
So if you use these 4 equations for 4 unknowns then if you do some calculations obviously we
can obtain this values that is in the form of like a x1 = -x0, here and λ0 this equals to λ1 this will
545
give you 1 here. So this is a simple calculation if you do this means that eliminating λ0 from 4.1
to and 4.14 here we get that λ1x13 -λ1x1x02 = 0 and from there itself we can get x1 = -x0 here and
since λ0 = λ1 = 1 here than we can write here x02 = 1/3 here.
So that is why x0 can be written as like ±1 λ/√3 and since x0 = -x1 here we can consider this
formula as in the form of like -1 to 1 ∫f(x) dx this equals to f(-1)/√3+f(1)/√3 here since λ0 and λ1
are 1there itself.
546
So if you go for this error constant here so the error can be written in the form of like C = -1 to 1
∫x4 dx - [1/9+1/9], and final value if you will go for this calculation so that will give you 8/45.
Therefore, if you calculate this error here so error term Rn(f) that can be written as C/4! from this
like earlier formulation if you see here Rn(f) this can be written as C/4! f4(ζ) here.
So C is already calculated here 8/45 so 8/45.4!, so that can be written as like 24 f4(ζ) where ζ
should lie between -1 to 1 here.
547
Then we will go for 3-point formula here, so like Gauss Legendre 2-point formula here also if
you write this formula as in the form 3-point formula for Gauss Legendre integration method so
likewise we will consider first one as like we will transform this interval -1 to 1 ∫f(x) dx this can
be written as like k = 0 to n ∑λkf(xk).
So that is why if you consider 3-points, so 3 points means we can write this one as λ0f(x0)+λ1
f(x1)+λ2f(x2) here. And if you see this equation here that is basically λ0 ≠ λ1 ≠ λ2 ≠0 all these
should be not be equals to 0, but all of these coefficients like x0 ≠ x1 ≠ x2 here.
So that is why we will have 6 unknowns λ0, λ1, λ2, x0, x1, x2 and for 6 unknowns we have to
implement this polynomial approximation up to degree 5 to get 6 equations this means that up to
degree 5 this will provide exact solution. So if you consider that making this formula exact for
f(x) = 1, x, x2, x3, x4, x5 since 6 different values we have to consider to get this formula exact
here.
So if you put all these values then we can obtain these values are as -1 to 1∫f(x) dx = λ0+λ1+λ2 =
2 here. And for f(x) = x this can be written as -1 to 1 ∫x dx = λ0x0+ λ1x1+ λ2x2 = 0 here f(x) = x2 -
1 to 1 ∫x2dx = λ0x02+λ1x12+λ2 x22 = 2/3 here.
So similarly f(x) = x3 it can be written as -1 to 1∫x3dx = λ0x03 + λ1x13 +λ2x23 = 0 here. And if you
consider f(x) = x4 here then -1 to 1 ∫x4dx = λ0x04 + λ1x14 + λ2x24 = 2/5 here f(x) = x5 if you
consider -1 to 1∫x5dx this can be written as λ0x05+λ1x15+λ2x25 = 0 also here.
548
So if you solve these 6 equations for 6 unknowns then we can obtain these coefficients for λ0, λ1,
λ2 and x0, x1 and x2 here. So in a complete form if you write this formula that can be written in
the form of -1 to 1∫f(x)dx = 5/9 f(-√3/5)+8/9 f(0)+5/9 f(√3/5) this is the complete formula for this
3-point Gauss Legendre integration method.
So if you go for this error computation so your error computation can be written in the form of C
equals to the formula is providing the exact result for a polynomial of degree 5 here then the
error term can be obtained for polynomial of degree 6 here.
So that is why C can be written as -1 to 1 ∫x6dx -1/9[5(-√3/5)6+8(0)+(5√3/5)6], and the total value
it is giving you 8/175 here.
Hence the total error term that is you can write Rn(f) = C/6! f6(ζ), here that is 8/175.6!f6(ζ), where
ζ should lie between -1 to 1, here this is the error term for 3-point Gauss Legendre integration
method.
549
(Refer Slide Time: 22:15)
So the general transform method for a interval from [a b ] to [-1 1] can be done in a easy form
that is if you write this transformation x = pt+q here then when x = a we have t = -1 and a = -p+q
here when x = ,b we have t = 1 so b = p+q here.
If you combine both these equations here we can obtain p = b –a/2q = b+a/2. So the required
transformation for x if we want to transform into [a, b] to [-1, 1] here then we can write that one
as b –a/2.t, the transformation basically it is written as x = b-a/2.t+a+b/2 here. So automatically
this range that is b to a can be transformed to -1 to 1∫f(x) dx here.
550
(Refer Slide Time: 23:38)
So obviously once you are obtaining this transformation x in terms of t, then we can write that dx
in the form of dt so then you can write this formulation that is in the form of if you want to find
this transformation specially you can write this part as g(t)dt here. So t obviously if you write
here dx = (b-a)/2dt and then you can put it over there and you can get this integration range there.
551
If the question is asked suppose evaluate this integral that is 0 to 1 ∫dx/(1+x) using Gauss
Legendre 3-point formula compare with the exact result, then first we will transform this range 0
to 1 to that is -1 to 1 here then we can apply this 3-point formula there itself.
Suppose if you write this integral evaluate the integral 0 to 1 ∫dx/(1+x) using 3-point Gauss
Legendre integration method.
Then the first criteria is that we have to transform this integration range 0, 1 to -1 to 1 here, so
for that we will put here that is x = (b –a)/2. t +(a+b)/2 here if you write in this form then we can
552
obtain here x = (1 -0)/2.t+1/2 and this can be written as ½.t+1/2, we can write t+1/2 here. And dx
can be written as dt/2 here and the complete integration range that is 0 to 1∫dx/(1+x), this can be
transformed into -1 to 1, so first 1/(1+x) so that is why you can write that one as
1/(1+(t+1)/2)dt/2 here.
So this can be written as -1 to 1∫1/(3+t) dt. So then if f(t) = 1/3+t here we can now use this
formula that as integration -1 to 1 ∫f(x) dx this as 5/n f(-√3/5+8/9f(0)+5/9 f(√3/5) here.
Since function is known to us and directly we can put these values and obtain these functional
values there. For the exact solution if you see here that is IExact = ln(2) that is coming as 0.693147
but the computed value if you see here that will give you 0.693122 and if you take a for this error
calculation the absolute error is I-IExact that is giving you 0.000025 here. Thank you for listening
this lecture in integration in the numerical methods for interpolation and the approximation of
functions in various forms thank you for listening.
553
Numerical Methods
By Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 36
Introduction to Ordinary Differential Equations
Hello everyone, so welcome to the fourth module of this course from my side and it is the last
module of the overall course numerical methods and in this module we will learn numerical
methods for solving ordinary differential equations. So in this lecture I will introduce ordinary
differential equations to you, what we mean by the solution of ordinary differential equations
what type of ordinary differential equations we have, what is the conditions for existence and
uniqueness of the solutions of a differential equation.
And finally we will talk about why we need numerical solution for ordinary differential
equations.
554
like y`, here y` is dy/dx that is the first derivative of y with respect to x, y `` and so on. So here x
is independent variable as I told you and y is the dependent variable.
So the solution of this is a function y which is a function of x which satisfy this particular
relation. So such a relation is called ordinary differential equation. Now we are having 2 terms
for a given ordinary differential equation that is one is degree another one is order. So first, what
is order? An order of a differential equation is the order of the highest derivative in the relation,
for example if I am having a differential equation so it is a differential equation of second order
because here we are having second order derivative which is the highest order derivative in the
given equation.
Then if I am having this kind of thing let us say then here again the order is 3 because highest
order derivative in this equation is 3. The other thing is degree, the degree of a differential
equation is the power of highest order term in the equation provided equation should not have
any radical sign. So for example the degree of this differential equation is 1 because here highest
degree order derivative is y`` and there is no square root type of term in this equation on the
dependent variable and hence the degree is 1.
The degree of this differential equation is again 1, if I am having an equation of the form let us
say (y``+√y`+y)2 = 0. Now, this differential equation is having order 2 but here I cannot say that
degree is 1, why? Because we have this square root term of the first order derivative of the
dependent variable in the equation. First of all we have to remove this square root, so how can
we remove? Let us make the square of the overall equation so it will become square of this
equals to 0 so this will become (y``)2+y`+y2+2yy`+2√y`(y``+y) =
So we will get this equation let us simplify it more so basically we need to simply this term so
when I will simplify this term this will become 0y``2+y`+y2+2yy``-2(y``+y)2 = 0 or finally if I
simplify it more here I will be having minus 2y``2-2y2 - 4yy`` = 0 and these terms as such or if I
simplify it more –y``2 so this term cancel with this one –y2 and then I am having 2yy`` here and
here I am having -4yy``+y` is coming from here this term -2yy``= 0, so it is –y``2–y2+2yy``-2yy``
= 0
And hence after simplification we get this equation and here highest order derivative is again
means that is y`` and degree of this is 2 hence the degree of this differential equation is 2. Now
555
further more we can classify the ordinary differential equations into two classes, one is linear
ODE and the nonlinear ODE.
If in the differential equation y and its derivatives are of degree 1 and there is no product of
dependent variable and its derivative terms then the equation is called linear differential
equation.
For example, x2d2y/dx2+dy/dx+y = ex. Here you can notice this the second order derivative term
is having degree 1 the first order derivative term is having degree 1, y is also having degree 1.
Hence and there is no cross product term of the dependent variable or its derivative. So hence it
is an example of linear differential equation, similarly this is an example of linear differential
equation if the differential equation is not linear than it is said to be nonlinear differential
equation.
556
(Refer Slide Time: 9:15)
For example, this difference equation here you can note down this term here it y3 so here I am
having degree 3 of the dependent variable y so it is nonlinear due to this term more over the first
term is the product of y2 with second order derivative of y. So hence it is again nonlinear term in
the differential equation and hence it is a nonlinear differential equation. Furthermore, if I take
this particular equation so here I am having y``2 and due to this term it is nonlinear because here I
am having square term and here 4 log y again it is not a linear term of the depended variable and
hence this is a nonlinear term.
So due to these 2 terms it is nonlinear. The linear differential equations are easy to solve and we
are having several analytical methods to find out the solutions of linear differential equation.
However, in case of nonlinear differential equation very few methods are there for solving them
analytically and that is if the given equation is having in a specified form then only we can solve
nonlinear differential equation using analytical method.
Moreover, whenever we solve nonlinear differential equation we get a close form solution, close
form means something solution in terms of an integral and hence they are not quite useful in real
life application scenario. So what we need to do? We need it is better to find out some alternative
solution for them instead of such an analytical solution and there we need numerical solutions.
557
(Refer Slide Time: 11:28)
However before going to detail about numerical solution let me give some example of ordinary
differential equations the first example is very simple it is coming from the motion of simple
pendulum.
So consider a simple pendulum having mass m which is hanging from a string of length l. Also it
is fixed at a pivot point. When pendulum is displaced to an initial angle θ and released, it will
swing back and forth periodically. So, here tangential force is - m g sin θ, that is mass where g is
the acceleration due to gravity and tangential acceleration is given by Rα, where alpha is second
order derivative of the angle with respect to time that is d2θ/dt2 and R is length of the string.
So I can write it tangential acceleration is l.d2θ/dt2. So by Newton second law for rotational
systems the tangential force should be equals to m into tangential acceleration so I can write -mg
558
sin θ = m.l d2θ/dt2, so m will be canceled, so we can have a second order linear differential
equation which is d2θ/dt2 = -g/L sin θ.
We can solve this differential equation analytically and solution of this equation is given as θ at
any time t = θ0 which is the initial angle cos (ωt+5) where ω is √(g/L) and it is the natural
frequency of the motion of the simple pendulum.
Similarly, we can have another physical example of ordinary differential equation and it is
coming from exponential growth and decay.
So it is taken from the mathematical biology or biomathematics and basically the population of
human animals or bacteria consist of individual the aggregate behavior can be effectively
modeled by a dynamical system that involves continuous varying variables.
559
So for example if we assume that at any time t the number of persons or number of people is N(t)
which satisfy a first order differential equation of the form dN/dt = ρN, here ρ denotes a
proportionality factor given as β-δ where β is non-negative number and it is giving the birth rate
while δ again non-negative number is the death rate. So if β exceeds δ means ρ is positive and
hence population is increasing if ρ is negative then rate is more compare to the birth rate and
hence population is decreasing.
The solution of this is given by Nt = N0eρt, again it is explicit analytical solution that we can
obtain using variable separable method for the differential equation and here N0 is the initial
population that is at time t = 0. And from this solution we can see that if ρ is positive the
population increase exponentially while if ρ is negative the population decrease exponentially
again.
560
Now whenever we solve the differential equation analytically like I have taken two examples,
one is of population model and here solution is coming Not equals to N0tρt. While in the other
example that was from the motion of simple pendulum the solution was coming θ = θ0 cos ω t+5.
In both of the cases what we are having we are having some constant in the solution, how to find
out the values of those constant because these are the general solutions and for a particular
problem we need particular solution and for finding the particular solutions we need conditions.
So the solution of the any nth order differential equation will contain an arbitrary constant, if we
are having population model equation, we were having only first order equation first order ODE
and we are having only one constant N0. Similarly, if you are having second order equation we
will be having two arbitrary constant in the solution basically in the general solution. So if you
are having nth degree or sorry nth order ODE then you will be having n number of arbitrary
561
constant in the general solution. And to eliminate both constant for getting a particular solution
we need some conditions.
So there are two types of conditions in case of ordinary differential equations one is called initial
value another one is boundary value, means if initial condition is given to you the differential
equation together with initial condition is called initial value problem. However, if boundary
conditions are given then the differential equation with those boundary conditions is called
boundary value problem.
562
So let us take some example of this, so initial value problem all the conditions are prescribed at a
single point that is value of y, y`, y`` up to if differential equation of nth order then we will be
having initial conditions up to n-1 order derivative. And those in the initial conditions are given
at some point x = a and in this case we can find out or we require the solution in the domain x>a
because x = a is the initial point so we have to move in the right side of that particular point.
And it means the solution domain is open, after a it may be any point. So this is an example of
first order initial value problem here differential equation is dy/dt = -2ty and y0 = 1 and t > 0. So
initial condition is y at t = 0 is 1 this is the differential equation and hence together it is called an
initial value problem. Similarly, it is an example of second order initial value problem and here
d2y/dt2+5dy/dt+4y = 8t2. The two initial conditions are given as y at t = 0 is 1 and y` at t = 0 is -7
and t > 0 is the domain of the t.
On the other hand in boundary value of problems the conditions are prescribed at two or more
points usually at two say x = a and x = b where b is always greater than a. In this case solution is
require in the domain x belongs to {a b], which is bounded. So this is if I take this first order
differential equation d2y/dx2 +y = 0 and the here domain is the close interval from 0 to 5 then y(0)
= 1 and y(π) = 1.
Now we will talk about the existence and uniqueness of the solution for a given differential
equation in theoretical sense means when the solution exist and if it is exist whether it is unique
563
or not and if it exist or unique in which interval it is means what is the interval of the existence or
uniqueness of the solution.
So again we will take a first order initial value problem y` = f(x, y) y(x0) = y0.
Now suppose this f is continuous function in some region that is the region in two dimensional
domain R which is having all the points (x, y) for x we are having the interval that is |x-x0|≤a, and
for y we are having |y-y0|≤ b and a, b are positive. So the center of the rectangle at (x0,y0), so
since f is continuous and our interval is a close interval so it will be bounded because function is
continuous, interval is closed, so it is bounded in R, bounded means there exist a K such that the
absolute value of |f(x, y)| ≤ K for all (x, y) belonging to R.
If this is true, then we will say that the initial value problem that is y`= f(x, y) together with
condition y(x0) = y0 has at least one solution y = y(x) define in the interval |x-x0|≤ α and where α
= minimum {a, b/K}. So please note that here interval will become smaller for the existence of
solution.
564
(Refer Slide Time: 22:55)
So for uniqueness suppose that ∂f/∂y are continuous function in R hence both f and ∂f/∂y are
bounded in R this means f (x, y) is bounded by K ∂f/∂y is bounded by L in R.
Then initial value problem has at most one solution, so this condition 1 is giving existence of
solution the second is given uniqueness because it is saying at most one solution y = y(x) in the
interval again |x-x0| ≤ α where α = min {a, b/K}. So it means as I told you if I combine it with the
existence theorem the solution will be unique, there exist a unique solution. If f is continuous
∂f/∂y is continuous in a close rectangle.
However, we can replace the condition that is ∂f/∂y is continuous and bounded in the given
rectangular domain by a weaker condition we can replace this condition.
565
And the weaker condition is called Lipschitz condition. So thus instead of continuity of ∂f/∂y we
can have if f is Lipschitz continuous that is f (x, y1) – f(x, y2) ≤ L (y1-y2) for all x, yi belongs to R
so then the solution is unique.
So basically this condition is coming if ∂f/∂y is continuous this implies this Lipschitz condition
will always hold for a continuous partial derivative. However, this is not true for example if you
take f = x2 |y| where |x| ≤ 1, |y| ≤ 1, is the domain, this particular function satisfy Lipschitz
continuity or Lipschitz condition in this particular rectangular domain, however the function
does not have partial derivate with respect to y that is ∂f/∂y does not exist at x0.
Again take this simple example so here y` = 4+y2, y(0) = 1 and our rectangle is given by R where
|x|≤ 110 mod |y-1| ≤ 2. So since f and ∂f/∂y = 2y, both are continuous f is bounded because the
maximum value of f in this R can be 13, so therefore the solution exists for the interval |x| ≤ 2/13
and why I am saying 2/13 because here according to first existence theorem minimum of {110,
2/13} that is β/K so 2/13.
And the solution will be unique because ∂f/∂y is also continuous in this rectangular domain.
Similarly, we can say for this however this particular initial value problem one solution is given
by y = x+c 55/3. And another solution is y = 0, so the solution exists but for this particular
example it is not unique. Now take this particular example y`= 2y/x here, y(x0) = y0, so here f(x,
y) is given by 2y/x and ∂f/∂y is 2/x, both of these functions are define for all x except the y axis.
566
So by uniqueness theorem for all x, excluding the y axis there exists a unique solution defined in
an open interval around x0. Also exist solution of this initial value problem is y= cx2 where c is
any arbitrary constant. So you can check that all these solutions when x is 0, y is 0 so all these
solutions will pass through the point (0, 0), but none of them will pass through any point at y axis
except the origin.
So the initial value problem as infinitely many solutions but the initial value problem if I
replaced this 0 by some y0 which is non-zero number so any point on the y axis which is not
except the origin. Then, the initial value problem does not have any solution also for all (x0, y0)
when both are not 0 the solution is unique.
567
Now in the final thing I want to tell you about ordinary differential equation that is how to reduce
nth order differential equation into a system of n linear differential equations or in first order
equations.
So this is like I am having a nth order differential equation and I want to write it as a system of
first order differential equation so that we can do very easily and then whatever technique we
will learn for a single equation numerical technique that will be applicable to a system of linear
equation ODE also first order ODE.
So for example if you are having a differential equation 0 or something else now it is a nth order
differential equation, so if I want to reduce it into n, first as a system of n first order differential
equation then what I will do, first of all I will take dy/dx and I will say it z1. Then what I will
take? I will take dz1/dx that is basically d2y/dx2 and this I will write z2 and so on. So similarly
dz2/dx, I will write z3, so finally dzn-1/dx will become zn and hence I will be having these n first
order differential equations and this dzn-1/dx is nothing just dny/dxn and that I can write in terms
of y, z1, z2,…zn-1 and hence I will be having a system of first order ODE.
For example, if I take an equation let us say second order linear differential equation, now I want
to write it as a system of 2 first order differential equations, so what it will take? I will write that
y` = z or let us say z1, then I will write z1`= z2. So basically what I am having dy/dx = z1 and
dz1/dx that is basically d2y/dx2 y`` so this will become 2-2y-y, so -2y` is - 2 z1 - y.
568
So here I am having a system of two first order ODE in z1 and y which is equivalent to a second
order differential equation, similarly third order differential equation can be reduced into a
system of three first order differential equations and so on. So in this lecture we learn few basic
things about differential equations, in the next lecture we will talk about numerical methods
about ordinary differential equations, thank you very much.
569
Numerical Methods
By Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 37
Numerical Methods for ODE-1
Hello everyone, so as I told you in the previous lecture that most of the differential equations
those are associated with or coming from the real life phenomena or some from real life process
do not possess analytical solution. So hence we require numerical methods to solve such
differential equations. So numerical methods give an approximate solution that is it will not give
a close form type of solution, that y is a function of some x, okay, now it will give you the value
of y for the given x or where you want to obtain your solution.
Suppose I am having an initial value problem y` = f(x, y) and y(0) =x0, so I am having like this
y` = f(x, y) together with a condition that y(x0) = y0. Suppose you want to calculate a solution at
x = x1 and which is your x0+h, where h is a small number. Now so it means by a numerical
method I will calculate the value of y at x = x1 and that will be a number, okay.
Here we are having two types of numerical methods, one is called single step and the another one
is multi step numerical methods. In single step numerical methods we use just previous value for
570
calculating the value at next step just the previous step only, okay. However, in case of multi-
step we use more than one previous steps. So for example suppose I want to calculate the value
of y at x = x1, so here I will use value of y(x0).
So if I am using only value at x0 than it is a single step method. However, suppose I want to find
out y at x = xn and I am using y at xn-1 then it is a multi-step method. In the next, including this
and next couple of lectures we will talk about we will do some single step methods and in the
final lecture we will take a multi-step method. Now in this module we will take few numerical
methods for example Euler’s method, Runge-Kutta method and then a multi-step to solve
ordinary differential equations.
So in this lecture I will talk about Euler’s method but before going to Euler’s method let me
introduce a semi analytical method which is called Picard’s. So what is Picard method?
571
So Picard’s method of successive approximation can be given like this, consider the initial value
problem dy/dx that is f(x, y) and y(x0) is y0 and the domain of x is x > x0. So integrating the
differential equation (1) from x0 to a general point x, I am using the initial condition y(x0) = y0, I
can write x - y(x0) = x0 to x ∫f(x, y) dx.
So the solution of this particular equation is obtained in an iterative manner according to the
scheme that yn+1 iteration or the n+1 approximation of y will be y0+ x0 to x∫f(x, yn(x)dx where n
= 0, 1, 2. So for example you want to calculate the approximation of y in first iteration so it will
become y0+ x0 to x∫f(x, y0)dx, you want to get second iteration of this or second approximation
then it will become y0+x0 to x ∫f(x, y).1 dx and so on.
572
So since the solution is obtained as a function of x the difference between 2 successive
approximation will also be a function of x because in each successive approximation you will get
a function of x hence the accuracy of the solution will depend on the value of x and the solution
obtained will be valid for certain range of x, okay if you want to obtain an accuracy up to let us
say of the order of delta, it will be valid for a particular domain of x it may happen that this
particular solution is not valid for some other domain for up to given accuracy. So this particular
equation is called Picard’s iteration formula and as I told you it is not a numerical method it is
something like between numerical and analytical methods so I can say it semi analytic or
approximate analytical method.
So let us consider this particular problem and we will apply the Picard’s method on to this
example.
So I am having example as dy/dx = x+y, so it is a quite popular example and you will find this
example in many books so here y0 = 1 that is initially x is 0 and the initial value of y(x0) or value
of y(0) = 1. Now y1 is given at y0+x0 to x ∫x+y this particular number because my original
problem is y` = f(x, y).
573
So y1 = y0, y0 is 1+ x0 is 0, 0 to x∫ f(x, y) so initially y(0) is 1, so it is ∫1 dx and let me write it 1.
So it is 1+x 2 will become x 2+2 x0 to x ∫1+x+x2/2. Now the next iterate of y is given as again 1+0
to x ∫f(x)dx so this will be 1+ now it will become x+1+x+x2/2, so 1 will become x+2x, so x 2+ so
it will be when I am putting x this and when I am putting 0, so it will become 1+x+x 2+x3/6, so
we will move in this way.
So if I put the value 0.1 for x, y1 is coming 1.105 while y2 is coming 1.110167, then I calculate y3
so y3 will again 1+ 0 to x∫f(x y) 2x dx so after putting all this I am getting 1+x+x2+x3/3+x4/24.
So when I put x = 0.1, I am getting y3 as that is the third approximation of y at x = 0.1, 1.110337
while it was in the second approximation, second iteration it was 1.110167. So it is correct up to
third decimal place, if I take one more iteration or one more approximation of y then I am getting
this particular expression for y at fourth iteration 1+x+x 2 and so on.
574
(Refer Slide Time: 12:16)
So consider again the same initial value problem y` = f(x, y), y(x0) = y0, so using Euler’s method
we will determine the numerical approximate value of the unknown function y at point x those
are larger than x0 means after the initial point.
So if h be the step size distance between two consecutive points that is h and R points are
uniform we are assuming here. So my x1 will become x0+h, x2 will become x1+h that is x0+2h
and so on.
575
So by Euler’s method we will find approximations of y at all those xi’s like x0 is given so x1, x2,
xn like that. So in that way we can find the value of y in a given interval at some uniform points.
So we start with x0 and then we will calculate y1 then from using y1 we will calculate y2 using y2
we will calculate y3 and so on. So the initial value problem that is y` = f(x, y), y(x0) = y0 tells us
that f(x, y) is basically nothing but the slope of tangent line to the solution curve at point (x, y)
because it is dy/dx = f(x, y).
So at the point of tangency tangent to any curve remain close to the curve. So the given initial
value problem is well posed with initial value of y at x0 and which is known to us, so now we
will find approximate x1 at y, approximation of y at x = x1.
So the equation of basically what we are having suppose and this is the solution curve so let us
say the value at x0 = 0 it is given. I want to find out it at point 1, at point 2 so at this point let me
make it more smooth.
576
So this is the tangent line at (x0, y0). So at this point this will be my numerical solution y at 0.1
where the exact solution is this one. Now again at this point, I will try to approximate the value
of y at x = 0.2, so it will go like this and so while the exact values are these values.
So this is the idea of Euler’s method, so mathematically we can write that the tangent line at x =
x0 is given as y = y0+(x - x0)y`(x0), thus approximation to the value of the solution at x = x1 is the
y coordinate that is x = x1 on the tangent line.
So y at x1 is given, that is approximated value of y1 is given by y0+(x1 - x0) y`(x0) and as I told
you this particular approximation is valid only when x1 is very close to x0, means I am having a
very small h very small step size.
577
If I take a large step size let us say I am having value at 0 and I want to find out at let us say at
0.5, so I will take a tangent here and tangent line will go there so my original solution is this one
but I will get this one as the approximation, so I will get more error if the step size is large.
So once we calculate y1 that is basically y0+hf(x0, y0), from the tangent line we will calculate y2.
So y2 will become basically y1+hf(x1, y1), similarly y3 will become y2+hf(x2, y2), so I will get a
sequence of iterations.
578
In general y at any point xi+1 is given as yi+hf(xi, yi). Now what is the error in this approximation
because from this figure itself you can see this is the error in this approximation if I take the
largest, this is the error at second point, that is when x is 0.2 and so on.
So, how to calculate a bound on this error? So for this we will use Tailor series method so the
value of y(x0+h), so x0+h basically x1, so the value of y(x1) can be obtained by the Tailor series
approximation of y about x0, so y f(0+h) is y(x0+h)y`(x0)+second order and higher order terms.
So let me because if you see this particular thing first three terms the left hand side term and two
579
right hand side terms of this expression, this equation it is my y1 and it is y0+h f(x, y) that is the
the Euler’s formula for approximating y1 from the (x0, y0).
So it means this is the error term R. So here R is the remainder term and it is given h2/2 y``(xi)
where xi is somewhere between x0 and x1.
So the error term is R= h2/2 y``(ξ) that is Rh2/2 because we know that dy/dx = f, so d2y/dx2
means y`` will be f `, so I can write it as f `(ξ). So, how to calculate f `? Because f is a function of
x and y and y is a function of again x, so f ` will become f(x+f(y)dy/dx), so I am writing and
again dy/dx will be f.
So f ` will be fx+fy.dy/dx, so fx+fy . f, so the error term as I told you will be h2/2 f `(ξ y), finally
this particular thing.
580
(Refer Slide Time: 20:40)
So let us take an example and we will solve this example using the Euler’s method. So here we
are having our initial value problem as dy/dx = 1+xy2, so here my function small f(x, y) is 1+xy2.
The given initial condition is y at x = 0 is 1 that is why y0 is 1, now I need to take a step size h =
0.1 and I need to calculate the values of y1 and y2.
The value of y1 is the value of y at x = 0.1 and the value of y2 is the value of y at x = 0.2. So if
we compare this given problem with the standard form of initial value problem then my f(x, y) is
1+xy2, I need to calculate y``(x), that will be d2y/dx2 that is d/dx f(x, y) and it becomes
∂f/∂x+∂f/∂y dy/dx =fx +fy f = y2 + 2xy(1+xy2). Why I am calculating this? Because in the
question they are asking to compute error terms also and for that I need y``(x).
581
Now, let us take the first step and calculate the value of y1, so here x is 0.1 so y1 = y0+hf(x0, y0)
by the Euler’s formula y0 is given as 1+0.1(1+ 0(1)), it is 1.1.
Now if I compute the error ε1 is given as h2/2.y``(ξ), where ξ is somewhere between 0 to 0.1. So
y``(ξ) is f `(ξ) and I have already calculated this so this is 0.01/2 into this is the expression for
y``. So maximum truncation error is 0.01/2[(1.1)2+2(0.1)(1.1)(1+0.1 (1.1)2)).
So after simplifying this we are getting the maximum truncation error in y1 that is given by ε1
and the numerical value of this is 0.00728.
Now I will calculate y at x = 0.2 that is my y2, so y2 is y1+h f(x1, y1) that is 1.1+0.1(1+0.1(1.1)2)
so please note that for calculating y2, I am taking the values from the previous iteration or
previous step so my x1 is now 0.1 and y1 is 1.1 which we have calculated from (x0, y0).
582
So after simplifying this I am getting this value as 1.2121. So my y at x = 0.2 is 1.2121, the
truncation error in this step is given by ε2 that is ε1(1+hfy(x1, y1))+h2/2 y``(ξ), where ξ is now
between 0.1 to 0.2.
So I have substituted all these values here and then this term so maximum truncation error will
be given by this particular expression + this is for the second term of the ε2 and after simplifying
it I am getting this value as 0.0179.
583
So here my y1 is 1.1 the maximum truncation error in y1 is 0.00728 and then y2 is coming out as
1.2121 and the maximum truncation error for this x2 is 0.0179.
584
(Refer Slide Time: 25:31)
Now if we take another example that is if f = x+2y, the problem is same y` = f(x, y) and y0 is 0
here so if I apply Euler’s method with step size 0.25, suppose I want to calculate value at 4
points between 0 to 1 that is at 0.25 at 0.5 at 0.75 and then finally at 1.
585
At x = 0.5 y2 is coming using the Euler’s process as 0.0625. Similarly, when x3 is 0.75 y3 is
coming at 0.21875 and finally at x4 = 1, x = 1, y4 that is we are y is coming out 0.515625.
586
So the true solution of the initial value problem is given by this particular expression and if I
calculate the values at 0.25 by the Euler’s method we are getting 0 but from the exact solution it
is 0.037180.
By that numeric Euler’s method at 0.5, I am getting y as 0.0625 while the exact value at 0.5 is
0.179570 and similarly we are having big errors at x = 0.75 and x = 1 also. So it means the
Euler’s is having large error when compare to the exact solution, and what is the reason behind
this? The reason is step size because here we are taking step size as 0.25 if you decrease the step
size error will decrease and it happens always in Euler’s method however you have to do more
calculations.
So if we shrink the step size we will get more accurate solution by using the Euler’s method. So
with this I will stop this lecture I, so today we have learnt Picard’s method and Euler’s method,
in the next lecture we will talk more accurate method when compare to the Euler’s means one
587
more lecture on the Euler’s method and then Tailor series method means we will consider more
means like in Euler’s method we are taking only term up to first order in Tailor series
approximation of y at about x0, however there we will take more like second order term also so
that error will reduce. So we will talk about those methods in next lecture, thank you very much.
588
Numerical Methods
By Dr. Sanjeev Kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 38
Numerical Methods - 2
Hello friends. So welcome to the third lecture of this module and in this lecture we will
continue from the last lecture in which we have introduced Euler's method and in the Euler’s
method we have seen that we need to reduce the step size for getting a better accuracy and
hence reducing the step size means you need to do more calculations. So in this lecture our
aim is to develop a whole family of numerical methods that can attain any order of accuracy.
Unlike the Euler’s method where we are having a accuracy of order each.
So here we will do that Euler’s method in this lecture and specifically I will talk about
quadratic Tailor method and then I will tell you how we can generalize Tailor method up to
any order. But in Tailor’s method we need to know how to determine higher order derivatives
of the solution of a differential equation at a point and this is also we will explore in this
lecture that how to calculate higher order derivative for a given function. So let us start with
quadratic Tailor method.
589
(Refer Slide Time: 1:41)
So the quadratic Tailor method approximation is based on the more accurate approximation
that is the approximation of second order derivative like in Euler’s method we have taken
only upto first order derivative but here we are taking up to second order derivative so a
function can be approximated about a point x0, by this expression that is f(x0)+hf
`(x0)+h2/2f``(x0).
To describe the algorithm we need to specify how the numerical solution can be advance
from a point (xk, yk) to a new point (xk+1, yk+1). Where xk+1 is xk+h. Now basic idea is to use
the above equation and compute yk+1, so by the Tailor series expansion we can write yk+1
hyk+ hyk`+h2/2y``(k).
So in this expression you can see we are having yk` and realize yk``. So Yk` can be given by
the differential equation from our initial value problem. However we need to calculate yk``.
So for calculating yk`` we will use the lemma.
590
And in this lemma we are having its function y` = f(x, y) that is the given differential
equation with initial condition y(x0) = y0. In the same time suppose that the derivative of f of
order P -1 exists at the point (x0, y0) then the Pth order derivative of the solution y(x) at x=x0
can be expressed in terms of FP(x0, y0).
Where FP is a function defined by F and its derivative of order less than P. So let us take an
example to get a better understanding of this lemma.
591
So in this example we are having a differential equation y`= x+y2 together with initial
condition y(x0) = y0. Now let us assume that y`= f(x, y) or I will write F1(x, y). Now if I
calculate y`` that is some according to previous lemma, I am having F2(x, y). And this is the
differentiation of this particular term with respect to x.
So it will be 1 for this particular x and then 2y y` so that is 1+2 y y`is F1(x, y). So here you
can note down for calculating the second order derivative of y that is the y`` we need the
value that is value of y as well as value of capital F1(x, y). That is the derivative of y which is
less than order 2. And this is coming out now if I substitute the value of F1(x, y) from here
that is your x+y2.
So it will be 1+2xy+2y3. Similarly, we can calculate higher order derivative for example if I
want to calculate y``` that will be F3(x, y), so it will become 0+2 y+2 xy y`+ 6 y2 y`. Or
somewhere I can write if I want to calculate from here directly 0+2 y y``.
So if I am coming from here so it will be 0+2 y y``+2 y`2. So basically each two times y F2(x,
y)+2.F1(x, y)2. I am using the value of y, I am using the capital F1, I am using capital F2 for
calculating capital F3 according to previous lemma and by substituting all these values I can
get y```. Now how to use quadratic Taylor method?
592
For that again we will consider an example of an initial value problem that is given as y` =
f(x, y) that is also my capital F1(x, y) and it is given as x-1/1+y and initial condition is
y(0)=1. If we want to solve these particular initial value problem on the interval [0 1] with
step size h equals to 0 point 5 using quadratic Tailor method.
So first of all we need to calculate the second order derivative of y for applying the quadratic
Taylor method and for doing that y``(x) can be given as F2(x, y) that is basically
1+y`(x)/1+y2. So now after that what I will do?
I need to solve a differential equation y`= x- 1/1+y y(0) = 1. So I want to solve this on the
interval [0 1] with h = 0.5. So now y`` is 1+y`x/1+y2. So here x0, x1 is 0.5 that is 0+h and x2
593
is1. At initial y at x0 that is my y0 it is 1. Now I need to calculate y at 0.5 and y at 1 using the
quadratic Tailor method.
So for doing this what I want I will write that y(x) = y0+h.y`(0)+h2/2 y``(0). Now y0 is given
as 1. If I want to calculate y`0 that is y`0. so from here I will calculate y`0 x is 0, 0-1/1+1 so it
is -1/2. y``(0), I can calculate from here. It is basically 7/8. So after putting these values here
y0 is 1+h is 0.5 y`(0) is – 1/2+0.52/2 and then y``(0) is 7/8 and after simplifying this I will get
a value 0.85937. So this is the approximation of y(0) =0.5.
Again if I want to calculate y at 1 what I will use? I will use y1+hy`(1)+h2/2 y``(1). So I am
having value of y1 which I will kept h 0.859375. I will calculate the value of y`(1) from this
formula. After putting the value x, 0. 5 and y is this value and then similarly, I can calculate
y`` with the help of this expression.
And then finally I will get this value which is coming out something 0.9641 and so on. So
this is the overall procedure for implementing quadratic Tailor method For solving initial
value problem. And here we are getting more accuracy compare to the Euler’s method
without reducing the step size means taking the large steps.
So this is the curve of the approximate solution and the exact solution. Here you can see
between 0 to 1 the curve the two curves are quite similar and hence we are getting a good
approximation with larger step size using the quadratic Tailor method. If we talk about Tailor
methods of higher order, we can do it.
594
(Refer Slide Time: 13:22)
The quadratic Tailor method is easily generalized to higher degrees by including more terms
in the Tailor polynomial. For example, that Tailor method of degree p uses the formula y at
xk+1 = yk+h.yk`+h2/2 yk`` upto Pth order derivative. That is the last term will be a hP/P!ykp.
To advance the solution from the point (xk, yk) to point (xk+1, yk+1) we will use this formula.
So just like for the quadratic method the main challenge is determination of the derivatives.
How to determine higher order derivatives? And for that we can use the same lemma
however we have to do a lot of computations and hence the complexity increase quickly with
the degree.
It is possible to make use of software for symbolic computation to produce the derivatives of
higher degree. Now if we talked about error in quadratic Tailor method so we can drive it in
this way. So y(xn+h) can be given by y(xn)+h.y`(xn)+h2/2y`(xn) + third and higher order
derivatives.
595
(Refer Slide Time: 14:51)
So in this equation we will substitute y` is f(x, y), and y`` is f1(x, y) that is the derivative of f
with respect to x and so on. So this gives yn+1 = yn+h.y`n+h2/2 yn`` with initial condition y0
=Y. Here yn`` as you know is f(xn, yn) and yn`` is F1(xn, yn).
So taking the difference between 1 and 2 we get the in the left hand side it will be yn+1-
y(xn)+h. So that will be the error in n+1 iteration.
596
(Refer Slide Time: 15:39)
See the first time in right hand side it will be yn- y(xn).
597
(Refer Slide Time: 15:53)
So this I am writing en+hf(xn, y(xn)) – f(xn, yn) + second order term and then third and high
order terms. If we assume that both f as well as f1 are Lipschitz continuous then we can
replace these two expressions in round brackets like this one and this one by the definition of
Lipschitz continuity in this way that en+1 will be less than equal to en+hk y(xn) - yn.
So k is lipschitz constant for this particular expression+1/2 h2 k1 and again this term. That is
basically en. It is again en. So this after simplification I can write en if I take out 1 from here +
hk from here+ 1/2.h2 k1 + third and higher order terms. This expression can be written in this
form.
598
If I used the expression for the error from the first iteration that is e0 that is the initial error
and finally this sum can be written in this way that is 1+α+ α2+ … αn- 1. In this particular form
and finally it is coming out in this way. So this particular expression gives a guarantee to be
that the error will be of order h2 which is an immense improvement over Euler’s method
where we were having the accuracy of order h.
We can also use this Taylor’s series for finding the approximation of a function about a given
point for example if I want to find out the expression of f x = 2 cos x about x = 0. So Taylor’s
series is given by this 1 and hence I am having different order derivatives in different terms
after calculating all these at x = 0, so I will get 1, 0,- 1, 0, 1.
599
And then substituting in expression I will get this particular 1- x2/2!+x4/4! and so on. So this
was about the Taylor’s method and here I told you that we can use quadratic Tailor method
which is having the error of order of h2 however we can use higher order Taylor method for
getting better accuracy.
Now I will explain one more method that is Euler’s modified method and that is just an
improvement of the Euler’s method which we have discussed in earlier lecture and what is
this method?
Basically in this method we will use an average slope rather than the slope at the start of the
interval like we have taken the slope at the starting point in of the interval in Euler’s method.
600
So what I will do? I will evaluate the slope at the start of the interval, I will estimate the value
of the dependent variable y at the end of the interval using the Euler’s method, evaluate the
slope at the end of the interval.
Find the average slope using slopes in step 1 and step 3 and compute a revised value of
dependent variable y at the end of the interval using the average slope of step H with Euler’s
method.
So if I want to explain this method so basically problem is I need to solve this problem y` =
f(x, y), y(x0) =y0 . And I want to calculate y1 which is the value of y at x0+h. So what I have
done in the Euler’s method I have used y1 = y0+h.f(x0, y0). Where I am approximating the
value of y at x= x1 by the slope which I have taken at the initial point of the interval.
For example, if this is the function, this is the point x0, this is the point x1. So here I know the
value and I am taking the slope here and I am approximating this value by this one in Euler's
method. Here I am calculating this value of y1 and I will use this as the predict value. Then
what I will do? I will correct this predict value by a new formula that will be y0+h.(x0, y0)
+f(x1, y1*).
So what I am doing using the Euler’s method I am finding the slope at the end point of the
interval means at this particular point and then what I am doing I am taking the average of the
slope over the whole interval by 1/2 of these two. And here what I am doing? This formula I
601
can use again and again because this is the corrector formula and this formula I can use again
and again in an implicit manner.
How? you can see here I am having y1 as well as y1*, so I will calculate y1 from this, again I
will substitute that y1 here. I will get a new y1 that is the more better approximation and again
and again I will repeat this process and I will get a better approximation. So hese two
formulas jointly is called Euler’s modified method. So in first you need to find out a predict
value of y1 and then you can correct the predict value by this formula.
So in general setting I can write this as yn+1 = yn+h.f(xn, yn). If you know the value of yn at x
= xn and you want to calculate value at x = xn+h that is xn+1. So this is predict formula and a
corrector formula is yn+1= yn+h/2 f(xn) yn+f (xn+1, yn+1*).
602
(Refer Slide Time: 24:34)
Then if I use the Taylor series expansion of y about x = xn then I can write in this way. If I
calculate y`(xn+1/2) is given by y`(xn+1/2 h), so that will come in this form. And here you can
note down just look at this two terms that are similar so I can substitute this particular thing
here. So y(xn+1) is given as y(xn+h)y`(xn+1/2)+Oh3.
Now the difference between y`n and y` at xn is given by these two functions and we are
assuming that F is Lipchitz continuous with Lipchitz constant k. So I can write in this way
and that is equals to k.en. That is error.
603
So U-y(xn+1/2) can be given now by this particular equation which is less than equal to
yn+y(xn). I have taken this term and this term together +1/2 h, I have taken common. So yn -
1/2 h, I have taken out. So y`(xn)+O h2 which is en+1/2 hk en+Oh2. Similarly, I can get U`-
y`(xn+1/2) and that is given by k.1+1/2 hk . en+O h2.
So finally yn+1 – y(xn+1) that is the error in n+1 step can be calculated yn+h.U`- yxn-
hy`(xn)+1/2+Oh3 that is less than equal to yn- y(xn)+h U`- y`(xn+1/2)+Oh3. So en+1 will be less
than en+hk I have taken common.
So 1+1/2 k h . en and these values I have substitute in from the previous slide which I have
calculated earlier. So if I take α = 1+hk+1/2 h2 k2 and so on. So after substituting this values
and writing the error in terms of error in initial error that is error in the initial iteration.
Then I will get this particular approximation and again like the quadratic Taylor method this
particular approximation and again like the quadratic Taylor method here the algorithm has
error of order h2 which is again an improvement over the simple Alex method that is given
error of order h.
604
So after this we will take one example of Euler’s modified method so example is given by
this particular initial value problem. So y`(x) is 2y/x+x and I need to find out the value of x in
interval [1 1.4] by taking n = 4. The same time a solution is also given for this particular
differential equation which is x2+x2 log x with natural base and now so here if I calculate the
initial value of y at x = 1, it will be 1.
605
So by initial value problem is given as y` = 2y/x+x, x belongs to [1, 1.4]. So the two solution
analytical solution also given that is y*(x) is x2+x2 log x. Now I need to solve this problem.
So here x0 is 1. So y0 is 1 at this equation, I can calculate when x = 1, y will become 1. x1 is
1.1. y1 I need to calculate.
x2 is 1.2, y2 I need to calculate. x3 is 1.3, y3 I need to calculate and so for y4, x4 that is 1.4. So
let us first calculate y1 using the Euler’s modified method. So y1 is given as y0+h F(x0, y0). So
y0 is 1+h is.1 here F(x0, y0) can be calculated from this because it is my F(x, y). So 2/1 2+1 3
so it is coming at 1.3 and this is the predict value.
Now I will correct this value. So y1 will be y0+h/2 F(x0, y0)+F(x1, y1*). So y0 is 1+ h is 0.5/2.
1+1/2[3 + 2(1.3/1.1)+1.1]. So after simplifying this particular expression I will calculate the
value of y1 that is the value of y at x = 1.1.
This value is coming y at x = 1.1 is 1.32405 where the exact value was 1.32533. So a very
small difference we are having here that is after third place of decimal. Then y at x = 1. 2 is 1.
69982. The exact one is 1. 70254. y at x = 1. 3 is 2. 12905 exact one is 2. 13340 and finally
these are the approximate and exact values of y at x = 1. 4.
So here you can note down from these two columns that the approximate solution is quite
close to the exact one. And this is the implementation of the Euler’s modified method for
solving initial value problem. So in this lecture we have seen 2 methods those are having
error of order h2. That is the quadratic Taylor method and then Euler’s modified method.
606
In the next lecture we will learn another class of numerical methods for solving ordinary
differential equation and those methods are called Runga-Kutta method. So thank you very
much for listening this lecture.
607
Numerical Methods
By Dr. Sanjeev kumar
Department of Mathematics
Indian Institute of Technology Roorkee
Lecture 39
R-k Methods for solving ODEs
hello everyone. So in this lecture I am going to introduce another class of methods for solving
ordinary differential equation numerically. This particular class or method is called Runge
kutta method or in short Rk method so in the previous couple of lectures we have seen
Euler’s method. So in Euler’s method in simple Euler’s method we were having right to ratio
of order of h while the modified Euler’s method we were having it of order h2.
And we have seen in Euler’s method if you want more accurate solution or a better
approximation of the solution what you need to do? you need to reduce the step size that is
you have to use smaller steps value of h or if we talked about Taylor method if you want
more accurate solution what you need to do? you have to go for higher order terms. higher
order terms means higher order derivatives of a function.
But whether you are decreasing your step size or you are calculating the higher order
derivatives in both the cases you need to do more calculation, computational complexity will
increase.
608
So in Rk methods we attempt to obtain greater accuracy and at the same time avoid the need
of calculation of higher derivatives or with a smaller step size. We don't want to reduce the
step size. So how we can do it? We will evaluate the function f(x, y).
At some selected points on each subinterval like in simple Euler’s method or Taylor method
we are calculating at initial point and last point. here what we will do? We will take some
selected points on the subinterval. So consider this, so first of all I will explain Rk method of
order 2 and then I will tell you how can we generalize it in term in any order.
609
So consider the initial value problem is dy/dx = f(x, y) with initial condition y at x = x0 = y0.
So for solving this problem using the Runge kutta method of order 2, first we need to define
the modified Euler’s method in the form y1 = y0+ k and here k will be 1/2 (k1 + k2) where k1 is
hf(x0, y0) and k2 is hf(x0,+h, y0+k1) and this particular method is called Rk method of order 2.
So basically what we are doing?
Then we need to calculate k2. We need to take the average of k1 and k2 that will be my k and
y0 can be updated as y1 = y0+k. So this is the overall algorithm for Runge kutta method of
order 2. Now graphically this method can be seen like this.
610
(Refer Slide Time: 5:25)
I am having initial point x at this point I am having the value of the function y(x) given by
this particular point. I am having and this is my function this pink colour curve as x function.
I am taking the slope at this particular point x which is given by this green line and I want to
find out the value of the function y at x+h. So what will happen? I will take the midpoint of
the slope that is somewhere x+0.5h.
So midpoint of the interval at this I will find out the point at the slope line. So this particular
point will be the y+0.5k1. Now at this particular point that is x+0.5 h and y+0.5 k1, I will be
having another solution curve given by this particular curve that is y1 x. Now slope of this y1x
at this particular point x+0.5 h will give me the value of function y at x+h that is y+k2.
And that will be the half of the Euler’s step. you can see here this is the difference in Euler’s
step and this is particularly y+k2 that is Runge kutta method of order 2.
611
(Refer Slide Time: 6:52)
So in general RK method of order M can be written as follows like in order 2 we are having 2
terms k1 and k2. Like that in order M we need to calculate k1, k2 up to km and where k1 is
given just like as in order 1 method. k2 will be just like at order 2 method. k3 will be
hf(x0+α2h, y0+β21 k1+β22 k2). Similarly, k4 can be given by this particular expression and km
will be finally will this one.
Where k will be the weighted sum of k1, k2 up to km that is ω1k1+ω2 k2+up to ωmkm, where
ω1, ω2, ωm are weights that is between 0 to 1. And finally once we will be having this k, I can
write y1 = y0+k.
612
(Refer Slide Time: 7:54)
The parameters α, β and weights ω are chosen to satisfy certain conditions. They are
determined by expanding various functions in y1 = y0+k about (x0 y0) and comparing powers
of h in the expansion of y x0+h. That can be seen means in case of order 2. I will derive it we
are getting this particular α, β and ω for a RK method of order 2.
613
So y0+ω1, the value of k1, I can substitute from here. hf(x0y0+ω2) the value of k2, I can
substitute from here. hf(x0+αh y0+βk). Or this can be written as y0+ω1hf(x0, y0) + let us
explain this term by the Taylor series expansion about (x0, y0).
So this will be ω2 hf(x0 y0+α h) fx+β kfy. That is the first order term so fx at (x0, y0) fy at (x0, y0)
+ I will be having the second order term that is will be h2/2. α2 h2 the second order derivative
of f with respect to x that is the partial derivative +2αhβk1 fyy
So k1 and then k1+β2 k12 fyy+ higher order term. So this this can be expanded like this.
Finally, we can collect the coefficient y0+h(ω1+ω2). So I have taken this into f, so please note
that now I am writing f(x0 y0) as f + so I have taken this terms then I will be having this
particular term h2.
So h2.α.ω2.fx+as you can note down this k1 can be written as from this formula. So this term
will be β h f fy. So h2 I have taken out. So it will be β.ω2.f fy + higher order term. This will be
y0+h ω1+ω2 as you know for the initial value problem y` will be f(x, y), so this f, I can replace
with y`+h2 terms.
Now the simple Taylor series expansion of y can be given as y0+hy`+h2/2 y``. Compare the
various powers of h from here ω1+ω2 will become 1. This is the first expression I am getting
by comparing the power one of h and then from the second what I am getting? α = β.
614
hence I am getting if I choose α = A, I can get β = A. Because α = β and ω2 will become 1/2A
and ω1 will become 1 – 1/2A. So giving different values to A in general we choose A ≥1/2
such that we can generalize the various RK methods of order 2.
For example, the classical method of order 2 can be obtained just by taking α = β = 1. So here
A is taken as 1. So ω1 will become ½, ω2 will become 1/2 and A corresponding this equation
will become x0+h y0+k1 which is the standard Rung- kutta method of order 2. So let us take
again an example and solve it using method Runga kutta method of order 2.
So let us calculate k1 for this particular example. So k1 will be hf(x0, y0). So h is 0.01f(x0, y0)
will be y0 which is 1. Similarly, I can calculate k2. k2 will be hf(x0+h, y0+k1). So it will be
0.01 into so this will become y0+k1, y0 is 1, k1 is 0.01.
Now k will become ½[0. 01+0.0101]. This is ½[0. 0201] and it will be 0. 01050 hence y at
0.01 can be given as 1.01005.
615
And so value from this table at x = 0.02 y will come out 1.020201 at 0.03 y will come out 1.
0305454 at 0.04 it will be 1.040810.
The exact solution of this particular differential equation is y = ex that can be obtained
separating the variables of y index in different size.
616
And integrating and exact solution of y at x = 0.04 is given by 1.0408 which is same as we
are getting using the Runga kutta method of order 2 for this particular example.
Then if we talk about error in this particular method RK method of order 2 then the
truncation error will be obtained as R = yx1 - y1 and that will be the term which we have left
out for third order of h3. So hence accuracy will be of order of h3 so local discretization error
of this method is of order h3 whereas the Euler’s and quadratic Taylor methods were having
order of h2.
So therefore we can expect to be able to use a larger step size in this method when compare
to the Taylor Euler’s method. The price we pay for this is what we must evaluate the function
617
fxy twice that is once for k1 another one for k2. This method is also known as Euler quasi
method. Third order RK method can also be constructed by taking k1, k2, k3 and carrying out
the same analysis as we have done for order 2. here we are talking the RK method of order 4.
So in its simple form a fourth order RK method may be expressed as so there we will be
having four terms k1, k2, k3, k4. k1 is given as hf(x0, y0). Then we will make use of this k1 for
calculating k2. So k2 will become hf(x0+α1h, y0+β1k1). k3 will be become hf(x0+α2 h, y0+β2
k2). And finally k4 will be become h f(x0+α3h, y0+β3k3), where k is ω1 k1+ω2 k2+ω3 k3+ω4 k4.
And finally y1 can be obtained just like in the method of order 2 that is y0+k.
618
Basically we are having 10 unknowns and procedure for computing their values is same as
we have discussed in order 2 method. However, we will be having less number of equations
than the unknown therefore we have freedom to assign arbitrary values to some of the
unknowns and in this way we will get a class of RK method of order 4.
The most classical method which we take is something like this. We take α1 as 1/2 β1 as 1/2
then α2 as 1/2 β2 as 1/2 and finally α3 β3 as 1 and 1. And we take the weighted average as ω1
as yeah ω1 k1 so 1/6, ω2 as 1/3 ω3 as 1/3 and ω4 as 1/6.
So scheme will become like this. Let us take an example and as we have seen in the method
of order 2 the truncation error is of order h3. So here order of error will become of order h5
one less than the degree of the polynomial which we are taking.
619
So compute y at f = 0.2 and 0.4 by fourth order Runge kutta method for the differential
equation dy/dx = y - x where y at x = 0 is given by 1.5.
So first of all we will take h as 0.2, x0 is 0, y0 is 1.5. So k1 will become hf(x0, y0) so it will be
0.2(1.5) – 0, so 0.3. The k2 will come out as 0. 310. k3 will come out as 0. 3110 and finally k4
will be 0. 322. By taking the weighted average of k1, k2, k3 and k4, I will calculate k and k
will be my 1/6(0. 3+2 (0. 31)+2(0. 3110)+0. 3222).
And it is coming out as 0. 3107. So finally y1 that is the value of y at x = 0. 2 will be y0+k
and it will be 10. 5+0. 3107 that is 10. 8107. Similarly taking the initial values at x1 = 0. 2
and y1 = 0. 8107, I will calculate the value of y at x2 that is 0. 4 that is basically y2. So
similarly for this I will calculate k1, k2, k3, k4.
620
So this is the value for k1, the value for k2, value for k3, value for k4. Finally, weighted
average will give the k which will be 0. 33517. So y2 will come out finally y1+k and it is 20.
1459. So in this way I can implement the Runga kutta method of order 4 which is having the
accuracy of order h5. I can generalize the method of order 3 also which will be having
accuracy of h4.
So in this lecture what we have learned? We have learned another class of numerical methods
that is Runga kutta methods and we have done the method of order 2 in detail for this we
have done the derivation, we have done the error analysis, we have also seen the RK method
of order 4 in detail. We have taken an example. We have found the value of y at x = 0. 2 and
0. 4.
If the value of y at x = 0 is given. So all these methods staring from the Euler’s then Taylor
then RK methods what we are doing? for calculating the value of y at x = x1 we are using
only the value of x at x0. So what we are doing? We are taking a single step for finding the
value at next 0. we are using the value of current 0.
So all these methods are single step. In the next lecture we will see another method which is
different from these methods because that particular method will use multi steps. So for
finding the value of y at x = x1, it will use the value of x at f0 and at some previous points and
based on those points it will fit a polynomial and for the next 0. it will extrapolate the value
of the function for the next 0. So thank you very much.
621
Numerical Methods
By Dr. Sanjeev kumar
Department of Mathematics
Indian Institute of Technology, Roorkee
Lecture 40 Multi-step Method for solving ODEs
Hello everyone so welcome to the last lecture of this module and it is the last lecture of this
course also. So in this lecture I will talk about multistep method for solving ordinary
Differential Equation numerically. So in the past few lectures we have talked about Euler’s
method and then Runga kutta method of order 2 and order 4.
So in all those methods like in Euler's method we Have taken our approximation of y at x =
xk+1 as yk+ hf(xk, yk). So Here we are assuming that our x = xk and at xk, the value y is yk and
these two values are known to us and from these two values we are moving to the next value
of y at next point that is yk+1. In the same way, in the Euler’s Runga kutta method what we
have done?
We have calculated the value of y at x = xk+1 as y at x = xk+k. Where k is coming from the
average of k1 and k2 and k1 is given as h f(xk, yk) and k2 is given as h f(xk+h, yk+k1). Now
please look Here that for finding the value of y at x = xk+1 that is yk+1. We are using the value
of y at x = xk and xk and yk.
622
So for finding the value in the next approximation or in the next iteration what we are doing?
We are using only the value of current iteration. Similarly, we are doing the same thing in
Runga kutta method. So here we are moving one step that for finding the value in the next
iteration we are taking the value of current iteration only. However, in multistep method we
will use the value not only the value of current iteration.
But we will use values of some previous iterations also. Like the value at x = xk-1 at x = xk-2
and so on. And xy for calculating the value of y at x = xk+1, we are using value at kth iteration
in k-1 iteration, in k-2 iteration and that’s why it is called multistep method.
So the principle behind the multistep method is to utilize the past values of y and or
derivative of y to construct a polynomial that approximates the derivative function and then
extrapolate this into the next interval. So basically using four to five points we will construct
approximate a polynomial of degree 4 or 5 and for the value of in the next iteration or at the
point we will extrapolate the value using this particular polynomial.
So the number of past point that are used, sets the degree of the polynomial which we are
going to approximate or going to fit and is therefore responsible for the truncation error. The
order of the method is equal to the power of h in the global error term of the formula which is
also equal to one more than the degree of the polynomial.
So if we are having 4 degree polynomial we will be having the error of order h5. So generally
using an implicit linear multistep method, there is an additional difficulty because we cannot
solve simply for the newest approximate y value that is yn+k.
623
(Refer Slide Time: 5:05)
So because we are having an implicit formula and here we are having the value of y at x =
xn+k in left hand side as well as in right hand side and hence how to get out the value
explicitly that is the problem here. So a general k step implicit method involves at the kth
time step by this equation.
So αk yn+k+….+α1y1+α0 y0 and this is = h(βk fn+k+…..+β1f1+β0f0). So now here you can note
down that here in this equation I am having the value of yn+k in the left hand side and fn+k in
the right hand side.
Basically this fn+k is also involving the term yn+k because fn+k is f(xn+k) that is f((n+k)h, yn+k).
So in this way we are having yn+k in the both sides and that's why we are saying it an implicit
scheme.
So to get rid of this particular thing one solution is to only use explicit method.
624
(Refer Slide Time: 6:35)
However, this is not so good as an implicit method generally but it can have a simplification
in terms of calculation or as I told you it is simple in calculation but not so good in terms of
approximation. So what is the solution? The solution is to use predictor corrector method so
explicitly scheme with predictor corrector formulation.
625
So the predictor corrector method involves the predictor step in which we use an explicit
method to obtain an approximation yn+kp to yn+k. So or you can say it * also.
If you see this Euler’s formula as I told you it may be considered in a way as a P-C formula
that is predictor corrector formula. Since we are going to express it in this way and here only
we are using P = yn+1P = yn+ hf(xn, yn). While in corrector form yn+1C can be obtained at
yn+h/2 f(xn, yn)+f(xn+1 yn+1P).
And which is the accurate Euler’s modified method in which we are taking the average of the
slope in the whole interval. In the above scheme that is in this scheme the formula P is a first
order formula having an error of order h, while formula C that is the corrector is a second
order formula which has an order of h3.
626
(Refer Slide Time: 8:26)
So the above formula may be iterated if required as yn+1 C(k+1) = yn+h/2 f(xn, yn)+f(xn,
yn+1C(k)). Where k = 1, 2, 3. So you can iterate and iterate again and again. Now we will come
to multistep predictor corrector formula and this is the Milne’s method.
So in Milne’s method let us assume that the value of y and y` are known, may be given to us
or it may be computed by any self-starting method like Euler’s method or any other method.
for the points like xn-2, xn-1, xn and the initial value xn-3.
So we have done Newton’s forward formula in terms of y` phase U with starting node point
as y` = y`n-3+U Δy`n-3+U(U-1)/2!Δ2y`n-3+U(U-1)(U-2)/3!.
And then Δ3y`n-3 and so on. Where this U is x-xn-3/h, where his step size. So we can write x as
xn-3+ hU and from here we can write dx = h du.
627
(Refer Slide Time: 10:05)
Now consider the initial value problem y` = f(x, y) and the initial value is given at xn-3 that is
y at xn-3 is yn-3. Now if we integrate it both sides from xn-3 to xn+1 then we can have this
iteration that integration over xn-3 to xn+1 ∫dy = xn-3 to xn+1 ∫y`dx or when this integration will
be y when I will substitute limits.
So it will become yn+1-yn-3 = and from the newton's forward formula it can be written as 0 to 4
h∫y`n-3+U Δy`n-3 +U(U-1)/2!Δ2y`n-3]dU.
628
(Refer Slide Time: 11:21)
And when I will integrate it, the U terms will come here because that integration of 1 will
become U with respect to U. So when I will substitute limit, so upper Limit is 4.
629
Then in the second term it will become U2/2. So when I will substitute the limit U2 will
become 16. So 16/2 will become 8. So 8 Δy`n-3 and similarly we got another 3 terms that third
term will become 20/3 Δ2 y`n-3+8/3 Δ3y`n-3+14/45 Δ4 y`n-3.
Now we will replace this forward difference operator by shift operator. So we know that from
the interpolation that Δ = E-1. So I can write this 8 Δ into 8(E-1). Similarly, for Δ2 it will
become (E-1)2. Δ3 will become (E-1)3 and this term I will take out in terms of Δ only.
So when I will use it and so 4 y`n-3 will remain as such. from here I will get 8Ey`n-3. So Ey`n-3
will become y`n-2.
630
Because we know from the shift operator that Eyn+1 will become yn+2. So hence it will
become 8 y`n-2-8y`n-3+20/3 y`n-1 - 2y`n-2+y`n-3. So this is coming from this (E-1)2.
And similarly, for this term I will get 8/3 y`n-3 y`n-1+3 y`n-2-y`n-3 and then this term which we
have taken out already. That is 14h/45 Δ4y`n-3. Now after simplification here we are having
4y`n-3. Here we are having -8y`n-3. Similarly, here I am having 20/3y`n-3 and here -8/3y`n-3.
631
(Refer Slide Time: 14:31)
So after simplification I will get that yn+1 - yn-3 that is the left hand side = 4 h/3 2y`n-2 - y`n-1 +
2y`n+1 4h/45 Δ4 y`n-3. Or I take this term yn-3 in the right hand side so I will get this expression
and let us denote this term which I have taken out as e1 that is the error in predictor.
And this is given as by this expression that is 14h/45 Δ4y`n-3 and this I can say it is of order
h5. That is by the Taylor series 14h3/45 into fifth order derivative of y at some point xi1 where
this point xi1 exist between xn-3 to xn+1 in this interval.
632
(Refer Slide Time: 15:45)
So thus the formula yn+1 = yn-3 + 4h/3 2y`n-2-y`n-1+2y`n is called the Milne’s predictor formula
or the extrapolation formula with error of order h5 given by this expression. Now we need to
derive corrector formula. So from here we will get the value of yn+1 but that will be the
predict value. We have to correct this value and for this we need corrector formula.
So for the corrector formula again we will consider the same initial value problem but with
different initial point. So now we will consider initial point as x of n-1. So and we consider
that y at x = xn-1 is equal to yn-1. Again we will use Newton’s forward formula with starting
node xn-1 in terms of y` and U.
633
So it is given by this expression y` = y`n-1 + UΔ y`n-1+U(U-1)/2! Δ2 y`n-1 and so on. So we
have taken it up to Δ3 and here U is x - xn-1/ h or I can write from this equation x = xn-1+ h U.
So we have again the integral over xn-1 to xn+1 ∫dy = xn-1 to xn+1∫y`dx. So in the same way as
we did in predictor formula we can take this range as yn+1 - yn-1 = 0 to 2 h∫ y`n-1+UΔy`n-
1+U(U-1)/2! Δ2y`n-1 + third order and fourth order term and U of this, (you can see earlier it
was from 0 to 4 in the predictor).
So finally after integrating it with respect to U and putting the upper limit of U as 2 and lower
limit as 0 it is coming out as h[2y`n-1+2 Δ y`n-1 + 1/3 Δ2y`n-1 - 1/90 Δ4y`n-1].
634
So again we will replace this forward difference operator by the shift operator that is we
replace Δ by Δ = E-1 and we will get this expression.
And here again like we have done in the earlier step we will take this 1/90 Δ3 y`n-1 outside the
bracket.
635
(Refer Slide Time: 19:25)
So we have done it here. After simplification it is coming out as h/3 y`n-1+4y`n + y`n+1+ E2.
Where E2 is the error and it is of the order h5, which is similar to the predictor one.
So the formula yn+1 = yn-1+h/3.y`n-1+4y`n+y`n+1 is called the Milne’s corrector formula with an
error of order h5.
636
(Refer Slide Time: 20:08)
from here we will get an initial value or a predicted value of yn+1 which will be corrected by
the corrector formula this one, which we have just derived. So combination of these two
formulas are called Milne’s PC formula predictor corrector method. So let us take an
example.
637
(Refer Slide Time: 20:41)
So we need to solve the differential equation dy/dx = x2+y2 - 2. for x = 0.3 by Milne predictor
corrector method compute the starting values at x = -0.1, 0, 0.1 and 0.2 by the Taylor’s
expansion about x = 0 where y0 is 1 taking first non-zero terms, 4 non zero terms in the
Taylor’s series expansion.
638
(Refer Slide Time: 21:23)
So y` is f(x, y) and it is x2+y2 - 2 and it is also given that y0 = 1. So now y`(0) will become
02+12-2 because it is f(x, y) and it is coming out as -1. Now I will calculate y``, y``(x) will
2x+2yy`. So it will become 2 x+2 y(x2+y2) -2.
And when I will calculate the value of y`` at 0 it will be 0+2y at 0 will become 2 0+1-2. So -2
similarly I will calculate y``` at 0 and again I will differentiate this with respect to x and I will
substitute the value of y` from here in the expression and it will come out finally as 0. I will
also calculate as I told you in the Taylor series expansion we will use first four terms. So I
will calculate this value and this will come out as 12.
639
Now after calculating these values I will use the Taylor series expansion of y about x = 0. So
y(x) can be written as y(0)+x y`(0)+x2/2! Y``(0) and so on. So substituting all these values I
will get y(x) which is approximately equal to 1-x-x2+x3/2.
After getting this particular expression I will calculate the value of y at -0.1, 0, 0.1 and 0.2.
640
So once I will calculate it.
641
(Refer Slide Time: 24:20)
y at x = 0 again coming out as 1 which is also given to us as an initial condition. Then y at 0.1
is 0.89 and y at 0.2 is 0.7608. Now I will use I am having all these values of y at these points,
these 4 points so I will calculate the value of y at 0.3 using the Milne predictor formula. So I
will put the value here in this formula and after simplifying it I am getting this value as
0.6149. Once I will get this value that is the predicted value of y3. Now I need to correct this
value.
So for this I will use corrector formula which is given by this particular equation.
642
That is yn+1 = 2 yn-1+h/3y`n-1+4y`n+y`n+1.
So after using this the corrected value of y at x = 0.3 can be obtained as 0.6149. So in this
way we can apply the Milne predictor corrector method for solving ordinary differential
equations and here as I told you we should know the value of y at more than 1 points in this
Milne method we should know this value at least at 4 points.
for which either given to us or we need to calculate using Taylor series method or Euler's
method or any other method. So this method is multistep method and it is more accurate
compared to the single step method since the accuracy in this method of order h5 in predictor
as well as corrector formula. How and hence we can use larger step size for computation
when compared to the Euler’s method where for getting a better accuracy we need to use
smaller step size. So with this I will stop the discussion about this method.
643
Now since it is the last lecture I would like to tell you about few references which I have used
for making all these lectures. So the first one is the book Applied numerical analysis by
Gerald and Wheatly and it is I have used the sixth edition of this book. The other book is
numerical methods for scientific and engineer computation by Jain, Iyengar and Jain.
Moreover, I have taken some of the notes of professor S. Bhaskar from IIT Bombay and
notes of his course introduction to numerical analysis which is online at IIT Bombay website
and the last reference which I have used the book elementary numerical analysis and
algorithmic approach by Conte and de Boor that I have taken the third edition of this book
which is published by McGraw-Hill.
So these are the references which I have followed in this course. Apart from that I would like
to acknowledge few people.
The first of all I would like to acknowledge education technology cell, IIT Roorkee, specially
professor B. K. Gandhi, the coordinator of ET cell at IIT Roorkee along with his team Dr
644
Nivedita, Sharad, Mohan and other people who have along with me during the shooting of
this course. I am also thankful to program implementation committee NPTEL.
And in the last but not least I am very thankful to my teaching assistant Miss Savita who is
also a PHD student at department of mathematics, IIT Roorkee and she helped me for
preparing all the slides for this course. So thank you very much.
645
THIS BOOK
IS NOT FOR
SALE
NOR COMMERCIAL USE