0% found this document useful (0 votes)
27 views260 pages

Mathematics Board Sheets

Uploaded by

vortexstock121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views260 pages

Mathematics Board Sheets

Uploaded by

vortexstock121
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 260

1 Relations and Functions

Introduction
Recall that the notion of relations and functions, domain, co-domain and range have been introduced
in Class XI along with different types of specific real valued functions and their graphs. The concept of
the term ‘relation’ in mathematics has been drawn from the meaning of relation in English language,
according to which two objects or quantities are related if there is a recognisable connection or link
between the two objects or quantities. Let A be the set of students of Class XII of a school and B be
the set of students of Class XI of the same school. Then some of the examples of relations from A to
B are
(i) {(a, b) ∈ A × B: a is brother of b},
(ii) {(a, b) ∈ A × B: a is sister of b},
(iii) {(a, b) ∈ A × B: age of a is greater than age of b},
(iv) {(a, b) ∈ A × B: total marks obtained by a in the final examination is less than
the total marks obtained by b in the final examination},
(v) {(a, b) ∈ A × B: a lives in the same locality as b}. However, abstracting from this, we define
mathematically a relation R from A to B as an arbitrary subset of A × B.
If (a, b) ∈ R, we say that a is related to b under the relation R and we write as a R b. In general, (a, b)
∈ R, we do not bother whether there is a recognisable connection or link between a and b. As seen in
Class XI, functions are special kind of relations.
In this chapter, we will study different types of relations and functions, composition of functions,
invertible functions and binary operations.

Types of Relations
In this section, we would like to study different types of relations. We know that a relation in a set A
is a subset of A × A. Thus, the empty set ϕ and A × A are two extreme relations. For illustration, consider
a relation R in the set A = {1, 2, 3, 4} given by R = {(a, b): a - b = 10}. This is the empty set, as no pair (a,
b) satisfies the condition a - b = 10. Similarly, R' = {(a, b) : | a - b | ≥ 0} is the whole set A × A, as all
pairs (a, b) in A × A satisfy | a - b | ≥ 0. These two extreme examples lead us to the following definitions.

Definition 1 A relation R in a set A is called empty relation, if no element of A is related to any element
of A, i.e., R = ϕ ⊂ A × A.

Definition 2 A relation R in a set A is called universal relation, if each element of A is related to every
element of A, i.e., R = A × A.
Both the empty relation and the universal relation are some times called trivial relations.

Example 1 Let A be the set of all students of a boys school. Show that the relation R in A given by R =
{(a, b) : a is sister of b} is the empty relation and R' = {(a, b): the difference between heights of a and

Relations and Functions 1


b is less than 3 meters} is the universal relation.
Solution Since the school is boys school, no student of the school can be sister of any student of the
school. Hence, R = ϕ, showing that R is the empty relation. It is also obvious that the difference between
heights of any two students of the school has to be less than 3 meters. This shows that R' = A × A is
the universal relation.
Remark In Class XI, we have seen two ways of representing a relation, namely raster method and set
builder method. However, a relation R in the set {1, 2, 3, 4} defined by R = {(a, b) : b = a + 1} is also
expressed as a R b if and only if b = a + 1 by many authors. We may also use this notation, as and when
convenient.
If (a, b) ∈ R, we say that a is related to b and we denote it as a R b.
One of the most important relation, which plays a significant role in Mathematics, is an equivalence
relation. To study equivalence relation, we first consider three types of relations, namely reflexive,
symmetric and transitive.

Definition 3 A relation R in a set A is called


(i) reflexive, if (a, a) ∈ R, for every a ∈ A,
(ii) symmetric, if (a1, a2) ∈ R implies that (a2, a1) ∈ R, for all a1, a2 ∈ A.
(iii) transitive, if (a1, a2) ∈ R and (a2, a3) ∈ R implies that (a1, a3) ∈ R, for all a1, a2, a3 ∈ A.

Definition 4 A relation R in a set A is said to be an equivalence relation if R is reflexive, symmetric and


transitive.

Example 2 Let T be the set of all triangles in a plane with R a relation in T given by R = {(T1 T2): T1 is
congruent to T1}. Show that R is an equivalence relation.
Solution R is reflexive, since every triangle is congruent to itself. Further, (T1, T2) ∈ R ⇒ T1 is congruent
to T2 ⇒ T2 is congruent to T1 ⇒ (T2, T1) ∈ R. Hence, R is symmetric. Moreover, (T1, T2), (T2, T3) ∈ R ⇒ T1 is
congruent to T2 and T2 is congruent to T3 ⇒ T1 is congruent to T3 ⇒ (T1, T3) ∈ R. Therefore, R is an
equivalence relation.

Example 3 Let L be the set of all lines in a plane and R be the relation in L defined as R = {(L1, L2) : L
is perpendicular to L2}. Show that R is symmetric but neither reflexive nor transitive.
Solution R is not reflexive, as a line L can not be perpendicular to itself, i.e., (L1, L1) ∉ R. R is symmetric
as (L1, L2) ∈ R
⇒ L is perpendicular to L2
⇒ L2 is perpendicular to L
⇒ (L2, L1) ∈ R.
R is not transitive. Indeed, if L is perpendicular to L2 and L2 is perpendicular to L3, then L can never be
perpendicular to L3. In fact, L1 is parallel to L3, i.e., (L1, L2) ∈ R, (L2, L3) ∈ R but (L1, L3) ∉ R.

2 Relations and Functions


Example 4 Show that the relation R in the set {1, 2, 3} given by R = {(1, 1), (2, 2), (3, 3), (1, 2), (2, 3)} is
reflexive but neither symmetric nor transitive.
Solution R is reflexive, since (1, 1), (2, 2) and (3, 3) lie in R. Also, R is not symmetric, as (1, 2) ∈ R but (2,
1) ∉ R. Similarly, R is not transitive, as (1, 2) ∈ R and (2, 3) ∈ R but (1, 3) ∉ R.

Example 5 Show that the relation R in the set Z of integers given by


R = {(a, b) : 2 divides a - b}
is an equivalence relation.
Solution R is reflexive, as 2 divides (a - a) for all a ∈ Z. Further, if (a, b) ∈ R, then 2 divides a - b.
Therefore, 2 divides b - a. Hence, (b, a) ∈ R, which shows that R is symmetric. Similarly, if (a, b) ∈ R
and (b, c) ∈ R, then a - b and b - c are divisible by 2. Now, a - c = (a - b) + (b - c) is even (Why?). So,
(a - c) is divisible by 2. This shows that R is transitive. Thus, R is an equivalence relation in Z.
In Example 5, note that all even integers are related to zero, as (0, ± 2), (0, ± 4) etc., lie in R and no
odd integer is related to 0, as (0, ± 1), (0, ± 3) etc., do not lie in R. Similarly, all odd integers are related
to one and no even integer is related to one. Therefore, the set E of all even integers and the set O of
all odd integers are subsets of Z satisfying following conditions:
(i) All elements of E are related to each other and all elements of O are related to each other.
(ii) No element of E is related to any element of O and vice-versa.
(iii) E and O are disjoint and Z = E ∪ O.
The subset E is called the equivalence class containing zero and is denoted by [0]. Similarly, O is the
equivalence class containing 1 and is denoted by [1], Note that [0] ≠ [1], [0] = [2r] and [1] = [2r + 1], r ∈
Z. Infact, what we have seen above is true for an arbitrary equivalence relation R in a set X. Given an
arbitrary equivalence relation R in an arbitrary set X, R divides X into mutually disjoint subsets A. called
partitions or subdivisions of X satisfying:
(i) all elements of A. are related to each other, for all
(ii) no element of A. is related to any element of A, i ≠ j.
(iii) ∪ A = X and A. ∩ A = ϕ, i ≠ j.
The subsets A. are called equivalence classes. The interesting part of the situation is that we can go
reverse also. For example, consider a subdivision of the set Z given by three mutually disjoint subsets
A1, A2 and A3 whose union is Z with
A1 = {% ∈ Z : xis a multiple of 3} = {..., -6, -3, 0, 3, 6, ...}
A2 = {% ∈ Z : x - 1 is a multiple of 3} = {..., -5, -2, 1, 4, 7, ...}
A3 = {% ∈ Z : x- 2 is a multiple of 3} = {..., -4, -1, 2, 5, 8, ...}
Define a relation R in Z given by R = {(a, b) : 3 divides a - b}. Following the arguments similar to those
used in Example 5, we can show that R is an equivalence relation. Also, A1 coincides with the set of all

Relations and Functions 3


integers in Z which are related to zero, A2 coincides with the set of all integers which are related to 1
and A3 coincides with the set of all integers in Z which are related to 2. Thus, A1 = [0], A2 = [1] and A3 =
[2]. In fact, A1 = [3r], A2 = [3r + 1] and A3 = [3r + 2], for all r ∈ Z.

Example 6 Let R be the relation defined in the set A = {1, 2, 3, 4, 5, 6, 7} by R = {(a, b) : both a and b
are either odd or even}. Show that R is an equivalence relation. Further, show that all the elements of
the subset {1, 3, 5, 7} are related to each other and all the elements of the subset {2, 4, 6} are related
to each other, but no element of the subset {1, 3, 5, 7} is related to any element of the subset {2, 4, 6}.
Solution Given any element a in A, both a and a must be either odd or even, so that (a, a) ∈ R. Further,
(a, b) ∈ R ⇒ both a and b must be either odd or even ⇒ (b, a) ∈ R. Similarly, (a, b) ∈ R and (b, c) ∈ R ⇒
all elements a, b, c, must be either even or odd simultaneously ⇒ (a, c) ∈ R. Hence, R is an equivalence
relation. Further, all the elements of {1, 3, 5, 7} are related to each other, as all the elements of this
subset are odd. Similarly, all the elements of the subset {2, 4, 6} are related to each other, as all of
them are even. Also, no element of the subset {1, 3, 5, 7} can be related to any element of {2, 4, 6}, as
elements of {1, 3, 5, 7} are odd, while elements of {2, 4, 6} are even.

Types of Functions
The notion of a function along with some special functions like identity function, constant function,
polynomial function, rational function, modulus function, signum function etc. along with their graphs
have been given in Class XI.
Addition, subtraction, multiplication and division of two functions have also been studied. As the
concept of function is of paramount importance in mathematics and among other disciplines as well,
we would like to extend our study about function from where we finished earlier. In this section, we
would like to study different types of functions.
Consider the functions f1, f2, f3 and f4 given by the following diagrams.
We observe that the images of distinct elements of X1 under the function f1 are distinct, but the image
of two distinct elements 1 and 2 of X1 under f2 is same, namely b. Further, there are some elements like
e and f in X2 which are not images of any element of X1 under f1, while all elements of X3 are images of
some elements of X1 under f3. The above observations lead to the following definitions:

Definition 5 A function f : X → Y is defined to be one-one (or injective), if the images of distinct elements
of X under fare distinct, i.e., for every x1, x2 ∈ X, f(x1) = f(x2) implies x1 = x2. Otherwise, f is called many-
one.

Definition 6 A function f : X → Y is said to be onto (or surjective), if every element of Y is the image of
some element of X under f i.e., for every y ∈ Y, there exists an element x in X such that f(x) = y.
The function f3 and f4 are onto and the function f1 is not onto as elements e, f in X2 are not the image
of any element in X1 under f1.

4 Relations and Functions


Remark f : X → Y is onto if and only if Range of f = Y.

Definition 7 A function f : X → Y is said to be one-one and onto (or bijective), iff is both one-one and
onto.
The function f4 is one-one and onto.

Example 7 Let A be the set of all 50 students of Class X in a school. Let f : A → N be function defined
by f(x) = roll number of the student x. Show that f is one-one but not onto.
Solution No two different students of the class can have same roll number. Therefore, f must be one-
one. We can assume without any loss of generality that roll numbers of students are from 1 to 50. This
implies that 51 in Nis not roll number of any student of the class, so that 51 can not be image of any
element of X under f Hence, f is not onto.

Example 8 Show that the function f : N→ N, given by f(x) = 2x, is one-one but not onto.
Solution The function f is one-one, for f(x1) = f(x2) ⇒ 2x1 = 2x2 ⇒ x1 = x2. Further, f is not onto, as for 1
∈ N, there does not exist any x in N such that f(x) = 2x = 1.

Example 9 Prove that the function f : R → R, given by f(x) = 2x, is one-one and onto.
Solution f is one-one, as f(x1) = f(x2) ⇒ 2x1 = 2x2 ⇒ x1 = x2. Also, given any real number y in R there exists
y x y
in R such that f( ) = 2. ( ) = y. Hence, f is onto.
2 2 2

Relations and Functions 5


Example 10 Show that the function f : N→ N, given by f (1) = f (2) = 1 and f(x) = x - 1, for every x > 2, is
onto but not one-one.
Solution f is not one-one, as f(1) = f(2) = 1. But f is onto, as given any y ∈ N, y ≠ 1, we can choose x as
y + 1 such that f(y + 1) = y + 1 - 1 = y. Also for 1 ∈ N, we have f(1) = 1.

Example 11 Show that the function f : R → R, defined as f(x) = x2, is neither one-one nor onto.
Solution Since f(- 1) = 1 = f(1), f is not one- one. Also, the element - 2 in the co-domain R is not image
of any element x in the domain R (Why?). Therefore f is not onto.

Example 12 Show that f : N → N, given by


x + 1, if x is odd,
𝑓𝑓(𝑥𝑥) =
x − 1, if x is even
is both one-one and onto.
Solution Suppose f(x1) = f(x2). Note that if x1 is odd and x2 is even, then we will have x1 + 1 = x2 - 1, i.e.,
x2 - x1 = 2 which is impossible. Similarly, the possibility of x1 being even and x2 being odd can also be
ruled out, using the similar argument. Therefore, both x1 and x2 must be either odd or even. Suppose
both x1 and x2 are odd. Then f (x1) = f(x2) ⇒ x1 + 1 = x2 + 1 ⇒ x1 = x2. Similarly, if both x1 and x2 are even,
then also f(x1) = f(x2) ⇒ x1 - 1 = x2 - 1 ⇒ x1 = x2. Thus, f is one-one. Also, any odd number 2r + 1 in the
co-domain N is the image of 2r + 2 in the domain N and any even number 2r in the co-domain N is the
image of 2r - 1 in the domain N. Thus, f is onto.

6 Relations and Functions


Example 13 Show that an onto function f : {1, 2, 3} → {1, 2, 3} is always one-one.
Solution Suppose f is not one-one. Then there exists two elements, say 1 and 2 in the domain whose
image in the co-domain is same. Also, the image of 3 under f can be only one element. Therefore, the
range set can have at the most two elements of the co-domain {1, 2, 3}, showing that f is not onto, a
contradiction. Hence, f must be one-one.

Example 14 Show that a one-one function f : {1, 2, 3} → {1, 2, 3} must be onto.


Solution Since f is one-one, three elements of {1, 2, 3} must be taken to 3 different elements of the
co-domain {1, 2, 3} under f Hence, f has to be onto.
Remark The results mentioned in Examples 13 and 14 are also true for an arbitrary finite set X, i.e., a
one-one function f : X → X is necessarily onto and an onto map f : X → X is necessarily one-one, for
every finite set X. In contrast to this. Examples 8 and 10 show that for an infinite set, this may not be
true. In fact, this is a characteristic difference between a finite and an infinite set.

Composition of Functions and Invertible Function


In this section, we will study composition of functions and the inverse of a bijective function. Consider
the set A of all students, who appeared in Class X of a Board Examination in 2006. Each student
appearing in the Board Examination is assigned a roll number by the Board which is written by the
students in the answer script at the time of examination. In order to have confidentiality, the Board
arranges to deface the roll numbers of students in the answer scripts and assigns a fake code number
to each roll number. Let B ⊂ N be the set of all roll numbers and C ⊂ N be the set of all code numbers.
This gives rise to two functions f : A → B and g : B → C given by f(a) = the roll number assigned to the
student a and g(b) = the code number assigned to the roll number b. In this process each student is
assigned a roll number through the function f and each roll number is assigned a code number through
the function g. Thus, by the combination of these two functions, each student is eventually attached
a code number.
This leads to the following definition:

Definition 8 Let f : A → B and g : B → C be two functions. Then the composition of /and g, denoted by
go/ is defined as the function go f : A → C given by
M(x) = g(f (x)), ∀ x ∈ A.

Relations and Functions 7


Example 15 Let f : {2, 3, 4, 5} → {3, 4, 5, 9} and g : {3, 4, 5, 9} → {7, 11, 15} be functions defined as f(2) =
3, f(3) = 4, f(4) = f(5) = 5 and g(3) = g(4) = 7 and g(5) = g(9) = 11. Find g of
Solution We have gof(2) = g ( f(2)) = g(3) = 7, go f(3) = g ( f(3)) = g(4) = 7, M(4) = g (1(4)) = g(5) = 11 and
gof (5) = g(5) = 11.

Example 16 Find gof and fog, if f : R → R and g : R → R are given by f(x) = cos x and g(.r) = 3x2. Show that
gof ≠ fog.
Solution We have gof(x) = g (f(x)) = g (cos x) = 3 (cos x)2 = 3 cos2 x. Similarly, fog(x) = f(g(x)) = f(3x2) =
cos (3x2). Note that 3cos2 x ≠ cos 3x2, for x = 0. Hence, gof ≠ fog.

7 3 3x+4
Example 17 Show that if f: R − � � → R − � � is defmed by f(x) = and
5 5 5x−7
3 7 7x+4
g: R − � � → R − � � is defined by g(x) = , then fog = IA and gof = IB , where,
5 5 5x−3
3 7
A = R − � � , B = R − � � ; IA (x) = x, ∀ x ∈ A, IB (x) = x, ∀x ∈ B are called identity
5 5
functions on sets A and B, respectively,
Solution We have
(3x + 4)
3x + 4 7� �+4 21x + 28 + 20x − 28 41x
(5x − 7)
gof(x) = g � �= = = =x
5x − 7 (3x + 4) 15x + 20 − 15x + 21 41
5� �−3
(5x − 7)
(7x+4)
7x+4 3�(5x−3)�+4 21x+12+20x−12 41x
Similarly, fog (x) = f �5x−3� = (7x+4) = = =x
5�(5x−3)�−7 35x+20−35x+21 41

Thus, gof(x) = x, ∀ x ∈ B and fog (x) = x, ∀ x ∈ A, which implies that gof = IB and fog = IA .

Example 18 Show that if f: A → B and g: B → C are one- one, then gof: A → C is also one‐one,
Solution Suppose gof(x1 ) = gof(x2 )
⇒ g�r(x1 )� = g�f(x2 )�
⇒ f(x1 ) = f(x2 ), as g is one‐one
⇒ x1 = x2 , as f is one‐one
Hence, gof is one‐one,

Example 19 Show that if f: A → B and g ∶ B → C are onto, then gof: A → C is also onto,
Solution Given an arbitrary element z ∈ C, there exists a pre‐image y of z under g such that g(y) = z,
since g is onto, Further, for y ∈ B, there exists an element x in A with f(x) = y, since f is onto. Therefore,
gof(x) = g(f(x)) = g(y) = z, showing that gof is onto.

Example 20 Consider functions f and g such that composite gof is defined and is one- one. Are/and g
both necessarily one-one.
Solution Consider f : {1, 2, 3, 4} → {1, 2, 3, 4, 5, 6} defined as f(x) = x, \/X and g : {1, 2, 3, 4, 5, 6} → {1, 2, 3,

8 Relations and Functions


4, 5, 6} as g(x) = x, for x = 1, 2, 3, 4 and g(5) = g(6) = 5. Then, go f(x) = x \/x, which shows that gof is
one-one. But g is clearly not one-one.

Example 21 Are f and g both necessarily onto, if gof is onto?


Solution Consider f : {1, 2, 3, 4} → {1, 2, 3, 4} and g : {1, 2, 3, 4} → {1, 2, 3} defined as f(1) = 1, f(2) = 2, f(3)
= f(4) = 3, g (1) = 1, g (2) = 2 and g (3) = g (4) = 3. It can be seen that gof is onto but f is not onto.
Remark It can be verified in general that gof is one-one implies that f is one-one. Similarly, gof is onto
implies that g is onto.
Now, we would like to have close look at the functions f and g described in the beginning of this section
in reference to a Board Examination. Each student appearing in Class X Examination of the Board is
assigned a roll number under the function f and each roll number is assigned a code number under g.
After the answer scripts are examined, examiner enters the mark against each code number in a mark
book and submits to the office of the Board. The Board officials decode by assigning roll number back
to each code number through a process reverse to g and thus mark gets attached to roll number rather
than code number. Further, the process reverse to f assigns a roll number to the student having that
roll number. This helps in assigning mark to the student scoring that mark. We observe that while
composing f and g, to get gof, first f and then g was applied, while in the reverse process of the
composite gof, first the reverse process of g is applied and then the reverse process of f.

Example 22 Let f : {1, 2, 3} → {a, b, c} be one-one and onto function given by f (1) = a, f (2) = b and f (3)
= c. Show that there exists a function g : {a, b, c} → {1, 2, 3} such that gof = IX and fog = IY, where, X =
{1, 2, 3} and Y = {a, b, c}.
Solution Consider g : {a, b, c} → {1, 2, 3} as g (a) = 1, g (b) = 2 and g (c) = 3. It is easy to verify that the
composite gof = IX is the identity function on X and the composite fog = IY is the identity function on Y.

Remark The interesting fact is that the result mentioned in the above example is true for an arbitrary
one-one and onto function f : X → Y. Not only this, even the converse is also true, i.e., if f : X → Y is a
function such that there exists a function g : Y → X such that gof = IX and fog = IY, then f must be one-
one and onto.
The above discussion, Example 22 and Remark lead to the following definition:

Definition 9 A function f : X → Y is defined to be invertible, if there exists a function g : Y → X such that


gof = IX and fog = IY. The function g is called the inverse of f and is denoted by f –1.
Thus, if f is invertible, then f must be one-one and onto and conversely, if f is one-one and onto, then
f must be invertible. This fact significantly helps for proving a function f to be invertible by showing
that f is one-one and onto, specially when the actual inverse of f is not to be determined.

Example 23 Let f : N → Y be a function defined as f (x) = 4x + 3, where,


Y = {y ∈ N : y = 4x + 3 for some x ∈ N }. Show that f is invertible. Find the inverse.
Solution Consider an arbitrary element y of Y. By the definition of Y, y = 4x + 3, for some

Relations and Functions 9


(y−3)
x in the domain N This shows that x = Define g ∶ Y → N by
4
(y−3) (4x+3−3)
g(y) = 4
Now, gof(x) = g�f(x)� = g(4x + 3) =
4
= x and
(y−3) 4(y−3)
fog (y) = f�g(J)� = f � �= + 3 = y − 3 + 3 = y. This shows that gof = Ir1
4 4
and fog = IY , which implies that f is invertible and g is the inverse off,
Example 24 Let Y = {n2 : n ∈ N} ⊂ N Consider f: N → Y as f(n) = n2 Show that f is mvenible, Find the
mverse off,
Solution An arbitrary element y in Y is of the form n2 , for some n ∈ N This implies that n =
�y This gives a function g: Y → N, defined by g(y) = �y Now:
2
gof(n) = g(n2 ) = �n2 = n and fog (y) = f��y� = ��y� = y, which shows that gof = IN and fog
= IY . Hence, f is inyertible with f 1 = g.
Example 25 Let f ∶ N → R be a function defined as f ′ (x) = 4x 2 + l2x + l5. Show thal

J: N → S, where, S is me range of f, is invertible. Find the inverse of f,


Solution Let y be an arbitrary element of range f, Then y = 4x 2 + 12x +
���y−6�−3�
15, for some x in N, which implies that y = (2x + 3)2 + 6. This gives x = 2
, as y ≥ 6.
���y−6�−3�
Let us define g: S → N by g(J) =
2
Now gof(x) = g�f(x)� = g(4x + 12x + 15) = g((2x + 3)2 + 6)
2

���(2x + 3)2 + 6 − 6� − 3� (2x + 3 − 3)


= = =x
2 2
2
���y−6�−3� 2���y−6�−3�
and fog (J) = f � �=� + 3� + 6
2 2
2
= ���y − 6� − 3 + 3�)2 + 6 = ��y − 6� + 6 = y − 6 + 6 = y.
Hence, gof = IN and fog = IS . This implies that f is invertible with f –1
= g.

Example 26 Consider f : N → N, g : N → N and h : N → R defined as f (x) = 2x, g (y) = 3y + 4 and h (z) =


sin z, ∀ x, y and z in N. Show that ho(gof ) = (hog) of.
Solution We have
ho(gof) (x) = h(gof (x)) = h(g (f (x))) = h (g (2x))
= h(3(2x) + 4) = h(6x + 4) = sin (6x + 4) ∀ x ∈ N.
Also, ((hog) o f ) (x) = (hog) ( f (x)) = (hog) (2x) = h ( g (2x))
= h(3(2x) + 4) = h(6x + 4) = sin (6x + 4), ∀ x ∈ N.
This shows that ho(gof) = (hog) o f.
This result is true in general situation as well.
Theorem 1 If f : X → Y, g : Y → Z and h : Z → S are functions, then
ho(gof ) = (hog) o f.
Proof We have
ho(gof ) (x) = h(gof (x)) = h(g (f (x))), ∀ x in X

10 Relations and Functions


and (hog) of (x) = hog (f (x)) = h(g (f (x))), ∀ x in X.
Hence, ho(gof ) = (hog) o f.

Example 27 Consider f : {1, 2, 3} → {a, b, c} and g : {a, b, c} → {apple, ball, cat} defined as f (1) = a, f (2) =
b, f (3) = c, g (a) = apple, g (b) = ball and g (c) = cat. Show that f, g and gof are invertible. Find out f –1
,
g–1 and (gof)–1 and show that (gof) –1
= f –1o g–1.
Solution Note that by definition, f and g are bijective functions. Let f –1
: {a, b, c} → (1, 2, 3} and g–1 :
{apple, ball, cat} → {a, b, c} be defined as f –1{a} = 1, f –1{b} = 2, f –1{c} = 3, g –1{apple} = a, g –1{ball} = b and
g –1{cat} = c.
It is easy to verify that f –1
o f = I{1, 2, 3}, f o f –1
= I{a, b, c}, g –1og = I {a, b, c} and g o g–1 = ID, where, D = {apple,
ball, cat}. Now, gof : {1, 2, 3} → {apple, ball, cat} is given by gof (1) = apple, gof (2) = ball, gof (3) = cat.
We can define
(gof)–1 : {apple, ball, cat} → {1, 2, 3} by (gof)–1 (apple) = 1, (gof)–1 (ball) = 2 and
(g o f)–1 (cat) = 3. It is easy to see that (g o f)–1 o (g o f) = I{1, 2, 3} and
(gof) o (gof)–1 = ID . Thus, we have seen that f, g and gof are invertible.
Now, f –1og–1 (apple)= f –1(g–1(apple)) = f –1(a) = 1 = (gof)–1 (apple)
f –1og–1 (ball) = f –1(g–1(ball)) = f –1(b) = 2 = (gof)–1 (ball) and
f –1og–1 (cat) = f –1(g–1(cat)) = f –1(c) = 3 = (gof)–1 (cat).
Hence (gof)–1 = f –1og–1.
The above result is true in general situation also.
Theorem 2 Let f : X → Y and g : Y → Z be two invertible functions. Then gof is also invertible with (gof)–
1
= f –1og–1.
Proof To show that gof is invertible with (gof)–1 = f –1og–1, it is enough to show that (f –1
og–1)o(gof) = IX
and (gof)o( f –1og–1) = IZ.
Now, (f –1og–1) o (gof) = ((f –1og–1) og) of, by Theorem 1
= (f –1o(g–1og)) of, by Theorem 1
= (f –1
o IY) of, by definition of g–1
= IX.
Similarly, it can be shown that (gof ) o (f –1
o g –1) = IZ.

Example 28 Let S = {1, 2, 3}. Determine whether the functions f : S → S defined as below have inverses.
Find f –1, if it exists.
(a) f = {(1, 1), (2, 2), (3, 3)}
(b) f = {(1, 2), (2, 1), (3, 1)}
(c) f = {(1, 3), (3, 2), (2, 1)}
Solution
(a) It is easy to see that f is one-one and onto, so that f is invertible with the inverse f –1
of f given by
f –1
= {(1, 1), (2, 2), (3, 3)} = f.
(b) Since f (2) = f (3) = 1, f is not one-one, so that f is not invertible.

Relations and Functions 11


(c) It is easy to see that f is one-one and onto, so that f is invertible with f –1
= {(3, 1), (2, 3), (1, 2)}.

Binary Operations
Right from the school days, you must have come across four fundamental operations namely addition,
subtraction, multiplication and division, The main feature of these operations is that given any two
a
numbers a and b, we associate another number a + b or a − b or ab or ,b ≠ 0. It is to be noted that
b
only two numbers can be added or multiplied at a time, When we need to add three numbers, we first
add two numbers and the result is then added to the third number, Thus, addition, multiplication,
subtraction and division are examples of binary operation, as ‘binary’ means two. If we want to have a
general definition which can cover all these four operations, then the set of numbers is to be replaced
by an arbitrary set X and then general binary operation is nothing but association of any pair of elements
a, b from X to another element of X. This gives rise to a general definition as follows:

Definition 10 A binary operation * on a set A is a function * : A × A → A. We denote * (a, b) by a * b.

Example 29 Show that addition, subtraction and multiplication are binary operations on R, but division
is not a binary operation on R. Further, show that division is a binary operation on the set R of nonzero
real numbers.
Solution + : R × R → R is given by (a, b) → a + b
- : R × R → R is given by (a, b) → a - b
× : R × R → R is given by (a, b) → ab
Since ‘+’, and ‘×’ are functions, they are binary operations on R.
a a
But ÷: R × R → R, given by (a, b) → , is not a function and hence not a binary operation, as for b = 0, is not
b b
defined.
a
However, ÷ : R, × R, ⇒ R, given by (a, b) → is a function and hence a binary operation on R .
b

Example 30 Show that subtraction and division are not binary operations on N.
Solution -: N × N → N, given by (a, b) → a- b, is not binary operation, as the image of (3, 5) under ‘-’ is 3
- 5 = - 2 ∉ N. Similarly, ÷ : N × N → N, given by (a, b) → a + b is not a binary operation, as the image of
3
(3, 5) under ÷ is 3 ÷ 5 = ∉ N.
5

Example 31 Show that * : R × R → R given by (a, b) → a + 4b2 is a binary operation.


Solution Since * carries each pair (a, b) to a unique element a + 4b2 in R, * is a binary operation on R.

Example 32 Let P be the set of all subsets of a given set X. Show that ∪ : P × P → P given by (A, B) → A
∪ B and ∩ : P × P → P given by (A, B) → A ∩ B are binary operations on the set P.
Solution Since union operation ∪ carries each pair (A, B) in P × P to a unique element A ∪ B in P, ∪ is
binary operation on P. Similarly, the intersection operation ∩ carries each pair (A, B) in P × P to a unique
element A ∩ B in P, ∩ is a binary operation on P.

12 Relations and Functions


Example 33 Show that the v : R × R → R given by (a, b) → max {a, b} and the x : R × R → R given by (a, b)
→ min {a, b} are binary operations.
Solution Since ∨ carries each pair (a, b) in R × R to a unique element namely maximum of a and b lying
in R, ∨ is a binary operation. Using the similar argument, one can say that ∧ is also a binary operation.
Remark ∨ (4, 7) = 7, ∨ (4, – 7) = 4, ∧ (4, 7) = 4 and ∧ (4, – 7) = – 7.
When number of elements in a set A is small, we can express a binary operation ∗ on the set A through
a table called the operation table for the operation ∗. For example consider A = {1, 2, 3}. Then, the
operation ∨ on A defined in Example 33 can be expressed by the following operation table. Here, ∨ (1,
3) = 3, ∨ (2, 3) = 3, ∨ (1, 2) = 2.

V 1 2 3
1 1 2 3
2 2 2 3
3 3 3 3

Here, we are having 3 rows and 3 columns in the operation table with (i, j) the entry of the table being
maximum of ith and jth elements of the set A. This can be generalised for general operation ∗ : A × A →
A. If A = {a1, a2, ..., an}. Then the operation table will be having n rows and n columns with (i, j)th entry
being ai ∗ aj. Conversely, given any operation table having n rows and n columns with each entry being
an element of A = {a1, a2, ..., an}, we can define a binary operation ∗ : A × A → A given by ai ∗ aj = the
entry in the ith row and jth column of the operation table.
One may note that 3 and 4 can be added in any order and the result is same, i.e., 3 + 4 = 4 + 3, but
subtraction of 3 and 4 in different order give different results, i.e., 3 – 4 ≠ 4 – 3. Similarly, in case of
multiplication of 3 and 4, order is immaterial, but division of 3 and 4 in different order give different
results. Thus, addition and multiplication of 3 and 4 are meaningful, but subtraction and division of 3
and 4 are meaningless. For subtraction and division we have to write ‘subtract 3 from 4’, ‘subtract 4
from 3’, ‘divide 3 by 4’ or ‘divide 4 by 3’.
This leads to the following definition:

Definition 11 A binary operation ∗ on the set X is called commutative, if a ∗ b = b ∗ a, for every a, b ∈ X.

Example 34 Show that + : R × R → R and × : R × R → R are commutative binary operations, but – : R ×


R → R and ÷ : R∗ × R∗ → R∗ are not commutative.
Solution Since a + b = b + a and a × b = b × a, ∀ a, b ∈ R, ‘+’ and ‘×’ are commutative binary operation.
However, ‘–’ is not commutative, since 3 – 4 ≠ 4 – 3. Similarly, 3  4 ≠ 4 ÷ 3 shows that ‘÷’ is not
commutative.

Example 35 Show that ∗ : R × R → R defined by a ∗ b = a + 2b is not commutative.


Solution Since 3 ∗ 4 = 3 + 8 = 11 and 4 ∗ 3 = 4 + 6 = 10, showing that the operation ∗ is not commutative.
If we want to associate three elements of a set X through a binary operation on X, we encounter a

Relations and Functions 13


natural problem. The expression a ∗ b ∗ c may be interpreted as (a ∗ b) ∗ c or a ∗ (b ∗ c) and these two
expressions need not be same. For example, (8 – 5) – 2 ≠ 8 – (5 – 2). Therefore, association of three
numbers 8, 5 and 3 through the binary operation ‘subtraction’ is meaningless, unless bracket is used.
But in case of addition, 8 + 5 + 2 has the same value whether we look at it as ( 8 + 5) + 2 or as 8 + (5
+ 2). Thus, association of 3 or even more than 3 numbers through addition is meaningful without using
bracket. This leads to the following:

Definition 12 A binary operation ∗ : A × A → A is said to be associative if


(a ∗ b) ∗ c = a ∗ (b ∗ c), ∀ a, b, c, ∈ A.

Example 36 Show that addition and multiplication are associative binary operation on R. But subtraction
is not associative on R. Division is not associative on R∗.

Solution Addition and multiplication are associative, since (a + b) + c = a + (b + c) and (a × b) × c = a ×


(b × c) ∀ a, b, c ∈ R. However, subtraction and division are not associative, as (8 – 5) – 3 ≠ 8 – (5 –
3) and (8 ÷ 5) ÷ 3 ≠ 8 ÷ (5 ÷ 3).

Example 37 Show that ∗ : R × R → R given by a ∗ b → a + 2b is not associative.


Solution The operation ∗ is not associative, since
(8 ∗ 5) ∗ 3 = (8 + 10) ∗ 3 = (8 + 10) + 6 = 24,
while 8 ∗ (5 ∗ 3) = 8 ∗ (5 + 6) = 8 ∗ 11 = 8 + 22 = 30.

Remark Associative property of a binary operation is very important in the sense that with this property
of a binary operation, we can write a1 ∗ a2 ∗ ... ∗ an which is not ambiguous. But in absence of this
property, the expression a1 ∗ a2 ∗ ... ∗ an is ambiguous unless brackets are used. Recall that in the earlier
classes brackets were used whenever subtraction or division operations or more than one operation
occurred.
For the binary operation ‘+’ on R, the interesting feature of the number zero is that a + 0 = a = 0 + a,
i.e., any number remains unaltered by adding zero. But in case of multiplication, the number 1 plays
this role, as a × 1 = a = 1 × a, ∀ a in R. This leads to the following definition:

Definition 13 Given a binary operation ∗ : A × A → A, an element e ∈ A, if it exists, is called identity for


the operation ∗, if a ∗ e = a = e ∗ a, ∀ a ∈ A.

Example 38 Show that zero is the identity for addition on R and 1 is the identity for multiplication on
R. But there is no identity element for the operations
– : R × R → R and ÷ : R∗ × R∗ → R∗.
Solution a + 0 = 0 + a = a and a × 1 = a = 1 × a, ∀ a ∈ R implies that 0 and 1 are identity elements for the
operations ‘+’ and ‘×’ respectively. Further, there is no element e in R with a – e = e – a, ∀ a. Similarly,

14 Relations and Functions


we can not find any element e in R∗ such that a ÷ e = e ÷ a, ∀ a in R∗. Hence, ‘–’ and ‘÷’ do not have
identity element.
Remark Zero is identity for the addition operation on R but it is not identity for the addition operation
on N, as 0 ∉ N. In fact the addition operation on N does not have any identity.
One further notices that for the addition operation + : R × R → R, given any a ∈ R, there exists – a in
R such that a + (– a) = 0 (identity for ‘+’) = (– a) + a.
Similarly, for the multiplication operation on R, given any a ≠ 0 in R, we can choose
1 1 1
in 𝐑𝐑 such that a × = 1 (identity for ‘ × ’) = × a. This leads to the following definition:
a a a

Definition 14 Given a binary operation * : A × A → A with the identity element e in A, an element a ∈ A


is said to be invertible with respect to the operation *, if there exists an element b in A such that a *
b = e = b * a and b is called the inverse of a and is denoted by a-1.

1
Example 39 Show that - a is the inverse of a for the addition operation ‘+’ on R and is the inverse of
a
a ≠ 0 for the multiplication operation ‘×’ on R.
Solution As a + (- a) = a - a = 0 and (- a) + a = 0, - a is the inverse of a for addition.
1 1 1
Similarly, for a ≠ 0, a × = 1 = × a implies that is the inverse of a for multiplication.
a a a

1
Example 40 Show that - a is not the inverse of a ∈ N for the addition operation + on N and is not the
a
inverse of a ∈ N for multiplication operation × on N, for a ≠ 1.
Solution Since - a ∉ N, - a can not be inverse of a for addition operation on N, although - a satisfies a
+ (- a) = 0 = (- a) + a.
1
Similarly, for a ≠ 1 in N, ∉ N, which implies that other than 1 no element of N has inverse for
a
multiplication operation on N.
Examples 34, 36, 38 and 39 show that addition on R is a commutative and associative binary operation
with 0 as the identity element and - a as the inverse of a in R ∀ a.

Relations and Functions 15


2 Inverse Trigonometric Functions
Introduction
We have studied that the inverse of a function f, denoted by f–1, exists if f is one-one and onto. There
are many functions which are not one-one, onto or both and hence we can not talk of their inverses.
In Class XI, we studied that trigonometric functions are not one-one and onto over their natural
domains and ranges and hence their inverses do not exist. In this chapter, we shall study about the
restrictions on domains and ranges of trigonometric functions which ensure the existence of their
inverses and observe their behaviour through graphical representations. Besides, some elementary
properties will also be discussed.
Besides, some elementary properties will also be discussed.
The inverse trigonometric functions play an important role in calculus for they serve to define many
integrals.
The concepts of inverse trigonometric functions is also used in science and engineering.

Basic Concepts
In Class XI, we have studied trigonometric functions, which are defined as follows:
sine function, i.e., sine : R →[- 1, 1]
cosine function, i.e., cos : R →[- 1, 1]
π
tangent function, i.e., tan : R - { x : x = (2n + 1) , n ∈ Z} →R
2
cotangent function, i.e., cot : R - { x : x = nπ, n ∈ Z} →R
π
secant function, i.e., sec : R - { x : x = (2n + 1) , n ∈ Z} →R - (- 1, 1)
2
cosecant function, i.e., cosec : R - { x : x = nπ, n ∈ Z} →R - (- 1, 1)
We have also learnt in Chapter 1 that if f : X→Y such that f (x) = y is one-one and onto, then we can
define a unique function g : Y→X such that g (y) = x, where x ∈ X and y = f (x), y ∈ Y. Here, the domain
of g = range of f and the range of g = domain of f. The function g is called the inverse of f and is
denoted by f–1. Further, g is also one-one and onto and inverse of g is f. Thus, g –1
= (f –1)–1 = f. We also
have
(f–1 o f ) (x) = f–1 (f (x)) = f–1(y) = x
and (f o f–1) (y) = f (f–1(y)) = f (x) = y
Since the domain of sine function is the set of all real numbers and range is the closed interval
−π π
[−1, 1], If we restrict its domain to � , � , then it becomes one‐one and onto with range
2 2
−3π π −π π π 3π
[−1, 1], Actually, sine function restricted to any of the intervals � , �,� , � , � , � etc,, is one‐one and its
2 2 2 2 2 2
range is
[−1, 1], We can, therefore, define the inverse of sine function in each of these intervals, We denote the inverse of sine function by sin
−π π
[−1, 1] → � , �
2 2
From me definition of the inverse functions, it follows mat sin(sin1 x) = x

Inverse Trigonometric Functions 1


π π
if −1 ≤ x ≤ 1 and sin1 ( sin x) = x if − ≤ x ≤ In other words, if y = sin1 x, then siny = x.
2 2

Remarks
(i) We know from Chapter 1, that if y = f(x) is an invertible function, then x = f-1 (y). Thus, the graph of
sin-1 function can be obtained from the graph of original function by interchanging x and y axes, i.e., if
(a, b) is a point on the graph of sine function, then (b, a) becomes the corresponding point on the graph
of inverse of sine function. Thus, the graph of the function y = sin-1 x can be obtained from the graph
of y = sin x by interchanging x and y axes. The graphs of y = sin-1 x and y = sirr1 x are as given in figure.
The dark portion of the graph of y = sin-1 x represent the principal value branch.
(ii) It can be shown that the graph of an inverse function can be obtained from the corresponding graph
of original function as a mirror image (i.e., reflection) along the line y = x. This can be visualised by
looking the graphs of y = sin x and y = sin-1 x as given in the same axes.

2 Inverse Trigonometric Functions


Like sine function, the cosine function is a function whose domain is the set of all real numbers and
range is the set [-1, 1], If we restrict the domain of cosine function to [0, π], then it becomes one-one
and onto with range [-1, 1]. Actually, cosine function restricted to any of the intervals [- π, 0], [0, π],
[π, 2π] etc., is bijective with range as [-1, 1], We can, therefore, define the inverse of cosine function in
each of these intervals. We denote the inverse of the cosine function by cos (arc cosine function). Thus,
cos is a function whose domain is [-1, 1] and range could be any of the intervals [-π, 0], [0, π], [π, 2π]
etc. Corresponding to each such interval, we get a branch of the function cos-1. The branch with range
[0, π] is called the principal value branch of the function cos-1. We write
cos-1 : [-1, 1] → [0, π].
The graph of the function given by y = cos x can be drawn in the same way as discussed about the
graph of y = sin-1 x. The graphs of y = cos x and y = cos-1x are given

Inverse Trigonometric Functions 3


Let us now discuss cosec-1x and sec-1x as follows:
1
Since, cosec x = , the domain of the cosec function is the set {x : x ∈ R and sin x x ≠ nπ, n ∈ Z} and
sin x
the range is the set {y : y ∈ R, y ≥ 1 or y ≤ -1} i.e., the set R - (-1, 1). It means that y = cosec x assumes
all real values except -1 < y < 1 and is not defined for integral multiple of π. If we restrict the domain
ππ
of cosec function to �− 2 2 � − {0}, then itis one to one and onto with its range as the set R −
−3π −π −π π π 3π
(−1, 1) . Actually, cosec function restricted to any of the intervals � , �− {−π}, � � − {0}, � , �−
2 2 2 2 2 2
{π} etc, , is bijective and its range is the set of all real numbers R − (−1, 1).
Thus cosec-1 can be defined as a function whose domain is R - (-1, 1) and range could be any of the
−3π −π −π π π 3π −π π
intervals � , �− {−π}, � , � − {0}, � , �− {π} etc. The function corresponding to the range � , � − {0}
2 2 2 2 2 2 2 2
is called the principal value branch of cosec-1. We thus have principal branch as
−π π
cosec −1 ∶ R − (−1, 1) → � , � − {0}
2 2
The graphs of y = cosec x and y = cosec-1 x are given

4 Inverse Trigonometric Functions


1 π
Also, since sec x = , the domain of y = sec x is the set R − {x: x = (2n + 1) , n ∈
cos x 2
Z} and range is the set R − (−1, 1), It means that sec (secant function) assumes all real values except − 1 < y <
π
1 and is not defined for odd multiples of If we restrict the domain of secant function to [0, π] −
2
π
� 2 � , then it is one- one and onto with its range as the set 𝐑𝐑 −
−π π
(−1, 1). Actually, secant function restricted to any of the intervals [−π, 0] − � � , [0, π] − � � , [π, 2π] −
2 2

� 2 � etc, , is bijective and its range is R − {−1, 1}, Thus sec −1 can be defined as a function whose domain is R −
−π π
(−1, 1) and range could be any of the intervals [−π, 0] − � � , [0, π] − � 2 � , [π, 2π] −
2

� 2 � etc. , Corresponding to each of these intervals, we get different branches of the function sec −1 . The branch with range [0, π] −
π
� 2 � is called the principal value branch of the function sec −1. We thus have
π
sec −1 ∶ R − (−1, 1) → [0, π] − � �
2
The graphs of the functions y = sec x ana y = sec −1 x are glven

Inverse Trigonometric Functions 5


Finally, we now discuss tan-1 and cot-1
π
We know that the domain of the tan function (tangent function) is the set {x : x ∈ R and x ≠ (2n +1) ,
2
n ∈ Z} and the range is R. It means that tan function is not defined for odd multiples of
π −π π
. If we restrict the domain of tangent function to � , � , then it is one- one and onto with its range as R. Actually, tangent functio
2 2 2
.
We thus have
−π π
tan−1 : R → � , �
2 2
The graphs of the function y = tan x and y = tan-1x are given.

6 Inverse Trigonometric Functions


We know that domain of the cot function (cotangent function) is the set {x : x ∈ R and x ≠ nπ, n ∈ Z}
and range is R. It means that cotangent function is not defined for integral multiples of π. If we restrict
the domain of cotangent function to (0, π), then it is bijective with and its range as R. In fact, cotangent
function restricted to any of the intervals (-π, 0), (0, π), (π, 2π) etc., is bijective and its range is R. Thus
cot-1 can be defined as a function whose domain is the R and range as any of the intervals (-π, 0), (0,
π), (π, 2π) etc. These intervals give different branches of the function col-1. The function with range (0,
π) is called the principal value branch of the function col-1. We thus have
cot-1 : R → (0, π)
The graphs of y = cot x and y = cot-1x are given

Inverse Trigonometric Functions 7


The following table gives the inverse trigonometric function (principal value branches) along with their
domains and ranges.

sin-1 : [-1. 1] π π
→ �− , �
2 2

cos-1 : [-1. 1] → [0, π]

cosec-1 : R - (-1, 1) π π
→ �− , � - {0}
2 2

π
sec-1 : R-(-l. 1) → [0, π] – { }
2

tan-1 : R −π π
→� , �
2 2

cot-1 : R → (0, π)

[Note]
1
1. sin-1 x should not be confused with (sin x)-1. In fact (sin x)-1 = and similarly for other trigonometric
sinx
functions.
2. Whenever no branch of an inverse trigonometric functions is mentioned, we mean the principal value
branch of that function.
3. The value of an inverse trigonometric functions which lies in the range of principal branch is called
the principal value of that inverse trigonometric functions.

We now consider some examples:

l
Example 1 Find the principal value of sin−1 � �.
√2

8 Inverse Trigonometric Functions


l 1
Solution Let sin−1 � � = y. Then, sin y = .
√2 √2
−π π π
We know that the range of the principal value branch of sin−1 is � , � and sin � � =
2 2 4
1 1 π
. Therefore, principal value of sin−1 � 2� is
√2 √ 4

−1
Example 2 Find the principal value of cot −1 � �
√3
−1
Solution Let cot −1 � � = y. Then,
√3
−l π π 2π
cot y = = − cot � � = cot �π − � = cot � �
√3 3 3 3

We know that the range of principal value branch of co┌l is (0, π) and cot � � =
3
−1 −1 2π
Hence, principal value of cot −1 � � is
√3 √3 3

Properties of Inverse Trigonometric Functions


In this section, we shall prove some important properties of inverse trigonometric functions. It may be
mentioned here that these results are valid within the principal value branches of the corresponding
inverse trigonometric functions and wherever they are defined. Some results may not be valid for all
values of the domains of inverse trigonometric functions. In fact, they will be valid only for some values
of x for which inverse trigonometric functions are defined. We will not go into the details of these
values of x in the domain as this discussion goes beyond the scope of this text book.
Let us recall that if y = sin-1x, then x = sin y and if x = sin y, then y = sin-1x. This is equivalent to
π π
sin (sin-1 x) = x, x ∈ [- 1, 1] and sin1 (sin x) = x, x ∈ �− , �
2 2
Same is true for other five inverse trigonometric functions as well. We now prove some properties of
inverse trigonometric functions.

𝟏𝟏
1. (i) sin-1 = cosec-1 x, x ≥ 1 or x ≤ -1
𝐱𝐱
𝟏𝟏
(ii) cos-1 = sec-1x, x ≥ 1 or x ≤ -1
𝐱𝐱
𝟏𝟏
(iii) tan-1 = cot-1 x, x > 0
𝐱𝐱
To prove the first result, we put cosec-1 x = y, i.e., x = cosec y
1
Therefore = sin y
x
1
Hence sin-1 =y
x
1
or sin-1 = cosec-1 x
x
Similarly, we can prove the other parts.

2. (i) sin-1 (-x) = - sin-1 x, x ∈ [- 1, 1]


(ii) tan-1 (-x) = - tan-1 x, x ∈ R

Inverse Trigonometric Functions 9


(iii) cosec-1 (-x) = - cosec-1 x, | x | ≥ 1
Let sin-1 (-x) = y, i.e., -x = sin y so that x = - sin y, i.e., x = sin (-y).
Hence sin-1 x = - y = - sin-1 (-x)
Therefore sin-1 (-x) = - sin-1x
Similarly, we can prove the other parts.

3. (i) cos-1(-x) = π - cos-1x, x ∈ [-1, 1]


(ii) sec-1 (—x) = π - sec-1x, |x| ≥ 1
(iii) cot-1 (-x) = π - cot-1x, x ∈ R
Let cos-1 (-x) = y i.e., - x = cos y so that x = - cos y = cos (π - y)
Therefore cos-1 x = π - y = π - cos-1 (-x)
Hence cos-1 (-x) = π - cos-1 x
Similarly, we can prove the other parts.

π
4. (i) sin-1 x + cos-1 x = , x ∈ [- 1, 1]
2
π
(ii) tan-1x + cot-1x = , x ∈ R
2
π
(iii) cosec-1x + sec-1x = , |x| ≥ 1
2
π
Let sin-1 x = y. Then x = sin y = cos � − y�
2
π π
Therefore cos-1 x = − y = - sin-1 x
2 2
π
Hence sin-1 x + cos-1 x =
2
Similarly, we can prove the other parts.
𝐱𝐱 + 𝐲𝐲
(i) tan-1x + tan-1 y = tan-1 , xy < 1
𝟏𝟏 − 𝐱𝐱𝐱𝐱
𝐱𝐱 − 𝐲𝐲
(ii) tan-1x - tan-1 y = tan-1 , xy > - 1
𝟏𝟏 + 𝐱𝐱𝐱𝐱
𝐱𝐱+ 𝐲𝐲
(iii) tan-1x + tan-1 y = π + tan-1 � �, xy > 1; x, y > 0
𝟏𝟏− 𝐱𝐱𝐱𝐱
Let tan-1 x = θ and tan-1 y = ϕ. Then x = tan θ, y = tan ϕ
tan θ+ tan ϕ x+y
Now tan(θ + ϕ) = =
1- tan θ tan ϕ 1-xy
x+y
This gives θ+ϕ = tan−1 1-xy
x+y
Hence tan−1 x + tan−1 y = tan−1
1-xy
In the above result, if we replace y by - y, we get the second result and by replacing y by x, we get the
third result as given below.

2x
6. (i) 2tan−1 x = sin−1 , |x| ≤ 1
1+x2
1−x2
(ii) 2tan−1 x = cos −1 ,x ≥ 0
1+x2
2x
(iii) 2 tan‐1 x = tan−1 , −1 < x < 1
1−x2
Let tan-1x = y, then x = tan y. Now

10 Inverse Trigonometric Functions


2x 2 tan y
sin−1 2
= sin−1
1+x 1 + tan2 y
= sin−1 ( sin 2y) = 2y = 2tan−1 x
1−x2 1−tan2 y
Also cos−1 = cos−1 = cos−1 ( cos 2y) = 2y = 2tan−1 x
1+x2 1+tan2 y
(iii) Can be worked out similarly,
We now consider some examples,

Example 3 Show that


1 1
(i) sin−1 �2x√1 − x 2 � = 2sin−1 x, − ≤x≤
√2 √2
1
(ii) sin−1 �2x√1 − x 2 � = −1
2cos x, 2 ≤ x≤1

Solution
(i) Let x = sin θ . Then sin−1 x =θ. We have

sin−1 �2x�1 − x 2 � = sin−1 �2 sin θ�1 − sin2 θ�


= sin−1 (2 sin θ cos θ) = sin−1 (sin 2θ) = 2θ
= 2sin−1 x
(ii) Take x = cos θ , then proceeding as above, we get, sin−1 �2x√1 − x 2 � = 2 cos−1 x

1 2 3
Example 4 Show that tan−1 + tan−1 ll = tan−1 4
2
Solution By property 5 (i), we have

1 2
1 2 + 15 3
L.H.S. = tan−1 + tan−1 = tan−1 2 11
1 2 = tan−1 20 = tan−1 4 = R. H. S.
2 11 1− ×
2 11

cos x 3π π
Example 5 Express tan−1 ,− <x< in the simplest form,
1 − sin x 2 2
Solution We write
x x
cos x cos2 − sin2
−1
tan � −1
� = tan � 2 2
1 − sin x x x x x�
cos2 + sin2 − 2 sin 2 cos 2
2 2
x x x x
� cos 2 + sin 2� � cos 2 − sin 2�
−1
= tan � �
x x 2
� cos 2 − sin 2�
x x x
cos 2 + sin 1 + tan 2
−1
= tan � 2 −1
x x � = tan � x�
cos 2 − sin 2 1 − tan 2
π x π x
= tan−1 � tan � + �� = +
4 2 4 2
Alternatively,

Inverse Trigonometric Functions 11


π π − 2x
cos x sin �2 − x� sin � 2 �
−1 −1 −1
tan � � = tan � π � = tan � �
1 − sin x 1 − cos �2 − x� π − 2x
1 − cos � 2 �
π − 2x π − 2x
2 sin � 4 � cos � 4 �
−1
= tan � �
π − 2x
2sin2 � 4 �
π − 2x π π − 2x
= tan−1 � cot � �� = tan−1 � tan � − ��
4 2 4
π x π x
= tan−1 � tan � + �� = +
4 2 4 2

1
Example 6 Write cot −1 � � , x > 1 in the simplest form,
√x2 −1
Solution Let x = sec θ , then √x 2 − 1 = √sec 2 θ − 1 = tan θ
1
Therefore, cot −1 = co┌1( cot θ) = θ = sec1 x, which is the sim
√x2 −1

2x 3x−x3 1
Example 7 Prove that tan−1 x + tan−1 = tan1 � � , |x| <
1−x2 1−3x2 √3
−1
Solution Let x = tan θ . Then θ = tan x. We have
3x−x3 3 tan θ−tan3 θ
R.H.S. = tan−1 � � = tan−1 � �
1−3x2 1−3tan2 θ
−1
= tan (tan3θ) = 3θ = 3tan−1 x = tan−1 x + 2tan−1 x
2x
= tan−1 x + tan−1 = L. H. S. (Why?)
1−x2

Example 8 Find the value of cos(sec −1 x + cosec −1 x) , |x| ≥ 1


π
Solution We have cos(sec −1 x + cosec −1 x) = cos � � = 0
2

12 Inverse Trigonometric Functions


3 Matrices
Introduction

The knowledge of matrices is necessary in various branches of mathematics. Matrices are one of the
most powerful tools in mathematics. This mathematical tool simplifies our work to a great extent when
compared with other straight forward methods. The evolution of concept of matrices is the result of
an attempt to obtain compact and simple methods of solving system of linear equations. Matrices are
not only used as a representation of the coefficients in system of linear equations, but utility of
matrices far exceeds that use. Matrix notation and operations are used in electronic spreadsheet
programs for personal computer, which in turn is used in different areas of business and science like
budgeting, sales projection, cost estimation, analysing the results of an experiment etc. Also, many
physical operations such as magnification, rotation and reflection through a plane can be represented
mathematically by matrices. Matrices are also used in cryptography. This mathematical tool is not only
used in certain branches of sciences, but also in genetics, economics, sociology, modem psychology
and industrial management.

In this chapter, we shall find it interesting to become acquainted with the fundamentals of matrix and
matrix algebra.

Matrix

Suppose we wish to express the information that Radha has 15 notebooks. We may express it as [15]
with the understanding that the number inside [ ] is the number of notebooks that Radha has. Now, if
we have to express that Radha has 15 notebooks and 6 pens. We may express it as [15 6] with the
understanding that first number inside [ ] is the number of notebooks while the other one is the number
of pens possessed by Radha. Let us now suppose that we wish to express the information of possession
of notebooks and pens by Radha and her two friends Fauzia and Simran which is as follows:

Radha has 15 notebooks and 6 pens.

Fauzia has 10 notebooks and 2 pens.

Simran has 13 notebooks and 5 pens.

Now this could be arranged in the tabular form as follows:

Notebooks Pens

Radha 15 6

Fauzia 10 2

Simran 13 5

and this can be expressed as

15 6 ← First row

Matrices 1
10 2 ← Second row

13 5 ← Third row

↑ ↑

First Column Second Column

or

Radha Fauzia Simran

Notebooks 15 10 13

Pens 6 2 5

which can be expressed as:

15 10 13 ← First row

6 2 5 ← Second row

↑ ↑ ↑

First Second Third

Column Column Column

In the first arrangement the entries in the first column represent the number of note books possessed
by Radha, Fauzia and Simran, respectively and the entries in the second column represent the number
of pens possessed by Radha, Fauzia and Simran, respectively. Similarly, in the second arrangement,
the entries in the first row represent the number of notebooks possessed by Radha, Fauzia and Simran,
respectively. The entries in the second row represent the number of pens possessed by Radha, Fauzia
and Simran, respectively. An arrangement or display of the above kind is called a matrix. Formally, we
define matrix as:

Definition 1 A matrix is an ordered rectangular array of numbers or functions. The numbers or functions
are called the elements or the entries of the matrix.

We denote matrices by capital letters. The following are some examples of matrices:
1
−2 5 ⎡2 + i 3 − ⎤
⎢ 2⎥ 1+x x3 3 �
A = �0 √5� B = ⎢ 3.5 −1 2 ⎥ C=�
5 ⎥ cosx sinx + 2 tanx
3 6 ⎢
⎣ √3 5
7 ⎦
In the above examples, the horizontal lines of elements are said to constitute, rows of the matrix and
the vertical lines of elements are said to constitute, columns of the matrix. Thus A has 3 rows and 2
columns, B has 3 rows and 3 columns while C has 2 rows and 3 columns.

2 Matrices
Order of a matrix

A matrix having m rows and n columns is called a matrix of order m × n or simply m × n matrix (read
as an m by n matrix). So referring to the above examples of matrices, we have A as 3 × 2 matrix, B as
3 × 3 matrix and C as 2 × 3 matrix. We observe that A has 3 × 2 = 6 elements, B and C have 9 and 6
elements, respectively.

In general, an m × n matrix has the following rectangular array:


a11 a12 a13 … a1j … a1n
⎡ ⎤
⎢a a 22 a23 … a 2j … a 2n ⎥
⎢ 21 ⎥
⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥
⎢ ai1 ai2 ai3 … aij … a in ⎥
⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥
⎣am1 am2 am3 … a mj … a mn ⎦
m×n

or A = �a ij � , 1 ≤ i ≤ m, 1 ≤ j ≤ n i, j ∈ N
m×n

Thus the ith row consists of the elements ai1, ai2, ai3,..., ain, in, while the jth column consists of the elements
a1j, a2j, a3j,..., amj,

In general aij, is an element lying in the ith row and jth column. We can also call it as the (i, j)th element
of A. The number of elements in an m × n matrix will be equal to mn.

[Note] In this chapter

1. We shall follow the notation, namely A = [aij]m × n to indicate that A is a matrix of order m × n.

2. We shall consider only those matrices whose elements are real numbers or functions taking real
values.

We can also represent any point (x, y) in a plane by a matrix (column or row) as
x
�y� (or [x, y]). For example point P(0,1) as a matrix representation may be given as

0
P = � � or [01].
1
Observe that in this way we can also express the vertices of a closed rectilinear figure in the form of
a matrix. For example, consider a quadrilateral ABCD with vertices A (1, 0), B(3, 2), C(1, 3), D(−1, 2) ,

Now, quadrilateral ABCD in the matrix form, can be represented as


A 1 0
A B C D
3 2
X = 1 3 1 −1 or Y = B � �
� � C 1 3
0 2 3 2 2×4
D −1 2 4×2
Thus, matrices can be used as representation of vertices of geometrical figures in a plane.

Now, let us consider some examples.

Example 1 Consider the following information regarding the number of men and women workers in
three factories I, II and III

Men workers Women workers

I 30 25

Matrices 3
II 25 31

III 27 26

Represent the above information in the form of a 3 × 2 matrix. What does the entry in the third row
and second column represent?

Solution The information is represented in the form of a 3 × 2 matrix as follows:


30 25
A = �25 31�
27 26
The entry in the third row and second column represents the number of women workers in factory III.

Example 2 If a matrix has 8 elements, what are the possible orders it can have?

Solution We know that if a matrix is of order m × n, it has mn elements. Thus, to find all possible
orders of a matrix with 8 elements, we will find all ordered pairs of natural numbers, whose product
is 8.

Thus, all possible ordered pairs are (1, 8), (8, 1), (4, 2), (2, 4)

Hence, possible orders are 1 × 8, 8 × 1, 4 × 2, 2 × 4


1
Example 3 Construct a 3 × 2 matrix whose elements are given by aij = |i - 3j|.
2
a11 a12
Solution In general a 3 × 2ma𝔩𝔩riX is given by A = �a 21 a 22 �.
a 31 a 32
1
Now a ij = |i − 3j|, i = 1, 2, 3 and j = 1, 2.
2
1 1 5
Therefore a11 = |1 − 3 × 1| = 1 a12 = |1 − 3 × 2| =
2 2 2

1 1 1
a 21 = |2 − 3 × 1| = a 22 = |2 − 3 × 2| = 2
2 2 2
1 1 3
a 31 = |3 − 3 × 1| = 0 a 32 = |3 − 3 × 2| =
2 2 2
5
⎡1 2⎤
⎢1 ⎥
Hence the required matrix is given by A = ⎢ 2 2⎥.
⎢ 3⎥
⎣0 2⎦

Types of Matrices

In this section, we shall discuss different types of matrices.

(i) Column matrix

A matrix is said to be a column matrix if it has only one column.


0
For example, A = � √3 � is a column matrix of order 4 × 1.
−1
1⁄2
In general, A = �a ij � is a column matrix of order m × 1.
m×1

4 Matrices
(ii) Row matrix

A matrix is said to be a row matrix if it has only one row.


1
For example, B = �− √5 2 3� is a row matrix.
2 1×4

In general, B = �bij � is a row ma𝔩𝔩riX of order 1 × n.


1× n

(iii) Square matrix

A matrix in which the number of rows are equal to the number of columns, is said to be a square
matrix, Thus an m × n matrix is said to be a square matrix if m = n and is known as a square matrix of
order ‘n’.
3 −1 0
3
For example A = �2 3√2 1 � is a square malrix of order 3.
4 3 −1
In general, A = �a ij � is a square matrix of order m.
m×m

[Note] It A = [aij] is a square matrix of order n, then elements (entries) a11, a22, …, ann are said to
1 −3 1
constitute the diagonal, of the matrix A. Thus, if A = �2 4 −1�.
3 5 6
Then the elements of the diagonal of A are 1, 4, 6.

(iv) Diagonal matrix

A square matrix B =
�bij � is said to be a diagonal matrix if all its non diagonal elements are zero, that is a matrix B = �bij � is said
m×m m×m
to be a diagonal matrix if bij = 0, when i ≠ j.
−1.1 0 0
−1 0
For example, A = [4], B = � �, C = �0 2 0�, are diagonal malrices of order 1, 2, 3, respectively.
0 2
0 0 3
(v) Scalar matrix

A diagonal matrix is said to be a scalar matrix if its diagonal elements are equal, that is, a square matrix
B = [bij]n × n is said to be a scalar matrix if

bij = 0, when i ≠ j

bij = k, when i = j, for some constant k.

For example

√3 0 0
−1 0
A= [3], B = � � C = �0 √3 0 �
0 −1
0 0 √3
are scalar matrices of order 1, 2 and 3, respectively.

(vi) Identity matrix

A square matrix in which elements in the diagonal are all 1 and rest are all zero is called an identity
1if i = j
matrix. In other words, the square matrix A = [aij]n × n is an identity matrix, if a ij = � .
0 if i ≠ j

Matrices 5
We denote the identity matrix of order n by In . When order is clear from the context, we simply write
it as I,
1 0 0
1 0
For example [1], � � , �0 1 0� are identity matrices of order 1, 2 and 3, respectively.
0 1
0 0 1
Observe that a scalar matrix is an identity matrix when k = 1. But every identity matrix is clearly a scalar
matrix.

(vii) Zero matrix

A matrix is said to be zero matrix or null matrix if all its elements are zero.
0 0 0 0 0
For example, [0], � �, � �, [0, 0] are all zero matrices. We denote zero matrix by O. Its order
0 0 0 0 0
will be clear from the context.

Equality of matrices

Definition 2 Two matrices A = [aij] and B = [bij] are said to be equal if

(i) they are of the same order

(ii) each element of A is equal to the corresponding element of B, that is a ij = bij for all i and j.

For example,
2 3 2 3 3 2 2 3
� � and � � are equal matrices but � � and � � are not equal matrices, Symbolically, if two matrices A and B are equ
0 1 0 1 0 1 0 1
B.
x
y −1.5 0
If �z
a � = �2 √6� , then x = −1.5, y = 0, z = 2, a = √6, b = 3, c = 2
b
c 3 2
𝑥𝑥 + 3 𝑧𝑧 + 4 2𝑦𝑦 − 7 0 6 3𝑦𝑦 − 2
Example 4 If �−6 𝑎𝑎 − 1 0 � = �−6 −3 2𝑐𝑐 + 2 �
𝑏𝑏 − 3 −21 0 2𝑏𝑏 + 4 −21 0
Find the values of a, b, c, x, y and z.

Solution As the given matrices are equal, therefore, their corresponding elements must be equal,
Comparing the corresponding elements, we get
x + 3 = 0, z + 4 = 6, 2y − 7 = 3y − 2
a − 1 = −3, 0 = 2c + 2 b − 3 = 2b + 4,
Simplifying, we get
a = −2, b = −7, c = −l, x = −3, y = −5, z = 2
Example 5 Find the values of a, b, c, and d from the following equation:
2a + b a − 2b 4 −3
� �=� �
5c − d 4c + 3d 11 24
Solution By equality of two matrices, equating the corresponding elements, we get

2a + b = 4 5c - d = 11

a - 2b = - 3 4c + 3d = 24

Solving these equations, we get

6 Matrices
a = 1, b = 2, c = 3 and d = 4

Operations on Matrices

In this section, we shall introduce certain operations on matrices, namely, addition of matrices,
multiplication of a matrix by a scalar, difference and multiplication of matrices.

Addition of matrices

Suppose Fatima has two factories at places A and B. Each factory produces sport shoes for boys and
girls in three different price categories labelled 1, 2 and s The quantities produced by each factory are
represented as matrices given below:

Factory at A Factory at B

Boys Girls Boys Girls

1 80 60 1 90 50

2 75 65 2 70 55

3 90 85 3 75 75

Suppose Fatima wants to know the total production of sport shoes in each price category. Then the
total production

In category 1 : for boys (80 + 90), for girls (60 + 50)

In category 2 : for boys (75 + 70), for girls (65 + 55)

In category 3 : for boys (90 + 75), for girls (85 + 75)


80 + 90 60 + 50
This can be represented in the matrix form as �75 + 70 65 + 55�.
90 + 75 85 + 75
This new matrix is the sum of the above two matrices. We observe that the sum of two matrices is a
matrix obtained by adding the corresponding elements of the given matrices. Furthermore, the two
matrices have to be of the same order.
a11 a12 a13 b11 b12 b13
Thus, if A = �a a 22 a 23 � is a 2 × 3 matrix and B = �b21 � is another
21 b22 b23
a + b11 a12 + b12 a13 + b13
2 × 3 matrix, Then, we define A + B = � 11 �.
a 21 + b21 a 22 + b22 a 23 + b23
In general, if A= �a ij � and B = �bij � are two matrices of the same order, say m × n.

Then, the sum of the two matrices A and B is defined as a matrix C = �cij � , where cij = a ij +
m× n
bij , for all possible values of i and j,

Example 6 Given A = �√3 1 −1� and B = �2 √5


l
l
� , find A + B
2 3 0 −2 3
2

Since A, B are of the same order 2 × 3. Therefore, addition of A and B is defined

and is given by

Matrices 7
2 + √3 1 + √5 1 − 1 2 + √3 1 + √5 0
A + B = � 1� = � 1�
2−2 3+3 0+ 0 6
2 2
[Note]

1. We emphasise that if A and B are not of the same order, then A + B is not defined. For example if A
2 3 1 2 3
=� �, B = � �, then A + B is not defined.
1 0 1 0 1
2. We may observe that addition of matrices is an example of binary operation on the set of matrices
of the same order.

Multiplication of a matrix by a scalar

Now suppose that Fatima has doubled the production at a factory A in all categories .

Previously quantities (in standard units) produced by factory A were

Boys Girls

1 80 60

2 75 65

3 90 85

Revised quantities produced by factory A are as given below:

Boys Girls

1 2×80 2×60

2 2×75 2×65

3 2×90 2×85

160 120
This can be represented in the matrix form as �150 130� . We observe that the new ma𝔩𝔩riX is obtained by
180 170
multiplying each element of the previous matrix by 2.

In general, we may define multiplication of a matrix by a scalar as follows: if A = �aij � is a matrix


m×n
and k is a scalar, then kA is another matrix which is obtained by multiplying each element of A by the
scalar k.

In other words, kA = k�a ij � = �k�a ij �� , that is, (i, j)th element of kA is kaij for all possible values of i and j.
m×n m×n

3 1 1.5
For example, if A = �√5 7 −3�, then
2 0 5
3 1 1.5 9 3 4.5
3A = 3 �√5 7 −3� = �3√5 21 −9�
2 0 5 6 0 15
Negative of a matrix- The negative of a matrix is denoted by - A. We define -A = (-1) A.

8 Matrices
For example, let
3 1
A=� �, then ‐A is given by
−5 x
3 1 −3 −1
−A = (−1)A = (−1) � �=� �
−5 x 5 −x
Difference of matrices If A = �aij �, B = �bij � are two matrices of the same order, say m ×
n, then difference A— B is defined as a matrix D = �dij �, where dij = a ij − bij , for all value of i andj, In other words, D =
A − B = A + (−l)B, that is sum of the matrix A and the matrix − B,
1 2 3 3 −1 3
Example 7 If A = � � and B = � � , then find 2A − B.
2 3 1 −1 0 2
Solution We have
1 2 3 3 −1 3
2A − B = 2 � �−� �
2 3 1 −1 0 2
2 4 6 −3 1 − 3
=� �+� �
4 6 2 1 0−2
2−3 4+1 6−3 −1 5 3
=� �=� �
4+1 6+0 2−2 5 6 0
Properties of matrix addition

The addition of matrices satisfy the following properties:

(i) Commutative Law If A = �a ij �, B = �bij � are malrices of the same order, say m × n, then A + B = B + A.

Now A + B = �a ij � + �bij � = �a ij + bij �

= �bij + aij � (addition of numbers is commutative)

= ��bij � + �a ij �� = B + A

(ii) Associative Law For any three matrices A = �aij �, B = �bij �, C = �cij � of the same order, say m × n, (A + B) +
C = A + (B + C) .

Now (A + B) + C = ��aij � + �bij �� + �cij �


= �a ij + bij � + �cij � = ��aij + bij � + cij �

= �a ij + �bij + cij �� (Why?)

= �a ij � + ��bij + cij �� = �aij � + ��bij � + �cij �� = A + (B + C)

(iii) Existence of additive identity Let A = [aij] be an m × n matrix and O be an m × n zero matrix, then
A+O = O+ A = A. In other words, O is the additive identity for matrix addition.

(iv) The existence of additive inverse Let A = [aij]m × n be any matrix, then we have another matrix as -
A = [-aij]m × n such that A + (- A) = (- A) + A = O. So - A is the additive inverse of A or negative of A.

Properties of scalar multiplication of a matrix

If A = [aij] and B = [bij] be two matrices of the same order, say m × n, and k and l are scalars, then

(i) k(A +B) = k A + kB, (ii) (k + l)A = k A + l A

(ii) k (A + B) = k ([aij] + [bij])

= k [aij + bij] = [k (aij + bij)] = [(k aij) + (k bij)]

= [k aij] + [k bij] = k [aij] + k [bij] = kA + kB

Matrices 9
(iii) ( k + l ) A = (k + l) [aij]

= [(k + l) aij] + [k aij] + [l aij] = k [aij] + l [aij] = k A + l A


8 0 2 −2
Example 8 If A = �4 −2� and B = � 4 2 �, then find the matrix X, such that 2A + 3X = 5B.
3 6 −5 1
Solution We have 2A + 3X = 5B

or 2A + 3X - 2A = 5B - 2A

or 2A - 2A + 3X = 5B - 2A (Matrix addition is commutative)

or O + 3X = 5B - 2A (- 2A is the additive inverse of 2A)

or 3X = 5B - 2A (O is the additive identity)


1
or x = (5B - 2A)
3

2 −2 8 0 10
−10 −16 0
1 1
or X = �5 � 4 2 � − 2 �4 −2�� = �� 20
10 � + � −8 4 ��
3 3
−5 1 3 6 −25
5 −6 −12
−10
⎡−2 ⎤
⎢ 3 ⎥
1 10 − 16 −10 + 0 1 −6 −10 14 ⎥
= �20 − 8 10 + 4 � = �12 14 � = ⎢4
3 3 ⎢ 3 ⎥
−25 − 6 5 − 12 −31 −7
⎢−31 −7 ⎥
⎣ 3 3 ⎦
5 2 3 6
Example 9 Find X and Y, if X + Y = � � and X − Y = � �.
0 9 0 −1
5 2 3 6
Solution We have (X + Y) + (X − Y) = � �+� �.
0 9 0 −1
8 8 8 8
or (X + X) + (Y − Y) = � � ⇒ 2X = � �
0 8 0 8
1 8 8 4 4
or X = � �=� �
2 0 8 0 4
5 2 3 6
Also (X + Y) − (X − Y) = � �−� �
0 9 0 −1
5−3 2−6 2 −4
or (X − X) + (Y + Y) = � � ⇒ 2Y = � �
0 9+1 0 10
1 2 −4 1 −2
or Y = � �=� �
2 0 10 0 5
Example 10 Find the values of x and y from the following equation:
x 5 3 −4 7 6
2� �+� � = � �
7 y−3 1 2 15 14
Solution We have
x 5 3−4 7 6 2x 10 3 −4 7 6
2� �+� � = � � ⇒ � �+� �=� �
7 y−3 12 15 14 14 2y − 6 1 2 15 14
2x + 3 10- 4 7 6 2x + 3 6 7 6
or � �=� �⇒� �=� �
14 + 1 2y- 6 + 2 15 14 15 2y- 4 15 14
or 2x + 3 = 7 and 2y - 4 = 14 (Why?)

or 2x = 7 - 3 and 2 y = 18

10 Matrices
4 18
or x = and y =
2 2

i.e. x = 2 and y = 9.

Example 11 Two farmers Ramkishan and Gurcharan Singh cultivates only three varieties of rice namely
Basmati, Permal and Naura. The sale (in Rupees) of these varieties of rice by both the farmers in the
month of September and October are given by the following matrices A and B.

(i) Find the combined sales in September and October for each farmer in each variety.

(ii) Find the decrease in sales from September to October.

(iii) If both farmers receive 2% profit on gross sales, compute the profit for each farmer and for each
variety sold in October.

Solution

(i) Combined sales in September and October for each farmer in each variety is given by

(ii) Change in sales from September to October is given by

2
(iii) 2% of B = × B = 0.02 × B
100

Matrices 11
Thus, in October Ramkishan receives Rs. 100, Rs. 200 and Rs. 120 as profit in the sale of each variety
of rice, respectively, and Grucharan Singh receives profit of Rs.400, Rs. 200 and Rs. 200 in the sale of
each variety of rice, respectively.

Multiplication of Matrix

Suppose Meera and Nadeem are two friends. Meera wants to buy 2 pens and 5 story books, while
Nadeem needs 8 pens and 10 story books. They both go to a shop to enquire about the rates which are
quoted as follows:

Pen - Rs. 5 each, story book - Rs. 50 each.

How much money does each need to spend? Clearly, Meera needs Rs. (5 × 2 + 50 × 5) that is Rs. 260,
while Nadeem needs (8 × 5 + 50 × 10) Rs., that is Rs. 540. In terms of matrix representation, we can
write the above information as follows:

Requirements Prices per piece (in Rupees) Money needed (in Rupees)

2 5 5 5 × 2 + 5 × 50 260
� � � � � � = � �
8 10 50 8 × 5 + 10 × 50 540

Suppose that they enquire about the rates from another shop, quoted as follows:

pen - Rs. 4 each, story book - Rs. 40 each.

Now, the money required by Meera and Nadeem to make purchases will be respectively Rs.(4 × 2 + 40
× 5) = Rs.208 and Rs.(8 × 4 + 10 × 40) = Rs.432

Again, the above information can be represented as follows:

Requirements Prices per piece (in Rupees) Money needed (in Rupees)

2 5 4 4 × 2 + 40 × 5 208
� � � � � � = � �
8 10 40 8 × 4 + 10 × 40 432

Now, the information in both the cases can be combined and expressed in terms of matrices as follows:

Requirements Prices per piece (in Rupees) Money needed (in Rupees)

12 Matrices
2 5 5 4 5 × 2 + 5 × 50 4 × 2 + 40 × 5
� � � � � �
8 10 50 40 8 × 5 + 10 × 50 8 × 4 + 10 × 40

260 208
=� �
540 432
The above is an example of multiplication of matrices. We observe that, for multiplication of two
matrices A and B, the number of columns in A should be equal to the number of rows in B. Furthermore
for getting the elements of the product matrix, we take rows of A and columns of B, multiply them
element-wise and take the sum. Formally, we define multiplication of matrices as follows:

The product of two matrices A and B is defined if the number of columns of A is equal to the number
of rows of B. Let A = [aij] be an m × n matrix and B = [bjk] be an n × p matrix. Then the product of the
matrices A and B is the matrix C of order m × p.

To get the (i, k)th element cik of the matrix C, we take the ith row of A and kth column of B, multiply
them elementwise and take the sum of all these products. In other words, if A = [aij]m × n, B = [bjk]n × p,
then the ith row of A is [ai1 ai2 ... ain] and the kth column of
𝑏𝑏1𝑘𝑘
𝑏𝑏
B is � 2𝑘𝑘 � , then cik = a i1 b1k + ai2 b2k + ai3 b3k + … + a in bnk = ∑nj=l a ij bjk .

𝑏𝑏𝑛𝑛𝑛𝑛
The matrix C= [cik ]m × p is the product of A and B.
2 7
1 −1 2
For example, if C=� � and D = �−1 1 � , then the product CD is defined and is given by CD =
0 3 4
5 −4
2 7
1 −12
� � �−1 1 � This is a 2 × 2 malrix in which each entry is the sum of the products across some
0 3 4
5 −4
row of C with the corresponding entries down some column of D. These four computations are

13 −2
Thus CD = � �
17 −13
6 9 2 6 0
Example 12 Find AB, if A = � � and B = � �.
2 3 7 9 8

Matrices 13
Solution The matrix A has 2 columns which is equal to the number of rows of B. Hence AB is defined,
Now
6(2) + 9(7) 6(6) + 9(9) 6(0) + 9(8)
AB = � �
2(2) + 3(7) 2(6) + 3(9) 2(0) + 3(8)
12 + 63 36 + 81 0 + 72 75 117 72
=� � = � �
4 + 21 12 + 27 0 + 24 25 39 24
Remark If AB is defined, then BA need not be defined. In the above example, AB is defined but BA is
not defined because B has 3 column while A has only 2 (and not 3) rows. If A, B are, respectively m ×
n, k × l matrices, then both AB and BA are defined if and only if n = k and I = m. In particular, if both A
and B are square matrices of the same order, then both AB and BA are defined.

Non-commutativity of multiplication of matrices

Now, we shall see by an example that even if AB and BA are both defined, it is not necessary that AB
= BA.
2 3
1 −2 3
Example 13 If A = � � and B = �4 5� , then find AB, BA, Show that AB ≠ BA.
−4 2 5
2 1
Solution Since A is a 2 × 3 matrix and B is 3 ×
2 malrix. Hence AB and BA are both defined and are matrices of order 2 × 2 and 3 × 3, respectiyely, Note that
2 3
1 −2 3 2−8+6 3 − 10 + 3 0 −4
AB = � � �4 5� = � �=� �
−4 2 5 −8 + 8 + 10 −12 + 10 + 5 10 3
2 1
2 3 2 − 12 −4 + 6 6 + 15 −10 2 21
1−2 3
and BA = �4 5� � � = �4 − 20 −8 + 10 12 + 25� = �−16 2 37�
−42 5
2 1 2−4 −4 + 2 6+5 −2 −2 11
Clearly AB ≠ BA

In the above example both AB and BA are of different order and so AB ≠ BA. But one may think that
perhaps AB and BA could be the same if they were of the Same order, But it is not so, here we give an
example to show that eyen if AB and BA are of same order they may not be same,
1 0 0 1 0 1 0 −1
Example 14 If A = � � and B = � � , then AB = � � . and BA = � � . Clearly AB ≠ BA.
0 −1 1 0 −1 0 1 0
Thus matrix multiplication is not commutative,

[Note] This does not mean that AB ≠ BA for every pair of matrices A, B for which AB and BA, are defined.
For instance,
1 0 3 0 3 0
If A = � �,B = � � , then AB = BA = � �
0 2 0 4 0 8
Observe that multiplication of diagonal matrices of same order will be commutative.

Zero matrix as the product of two non zero matrices

We know that, for real numbers a, b if ab = 0, then either a = 0 or b = 0. This need not be true for
matrices, we will observe this through an example.
0 −1 3 5
Example 15 Find AB, if A = � � and B = � �.
0 2 0 0
0 −1 3 5 0 0
Solution We have AB = � �� �=� �.
0 2 0 0 0 0

14 Matrices
Thus, if the product of two matrices is a zero matrix, it is not necessary that one of the matrices is a
zero matrix.

Properties of multiplication of matrices

The multiplication of matrices possesses the following properties, which we state without proof.

1. The associative law For any three matrices A, B and C. We have (AB) C = A (BC), whenever both sides
of the equality are defined.

2. The distributive law For three matrices A, B and C.

(i) A (B+C) = AB + AC

(ii) (A+B) C = AC + BC, whenever both sides of equality are defined.

3. The existence of multiplicative identity For every square matrix A, there exist an identity matrix of
same order such that IA = AI = A.

Now, we shall verify these properties by examples.


1 1 −1 1 3
1 2 3 −4
Example 16 If A = �2 0 3 � , B = �0 2� and C = � �, find
2 0 −2 1
3 −1 2 −1 4
A(BC), (AB)C and show that (AB)C = A(BC) .
1 1 −1 1 3 1+0+1 3+2−4 2 1
Solution We have AB = �2 0 3 � � 0 2� = �2 + 0 − 3 6 + 0 + 12� = �−1 18�
3 −1 2 −1 4 3+0−2 9−2+8 1 15
2 1 2+2 4+0 6−2 −8 + 1
1 2 3 −4
(AB) (C) = �−1 18� � � = �−1 + 36 −2 + 0 −3 − 36 4 + 18 �
2 0 −2 1
1 15 1 + 30 2+0 3 − 30 −4 + 15
4 4 4 −7
= �35 −2 −39 22 �
31 2 −27 11
1 3 1+6 2+0 3 − 6 −4 + 3
1 2 3
Now BC = �0 2� � � = �0 + 4 0+0 0−4 0+2 �
2 0−2 1
−1 4 −1 + 8 −2 + 0 −3 − 8 4 + 4
7 2 −3 −1
= �4 0 −4 2�
7 −2 −11 8
1 1 −1 7 2 −3 −1
Therefore A(BC) = �2 0 3 � �4 0 −4 2�
3 −1 2 7 −2 −11 8
7+4−7 2 + 0 + 2 −3 − 4 + 11 −1 + 2 − 8
= �14 + 0 + 21 4 + 0 − 6 −6 + 0 − 33 −2 + 0 + 24�
21 − 4 + 14 6 + 0 − 4 −9 + 4 − 22 −3 − 2 + 16
4 4 4 −7
= �35 −2 −39 22 � Clearly, (AB) C = A (BC)
31 2 −27 11
0 6 7 0 1 1 2
Example 17 If A = �−6 0 8� , B = �1 0 2� , C = �−2�
7 −8 0 1 2 0 3
Calculate AC, BC and (A+B)C, Also, verify that (A + B)C = AC + BC

Matrices 15
0 7 8
Solution Now, A + B = �−5 0 10�
8 −6 0
0 7 8 2 0 − 14 + 24 10
So (A + B)C = �−5 0 10� �−2� = �−10 + 0 + 30� = �20�
8 −6 0 3 16 + 12 + 0 28
0 6 7 2 0 − 12 + 21 9
Further AC = �−6 0 8� �−2� = �−12 + 0 + 24� = �12�
7 −8 0 3 14 + 16 + 0 30
0 1 1 2 0−2+3 1
and BC = �1 0 2� �−2� = �2 + 0 + 6� = �8 �
1 2 0 3 2−4+0 −2
9 1 10
So AC + BC = �12� + �8 � = �20�
30 −2 28
Clearly, (A + B)C = AC + BC
1 2 3
Example 18 If A = �3 −2 1� , then show that A3 − 23A − 40I = O
4 2 1
1 2 3 1 2 3 19 4 8
Solution We have A2 = A. A = �3 −2 1� �3 −2 1� = �1 12 8 �
4 2 1 4 2 1 14 6 15
1 2 3 19 4 8 63 46 69
So Now A3 = AA2 = �3 −2 1� �1 12 8 � = �69 −6 23�
4 2 1 14 6 15 92 46 63
63 46 69 1 2 3 1 0 0
A3 − 23A − 40I = �69 −6 23� − 23 �3 −2 1� − 40 �0 1 0�
92 46 63 4 2 1 0 0 1
63 46 69 −23 −46 −69 −40 0 0
= �69 −6 23� + �−69 46 −23� + �0 −40 0 �
92 46 63 −92 −46 −23 0 0 −40
63 − 23 − 40 46 − 46 + 0 69 − 69 + 0
= �69 − 69 + 0 −6 + 46 − 40 23 − 23 + 0 �
92 − 92 + 0 46 − 46 + 0 63 − 23 − 40
0 0 0
= �0 0 0� = O
0 0 0
Example 19 In a legislative assembly election, a political group hired a public relations firm to promote
its candidate in three ways: telephone, house calls, and letters. The cost per contact (in paise) is given
in matrix A as

The number of contacts of each type made in two cities X and Y is given by

16 Matrices
Find the total amount spent by the group in the two cities X and Y.

Solution We have

So the total amount spent by the group in the two cities is 340, 000 paise and 720, 000 paise, i.e.,
Rs.3400 and Rs.7200, respectively.

Transpose of a Matrix

In this section, we shall learn about transpose of a matrix and special types of matrices such as
symmetric and skew symmetric matrices.

Definition 3 If A = �aij � be an m ×
n matrix, then the matrix obtained by interchanging the rows and columns of A is called the transpose of A. Transpose of the matrix
�a ij � , then A′ = �a ji � . For example,
m× n n×m

3 5
3 √3 0
if A = �√3 1� , then A′ = � −1�
−1 5 1
0 5 2×3
5 3×2

Properties of transpose matrices

We now state the following properties of transpose of matrices without proof. These may be verified
by taking suitable examples.

For any matrices A and B of suitable orders, we have

(i) (A')' = A, (ii) (kA)' = kA' (where k is any constant)

(iii) (A + B)' = A' + B' (iv) (A B)'= B'A'

Example 20 If A = �3 √3 2� and B = �2 −1 2
�, verify that
4 2 0 1 2 4
(i) (A')'=A, (ii) (A + B)'= A'+ B',

(iii) (kB)' = kB', where k is any constant.

Solution

(i) We have
3 4
A = �3 √3 2� ⇒ A′ = �
√3 2� ⇒ (A′ )′ = �
3 √3 2� = A
4 2 0 4 2 0
2 0
Thus (A′ )′ = A

(ü) We have

Matrices 17
A = �3 √3 2� B = �2 −1 2� ⇒ A + B = �5 √3 − 1 4�
4 2 0 1 2 4 5 4 4
5 5
Therefore (A + B)′ = �√3 − 1 4�
4 4
3 4 2 1
Now A′ = �√3 2� , B ′ = �−1 2�,
2 0 2 4
5 5
So A′ + B ′ = �√3 − 1 4�
4 4
Thus (A + B)′ = A′ + B ′

(iii) We have
2 −1
2 2k −k 2k
kB = k � �=� �
1 42 k 2k 4k
2k k 2 1
Then (kB)′ = �−k 2k� = k �−1 2� = kB ′
2k 4k 2 4
Thus (kB)′ = kB ′
−2
Example 21 If A = �4 � , B = [1 3 −6], verify that (AB)′ = B ’ A’
5
Solution We have
−2
A = �4 � , B = [1 3 −6]
5
−2 −2 −6 12
then AB = �4 � [1 3 −6] = � 4 12 −24�
5 5 15 −30
1
Now A’ = [−2 4 5], B ’ = �3 �
−6
1 −2 4 5
B ’ A’ = �3 � [−2 4 5] = �−6 12 15 � = (AB)′
−6 l2 −24 −30
Clearly (AB)′ = B ’ A’

Symmetric and Skew Symmetric Matrices

Definition 4 A square matrix A = �a ij � is said to be symmetric if A’ = A, that is, �a ij � =


�aji � for all passible values of i and j.

√3 2 3
For example A = �2 −1.5

−1� is a symmelric malrix as A = A
3 −1 1
Definition 5 A square matrix A = �aij � is said to be skew symmetric matrix if A’ = −A, that is aji =
−aij for all possible values of i and j. Now, if we put i = j, we have a ii = −a ii . Therefore 2a ii = 0 or aii = 0 for all i’s.

This means that all the diagonal elements of a skew symmetric matrix are zero,

18 Matrices
0 e f
For example, the matrix B = �−e 0 g� is a skew symmetric matrix as B'= -B
−f −g 0
Now, we are going to prove some results of symmetric and skew-symmetric matrices.

Theorem 1 For any square matrix A with real number entries, A + A' is a symmetric matrix and A - A' is
a skew symmetric matrix.

Proof Let B = A + A', then

B'= (A + A')’

= A' + (A')' (as (A + B)’ = A' + B')

= A' + A (as (A')' = A)

= A + A' (as A + B = B + A)

=B

Therefore B = A + A' is a symmetric matrix

Now let C = A-A'

C'= (A - A')'= A'- (A')’ (Why?)

= A'-A (Why?)

= - (A - A') = - C

Therefore C = A - A' is a skew symmetric matrix.

Theorem 2 Any square matrix can be expressed as the sum of a symmetric and a skew symmetric
matrix.

Proof Let A be a square matrix, then we can write


1 1
A= (A + A′) + (A - A')
2 2

From the Theorem 1, we know that (A + A') is a symmetric matrix and (A - A') is 1 a skew symmetric
1 1
matrix. Since for any matrix A, (kA)' = kA', it follows that (A + A’) is symmetric matrix and (A – A’) is
2 2
skew symmetric matrix. Thus, any square matrix can be expressed as the sum of a symmetric and a
skew symmetric matrix.
2 −2 ⊲
Example 22 Express the matrix B = �−1 3 4 � as the sum of a symmetric and a skew symmetric
1 −2 −3
matrix,

Solution Here
2 −1 1
B ′ = �−2 3 −2�
−4 4 −3
−3 −3
4 −3 −3 ⎡2 2 2 ⎤
1 1 ⎢−3 ⎥
Let P = (B + B ′ ) = �−3 6 2 �=⎢2 3 1 ⎥,
2 2
−3 2 6 ⎢−3 ⎥
⎣2 1 −3⎦

Matrices 19
−3 −3
⎡2 2 ⎤ 2
⎢−3 ⎥
Now P ′ = ⎢ 2 3 1⎥=P
⎢−3 ⎥
⎣2 1 −3⎦
1
Thus P = (B + B ′ ) is a symmetric matrix,
2
−1 −5
0 −1 −5 ⎡0 2 ⎤2
1 1 ⎢1 ⎥
Also, let Q = (B − B ′ ) = �6 1 0 � = ⎢2 0 3⎥
2 2
5 −6 0 ⎢5 ⎥
⎣2 −3 0⎦
1 5
⎡0 2 ⎤ 3
⎢−1 ⎥
Then Q′ = ⎢ 2 0 −3⎥ = −Q
⎢−5 ⎥
⎣2 3 0⎦
1
Thus Q = (B − B ′ ) is a skew symmetric matrix,
2
−3 −3 −1 −5
⎡2 2 2 ⎤ ⎡0 2 2⎤ 2 −2 −4
⎢−3 ⎥ ⎢1 ⎥
Now P + Q = ⎢ 2 3 1 ⎥+⎢
2
0 3 ⎥ = �−1 3 4 �=B
⎢−3 ⎥ ⎢5 ⎥ 1 −2 −3
⎣2 1 −3⎦ ⎣ −3 0⎦
2

Thus, B is represented as the sum of a symmetric and a skew symmetric matrix.

Elementary Operation (Transformation) of a Matrix

There are six operations (transformations) on a matrix, three of which are due to rows and three due
to columns, which are known as elementary operations or transformations.

(i) The interchange of any two rows or two columns. Symbolically the interchange of ith and jth rows is
denoted by Ri ↔ Rj and interchange of ith and jth column is denoted by Ci ↔ Cj.
1 2 1 −1 √3 1
For example, applying R1 ↔ R 2 to A = �−1 √3 1� , we get �1 2 1�
5 6 7 5 6 7
(ii) The multiplication of the elements of any row or column by a non zero number, Symbolically, the
multiplication of each element of the ith row by k, where k ≠ 0 is denoted by R i → kR i .

The corresponding column operation is denoted by Ci → kCi


1
l 1 2 1 1 2
7
For example, applying C3 → C3 , to B = � � , we get � 1�
7 −1 √3 1 −1 √3 7

(iii) The addition to the elements of any row or column, the corresponding elements of any other row
or column multiplied by any non zero number.

Symbolically, the addition to the elements of rth row, the corresponding elements of7th row multiplied
by k is denoted by Ri → Ri + kRj.

The corresponding column operation is denoted by Ci → Ci + kCj.


1 2 1 2
For example, applying R 2 → R 2 − 2R1 , to C = � � , we get � �.
2 −1 0 −5
Invertible Matrices

20 Matrices
Definition < If A is a square matrix of order m, and if there exists another square matrix B of the same
order m, such that AB = BA = I, then B is called the inverse matrix of A and it is denoted by A In that
case A is said to be invertible.

2 3 2 −3
For example, let A = � � and B = � � be two matrices,
1 2 −1 2
2 3 2 −3
Now AB = � �� �
1 2 −1 2
4 − 3 −6 + 6 1 0
=� �=� �I
2 − 2 −3 + 4 0 1
1 0
Also BA = � � = I Thus B is the inverse of A, in other words B = A-1 and A is inverse of B, i.e., A = B-1
0 1
[Note]

1. A rectangular matrix does not possess inverse matrix, since for products BA and AB to be defined
and to be equal, it is necessary that matrices A and B should be square matrices of the same order.

2. If B is the inverse of A, then A is also the inverse of B.

Theorem 3 (Uniqueness of inverse) Inverse of a square matrix, if it exists, is unique.

Proof Let A = [aij] be a square matrix of order m. If possible, let B and C be two inverses of A. We shall
show that B = C.

Since B is the inverse of A

AB = BA = I ...(1)

Since C is also the inverse of A

AC = CA = I ...(2)

Thus B = BI = B (AC) = (BA) C = IC = C

Theorem 4 If A and B are invertible matrices of the same order, then (AB)-1 = B-1 A-1.

Proof From the definition of inverse of a matrix, we have

(AB) (AB)–1 = 1

or A–1 (AB) (AB)–1 = A–1 I (Pre multiplying both sides by A–1)

or (A–1A) B (AB)–1 = A– 1 (Since A–1 I = A–1)

or IB (AB)–1 = A– 1

or B (AB)–1 = A– 1

or B–1 B (AB)–1 = B–1 A–1

or I (AB)–1 = B–1 A–1

Hence (AB)–1 = B-1 A–1

Inverse of a matrix by elementary operations

Let X, A and B be matrices of, the same order such that X = AB. In order to apply a sequence of
elementary row operations on the matrix equation X = AB, we will apply these row operations

Matrices 21
simultaneously on X and on the first matrix A of the product AB on RHS.

Similarly, in order to apply a sequence of elementary column operations on the matrix equation X =
AB, we will apply, these operations simultaneously on X and on the second matrix B of the product AB
on RHS.

In view of the above discussion, we conclude that if A is a matrix such that A-1 exists, then to find A-1
using elementary row operations, write A = IA and apply a sequence of row operation on A = IA till we
get, I = BA. The matrix B will be the inverse of A. Similarly, if we wish to find A-1 using column operations,
then, write A = AI and apply a sequence of column operations on A = AI till we get, I = AB.

Remark In case, after applying one or more elementary row (column) operations on A = IA (A = AI), if
we obtain all zeros in one or more rows of the matrix A on L.H.S., then A-1 does not exist.
1 2
Example 23 By using elementary operations, find the inverse of the matrix A = � �.
2 −1
Solution In order to use elementary row operations we may write A = IA.
12 1 0 1 2 1 0
or � �=� � A, then � �=� � A (applying R 2 → R 2 − 2R1 )
2
−1 0 1 0 −5 −2 1
1 2 1 0 l
or � � = �2 −1� (applying R 2 → − R 2 )
0 1 5 5
5

1 2
1 0
or � � = �52 5
−1� A (applying R1 → R1 − 2R 2 )
0 1
5 5

1 2

Thus A −1
= �52 5
−1�
5 5

Alternatively, in order to use elementary column operations, we write A = AI, i.e.,


1 2 1 0
� � = A� �
2 −1 0 1
Applying C2 → C2 − 2C1 , we get
1 0 1 −2
� � = A� �
2 −5 0 1
l
Now applying C2 → − C2 , we have
5

2
1
1 0 5�
� � = A�
2 1 −1
0
5
Finally, applying C1 → C1 − 2C2 , we obtain
1 2
1 0
� � = A �5 5�
0 1 2 −1
5 5
1 2

Hence A −1
= �52 5
−1�
5 5

Example 24 Obtain the inverse of the following matrix using elementary operations

22 Matrices
0 1 2
A = �1 2 3�.
3 1 1
0 1 2 1 0 0
Solution Write A = IA, i.e., �1 2 3� = �0 1 0� A
3 1 1 0 0 1
1 2 3 0 1 0
or �0 1 2� = �1 0 0� A (applying R1 ↔ R 2 )
3 1 1 0 0 1
1 2 3 0 1 0
or �0 1 2 � = �1 0 0� A (applying R 3 → R 3 − 3R1 )
0 −5 −8 0 −3 1
1 0 −1 −2 1 0
or �0 1 2 � = �1 0 0� A (applying R1 → R1 − 2R 2 )
0 −5 −8 0 −3 1
1 0 −1 −2 1 0
or �0 1 2 � = �1 0 0� A (applying R 3 → R 3 + 5R 2 )
0 0 2 5 −3 1
1 0 −1 −2 1 0
or �0 1 2 � = �5 1 0 0� A (applying R → 1 R )
−3 1 3 2 3
0 0 1 2 2 2
1 −1 1
1 0 0 2 2 2
or �0 1 2� = �1 0 0� A (applying R1 → R1 + R 3 )
5 −3 1
0 0 1
2 2 2
1 −1 1
1 0 0 2 2 2
or �0 1 0� = �−4 3 −1� A (applying R 2 → R 2 − 2R 3 )
5 −3 1
0 0 1
2 2 2
1 −1 1
2 2 2
Hence A−1 = �−4 3 −1�
5 −3 1
2 2 2

Alternatively, write A = AI, i.e.,


0 1 2 1 0 0
�1 2 3� = A �0 1 0�
3 1 1 0 0 1
1 0 2 0 1 0
or �2 1 3� = A �1 0 0� (C1 ↔ C2 )
1 3 1 0 0 1
1 0 0 0 1 0
or �2 1 −1� = A �1 0 −2� (C3 → C3 − 2C1 )
1 3 −1 0 0 1
1 0 0 0 1 1
or �2 1 0� = A �1 0 −2� (C3 → C3 + C2 )
1 3 2 0 0 1
1
1 0 0 0 1
2
1
or �2 1 0� = A �1 0 −1� �C3 → C3 �
1 2
1 3 1 0 0
2

Matrices 23
1
1 0 0 −2 1
2
or �0 1 0� = A �1 0 −1� (C1 → C1 − 2C2 )
1
−5 3 1 0 0
2
1 1
1 0 0 1
2 2
or �0 1 0� = A �−4 0 −1� (C1 → C1 + 5C3 )
5 1
0 3 1 0
2 2
1 −1 1
1 0 0 2 2 2
or �0 1 0� = A �−4 3 −1� (C1 → C1 + 3C3 )
5 −3 1
0 0 1
2 2 2
1 −1 1
2 2 2
Hence A1 = �−4 3 −1�
5 −3 1
2 2 2

10 −2
Example 25 Find P-1, if it exists, given P = � �.
−5 1
10 −2 1 0
Solution We have P = IP, i. e. , � �=� � P.
−5 1 0 1
−1 1
1 0 1
or � 5 � = �10 � P (applying R1 → R1 )
10
−5 1 0 1
1
1
−1 0
or � 5 �= �110 � P (applying R 2 → R 2 + 5R1 )
0 0 1
2

We have all zeros in the second row of the left hand side matrix of the above equation. Therefore, P
does not exist.

24 Matrices
4 Determinants
Introduction

We have studied about matrices and algebra of matrices. We have also learnt that a system of algebraic
equations can be expressed in the form of matrices. This means, a system of linear equations like
a1 x + b1 y = c1
a2 x + b2 y = c2
a b1 x
can be represented as � 1 �� � =
a2 b2 y
c1
�c � . Now, this system of equations has a unique solution or not, is determined by the number a1 b2 −
2
a b
a 2 b1 . (Recall that if 1 ≠ 1 or, a1 b2 − a2 b1 ≠
a2 b2
0, then the system of linear equations has a unique solution). The number a1 b2 − a2 b1
a1 b1
which determines uniqueness of solution is associated with the matrix A = � � and is called the
a 2 b2
determinant of A or det A. Determinants have wide applications in Engineering, Science, Economics,
Social Science, etc.

In this chapter, we shall study determinants up to order three only with real entries. Also, we will study
various properties of determinants, minors, cofactors and applications of determinants in finding the
area of a triangle, adjoint and inverse of a square matrix, consistency and inconsistency of system of
linear equations and solution of linear equations in two or three variables using inverse of a matrix.

Determinant

To every square matrix A = [aij] of order n, we can associate a number (real or complex) called
determinant of the square matrix A, where aij = (i, j)th element of A.

This may be thought of as a function which associates each square matrix with a unique number (real
or complex). If M is the set of square matrices, K is the set of numbers (real or complex) and f: M → K
is defined by f (A) = k. where A ∈ M and k ∈ K, then f(A) is called the determinant of A. It is also denoted
by |A| or det A or ∆.
a b a b
If A = � � , then determinant ofA is w1itten as |A| = � � = det(A)
c d c d
Remarks

(i) For matrix A, |A| is read as determinant of A and not modulus of A.

(ii) Only square matrices have determinants.

Determinant of a matrix of order one

Let A = [a] be the matrix of order 1, then determinant of A is defined to be equal to a

Determinant of a matrix of order two


a11 a12
Let A = �a a 22 � be a matrix of order 2 × 2,
21

then the determinant of A is defined as:

Determinants 1
a11 a12
det (A) = |A| = Δ = �a
21 a 22 � = a11a22 - a21a12
2 4
Example 1 Evaluate � �.
−1 2
2 4
Solution We have � � = 2 (2) - 4(-1) = 4 + 4 = 8.
−1 2
x x+1
Example 2 Evaluate � �
x -1 x
Solution We have
x x+1
� � = x (x) - (x + 1) (x - 1) = x2 - (x2 - 1) = x2 - x2 + 1 = 1
x-1 x
Determinant of a matrix of order 3 × 3

Determinant of a matrix of order three can be determined by expressing it in terms of second order
determinants. This is known as expansion of a determinant along a row (or a column). There are six
ways of expanding a determinant of order 3 corresponding to each of three rows (R1, R2 and R3) and
three columns (C1, C2 and C3) giving the same value as shown below.

Consider the determinant of square matrix A = [aij]3 x 3


a11 a12 a13
i.e., |A| = �a 21 a 22 a 23 �
a 31 a 32 a 33
Expansion along first Row (R1)

Step 1 Multiply first element a11 of R1 by (–1)(1 + 1) [(–1)sum of suffixes in a11] and with the second order
determinant obtained by deleting the elements of first row (R1) and first column (C1) of | A | as a11 lies
in R1 and C1,
a 22 a 23
i.e., (- 1)1+1 a11 �a a 33 �
32

Step 2 Multiply 2nd element a12 of R1 by (–1)1 + 2 [(–1)sum of suffixes in a12] and the second order determinant
obtained by deleting elements of first row (R1) and 2nd column (C2) of | A | as a12 lies in R1 and C2,
a 21 a 23
i.e., (- 1)1+2 a12 �a a 33 �
31

Step 3 Multiply third element a13 of R1 by (–1)1 + 3 [(–1)sum of suffixes in a13] and the second order determinant
obtained by deleting elements of first row (R1) and third column (C3) of | A | as a13 lies in R1 and C3,
a 21 a 22
i.e., (- 1)1 + 3 a13 �a a 32 �
31

Step 4 Now the expansion of determinant of A, that is, | A | written as sum of all three terms obtained
in steps 1, 2 and 3 above is given by
a 22 a 23 a2l a 23 a 21 a 22
detA = |A| = (- 1)1+1 a11 �a 1+2
a 33 � + (- 1) a12 �a 31
1+3
a 33� + (- 1) a13 �a 31 a 32 �
32

or |A| = a11 (a22 a 33 − a32 a 23 ) − a12 (a21 a33 − a31a 23 ) + a13 (a21 a 32 - a 31 a 22)

= a11 a22 a33 − a11 a32 a 23 − a12 a21 a33 + a12 a31 a23 + a13 a21 a32 – a13 a31 a22 …(1)

[Note] We shall apply all four steps together.

Expansion along second row (R2)

2 Determinants
a11 a12 a13
| A |= �a 21 a 22 a 23�
a 31 a 32 a 33
Expanding along R 2 , we get
a12 a13 a11 a13 a11 a12
|A| = (- 1)2+1 a 21 �a 2+2
a 33 � + (- 1) a 22 �a 31
2+3
a 33 � + (- 1) a 23 �a 31 a 32 �
32

I A I = −a 21 a12a 33 + a21a 32 a13 + a22 a11 a33 − a22 a 31 a13 − a23 a11 a32 + a23 a 31 a12

= a11 a22 a 33 − a11 a 23a 32 − a12 a 21 a33 + a12 a23 a 31 + a13 a 21 a32 − a13 a 31 a22 …(2)

Expansion along first Column (C1)


a11 a12 a13
a
|A| = � 21 a 22 a 23�
a 31 a 32 a 33
By expanding along C1, we get
a 22 a 23 a
2+1 12
a13 a
3+1 12
a13
|A| = a11 (- 1)1+1 �a a 33 � + a 21 (- 1) �a a 33� + a 31 (- 1) �a a 23�
32 32 22

= a11 (a22a 33 − a23 a 32 ) − a21 (a12 a33 − a13 a32 ) + a31 (a12 a23 − a13 a22 )
| A | = a11 a 22a 33 − a11 a23 a 32 − a21 a12a 33 + a21 a13 a 32 + a31 a12 a23 − a31 a13 a 22

= a11 a 22 a33 − a11 a23a 32 − a12 a 21 a33 + a12 a23 a 31 + a13 a 21a 32 − a13a 31 a22 …(3)

Clearly, values of |A| in (1), (2) and (3) are equal. It is left as an exercise to the reader to verify that the
values of |A| by expanding along R3, C2 and C3 are equal to the value of |A| obtained in (1), (2) or (3).

Hence, expanding a determinant along any row or column gives same value.

Remarks

(i) For easier calculations, we shall expand the determinant along that row or column which contains
maximum number of zeros.

(ii) While expanding, instead of multiplying by (-1)i + j we can multiply by +1 or -1 according as (i + j) is


even or odd.
2 2 1 1
(iii) Let A = � � and B = � � Then, it is easy to yerify that A = 2B. Also
4 0 2 0
| A | = 0 − 8 = −8 and I BI = 0 − 2 = −2.

Observe that, | A | = 4(−2) = 22 |B| or | A | = 2n |B|, where n = 2 is the order of square matrices A and B.

In general, if A = kB where A and B are square malrices of order n, then |A| = k n |B|, where n = 1, 2, 3
1 2 4
Example 3 Evaluate the determinant Δ = �−1 3 0�.
4 1 0
Solution Note that in the third column, two entries are zero. So expanding along third column (C3), we
get
−1 3 1 2 1 2
𝛥𝛥 = 4 � � − 0� � + 0� �
4 1 4 1 −1 3
= 4(-1 - 12) - 0 + 0 = -52

Determinants 3
0 sinα −cosα
Example 4 Evaluate Δ = �−sinα 0 sinβ �
cosα −sinβ 0
Solution Expanding along R1, we get
0 sinβ −sinα sinβ
Δ = 0� � − sin α � � − cos α|−cos
sin α
α − sin β0
−sinβ 0 cosα 0
= 0 − sin α(0 − sin β cos α) − cos α( sin α sin β − 0)
= sin α sin β cos α − cos α sin α sin β = 0
3 x 3 2
Example 5 Find values of x for which � �=� �.
x 1 4 1
3 x 3 2
Solution We have � �=� �
x 1 4 1
i.e. 3 - x2 = 3 - 8

i.e. x2 = 8

Hence x = ± 2√2

Properties of Determinants

In the previous section, we have learnt how to expand the determinants. In this section, we will study
some properties of determinants which simplifies its evaluation by obtaining maximum number of
zeros in a row or a column. These properties are true for determinants of any order. However, we shall
restrict ourselves upto determinants of order 3 only.

Property 1 The value of the determinant remains unchanged if its rows and columns are interchanged.
a1 a2 a3
Verification Let Δ = �b1 b2 b3 �
c1 c2 c3
Expanding along first row, we get
b2 b3 b b3 b b2
Δ = a1 � � − a2 � 1 � + a3 � 1 �
c2 c3 c1 c3 c1 c2
= a1 (b2 c3 − b3 c2 ) − a2 (bl c3 − b3 c1 ) + a3 (bl c2 − b2 c1 )
By interchanging the rows and columns of Δ, we get the determinant
01 b1 c1
Δ1 = �a 2 b2 c2 �
a3 b3 c3
Expanding Δ1 along first column, we get
Δ1 = a1 (b2 c3 − c2 b3 ) − a2 (bl c3 − b3 c1 ) + a3 (bl c2 − b2 c1 )
Hence Δ = Δ1

Remark It follows from above property that if A is a square matrix, then det (A) = det (A'), where A' =
transpose of A.

[Note] If Ri = ith row and Ci = ith column, then for interchange of row and columns, we will symbolically
write Ci ↔ Ri.

Let us verify the above property by example.

4 Determinants
2 −3 5
Example 6 Verify Property 1 for Δ = �6 0 4�
1 5 −7
Solution Expanding the determinant along first row, we have
0 4 6 4 6 0
Δ = 2� � − (−3) � � +5� �
5 −7 1 −7 1 5
= 2(0 − 20) + 3(−42 − 4) + 5(30 − 0)
= −40 − 138 + 150 = −28
By interchanging rows and columns, we get
0 5 6 1 6 0
= 2� � − (−3) � � + 5� �
4 −7 4 −7 1 5
= 2 (0 - 20) + 3 (- 42 - 4) + 5 (30 - 0)

= -40- 138 + 150 = -28

By interchanging rows and columns, we get


2 6 1
Δ1 = �−3 0 5 � (Expanding along first column)
5 4 −7
0 5 6 1 6 1
= 2� � − (−3) � � + 5� �
4 −7 4 −7 0 5
= 2 (0 - 20) + 3 (- 42 - 4) + 5 (30 - 0)

= -40- 138 + 150 = -28

Clearly Δ = Δ1

Hence, Property 1 is verified.

Property 2 If any two rows (or columns) of a determinant are interchanged, then sign of determinant
changes.
a1 a2 a3
Verification Let Δ = � 1
b b2 b3 �
c1 c2 c3
Expanding along first row, we get
Δ = a1 (b2 c3 − b3 c2 ) − a2 (b1 c3 − b3 c1 ) + a3 (b1 c2 − b2 c1 )
Interchanging first and third rows, the new determinant obtained is given by
c1 c2 c3
Δ1 = �b1 b2 b3 �
a1 a2 a3
Expanding along third row, we get
Δ1 = a1 (c2 b3 − b2 c3 ) − a2 (c1 b3 − c3 b1 ) + a3 (b2 c1 − bl c2 )
= −[a1 (b2 c3 − b3 c2 ) − a2 (bl c3 − b3 c1 ) + a3 (bl c2 − b2 c1 )]
Clearly Δ1 = −Δ

Similarly, we can verify the result by interchanging any two columns.

[Note] We can denote the interchange of rows by Ri ↔ Rj. and interchange of columns by Ci ↔ Cj.

Determinants 5
2 −3 5
Example 7 Verify Property 2 for Δ = �6 0 4 �.
1 5 −7
2 −3 5
Solution Δ = �6 0 4 � = -28 (See Example 6)
1 5 −7
Interchanging rows R2 and R3 i.e., R2 ↔ R3, we have
2 −3 5
Δ1 = �1 5 −7�
6 0 4
Expanding the determinant Δ1 along first row, we have
5 −7 1 −7 1 5
Δ1 = 2 � � − (−3) � � + 5� �
0 4 6 4 6 0
= 2 (20 - 0) + 3 (4 + 42) + 5 (0 - 30)

= 40 + 138 - 150 = 28

Clearly Δ1 = - Δ

Hence, Property 2 is verified.

Property 3 If any two rows (or columns) of a determinant are identical (all corresponding elements are
same), then value of determinant is zero.

Proof If we interchange the identical rows (or columns) of the determinant Δ, then Δ does not change.
However, by Property 2, it follows that Δ has changed its sign

Therefore Δ = - Δ

or Δ = 0

Let us verify the above property by an example.


3 2 3
Example 8 Evaluate Δ = �2 2 3�
3 2 3
Solution Expanding along first row, we get

Δ = 3 (6 - 6) - 2 (6 - 9) + 3 (4 - 6)

= 0 - 2 (-3) + 3 (-2) = 6 - 6 = 0

Here R1 and R3 are identical.

Property 4 If each element of a row (or a column) of a determinant is multiplied by a constant k, then
its value gets multiplied by k.
a1 b1 c1
Verification Let Δ = �a 2 b2 c2 �
a3 b3 c3
and ∆1 be the determinant obtained by multiplying the elements of the first row by k.

Then
ka1 kb1 kc1
Δ1 = � a 2 b2 c2 �
a3 b3 c3

6 Determinants
Expanding along first row, we get
Δ1 = ka1(b2 c3 − b3 c2 ) − kb1 (a2 c3 − c2 a 3 ) + kc1 (a2 b3 − b2 a 3 )
= k[a1 (b2 c3 − b3 c2 ) − b1 (a2 c3 − c2 a 3 ) + c1 (a2 b3 − b2 a 3 )]
= kΔ
ka1 kb1 kc1 01 b1 c1
Hence � a 2 b2 c2 � = k � a 2 b2 c2 �
a3 b3 c3 a3 b3 c3
Remarks

(i) By this property, we can take out any common factor from any one row or any one column of a given
determinant.

(ii) If corresponding elements of any two rows (or columns) of a determinant are proportional (in the
same ratio), then its value is zero. For example
a1 a2 a3
b
Δ=� 1 b2 b3 � = 0 (rows R1 and R 2 are proportional)
ka1 ka 2 ka 3
102 18 36
Example 9 Evaluate � 1 3 4�
17 3 6
102 18 36 6(17) 6(3) 6(6) 17 3 6
Solution Note that � 1 3 4�= � 1 3 4 � = 6� 1 3 4� = 0
17 3 6 17 3 6 17 3 6
(Using Properties 3 and 4)

Property : If some or all elements of a row or column of a determinant are expressed as sum of two
(or more) terms, then the determinant can be expressed as sum of two (or more) determinants.
a1 + λ1 a 2 + λ2 a 3 + λ3 a1 a2 a3 λl λ2 λ3
For example, � b1 b2 b3 � = � bl b2 b3 � + �b1 b2 b3 �
cl c2 c3 c1 c2 c3 c1 c2 c3
a1 + λ1 a 2 + λ2 a 3 + λ3
Verification L.H.S. = � b1 b2 b3 �
c1 c2 c3
Expanding the determinants along the first row, we get
Δ = (a1 + λ1 )(b2 c3 − c2 b3 ) − (a2 + λ2 )(b1 c3 − b3 c1 ) + (a3 + λ3 )(b1 c2 − b2 C1 )
= a1 (b2 c3 − c2 b3 ) − a2 (bl c3 − b3 c1 ) + a3 (bl c2 − b2 c1 ) + λ1 (b2 c3 − c2 b3 ) − λ2 (bl c3 − b3 c1 ) + λ3 (b1 c2 − b2 c1 )
(by rearranging terms)
a1 a2 a3 λ1 λ2 λ3
= �b1 b2 b3 � + �b1 b2 b3 � = R. H. S.
c1 c2 c3 c1 c2 c3
Similarly, we may verify Property 5 for other rows or columns.
a b c
Example 10 Show that �a + 2x b + 2y c + 2z� = 0
x y z
a b c a b c a b c
Solution We have �a + 2x b + 2y c + 2z� = �a b c� + �2x 2y 2z�
x y z x y z x y z

Determinants 7
(by Property 5)

= 0 + 0 = 0 (Using Property 3 and Property 4)

Property 6 If, to each element of any row or column of a determinant, the equimultiples of
corresponding elements of other row (or column) are added, then value of determinant remains the
same, i.e., the value of determinant remain same if we apply the operation Ri → Ri + kRj or Ci → Ci + kCj.

Verification
a1 a2 a3 a1 + kcl a 2 + kc2 a 3 + kc3
Let Δ = �b1 b2 b3 � and Δ1 = � bl b2 b3 �,
c1 c2 c3 c1 c2 c3
where Δ1 is obtained by the operation R1 → R1 + kR3.

Here, we have multiplied the elements of the third row (R3) by a constant k and added them to the
corresponding elements of the first row (R1).

Symbolically, we write this operation as R1 → R1 + k R3.

Now, again
a1 a2 a3 kcl kc2 kc3
b
Δ1 = � 1 b2 b3 � + � b1 b2 b3 � (Using Property 5)
c1 c2 c3 c1 c2 c3
= Δ + 0 (since R1 and R3 are proportional)

Hence Δ = Δ1

Remarks

(i) If ∆1 is the determinant obtained by applying Ri → kRi or Ci → kCi to the determinant Δ, then Δ1 = k∆.

(ii) If more than one operation like Ri → Ri + kRj is done in one step, care should be taken to see that a
row that is affected in one operation should not be used in another operation. A similar remark applies
to column operations.
a a+b a+b+c
Example 11 Prove that �2a 3a + 2b 4a + 3b + 2c � = a3.
3a 6a + 3b 10a + 6b + 3c
Solution Applying operations R2 → R2 - 2R1 and R3 → R3 - 3R1 to the given determinant Δ, we have
a a+b a+b+c
Δ = �0 a 2a + b �
0 3a 7a + 3b
Now applying R3 → R3 - 3R2, we get
a a+b a+b+c
Δ = �0 a 2a + b �
0 0 a
Expanding along C1, we obtain
a 2a + b
Δ = a� �+0+0
0 a
= a (a2 - 0) = a (a2) = a3

Example 12 Without expanding, prove that

8 Determinants
x+y y+z z+x
Δ=� z x y �=0
1 1 1
Solution Applying R1 → R1 + R2 to Δ, we get
x+y+z x+y+z x+y+z
Δ=� z x y �
1 1 1
Since the elements of R and R3 are proportional, Δ = 0.

Example 13 Evaluate
1 a bc
Δ = �1 b ca �
1 c ab
Solution Applying R2 → R2 - R and R3 → R3 – R1, we get
1 a bc
Δ = �0 b - a c(a - b)�
0 c -a b(a - c)
Taking factors (b - a) and (c - a) common from R2 and R3, respectively, we get
1 a bc
Δ = (b - a) (c - a) �0 1 - c�
0 1 -b
= (b - a) (c - a) [(- b + c)] (Expanding along first column)

= (a - b) (b - c) (c - a)
b+c a a
Example 14 Prove that � b c+a b � = 4abc
c c a+b
b+c a a
Solution Let Δ = � b c+a b �
c c a+b
Applying R1 → R1 - R2 - R3 to Δ, we get
0 - 2c - 2b
Δ = �b c+a b �
c c a+b
Expanding along R1, we obtain
c+a b b b b c+a
Δ = 0� � - (- 2c) � � + (- 2b) � �
c a+b c a+b c c
= 2 c (a b + b2 - bc) - 2 b (b c - c2 - ac)

= 2 a b c + 2 cb2 - 2 bc2 - 2 b2c + 2 bc2 + 2 abc

= 4 abc
𝑥𝑥 𝑥𝑥 2 1 + x3
Example 15 If x, y, z are different and Δ = � y y2 1 + y 3 � = 0 then show that 1 + xyz = 0
z z2 1 + z3
Solution We have

Determinants 9
x x2 1 + x3
Δ = �y y2 1 + y3 �
z z2 1 + z3
x x2 1 x x2 x3
= �y y2 1� + �y y2 y 3 � (Using Property 5)
z z2 1 z z2 z3
1 x x2 1 x x2
= (-1) �1
2
y y 2 � + xyz �1 y y 2 � (Using C3 ↔ C2 and then C1 ↔ C2)
1 z z2 1 z z2
1 x x2
= �1 y y 2 � (1 + xyz)
1 z z2
1 x x2
= (1 + xyz) �0 y - x y - x2�
2 (Using R2 → R2 - R1 and R3 → R3 - R1)
0 z-x z2- x2
Taking out common factor (y - x) from R2 and (z - x) from R3, we get
1 x x2
Δ = (1 + xyz)(y - x)(z - x) �0 1 y + x�
0 1 z+x
= (1 + xyz) (y - x) (Z - x) (Z - y) (on expanding along C1)

Since Δ = 0 and x, y, z are all different, i.e., x - y ≠ 0, y - z ≠ 0, z - x ≠ 0, we get

1 + xyz = 0

Example 16 Show that


1 + 𝑎𝑎 1 1 1 1 1
� 1 1 + 𝑏𝑏 1 � = abc �1 + + + � = abc + bc + ca + ab
a b c
1 1 1 + 𝑐𝑐
Solution Taking out factors a, b, c common from R1, R2 and R3, we get
1 1 1
+1
a a a
1 1 1
L.H.S. = abc �� b b
+1
b


1 1 1
+1
c c c

Applying R1→ R1 + R2 + R3, we have


1 1 1 1 1 1 1 1 1
1+ + + 1+ + + 1+ + +
a b c a b c a b c
� 1 1 1 �
Δ = abc +1
� b b b �
1 1 1
+1
c c c
1 1 1
1 1 1
1 1 1 � +1
= abc �1 + + + � �b b b ��
a b c 1 1 1
+1
c c c
Now applying C2 → C2 − C1 , C3 → C3 − C1 , we get

10 Determinants
1 0 0
1
1 1 1 1 0�
Δ = abc �1 + + + � ��b �
a b c 1
0 1
c
1 1 1
= abc �1 + + + � [1(1 − 0)]
a b c
1 1 1
= abc �1 + + + � = abc + bc + ca + ab = R. H. S.
a b c
[Note] Alternately try by applying C1 → C1 - C2 and C3 → C3 - C2, then apply C1 → C1 - a C3.

Area of a Triangle

In earlier classes, we have studied that the area of a triangle whose vertices are
1
(x1 , y1 ), (x2 , y2 ) and (x3 , y3 ), is given by the expression [x1 (y2 − y3 ) + x2 (y3 − y1 ) + x3 (y1 −
2
y2 )]. Now this expresston can De wrlttenm me rorm or a𝔠𝔠ιeterrmnam as
x y 1
1
Δ = �x y 1� ...(1)
2
x y 1
Remarks

(i) Since area is a positive quantity, we always take the absolute value of the determinant in (1).

(ii) If area is given, use both positive and negative values of the determinant for calculation.

(iii) The area of the triangle formed by three collinear points is zero.

Example 17 Find the area of the triangle whose vertices are (3, 8), (- 4, 2) and (5, 1).

Solution The area of triangle is given by

1 3 8 1
Δ= �−4 2 1�
2
5 1 1
1
= [3(2 − 1) − 8(−4 − 5) + 1(−4 − 10)]
2
1 61
= (3 + 72 − 14) =
2 2
Example 18 Find the equation of the line joining A( 1, 3) and B (0, 0) using determinants and find k if
D(k, 0) is a point such that area of triangle ABD is 3sq units.

Solution Let P (x, y) be any point on AB. Then, area of triangle ABP is zero (Why?). So

1 0 0 1
�1 3 1� = 0
2 x y 1
1
This gives (y - 3x) = 0 or y = 3x,
2

which is the equation of required line AB.

Also, since the area of the triangle ABD is 3 sq. units, we have

1 1 3 1
�0 0 1� = ±3
2
k 0 1

Determinants 11
−3k
This gives, = ±3, i.e., k = ∓ 2.
2

Minors and Cofactors

In this section, we will learn to write the expansion of a determinant in compact form using minors
and cofactors.

Definition 1 Minor of an element aij of a determinant is the determinant obtained by deleting its ith row
and jth column in which element aij lies. Minor of an element a.. is denoted by Mij.

Remark Minor of an element of a determinant of order n(n ≥ 2) is a determinant of order n - 1.


1 2 3
Example 19 Find the minor of element 6 in the determinant Δ = �4 5 6�
7 8 9
Solution Since 6 lies in the second row and third column, its minor M23 is given by
1 2
M23 = � � = 8 - 14 = -6 (obtained by deleting R2 and C3 in Δ).
7 8
Definition 2 Cofactor of an element aij, denoted by Aij is defined by

Aij = (-1)i + j Mij, where Mij is minor of aij.


1 −2
Example 20 Find minors and cofactors of all the elements of the determinant � �
4 3
Solution Minor of the element aij is Mij

Here a11 = 1. So M11 = Minor of a11 = 3

M12 = Minor of the element a12 = 4

M21 = Minor of the element a21 = -2

M22 = Minor of the element a22 = 1

Now, cofactor of aij is Aij. So

A11 = (-1)1 + 1 M11 = (-1)2 (3) = 3

A12 = (-1)1 + 2 M12 = (-1)3 (4) = - 4

A21 = (-1)2 + 1 M21 = (-1)3 (-2) = 2

A22 = (-1)2 + 2 M22 = (-1)4 (1) = 1

Example 21 Find minors and cofactors of the elements a11, a21 in the determinant
a11 012 013
Δ = �a 21 a 22 a 23 �
a 31 a 32 a 33
Solution By definition of minors and cofactors, we have
a 22 a 23
Minor of a11 = M11 = �a a 33 � = a 22 a33 − a23a 32
32

Cofactor of a11 = A11 = (−1)1+1 M11 = a 22 a 33 − a23 a 32


a12 a13
Minor of a21 = M21 = �a a 33 � = a12 a 33 − a13 a 32
32

Cofactor of a21 = A21 = (−1)2+1 M21 = (−1)(a12 a 33 − a13 a 32) = −a12 a 33 + a13 a 32

12 Determinants
Remark Expanding the determinant Δ, in Example 21, along R1, we have
a 22 a 23 a 21 a 23 a 21 a 22
Δ = (−1)1+1 a11 �a 1+2
a 33 � + (−1) a12 �a 31 a 33 � + (−1)
1+3
a13 �a a 32 �
32 31

= a11 A11 + a12 A12 + a13 A13 , where Aij is cofactor of aij

= sum of product of elements of R1 with their corresponding cofactors

Similarly, Δ can be calculated by other five ways of expansion that is along R2, R3, C1, C2 and C3.

Hence Δ = sum of the product of elements of any row (or column) with their corresponding cofactors.

[Note] If elements of a row (or column) are multiplied with cofactors of any other row (or column),
then their sum is zero. For example,
Δ = a11 A21 + a12 A22 + a13 A23
a12 a13 a
1+2 11
a13 a
1+3 11
a12
= a11 (−l)1+1 �a a 33� + a12 (−1) �a a 33 � + a13 (−l) �a a 32�
32 31 31

011 a12 013


= �011 a l2 013 � = 0 (since R1 and R 2 are identical)
a 31 a 32 a 33
Similarly, we can try for other rows and columns.

Example 22 Find minors and cofactors of the elements of the determinant


2 −3 5
�6 0 4 � and verify that a11A31 + a12 A32 + a13 A33 = 0
1 5 −7
0 4
Solution We have M11 = � � = 0 − 20 = −20; A11 = (−1)1+1 (−20) = −20
5 −7
6 4
M12 = � � = −42 − 4 = −46 A12 = (- 1)1+2 (- 46) = 46
1 −7
6 0
M13 = � � = 30 − 0 = 30; A13 = (- 1)1+3 (30) = 30
1 5
−3 5
M21 = � � = 21 − 25 = −4; A21 = (- 1)2+1 (- 4) = 4
5 −7
2 5
M22 = � � = −14 − 5 = −19; A22 = (- 1)2+2 (- 19) =- 19
1 −7
2 −3
M23 = � � = 10 + 3 = 13; A23 = (- 1)2+3 (13) =- 13
1 5
−3 5
M31 = � � = −12 − 0 = −12; A31 = (- 1)3+1 (- 12) =- 12
0 4
2 5
M32 = � � = 8 − 30 = −22; A32 = (- 1)3+2 (- 22) = 22
6 4
2 −3
and M33 = � � = 0 + 18 = 18; A33 = (- 1)3+3 (18) = 18
6 0
Now a11 = 2, a12 = −3, a13 = 5; A31 = −12, A32 = 22, A33 = 18

So a11 A31 + a12 A32 + a13 A33


= 2(−12) + (−3)(22) + 5(18) = −24 − 66 + 90 = 0

Adjoint and Inverse of a Matrix

Determinants 13
In the previous chapter, we have studied inverse of a matrix. In this section, we shall discuss the
condition for existence of inverse of a matrix.

To find inverse of a matrix A, i.e., A-1 we shall first define adjoint of a matrix.

Adjoint of a matrix

Definition 3 The adjoint of a square matrix A =


�a ij � is defined as the transpose of the matrix �Aij � , where Aij is the cofactor of the element aij . Adjoint of the
n×n n× n
matrix A is denoted by adj A.
a11 a12 a13
Let A = �a 21 a 22 a 23 �
a 31 a 32 a 33
A11 A12 A13 A11 A21 A31
Then adj A = Transpose of �A21 A22 A23 � = �A12 A22 A32 �
A31 A32 A33 A13 A23 A33
2 3
Example 23 Find adj A for A = � �
1 4
Solution We have A11 = 4, A12 = −1, A21 = −3, A22 = 2
A11 A21 4 −3
Hence adj A = � �=� �
A12 A22 −1 2
Remark For a square matrix of order 2, given by
a11 a12
A = �a a 22 �
21

The adj A can also be obtained by interchanging a11 and a22 and by changing signs of a12 and a21, i.e.,

We state the following theorem without proof.

Theorem 1 If A be any given square matrix of order n, then

A(adj A) = (adj A) A = |A|l,

where I is the identity matrix of order n

Verification
a11 a12 a13 A11 A21 A31
a
Let A = � 21 a 22 a 23 � then adj A = �A12 A22 A32 �
a 31 a 32 a 33 A13 A23 A33
Since sum of product of elements of a row (or a column) with corresponding cofactors is equal to |A|
and otherwise zero, we have
|A| 0 0 1 0 0
A (adj A) = � 0 |A| 0 � = |A| �0 1 0� = |A|I
0 0 |A| 0 0 1
Similarly, we can show (adj A) A = |A| I

14 Determinants
Hence A (adj A) = (adj A) A = |A| I

Definition 4 A square matrix A is said to be singular if |A| = 0.


1 2
For example, the determinant of matrix A = � � is zero
4 8
Hence A is a singular matrix.

Definition 5 A square matrix A is said to be non-singular if |A| ≠ 0


1 2 1 2
Let A = � � . Then |A| = � � = 4 − 6 = −2 ≠ 0.
3 4 3 4
Hence A is a nonsingular matrix

We state the following theorems without proof.

Theorem 2 If A and B are nonsingular matrices of the same order, then AB and BA are also nonsingular
matrices of the same order.

Theorem 3 The determinant of the product of matrices is equal to product of their respective
determinants, that is, |AB| = |A| |B|, where A and B are square matrices of the same order
|A| 0 0
Remark We know that (adj A) A = |A| I = � 0 |A| 0 �, |A| ≠ 0
0 0 |A|
Writing determinants of matrices on both sides, we have
|A| 0 0
|(adj A)A| = � 0 |A| 0�
0 0 |A|
1 0 0
i.e. |(adj A)| |A| = |A|3 �0 1 0� (Why?)
0 0 1
i.e. |(adj A)| |A| = |A|3 (1)

i.e. |(adj A)| = |A|2

In general, if A is a square matrix of order n, then |adj (A)| = |A|n - 1.

Theorem 4 A square matrix A is invertible if and only if A is nonsingular matrix.

Proof Let A be invertible matrix of order n and I be the identity matrix of order n. Then, there exists a
square matrix B of order n such that AB = BA = I

Now AB = I. So |AB| = |I| or |A| |B| = 1 (since |I| = I, |AB| = |A||B|)

This gives |A| ≠ 0. Hence A is nonsingular.

Conversely, let A be nonsingular. Then |A| ≠ 0

Now A (adj A) = (adj A) A = |A| I (Theorem 1)


1 1
or A � adj A� = � adj A� A = I
|A| |A|
1
or AB = BA = I, where B = |A|
adj A
1
Thus A is invertible and A−1 = adj A
|A|

Determinants 15
1 3 3
Example 24 If A = �1 4 3�, then verify that A adj A = |A| I, Also find A-1.
1 3 4
Solution We have |A| = 1 (16 - 9) -3 (4 - 3) + 3 (3 - 4) = 1 ≠ 0

Now A11 = 7, A12 = −1, A13 = −1, A21 = −3, A22 = 1, A23 = 0, A31 = −3, A32 = 0, A33 = 1
7 −3 −3
Therefore adj A = �−1 1 0�
−1 0 1
1 3 3 7 −3 −3
Now A (adj A) = �1 4 3� �−1 1 0�
1 3 4 −1 0 1
7 − 3 − 3 −3 + 3 + 0 −3 + 0 + 3
= �7 − 4 − 3 −3 + 4 + 0 −3 + 0 + 3�
7 − 3 − 4 −3 + 3 + 0 −3 + 0 + 4
1 0 0 1 0 0
= �0 1 0� = (1) �0 1 0� = |A|. I
0 0 1 0 0 1
7 −3 −3 7 −3 −3
1 1
Also A-1 = |A| adj A = �−1 1 0 � = �−1 1 0�
1
−1 0 1 −1 0 1
2 3 1 −2
Example 25 If A = � � and B = � � , then verify that (AB) − 1 = B −1 A−1
1 −4 −1 3
2 3 1 −2 −1 5
Solution We have AB = � �� �=� �
1 −4 −1 3 5 −14
Since, |AB| = −11 ≠ 0, (AB)-1 exists and is given by
l 1 −14 −5 1 14 5
(AB)-1= |AB| adj(AB) = − � � = � �
11 −5 −1 11 5 1
Further, |A| = −11 ≠ 0 and |B| = 1 ≠ 0. Therefore, A−1 and B −1 both exist and are given by
1 −4 −3 3 2
A−1 = − � � , B −1 = � �
11 −1 2 1 1
1 3 2 −4 −3 1 −14 −5 1 14 5
Therefore B −1 A−1 = − � �� � =− � � = � �
11 1 1 −1 2 11 −5 −1 11 5 1
Hence (AB)-1 = B −1 A−1
2 3
Example 26 Show that the matrix A = � � satisfies the equation A2 − 4A + I = O,
1 2
where I is 2 × 2 identity matrix and O is 2 × 2 zero matrix, Using this equation, find A-1.
2 3 2 3 7 12
Solution We have A2 = AA = � �� �=� �
1 2 1 2 4 7
7 12 8 12 1 0 0 0
Hence A2 − 4A + I = � �−� �+� �=� �=O
4 7 4 8 0 1 0 0
Now A2 − 4A + I = O

Therefore A A − 4A = −I

or A A (A) − 4AA−1 = −I A−1 (Post multiplying by A − 1 because |A| ≠ 0)

or A (A A − 1) − 4I = −A−1

or AI − 4I = −A−1

16 Determinants
4 0 2 3 2 −3
or A−1 = 4I − A = � �−� �=� �
0 4 1 2 −1 2
2 −3
Hence A−1 =� �
−1 2
Applications of Determinants and Matrices

In this section, we shall discuss application of determinants and matrices for solving the system of
linear equations in two or three variables and for checking the consistency of the system of linear
equations.

Consistent system A system of equations is said to be consistent if its solution (one or more) exists.

Inconsistent system A system of equations is said to be inconsistent if its solution does not exist.

[Note] In this chapter, we restrict ourselves to the system of linear equations having unique solutions
only.

Solution of system of linear equations using inverse of matrix

Let us express the system of linear equations as matrix equations and solve them using inverse of the
coefficient matrix.

Consider the system of equations


a1 x + b1 y + c1 z = d1
a2 x + b2 y + c2 z = d2
a 3 x + b3 y + c3 z = d3
a1 b1 c1 x d1
Let A = �a 2 b2 c2 � , X = �y� and B = �d2 �
a3 b3 c3 z d3
Then, the system of equations can be written as, AX = B, i.e.,
a1 b1 c1 x d1
�a 2 b2 c2 � �y� = �d2 �
a3 b3 c3 z d3
Case I If A is a nonsingular matrix, then its inverse exists. Now

AX = B

or A-1 (AX) = A-1 B (premultiplying by A-1)

or (A-1 A) X = A-1 B (by associative property)

or IX = A-1 B

or X = A-1 B

This matrix equation provides unique solution for the given system of equations as inverse of a matrix
is unique. This method of solving system of equations is known as Matrix Method.

Case II If A is a singular matrix, then |A| = 0.

In this case, we calculate (adj A) B.

If (adj A) B ≠ O, (O being zero matrix), then solution does not exist and the system of equations is called
inconsistent.

Determinants 17
If (adj A) B = O, then system may be either consistent or inconsistent according as the system have
either infinitely many solutions or no solution.

Example 27 Solve the system of equations

2x + 5y = 1

3x + 2y = 7

Solution The system of equations can be written in the form AX = B, where


2 5 x 1
A=� � , X = �y� and B = � �
3 2 7
Now, |A| = -11 ≠ 0, Hence, A is nonsingular matrix and so has a unique solution.
1 2
−5
Note that A−1 = − � �
11 −3
2
1 2 −5 1
Therefore X = A−1 B = − � �� �
11 −3 2 7
x 1 −33 3
i.e. �y� = − � �=� �
11 11 −1
Hence x = 3, y = -1

Example 28 Solve the following system of equations by matrix method,


3x − 2y + 3z = 8
2x + y − z = 1
4x − 3y + 2z = 4
Solution The system of equations can be written in the form AX = B, where
3 −2 3 x 8
A = �2 1 −1� , X = �y� and B = �1�
4 −3 2 z 4
We see that
|A| = 3(2 − 3) + 2(4 + 4) + 3(−6 − 4) = −17 ≠ 0
Hence, A is nonsingular and so its inverse exists. Now

A11 = -1, A12 = -8, A13 = -10

A21 = -5, A22 = - 6, A23 = 1

A31 = -1, A32 = 9, A33 = 7


−1 −5 −1
1
Therefore A−1 = − �−8 −6 9�
17
−10 1 7
−1 −5 −1 8
1
So X = A − 1 B = �−8 −6 9 � �1�
17
−10 1 7 4
x −17 1
1
i.e. �y� = − �−34� = �2�
17
z −51 3
Hence x = 1, y = 2 and z = 3.

Example 29 The sum of three numbers is 6. If we multiply third number by 3 and add second number

18 Determinants
to it, we get 11. By adding first and third numbers, we get double of the second number. Represent it
algebraically and find the numbers using matrix method.

Solution Let first, second and third numbers be denoted by x, y and z, respectively. Then, according to
given conditions, we have

x+y+z=6

y + 3z = 11

x + z = 2y or x - 2y + z = 0

This system can be written as A X = B, where


1 1 1 x 6
A = �0 1 3� , x = �y� and B = �11�
1 2 1 z 0
Here |A| = 1 (l + 6) - (0 - 3)+ (0 -1) = 9 ≠ 0. Now we find adj A

A11 = 1 (1 + 6) = 7, A12 = -(0 - 3) = 3, A13 = -1

A21 = - (1 + 2) = - 3, A22 = 0, A23 = - (-2 - 1) = 3

A31 = (3 - 1) = 2, A32 = - (3 - 0) = - 3, A33 = (1 - 0) = 1


7 −3 2
Hence adj A = �3 0 −3�
−1 3 1
7 −3 2
1 1
Thus A-1 = |A|
adj (A) = �3 0 −3�
9
−1 3 1
Since X = A−1 B

1 7 −3 2 6
X= �3 0 −3� �11�
9
−1 3 1 0
x 42 − 33 + 0 9 1
1 1
or �y� = �18 + 0 + 0 � = �18� = �2�
9 9
z −6 + 33 + 0 27 3
Thus x = 1, y = 2, z = 3

Determinants 19
5 Continuity and Differentiability
Introduction

This chapter is essentially a continuation of our study of differentiation of functions in Class XI. We
had learnt to differentiate certain functions like polynomial functions and trigonometric functions. In
this chapter, we introduce the very important concepts of continuity, differentiability and relations
between them. We will also learn differentiation of inverse trigonometric functions. Further, we
introduce a new class of functions called exponential and logarithmic functions. These functions lead
to powerful techniques of differentiation. We illustrate certain geometrically obvious conditions
through differential calculus. In the process, we will learn some fundamental theorems in this area.

Continuity

We start the section with two informal examples to get a feel of continuity. Consider the function
1, if x ≤ 0
f(x) = �
2, if x > 0

This function is of course defined at every point of the real line. Graph of this function is given in the
Figure. One can deduce from the graph that the value of the function at nearby points on x-axis remain
close to each other except at x = 0. At the points near and to the left of 0, i.e., at points like - 0.1, -
0.01, - 0.001, the value of the function is 1. At the points near and to the right of 0, i.e., at points like
0.1, 0.01, 0.001, the value of the function is 2. Using the language of left and right hand limits, we may
say that the left (respectively right) hand limit of f at 0 is 1 (respectively 2). In particular the left and
right hand limits do not coincide. We also observe that the value of the function at x = 0 concides with
the left hand limit. Note that when we try to draw the graph, we cannot draw it in one stroke, i.e.,
without lifting pen from the plane of the paper, we can not draw the graph of this function. In fact, we
need to lift the pen when we come to 0 from left. This is one instance of function being not continuous
at x = 0.

Now, consider the function defined as


1, if x ≠ 0
f(x) = �
2, if x > 0
This function is also defined at every point. Left and the right hand limits at x = 0 are both equal to 1.
But the value of the function at x = 0 equals 2 which does not coincide with the common value of the
left and right hand limits. Again, we note that we cannot draw the graph of the function without lifting

Continuity and Differentiability 1


the pen. This is yet another instance of a function being not continuous at x = 0.

Naively, we may say that a function is continuous at a fixed point if we can draw the graph of the
function around that point without lifting the pen from the plane of the paper.

Mathematically, it may be phrased precisely as follows:

Definition 1 Suppose f is a real function on a subset of the real numbers and let c be a point in the
domain of f. Then f is continuous at c if
lim f (x) = f(c)
x→c

More elaborately, if the left hand limit, right hand limit and the value of the function at x = c exist and
equal to each other, then f is said to be continuous at x = c. Recall that if the right hand and left hand
limits at x = c coincide, then we say that the common value is the limit of the function at x = c. Hence
we may also rephrase the definition of continuity as follows: a function is continuous at x = c if the
function is defined at x = c and if the value of the function at x = c equals the limit of the function at
x = c. If f is not continuous at c, we say f is discontinuous at c and c is called a point of discontinuity
of f.

Example 1 Check the continuity of the function f given by f (x) = 2x + 3 at x = 1.

Solution First note that the function is defined at the given point x = 1 and its value is 5. Then find the
limit of the function at x = 1. Clearly
lim f (x) = lim(2x + 3) = 2(1) + 3 = 5
x→1 x→1

Thus lim f (x) = 5 = f(1)


x→1

Hence, f is continuous at x = 1.

Example 2 Examine whether the function f given by f(x) = x2 is continuous at x = 0.

Solution First note that the function is defined at the given point x = 0 and its value is 0. Then find the
limit of the function at x = 0. Clearly
lim f (x) = lim x 2 = 02 = 0
x→0 x→0

Thus lim f (x) = 0 = f(0)


x→0

Hence, f is continuous at x = 0.

Example 3 Discuss the continuity of the function f given by f(x) = | x | at x = 0.

Solution By definition

2 Continuity and Differentiability


−x, if x < 0
f(x) = �
x, if x ≥ 0
Clearly the function is defined at 0 and f(0) = 0. Left hand limit of f at 0 is
lim f (x) = lim− (−x) = 0
x→0− x→0

Similarly, the right hand limit of f at 0 is


lim f (x) = lim+ x = 0
x→0+ x→0

Thus, the left hand limit, right hand limit and the value of the function coincide at x = 0. Hence, f is
continuous at x = 0.

Example 4 Show that the function f given by


x 3 + 3, if x ≠ 0
f(x) = �
1, if x = 0
is not continuous at x = 0.

Solution The function is defined at x = 0 and its value at x = 0 is 1. When x ≠ 0, the function is given by
a polynomial. Hence,
lim f (x) = lim(x 3 + 3) = 03 + 3 = 3
x→O x→0

Since the limit of f at x = 0 does not coincide with f(0), the function is not continuous at x =
0. It may be noted that x = 0 is the only point of discontinuity for this function.

Example 5 Check the points where the constant function f(x) = k is continuous,

Solution The function is defined at all real numbers and by definition, its value at any real number
equals k. Let c be any real number, Then
lim f (x) = lim k = k
x→c x→c

Since f(c) = k = lim f (x) for any real number c, the function f is continuous at every real number.
x→c

Example 6 Prove that the identity function on real numbers given by f(x) = x is continuous at every real
number,

Solution The function is clearly defined at every point and f(c) = c for every real number c. Also,
lim f (x) = lim x = c
x→c x→c

Thus, lim f (x) = c = f(c) and hence the function is continuous at every real number, Having defined
x→c
continuity of a function at a given point, now we make a natural extension of this definition to discuss
continuity of a function.

Definition 2 A real function f is said to be continuous if it is continuous at every point in the domain of
f.

This definition requires a bit of elaboration, Suppose


f is a function defined on a closed interval [a, b], then for f to be continuous, it needs to be continuous at every point in [a, b] includin
means
lim f (x) = f(a)
x→a+

and continuity of f at b means

Continuity and Differentiability 3


lim f (x) = f(b)
x→b−

Observe that
lim− f (x) and lim+ f (x) do not make sense, As a consequence of this definition, if f is defined only at one point, it is continuous there
X→a x→b
is a continuous function.

Example 7 Is the function defined by f(x) = | x |, a continuous function?

Solution We may rewrite f as


−x, if x < 0
f(x) = �
x, if x ≥ 0
By Example 3, we know that f is continuous at x = 0.

Let c be a real number such that c < 0. Then f(c) = -c. Also

lim f (x) = lim(−x) = −c (Why?)


x→c x→c

Since lim f (x) = f(c), f is continuous at all negative real numbers,


x→c

Now, let c be a real number such that c > 0. Then f(c) = c. Also

lim f (x) = lim x = c (Why?)


x→c x→c

Since lim f (x) = f(c), f is continuous at all positive real numbers, Hence, f is continuous at all points,
x→c

Example 8 Discuss the continuity of the function f given by f(x) = x 3 + x 2 − 1.

Solution Clearly f is defined at every real number c and its value at c is c 3 + c 2 − 1. We also know that
lim f (x) = lim(x 3 + x 2 − 1) = c 3 + c 2 − 1
x→c x→c

Thus lim f (x) = f(c), and hence f is continuous at every real number, This means f is a continuous
x→c
function,
1
Example 9 Discuss the continuity of the function f defined by f(x) = , x ≠ 0.
x

Solution Fix any non zero real number c, we have


1 1
lim f (x) = lim =
x→c x→c x c
1
Also, since for c ≠ 0, f(c) = , we have lim f (x) =
c x→c
f(c) and hence, f is continuous at every point in the domain of f, Thus f is a continuous function.

We take this opportunity to explain the concept of infinity. This we do by analyzing the function f(x) =
1
near x = 0. To carry out this analysis we follow the usual trick of finding the value of the function at
x
real numbers close to 0. Essentially we are trying to find the right hand limit of f at 0. We tabulate this
in the following Table.

Table

A 1 0.3 0.2 0.1 = 10-1 0.01 = 10-2 0.001 = 10-3 10-n

f(x) 1 3.333... 5 10 100 = 102 1000 = 103 10-n

We observe that as x gets closer to 0 from the right, the value of f(x) shoots up higher. This may be

4 Continuity and Differentiability


rephrased as: the value of f(x) may be made larger than any given number by choosing a positive real
number very close to 0. In symbols, we write
lim f (x) = +∞
x→0+

(to be read as: the right hand limit of f(x) at 0 is plus infinity). We wish to emphasise that + ∞ is NOT
a real number and hence the right hand limit of f at 0 does not exist (as a real number).

Similarly, the left hand limit of f at 0 may be found. The following table is self explanatory.

Table

A -1 -0.3 -0.2 - 10-1 - 10-2 - 10-3 - 10-n

f(x) -1 - 3.333... -5 - 10 - 102 - 103 - 10n

From the Table , we deduce that the value of f(x) may be made smaller than any given number by
choosing a negative real number very close to 0. In symbols, we write
lim f (x) =- ∞
x→0−

(to be read as: the left hand limit of f(x) at 0 is minus infinity). Again, we wish to emphasise that - ∞
is NOT a real number and hence the left hand limit of f at 0 does not exist (as a real number). The
graph of the reciprocal function given in Figure is a geometric representation of the above mentioned
facts.

Example 10 Discuss the continuity of the function f defined by


x + 2, if x ≤ 1
f(x) = �
x − 2, if x > 1
Solution The function f is defined at all points of the real line.

Case 1 If c < 1, then f(c) = c + 2. Therefore, lim f (x) = lim(x + 2) = c + 2


x→c x→c

Thus, f is continuous at all real numbers less than 1.

Case 2 If c > 1, then f(c) = c - 2. Therefore,


lim f (x) = lim(x − 2) = c − 2 = f(c)
x→c x→c

Continuity and Differentiability 5


Thus, f is continuous at all points x > 1.

Case 3 If c = 1, then the left hand limit of f at x = 1 is


lim f (x) = lim−(x + 2) = 1 + 2 = 3
x→1− x→1

The right hand limit of f at x = 1 is


lim f (x) = lim+(x − 2) = 1 − 2 = −1
x→1+ x→1

Since the left and right hand limits of f at x = 1 do not coincide, f is not continuous at x = 1. Hence x =
1 is the only point of discontinuity of f. The graph of the function is given in Figure.

Example 1 Find all the points of discontinuity of the function f defined by


x + 2 if x < 1
f(x) = �0, if x = 1
x - 2 if x > 1
Solution As in the previous example we find that f is continuous at all real numbers x ≠ 1. The left
hand limit of f at x = 1 is
lim f (x) = lim−(x + 2) = 1 + 2 = 3
x→1− x→1

The right hand limit of f at x = 1 is


lim f (x) = lim+(x − 2) = 1 − 2 = −1
x→1+ x→1

Since, the left and right hand limits of f at x = 1 do not coincide, f is not continuous at x = 1. Hence x =
1 is the only point of discontinuity of f. The graph of the function is given in the Figure.

6 Continuity and Differentiability


Example 12 Discuss the continuity of the function defined by
x + 2, if x < 0
f(x) = �
−x + 2, if x > 0
Solution Observe that the function is defined at all real numbers except at 0. Domain of definition of
this function is

D1 ∪ D2 where D1 = {x ∈ R : x < 0} and D2 = {x ∈ R : x > 0}

Case 1 If c ∈ D1, then lim f (x) = lim(x + 2) = c + 2 = f(c) and hence f is continuous in D1.
x→c x→c

Case 2 If c ∈ D2, then lim f (x) = lim(- x + 2) = - c + 2= f(c) and hence f is continuous in D2.
x→c x→c

Since f is continuous at all points in the domain of f, we deduce that f is continuous. Graph of this
function is given in the Figure. Note that to graph this function we need to lift the pen from the plane
of the paper, but we need to do that only for those points where the function is not defined.

Example 13 Discuss the continuity of the function f given by


x, if x ≥ 0
f(x) = � 2
x , if x < 0
Solution Clearly the function is defined at every real number. Graph of the function is given in Figure.
By inspection, it seems prudent to partition the domain of definition of f into three disjoint subsets of
the real line.

Let D1 = {x ∈ R : x < 0}, D2 = {0} and D3 = {x ∈ R : x > 0}

Continuity and Differentiability 7


Case 1 At any point in D1, we have f(x) = x2 and it is easy to see that it is continuous there (see Example
2).

Case 2 At any point in D3, we have f(x) = x and it is easy to see that it is continuous there (see Example
6).

Case 3 Now we analyse the function at x = 0. The value of the function at 0 is f(0) = 0. The left hand
limit of f at 0 is
lim f (x) = lim− x 2 = 02 = 0
x→0− x→0

The right hand limit of f at 0 is


lim f (x) = lim+ x = 0
x→0+ x→0

Thus lim f (x) = 0 =


x→0
f(0) and hence f is continuous at 0. This means that f is continuous at every point in its domain and hence, f is a
continuous function.

Example 14 Show that every polynomial function is continuous.

Solution Recall that a function p is a polynomial function if it is defined by p(x) = a 0 + a1 x + … +


a n x n for some natural number n, a n ≠ 0 and a i ∈ R. Clearly this function is defined for every real number, For
a fixed real number c, we have
lim p (x) = p(c)
x→c

By definition, p is continuous at c. Since c is any real number, p is continuous at every real number and
hence p is a continuous function.

Example 15 Find all the points of discontinuity of the greatest integer function defined by f(x) = [x],
where [x] denotes the greatest integer less than or equal to x.

Solution First observe that f is defined for all real numbers. Graph of the function is given in Figure.
From the graph it looks like that f is discontinuous at every integral point. Below we explore, if this is
true.

Case 1 Let c be a real number which is not equal to any integer. It is evident from the graph that for
all real numbers close to c the value of the function is equal to [c]; i.e., lim f (x) = lim[x] = [c]Alsof(c) = [c]
x→c x→c

8 Continuity and Differentiability


and hence the function is continuous at all real numbers not equal to integers.

Case 2 Let c be an integer. Then we can find a sufficiently small real number r > 0 such that [c - r] =
c - 1 whereas [c + r] = c.

This, in terms of limits mean that


lim f (x) = c - 1, lim+ f (x) = c
x→c− x→c

Since these limits cannot be equal to each other for any c, the function is discontinuous at every
integral point.

Algebra of continuous functions

In the previous class, after having understood the concept of limits, we learnt some algebra of limits.
Analogously, now we will study some algebra of continuous functions. Since continuity of a function
at a point is entirely dictated by the limit of the function at that point, it is reasonable to expect results
analogous to the case of limits.

Theorem 1 Suppose f and g be two real functions continuous at a real number c.

Then

(1) f + g is continuous at x = c.

(2) f - g is continuous at x = c.

(3) f.g is continuous at x = c.


f
(4) � � is continuous at x = c. (provided g (c) ≠ 0).
g

Proof We are investigating continuity of (f + g) at x = c. Clearly it is defined at x = c. We have

lim(f + g) (x) = lim[f(x) + g(x)] (by definition of f +g)


x→c x→c

= lim f (x) + lim g (x) (by the theorem on limits)


x→c x→c

= f(c) + g(c) (as f and g are continuous)

= (f + g) (c) (by definition of f + g)

Hence, f + g is continuous at x = c.

Proofs for the remaining parts are similar and left as an exercise to the reader.

Remarks

(i) As a special case of (3) above, if f is a constant function, i.e., f(x) = λ for some real number λ, then
the function (λ.g) defined by (λ.g) (x) = λ.g(x) is also continuous. In particular if λ = - 1, the continuity
of f implies continuity of -f.

(ii) As a special case of (4) above, iff is the constant function f(x) = λ, then the function
λ λ λ 1
defined by (x) = is also continuous wherever g(x) ≠ 0. In particular, the continuity of g implies continuity of .
g g g(x) g

The above theorem can be exploited to generate many continuous functions. They also aid in deciding
if certain functions are continuous or not. The following examples illustrate this:

Example 16 Prove that every rational function is continuous.

Solution Recall that every rational function f is given by

Continuity and Differentiability 9


p(x)
f(x) = , q(x) ≠ 0
q(x)
where p and q are polynomial functions. The domain of f is all real numbers except points at which q
is zero. Since polynomial functions are continuous (Example 14), f is continuous by (4) of Theorem 1.

Example 17 Discuss the continuity of sine function.

Solution To see this we use the following facts


lim sin x = 0
x→0

We have not proved it, but is intuitively clear from the graph of sin x near 0.

Now, observe that f(x) = sin x is defined for every real number. Let c be a real number. Put x = c + h. If
x → c we know that h → 0. Therefore
lim f (x) = lim sin x
x→c x→c

= lim sin (c + h)
h→0

= lim[ sin c cos h + cos c sin h]


h→0

= lim[ sin c cos h] + lim[ cos c sin h]


h→0 h→0

= sin c + 0 = sin c = f(c)


Thus lim f (x) = f(c) and hence f is a continuous function.
x→c

Remark A similar proof may be given for the continuity of cosine function.

Example 18 Prove that the function defined by f(x) = tan x is a continuous function.
sin x
Solution The function f(x) = tan x = . This is defined for all real numbers such that cos x ≠ 0, i. e. , x ≠ (2n +
cos x
π
1) . We have just proved that both sine and cosine functions are continuous. Thus tan x being a
2
quotient of two continuous functions is continuous wherever it is defined.

An interesting fact is the behaviour of continuous functions with respect to composition of functions.
Recall that if f and g are two real functions, then

(f o g) (A) = f(g (x))

is defined whenever the range of g is a subset of domain of f. The following theorem (stated without
proof) captures the continuity of composite functions.

Theorem 2 Suppose f and g are real valued functions such that (f o g) is defined at c. If g is continuous
at c and if f is continuous at g (c), then (f o g) is continuous at c.

The following examples illustrate this theorem.

Example 19 Show that the function defined by f(x) = sin (x2) is a continuous function.

Solution Observe that the function is defined for every real number. The function f may be thought of
as a composition g o h of the two functions g and h, where g (x) = sin x and h (x) = x2. Since both g and
h are continuous functions, by Theorem 2, it can be deduced that f is a continuous function.

Example 20 Show that the function f defined by

f(x) = |1 - A + |x||,

10 Continuity and Differentiability


where x is any real number, is a continuous function.

Solution Define g by g(x) = 1 - x + |x| and h by h(x) = |x| for all real x. Then

(h o g) (x) = h (g (x))

= h(1 - x + |x|)

= |1 - x + |x|| = f(x)

In Example 7, we have seen that h is a continuous function. Hence g being a sum of a polynomial
function and the modulus function is continuous. But then f being a composite of two continuous
functions is continuous.

Differentiability

Recall the following facts from previous class, We had defined the derivative of a real function as
follows:

Suppose f is a real function and c is a point in its domain, The derivative of f at c is defined by
f(c + h) − f(c)
lim
h→0 h
d
provided this limit exists, Derivative of f at c is denoted by f ′ (c) or �f(x)�|c . The function defined by
dx

f(x + h) − f(x)
f ′ (x) = lim
h→0 h
wherever the limit exists is defined to be the derivative of f, The derivative of f is denoted by
d dy
f ′ (x) or �f(x)� or if y = f(x) by or y ′ . The process of finding derivative of a function is called
dx dx
differentiation, We also use the phrase differentiate f(x) with respect to x to mean find f’ (x).

The following rules were established as a part of algebra of derivatives:

(1) (u ± v)′ = u′ ± v ′

(2) (uv)′ = u′ v + uv ′ (Leibnitz or product rule)


u ′ u′ v−uv′
(3) � � = , wherever v ≠ 0 (Quotient rule).
v v2

The following table gives a list of derivatives of certain standard functions:

Table

f(x) xn sin x cos x tan x

f'(y) nxn - 1 cos x - sin x sec2 x

Whenever we defined derivative, we had put a caution provided the limit exists. Now the natural
f(c+h)−f(c)
question is; what if it doesn’t? The question is quite pertinent and so is its answer, If lim does
h→0 h
not exist, we say that f is not differentiable at c.

In other words, we say that a function f is differentiable at a point


f(c+h)−f(c) f(c+h)−f(c)
c in its domain if both lim− and lim+ are fimte and equal, A function is said to be
h→0 h h→0 h
differentiable in an interval [a, b] if it is differentiable at every point of [a, b]. As in case of continuity,
at the end points a and b, we take the right hand limit and left hand limit, which are nothing but left

Continuity and Differentiability 11


hand derivative and right hand derivative of the function at a and b respectively. Similarly, a function
is said to be differentiable in an interval (a, b) if it is differentiable at every point of (a, b).

Theorem 3 If a function f is differentiable at a point c, then it is also continuous at that point.

Proof Since f is differentiable at c, we have


f(x) − f(c)
lim = f ′ (c)
x→c x−c
But for x ≠ c, we have
f(x) − f(c)
f(x) − f(c) = . (x − c)
x−c
f(x)−f(c)
Therefore lim[f(x) − f(c)] = lim � (x − c)�
x→c x→c x−c
f(x)−f(c)
or lim[f(x)] − lim[f(c)] = lim � � . lim[(x − c)]
x→c x→c x→c x−c x→c
′ (c)
=f 0=0
or lim f (x) = f(c)
x→c

Hence f is continuous at x = c.

Corollary 1 Every differentiable function is continuous.

We remark that the converse of the above statement is not true, Indeed we have seen that the function
defined by f(x) = |x| is a continuous function, Consider the left hand limit
f(0 + h) − f(0) −h
lim = = −1
h→0− h h
The right hand limit
f(0 + h) − f(0) h
lim+ = =1
h→0 h h
f(0+h)−f(0)
Since the above left and right hand limits at 0 are not equal, lim does not exist and hence f is
h→0 h
not differentiable at 0. Thus f is not a differentiable function.

Derivatives of composite functions

To study derivative of composite functions, we start with an illustrative example. Say, we want to find
the derivative of f, where
f(x) = (2x + 1)3
One way is to expand (2x + 1)3 using binomial theorem and find the derivative as a palynomial function
as illustrated below.
d d
f(x) = [(2x + 1)3 ]
dx dx
d
= (8x 3 + 12x 2 + 6x + 1)
dx
= 24x 2 + 24x + 6
= 6(2x + 1)2
Now, observe that f(x) = (hog)(x)

12 Continuity and Differentiability


where g(x) = 2x + 1 and h(x) = x 3 Put t = g(x) = 2x + 1. Then f(x) = h(t) = t 3 Thus
df dh dt
= 6 (2x + 1)2 = 3(2x + 1)2 2 = 3P 2 = ⋅
dx dt dx
The advantage with such observation is that it simplifies the calculation in finding the derivative of,
say, (2x + 1)100 . We may formalise this observation in the following theorem called the chain rule.

Theorem 4 (Chain Rule) Let f be a real valued function which is a composite of two functions u and v;
dt dv
i.e., f = v o u. Suppose t = u(x) and if both and exist, we have
dx dt

df dv dt
= ⋅
dx dt dx
We skip the proof of this theorem. Chain rule may be extended as follows, Suppose f is a real valued
function which is a composite of three functions u, v and w; i.e.,

f = (w o u) o v. If t = v (x) and s = u (t), then


df d(wou) dt dw ds dt
= ⋅ = ⋅ ⋅
dx dt dx ds dt dx
provided all the derivatives in the statement exist. Reader is invited to formulate chain rule for
composite of more functions,

Example 21 Find the derivative of the function given by f(x) = sin(x 2 ) .

Solution Observe that the given function is a composite of two functions, Indeed, if t = u(x) =
x 2 and v(t) = sin t, then
f(x) = (v o u)(x) = v�u(x)� = v(x 2 ) = sin x 2
dv dt
Put t = u(x) = x 2 . Observe that = cos t and = 2x exist. Hence, by chain rule
dt dx

df dv dt
= ⋅ = cos t ⋅ 2x
dx dt dx
It is normal practice to express the final result only in terms of x. Thus
df
= cos t ⋅ 2x = 2x cos x 2
dx
Alternatively, We can also directly proceed as follows:
dy d
y = sin(x 2 ) ⇒ = ( sin x 2 )
dx dx
d 2
= cos x 2 (x ) = 2x cos x 2
dx
Example 22 Find the derivative of tan (2x + 3).

Solution Let f(x) = tan(2x + 3) , u(x) = 2x + 3 and v(t) = tan t. Then


(vou)(x) = v�u(x)� = v(2x + 3) = tan(2x + 3) = f(x)
dv dt
Thus f is a composite of two functions, Put t = u(x) = 2x + 3. Then = sec 2 t and = 2 exist, Hence, by
dt dx
chain rule
df dv dt
= ⋅ = 2sec 2 (2x + 3)
dx dt dx
Example 23 Differentiate sin� cos (x 2 )� with respect to x.

Continuity and Differentiability 13


Solution The function f (x) = sin� cos (x 2 )� is a compositionf (x) =
(w o v o u)(x) of the three functions u, v and w, where u(x) = x 2 , v(t) = cos t and w(s) = sin s. Put
dw ds dt
t = u(x) = x 2 and s = v(t) = cos t . Observe that = cos s , = − sin t and = 2x
ds dt dx
exist for all real x. Hence by a generalisation of chain rule, we have
df dw ds dt
= ⋅ ⋅ = ( cos s) . (- sin t) (2x) = −2x sin x 2 . cos( cos x 2 )
dx ds dt dx
Alternatively, we can proceed as follows:

y = sin (cos x2)


dy d d
Therefore = sin( cos x 2 ) = cos( cos x 2 ) ( cos x 2 )
dx dx dx

d
= cos( cos x 2 ) (- sin x 2 ) (x 2 )
dx
= - sin x2 cos (cos x2) (2x)

= - 2x sin x2 cos (cos x2)

Derivatives of implicit functions

Until now we have been differentiating various functions given in the form y = f(x). But it is not
necessary that functions are always expressed in this form. For example, consider one of the following
relationships between x and y:

x-y-π=0

x + sin xy - y = 0

In the first case, we can solve for y and rewrite the relationship as y = x - π. In the second case, it
does not seem that there is an easy way to solve for y. Nevertheless, there is no doubt about the
dependence of y on x in either of the cases. When a relationship between x and y is expressed in a way
that it is easy to solve for y and write y = F(x), we say that y is given as an explicit function of x. In the
latter case it is implicit that y is a function of x and we say that the relationship of the second type,
above, gives function implicitly. In this subsection, we learn to differentiate implicit functions.
dy
Example 24 Find if x - y = π.
dx

Solution One way is to solve for y and rewrite the above as

y=x-π
dy
But then =1
dx

Alternatively, directly differentiating the relationship w.r.t., x, we have


d dπ
(x − y) =
dx dx

Recall that means to differentiate the constant function taking value π everywhere w.r.t., x. Thus
dx

d d
(x) − (y) = 0
dx dx
which implies that

14 Continuity and Differentiability


dy dx
= =l
dx dx
dy
Example 25 Find , if y + sin y = cos x.
dx

Solution We differentiate the relationship directly with respect to x, i.e.,


dy d d
+ ( sin y) = ( cos x)
dx dx dx
which implies using chain rule
dy dy
+ cos y ⋅ = − sin x
dx dx
dy sin x
This gives = −
dx 1 + cos y

where y ≠ (2n + 1)π

Derivatives of inverse trigonometric functions

We remark that inverse trigonometric functions are continuous functions, but we will not prove this.
Now we use chain rule to find derivatives of these functions.

Example 26 Find the derivative of f given by f(x) = sin-1 x assuming it exists.

Solution Let y = sin-1 x. Then, x = sin y.

Differentiating both sides w.r.t. x, we get


dy
1 = cos y
dx
dy 1 1
which implies that = =
dx cos y cos (sin−1 x)
π π
Observe that this is defined only for cos y ≠ 0, i.e., sin−1 x ≠ − , , i.e., x ≠ −1, 1, i.e., x ∈ (−1, 1).
2 2

To make this result a bit more attractive, we carry out the following manipulation.

Recall that for x ∈ (−1, 1), sin(sin−1 x) = x and hence


2
cos 2 y = 1 − ( sin y)2 = 1 − � sin (sin−1 x)� = 1 − x 2
π π
Also, since y ∈ �− , � , cos y is positive and hence cos y = √1 − x 2
2 2

Thus, for x ∈ (−1, 1) ,


dy 1 1
= =
dx cos y √1 − x 2
Example 27 Find the derivative of f given by f(x) = tan−1 x assuming it exists.

Solution Let y = tan−1 x. Then, x = tan y.

Differentiating both sides w.r.t. x, we get


dy
1 = sec 2 y
dx
which implies that
dy 1 1 1 1
= 2
= 2
= 2 =
dx sec y 1 + tan y 1 + � tan (tan−1 x)� 𝑓𝑓 + x 2

Continuity and Differentiability 15


Finding of the derivatives of other inverse trigonometric functions is left as exercise. The following
table gives the derivatives of the remaining inverse trigonometric functions.

Table

f(x) cos-1 x cot-1 x sec-1 x cosec-1 x

−1 −1 1 −1
f'(x) √1 – x2 √1 + x2 |x|√x 2 −1 |x|√x 2 − 1

Domain of f' (-1, 1) R (-∞, -1) ∪ (1, ∞) (-∞, -1) ∪ (1, ∞)

Exponential and Logarithmic Functions

Till now we have learnt some aspects of different classes of functions like polynomial functions,
rational functions and trigonometric functions. In this section, we shall learn about a new class of
(related) functions called exponential functions and logarithmic functions. It needs to be emphasized
that many statements made in this section are motivational and precise proofs of these are well beyond
the scope of this text.

The Figure gives a sketch of y = f1(x) = x, y = f2(x) = x2, y = f3(x) = x3 and y = f4(x) = x4. Observe that the
curves get steeper as the power of x increases. Steeper the curve, faster is the rate of growth. What
this means is that for a fixed increment in the value of x(> 1), the increment in the value of y = fn (x)
increases as n increases for n = 1, 2, 3, 4. It is conceivable that such a statement is true for all positive
values of n, where fn (x) = xn. Essentially, this means that the graph of y = fn (x) leans more towards
the y-axis as n increases. For example, consider f10(x) = x10 and f15(x) = x15. If x increases from 1 to 2, f10
increases from 1 to 210 whereas f15 increases from 1 to 215. Thus, for the same increment in x, f15 grow
faster than f10.

Upshot of the above discussion is that the growth of polynomial functions is dependent on the degree
of the polynomial function - higher the degree, greater is the growth. The next natural question is: Is
there a function which grows faster than any polynomial function. The answer is in affirmative and an
example of such a function is

y = f(x) = KT.

Our claim is that this function f grows faster than fn (x) = xn for any positive integer n. For example, we
can prove that 10x grows faster than f100 (x) = x100. For large values of x like x = 103, note that f100 (x) =
(103)100 = 10300 whereas f (103) = 10103 = 101000.

Clearly f (x) is much greater than f100 (x). It is not difficult to prove that for all x > 103, f (x) > f100 (x). But
we will not attempt to give a proof of this here. Similarly, by choosing large values of x, one can verify
that f (x) grows faster than fn (x) for any positive integer n.

16 Continuity and Differentiability


Definition 3 The exponential function with positive base b > 1 is the function

V = f(x) = bx

The graph of y = 10x is given in the Figure.

It is advised that the reader plots this graph for particular values of b like 2, 3 and 4.

Following are some of the salient features of the exponential functions:

(1) Domain of the exponential function is R, the set of all real numbers.

(2) Range of the exponential function is the set of all positive real numbers.

(3) The point (0, 1) is always on the graph of the exponential function (this is a restatement of the fact
that b0 = 1 for any real b > 1).

(4) Exponential function is ever increasing; i.e., as we move from left to right, the graph rises above.

(5) For very large negative values of x, the exponential function is very close to 0. In other words, in the
second quadrant, the graph approaches x-axis (but never meets it).

Exponential function with base 10 is called the common exponential function. In the Appendix A. 1.4 of
Class XI, it was observed that the sum of the series
1 1
1+ + + ...
1! 2!

is a number between 2 and 3 and is denoted by e. Using this e as the base we obtain an extremely
important exponential function y = ex.

This is called natural exponential function.

It would be interesting to know if the inverse of the exponential function exists and has nice
interpretation. This search motivates the following definition.

Definition 4 Let b > 1 be a real number. Then we say logarithm of a to base b is x if bx = a.

Logarithm of a to base b is denoted by logb a. Thus logb a = x if bx = a. Let us work with a few explicit
examples to get a feel for this. We know 23 = 8. In terms of logarithms, we may rewrite this as log2 8 =
3. Similarly, 104 = 10000 is equivalent to saying log10 10000 = 4. Also, 625 = 54 = 252 is equivalent to
saying log5 625 = 4 or log25 625 = 2.

Continuity and Differentiability 17


On a slightly more mature note, fixing a base b > 1, we may look at logarithm as a function from positive
real numbers to all real numbers. This function, called the logarithmic function, is defined by

logb : R+ → R

x → logb x = y if by = x

As before if the base b = 10, we say it is common logarithms and if b = e, then we say it is natural
logarithms. Often natural logarithm is denoted by In. In this chapter, log x denotes the logarithm
function to base e, i.e.. In x will be written as simply log x. The Figure gives the plots of logarithm
function to base 2, e and 10.

Some of the important observations about the logarithm function to any base b > 1 are listed below:

(1) We cannot make a meaningful definition of logarithm of non-positive numbers and hence the domain
of log function is R+.

(2) The range of log function is the set of all real numbers.

(3) The point (1, 0) is always on the graph of the log function.

(4) The log function is ever increasing, i.e., as we move from left to right the graph rises above.

(5) For x very near to zero, the value of log x can be made lesser than any given real number. In other
words in the fourth quadrant the graph approaches y-axis (but never meets it).

(6) Figure gives the plot of y = ex and y = In x. It is of interest to observe that the two curves are the
mirror images of each other reflected in the line y = x.

Two properties of ‘log’ functions are proved below:

(1) There is a standard change of base rule to obtain loga p in terms of logb p. Let loga p = α, logb p =
β and logb a = γ. This means aα = p, bβ = p and bγ = a. Substituting the third equation in the first
one, we have

(bγ)α = bγα = p

Using this in the second equation, we get

bβ = p = bγα

18 Continuity and Differentiability


β
which implies β = αγ or α = . But then
γ

log b p
log a p =
log b a
(2) Another interesting property of the log function is its effect on products. Let logb pq = α. Then bα
= pq. If logb p = β and logb q = γ, then bβ = p and bγ = q. But then bα = pq = bβbγ = bβ + γ

which implies α = β + γ, i.e.,

logb pq = logb p + logb q

A particularly interesting and important consequence of this is when p = q. In this case the above may
be rewritten as

logb p2 = logb p + logb p = 2 log p

An easy generalisation of this (left as an exercise!) is

logb pn = n log p

for any positive integer n. In fact this is true for any real number n, but we will not attempt to prove
this. On the similar lines the reader is invited to verify
𝐱𝐱
logb = logb x - logb y
𝐲𝐲

Example 28 Is it true that x = elog x for all real x?

Solution First, observe that the domain of log function is set of all positive real numbers. So the above
equation is not true for non-positive real numbers. Now, let y = elog x. If y > 0, we may take logarithm
which gives us log y = log (elog x) = log x . log e = log x. Thus y = x. Hence x = elog x is true only for positive
values of x.

One of the striking properties of the natural exponential function in differential calculus is that it
doesn’t change during the process of differentiation. This is captured in the following theorem whose
proof we skip.

Theorem 5*
d
(1) The derivative of ex w.r.t., x is ex; i.e., (iii) = ex.
dx
1 d 1
(2) The derivative of log x w.r.t., x is ; i. e. , (log x) = .
x dx x

* Please see supplementary material on Page 286.

Continuity and Differentiability 19


Example 29 Differentiate the following w.r.t. x:

(i) e−x (ii) sin( log x) , x > 0 (iii) cos −1 (ex ) (iv) e cos x

Solution

(i) Let y = e−x . Using chain rule, we have


dy d
= e−x ⋅ (−x) = −e−x
dx dx
(ii) Let y = sin( log x) . Using chain rule, we have
dy d cos ( log x)
= cos( log x) ⋅ ( log x) =
dx dx x
(iii) Let y = cos −1 (ex ). Using chain rule, we have
dy −1 d −ex
= . (ex ) =
dx �1 − (ex )2 dx √1 − e2x

(iv) Let y = e cos x . Using chain rule, we have


dy
= e cos x ⋅ (− sin x) = −( sin x)e cos x
dx

Logarithmic Differentiation

In this section, we will learn to differentiate certain special class of functions given in the form
y = f(x) = [u(x)]v(x)
By taking logarithm (to base e) the above may be rewritten as
log y = v(x) log[u(x)]
Using chain rule we may differentiate this to get
1 dy 1
⋅ = v(x) ⋅ . u′ (x) + v ′ (x) log[u(x)]
y dx u(x)
which implies that

dy v(x) ′
= y� u (x) + v ′ (x) log [u(x)]�
dx u(x)

The main point to be noted in this method is that f(x) and u(x) must always be positive as otherwise
their logarithms are not defined, This process of differentiation is known as logarithms differentiation
and is illustrated by the following examples:

(x−3)(x2 +4)
Example 30 Differentiate � w.r.t. x.
3x2 +4x+5

(x−3)(x2 +4)
Solution Let y = � (3x2
+4x+5)

Taking logarithm on both sides, we have


1
log y = [ log (x − 3) + log (x 2 + 4) − log (3x 2 + 4x + 5)]
2
Now, differentiating both sides w.r.t. x, we get

20 Continuity and Differentiability


1 dy 1 1 2x 6x + 4
. = � + − �
y dx 2 (x − 3) x 2 + 4 3x 2 + 4x + 5
dy y 1 2x 6x+4
or = � + − �
dx 2 (x−3) x2 +4 3x2 +4x+5

1 (x − 3)(x 2 + 4) 1 2x 6x + 4
= � � + 2 − 2 �
2 3x + 4x + 5 (x − 3) x + 4 3x + 4x + 5
2

Example 31 Differentiate ax w.r.t. x, where a is a positive constant,

Solution Let y = ax Then


log y = x log a
Differentiating both sides w.r.t. x, we have
1 dy
= log a
y dx
dy
or = y log a
dx
d
Thus (ax ) = ax log a
dx
d d d
Alternatively (ax ) = �exloga � = exloga (xlog a)
dx dx dx

= ex log a . log a = ax log a.


Example 32 Differentiate x sin x , x > 0 w.r.t. x.

Solution Let y = x sin x . Taking logarithm on both sides, we have


log y = sin x log x
1 dy d d
Therefore = sin x ( log x) + log x ( sin x)
y dx dx dx
1 dy 1
or = ( sin x) + log x cos x
y dx x

dy sin x
or = y� + cos x log x�
dx x
sin x
= x sin x � + cos x log x�
x

= x sin x−1 ⋅ sin x + x sin x .cosxlogx


dy
Example 33 Find , if y x + x y + x x = ab
dx

Solution Given that y 𝑥𝑥 + x y + x 𝑥𝑥 = ab

Putting u = y x , v = x y and w = x x , we get u + v + w = ab


du dv dw
Therefore + + = 0 …(1)
dx dx dx

Now, u = y x . Taking logarithm on both sides, we have


log u = x log y
Differentiating both sides w.r.t. x, we have
1 du d d
. = x ( log y) + log y (x)
u dx dx dx
1 dy
= x ⋅ + log y ⋅ 1
y dx

Continuity and Differentiability 21


du x dy x dy
So = u� + log y� = y X � + log y� ... (2)
dx y dx y dx

Also v = x y

Taking logarithm on both sides, we have


log v = y log x
Differentiating both sides wr.t. x, we have
1 dv d dy
. = y ( log x) + log x
v dx dx dx
1 dy
= y⋅ + log x ⋅
x dx
dv y dy
So = v � + log x �
dx x dx
y dy
= x y � + log x � … (3)
x dx

Again w = x x

Taking logarithm on both sides, we have


log w = x log x.
Differentiating both sides w.r.t. x, we have
1 dw d d
. = x ( log x) + log x ⋅ (x)
w dx dx dx
1
= x ⋅ + log x ⋅ 1
x
dw
i.e. = w(1 + log x)
dx

= x x (1 + log x) … (4)

From (1), (2), (3), (4), we have


x dy y dy
yx � + log y� + x y � + log x � + x x (1 + log x) = 0
y dx x dx
dy
or (x y x−1 1 + x y . log x) = −x x (1 + log x) − y . x y−1 − y x log y
dx
dy −�y𝑥𝑥 log y+y.xy - 1 +xx (1+ log x)�
Therefore =
dx x.yx - 1 +x𝑦𝑦 log x

Derivatives of Functions in Parametric Forms

Sometimes the relation between two variables is neither explicit nor implicit, but some link of a third
variable with each of the two variables, separately, establishes a relation between the first two
variables. In such a situation, we say that the relation between them is expressed via a third variable.
The third variable is called the parameter. More precisely, a relation expressed between two variables
x and y in the form x = f(t), y = g (t) is said to be parametric form with t as a parameter.

In order to find derivative of function in such form, we have by chain rule.


dy dy dx
= .
dt dx dt
dy
dy dx
or = dt
dx �whenever ≠ 0�
dx dt
dt

22 Continuity and Differentiability


dy g′ (t) dy dx
Thus = f �as = g ′ (t) and = f ′ (t)� [provided f ′ (t) ≠ 0]
dx f′(t) dt dt

dy
Example 34 Find , if x = a cos θ , y = a sin θ.
dx

Solution Given that


x = a cos θ , y = a sin θ
dx dy
Therefore = −a sin θ , = a cos θ
dθ d6
dy
dy a cos θ
Hence = dθ
dx = = − cot θ
dx −a sin θ

dy
Example 35 Find , if x = at 2 , y = 2at.
dx

Solution Given that x = at 2 , y = 2at


dx dy
So = 2at and = 2a
dt dt
dy
dy 2a 1
Therefore = dt
dx = =
dx 2at t
dt

dy
Example 36 Find , if x = a(θ + sin θ), y = a(1 − cos θ).
dx
dx dy
Solution We have = a(1 + cos θ), = a( sin θ)
dθ d6
dy
dy a sin θ θ
Therefore = dθ
dx = = tan
dx a(1+ cos θ) 2

dy
[Note] It may be noted here that is expressed in terms of parameter only without directly involving
dx
the main variables x and y.
2 2 2
dy
Example 37 Find , if x 3 + y 3 = a3
dx

Solution Let x = acos 3 θ, y = asin3 θ. Then


2 2 2 2
x 3 + y 3 = (acos3 θ)3 + (asin3 θ)3
2 2
= a3 (cos2 θ + (sin2 θ) = a3
2 2 2
Hence, x = a cos3 θ, y = a sin3 θ is parametric equation of x 3 + y 3 = a3
dx dy
Now = −3acos 2 θ sin θ and = 3asin2 θ cos θ
dθ dθ
dy
dy 3asin2 θ cos θ 3 y
Therefore = dθ
dx = =- tan θ = - �
dx -3acos2 θ sin θ x

If x and y are connected parametrically by the equations given in Exercises 1 to 10, without eliminating
dy
the parameter. Find .
dx

Second Order Derivative

Let y = f(x) . Then


dy
= f ′ (x) …(1)
dx

Continuity and Differentiability 23


If
d dy
f ′ (x) is differentiable, we may differentiate (1) again w. r. t. x. Then, the left hand side becomes � � which is called the second ord
dx dx
. It is also denoted by D2 y or y" or y2 if y = f(x). We remark that higher order derivatives may be defined
similarly.
d2 y
Example 38 Find , if y = x 3 + tan x.
dx2

Solution Given that y = x 3 + tan x. Then


dy
= 3x 2 + sec 2 x
dx
d2 y d
Therefore = (3x 2 + sec 2 x)
dx2 dx

= 6x + 2 sec x secx tanx = 6x + 2sec 2 x tan x


d2 y
Example 39 If y = A sin x + B cos x , then prove that +y =0
dx2

Solution We have
dy
= A cos x − B sin x
dx
d2 y d
and = (A cos x − B sin x)
dx2 dx

= −A sin x − B cos x = −y
d2 y
Hence +y=0
dx2
d2 y dy
Example 40 If y = 3e2𝔯𝔯 + 2e3x , prove that −5 + 6y = 0
dx2 dx

Solution Given that y = 3e2x + 2e3x Then


dy
= 6e2x + 6e3x = 6(e2x + e3x )
dx
d2 y
Therefore = l2e2x + 18e3x = 6(2e2x + 3e3x )
dx2
d2 y dy
Hence −5 + 6y = 6(2e2x + 3e3x ) − 30(e2x + e3x ) + 6(3e2x + 2e3x ) = 0
dx2 dx
d2 y dy
Example 41 If y = sin−1 x, show that (1 − x 2 ) −x =0
dx2 dx

Solution We have y = sin−1 x. Then


dy 1
=
dx �(1 − x 2 )
dy
or �(1 − x 2 ) =1
dx
d dy
So ��(1 − x 2 ). �=0
dx dx

d2 y dy d
or �(1 − x 2 ) ⋅ + ⋅ ��(1 − x 2 )� = 0
dx2 dx dx
d2 y dy 2x
or �(1 − x 2 ) ⋅ − ⋅ =0
dx2 dx 2�1−x2

d2 y dy
Hence (1 − x 2 ) −x =0
dx2 dx

Alternatively, Given that y = sin−1 x, we have

24 Continuity and Differentiability


1
y1 = , i. e. , (1 − x 2 )y12 = 1
√1 − x2
So (1 − x 2)
2y1 y2 + y12 (0 − 2x) = 0

Hence (1 − x 2 )y2 − xy1 = 0

Mean Value Theorem

In this section, we will state two fundamental results in Calculus without proof, We shall also learn
the geometric interpretation of these theorems.

Theorem 6 (Rolle’s Theorem) Let f: [a, b] → R be continuous on [a, b] and differentiable on (a, b), such
that f(a) = f(b), where a and b are some real numbers. Then there exists some c in (a, b) such that f’
(c) = 0.

In Figure, graphs of a few typical differentiable functions satisfying the hypothesis of Rolle’s theorem
are given.

Observe what happens to the slope of the tangent to the curve at various points between a and b. In
each of the graphs, the slope becomes zero at least at one point. That is precisely the claim of the
Rolle’s theorem as the slope of the tangent at any point on the graph of y = f(x) is nothing but the
derivative of f(x) at that point.

Theorem 7 (Mean Value Theorem) Let f : [a, b] → R be a continuous function on [a, b] and differentiable
on (a, b). Then there exists some c in (a, b) such that
f(b) - f(a)
f ′ (c) =
b- a
Observe that the Mean Value Theorem (MVT) is an extension of Rolle’s theorem. Let us now understand
a geometric interpretation of the MVT. The graph of a function y = f(x) is given in the Figure. We have
already interpreted f’(c) as the slope of the tangent to the curve y = f(x) at (c, f(c)). From the Figure it

Continuity and Differentiability 25


f(b) -f(a)
is clear that is the slope of the secant drawn between (a, f(a)) and (b, f(b)). The MVT states that
b-a
there is a point c in (a, b) such that the slope of the tangent at (c, f(c)) is same as the slope of the
secant between (a, f(a)) and (b, f(b)). In other words, there is a point c in (a, b) such that the tangent
at (c, f(c)) is parallel to the secant between (a, f(a)) and (b, f(b)).

Example 42 Verify Rolle’s theorem for the function y = x2 + 2, a = - 2 and b = 2.

Solution The function y = x2 + 2 is continuous in [– 2, 2] and differentiable in (– 2, 2). Also f (– 2) = f (


2) = 6 and hence the value of f (x) at – 2 and 2 coincide. Rolle’s theorem states that there is a point c
∈ (– 2, 2), where f′ (c) = 0. Since f′ (x) = 2x, we get c = 0. Thus at c = 0, we have f′ (c) = 0 and c =
0 ∈ (– 2, 2).

Example 43 Verify Mean Value Theorem for the function f (x) = x2 in the interval [2, 4].

Now, f (2) = 4 and f (4) = 16. Hence


f(b) − f(a) 16 − 4
= =6
b−a 4−2
MVT states that there is a point c implies c ∈ (2, 4) such that f ′ (c) = 6. But f ′ (x) = 2x which = 3.
Thus at c = 3 ∈ (2, 4), we have f ′ (c) = 6.

26 Continuity and Differentiability


6 Application of Derivatives
Introduction
We have learnt how to find derivative of composite functions, inverse trigonometric functions, implicit
functions, exponential functions and logarithmic functions. In this chapter, we will study applications
of the derivative in various disciplines, e.g., in engineering, science, social science, and many other
fields. For instance, we will learn how the derivative can be used (i) to determine rate of change of
quantities, (ii) to find the equations of tangent and normal to a curve at a point, (iii) to find turning
points on the graph of a function which in turn will help us to locate points at which largest or smallest
value (locally) of a function occurs. We will also use derivative to find intervals on which a function is
increasing or decreasing. Finally, we use the derivative to find approximate value of certain quantities.
Rate of Change of Quantities
Recall that by the derivative
ds
, we mean the rate of change of distance s with respect to the time t. In a similar fashion, whenever one quantity
dt
dy
y varies with another quantity x, satisfying some rule y = f(x), then �or f’ (x)�represents the rate of change
dx
dy
of y with respect to x and � . (or f’ (x0)) represents the rate of change of y with respect to x at x = x0.
dx x=x0

Further, if two variables x and y are varying with respect to another variable t, i.e., if x = f (t) and y = g
(t). then by Chain Rule
dy dy dx dx
= � , if ≠0
dx dt dt dt
Thus, the rate of change of y with respect to x can be calculated using the rate of change of y and that
of x both with respect to t.
Let us consider some examples.
Example : Find the rate of change of the area of a circle per second with respect to its radius r when
r = 5 cm.
Solution The area A of a circle with radius r is given by A = πr2. Therefore, the rate of change of the
dA d
area A with respect to its radius r is given by = (πr 2 ) = 2πr.
dr dr
dA
When r = 5 cm, = l0π Thus, the area of the circle is changing at the rate of 10 π cm2/s.
dr
Example 2 The volume of a cube is increasing at a rate of 9 cubic centimetres per second. How fast is
the surface area increasing when the length of an edge is 10 centimetres ?
Solution Let x be the length of a side, V be the volume and S be the surface area of the cube. Then, V
= x3 and S = 6x2, where x is a function of time t.
dN
Now = 9 cm3/s (Given)
dt
dN d d dx
Therefore 9 = = (x 3 ) = (x 3 ) ⋅ (By Chain Rule)
dt dt dx dt
dx
= 3x 2 ⋅
dt
dx 3
or = …(1)
dt x2
dS d d dx
Now = (6x 2 ) = (6x 2 ) ⋅ (By Chain Rule)
dt dt dx dt

Application of Derivatives 1
3 36
= 12x ⋅ � 2� = (Using (1))
x x
dS
Hence, when x = 10 cm, = 3.6 cm2/s
dt
Example 3 A stone is dropped into a quiet lake and waves move in circles at a speed of 4cm per second.
At the instant, when the radius of the circular wave is 10 cm, how fast is the enclosed area increasing?
Solution The area A of a circle with radius r is given by A = πr2. Therefore, the rate of change of area A
with respect to time t is
dA d d dr dr
= (πr 2 ) = (πr 2 ) ⋅ = 2πr (By Chain Rule)
dt dt dr dt dt
dr
It is given that = 4cm/s
dt
dA
Therefore, when r = 10 cm, = 2π(10)(4) = 80π
dt
Thus, the enclosed area is increasing at the rate of 80π cm2/s, when r = 10 cm.
dy
[Note] is positive if y increases as x increases and is negative if y decreases as x increases.
dx
Example 4 The length x of a rectangle is decreasing at the rate of 3 cm/minute and the width y is
increasing at the rate of 2cm/minute. When x =10cm and y = 6cm, find the rates of change of (a) the
perimeter and (b) the area of the rectangle.
Solution Since the length x is decreasing and the width y is increasing with respect to time, we have
dx dy
= −3 cm/min and = 2 cm/min
dt dt
(a) The perimeter P of a rectangle is given by
P = 2(x + y)
dP dx dy
Therefore = 2� + � = 2(−3 + 2) = −2 cm/min
dt dt dt

(b) The area A of the rectangle is given by


A=x.y
dA dx dy
Therefore = ⋅y+x⋅
dt dt dt
= - 3(6) + 10(2) (as x = 10 cm and y = 6 cm)
= 2 cm2/min
Example: The total cost C(x) in Rupees, associated with the production of x units of an item is given
by
C (x) = 0.005 x3 - 0.02 x2 + 30x + 5000
Find the marginal cost when 3 units are produced, where by marginal cost we mean the instantaneous
rate of change of total cost at any level of output.
Solution Since marginal cost is the rate of change of total cost with respect to the output, we have
dC
Marginal cost (MC) = = 0.005(3x2) - 0.02(2x) + 30
dx
When x = 3, MC = 0.015(32) - 0.04(3) + 30
= 0.135 - 0.12 + 30 = 30.015
Hence, the required marginal cost is Rs. 30.02 (nearly).
Example 6 The total revenue in Rupees received from the sale of x units of a product is given by R(x)
= 3x2 + 36x + 5. Find the marginal revenue, when x = 5, where by marginal revenue we mean the rate
of change of total revenue with respect to the number of items sold at an instant.
Solution Since marginal revenue is the rate of change of total revenue with respect to the number of

2 Application of Derivatives
units sold, we have
dR
Marginal Revenue (MR) = = 6x + 36
dx
When x = 5, MR = 6(5) + 36 = 66
Hence, the required marginal revenue is Rs. 66.
Increasing and Decreasing Functions
In this section, we will use differentiation to find out whether a function is increasing or decreasing or
none.
Consider the function f given by f(x) = x2, x ∈ R. The graph of this function is a parabola as given in
Figure.

First consider the graph (Figure) to the right of the origin. Observe that as we move from left to right
along the graph, the height of the graph continuously increases. For this reason, the function is said to
be increasing for the real numbers x > 0.
Now consider the graph to the left of the origin and observe here that as we move from left to right
along the graph, the height of the graph continuously decreases. Consequently, the function is said to
be decreasing for the real numbers x < 0.
We shall now give the following analytical definitions for a function which is increasing or decreasing
on an interval.
Definition 1 Let I be an interval contained in the domain of a real valued function f. Then f is said to be
(i) increasing on I if x1 < x2 in I ⇒ f(x1) < f(x2) for all x1, x2 ∈ I.
(ii) decreasing on I, if x1, x2 in I ⇒ f(x1) < f(x2) for all x1, x2 ∈ I.
(iii) constant on I, if f(x) = c for all x ∈ I, where c is a constant.
(iv) decreasing on I if x1 < x2 in I ⇒ f(x1) ≥ f(x2) for all x1, x2 ∈ I.
(v) strictly decreasing on I if x1 < x2 in I ⇒ f(x1) > f(x2) for all x1, x2 ∈ I.
For graphical representation of such functions see Figure.

Application of Derivatives 3
Strictly Increasing function Strictly Decreasing function Neither Increasing nor
Decreasing function
(i) (ii) (iii)
We shall now define when a function is increasing or decreasing at a point.
Definition 2 Let x0 be a point in the domain of definition of a real valued function f. Then f is said to be
increasing, decreasing at x0 if there exists an open interval I containing x0 such that f is increasing,
decreasing, respectively, in I.
Let us clarify this definition for the case of increasing function.
Example 7 Show that the function given by f (x) = 7x – 3 is increasing on R.
Solution Let x1 and x2 be any two numbers in R. Then
x1 < x2 ⇒ 7x1 < 7x2 ⇒ 7x1 – 3 < 7x2 – 3 ⇒ f (x1) < f (x2)
Thus, by Definition 1, it follows that f is strictly increasing on R.
We shall now give the first derivative test for increasing and decreasing functions. The proof of this
test requires the Mean Value Theorem studied in Chapter 5.
Theorem 1 Let f be continuous on [a, b] and differentiable on the open interval (a,b). Then
(a) f is increasing in [a,b] if f ′(x) > 0 for each x ∈ (a, b)
(b) f is decreasing in [a,b] if f ′(x) < 0 for each x ∈ (a, b)
(c) f is a constant function in [a,b] if f ′(x) = 0 for each x ∈ (a, b)
Proof (a) Let x1, x2 ∈ [a, b] be such that x1 < x2.
Then, by Mean Value Theorem (Theorem 8 in Chapter 5), there exists a point c between x1 and x2 such
that
f (x2) – f (x1) = f ′(c) (x2 – x1)

i.e. f (x2) – f (x1) > 0 (as f ′(c) > 0 (given))


i.e. f (x2) > f (x 1)
Thus, we have
x1 < x2 f ( x1 ) f ( x2 ), for all x1 , x2 [ a , b]
Hence, f is an increasing function in [a,b].
The proofs of part (b) and (c) are similar. It is left as an exercise to the reader.
Remarks
There is a more generalised theorem, which states that if f¢(x) > 0 for x in an interval excluding the
end points and f is continuous in the interval, then f is increasing. Similarly, if f¢(x) < 0 for x in an

4 Application of Derivatives
interval excluding the end points and f is continuous in the interval, then f is decreasing.
Example 8 Show that the function f given by
f (x) = x3 – 3x2 + 4x, x ∈ R
is increasing on R.
Solution Note that
f ′(x) = 3x2 – 6x + 4
= 3(x2 – 2x + 1) + 1
= 3(x – 1)2 + 1 > 0, in every interval of R
Therefore, the function f is increasing on R.
Example 9 Prove that the function given by f (x) = cos x is
(a) decreasing in (0, π)
(b) increasing in (π, 2π), and
(c) neither increasing nor decreasing in (0, 2π).
Solution Note that f ′(x) = – sin x
(a) Since for each x ∈ (0, π), sin x > 0, we have f ′(x) < 0 and so f is decreasing in (0, π).
(b) Since for each x ∈ (π, 2π), sin x < 0, we have f ′(x) > 0 and so f is increasing in (π, 2π).
(c) Clearly by (a) and (b) above, f is neither increasing nor decreasing in (0, 2π).
Example 10 Find the intervals in which the function f given by f (x) = x2 – 4x + 6 is
(a) increasing (b) decreasing
Solution We have
f(x) = x2 - 4x + 6
or f’(x) = 2x - 4

Therefore, f ′(x) = 0 gives x = 2. Now the point x = 2 divides the real line into two disjoint intervals
namely, (– ∞, 2) and (2, ∞). In the interval (– ∞, 2), f ′(x) = 2x – 4 < 0.
Therefore, f is decreasing in this interval. Also, in the interval (2, ∞) , f ′( x)  0 and so the function f
is increasing in this interval.
Example 11 Find the intervals in which the function f given by f (x) = 4x3 – 6x2 – 72x + 30 is
(a) increasing (b) decreasing.
Solution We have
f (x) = 4x3 – 6x2 – 72x + 30
or f ′(x) = 12x2 – 12x – 72
= 12(x2 – x – 6)
= 12(x – 3) (x + 2)
Therefore, f'(x) = 0 gives x = - 2, 3. The points x = - 2 and x = 3 divides the real line into three disjoint
intervals, namely, (- ∞, -2), (-2.3) and (3, ∞).

Application of Derivatives 5
In the intervals (- ∞, - 2) and (3, ∞), f'(x) is positive while in the interval (- 2, 3), f'(x) is negative.
Consequently, the function f is increasing in the intervals (- ∞, - 2) and (3, ∞) while the function is
decreasing in the interval (- 2, 3). However, f is neither increasing nor decreasing in R.
Interval Sign of f'(x) Nature of function f

(- ∞, - 2) (-) (-) > 0 f is increasing

(-2, 3) (-) (+) < 0 f is decreasing

(3, ∞) (+) (+) > 0 f is increasing


π
Example 12 Find intervals in which the function given by f(x) = sin 3x , x ∈ �0, � is
2

(a) increasing (b) decreasing.


Solution We have
f(x) = sin 3x
or f ′ (x) = 3 cos 3x
π 3π π 3π
Therefore, f ′ (x) = 0 gives cos 3x = 0 which in turn gives 3x = , (as x ∈ �0, � implies 3x ∈ �0, �). So x =
2 2 2 2
π π π π 𝜋𝜋 π π
and . The point x = divides the interval �0, � into two disjoint intervals and �0, � and � , �.
6 2 6 2 6 6 2

π π π π π π π π
Now, f ′ (x) > 0 for all x ∈ �0, � as 0 ≤ x < ⇒ 0 ≤ 3x < and f ′ (x) < 0 for all x ∈ � , � as <x< ⇒ < 3x <
6 6 2 6 2 6 2 2

.
2
π π π
Therefore, f is increasing in �0, � and decreasing in � , �.
6 6 2
π
Also, the given function is continuous at x = 0 and x = . Therefore, by Theorem 1, f is increasing on
6
π π π
�0, � and decreasing on � , �.
6 6 2

Example 13 Find the intervals in which the function f given by


f(x) = sin x + cos x, 0 ≤ x ≤ 2π
is increasing or decreasing.
Solution We have
f(x) = sin x + cos x,
or f'(x) = cos x - sin x
π 5π
Now f ′ (x) = 0 gives sin x = cos x which gives that x = , as 0 ≤ x ≤ 2π
4 4
π
The paints x= and x =
4
5π π π 5π 5π
divide the interval [0, 2π] into three disjoint intervals, namely �0, � � , � and � 2π�.
4 4 4 4 4

π 5π
Note that f ′ (x) > 0 if x ∈ �0, � ∪ � , 2π�
4 4
π 5π
or f is increasing in the intervals �0, � and � , 2π�
4 4
π 5π
Also f ′ (x) < 0 if x ∈ � , �
4 4

6 Application of Derivatives
π 5π
or f is decreasing in � , �
4 4

Interval Sign of f'(x) Nature of function


π
�0, � >0 f is increasing
4

π 5π <0 f is decreasing
� , �
4 4

5π >0 f is increasing
� , 2π�
4

Tangents and Normals


In this section, we shall use differentiation to find the equation of the tangent line and the normal line
to a curve at a given point.
Recall that the equation of a straight line passing through a given point (x0, y0) having finite slope m is
given by
y - y0 = m (x – x0)
dy
Note that the slope of the tangent to the curve y = f(x) at the point (x0, y0) is given by � (= f'(x0)).
dx (x0,y0)

So the equation of the tangent at (x0, y0) to the curve y = f(x) is given by
y – y0 = f'(x0)(x – x0)
Also, since the normal is perpendicular to the tangent, the slope of the normal to the curve y = f(x) at
1
(x0, y0) is , if f’(x0) ≠ 0. Therefore, the equation of the normal to the curve y = f(x) at (x0, y0) is given
f′ (x0 )

by
-1
y − y0 = (x - x0 )
f(x0 )
i.e. (y - y0 )f ′ (x0 ) + (x - x0 ) = 0

[Note] If a tangent line to the curve y = f(x) makes an angle θ with x-axis in the positive direction, then
dy
= slope of the tangent = tanθ.
dx
Particular cases
(i) If slope of the tangent line is zero, then tan θ = 0 and so θ = 0 which means the tangent line is
parallel to the x-axis. In this case, the equation of the tangent at the point (x0, y0) is given by y = y0.
π
(ii) If θ → , then tan θ → ∞, which means the tangent line is perpendicular to the x-axis, i.e., parallel to
2

Application of Derivatives 7
the y-axis. In this case, the equation of the tangent at (x0, y0) is given by x = x0 (Why?).
Example 14 Find the slope of the tangent to the curve y = x3 – x at x = 2.
Solution The slope of the tangent at x = 2 is given by
dy
� = 3x 2 - 1]x=2 = 11.
dx x=2
2
Example 15 Find the point at which the tangent to the curve y = √4x − 3 − 1 has its slope .
3
Solution Slope of tangent to the given curve at (x, y) is
dy 1 −1 2
= (4𝑥𝑥 − 3) 2 4 =
dx 2 √4x − 3
2
The slope is given to be .
3
2 2
So =
√4x−3 3

or 4x − 3 = 9
or x = 3
Now y = √4x − 3 − 1 So when x = 3, y = �4(3) − 3 − 1 = 2
Therefore, the required point is (3, 2).
2
Example 16 Find the equation of all lines having slope 2 and being tangent to the curve y + = 0.
x−3
Solution Slope of the tangent to the given curve at any point (x, y) is given by
dy 2
=
dx (x − 3)2
But me stope is given to be 2. Therefore
2
=2
(x − 3)2
or (x − 3)2 = 1
or x − 3 = ±1
or x = 2, 4
Now x = 2 gives y = 2 and x = 4 gives y =
−2. Thus, there are two tangents to the given curve with slope 2 and passing through the points (2, 2) and (4, −2) .
The equation of tangent through (2, 2) is given by
y − 2 = 2(x − 2)
or y − 2x + 2 = 0
and the equation of the tangent through (4, −2) is given by
y − (−2) = 2(x − 4)
or y − 2x + 10 = 0
x2 y2
Example 17 Find points on the curve + = 1 at which the tangents are (i) parallel to x- axis (ii) parallel to y‐
4 25
axis.
x2 y2
Solution Differentiating + = 1 with respect to x, we get
4 25
x 2y dy
+ =0
2 25 dx
dy −25 x
or =
dx 4 y
−25x
(i) Now, the tangent is parallel to the x- axis if the slope of the tangent is zero which gives =
4y

8 Application of Derivatives
x2 y2
0. This is possible if x = 0. Then + = 1 for x = 0 gives y 2 = 25, i. e. , y = ±5.
4 25
Thus, the points at which the tangents are parallel to the x- axis are (0,5) and (0, −5) .
4y
(ii) The tangent line is parallel to y‐axis if the slope of the normal is 0 which gives = 0 , i. e. , y =
25x
x2 y2
0. Therefore, + = 1 for y = 0 gives x =
4 25
±2. Hence, the points at which the tangents are parallel to the y- axis are (2, 0) and (−2,0).
x−7
Example 18 Find the equation of the tangent to the curve y = (x−2)(x−3) at the point where it cuts the x‐axis.

Solution Note that on x- axis, y = 0. So the equation of the curve, when y = 0, gives x =
7. Thus, the curve cuts the x- axis at (7, 0). Now differentiating the equation of the curve with respect to x,
we obtain
dy 1−y(2x−5)
= (x−2)(x−3)
(Why?)
dx
𝑑𝑑𝑑𝑑 1−0 1
or � = (5)(4) =
𝑑𝑑𝑑𝑑 (7,0) 20
1
Therefore, the slope of the tangent at (7, 0) is Hence, the equation of the tangent at (7, 0) is
20
1
y−0= (x − 7) or 20y − x + 7 = 0
20
2 2
Example 19 Find the equations of the tangent and normal to the curve x 3 + y 3 = 2 at (1, 1).
2 2
Solution Differentiating x 3 + y 3 = 2 with respect to x, we get
2 -1 2 -1 dy
x3 + y3 =0
3 3 dx
1
dy y 3
or = −� �
dx x
𝑑𝑑𝑑𝑑
Therefore, the slope of the tangent at (1, 1) is � = −1
𝑑𝑑𝑑𝑑 (1,1)

So the equation of the tangent at (1, 1) is


y − 1 = −1(x − 1) or y + x − 2 = 0
Also, the slope of the normal at (1, 1) is given by
−1
=1
slope of the tangent at(1,1)
Therefore, the equation of the normal at (1, 1) is
y − 1 = 1(x − 1) or y − x = 0
Example 20 Find the equation of tangent to the curve given by
x = asin3 t , y = bcos 3 t … (1)
π
at a point where t = .
2
Solution Differentiating (1) with respect to t, we get
dx dy
= 3asin2 tcost and = −3bcos 2 tsint
dt dt
dy
dy −3bcos2 t sin t −b cos t
or = dt
dx = =
dx 3asin2 t cos t a sin t
dt
π
Therefore, slope of the tangent at t = is
2
π
dy −b cos
� = 2
dx t=π π =0
a sin
2 2
π π
Also, when t = , x = a and y = 0. Hence, the equation of tangent to the given curve at t = , i. e. , at (a, 0) is
2 2

Application of Derivatives 9
y − 0 = 0(x − a) , i. e. , y = 0.
Approximations
In this section, we will use differentials to approximate values of certain quantities.
Let f : D → R, D ⊂ R, be a given function and let y = f(x). Let ∆x denote a small increment in x. Recall
that the increment in y corresponding to the increment in x, denoted by ∆y, is given by ∆y = f(x + ∆y) -
f(x). We define the following
(i) The differential of x, denoted by dx, is defined by dx = ∆y.
dy
(ii) The differential of y, denoted by dy, is defined by dy = f’(x) dx or dy = � � ∆x.
dx

* Two curves intersect at right angle if the tangents to the curves at the point of intersection are
perpendicular to each other.
In case dx = ∆x is relatively small when compared with x, dy is a good approximation of ∆y and we
denote it by dy ≈ ∆y.
For geometrical meaning of ∆x, ∆y, dx and dy, one may refer to Figure.
[Note] In view of the above discussion and Figure, we may note that the differential of the dependent
variable is not equal to the increment of the variable where as the differential of independent variable
is equal to the increment of the variable.
Example 21 Use differential to approximate √36.6.
Solution Take y = √x Let x = 36 and let Δx = 0.6. Then
Δy = √x + Δx − √x = √36.6 − √36 = √36.6 − 6
or √36.6 = 6 + Δy
Now dy is approximately equal to Δy and is given by
dy 1 1
dy = � � Δx = (0.6) = (0.6) = 0.05 (as y = √x)
dx 2√x 2√36

Thus, the approximate value of √36.6 is 6 + 0.05 = 6.05.


1
Example 22 Use differential to approximate (25)3 .
1
Solution Let y = x 3 Let x = 27 and let Δx = −2. Then
1 1 1 1 1
Δy = (x + Δv)3 − x 3 = (25)3 − (27)3 = (25)3 − 3
1
or (25)3 = 3 + Δy
Now dy is approximately equal to Δy and is given by

10 Application of Derivatives
1
dy 1 2
dy = � � Δx = 2 (−2) (as y = x 3 )
dx 3
3x3
1 1 −2
= (−2) = = −0.074
1 2 3 27
3 �(27)3 �
1
Thus, the approximate value of (25)3 is given by
3 + (-0.074) = 2.926
Example 23 Find the approximate value of f(3.02), where f(x) = 3x2 + 5x + 3.
Solution Let x = 3 and ∆x = 0.02. Then
f (3. 02) = f (x + ∆x) = 3 (x + ∆x)2 + 5(x + ∆x) + 3
Note that y = f (x + ∆x) – f (x). Therefore
f (x + ∆x) = f (x) + ∆y
≈ f (x) + f ′(x) ∆x (as dx = ∆x)
or f (3.02) ≈ (3x2 + 5x + 3) + (6x + 5) ∆x
= (3(3)2 + 5(3) + 3) + (6(3) + 5) (0.02) (as x = 3, ∆x = 0.02)
= (27 + 15 + 3) + (18 + 5) (0.02)
= 45 + 0.46 = 45.46
Hence, approximate value of f (3.02) is 45.46.
Example 24 Find the approximate change in the volume V of a cube of side x meters caused by
increasing the side by 2%.
Solution Note that
V = x3
dV
or dV = � � ∆x = (3x2) ∆x
dx

= (3x2) (0.02x) = 0.06x3m3 (as 2% of x is 0.02x)


Thus, the approximate change in volume is 0.06 x3m3.
Example 25 If the radius of a sphere is measured as 9 cm with an error of 0.03 cm, then find the
approximate error in calculating its volume.
Solution Let r be the radius of the sphere and ∆r be the error in measuring the radius. Then r = 9 cm
and ∆r = 0.03 cm. Now, the volume V of the sphere is given by
4
v = πr 3
3
dN
or = 4πr 2
dr
dV
Therefore dV = � � Δr = (4πr 2 )Δr
dr

= 4π(9) (0.03) = 9.72πcm3


2

Thus, the approximate error in calculating the volume is 9.72π cm3.

Maxima and Minima


In this section, we will use the concept of derivatives to calculate the maximum or minimum values of
various functions. In fact, we will find the ‘turning points’ of the graph of a function and thus find
points at which the graph reaches its highest (or lowest) locally. The knowledge of such points is very

Application of Derivatives 11
useful in sketching the graph of a given function. Further, we will also find the absolute maximum and
absolute minimum of a function that are necessary for the solution of many applied problems.
Let us consider the following problems that arise in day to day life.
(i) The profit from a grove of orange trees is given by P(x) = ax + bx2. where a, b are constants and x is
the number of orange trees per acre. How many trees per acre will maximise the profit?
(ii) A ball, thrown into the air from a building 60 metres high, travels along a path given by h(x) = 60 +
x2
x- , where x is the horizontal distance from the building and h(x) is the height of the ball . What is
60
the maximum height the ball will reach?
(iii) An Apache helicopter of enemy is flying along the path given by the curve f(x) = x2 + 7. A soldier,
placed at the point (1, 2), wants to shoot the helicopter when it is nearest to him. What is the nearest
distance?
In each of the above problem, there is something common, i.e., we wish to find out the maximum or
minimum values of the given functions. In order to tackle such problems, we first formally define
maximum or minimum values of a function, points of local maxima and minima and test for determining
such points.
Definition 3 Let f be a function defined on an interval I. Then
(a) f is said to have a maximum value in I, if there exists a point c in I such that f(c) > f(x), for all x ∈ I.
The number f(c) is called the maximum value of f in I and the point c is called a point of maximum
value of f in I.
(b) f is said to have a minimum value in I, if there exists a point c in I such that f (c) < f(x), for all x ∈ I.
The number f(c), in this case, is called the minimum value of f in I and the point c. in this case, is called
a point of minimum value of f in I.
(c) f is said to have an extreme value in I if there exists a point c in I such that f(c) is either a maximum
value or a minimum value of f in I.
The number f(c), in this case, is called an extreme value of f in I and the point c is called an extreme
point.
Remark In Figure (a), (b) and (c), we have exhibited that graphs of certain particular functions help us
to find maximum value and minimum value at a point. Infact, through graphs, we can even find
maximum/minimum value of a function at a point at which it is not even differentiable.

Example 26 Find the maximum and the minimum values, if any, of the function f given by
f(x) = x2, x ∈ R.

12 Application of Derivatives
Solution From the graph of the given function , we have f(x) = 0 if x = 0. Also
f(x) ≥ 0, for all x ∈ R.
Therefore, the minimum value of f is 0 and the point of minimum value of f is x = 0. Further, it may be
observed from the graph of the function that f has no maximum value and hence no point of maximum
value of f in R.
[Note] If we restrict the domain of f to [-2, 1] only, then f will have maximum value(- 2)2 = 4 at x = - 2.

Example 27 Find the maximum and minimum values of f, if any, of the function given by f(x) = | x |, x ∈
R.
Solution From the graph of the given function, note that
f(x) ≥ 0, for all x ∈ R and f(x) = 0 if x = 0.
Therefore, the function f has a minimum value 0 and the point of minimum value of f is x = 0. Also,
the graph clearly shows that f has no maximum value in R and hence no point of maximum value in R.

[Note]
(i) If we restrict the domain of f to [-2, 1] only, then f will have maximum value |-2| = 2.
(ii) One may note that the function f in Example 27 is not differentiable at x = 0.
Example 28 Find the maximum and the minimum values, if any, of the function given by
f(x) = x, x ∈ (0, 1).
Solution The given function is an increasing (strictly) function in the given interval (0, 1). From the graph
of the function f, it seems that, it should have the minimum value at a point closest to 0 on its right
and the maximum value at a point closest to 1 on its left. Are such points available? Of course, not. It
x0
is not possible to locate such points. Infact, if a point x0 is closest to 0, then we find < x0 for all xo ∈
2

Application of Derivatives 13
x1 +1
(0,1) Also, if x1 is closest to 1 , then > x1 for all x1 ∈ (0,1).
2
Therefore, the given function has neither the maximum value nor the minimum value in the interval (0,
1).

Remark The reader may observe that in Example 28, if we include the points 0 and 1 in the domain of
f, i.e., if we extend the domain of f to [0, 1], then the function f has minimum value 0 at x = 0 and
maximum value 1 at x = 1. Infact, we have the following results (The proof of these results are beyond
the scope of the present text)
Every monotonic function assumes its maximum/minimum value at the end points of the domain of
definition of the function.
A more general result is
Every continuous function on a closed interval has a maximum and a minimum value.
[Note] By a monotonic function f in an interval I, we mean that f is either increasing in I or decreasing
in I.
Maximum and minimum values of a function defined on a closed interval will be discussed later in this
section.
Let us now examine the graph of a function as shown in Figure. Observe that at points A, B, C and D
on the graph, the function changes its nature from decreasing to increasing or vice-versa. These points
may be called turning points of the given function. Further, observe that at turning points, the graph
has either a little hill or a little valley. Roughly speaking, the function has minimum value in some
neighbourhood (interval) of each of the points A and C which are at the bottom of their respective

valleys. Similarly, the function has maximum value in some neighbourhood of points B and D which are
at the top of their respective hills. For this reason, the points A and C may be regarded as points of
local minimum value (or relative minimum value) and points B and D may be regarded as points of

14 Application of Derivatives
local maximum value (or relative maximum value) for the function. The local maximum value and local
minimum value of the function are referred to as local maxima and local minima, respectively, of the
function.
We now formally give the following definition
Definition 4 Let f be a real valued function and let c be an interior point in the domain of f. Then
(a) c is called a point of local maxima if there is an h > 0 such that
f(c) ≥ f(x), for all x in (c - h, c + h). x ≠ c
The value f(c) is called the local maximum value of f.
(b) c is called a point of local minima if there is an h > 0 such that
f(c) ≤ f(x), for all x in (c - h, c + H)
The value f(c) is called the local minimum value of f.
Geometrically, the above definition states that if x = c is a point of local maxima of f, then the graph
of f around c will be as shown in Figure (a). Note that the function f is increasing (i.e., f'(x) > 0) in the
interval (c – h, c) and decreasing (i.e., f'(x) < 0) in the interval (c, c + h).
This suggests that f’(c) must be zero.

Similarly, if c is a point of local minima of f , then the graph of f around c will be as shown in Figure(b).
Here f is decreasing (i.e., f ′(x) < 0) in the interval (c – h, c) and increasing (i.e., f ′(x) > 0) in the
interval (c, c + h). This again suggest that f ′(c) must be zero.
The above discussion lead us to the following theorem (without proof).
Theorem 2 Let f be a function defined on an open interval I. Suppose c ∈ I be any point. If f has a local
maxima or a local minima at x = c, then either f ′(c) = 0 or f is not differentiable at c.
Remark The converse of above theorem need not be true, that is, a point at which the derivative
vanishes need not be a point of local maxima or local minima. For example, if f (x) = x3, then f ′(x) =
3x2 and so f ′(0) = 0. But 0 is neither a point of local maxima nor a point of local minima .
[Note] A point c in the domain of a function f at which either f ′(c) = 0 or f is not differentiable is
called a critical point of f. Note that if f is continuous at c and f ′(c) = 0, then there exists an h > 0
such that f is differentiable in the interval (c – h, c + h).
We shall now give a working rule for finding points of local maxima or points of local minima using only
the first order derivatives.

Application of Derivatives 15
Theorem 3 (First Derivative Test) Let f be a function defined on an open interval I. Let f be continuous
at a critical point c in I. Then
(i) If f'(x) changes sign from positive to negative as x increases through c, i.e., if f'(x) > 0 at every point
sufficiently close to and to the left of c, and f'(x) < 0 at every point sufficiently close to and to the right
of c, then c is a point of local maxima.
(ii) If f'(x) changes sign from negative to positive as x increases through c, i.e., if f'(x) < 0 at every point
sufficiently close to and to the left of c, and f'(x) > 0 at every point sufficiently close to and to the right
of c, then c is a point of local minima.
(iii) If f'(x) does not change sign as x increases through c, then c is neither a point of local maxima nor
a point of local minima. Infact, such a point is called point of inflection .
[Note] If c is a point of local maxima of f, then f(c) is a local maximum value of f Similarly, if c is a
point of local minima of f, then f(c) is a local minimum value of f.
geometrically explain Theorem 3.

Example 29 Find all points of local maxima and local minima of the function f given by
f(x) = x3 - 3x + 3.
Solution We have
f(x) = x3 - 3x + 3
or f'(x) = 3x2 - 3 = 3(x - 1) (x + 1)
or f’(x) = 0 at x = 1 and x = - 1

16 Application of Derivatives
Thus, x = ± 1 are the only critical points which could possibly be the points of local maxima and/or
local minima of f. Let us first examine the point x = 1.
Note that for values close to 1 and to the right of 1, f'(x) > 0 and for values close to 1 and to the left of
1, f'(x) < 0. Therefore, by first derivative test, x = 1 is a point of local minima and local minimum value
is f(1) = 1. In the case of x = -1, note that f’(x) > 0, for values close to and to the left of -1 and f'(x) < 0,
for values close to and to the right of - 1. Therefore, by first derivative test, x = - 1 is a point of local
maxima and local maximum value is f(-1) = 5.
Values of x Sign of f'(x) = 3(x - 1) (x + 1)

Close to 1 to the right (say 1.1 etc.) >0


to the left (say 0.9 etc.) <0

to the right (say - 0.9 etc.) <0


Close to -1
to the left (say -1.1 etc.) >0
Example 30 Find all the points of local maxima and local minima of the function f given by
f(x) = 2x3 - 6x2 + 6x +5.
Solution We have
f(x) = 2x3 - 6x2 + 6x + 5
or f'(x) = 6x2 - 12x + 6 = 6 (x - 1)2
or f'(x) = 0 at x = 1
Thus, x = 1 is the only critical point of f. We shall now examine this point for local maxima and/or local
minima of f. Observe that f'(x) ≥ 0, for all x ∈ R and in particular f'(x) > 0, for values close to 1 and to
the left and to the right of 1. Therefore, by first derivative test, the point x = 1 is neither a point of local
maxima nor a point of local minima. Hence x = 1 is a point of inflexion.
Remark One may note that since f'(x), in Example 30, never changes its sign on R, graph of f has no
turning points and hence no point of local maxima or local minima.
We shall now give another test to examine local maxima and local minima of a given function. This test
is often easier to apply than the first derivative test.
Theorem 4 (Second Derivative Test) Let f be a function defined on an interval I and c ∈ I. Let f be twice
differentiable at c. Then
(i) x = c is a point of local maxima if f'(c) = 0 and f"(c) < 0
The value f(c) is local maximum value of f.
(ii) x = c is a point of local minima if f'(c) = 0 and f"(c) > 0
In this case, f(c) is local minimum value of f.
(iii) The test fails if f'(c) = 0 and f"(c) = 0.
In this case, we go back to the first derivative test and find whether c is a point of local maxima, local
minima or a point of inflexion.
[Note] As f is twice differentiable at c, we mean second order derivative of f exists at c.
Example 31 Find local minimum value of the function f given by f(x) = 3 + |x|, x ∈ R.
Solution Note that the given function is not differentiable at x = 0. So, second derivative test fails. Let
us try first derivative test. Note that 0 is a critical point of f. Now to the left of 0, f(x) = 3 - x and so

Application of Derivatives 17
f'(x) = - 1 < 0. Also

to the right of 0, f(x) = 3 + x and so f'(x) = 1 > 0. Therefore, by first derivative test, x = 0 is a point of
local minima of f and local minimum value of f is f(0) = 3.
Example 32 Find local maximum and local minimum values of the function f given by
f(x) = 3x4 + 4x3 - 12x2 + 12
Solution We have
f(x) = 3x4 + 4x3 - 12x2 + 12
or f'(x) = 12x3 + 12x2 - 24x = 12x (x - 1) (x + 2)
or f'(x) = 0 at x = 0, x = 1 and x = - 2.
Now f"(x) = 36x2 + 24x - 24 = 12 (3x2 + 2x - 2)
f ′′ (0) = −24 < 0
or �f ′′ (1) = 36 > 0
f ′′ (−2) = 72 > 0
Therefore, by second derivative test, x = 0 is a point of local maxima and local maximum value of f at
x = 0 is f(0) = 12 while x = 1 and x = - 2 are the points of local minima and local minimum values of f
at x = - 1 and - 2 are f(1) = 7 and f(-2) = -20, respectively.
Example 33 Find all the points of local maxima and local minima of the function f given by
f(x) = 2x3 - 6x2 + 6x + 5.
Solution We have
f(x) = 2x3 - 6x2 + 6x + 5
f ′ (x) = 6x 2 − 12x + 6 = 6(x − 1)2
or �
f ′′ (x) = 12(x − 1)
NOW f'(x) = 0 gives x = 1. Also f”(1) = 0. Therefore, the second derivative test fails in this case. So, we
shall go back to the first derivative test.
We have already seen (Example 30) that, using first derivative test, x = 1 is neither a point of local
maxima nor a point of local minima and so it is a point of inflexion.
Example 34 Find two positive numbers whose sum is 15 and the sum of whose squares is minimum.
Solution Let one of the numbers be x. Then the other number is (15 - x). Let S(x) denote the sum of
the squares of these numbers. Then
S(x) = x 2 + (15 − x)2 = 2x 2 − 30x + 225
′ (x)
S = 4x − 30
or �
S " (x) = 4
15 15
Now S ′ (x) = 0 gives x = Also S " � � = 4 > 0 Therefore, by second derivative test, x =
2 2

18 Application of Derivatives
15 15
is the point oflocal mimma of S. Hence the sum of squares of numbers is minimum when the numbers are and 15 −
2 2
15 15
= .
2 2
Remark Proceeding as in Example 34 one may prove that the two positive numbers, whose sum is k
k k
and the sum of whose squares is mimmum, are and .
2 2
1
Example 35 Find the shortest distance of the point (0, c) from the parabola y = x 2 , where ≤ c ≤ 5.
2
Solution Let (h, k) be any point on the parabola y = x 2 Let D be the required distance between (h, k) and (0, c) .
Then
D = �(h − 0)2 + (k − c)2 = �h2 + (k − c)2 … (1)
Since (h, k) lies on the parabola y = x 2 , we have k = h2 So (1) gives
D ≡ D(k) = �k + (k − c)2
1+2(k−c)
or D′ (k) =
2�k+(k−c)2
2c−1
Now D′ (k) = 0 gives k =
2
2c−1 2c−1
Observe that when k< , then 2(k − c) + 1 < 0 , i. e. , D′ (k) < 0 Also when k > , then D′ (k) >
2 2
2c−1
0 So, by first deriyatiye test, D(k) is mimmum at k = .
2
Hence, the required shortest distance is given by

2c − 1 2c − 1 2c − 1 2 √4c − 1
D� �=� +� c� =
2 2 2 2
[Note] The reader may note that in Example 35, we have used first derivative test instead of the second
derivative test as the former is easy and short.
Example 36 Let AP and BQ be two vertical poles at points A and B, respectively. If AP = 16 m, B Q = 22
m and AB = 20 m, then find the distance of a point R on AB from the point A such that RP2 + RQ2 is
minimum.
Solution Let R be a point on AB such that AR = x m.
Then RB = (20 - x) m (as AB = 20 m). From Figure, we have
RP2 = AR2 + AP2
and RQ2 = RB2 + BQ2
Therefore RP2 + RQ2 = AR2 + AP2 + RB2 + BQ2
= x2 + (16)2 + (20 - x)2 + (22)2
= 2x2 - 40x + 1140
Let S ≡ S(x) = RP2 + RQ2 = 2x2 - 40x + 1140.
Therefore S'(x) = 4x - 40.
Now S'(x) = 0 gives x = 10. Also S“(x) = 4 > 0, for all x and so S “(10) > 0.
Therefore, by second derivative test, x = 10 is the point of local minima of S. Thus, the distance of R
from A on AB is AR = x = 10 m.

Application of Derivatives 19
Example 37 If length of three sides of a trapezium other than base are equal to 10cm, then find the
area of the trapezium when it is maximum.
Solution The required trapezium is as given in Figure. Draw perpendiculars DP and

CQ on AB. Let AP = x cm. Note that ΔAPD ~ ΔBQC. Therefore, QB = x cm. Also, by Pythagoras theorem,
DP = QC = √100 – x 2 . Let be the area of the trapezium. Then
1
A ≡ A(x) = (sum of parallel sides) (height)
2
1
= (2x + 10 + 10)( √100 – x 2 )
2

= (x + 10)( √100 – x 2 )
(−2x)
or A'(x) = (x +10), + (√100 – x 2 )
2�100−x2
−2x 2 − 10x + 100
=
√100 − x 2
Now A'(x) = 0 gives 2x2 + 10x - 100 = 0, i.e., x = 5 and x = -10.
Since x represents distance, it can not be negative.
So, x = 5. Now
(−2x)
√100 − x 2 (−4x − 10) − (−2x 2 − 10x + 100)
A′′ (x) = 2√100 − x 2
100 − x 2
2x3 −300x−1000
= 3 (on simplification)
(100−x2 )2
2(5)3 −300(5)−l0α) 3 −2250 −30
or A′′ (5) = 3 = = <0
2 75√75 √75
(l00−(5)2 )2

Thus, area of trapezium is maximum at x = 5 and the area is given by


A(5) = (5 + 10)�100 − (5)2 = 15√75 = 75√3 cm2
Example 38 Prove that the radius of the right circular cylinder of greatest curved surface area which

20 Application of Derivatives
can be inscribed in a given cone is half of that of the cone.
Solution Let OC = r be the radius of the cone and OA = h be its height. Let a cylinder with radius OE =
x inscribed in the given cone . The height QE of the cylinder is given by
QE EC
= (since ΔQEC ∼ ΔAOC)
OA OC
QE r-x
or =
h r
h(r - x)
or QE =
r

Let S be the curved surface area of the given cylinder. Then


2πxh(r- x) 2πh
S ≡ S(x) = = (rx - x 2 )
r r
2πh
S ′ (x) = (r - 2x)
r
or � -4πh
S ′′ (x) =
r
r r r
Now S ′ (x)
= 0 gives x = . Since S ′′ (x) < 0 for all x, S " � � < 0 So x = is a point of maxima of S. Hence, the
2 2 2
radius of the cylinder of greatest curved surface area which can be inscribed in a given cone is half of
that of the cone.
Maximum and mum Values of a Function in a Closed Interval
Let us consider a function f given by
f(x) = A + 2, A ∈ (0, 1)
Observe that the function is continuous on (0, 1) and neither has a maximum value nor has a minimum
value. Further, we may note that the function even has neither a local maximum value nor a local
minimum value.
However, if we extend the domain of f to the closed interval [0, 1 ], then/still may not have a local
maximum (minimum) values but it certainly does have maximum value 3 = f(1) and minimum value 2 =
f(0). The maximum value 3 of f at x = 1 is called absolute maximum value (global maximum or greatest
value) of f on the interval [0, 1]. Similarly, the minimum value 2 of f at x = 0 is called the absolute
minimum value (global minimum or least value) of f on [0, 1].
Consider the graph given in Figure of a continuous function defined on a closed interval [a, d]. Observe
that the function f has a local minima at x = b and local

Application of Derivatives 21
minimum value is f(b). The function also has a local maxima at x = c and local maximum value is f(c).
Also from the graph, it is evident that f has absolute maximum value f(a) and absolute minimum value
f(d). Further note that the absolute maximum (minimum) value of f is different from local maximum
(minimum) value of f.
We will now state two results (without proof) regarding absolute maximum and absolute minimum
values of a function on a closed interval I.
Theorem 5 Let f be a continuous function on an interval I = [a, b]. Then f has the absolute maximum
value and/attains it at least once in I. Also, f has the absolute minimum value and attains it at least
once in I.
Theorem 6 Let f be a differentiable function on a closed interval I and let c be any interior point of I.
Then
(i) f'(c) = 0 if f attains its absolute maximum value at c.
(ii) f'(c) = 0 if f attains its absolute minimum value at c.
In view of the above results, we have the following working rule for finding absolute maximum and/or
absolute minimum values of a function in a given closed interval [a, b].
Working Rule
Step 1: Find all critical points of f in the interval, i.e., find points x where either f'(x) = 0 or f is not
differentiable.
Step 2: Take the end points of the interval.
Step 3: At all these points (listed in Step 1 and 2), calculate the values of f.
Step 4: Identify the maximum and minimum values of f out of the values calculated in Step 3. This
maximum value will be the absolute maximum (greatest) value of f and the minimum value will be the
absolute minimum (least) value of f.
Example 39 Find the absolute maximum and minimum values of a function f given by
f(x) = 2x3 - 15x2 + 36x + 1 on the interval [1, 5].
Solution We have
f(x) = 2x3 - 15x2 + 36x + 1
or f'(x) = 6x2 - 30x + 36 = 6 (x - 3) (x - 2)
Note that f’(x) = 0 gives x = 2 and x = 3.
We shall now evaluate the value of f at these points and at the end points of the interval [1, 5], i.e., at

22 Application of Derivatives
x = 1, x = 2, x = 3 and at x = 5. So
f(1) = 2(13) - 15(12) + 36(1) + 1 = 24
f(2) = 2 (23) - 15 (22) + 36 (2) + 1 = 29
f(3) = 2 (33) - 15 (32) + 36 (3) + 1 = 28
f(5) = 2 (53) - 15 (52) + 36 (5) + 1 = 56
Thus, we conclude that absolute maximum value of f on [1, 5] is 56, occurring at x = 5, and absolute
minimum value of f on [1, 5] is 24 which occurs at x = 1.
Example 40 Find absolute maximum and minimum values of a function f given by
4 1
f(x) = 12x 3 − 6x 3 , x ∈ [−1, 1]
Solution We have

4 1
f(x) = 12x 3 − 6x 3
1
2 2(8x−1)
or f ′ (x) = 16x 3 − 2 = 2
x3 x3
1
Thus, f ′ (x) = 0 gives x = Further note that f ′ (x) is not defined at x = 0. So the critical points are x = 0 and x =
8
1 1
. Now evaluating the value of f at critical points x = 0, and at end points of the interval x = −1 and x = 1 , we have
8 8
4 1
f(−1) = 12(−1)3 − 6(−1)3 = 18
f(0) = 12(0) - 6(0) = 0
4 1
1 1 3 1 3 −9
f � � = 12 � � − 6 � � =
8 8 8 4
4 1
f(1) = 12(1)3 − 6(1)3 = 6
Hence, we conclude that absolute maximum value of f is 18 that occurs at x = -1 and absolute minimum
−9 1
value of f is that occurs at x = .
4 8
Example 41 An Apache helicopter of enemy is flying along the curve given by y = x2 + 7. A soldier, placed
at (3, 7), wants to shoot down the helicopter when it is nearest to him. Find the nearest distance.
Solution For each value of x, the helicopter’s position is at point (x, x2 + 7). Therefore, the distance
between the helicopter and the soldier placed at (3, 7) is
�(x - 3)2 + (x 2 + 7 - 7)2 , i. e. , �(x - 3)2 + x 4 .
Let f(x) = (x - 3)2 + x4
or f’(x) = 2(x - 3) + 4x3 = 2(x - 1) (2x2 + 2x + 3)
Thus, f ′(x) = 0 gives x = 1 or 2x2 + 2x + 3 = 0 for which there are no real roots. Also, there are no end
points of the interval to be added to the set for which f ′ is zero, i.e., there is only one point, namely,
x = 1. The value of f at this point is given by f (1) = (1 – 3)2 + (1)4 = 5. Thus, the distance between the
solider and the helicopter is �f(1) = √5.
Note that √5 is either a maximum value or a mimmum value, Since
�f(0) = �(0 − 3)2 + (0)4 = 3 > √5,
it follows that √5 is the mimmum value of �f(x) Hence, √5 is the minimum distance between the soldier and
the helicopter.6.

Application of Derivatives 23
7 Integrals
If a function f is differentiable in an interval I, i.e., its derivative f ′ exists at each point of I, then a

natural question arises that given f ′ at each point of I, can we determine the function? The functions
that could possibly have given function as a derivative are called anti derivatives (or primitive) of the
function. Further, the formula that gives all these anti derivatives is called the indefinite integral of
the function and such process of finding anti derivatives is called integration. Such type of problems
arise in many practical situations. For instance, if we know the instantaneous velocity of an object at
any instant, then there arises a natural question, i.e., can we determine the position of the object at
any instant? There are several such practical and theoretical situations where the process of integration
is involved. The development of integral calculus arises out of the efforts of solving the problems of
the following types:
(a) the problem of finding a function whenever its derivative is given,
(b) the problem of finding the area bounded by the graph of a function under certain conditions.
These two problems lead to the two forms of the integrals, e.g., indefinite and definite integrals, which
together constitute the Integral Calculus.
There is a connection, known as the Fundamental Theorem of Calculus, between indefinite integral
and definite integral which makes the definite integral as a practical tool for science and engineering.
The definite integral is also used to solve many interesting problems from various disciplines like
economics, finance and probability.

In this Chapter, we shall confine ourselves to the study of indefinite and definite integrals and their
elementary properties including some techniques of integration.
Integration as an Inverse Process of Differentiation
Integration is the inverse process of differentiation. Instead of differentiating a function, we are given
the derivative of a function and asked to find its primitive, i.e., the original function. Such a process is
called integration or anti differentiation.
Let us consider the following examples:
d
We know that (sin x) = cos x... (1)
dx
d x3
� � = 𝑥𝑥 2 … (2)
dx 3

and
d
(𝑒𝑒 𝑥𝑥 ) = 𝑒𝑒 𝑥𝑥 … (3)
dx
We observe that in (1), the function cos x is the derived function of sin x. We say that sin x is an anti
x3
derivative (or an integral) of cos x. Similarly, in (2) and (3), and ex are the anti derivatives (or integrals)
3
of x2 and ex, respectively. Again, we note that for any real number C, treated as constant function, its
derivative is zero and hence, we can write (1), (2) and (3) as follows :
d d x3 d
( sin x + C) = cos x � + C� = x 2 and (ex + C) = ex
dx dx 3 dx

Integrals 1
Thus, anti derivatives (or integrals) of the above cited functions are not unique. Actually, there exist
infinitely many anti derivatives of each of these functions which can be obtained by choosing C
arbitrarily from the set of real numbers. For this reason C is customarily referred to as arbitrary
constant. In fact, C is the parameter by varying which one gets different anti derivatives (or integrals)
of the given function.
d
More generally, if there is a function F such that F(x) = f(x), ∀ x ∈ I (interyal),
dx
then for any arbitrary real number C, (also called constant of integration)
d
[F(x) + C] = f(x), x ∈ I
dx
Thus, {F + C, C ∈ R} denotes a family of anti derivatives of f.
Remark Functions with same derivatives differ by a constant. To show this, let g and h be two functions
having the same derivatives on an interval I.
Consider the function f = g – h defined by f (x) = g (x) – h (x), ∀ x ∈ I
df
Then = f′ = g′ – h′ giving f′ (x) = g′ (x) – h′ (x) ∀ x ∈ I
dx

or f′ (x) = 0, ∀ x ∈ I by hypothesis,
i.e., the rate of change of f with respect to x is zero on I and hence f is constant.
In view of the above remark, it is justified to infer that the family {F + C, C ∈ R} provides all possible
anti derivatives of f.

We introduce a new symbol, namely, ∫ f (x) dx which will represent the entire class of anti derivatives
read as the indefinite integral of f with respect to x.
Symbolically, we write ∫ f (x) dx = F (x) + C.
dy
Notation Given that = f (x), we write y = ∫ f (x) dx.
dx

For the sake of convenience, we mention below the following symbols/terms/phrases with their
meanings as given in the Table

Symbols/Terms/Phrases Meaning
∫ f (x) dx Integral of f with respect to x

f (x) in ∫ f (x) dx Integrand

x in ∫ f (x) dx Variable of integration

Integrate Find the integral


An integral of f A function F such that
F′(x) = f (x)
Integration The process of finding the integral
Constant of Integration Any real number C, considered as constant function

2 Integrals
We already know the formulae for the derivatives of many important functions. From these formulae,
we can write down immediately the corresponding formulae (referred to as standard formulae) for the
integrals of these functions, as listed below which will be used to find integrals of other functions.
Derivatives Integrals (Anti derivatives)

d xn+1
(i) � � = xn;
dx n+l
Particularly, we note mat

d ∫ dx = x + C
dx
(x) = 1;
d ∫cos x dx = sin x + C
(ii) ( sin x) = cos x;
dx
d ∫sin x dx = – cos x + C
(iii) (−cos x) = sin x;
dx
d ∫sec2 x dx = tan x + C
(iv)
dx
(tan x) = sec 2 x;
d ∫cosec 2
x dx = – cot x + C
(v) (- cot x) = cosec 2 x;
dx
d ∫sec x tan x dx = sec x + C
(vi) ( sec x) = sec x tan x;
dx
d ∫cosec x cot x dx = – cosec x + C
(vii) (−cosec x) = cosec x cot x ;
dx

(viii)
d
(sin−1 x) =
1
; dx
dx √l−x2 � = sin−1 x + C
√1- x 2
(ix)
d
(−cos −1 x) =
1
; dx
dx √l−x2 � =- cos−1 x + C
√1- x 2
(x)
d
(tan−1 x) = 1+x2;
1 dx
dx � = tan−1 x + C
1 + x2
(xi)
d
(−cot −1 x) = 1+x2;
1 dx
dx � =- cot −1 x + C
1 + x2
(xii)
d
(sec −1 x) =
1
; dx
dx x√x2 -1 � = sec −1 x + C
2
x√x − 1
(xiii)
d
(- cosec −1 x) =
1
; dx
dx x√x2 -1 � = −cosec −1 x + C
x√x 2 − 1
d
(xiv) (e𝑥𝑥 ) = e𝑥𝑥 ; � e𝑥𝑥 dx = e𝑥𝑥 + C
dx

(xvi)
d

ax
� = a𝑥𝑥 ; 1
dx loga � dx = log|x| + C
x
(xv)
d
(log|x|) = x;
1 ax
dx � ax dx = +C
loga
d ax dx
(xvi) �
dx loga
� = ax ; � = sec −1 x + C
x√x 2 − 1
Note In practice, we normally do not mention the interval over which the various functions are defined.
However, in any specific problem one has to keep it in mind.
Geometrical interpretation of indefinite integral
Let f (x) = 2x. Then ∫ f (x) dx = x2 + C. For different values of C, we get different integrals. But these

Integrals 3
integrals are very similar geometrically.
Thus, y = x2 + C, where C is arbitrary constant, represents a family of integrals. By assigning different
values to C, we get different members of the family. These together constitute the indefinite integral.
In this case, each integral represents a parabola with its axis along y-axis.

Clearly, for C = 0, we obtain y = x2, a parabola with its vertex on the origin. The curve y = x2 + 1 for C =
1 is obtained by shifting the parabola y = x2 one unit along y-axis in positive direction. For C = – 1, y =
x2 – 1 is obtained by shifting the parabola y = x2 one unit along y-axis in the negative direction. Thus,
for each positive value of C, each parabola of the family has its vertex on the positive side of the y-
axis and for negative values of C, each has its vertex along the negative side of the y-axis.
Let us consider the intersection of all these parabolas by a line x = a.We have taken a > 0. The same
is true when a < 0. If the line x = a intersects the parabolas y = x2, y = x2 + 1, y = x2 + 2, y = x2 – 1, y =
dy
x2 – 2 at P0, P1, P2, P–1, P–2 etc., respectively, then at these points equals 2a. This indicates that the
dx
tangents to the curves at these points are parallel. Thus, ∫ 2x dx = x2 + C = FC (x) (say), implies that

the tangents to all the curves y = FC (x), C ∈ R, at the points of intersection of the curves by the line x

= a, (a ∈ R), are parallel.

Further, the following equation (statement) ∫ f (x) dx = F (x) = C = y (say), represents a family of curves.
The different values of C will correspond to different members of this family and these members can
be obtained by shifting any one of the curves parallel to itself. This is the geometrical interpretation of
indefinite integral.
Some properties of indefinite integral

4 Integrals
In this sub section, we shall derive some properties of indefinite integrals.
(I) The process of differentiation and integration are inverses of each other in the sense of the following
results :
d
∫ f (x) dx = f (x)
dx

and ∫ f ′(x) dx = f (x) + C, where C is any arbitrary constant.


Proof Let F be any anti derivative of f, i.e.,
d
F(x) = f(x)
dx
Then ∫ f (x)dx = F(x) + C
d d
Therefore ∫ f (x)dx = (F(x) + C)
dx dx
d
= F(x) = f(x)
dx
Similarly, we note that
d
f ′ (x) = f(x)
dx
and hence ∫ f ′ (x)dx = f(x) + C
where C is arbitrary constant called constant of integration,
(II) Two indefinite integrals with the same derivative lead to the same family of curves and so they are
equivalent,
Proof Let f and g be two functions such that
d d
� f (x)dx = � g (x)dx
dx dx
d
or �∫ f (x)dx − ∫ g (x)dx� = 0
dx
Hence ∫ f (x) dx – ∫ g (x) dx = C, where C is any real number (Why?)
or ∫ f (x) dx = ∫ g (x) dx + C
So the families of curves {∫ f (x) dx + C1, C1 ∈ R}

and {∫ g (x) dx + C 2, C 2 ∈ R} are identical.


Hence, in this sense, ∫ f (x) dx and ∫ g(x) dx are equivalent.
Note The equivalence of the families { ∫ f (x) dx + C1, C1 ∈ R} and { ∫ g (x) dx + C2, C2 ∈ R} is customarily
expressed by writing ∫ f (x) dx = ∫ g (x) dx, without mentioning the parameter.
(III) ∫[f (x) + g (x)] dx = ∫ f (x) dx + ∫ g (x) dx
Proof By Property (I), we have
d
�∫ [f(x) + g(x)] dx� = f(x) + g(x) … (1)
dx
On the otherhand, we find that
d d d
�� f (x)dx + � g (x)dx� = � f (x)dx + � g (x)dx
dx dx dx
= f(x) + g(x) … (1)
Thus, in view of Property (II), it follows by (1) and (2) that

� �f(x) + g(x)� dx = � f (x)dx + � g (x)dx

Integrals 5
(IV) For any real number k, ∫ k f(x)dx = k ∫ f (x)dx
d
Proof By the Property (I), ∫ k f(x)dx = kf(x)
dx
d d
Also �k ∫ f (x)dx� = k ∫ f (x)dx = kf(x)
dx dx
Therefore, using the Property (II), we have ∫ k f (x) dx = k ∫ f (x) dx.
(V) Properties (III) and (IV) can be generalised to a finite number of functions f1, f2,..., fn and the real
numbers, k1, k2,..., kn giving
∫[k1 f1 (x) + k 2 f 2 (x) +... + k n f n (x)] dx

= k1 ∫ f1 (x) dx + k 2 ∫f 2 (x) dx +... + k n ∫f n (x) dx.


To find an anti derivative of a given function, we search intuitively for a function whose derivative is
the given function. The search for the requisite function for finding an anti derivative is known as
integration by the method of inspection. We illustrate it through some examples.
Example 1 Write an anti derivative for each of the following functions method of inspection:
1
(i) cos 2x (ü) 3x 2 + 4x 3 (iii) , x ≠ 0
x
Solution
(i) We look for a function whose derivative is cos 2x. Recall that
d
sin 2x = 2 cos 2x
dx
1 d d 1
or cos 2x = ( sin 2x) = � sin 2x�
2 dx dx 2
1
Therefore, an anti derivative of cos 2x is sin 2x
2
(ii) We look for a function whose derivative is 3x 2 + 4x 3 Note that
d 3
(x + x 4 ) = 3x 2 + 4x 3
dx
Therefore, an anti derivative of 3x 2 + 4x 3 is x 3 + x 4 .
(iii) We know that
d 1 d 1 1
( log x) = , x > 0 and [log (−x)] = (−1) = , x < 0
dx x dx −x x
d 1
Combining above, we get (log|x|) = , x ≠ 0
dx x
1 1
Therefore, ∫ dx = log|x| is one of the anti derivatives of .
x x
Example 2 Find the following integrals:
2 3
x3 −1 1
(i) ∫ dx (ii) ∫ �x 3 + 1� dx (iii) ∫ �x 2 + 2ex − � dx
x2 x

Solution
(i) We have
x3 −1
∫ dx = ∫ x dx − ∫ x −2 dx (by Property V)
x2
x1+1 x−2+1
=� + C1 � − � + C2 � ; C1 , C2 are constants of integration
1+1 −2+1
x2 x−1 x2 1
= + C1 − − C2 = + + C1 − C2
2 −1 2 x
x2 1
= + + C, where C = C1 − C2 is another constant of integration,
2 x
Note From now onwards, we shall write only one constant of integration in the final answer.
(ii) We have

6 Integrals
2 2
� �x 3 + 1� dx = � x 3 dx + � d x
2
x 3+1 3 5
= + x + C = x3 + x + C
2 5
+1
3
3 3
1 1
(iii) We have ∫ �x 2 + 2ex − � dx = ∫ x 2 dx + ∫ 2 ex dx − ∫ dx
x x
3
x 2 +1
= + 2ex − log|x| + C
3
+1
2
2 5
= x 2 + 2ex − log|x| + C
5
Example 3 Find the following integrals:
(i) ∫ (sin x + cos x)dx (ii) ∫ cosec x( cosec x + cot x)dx
1 − sin x
(iii) ∫ dx
cos2 x
Solution
(i) We have
∫ ( sin x + cos x)dx = ∫ sin xdx + ∫ cos x dx
= -cos x + sin x + C
(ii) We have
∫(cosec x (cosec x + cot x) dx = ∫ cosec2 x dx + ∫cosec x cot x dx
= – cot x – cosec x + C
(iii) We have
1 - sin x 1 sin x
� 2
dx = � 2
dx - � dx
cos x cos x cos 2 x
= ∫ sec 2
x dx – ∫ tan x sec x dx
= tan x – sec x + C
Example 4 Find the anti derivative F of f defined by f (x) = 4x3 – 6, where F (0) = 3
Solution One anti derivative of f (x) is x4 – 6x since
4
(x – 6x) = 4x3 – 6
d
dx

Therefore, the anti derivative F is given by


F(x) = x4 – 6x + C, where C is constant.
Given that F(0) = 3, which gives,
3 = 0 – 6 × 0 + C or C = 3
Hence, the required anti derivative is the unique function F defined by
F(x) = x4 – 6x + 3.
Remarks
(i) We see that if F is an anti derivative of f, then so is F + C, where C is any constant. Thus, if we know
one anti derivative F of a function f, we can write down an infinite number of anti derivatives of f by
adding any constant to F expressed by F(x) + C, C ∈ R. In applications, it is often necessary to satisfy
an additional condition which then determines a specific value of C giving unique anti derivative of the
given function.

Integrals 7
(ii) Sometimes, F is not expressible in terms of elementary functions viz., polynomial, logarithmic,
exponential, trigonometric functions and their inverses etc. We are therefore blocked for finding ∫ f
(x) dx. For example, it is not possible to find
−x2 −x2
∫e dx by inspection since we can not find a function whose derivative is e
(iii) When the variable of integration is denoted by a yariable other than x, the integral formulae are
modified accordingly, For instance
y 4 +1 1
� y 4 dy = + C = y5 + C
4+1 5

Methods of Integration
In previous section, we discussed integrals of those functions which were readily obtainable from
derivatives of some functions. It was based on inspection, i.e., on the search of a function F whose
derivative is f which led us to the integral of f. However, this method, which depends on inspection, is
not very suitable for many functions. Hence, we need to develop additional techniques or methods for
finding the integrals by reducing them into standard forms. Prominent among them are methods based
on:
1. Integration by Substitution
2. Integration using Partial Fractions
3. Integration by Parts
Integration by substitution
In this section, we consider the method of integration by substitution.
The given integral ∫ f (x) dx can be transformed into another form by changing the independent variable
x to t by substituting x = g (t).
Consider I = ∫ f (x) dx
dx
Put x = g(t) so that = g′(t).
dt

We write dx = g′(t) dt

Thus I = ∫ f (x) dx = ∫ f (g (t)) g ′(t) dt


This change of variable formula is one of the important tools available to us in the name of integration
by substitution. It is often important to guess what will be the useful substitution. Usually, we make a
substitution for a function whose derivative also occurs in the integrand as illustrated in the following
examples.
Example 5 Integrate the following functions w.r.t. x:
(i) sin mx (ii) 2x sin (x2 + 1)
tan4 √x sec2 √x sin �tan−1 x�
(iii) (iv)
√x 1+x2

Solution
(i) We know that derivative of mx is m. Thus, we make the substitution mx = t so that mdx = dt.
1 1 1
Therefore, ∫ sin mx dx = ∫ sin t dt = − cos t + C = − cos mx + C
m m m
(ii) Derivative of x2 + 1 is 2x. Thus, we use the substitution x2 + 1 = t so that 2x dx = dt.

8 Integrals
Therefore, ∫ 2 x sin (x 2
+ 1) dx = ∫sin t dt = – cos t + C = – cos (x2 + 1) + C
1
1 1
(iii) Derivative of √x is x −2 = . Thus, we use the substitution
2 2√x
1
√x = t so that dx = dt giying dx = 2tdt.
2√x
tan4 √xsec2 √x 2ttan4 tsec2 tdt
Thus, ∫ dx = ∫ = 2 ∫ tan4 t sec 2 tdt
√x t

Again, we make another substitution tan t = u so that sec 2 tdt = du


u3
Therefore, 2∫ tan4 tsec 2 tdt = 2∫ u4 du = 2 +C
5
2
= tan5 t + C (since u = tan t)
5
2
= tan5 √x + C (since t = √x)
5
tan4 √xsec2 √x 2
Hence, ∫ dx = tan5 √x + C
√x 5

Alternatively, make the substitution tan √x = t


1
(iv) Derivative of tan−1 x = Thus, we use the substitution
l+x2
dx
tan-1x = t so that = dt.
1+x2
sin �tan−1 x�
Therefore, ∫ dx = ∫ sin t dt = − cos t + C = − cos(tan−1 x) + C
1+x2
Now, we discuss some important integrals involving trigonometric functions and their standard
integrals using substitution technique, These will be used later without reference,
(i) ∫ tan xdx = log|sec x| + C
We have
sin x
� tan x dx = � dx
cos x
Put cos x = t so that sin xdx = −dt
dt
Then ∫ tan xdx = − ∫ = − log|t| + C = − log|cos x| + C
t
or ∫ tan x dx = log|sec x| + C
(ii) ∫ cot xdx = log|sin x| + C
cos x
We have ∫ cot xdx = ∫ dx
sin x
Put sin x = t so that cos xdx = dt
dt
Then ∫ cot xdx = ∫ = log|t| + C = log|sin x| + C
t
(iii) ∫ sec xdx = log|sec x + tan x| + C
We have
sec x( sec x + tan x)
� sec xdx = � dx
sec x + tan x
Put sec x + tan x = t so that sec x( tan x + sec x)dx = dt
dt
Therefore, ∫ sec xdx = ∫ = log|t| + C = log|sec x + tan x| + C
t
(iv) ∫ cosec xdx = log| cosec x − cot x| + C
We have
cosec x( cosec x + cot x)
� cosec xdx = � dx
( cosec x + cot x)
Put cosec x + cot x = t so that- cosec x(cosec x + cot x)dx = dt

Integrals 9
dt
So ∫ cosec xdx = − ∫ = − log|t| = − log lcosec x + cot xI + C
t
2 2
cosec x − cot x
= − log � �+C
cosec x − cot x
= log| cosec x − cot x| + C
Example 6 Find the following integrals:
sṁx 1
(i) ∫ sin3 xcos 2 xdx (ii) ∫ dx (iii) ∫ dx
sin (x+a) 1+ tan x

Solution
(i) We have
∫ sin3 x cos2 x dx = ∫sin2 x cos2 x (sin x) dx
= ∫(1 – cos2 x) cos2 x (sin x) dx
Put t = cos x so that dt = – sin x dx
Therefore, ∫sin2 x cos2 x (sin x) dx = − ∫(1 – t2) t2 dt
t3 t
=- � (t 2 - t 4 ) dt =- � - �+C
3 5
1 1
=- cos 3 x + cos 5 x + C
3 5
(ii) Put x + a = t. Then dx = dt. Therefore
sin x sin (t - a)
� dx = � dt
sin (x + a) sin t
sin t cos a- cos t sin a
� dt
sin t
= cos a ∫ dt – sin a ∫cot t dt
= (cos a) t – (sin a) log |sin t| + C1]
= (cos a) (x + a) – (sin a) [log |sin (x + a)| + C1]
= x cos a + a cos a – (sin a) log |sin (x + a)| – C1 sin a
sin x
Hence, ∫ dx = x cos a − sin a log |sin (x + a)| + C,
sin (x + a)

where, C = −C1 sin a + a cos a, is another arbitrary constant,


dx cos xdx
(iii) ∫ =∫
1+ tan x cos x + sin x
1 ( cos x + sin x + cos x − sin x)dx
= �
2 cos x + sin x
1 1 cos x − sin x
= �dx+ � dx
2 2 cos x + sin x
x C1 1 cos x− sin x
= + + ∫ dx … (1)
2 2 2 cos x+ sin x
cos x−sin x
Now, consider I = ∫ dx
cos x+ sin x
Put cos x + sin x = t so that ( cos x − sin x)dx = dt
dt
Therefore I = ∫ = log|t| + C2 = log|cos x + sin x| + C2
t
Putting it in (1), we get
dx x C1 1 C2
� = + + log|cos x + sin x| +
1 + tan x 2 2 2 2
x 1 C1 C2
= + log|cos x + sin x| + +
2 2 2 2

10 Integrals
x 1 C1 C2
= + log|cos x + sin x| + C, �C = + �
2 2 2 2
Integration using trigonometric identities
When the integrand involves some trigonometric functions, we use some known identities to find the
integral as illustrated through the following example.
Example 7 Find (i) ∫cos2 x dx (ii) ∫sin 2 x cos 3x dx (iii) ∫sin3 x dx
Solution
(i) Recall the identity cos 2x = 2cos 2 x − 1, which gives
1 + cos 2x
cos 2 x =
2
1 1 1
Therefore, ∫ cos 2 xdx = ∫ (1 + cos 2x) dx = ∫ d x + ∫ cos 2xdx
2 2 2
x 1
= + sin 2x + C
2 4
1
(ii) Recall the identity sin x cos y = [sin (x + y) + sin (x − y)] (Why?)
2
1
Then ∫ sin 2x cos 3xA = �∫ sin 5x dx − ∫ sin x dx�
2
1 1
= �− cos 5x + cos x� + C
2 5
1 1
= − cos 5x + cos x + C
10 2
(iii) From the identity sin 3x = 3 sin x − 4sin3 x, we find that
3 sin x − sin 3x
sin3 x =
4
3 1
Therefore, ∫ sin3 x dv = ∫ sin x dx − ∫ sin 3x dx
4 4
3 1
= − cos x + cos 3x + C
4 12
Alternatively, ∫ sin3 x dx = ∫ sin2 x sin x dx = ∫ (1 − cos 2 x) sin x dx
Put cos x = t so that — sin x dx = dt
t3
Therefore, ∫ sin3 x dx = − ∫ (1 − t 2 ) dt = − ∫ d t + ∫ t 2 dt = −t + +C
3
1
= − cos x + cos 3 x + C
3
Remark It can be shown using trigonometric identities that both answers are equivalent.

Integrals of Some Particular Functions


In this section, we mention below some important formulae of integrals and apply them for integrating
many other related standard integrals:
dx 1 x−a
(1) ∫ = log � �+C
x2 −a2 2a x+a
dx 1 a+x
(2) ∫ = log � �+C
a2 −x2 2a a−x
dx 1 −1 x
(3) ∫ = tan +C
x2 +a2 a a
dx
(4) ∫ = log�x + √x 2 − a2 � + C
�x2 −a2
dx x
(5) ∫ = sin−1 + C
�a2 −x2 a
dx
(6) ∫ = log�x + √x 2 + a2 � + C
�x2 +a2

Integrals 11
We now prove the above results:
1 1
(1) We have = (x−a)(x+a)
x2 −a2
1 (x + a) − (x − a) 1 1 1
= � �= � − �
2a (x − a)(x + a) 2a x − a x + a
dx 1 dx dx
Therefore, ∫ = �∫ −∫ �
x2 −a2 2a x−a x+a
1
= [log|(x − a)| − log|(x + a)|] + C
2a
1 x−a
= log � �+C
2a x+a
(2) In view of (1) above, we have
1 1 (a + x) + (a − x) 1 1 1
= � �= � + �
a2 −x 2 2a (a + x)(a − x) 2a a − x a + x
dx 1 dx dx
Therefore, ∫ = �∫ +∫ �
a2 −x2 2a a−x a+x
1
= [- log|a − x| + log |a + x|] + C
2a
1 a+x
= log � �+C
2a a−x
Note The technique used in (1) will be explained in Section 7.5.
(3) Put x = a tan θ . Then dx = asec 2 θdθ.
dx asec2 θdθ
Therefore, ∫ = ∫
x2 +a2 a2 tan2 θ + a2
1 1 1 x
= � d θ = θ + C = tan−1 + C
a a a a
(4) Let x = a sec θ . Then dx = a sec θ tan θdθ.
dx a sec θ tan θdθ
Therefore, ∫ = ∫
�x2 −a2 �a2 sec2 θ−a2

= ∫ sec θ dθ = log |sec θ + tanθ| + C1

x x2
= log � + � 2 − 1� + C1
a a

= log �x + �x 2 − a2 � − log|a| + C1

= log �x + �x 2 − a2 � + C, where C = C1 − log |a|


(5) Let x = a sin θ. Then dx = a cos θ dθ.
dx a cos θ dθ
Therefore, ∫ = ∫
�a2 −x2 �a2 −a2 sin2 θ
x
= � d θ = θ + C = sin−1 + C
a
(6) Let x = a tan θ. Then dx = a sec2θ dθ.
dx asec2 θdθ
Therefore, ∫ = ∫
�x2 +a2 �a2 tan2 θ+a2

� sec θ dθ = log|(sec θ + tan θ)| + C1

x x2
= log � + � 2 + 1� + C1
a a

= log �x + �x 2 + a2 � − log|𝑎𝑎| + C1

12 Integrals
= log �x + �x 2 + a2 � + C, where C = C1 − log|a|
Applying these standard formulae, we now obtain some more formulae which are useful from
applications point of view and can be applied directly to evaluate other integrals,
dx
(7) To find the integral ∫ , we write
ax2 +bx+c

2
b c 2
b 2 c b2
ax + bx + c = a �x + x + � = a ��x + � + � − 2 ��
a a 2a a 4a
b c b2
Now, put x+ = t so that dx = dt and writing − =
2a a 4a2
1 dt c b2
±k 2 . We find the integral reduced to the form ∫ depending upon the sign of � − � and hence can be
a t2 ± k2 a 4a2
evaluated.
dx
(8) To find the integral of the type ∫ , proceeding as in (7), we obtain the integral using the
�ax2 +bx+c

standard formulae.
px+q
(9) To find the integral of the type ∫ dx, where p, q, a, b, c are constants, we are to find real
ax2 +bx+c
numbers A, B such that
d
px + q = A (ax 2 + bx + c) + B = A(2ax + b) + B
dx
To determine A and B, we equate from both sides the coefficients of x and the constant terms, A and
B are thus obtained and hence the integral is reduced to one of the known forms.
(px+q)dx
(10) For the evaluation of the integral of the type ∫ , we proceed as in (9) and transform the
�ax2 +bx+c

integral into known standard forms.


Let us illustrate the above methods by some examples.
Example 8 Find the following integrals:
dx dx
(i) ∫ (ii) ∫
x2 −16 �2x−x2

Solution
dx dx 1 x−4
(i) We have ∫ =∫ = log � � + C [by 7.4 (1)]
x2 −16 x2 −4 2 8 x+4
dr dr
(ii) ∫ =∫
�2x−x2 �1−(x−1)2

Put x − 1 = t. Then dx = dt.


dx dt
Therefore, ∫ = ∫ = sin−1 (t) + C [by 7.4 (5)]
�2x−x2 �1−t2

= sin−1 (x − 1) + C
Example 9 Find the following integrals :
dx dx dx
(i) ∫ (ii) ∫ (iii) ∫
x2 −6x+13 3x2 +13x−10 �5x2 −2x

Solution
(i) We have x 2 − 6x + 13 = x 2 − 6x + 32 − 32 + 13 = (x − 3)2 + 4
dx 1
So, ∫ = ∫ dx
x2 −6x+13 (x−3)2 +22

Let x − 3 = t. Then dx = dt
dx dt 1 t
Therefore, ∫ = ∫ = tan−1 + C [by 7.4 (3)]
x2 −6x+13 t2 +22 2 2
1 x−3
= tan−1 +C
2 2

Integrals 13
(ii) The given integral is of the form 7.4 (7), We write the denominator of the integrand,
13x 10
3x 2 + 13x − 10 = 3 �x 2 + − �
3 3
13 2 17 2
= 3 ��x + � − � � � (completing the square)
6 6
d𝔯𝔯 1 dx
Thus ∫ = ∫ 13 2 17 2
3x2 +13x−10 3 �x+ � −� �
6 6
13
Put x + = t Then dx = dt.
6
dx 1 dt
Therefore, ∫ = ∫ 17 2
3x2 +13x−10 3 t2 −� �
6
17
1 t−
= 17 log � 6
17 � + C1 [by 7.4(i)]
3×2× t+
6 6
13 17
1 x+ −
= log � 6 6�+C
1
17 13 17
x+ +
6 6
1 6x − 4
= log � � + C1
17 6x + 30
1 3x − 2 1 1
= log � � + C1 + log
17 x+5 17 3
1 3x − 2 1 1
= log � � + C, where C = C1 + log
17 x+5 17 3
dx dx
(iii) We have ∫ =∫
�5x2 −2x �5�x2 −
2x

5
1 dx
= ∫ 2 2
(completing the square)
√5 ��x−1� −�1�
5 5
1
Put x − = t Then dx = dt.
5
dx 1 dr
Therefore, ∫ = ∫ 2
�5x2 −2x √5 �t2 −�1�
5

1 1 2
= log �t + �t 2 − � � � + C [by 7.4(4)]
√5 5

1 1 2x
= log �x − + �x 2 − � + C
√5 5 5

Example 10 Find the following integrals:


x+2 x+3
(i) ∫ dx (ü ) ∫ dx
2x2 +6x+5 �5−4x−x2

Solution
(i) Using the formula 7.4 (9), we express
d
x + 2 = A (2x 2 + 6x + 5) + B = A(4x + 6) + B
dx
Equating the coefficients of x and the constant terms from both sides, we get
1 1
4A = 1 and 6A + B = 2 or A = and B =
4 2
x+2 1 4x+6 1 dx
Therefore, ∫ 2 = ∫ 2 dx + ∫ 2
2x +6x+5 4 2x +6x+5 2 2x +6x+5
1 1
= I1 + I2 (say) … (1)
4 2
In I1 , put 2x 2 + 6x + 5 = t, so that (4x + 6)dx = dt

14 Integrals
dt
Therefore, I1 = ∫ = log|t| + C1
t
= log 12x + 6x + 5I + C1 ... (2)
2

dx 1 dx
and I2 = ∫ = ∫ 5
2x2 +6x+5 2 x2 +3x+
2
1 dx
= �
2 3 2 1
�x + � + − � �
2 2
3
Put x + = t, so that dx = dt, we get
2
1 dt 1
I2 = ∫ 1 2
= 1 tan−1 2t + C2 [by 7.4(3)]
2 t2 +� � 2×
2 2
3
= tan−1 2 �x + � + C2 = tan−1 (2x + 3) + C2 … (3)
2

Using (2) and (3) in (1), we get


x+2 1 1
� 2 dx = log|2x 2 + 6x + 5| + tan−1 (2x + 3) + C
2x + 6x + 5 4 2
C1 C2
where, C = +
4 2
(ii) This integral is of the form given in 7.4 (10). Let us express
d
x + 3 = A (5 − 4x − x 2 ) + B = A(−4 − 2x) + B
dx
Equating the coefficients of x and the constant terms from both sides, we get
1
−2A = 1 and- 4 A + B = 3, i. e. , A = − and B = 1
2
x+3 1 (−4−2x)dx dx
Therefore, ∫ 2
dx = − ∫ 2
+ ∫ 2
�5−4x−x 2 �5−4x−x �5−4x−x
1
= − I1 + I2 … (1)
2
In I1 , put 5 − 4x − x 2 = t, so that (−4 − 2x)dx = dt.
(−4−2x)dx dt
Therefore, I1 = ∫ =∫ = 2√t + C1
�5−4x−x2 √t

= 2√5 − 4x − x 2 + C1 … (2)
dx dx
Now consider I2 = ∫ =∫
�5−4x−x2 �9−(x+2)2

Put x + 2 = t, so that dx = dt.


dt t
Therefore, I2 = ∫ = sin−1 + C2 [by 7.4 (5)]
�32 −t2 3
x+2
= sin−1 + C2 … (3)
3
Substituting (2) and (3) in (1), we obtain
x+3 x+2 C1
� = −�5 − 4x − x 2 + sin−1 + C, where C = C2 −
√5 − 4x − x 2 3 2

Integration by Partial Fractions


Recall that a rational function is defined as the ratio of two polynomials in the form
P(x)
, where P (x) and Q(x) are polynomials in x and Q(x) ≠
Q(x)

0. If the degree of P(x) is less than the degree of Q(x), then the rational function is called proper, otherwise, it is called improper. Th
T(x) +
P1 (x) P1 (x)
, where T(x) is a polynomial in x and is a proper rational function. As we know how to integrate polynomials, the integrat
Q(x) Q(x)

Integrals 15
is proper rational function. It is always possible to write the integrand as a sum of simpler rational
functions by a method called partial fraction decomposition. After this, the integration can be carried
out easily using the already known methods. The following Table indicates the types of simpler partial
fractions that are to be associated with various kind of rational functions.

S.No. Form of the rational function Form of the partial fraction

1.
px + q
,a≠b A B
(x−a) (x−b) +
x−a x−b

2. px + q A B
+
(x − a)2 x−a (x − a)2

3. px 2 + qx + r A B C
+ +
(x − a) (x − b) (x − c) x−a x−b x−c

4. px 2 + qx + r A B C
+ +
(x − a)2 (x − b) x−a (x − a)2 x−b

5. px 2 + qx + r A
+
Bx + C
,
x−a x2 + bx + c
(x − a)(x 2 + bx + c)

where x2 + bx + c cannot be factorised further


In the above table, A, B and C are real numbers to be determined suitably.
dx
Example 11 Find ∫ (x+1)(x+2)

Solution The integrand is a proper rational function, Therefore, by using the form of partial fraction,
we write
1 A B
(x+1)(x+2)
= + … (1)
x+1 x+2

where, real numbers A and B are to be determined suitably, This gives


1 = A(x + 2) + B(x + 1).
Equating the coefficients of x and the constant term, we get
A+B= 0
and 2A + B = 1
Solving these equations, we get A = 1 and B = −1.
Thus, the integrand is given by
1 1 −1
= +
(x + 1)(x + 2) x+1 x+2
dx dx dx
Therefore, ∫ (x+1)(x+2)
= ∫ −∫
x+1 x+2
= log|x + 1| − log|x + 2| + C
x+1
= log � �+C
x+2
Remark The equation (1) above is an identity, i.e. a statement true for all (permissible) values of x.
Some authors use the symbol ≡’ to indicate that the statement is an identity and use the symbol ‘=’

16 Integrals
to indicate that the statement is an equation, i.e., to indicate that the statement is true only for certain
values of x.
x2 +1
Example 12 Find ∫ dx
x2 −5x+6
x2 +1
Solution Here the integrand is not proper rational function, so we divide x 2 + 1 by x 2 − 5x + 6 and find
x2 −5x+6
that
x2 + 1 5x − 5 5x − 5
= 1+ 2 =1+
2
x − 5x + 6 x − 5x + 6 (x − 2)(x − 3)
5x−5 A B
Let (x−2)(x−3)
= +
x−2 x−3

So that 5x − 5 = A(x − 3) + B(x − 2)


Equating the coefficients of x and constant terms on both sides, we get A + B = 5 and 3A + 2B =
5. Solving these equations, we get A = −5 and B = 10
x2 +1 5 10
Thus, = 1− +
x2 −5x+6 x−2 x−3
x2 +1 1 dx
Therefore, ∫ dx = ∫ d x − 5 ∫ dx + 10 ∫
x2 −5x+6 x−2 x−3
= x − 5 log |x − 21 + 10 log |x − 3| + C.
3x−2
Example 13 Find ∫ (x+1)2 (x+3)
dx

Solution We write
3x − 2 A B C
= + +
(x + 1) (x + 3)
2 x + 1 (x + 1) 2 x+3
So that 3x − 2 = A(x + 1)(x + 3) + B(x + 3) + C(x + 1)2
= A(x 2 + 4x + 3) + B(x + 3) + C(x 2 + 2x + 1)
Comparing coefficient of x 2 , x and constant term on both sides, we get A + C = 0, 4A + B + 2C = 3 and 3A + 3B +
11 −5 −11
C = −2. Solving these equations, we get A = ,B = and C = Thus the integrand is given by
4 2 4
3x − 2 11 5 11
= − −
(x + 1) (x + 3)
2 4(x + 1) 2(x + 1)2 4(x + 3)
3x−2 11 dx 5 dx 11 dx
Therefore, ∫ (x+1)2 (x+3)
= ∫ − ∫ (x+1)2
− ∫
4 x+1 2 4 x+3
11 5 11
= log|x + 1| + − log|x + 3| + C
4 2(x + 1) 4
11 x+1 5
= log � �+ +C
4 x+3 2(x + 1)
x2
Example 14 Find ∫ (x2 +1)(x2 +4)
dx
x2
Solution Consider (x2 +1)(x2 +4)
and put x 2 = y.
x2 y
Then (x2 +1)(x2 +4)
= (y+1)(y+4)
y A B
Write (y+1)(y+4)
= +
y+1 y+4

So that y = A(y + 4) + B(y + 1)


Comparing coefficients of y and constant terms on both sides, we get A + B = 1 and 4A + B = 0, which give
l 4
A = − and B =
3 3
x2 1 4
Thus, (x2 +1)(x2 +4)
= − +
3(x2 +1) 3(x2 +4)
x2 dx 1 dx 4 dx
Therefore, ∫ (x2 +1)(x2 +4)
= − ∫ + ∫
3 x2 +1 3 x2 +4

Integrals 17
1 4 1 x
= − tan−1 x + × tan−1 + C
3 3 2 2
1 2 x
= − tan−1 x + tan−1 + C
3 3 2
In the above example, the substitution was made only for the partial fraction part and not for the
integration part, Now, we consider an example, where the integration involves a combination of the
substitution method and the partial fraction method,
(3 sin ϕ−2) cos ϕ
Example 15 Find ∫ dϕ
5−cos2 ϕ−4 sin ϕ

Solution Let y = sin ϕ


Then dy = cos ϕ dϕ
(3 sin ϕ−2) cos ϕ (3y−2)dy
Therefore, ∫ dϕ = ∫
5−cos2 ϕ−4 sin ϕ 5−(1−y2 )−4y
3y − 2
= � dy
y2 − 4y + 4
3y−2
=∫ (y−2)2
= I (say)
3y−2 A B
Now, we write (y−2)2
= + (y−2)2
y−2

Therefore, 3y − 2 = A(y − 2) + B
Comparing the coefficients of y and constant term, we get A = 3 and B − 2A = −2, which gives A = 3 and B = 4.
Therefore, the required integral is given by
3 4 dy dy
I= �� + � dy = 3 � + 4�
y − 2 (y − 2) 2 y−2 (y − 2)2
1
= 3 log|y − 2| + 4 �− �+C
y−2
4
= 3 log|sin ϕ − 2| + +C
2 − sin ϕ
4
= 3 log(2 − sin ϕ) + + C (since, 2 - sin ϕ is always positive)
2− sin ϕ
x2 +x+1dx
Example 16 Find ∫ (x+2)(x2 +1)

Solution The integrand is a proper rational function, Decompose the rational function into partial
fraction. Write
x2 + x + 1 A Bx + C
= + 2
(x + 1)(x + 2) x + 2 (x + 1)
2

Therefore, x 2 + x + 1 = A(x 2 + 1) + (Bx + C)(x + 2)


Equating the coefficients of x 2 , x and of constant term of both sides, we get A + B = 1, 2B + C = 1 and A +
2C = 1 Solying these equations, we get
3 2 1
A= , B = and C =
5 5 5
Thus, the integrand is given by
2 1
x2 + x + 1 3 x+ 3 1 2x + 1
= 5
+ 2 5 = + � 2 �
(x + 1)(x + 2)
2 5(x + 2) x + 1 5(x + 2) 5 x + 1
x2 +x+1 3 dx 1 2x 1 1
Therefore, ∫ (x2 +1)(x+2)
dx = ∫ + ∫ dx + ∫ d𝔯𝔯
5 x+2 5 x2 +1 5 x2 +1
3 1 1
= log|x + 2| + log|x 2 + 1| + tan−1 x + C
5 5 5

18 Integrals
Integration by Parts
In this section, we describe one more method of integration, that is found quite useful in integrating
products of functions.
If u and v are any two differentiable functions of a single variable x (say), Then, by the product rule of
differentiation, we have
d dv du
(uv) = u + v
dx dx dx
Integrating both sides, we get
dv du
uv = � u dx + � v dx
dx dx
dv du
or ∫ u dx = uv − ∫ v dx … (1)
dx dx
dv
Let u = f(x) and = g(x). Then
dx
du
= f ′ (x) and v = � g (x)dx
dx
Therefore, expression (1) can be rewritten as
∫ f (x) g (x) dx = f (x) ∫ g (x) dx – ∫ [ ∫ g (x) dx] f ′(x) dx

i.e. ∫ f (x) g (x) dx = f (x) ∫ g (x) dx – ∫ [ f ′ (x) ∫ g (x) dx] dx


If we take f as the first function and g as the second function, then this formula may be stated as
follows:
“The integral of the product of two functions = (first function) × (integral of the second function) –
Integral of [(differential coefficient of the first function) × (integral of the second function)]”
Example 17 Find ∫ x cos x dx
Solution Put f (x) = x (first function) and g (x) = cos x (second function). Then, integration by parts gives
d
∫ x cos x dx = x ∫ cos x dx – ∫ [ (x) ∫cos x dx]
dx

= x sin x – ∫sin x dx = x sin x + cos x + C


Suppose, we take f (x) = cos x and g (x) = x. Then
d
∫ x cos x dx = cos x ∫ x dx – ∫[ (cos x) ∫ x dx] dx
dx
x2 x2
= ( cos x) + � sin x dx
2 2
Thus, it shows that the integral ∫ x cos x dx is reduced to the comparatively more complicated integral
having more power of x. Therefore, the proper choice of the first function and the second function is
significant.
Remarks
(i) It is worth mentioning that integration by parts is not applicable to product of functions in all
cases. For instance, the method does not work for ∫ √x sin x dx. The reason is that there does not exist

any function whose derivative is ∫ √x sin x.


(ii) Observe that while finding the integral of the second function, we did not add any constant of
integration. If we write the integral of the second function cos x
as sin x + k, where k is any constant, then

Integrals 19
∫ x cos x dx = x (sin x + k) − ∫(sin x + k) dx

= x (sin x + k) − ∫ (sin x dx − ∫k dx

= x (sin x + k) − cos x – kx + C = x sin x + cos x + C


This shows that adding a constant to the integral of the second function is superfluous so far as the
final result is concerned while applying the method of integration by parts.
(iii) Usually, if any function is a power of x or a polynomial in x, then we take it as the first function.
However, in cases where other function is inverse trigonometric function or logarithmic function, then
we take them as first function.
Example 18 Find ∫log x dx
Solution To start with, we are unable to guess a function whose derivative is log x. We take log x as
the first function and the constant function 1 as the second function. Then, the integral of the second
function is x.
d
Hence, ∫(logx.1) dx = log x ∫ 1 dx - ∫[ (log x) ∫1 dx] dx
dx
1
= (log x) ⋅ x − ∫ xdx = x log x − x + C.
x

Example 19 Find ∫ xex dx


Solution Take first function as x and second function as ex . The integral of the second function is ex .
Therefore, ∫ x ex dx = xex − ∫ 1 . ex dx = xex − ex + C.
xsin−1 x
Example 20 Find ∫ dx
�1−x2
x
Solution Let first function be sin−1 x and second function be .
�l−x2
xdx
First we find the integral of the second function, i.e., ∫ .
�1−x2

Put t = 1 − x 2 Then dt = −2x dx


xdx 1 dt
Therefore, ∫ = ∫ = √t = −√1- x 2
�1 - x2 2 √t
xsin−1 x 1
Hence, ∫ dx = (sin−1 x)�- √1- x 2 � − ∫ �- √1- x 2 �dx
�1-x2 �1-x2

= - √1- x 2 sin−1 x + x + C = x − √1 − x 2 sin−1 x + C


Alternatively, this integral can also be worked out by making substitution sin–1 x = θ and then integrating
by parts.
Example 21 Find ∫ex sin x dx
Solution Take ex as the first function and sin x as second function. Then, integrating by parts, we have
I = ∫ex sin x dx = ex (– cos x) + ∫ex cos x dx
= – ex cos x + I 1 (say)... (1)
Taking ex and cos x as the first and second functions, respectively, in I1, we get
I1 = ex sin x – ∫ex sin x dx
Substituting the value of I1 in (1), we get
I = – ex cos x + ex sin x – I or 2I = ex (sin x – cos x)
ex
Hence, I = ∫ex sin x dx = (sin x – cos x) + C
2

20 Integrals
Alternatively, above integral can also be determined by taking sin x as the first function and ex the
second function.
Integral of the type ∫e x [f (x) + f ′ (x)] dx

We have I = ∫e x [f (x) + f ′(x)] dx = ∫ e x f (x) dx + ∫ex f′(x) dx

= I1 + ∫ ex f′(x) dx, where I = ∫ ex f (x) dx... (1)


Taking f (x) and ex as the first function and second function, respectively, in I1 and integrating it by
parts, we have I1 = f (x) ex – ∫ f′(x) ex dx + C
Substituting I1 in (1), we get
I = ex f (x) − ∫ f′(x) ex dx + ∫ex f′(x) dx + C = ex f (x) + C.

Thus, ∫ex [f (x) + f′(x)] dx = ex f (x) + C


1 �x2 +1�ex
Example 22 Find (i) ∫ ex �tan−1 x + � dx (ii) ∫ dx
1+x2 (x+1)2

Solution
1
(i) We have I = ∫ ex �tan−1 x + � d𝔯𝔯
1+x2
1
Consider f(x) = tan−1 x , then f ′ (x) =
1+x2
Thus, the given integrand is of the form ex [f(x) + f ′ (x)].
1
Therefore, I = ∫ ex �tan−1 x + � dx = ex tan−1 𝑥𝑥 + C
1+x2
�x2 +1�ex x2 −1+1+1)
(ii) We have I = ∫ (x+1)2
dx = ∫ ex � (x+1)2
� dx
x2 − 1 2 x−1 2
= � ex � + � dx = � ex � + � dx
(x + 1)2 (x + 1)2 x + 1 (x + 1)2
x−1 2
Consider f(x) = , then f ′ (x) = (x+1)2
x+1

Thus, the given integrand is of the form ex ⌈f(x) + f ′ (x)].


x2 +1 x−1 x
Therefore, ∫ (x+1)2
ex dx = e +C
x+1

Integrals of some more types


Here, we discuss some special types of standard integrals based on the technique of integration by
parts :
(i) ∫ √x 2 − a2 dx (ii) ∫ √x 2 + a2 dx (iii) ∫ √a2 − x 2 dx
(i) Let I = ∫ √x 2 − a2 h
Taking constant function 1 as the second function and integrating by parts, we have
1 2x
I = x�x 2 − a2 − � xdx
2 √x 2 − a2
x2 x 2 − a2 + a2
= x�x 2 − a2 − � dx = x�x 2 − a2 − � dx
√x 2 − a2 √x 2 − a2

dx
= x�x 2 − a2 − � �x 2 − a2 dx − a2 �
√x 2 − a2
dx
= x�x 2 − a2 − I − a2 �
√x 2 − a2
dx
or 2I = x√x 2 − a2 − a2 ∫
�x2 −a2

Integrals 21
x a2
or I = ∫ √x 2 − a2 dx = √x 2 − a2 − log�x + √x 2 − a2 � + C
2 2
Similarly, integrating other two integrals by parts, taking constant function 1 as the second function,
we get
1 a2
(ii) ∫ √x 2 + a2 dx = x√x 2 + a2 + log�x + √x 2 + a2 � + C
2 2
1 a2 x
(iii) ∫ √a2 − x 2 dx = x√a2 − x 2 + sin−1 + C
2 2 a
Alternatively, integrals (i), (ii) and (iii) can also be found by making trigonometric substitution x =
a sec θ in (i), x = a tan θ in (ii) and x = a sin θ in (iii) respectively.
Example 23 Find ∫ √x 2 + 2x + 5 dx
Solution Note that

� �x 2 + 2x + 5 h = � �(x + 1)2 + 4 dx

Put x + l = y, so that dx = dy. Then

� �x 2 + 2x + 5 d𝔯𝔯 = � �y 2 + 22 dy
1 4
= y�y 2 + 4 + log�y + �y 2 + 4� + C [using 7.6.2 (ii)]
2 2
1
= (x + 1)�x 2 + 2x + 5 + 2 log �x + 1 + �x 2 + 2x + 5� + C
2
Example 24 Find ∫ √3 − 2x − x 2 dx
Solution Note that ∫ √3 − 2x − x 2 dx = ∫ �4 − (x + 1)2 dx
Put x +1 = y so that dx = dy.
Thus ∫√3 − 2x − x 2 dx = ∫ �4 − y 2 dy
1 4 y
= y�4 − y 2 + sin−1 + C [using 7.6.2 (iii)]
2 2 2
1 x+1
= (x + 1)�3 − 2x − x 2 + 2sin−1 � �+C
2 2
Definite Integral
In the previous sections, we have studied about the indefinite integrals and discussed few methods of
finding them including integrals of some special functions. In this section, we shall study what is called
definite integral of a function. The definite integral has a unique value. A definite integral is denoted
b
by ∫a f (x)dx, lower limit of the integral and b is called the upper limit of the integral. The definite
integral is introduced either as the limit of a sum or if it has an anti derivative F in the interval [a, b],
then its value is the difference between the values of F at the end points, i.e., F(b) – F(a). Here, we
shall consider these two cases separately as discussed below:
Definite integral as the limit of a sum
Let f be a continuous function defined on close interval [a, b]. Assume that all the values taken by the
function are non negative, so the graph of the function is a curve above the x-axis.
b
The definite integral ∫a f (x) dx is the area bounded by the curve y = f (x), the ordinates x = a, x = b and
the x-axis. To evaluate this area, consider the region PRSQP between this curve, x-axis and the
ordinates x = a and x = b

22 Integrals
Divide the interval [a, b] into n equal subintervals denoted by [x0, x1], [x1, x2],..., [xr –1 , xr],..., [xn –1 , xn],
b–a
where x0 = a, x1 = a + h, x2 = a + 2h,..., xr = a + rh and xn = b = a + nh or n = . We note that as n → ∞,
h

h → 0.
The region PRSQP under consideration is the sum of n subregions, where each subregion is defined on
subintervals [xr –1 , xr], r = 1, 2, 3, …, n.
We have area of the rectangle (ABLC) < area of the region (ABDCA) < area of the rectangle (ABDM)... (1)
Evidently as xr – xr–1 → 0, i.e., h → 0 all the three areas shown in (1) become nearly equal to each other.
Now we form the following sums.
sn = h[f(x0 ) + … + f(xn − 1 )] = h ∑nr =− 01 f (xr ) … (2)
and Sn = h[f(x1 ) + f(x2 ) + … + f(xn )] = h ∑nr = 1 f (xr ) … (3)
Here, sn and Sn denote the sum of areas of all lower rectangles and upper rectangles raised over
subintervals [xr–1, xr] for r = 1, 2, 3, …, n, respectively.
In view of the inequality (1) for an arbitrary subinterval [xr–1, xr], we have
sn < area of the region PRSQP < Sn ... (4)
As n → ∞ strips become narrower and narrower, it is assumed that the limiting values of (2) and (3) are
the same in both cases and the common limiting value is the required area under the curve.
Symbolically, we write
b
lim Sn = lim sn = area of the region PRSQP = ∫a f (x)dx … (5)
n→∞ n→∞
It follows that this area is also the limiting value of any area which is between that of the rectangles
below the curve and that of the rectangles above the curve. For the sake of convenience, we shall take
rectangles with height equal to that of the curve at the left hand edge of each subinterval. Thus, we
rewrite (5) as
b
� f (x)dx = lim h [f(a) + f(a + h) + … + f(a + (n − 1)h]
a h→0
b 1
or ∫a f (x)dx = (b − a) lim [f(a) + f(a + h) + … + f(a + (n − 1)h] … (6)
n→∞ n
b−a
where h = → 0 as n → ∞
n
The above expression (6) is known as the definition of definite integral as the limit of sum.
Remark The value of the definite integral of a function over any particular interval depends on the

Integrals 23
function and the interval, but not on the variable of integration that we choose to represent the
independent variable. If the independent variable is denoted by
b b b
t or u instead ofx, we simply write the integral as ∫a f (t)dt or ∫a f (u)du instead of ∫a f (x)dx Hence, the variable of
integration is called a dummy variable.
2
Example 25 Find ∫0 (x 2 + 1) dx as the limit of a sum.
Solution By definition
b
1
� f (x)dx = (b − a) lim [f(a) + f(a + h) + … + f(a + (n − 1)h],
a n→∞ n
b −a
where, h =
n
2−0 2
In this example, a = 0, b = 2, f(x) = x 2 + l, h = =
n n
Therefore,
2
2 1 4 2(n − 1)
� (x 2 + 1) dx = 2 lim�f(0) + f � � + f � � + … + f � ��
0 n→∞ nn n n
1 22 42 (2n − 2)2
= 2 lim �1 + � 2 + 1� + � 2 + 1� + … + � + 1��
n→∞ n n n n2
1 1
= 2 lim [(1 + 1 + … + 1) + 2 (22 + 42 + … + (2n − 2)2 ]
�����������
n→∞ n n
n−terms
1 22
= 2 lim [n + 2 (12 + 22 + … + (n − 1)2 ]
n→∞ n n
1 4 (n − 1)n(2n − 1)
= 2 lim �n + 2 �
n→∞ n n 6
1 2 (n − 1)(2n − 1)
= 2 lim �n + �
n→∞ n 3 n
2 1 1 4 14
= 2 lim �1 + �1 − � �2 − �� = 2 �1 + � =
n→∞ 3 n n 3 3

2
Example 26 Evaluate ∫0 ex dx as the limit of a sum,
Solution By definition
2
1 0 2 4 2n − 2
� ex dx = (2 − 0) lim �e + en + en + … + e n �
0 n→∞ n
2
Using the sum to n terms of a G. P. , where a = 1, r = en , we have
2n
2
𝑥𝑥
1 en −1 1 e2 − 1
� e dx = 2 lim � 2 � = 2 lim � 2 �
n→∞ n n→∞ n
0 en − 1 en − 1
2(e2 − 1) �eh − 1�
= = e2 − 1 [using lim =1
2 h→0 h
en − 1
lim � 2 � . 2
n→∞
n
Fundamental Theorem of Calculus
Area function
b
We have defined ∫a f (x)dx as the area of the region bounded by the curve y = f(x), the ordinates x = a and x =
X
b and x- axis, Let x be a given point in [a, b]. Then ∫a f (x)dx represents the area of the light shaded region

24 Integrals
[Here it is assumed that f (x) > 0 for x ∈ [a, b], the assertion made below is equally true for other
functions as well]. The area of this shaded region depends upon the value of x.
In other words, the area of this shaded region is a function of x. We denote this function of x by A(x).
We call the function A(x) as Area function and is given by

𝐱𝐱
A (x) = ∫𝐚𝐚 𝐟𝐟 (x) dx ... (1)
Based on this definition, the two basic fundamental theorems have been given. However, we only state
them as their proofs are beyond the scope of this text book.
First fundamental theorem of integral calculus
Theorem 1 Let f be a continuous function on the closed interval [a, b] and let A (x) be the area function.
Then A′(x) = f (x), for all x ∈ [a, b].
Second fundamental theorem of integral calculus
We state below an important theorem which enables us to evaluate definite integrals by making use
of anti derivative.
Theorem 2 Let f be continuous function defined on the closed interval [a, b] and F be an anti derivative
𝐛𝐛
of f. Then ∫𝐚𝐚 𝐟𝐟 (𝐱𝐱) 𝐝𝐝𝐝𝐝 = [𝐅𝐅(𝐱𝐱)]ba = F (b) – F(a).
Remarks
b
(i) In words, the Theorem 2 tells us that ∫a f (x) dx = (value of the anti derivative F of f at the upper
limit b – value of the same anti derivative at the lower limit a).
(ii) This theorem is very useful, because it gives us a method of calculating the definite integral more
easily, without calculating the limit of a sum.
(iii) The crucial operation in evaluating a definite integral is that of finding a function whose derivative
is equal to the integrand. This strengthens the relationship between differentiation and integration.
b
(iv) In ∫a f (x) dx, the function f needs to be well defined and continuous in [a, b].
For instance, the consideration of definite integral
3 1 1
∫−2 x (x 2 – 1) dx is erroneous since the function f expressed by f (x) = x(x – 1) is not defined in a portion – 1
2 2 2

< x < 1 of the closed interval [– 2, 3].


b
Steps for calculating ∫a f (x) dx.
(i) Find the indefinite integral ∫ f (x) dx. Let this be F(x). There is no need to keep integration constant C

Integrals 25
because if we consider F(x) + C instead of F(x), we get
b
∫a f (x) dx = [F (x) + C]𝐛𝐛𝐚𝐚 = [F(b) + C] – [F(a) + C] = F(b) – F(a).
Thus, the arbitrary constant disappears in evaluating the value of the definite integral.
b
(ii) Evaluate F(b) – F(a) = [F (x) ]𝐛𝐛𝐚𝐚 , which is the value of ∫a f (x) dx.
We now consider some examples
Example 27 Evaluate the following integrals:
3 9 √x
(i) ∫2 x 2 dx (ii) ∫4 3 2
dx
�30−x2 �

π
2 xdx
(iii) ∫1 (x+1)(x+2)
(iy) ∫04 sin3 2t cos2 tdt

Solution
3 x3
(i) Let I = ∫2 x 2 dx Since ∫ x 2 dx = = F(x),
3
Therefore, by the second fundamental theorem, we get
27 8 19
I = F(3) − F(2) = − =
3 3 3
9 √x
(ii) Let I = ∫4 3 2
dx. We first find the anti derivative of the integrand.
�30−x2 �

3 3 2
Put 30 − x 2 = t. Then- √xdx = dt or √xdx = − dt
2 3

√x 2 dt 2 1 2 1
Thus, ∫ 3 2
dx = − ∫ = � �= � 3 � = F(x)
3 t2 3 t 3
�30−x2 � �30−x2 �

Therefore, by the second fundamental theorem of calculus, we have


9

2 1
I = F(9) − F(4) = � 3 �
3
�30 − x 2 �
4
2 1 1 2 1 1 19
= � − �= � − �=
3 (30 − 27) 30 − 8 3 3 22 99
2 xdx
(iii) Let I = ∫1 (x+1)(x+2)
x −1 2
Using partial fraction, we get (x+1)(x+2)
= +
x+1 x+2
xdx
So ∫ (x+1)(x+2)
= − log|x + 1| + 2 log|x + 2| = F(x)
Therefore, by the second fundamental theorem of calculus, we have
I = F(2) − F(l) = [− log 3 + 2 log 4] − [− log 2 + 2 log 3]
32
= −3 log 3 + log 2 + 2 log 4 = log � �
27
π
(iv) Let I = ∫04 sin3 2t cos2 tdt Consider ∫ sin3 2t cos2 tdt
1
Put sin 2t = u so that 2 cos 2tdt = du or cos 2tdt = du
2
1
So ∫ sin3 2t cos2 tdt = ∫ u3 du
2
1 1
= [u4 ] = sin4 2t = F(t) say
8 8
Therefore, by the second fundamental theorem of integral calculus

26 Integrals
π 1 π 1
I = F � � − F(0) = [sin4 − sin4 0] =
4 8 2 8
Evaluation of Definite Integrals by Substitution
In the previous sections, we have discussed several methods for finding the indefinite integral. One of
the important methods for finding the indefinite integral is the method of substitution.
b
To evaluate ∫a f (x) dx, by substitution, the steps could be as follows:
1. Consider the integral without limits and substitute, y = f (x) or x = g(y) to reduce the given integral
to a known form.
2. Integrate the new integrand with respect to the new variable without mentioning the constant of
integration.
3. Resubstitute for the new variable and write the answer in terms of the original variable.
4. Find the values of answers obtained in (3) at the given limits of integral and find the difference of
the values at the upper and lower limits.
Note In order to quicken this method, we can proceed as follows: After performing steps 1, and 2, there
is no need of step 3. Here, the integral will be kept in the new variable itself, and the limits of the
integral will accordingly be changed, so that we can perform the last step.
Let us illustrate this by examples.
1
Example 28 Evaluate ∫−1 5 x 4 √x 5 + 1dx.
Solution Put t = x 5 + 1, then dt = 5x 4 dx.
2 3 2 3
Therefore, ∫ 5 x 4 √x 5 + 1dx = ∫ √t dt = t 2 = (x 5 + 1)2
3 3
1 3 1
2
Hence, ∫−1 5 x 4 √x 5 + 1dx = �(𝑥𝑥 5 + 1)2 �
3 −1
2 3 3
= �(15 + 1)2 − ((−1)5 + 1)2 �
3
2 3 3 2 4√2
= �22 − 02 � − = �2√2� =
3 3 3
Alternatively, first we transform the integral and then evaluate the transformed integral with new
limits.
Let t = x 5 + 1. Then dt = 5x 4 dx.
Note that, when x = −1, t = 0 and when x = 1, t = 2
Thus, as x varies from- 1 to 1, t varies from 0 to 2
1 2
Therefore ∫−1 5 x 4 √x 5 + 1dx = ∫0 √t dt
2 3 2 2 3 3 2 4√2
= �t 2 � = �22 − 02 � = (2�2) =
3 0 3 3 3
1 tan−1 x
Example 29 Evaluate ∫0 dx
l+x2
1
Solution Let t = tan −1
x , then dt = dx. The new limits are, when x = 0, t = 0 and
1+x2
π π
when x = 1, t = Thus, as x varies from 0 to 1, t varies from 0 to .
4 4
π
π
1 tan−1 x t2 4 1 π2 π2
Therefore ∫0 dx = ∫0 t dt � � = �
4 − 0� =
1+x2 2 0 2 16 32

Some Properties of Definite Integrals


We list below some important properties of definite integrals. These will be useful in evaluating the

Integrals 27
definite integrals more easily.
b b
P0 : � f (x)dx = � f (t)dt
a a
b a a
P1 ∶ � f (x)dx = − � f (x)dx. In particular, � f (x)dx = 0
a b a
b c b
P2 ∶ � f (x)dx = � f (x)dx + � f (x)dx
a a c
b b
P3 ∶ � f (x)dx = � f (a + b − x)dx
a a
a a
P4 ∶ � f (x)dx = � f (a − x)dx
0 0
(Note that P4 is a particular case of P3 )
2a a a
P5 ∶ � f (x)dx = � f (x)dx + � f (2a − x)dx
0 0 0
2a a
P6 ∶ ∫0 f (x)dx = 2 ∫0 f (x)dx, if f(2a − x) = f(x) and
0 if f(2a − x) = −f(x)
a a
P7 ∶ (i) ∫−a f (x)dx = 2 ∫0 f (x)dx, if f is an even function, i. e. , if f (−x) = f(x).
a
(ii) ∫−a f (x)dx = 0, if f is an odd function, i, e, , if f(−x) = −f(x).
We give the proofs of these properties one by one.
Proof of P0 It follows directly by making the substitution x = t.
Proof of P1 Let F be anti derivative off, Then, by the second fundamental theorem of calculus, we have
b a
∫a f (x)dx = F(b) − F(a) = −[F(a) − F(b)] = − ∫b f (x)dx
a
Here, we observe that, if a = b, then ∫a f (x)dx = 0.
Proof of P2 Let F be anti derivative of f, Then
b
∫a f (x)dx = F(b) − F(a) … (1)
c
∫a f (x)dx = F(c) − F(a) … (2)
a
and ∫c f (x)dx = F(b) − F(c) … (3)
c b b
Adding (2) and (3), we get ∫a f (x)dx + ∫c f (x)dx = F(b) − F(a) = ∫a f (x)dx
This proves the property P2 .
Proof of P3 Let t = a + b − x. Then dt = −dx. When x = a, t = b and when x = b, t = a.
Therefore
b a
� f (x)dx = − � f (a + b − t)dt
a b
b
= ∫a f (a + b − t)dt (by P1 )
b
= ∫a f (a + b − x)dx by P0
Proof of P4 Put t = a − x. Then dt = −dx. When x = 0, t = a and when x = a, t = 0. Now proceed as in P3 .
2a a 2a
Proof of P5 Using P2 , we haye ∫0 f (x)dx = ∫0 f (x)dx + ∫a f (x)dx.
Let t = 2a − x in the second integral on the right hand side, Then
dt = −dx. When x = a, t = a and when x = 2a, t = 0. Also x = 2a − t.
Therefore, the second integral becomes

28 Integrals
2a 0 a a
� f (x)dx = − � f (2a − t)dt = � f (2a − t)dt = � f (2a − x)dx
a a 0 0
2a a a
Hence ∫0 f (x)dx = ∫0 f (x)dx + ∫0 f (2a − x)dx
2a a a
Proof of P6 Using P5 , we haye ∫0 f (x)dx = ∫0 f (x)dx + ∫0 f (2a − x)dx … (1)
Now, if f(2a − x) = f(x), then (1) becomes
2a a a a
� f (x)dx = � f (x)dx + � f (x)dx = 2 � f (x)dx,
0 0 0 0
and if f(2a − x) = −f(x), then (1) becomes
2a a a
� f (x)dx = � f (x)dx − � f (x)dx = 0
0 0 0
Proof of P7 Using P2 , we haye
a 0 a
∫−a f (x)dx = ∫−a f (x)dx + ∫0 f (x)dx. Then
Let t = −x in the first integral on the right hand side.
at = −ax. When x = −a, t = a and when
x = 0, t = 0. Also x = −t.
a 0 a
Therefore ∫−a f (x)dx = − ∫a f (−t)dt + ∫0 f (x)dx
a a
= ∫0 f (−x)dx + ∫0 f (x)dx (by Po ) … (1)
(i) Now, if f is an even function, then f(−x) = f(x) and so (1) becomes
a a a a
� f (x)dx = � f (x)dx + � f (x)dx = 2 � f (x)dx
−a 0 0 0
(ii) If f is an odd function, then f(−x) = −f(x) and so (1) becomes
a a a
� f (x)dx = − � f (x)dx + � f (x)dx = 0
−a 0 0
2
Example 30 Evaluate ∫−1 | x 3 − x|dx
Solution We note that x − x ≥ 0 on [‐1, 0] and x 3 − x ≤ 0 on [0, 1] and that x 3 − x ≥ 0 on [1, 2], So by P2
3

we write
2 0 1 2
∫−1 | x 3 − x|dx = ∫−1(x 3 − x) dx + ∫0 − (x 3 − x)dx + ∫1 (x 3 − x) dx
0 1 2
= � (x 3 − x) dx + � (x − x 3 ) dx + � (x 3 − x) dx
−1 0 1
4 2 0 2 4 1 4 2 2
x x x x x x
= � − � +� − � +� − �
4 2 −1 2 4 0 4 2 l
1 1 1 1 1 1
= − � − � + � − � + (4 − 2) − � − �
4 2 2 4 4 2
1 1 1 1 1 1 3 3 11
= − + + − +2− + = − +2=
4 2 2 4 4 2 2 4 4
π
Example 31 Evaluate ∫-π4 sin2 xdx
4

Solution We observe that sin2 x is an even function, Therefore, by P7 (i), we get


π π
4 4
� sin xdx = 2 � sin2 x dx
2

0
4
π π
4 (1 − cos 2x) 4
= 2� dx = � (1 − cos 2x) dx
0 2 0

Integrals 29
π
1 4 π 1 π π 1
= �x − sin 2x� = � − sin � − 0 = −
2 0 4 2 2 4 2
π x sin x
Example 32 Evaluate ∫0 dx
1+cos2 x
π x sin x
Solution Let I = ∫0 1+cos2 x dx Then, by P4 , we have
π (π
− x) sin (π − x)dx
I= �
0 1 + cos 2 (π − x)
π (π π
− x) sin xdx sin xdx
= � 2x
= π � −I
0 1 + cos 0 1 + cos 2 x
π sin xdx
or 2 I = π ∫0
1+cos2 x
π π sin xdx
or I = ∫
2 0 1+cos2 x
Put cos x = t so that- sin xdx = dt. When x = 0, t = 1 and when x = π, t = −1.
Therefore, (by P1 ) we get
−π −1 dt π 1 dt
I= � 2
= �
2 1 1+t 2 −1 1 + t 2
1 dt 1
= π ∫0 (by P7 , since is eyen function)
1+t2 1+t2
π π2
= π[tan−1 t]10 = π[tan−1 1 − tan−1 0] = π � − 0� =
4 4
1
Example 33 Evaluate ∫−1 sin xcos xdx
5 4

1
Solution Let I = ∫−1 sin5 xcos4 xdx Let f(x) = sin5 xcos 4 x. Then
f(−x) = sin5 (−x)cos 4 (−x) = −sin5 xcos 4 x = −f(x), i. e. , f is an odd function,
Therefore, by P7 (ii), I = 0
π
sin4 x
Example 34 Evaluate ∫02 4 dx
sin x+cos4 x
π
sin4 x
Solution Let I = ∫02 4 dx … (1)
sin x+cos4 x
Then, by P4
π π π
sin4 � 2 −x� cos4 x
I = ∫02 π π dx = ∫02 dx … (2)
sin � −x�+cos4 � −x�
4 cos4 x+sin4 x
2 2

Adding (1) and (2), we get


π π
4
+ cos 4 x
2 sin x 2
π
2
π
2I = � 4 4
dx = � d x = [x] 0 =
0 sin x + cos x 0 2
π
Hence I =
4
π
dx
Example 35 Evaluate ∫π3
1+√ tan x
6
π π
dx √ cos xdx
Solution Let I = ∫ π
3
= ∫π3 … (1)
1+√ tan x √ cos x+√ sin x
6 6

π π π
� cos � + −x�dx
3 6
Then by P3 I = ∫ π
3
π π π π
6 � cos � + −x�+� sin � + −x�
3 6 3 6
π
√ sin x
= ∫π3 dx … (2)
√ sin x+√ cos x
6

Adding (1) and (2), we get

30 Integrals
π
π
3 π π π π
2I = � dx = [x]π3 = − = . Hence I =
π
6 3 6 6 12
6
π
Example 36 Evaluate ∫0 log sin x dx
2

π
Solution Let I = ∫02 log sin x dx
Then, by P4
π π
π
I = ∫02 log sin � − x� dx = ∫02 log cos x dx
2

Adding the two values of I, we get


π
2
2I = � (log sin x + log cos x) dx
0
π
= ∫0 ( log sin x cos x + log 2 − log 2) dx (by adding and Subtracting log 2)
2

π π
= ∫02 log sin 2x dx − ∫02 log 2dx (Why?)
π
Put 2x = t in the first integral, Then 2 dx = dt, when x = 0, t = 0 and when x = , t = π.
2
1 π π
Therefore 2I = ∫ log sin tdt − log 2
2 0 2
π
2 π
= ∫0 log sin t dt − log2 [by P6 as sin(π − t) = sin t)
2
2 2
π
π
= ∫02 log sin x dx − log2 (by changing yariable t to x)
2
π
= I − log 2
2
π
−π
Hence ∫02 log sin xdx = log 2.
2

Integrals 31
8 Application of Integrals
Introduction

In geometry, we have learnt formulae to calculate areas of various geometrical figures including
triangles, rectangles, trapezias and circles. Such formulae are fundamental in the applications of
mathematics to many real life problems. The formulae of elementary geometry allow us to calculate
areas of many simple figures. However, they are inadequate for calculating the areas enclosed by
curves. For that we shall need some concepts of Integral Calculus.
In the previous chapter, we have studied to find the area bounded by the curve y = f (x), the ordinates
x = a, x = b and x-axis, while calculating definite integral as the limit of a sum. Here, in this chapter,
we shall study a specific application of integrals to find the area under simple curves, area between
lines and arcs of circles, parabolas and ellipses (standard forms only). We shall also deal with finding
the area bounded by the above said curves.

Area under Simple Curves

In the previous chapter, we have studied definite integral as the limit of a sum and how to evaluate
definite integral using Fundamental Theorem of Calculus. Now, we consider the easy and intuitive way
of finding the area bounded by the curve y = f (x), x-axis and the ordinates x = a and x = b.We can think
of area under the curve as composed of large number of very thin vertical strips. Consider an arbitrary
strip of height y and width dx, then dA (area of the elementary strip)= ydx, where, y = f (x).
This area is called the elementary area which is located at an arbitrary position within the region which
is specified by some value of x between a and b. We can think of the total area A of the region between
x-axis, ordinates x = a, x = b and the curve y = f (x) as the result of adding up the elementary areas of
thin strips across the region PQRSP. Symbolically, we express

Application of Integrals 1
b b b
A = � d A = � y dx = � f (x)dx
a a a
The area A of the region bounded by the curve x = g(y), y-axis and the lines y = c, y = d is given by
d d
A = � x dy = � g (y)dy
c c

Remark If the position of the curve under consideration is below the x-axis, then since f (x) < 0 from x
= a to x = b, the area bounded by the curve, x-axis and the ordinates x = a, x = b come out to be
negative. But, it is only the numerical value of the area which is taken into consideration. Thus, if the
b
area is negative, we take its absolute value, i.e., �∫d 𝑓𝑓 (x)dx�.

Generally, it may happen that some portion of the curve is above x-axis and some is below the x-axis.
Here, A1 < 0 and A2 > 0. Therefore, the area A bounded by the curve y = f (x), x-axis and the ordinates
x = a and x = b is given by A = |A1| + A2.

2 Application of Integrals
Example 1 Find the area enclosed by the circle x2 + y2 = a2.
Solution The whole area enclosed by the given circle

= 4 (area of the region AOBA bounded by the curve, x-axis and the ordinates x = 0 and x = a) [as the
circle is symmetrical about both x-axis and y-axis]
a
= 4 ∫0 ydx (taking vertical strips)
a
= 4 � �a2 - x 2 dx
0
Since x 2 + y 2 = a2 gives y = ±√a2 - x 2
As the region AOBA lies in the first quadrant, y is taken as positive. Integrating, we get the whole area
enclosed by the given circle
a
x a2 x
= 4 � �a2 - x 2 + sin−1 �
2 2 a0
a a2 a2 π
= 4 �� × 0 + sin−1 1� - 0� = 4 � � � � = 𝔫𝔫a2
2 2 2 2
Alternatively, considering horizontal strips, the whole area of the region enclosed by circle

Application of Integrals 3
a a
= 4 ∫o x dy = 4 ∫o �a2 − y 2 dy (Why?)
a
y a2 y
= 4 � �a2 − y 2 + sin−1 �
2 2 a o
a a2
= 4� � × 0+ sin−1 1� − 0�
2 2
a2 π
=4 = πa2
2 2
x2 y2
Example 2 Find the area enclosed by the ellipse + =1
a2 b2
Solution The area of the region ABA’ B’ A bounded by the ellipse
area of the region AOBA in the first quadrant bounded
= 4� �
by the curve, x − axis and the ordinates x = 0, x = a
(as the ellipse is symmetrical about both x‐axis and y-axis)
a
= 4 ∫o ydx (taking vertical strips)
x2 y2 b
Now + = 1 gives y = ± √a2 − x 2 , but as the regionAOBA lies in the first quadrant, y is taken as positive, So,
a2 b2 a
the required area is

a
b 2
= 4� �a − x 2 dx
0 a

4 Application of Integrals
a
4b x a2 x
= � √a2 − x 2 + sin−1 � (Why?)
a 2 2 a 0

4b a a2
= �� × 0 + sin−1 1� − 0�
a 2 2
4b a2 π
= = πab
a 22
Alternatively, considering horizontal strips, the area of the ellipse is

b a b
= 4 ∫0 x dy = 4 ∫o �b 2 - y 2 dy (Why?)
b
b
4a y b2 y
= � �b 2 - y 2 + sin−1 �
b 2 2 b0
4a b b2
= �� × 0 + sin−1 1� - 0�
b 2 2
4a b2 π
= = πab
b 2 2
The area of the region bounded by a curve and a line
In this subsection, we will find the area of the region bounded by a line and a circle, a line and a
parabola, a line and an ellipse. Equations of above mentioned curves will be in their standard forms
only as the cases in other forms go beyond the scope of this textbook.
Example 3 Find the area of the region bounded by the curve y = x2 and the line y = 4.

Solution Since the given curve represented by the equation y = x2 is a parabola symmetrical about y-
axis only, therefore, the required area of the region AOBA is given by

Application of Integrals 5
4
2 ∫o x dy =
area of the regionBONB bounded by curve, y − axis)
2� �
and the lines y = 0 and y = 4
4 3 4
2 4 32
= 2 ∫0 �y dy = 2 × �y 2 � = × 8 = (Why?)
3 3
0 3

Here, we have taken horizontal strips


Alternatively, we may consider the vertical strips like PQ to obtain the area of the region AOBA. To this
end, we solve the equations x2 = y and y = 4 which gives x = –2 and x = 2.

Thus, the region AOBA may be stated as the region bounded by the curve y = x2, y = 4 and the ordinates
x = –2 and x = 2.
Therefore, the area of the region AOBA
2
= ∫−2 ydx
[y = (y‐coordinate of Q) - (y‐coordinate of P) = 4 – x2]
2
= 2 ∫0 (4 − x 2 ) dx (Why?)
2
x3 8 32
= 2 �4x − � = 2 �4 × 2 − � =
3 0 3 3
Remark From the above examples, it is inferred that we can consider either vertical strips or horizontal
strips for calculating the area of the region. Henceforth, we shall consider either of these two, most
preferably vertical strips.
Example 4 Find the area of the region in the first quadrant enclosed by the x-axis, the line y = x, and
the circle x2 + y2 = 32.
Solution The given equations are

6 Application of Integrals
y = x... (1)
and x2 + y2 = 32... (2)
Solving (1) and (2), we find that the line and the circle meet at B(4, 4) in the first quadrant. Draw
perpendicular BM to the x-axis.
Therefore, the required area = area of the region OBMO + area of the region BMAB.
Now, the area of the region OBMO
4 4
= ∫0 y dx = ∫0 x dx … (3)
1
= [x 2 ]40 = 8
2
Again, the area of the region BMAB
4√2 4√2
=� ydx = � �32 − x 2 dx
4 4

1 1 x 4√2
= � x�32 − x 2 + × 32 × sin−1 �
2 2 4√2 4
1 1 4 1 1
= � 4√2 × 0 + × 32 × sṁ 1� − � √32 − 16 + × 32 × sin−1 �
2 2 2 2 √2

= 8π − (8 + 4π) = 4π − 8 … (4)
Adding (3) and (4), we get, the required area = 4π.
x2 y2
Example 5 Find the area bounded by the ellipse + = 1 and the ordinates x = 0 and x = ae, where, b2 =
a2 b2
a2 (1 − e2 ) and e < 1.
Solution The required area of the region BOB’RFSB is enclosed by the ellipse and the lines x = 0 and x
= ae.

Application of Integrals 7
Note that the area of the region BOB’RFSB
ae b ae
= 2 ∫0 ydx = 2 ∫0 √a2 − x 2 dx
a
ae
2b x 2 a2 x
= � �a − x 2 + sin−1 �
a 2 2 a0
2b
= �ae�a2 − a2 e2 + a2 sin−1 e�
2a
= ab �e�1 − e2 + sin−1 e�
Area between Two Curves
Intuitively, true in the sense of Leibnitz, integration is the act of calculating the area by cutting the
region into a large number of small strips of elementary area and then adding up these elementary
areas. Suppose we are given two curves represented by y = f (x), y = g (x), where f (x) ≥ g(x) in [a, b].
Here the points of intersection of these two curves are given by x = a and x = b obtained by taking
common values of y from the given equation of two curves.
For setting up a formula for the integral, it is convenient to take elementary area in the form of vertical
strips. Elementary strip has height f (x) – g (x) and width dx so that the elementary area

dA = [f (x) – g(x)] dx, and the total area A can be taken as


b
A = � [(x) − g(x)]dx
a
Alternatively,
A = [area bounded by y = f(x), x- axis and the lines x = a, x = b] − [area bounded by y =
g(x), x- axis and the linesx = a, x = b]
b b b
= � f (x)dx − � g (x)dx = � [f(x) − g(x)] dx, where f(x) ≥ g(x) in [a, b]
a a a

8 Application of Integrals
If f (x) ≥ g (x) in [a, c] and f (x) ≤ g (x) in [c, b], where a < c < b, then the area of the regions bounded
by curves can be written as
Total Area = Area of the region ACBDA + Area of the region BPRQB
c b
= � [f(x) - g(x)] dx + � [g(x) - f(x)] dx
a c

Example 6 Find the area of the region bounded by the two parabolas y = x2 and y2 = x.
Solution The point of intersection of these two parabolas are O (0, 0) and A (1, 1)

Here, we can set y 2


= x or y = x = f(x) and y = x2
= g (x), where, f (x) ≥ g (x) in [0, 1].
Therefore, the required area of the shaded region
1
= ∫0 [f(x) − g(x)] dx
1
1
2
2 3 x3
= � �√x − x � d𝔯𝔯 = � x 2 − �
0 3 3 0
2 1 1
− = =
3 3 3
Example 7 Find the area lying above x-axis and included between the circle x2 + y2 = 8x and inside of
the parabola y2 = 4x.
Solution The given equation of the circle x2 + y2 = 8x can be expressed as (x – 4)2 + y2 = 16. Thus, the
centre of the circle is (4, 0) and radius is 4. Its intersection with the parabola y2 = 4x gives

Application of Integrals 9
x2 + 4x = 8x
or x2 – 4x = 0
or x (x – 4) = 0
or x = 0, x = 4
O C (4, 0) Q (8, 0)
Thus, the points of intersection of these two curves are O(0, 0) and P(4,4) above the x-axis.
The required area of the region OPQCO included between these two curves above x-axis is
= (area of the region OCPO) + (area of the region PCQP)
4 8
= � y dx + � y dx
0 4
4 8
= 2 ∫0 √x dx + ∫4 �42 - (x- 4)2 dx (Why?)
3 4 4
2
= 2 × �x 2 � + ∫0 √42 − t 2 dt, where, x − 4 = t (Why?)
3 0
32 t 1 t 4
= + � �42 - t 2 + × 42 × sin1 �
3 2 2 40
32 4 1 32 π 32 4
+ � × 0 + × 42 × sin−1 1� = + �0 + 8 × � = + 4π (8 + 3π)
3 2 2 3 2 3 3
Example 8 Using integration find the area of region bounded by the triangle whose vertices are (1, 0),
(2, 2) and (3, 1).
Solution Let A(1, 0), B(2, 2) and C (3, 1) be the vertices of a triangle ABC

Area of ∆ABC
= Area of ∆ABD + Area of trapezium

10 Application of Integrals
BDEC – Area of ∆AEC
Now equation of the sides AB, BC and CA are given by
1
y = 2(x − 1), y = 4 − x, y = (x − 1), respectively,
2
2 3 3 x−1
Hence, area of Δ ABC = ∫1 2 (x − 1)dx + ∫2 (4 − x) dx − ∫1 dx
2
2 2 3 3
x2 x 1 x2
= 2� − x� + �4x − � − � − x�
2 1
2 2 2 2 1
22 1 32 22 1 32 1
= 2 �� − 2� − � − 1�� + ��4 × 3 − � − �4 × 2 − �� − �� − 3� − � − 1��
2 2 2 2 2 2 2
3
=
2
Example 9 Find the area of the region enclosed between the two circles: x2 + y2 = 4 and (x – 2)2 + y2 =
4.
Solution Equations of the given circles are
x2 + y2 = 4 ... (1)
and (x – 2)2 + y2 = 4 ... (2)

(x –2)2 + y2 = x2 + y2
or x2 – 4x + 4 + y2 = x2 + y2
or x = 1 which gives y = ± √3
Thus, the points of intersection of the given circles are A (1, √3) and A′ �1, −√3�
Required area of the enclosed region OACA′O between circles
= 2 [area of the region ODCAO] (Why?)
= 2 [area of the region ODAO + area of the region DCAD]
1 2
= 2 �∫0 y dx + ∫1 y dx�
1 2
= 2 �∫0 �4 − (x − 2)2 dx + ∫1 √4 − x 2 dx� (Why?)
1 1 x−2 1 1 1 x 2
= 2 � (x − 2)�4 − (x − 2)2 + × 4sin−1 � �� + 2 � x�4 − x 2 + × 4sin−1 �
2 2 2 0 2 2 21
1
x−2 x 2
= �(x − 2)�4 − (x − 2)2 + 4sin−1 � �� + �x�4 − x 2 + 4sin−1 �
2 0 2 1
−1 1
= ��−√3 + 4sin−1 � �� − 4sin−1 (−1)� + �4sin−1 1 − √3 − 4sin−1 �
2 2
π π π π
= ��−√3 − 4 × � + 4 × � + �4 × − √3 − 4 × �
6 2 2 6

Application of Integrals 11
2π 2π
= �−√3 − + 2π� + �2π − √3 − �
3 3

= − 2√3
3

12 Application of Integrals
9 Differential Equations
Introduction
For a given function g, find a function f such that
dy
= g (x), where y = f (x)... (1)
dx
An equation of the form (1) is known as a differential equation. A formal definition will be given later.
These equations arise in a variety of applications, may it be in Physics, Chemistry, Biology, Anthropology,
Geology, Economics etc. Hence, an indepth study of differential equations has assumed prime
importance in all modern scientific investigations.
In this chapter, we will study some basic concepts related to differential equation, general and
particular solutions of a differential equation, formation of differential equations, some methods to
solve a first order - first degree differential equation and some applications of differential equations
in different areas.
Basic Concepts
We are already familiar with the equations of the type:
x2 – 3x + 3 = 0... (1)
sin x + cos x = 0... (2)
x + y = 7... (3)
Let us consider the equation:
dy
x + y = 0 … (4)
dx
We see that equations (1), (2) and (3) involve independent and/or dependent variable (variables) only
but equation (4) involves variables as well as derivative of the dependent variable y with respect to the
independent variable x. Such an equation is called a differential equation.
In general, an equation involving derivative (derivatives) of the dependent variable with respect to
independent variable (variables) is called a differential equation.
A differential equation involving derivatives of the dependent variable with respect to only one
independent variable is called an ordinary differential equation, e.g.,
d2 y dy 3
2 + � � = 0 is an ordinary differential equation … (5)
dx2 dx

Of course, there are differential equations involving derivatives with respect to more than one
independent variables, called partial differential equations but at this stage we shall confine ourselves
to the study of ordinary differential equations only. Now onward, we will use the term ‘differential
equation’ for ‘ordinary differential equation’.
Note
1. We shall prefer to use the following notations for derivatives:
dy d2 y d3 y
= y ′ 2 = y r 3 = y ′′
dx dx dx
2. For derivatives of higher order, it will be inconvenient to use so many dashes as supersuffix therefore,
dn y
we use the notation yn for nth order derivative .
dxn
Order of a differential equation

Differential Equations 1
Order of a differential equation is defined as the order of the highest order derivative of the dependent
variable with respect to the independent variable involved in the given differential equation.
Consider the following differential equations:
dy
= ex … (6)
dx
d2 y
+ y = 0 … (7)
dx2
3
d3 y d2 y
� � + x2 � � =0 … (8)
h3 dx2

The equations (6), (7) and (8) involve the highest derivative of first, second and third order respectively,
Therefore, the order of these equations are 1, 2 and 3 respectively.
Degree of a differential equation
To study the degree of a differential equation, the key point is that the differential equation must be
a polynomial equation in derivatives, i. e. , y ′ , y ′′ , y ′′′ etc. Consider the following differential equations:
2
d3 y d2 y dy
+ 2� � − + y = 0 … (9)
dx3 dx2 dx
dy 2 dy
� � + � � − sin2 y = 0 … (10)
dx dx
dy dy
+ sin � � = 0 … (11)
dx dx

We observe that equation (9) is a polynomial equation in y’’’, y’’ and y′, equation (10) is a polynomial

equation in y′ (not a polynomial in y though). Degree of such differential equations can be defined. But

equation (11) is not a polynomial equation in y′ and degree of such a differential equation can not be
defined.
By the degree of a differential equation, when it is a polynomial equation in derivatives, we mean the
highest power (positive integral index) of the highest order derivative involved in the given differential
equation.
In view of the above definition, one may observe that differential equations (6), (7), (8) and (9) each are
of degree one, equation (10) is of degree two while the degree of differential equation (11) is not defined.
Note Order and degree (if defined) of a differential equation are always positive integers.
Example 1 Find the order and degree, if defined, of each of the following differential equations:
dy d2 y dy 2 dy
(i) − cos x = 0 (ii) xy + x� � − y =0
dx dx2 dx dx
y′
(iii) y ′′′ + y 2 + e =0
Solution
(i) The highest order derivative present in the differential equation is
dy dy
f
, so its order is one, It is a polynomial equation in y and the highest power raised to is one, so its degree is
dx dx
one,
(ii) The highest order derivative present in the given differential equation is
d2 y d2 y dy d2 y
, so its order is two, It is a polynomial equation in and and the highest power raised to is one, so its
dx2 dx2 dx dx2
degree is one,
(iii) The highest order derivative present in the differential equation is y’’’, so its order is three, The
given differential equation is not a polynomial equation in its derivatives and so its degree is not

2 Differential Equations
defined.

General and Particular Solutions of a Differential Equation


In earlier Classes, we have solved the equations of the type:
x2 + 1 = 0 … (1)
sin2 x − cos x = 0 … (2)
Solution of equations (1) and (2) are numbers, real or complex, that will satisfy the given equation i.e.,
when that number is substituted for the unknown x in the given equation, L.H.S. becomes equal to the
R.H.S..
d2 y
Now consider the differential equation + y = 0 … (3)
dx2
In contrast to the first two equations, the solution of this differential equation is a function ϕ that will
satisfy it i,e,, when the function ϕ is substituted for the unknown y (dependent variable) in the given
differential equation, L.H.S. becomes equal to R.H.S..
The curve y = ϕ (x) is called the solution curve (integral curve) of the given differential equation,
Consider the function given by
y = ϕ(x) = a sin(x + b), … (4)
where a, b∈ R. When this function and its derivative are substituted in equation (3), L.H.S. = R.H.S.. So
it is a solution of the differential equation (3),
π
Let a and b be given some particular values say a = 2 and b = , then we get a function y = ϕ1 (x) =
4
π
2 sin �x + � … (5)
4

When this function and its derivative are substituted ln equation (3) again L.H.S. = R.H.S.. Therefore ϕ1
is also a solution of equation (3).
Function ϕ consists of two arbitrary constants (parameters) a, b and it is called general solution of the
given differential equation, Where as function ϕ1 contains no arbitrary constants but only the particular
values of the parameters a and b and hence is called a particular solution of the given differential
equation.
The solution which contains arbitrary constants is called the general solution (primitive) of the
differential equation.
The solution free from arbitrary constants i.e., the solution obtained from the general solution by giving
particular values to the arbitrary constants is called a particular solution of the differential equation,
Example 2 Verify that the function y = e-3x is a solution of the differential equation
d2 y dy
+ − 6y = 0
dx 2 dx
Solution Given function is y = e-3x. Differentiating both sides of equation with respect to x, we get
dy
= −3e−3x … (1)
dx
Now, differentiating (1) with respect to x, we have
d2 y
= 9e−3𝑥𝑥
dx 2
d2 y dy
Substituting the values of , and y in the given differential equation, we get
dx2 dx

Differential Equations 3
L.H.S. = 9 e–3x + (–3e–3x) – 6.e–3x = 9 e–3x – 9 e–3x = 0 = R.H.S..
Therefore, the given function is a solution of the given differential equation,
Example 3 Verify that the function y = a cos x + b sin x , where, a, b ∈
d2 y
R is a solution of the differential equation +y=0
dx2
Solution The given function is
y = a cos x + b sin x … (1)
Differentiating both sides of equation (1) with respect to x, successively, we get
dy
= −a sin x + b cos x
dx
d2 y
= −a cos x − b sin x
dx 2
d2 y
Substituting the values of and y in the given differential equation, we get
dx2
L,H,S. = (−a cos x − b sin x) + (a cos x + b sin x) = 0 = R. H. S.
Therefore, the given function is a solution of the given differential equation,
Formation of a Differential Equation whose General Solution is given
We know that the equation
x 2 + y 2 + 2x − 4y + 4 = 0 … (1)
represents a circle having centre at (−1, 2) and radius 1 unit.
Differentiating equation (1) with respect to x, we get
dy x+1
= (y ≠ 2) … (2)
dx 2−y

which is a differential equation. You will find later on [See (example 9 section 9.5.1.)] that this equation
represents the family of circles and one member of the family is the circle given in equation (1).
Let us consider the equation
x2 + y2 = r2... (3)
By giving different values to r, we get different members of the family e.g. x2 + y2 = 1, x2 + y2 = 4, x2 + y2
= 9 etc.
Thus, equation (3) represents a family of concentric circles centered at the origin and having different
radii.

4 Differential Equations
We are interested in finding a differential equation that is satisfied by each member of the family. The
differential equation must be free from r because r is different for different members of the family.
This equation is obtained by differentiating equation (3) with respect to x, i.e.,
dy dy
2x + 2y = 0 or x + y = 0 … (4)
dx dx
which represents the family of concentric circles given by equation (3).
Again, let us consider the equation
y = mx + c... (5)
By giving different values to the parameters m and c, we get different members of the family, e.g.,
y = x (m = 1, c = 0)
y = √3 x (m = √3, c = 0)
y = x + 1 (m = 1, c = 1)
y = – x (m = – 1, c = 0)
y = – x – 1 (m = – 1, c = – 1) etc.
Thus, equation (5) represents the family of straight lines, where m, c are parameters. We are now
interested in finding a differential equation that is satisfied by each member of the family. Further, the
equation must be free from m and c because m and c are different for different members of the family.
This is obtained by differentiating equation (5) with respect to x, successively we get

dy d2 y
= m and = 0 … (6)
dx dx2
The equation (6) represents the family of straight lines given by equation (5).
Note that equations (3) and (5) are the general solutions of equations (4) and (6) respectively.
Procedure to form a differential equation that will represent a given family of curves
(a) If the given family F1 of curves depends on only one parameter then it is represented by an equation
of the form
F1 (x, y, a) = 0... (1)
For example, the family of parabolas y2 = ax can be represented by an equation of the form f (x, y, a) :
y2 = ax.
Differentiating equation (1) with respect to x, we get an equation involving

Differential Equations 5
y′, y, x, and a, i.e.,

g (x, y, y′, a) = 0... (2)


The required differential equation is then obtained by eliminating a from equations (1) and (2) as
F(x, y, y′) = 0... (3)
(b) If the given family F2 of curves depends on the parameters a, b (say) then it is represented by an
equation of the from
F2 (x, y, a, b) = 0... (4)
Differentiating equation (4) with respect to x, we get an equation involving y′, x, y, a, b, i.e.,

g (x, y, y′, a, b) = 0... (5)


But it is not possible to eliminate two parameters a and b from the two equations and so, we need a
third equation. This equation is obtained by differentiating equation (5), with respect to x, to obtain a
relation of the form
h (x, y, y′, y″, a, b) = 0... (6)
The required differential equation is then obtained by eliminating a and b from equations (4), (5) and
(6) as
F (x, y, y′, y″) = 0... (7)
Note The order of a differential equation representing a family of curves is same as the number of
arbitrary constants present in the equation corresponding to the family of curves.
Example 4 Form the differential equation representing the family of curves y = mx, where, m is arbitrary
constant.
Solution We have
y = mx... (1)
Differentiating both sides of equation (1) with respect to x, we get
dy
=m
dx
dy
Substituting the value of m in equation (1) we get y = ⋅ x
dx
dy
or x −y= 0
dx
which is free from the parameter m and hence this is the required differential equation
Example 5 Form the differential equation representing the family of curves y = a sin(x + b), where a, b
are arbitrary constants,
Solution We have
y = a sin (x + b) … (1)
Differentiating both sides of equation (1) with respect to x, successively we get
dy
= a cos (x + b) … (2)
dx
d2 y
= −a sin (x + b) … (3)
dx2
Eliminating a and b from equations (1), (2) and (3), we get
d2 y
+ y = 0 … (4)
dx2

6 Differential Equations
which is free from the arbitrary constants a and b and hence this the required differential equation.
Example 6 Form the differential equation representing the family of ellipses having foci on x-axis and
centre at the origin.
Solution We know that the equation of said family of ellipses is

x2 y2
+ = 1 … (1)
a2 b2
2x 2y dy
Differentiating equation (1) with respect to x, we get + =0
a2 b2 dx
y dy −b2
or � � = … (2)
x dx a2

Differentiating both sides of equation (2) with respect to x, we get


dy
y d2 y x − y dy
� � � 2 � + � dx 2 � =0
x dx x dx

d2 y dy 2 dy
or xy + x� � − y = 0 … (3)
d2 dx dx

which is the required differential equation.


Example 7 Form the differential equation of the family of circles touching the x-axis at origin.
Solution Let C denote the family of circles touching x-axis at origin. Let (0, a) be the coordinates of the
centre of any member of the family. Therefore, equation of family C is

x2 + (y – a)2 = a2 or x2 + y2 = 2ay... (1)


where, a is an arbitrary constant. Differentiating both sides of equation (1) with respect to x, we get
dy dy
2x + 2y = 2a
dx dx

Differential Equations 7
dy
dy dy x+y
or x + y = a or a = dy
dx
… (2)
dx dx
dx

Substituting the value of a from equation (2) in equation (1), we get


dy
�x + y �
dx
x 2 + y 2 = 2y
dy
dx
dy dy
or (x + y ) = 2xy + 2y 2
2 2
dx dx
dy 2xy
or =
dx x2 −y2

This is the required differential equation of the given family of circles.


Example 8 Form the differential equation representing the family of parabolas having vertex at origin
and axis along positive direction of x-axis.
Solution Let P denote the family of above said parabolas and let (a, 0) be the focus of a member of
the given family, where a is an arbitrary constant. Therefore, equation of family P is

y2 = 4ax... (1)
Differentiating both sides of equation (1) with respect to x, we get
dy
2y = 4a … (2)
dx
Substituting the value of 4a from equation (2) in equation (1), we get
dy
y 2 = �2y � (x)
dx
dy
or y − 2xy = 0
2
dx
which is the differential equation of the given family of parabolas.
Methods of Solving First Order, First Degree Differential Equations
In this section we shall discuss three methods of solving first order first degree differential equations.
Differential equations with variables separable
A first order-first degree differential equation is of the form
dy
= F(x, y) … (1)
dx
If F (x, y) can be expressed as a product g (x) h(y), where, g(x) is a function of x and h(y) is a function
of y, then the differential equation (1) is said to be of variable separable type. The differential equation
(1) then has the form

8 Differential Equations
dy
= h(y) g(x) … (2)
dx
If h(y) ≠ 0, separating the variables, (2) can be rewritten as
1
dy = g(x) dx … (3)
h(y)

Integrating both sides of (3), we get


1
∫ dy = ∫ g (x)dx … (4)
h(y)

Thus, (4) provides the solutions of given differential equation in the form
H(y) = G(x) + C
1
Here, H(y) and G(x) are the anti derivatives of and g(x) respectively and C is the arbitrary constant.
h(y)
dy x+1
Example 9 Find the general solution of the differential equation = , (y ≠ 2)
dx 2−y

Solution We have
dy x+1
= … (1)
dx 2−y

Separating the variables in equation (1), we get


(2 – y) dy = (x + 1) dx... (2)
Integrating both sides of equation (2), we get
∫ (2 − y ) dy = ∫ ( x + 1) dx
y2 x2
or 2 y – = + x + C1
2 2
or x2 + y2 + 2x – 4y + 2 C1 = 0
or x2 + y2 + 2x – 4y + C = 0, where C = 2C1
which is the general solution of equation (1).
dy 1+y2
Example 10 Find the general solution of the differential equation = .
dx 1+x2
Solution Since 1 + y 2 ≠ 0, therefore separating the variables, the given differential equation can be
written as
dy dx
= … (1)
1+y2 1+x2

Integrating both sides of equation (1), we get


dy dx
� 2
= �
1+y 1 + x2
or tan−1 y = tan−1 x + C
which is the general solution of equation (1).
dy
Example 11 Find the particular solution of the differential equation = −4xy 2 given that y = 1, when x =
dx
0.
Solution If y ≠ 0, the given differential equation can be written as
dy
= −4xdx … (1)
y2

Integrating both sides of equation (1), we get


dy
� 2 = −4 � x dx
y
1
or − = −2x 2 + C
y
1
or y = … (2)
2x2 −C

Differential Equations 9
Substituting y = 1 and x = 0 in equation (2), we get, C = −1.
Now substituting the value of C in equation (2), we get the particular solution of the given differential
1
equation as y = .
2x2 +1
Example 12 Find the equation of the curve passing through the point (1, 1) whose differential equation
is x dy = (2x 2 + 1)dx(x ≠ 0).
Solution The given differential equation can be expressed as
2x 2 + 1
dy ∗= � � dx ∗
x
1
or dy = �2x + � dx … (1)
x

Integrating both sides of equation (1), we get


1
� d y = � �2x + � dx
x
or y = x 2 + log |x| + C … (2)
Equation (2) represents the family of solution curves of the given differential equation but we are
interested in finding the equation of a particular member of the family which passes through the point
(1, 1), Therefore substituting x = 1, y = 1 in equation (2), we get C = 0.
Now substituting the value of C in equation (2) we get the equation of the required curve as y = x 2 + log|x|.
Example 13 Find the equation of a curve passing through the point (-2, 3), given that the slope of the
2x
tangent to the curve at any point (x, y) is .
y2
dy
Solution We know that the slope of the tangent to a curve is given by .
dx
dy 2x
so, = … (1)
dx y2

Separating the variables, equation (1) can be written as


y 2 dy = 2xdx … (2)
Integrating both sides of equation (2), we get

� y 2 dy = � 2 xdx
y3
or = x 2 + C … (3)
3
dy
* The notation due to Leibnitz is extremely flexible and useful in many calculation and formal
dx
transformations, where, we can deal with symbols dy and dx exactly as if they were ordinary numbers.
By treating dx and dy like separate entities, we can give neater expressions to many calculations.
Refer: Introduction to Calculus and Analysis, volume-I page 172, By Richard Courant, Fritz John Spinger
– Verlog New York.
Substituting x = –2, y = 3 in equation (3), we get C = 5.
Substituting the value of C in equation (3), we get the equation of the required curve as
y3 1
= x 2 + 5 or y = (3x 2 + 15)3
3
Example 14 In a bank, principal increases continuously at the rate of 5% per year. In how many years
Rs 1000 double itself?
Solution Let P be the principal at any time t. According to the given problem,

10 Differential Equations
dp 5
= � �×P
dt 100
dp P
or = … (1)
dt 20
separating the variables in equation (1), we get
dp dt
= … (2)
P 20
Integrating both sides of equation (2), we get
t
log P = + C1
20
t
or P = e20 ⋅ ec1
t
or P = Ce20 (where eC1 = C) … (3)
Now P = 1000, when t = 0
Substituting the values of P and t in (3), we get C = 1000. Therefore, equation (3), gives
t
P = 1000e20
Let t years be the time required to double the principal, Then
t
2000 = 1000e20 ⇒ t = 20 loge2

Homogeneous differential equations


Consider the following functions in x and y
F1 (x, y) = y2 + 2xy, F2 (x, y) = 2x – 3y,
y
F3 (x, y) = cos � � , F4 (x, y) = sin x + cos y
x
F1 (λx, λy) = λ2 (y2 + 2xy) = λ2 F1 (x, y)

F2 (λx, λy) = λ (2x – 3y) = λ F2 (x, y)


λy y
F3 (λx, λy) = cos � � = cos � � = λ0 F3 (x, y)
λx x
F4 (λx, λy) = sin λx + cos λy ≠ λn F4 (x, y), for any n ∈ N

Here, we observe that the functions F1, F2, F3 can be written in the form F(λx, λy) = λn F (x, y) but F4 can
not be written in this form. This leads to the following definition:
A function F(x, y) is said to be homogeneous function of degree n if F(λx, λy) = λn F(x, y) for any nonzero

constant λ.
We note that in the above examples, F1, F2, F3 are homogeneous functions of degree 2, 1, 0 respectively
but F4 is not a homogeneous function.
We also observe that
y 2 2y y
F1 (x, y) = x 2 � 2
+ � = x 2 h1 � �
x x x
2x x
or F1 (x, y) = y 2 �1 + � = y 2 h2 � �
y y
3y y
F2 (x, y) = x1 �2 − � = x 1 h3 � �
x x
x x
or F2 (x, y) = y1 �2 − 3� = y1 h4 � �
y y

Differential Equations 11
y y
F3 (x, y) = x 0 cos � � = x 0 h5 � �
x x
y
F4 (x, y) ≠ x n h6 � � , for any n ∈ N
x
x
or F4 (x, y) ≠ y n h7 � �, for any n ∈ N
y

Therefore, a function F(x, y) is a homogeneous function of degree n if


y x
F(x, y) = x n g � � or y n h � �
x y
dy
A differential equation of the form = F (x, y) is said to be homogenous if F(x, y) is a homogenous function
dx
of degree zero.
To solve a homogeneous differential equation of the type
dy y
= F(x, y) = g � � … (1)
dx x

We make the substitution y = v. x … (2)


Differentiating equation (2) with respect to x, we get
dy dv
= v+x … (3)
dx dx
dy
Substituting the value of from equation (3) in equation (1), we get
dx
dv
v+x = g (v)
dx
dv
or x = g(v) − v … (4)
dx
Separating the variables in equation (4), we get
dv dx
= … (5)
g(v)−v x

Integrating both sides of equation (5), we get


dv 1
∫ = ∫ dx + C … (6)
g(v)−v x
y
Equation (6) gives general solution (primitive) of the differential equation (1) when we replace v by .
x
dx
Note If the homogeneous differential equation is in the form =
dy
x
F(x, y) where, F(x, y) is homogenous function of degree zero, then we make substitution = v i. e. , x =
y
dx x
vy and we proceed further to find the general solution as discussed above by writing = F(x, y) = h � �.
dy y
dy
Example 15 Show that the differential equation (x − y) = x + 2y is homogeneous and solve it.
dx
Solution The given differential equation can be expressed as
dy x+2y
= … (1)
dx x−y
x+2y
Let F(x, y) =
x−y
λ(x+2y)
Now F(λx, λy) = = λ0 ⋅ f(x, y)
λ(x−y)

Therefore, F(x, y) is a homogenous function of degree zero. So, the given differential equation is a
homogenous differential equation.
Alternatively,
2y
dy 1+ y
=� 𝑥𝑥
y � = g� � … (2)
dx 1− x
x
y
R.H.S. of differential equation (2) is of the form g � � and so it is a homogeneous function of degree
x
zero, Therefore, equation (1) is a homogeneous differential equation.

12 Differential Equations
To solve it we make the substitution
y = vx … (3)
Differentiating equation (3) with respect to, x we get
dy dv
= v+x … (4)
dx dx
dy
Substituting the value of y and in equation (1) we get
dx
dv 1 + 2v
v+x =
dx 1−v
dv 1+2v
or x = −v
dx 1−v
dv v2 +v+1
or x =
dx 1−v
v−1 −dx
or dv =
v2 +v+1 x
Integrating both sides of equation (5), we get
v−1 dx
� 2 dv = − �
v +v+1 x
1 2v+1−3
or ∫ dv = − log|x| + C1
2 v2 +v+1
1 2v+1 3 1
or ∫ dv − ∫ dv = − log|x| + C1
2 v2 +v+1 2 v2 +v+1
1 3 1
or log|v 2 + v + 1| − ∫ 2 dv = − log|x| + C1
2 2 1 2 √3
�v+ � +� �
2 2
1 3 2 2v+1
or log|v 2 + v + 1| − . tan−1 � � = − log|x| + C1
2 2 √3 √3
1 1 2v+1
or log|v 2 + v + 1| + log x 2 = √3tan−1 � � + C1 (Why?)
2 2 √3
y
Replacing v by , we get
x
1 y2 y 1 2y+x
or log � + + 1� + log x 2 = √3tan−1 � � + C1
2 x2 x 2 √3x
1 y2 y 2y+x
or log �� + + 1� x 2 � = √3tan−1 � � + C1
2 x2 x √3x
2y+x
or log|(y 2 + xy + x 2 )| = 2√3tan−1 � � + 2C1
√3x
x+2y
or log|(x 2 + xy + y 2 )| = 2√3tan−1 � �+C
√3x

which is the general solution of the differential equation (1)


y dy y
Example 16 Show that the differential equation x cos � � = y cos � � + x homogeneous and solve it.
x dx x

Solution The given differential equation can be written as


y
dy y cos �x�+x
= y … (1)
dx x cos � �
x
dy
It is a differential equation of the form = F(x, y)
dx
y
y cos �x�+x
Here F(x, y) = y
x cos � �
x

Replacing x by λx and y by λy, we get


y
λ[y cos �x�+x
F(λx, λy) = y = λ0 [F(x, y)]
λ�xcos �
x

Thus, F(x, y) is a homogeneous function of degree zero,


Therefore, the given differential equation is a homogeneous differential equation
To solve it we make the substitution

Differential Equations 13
y = vx … (2)
Differentiating equation (2) with respect to x, we get
dy dv
= v+x … (3)
dx dx
dy
Substituting the value of y and in equation (1), we get
dx
dv v cos v + 1
v+x =
dx cos v
dv v cos v+1
or x = −v
dx cos v
dv 1
or x =
dx cos v
dx
or cos vdv =
x
1
Therefore ∫ cos v dv = ∫ dx
x
or sin v = log|x| + log |C|
or sin v = log |Cx|
y
Replacing v by , we get
x
y
sin � � = log|Cx|
x
which is the general solution of the differential equation (1).
x x
Example 17 Show that the differential equation 2yey dx + �y − 2x ey �dy = 0 is homogeneous and find its

particular solution, given that, x = 0 when y = 1.


Solution The given differential equation can be written as
𝑥𝑥
dx 2x e𝑦𝑦 − y
= x … (1)
dy
2y ey
x
2xey − y
Let F(x, y) = x
2yey
x
λ�2xey - y�
Then F(λx, λy) = x = λ0 [F(x, y)]
λ�2yey �

Thus, F(x, y) is a homogeneous function of degree zero, Therefore, the given differential equation is a
homogeneous differential equation.
To solve it, we make the substitution
x = vy … (2)
Differentiating equation (2) with respect to y, we get
dx dv
= v+y
dy dy
dx
Substituting the value of x and in equation (1), we get
dy
dv 2vev − 1
v+y =
dy 2ev
dv 2vev −1
or y = −v
dy 2ev
dv 1
or y =
dy 2ev
−dy
or 2ev dv =
y

14 Differential Equations
dy
or ∫ 2 ev ⋅ dv = − ∫
y

or 2 e = − log |y| + C
v

x
and replacing v by , we get
y
X
2 ey + log |y| = C … (3)
Substituting x = 0 and y = 1 in equation (3), we get
2 e0 + log |1| = C ⇒ C = 2
Substituting the value of C in equation (3), we get
𝑥𝑥
2 ey + log|y| = 2
which is the particular solution of the given differential equation,
Example 18 Show that the family of curves for which the slope of the tangent at any point
x2 +y2
(x, y) on it is , is given by x 2 − y 2 = cx.
2xy
dy
Solution We know that the slope of the tangent at any point on a curve is .
dx
dy x2 +y2
Therefore, =
dx 2xy
y2
dy 1+ 2
or = 2y
x
… (1)
dx
x

Clearly, (1) is a homogenous differential equation. To solve it we make substitution


y = vx
Differentiating y = vx with respect to x, we get
dy dv
= v+x
dx dx
dv 1+v2
or v + x =
dx 2v
dv 1−v2
or x =
dx 2v
2v dx
2
dv =
1−v x
2v dx
or 2 dv = −
v −1 x
2v 1
Therefore ∫ dv = − ∫ dx
v2 −1 x
or 1og lv − l|= − log lx| + log | C1 |
2

or log|(v 2 − 1)(x)| = log |C1 |


or (v 2 − 1)x = ±C1
y
Replacing v by , we get
x
y2
� 2 − 1� x = ±C1
x
or (y 2 − x 2 ) = ±C1 x or x 2 − y 2 = Cx
Linear differential equations
A differential equation of the from
dy
+ Py = Q
dx
where, P and Q are constants or functions of x only, is known as a first order linear differential equation,
Some examples of the first order linear differential equation are

Differential Equations 15
dy
+ y = sin x
dx
dy 1
+ � � y = eX
dx x
dy y 1
+� �=
dx x log x x
Another form of first order linear differential equation is
dx
+ P1 x = Q1
dy
where, P1 and Q1 are constants or functions of y only, Some examples of this type of differential
equation are
dx
+ x = cos y
dy
dx −2x
+ = y 2 e−y
dy y
To solve the first order linear differential equation of the type
dy
+ Py = Q … (1)
dx
Multiply both sides of the equation by a function of x say g(x) to get
dy
g(x) + P. �g(x)�y = Q. g(x) … (2)
dx
Choose g(x) in such a way that R.H.S. becomes a derivative of y. g(x).
dy d
i.e. g(x) + P. g(x)y = [y. g(x)]
dx dx
dy dy
or g(x) + P. g(x)y = g(x) + yg ′ (x)
dx dx
⇒ P. g(x) = g ′ (x)
g′ (x)
or P =
g(x)

Integrating both sides with respect to x, we get


g ′ (x)
� P dx = � dx
g(x)
or ∫ P ⋅ dx = log�g(x)�
or g(x) = e ∫ P dx
On multiplying the equation (1) by g(x) =
e ∫ Pdx
, the L. H. S. becomes the derivative of some function of x and y. This function g(x) = e ∫ Pdx
is called
Integrating Factor (I.F.) of the given differential equation.
Substituting the value of g(x) in equation (2), we get
dy
e∫ P dx + Pe∫ P dx y = Q ⋅ e∫ P dx
dx
d
or �ye∫ P dx � = Qe∫ P dx
dx
Integrating both sides with respect to x, we get

y ⋅ e∫ P dx = � �Qe∫ P dx � dx

or y = e− ∫ P dx . ∫ �Q. e∫ P dx � dx + C
which is the general solution of the differential equation.
Steps involved to solve first order linear differential equation:

16 Differential Equations
dy
(i) Write the given differential equation in the form + Py = Q where P, Q are constants or functions of
dx
x only.
(ii) Find the Integrating Factor (I. F) = e∫ Pdx .
(iii) Write the solution of the given differential equation as

y(I. F) = � ( Q × I. F)dx + C
dx
In case, the first order linear differential equation is in the form + P1 x =
dy

Q1 , where, P1 and Q1 are constants or functions of y only, Then I. F = eP1 dy and the solution of the differential
equation is given by

x (I. F) = � (Q1 × I. F)dy + C


dy
Example 19 Find the general solution of the differential equation − y = cos x
dx
Solution Given differential equation is of the form
dy
+ Py = Q, where P = −1 and Q = cos x
dx
Therefore I F = e∫ −1 dx = e−x
Multiplying both sides of equation by I.F, we get
dy
e−x − e−x y = e−x cos x
dx
dy
or (ye−x ) = e−x cos x
dx
On integrating both sides with respect to x, we get
ye-x = ∫ e−x cos xdx + C … (1)
Let I = ∫ e−x cos xdx
e−x
= cos x � � − � (− sin x) (−e−x )dx
−1
= − cos x e−x − ∫sin x e−x dx

= − cos x e−x − sin x (– e−x) − ∫ cos x (−e−x) dx]

= − cos x e−x + sin x e−x − ∫cos x e−x dx


or I = – e–x cos x + sin x e–x – I
or 2I = (sin x – cos x) e–x
(sin x− cos x)e−x
or I =
2
Substituting the value of I in equation (1), we get
sin x− cos x
ye-x = � � e−x + C
2
sin x− cos x
or y = � � + Cex
2

which is the general solution of the given differential equation.


dy
Example 20 Find the general solution of the differential equation x + 2y = x 2 (x ≠ 0).
dx
Solution The given differential equation is
dy
x + 2y = x 2 … (1)
dx
Dividing both sides of equation (1) by x, we get

Differential Equations 17
dy 2
+ y =x
dx x
dy 2
which is a linear differential equation of the type + Py = Q, where P = and Q = x.
dx x
2 2
∫ xdx=e2 log x =e log x =x2 �as e log f(x)=f(x)�
So I. F = e
Therefore, solution of the given equation is given by
y. x 2 = ∫ (x)(x 2 )dx + C = ∫ x 3 dx + C
x2
or y = + Cx −2
4
which is the general solution of the given differential equation.
Example 21 Find the general solution of the differential equation ydx − (x + 2y 2 )dy = 0.
Solution The given differential equation can be written as
dx x
− = 2y
dy y
dx 1
This is a linear differential equation of the type + P1 x = Q1 , where P1 = − and
dy y
1
∫ − dy 1 1
Q1 = 2y. Therefore I. F = e y = e− log y = e log (y) =
y
Hence, the solution of the given differential equation is
1 1
x = � (2y) � � dy + C
y y
x
or = ∫ (2dy) + C
y
x
or = 2y + C
y

or x = 2y 2 + Cy
which is a general solution of the given differential equation.
Example 22 Find the particular solution of the differential equation
dy
+ y cot x = 2x + x 2 cot x(x ≠ 0)
dx
π
given that y = 0 when x = .
2
dy
Solution The given equation is a linear differential equation of the type + Py = Q,
dx
where P = cot x and Q = 2x + x 2 cot x. Therefore
cot x dx
I. F = e∫ = e log sin x = sin x
Hence, the solution of the differential equation is given by
y. sin x = ∫ (2x + x 2 cot x) sin xdx + C
or y sin x = ∫ 2 x sin xdx + ∫ x 2 cos xdx + C
2x2 2x2
or y sin x = sin x � � − ∫ cos x � � dx + ∫ x 2 cos xdx + C
2 2

or y sin x = x 2 sin x − ∫ x 2 cos xdx + ∫ x 2 cos xdx + C


or y sin x = x 2 sin x + C … (1)
π
Substituting y = 0 and x = in equation (1), we get
2
π 2π
0 = � � sin � � + C
2 2
−π2
or C =
4
Substituting the value of C in equation (1), we get

18 Differential Equations
π2
y sin x = x 2 sin x −
4
π2
or y = x − 2 ( sin x ≠ 0)
4 sin x
which is the particular solution of the given differential equation,
Example 23 Find the equation of a curve passing through the point (0, 1). If the slope of the tangent to
the curve at any point (x, y) is equal to the sum of the x coordinate (abscissa) and the product of the
x coordinate and y coordinate (ordinate) of that point.
dy
Solution We know that the slope of the tangent to the curve is .
dx
dy
Therefore, = x + xy
dx
dy
or − xy = x … (1)
dx
dy
This is a linear differential equation of the type + Py = Q, where P = −x and Q = x.
dx
−x2
Therefore, I.F = e∫ −x dx = e 2

Hence, the solution of equation is given by


−x2 −x2
y. e 2 = ∫ (x) �e 2 � − dx + C … (2)
−x2
Let I = ∫ (x)e 2 dx
−x2
Let = t, then − xdx = dt or xdx = −dt.
2
−x2
Therefore, I = − ∫ et dt = −et = −e 2

Substituting the value of I in equation (2), we get


−x2 −x2
ye 2 = −e 2 +C
x2
or y = −1 + Ce 2 … (3)
Now (3) represents the equation of family of curves, But we are interested in finding a particular
member of the family passing through (0,1). Substituting x = 0 and y = 1 in equation (3) we get
1 = −1 + C. e0 or C = 2
Substituting the value of C in equation (3), we get
x2
y = −1 + 2e 2
which is the equation of the required curve.

Differential Equations 19
10 Vector Algebra
Introduction
In this chapter, we will study some of the basic concepts about vectors, various operations on vectors,
and their algebraic and geometric properties. These two type of properties, when considered together
give a full realisation to the concept of vectors, and lead to their vital applicability in various areas as
mentioned above.
Some Basic Concepts
Let ‘l’ be any straight line in plane or three dimensional space. This line can be given two directions by
means of arrowheads. A line with one of these directions prescribed is called a directed line.

Now observe that if we restrict the line l to the line segment AB, then a magnitude is prescribed on
the line l with one of the two directions, so that we obtain a directed line segment. Thus, a directed
line segment has magnitude as well as direction.
Definition 1 A quantity that has magnitude as well as direction is called a vector.
Notice that a directed line segment is a vector, denoted as
�����⃗ or simply as �a⃗, and read as ‘vector �����⃗
AB AB’ or ‘vector �a⃗’
The point A from where the vector
�����⃗
AB starts is called its initial point, and the point B where it ends is called its terminal point, The distance between initial and termina
|, or a. The arrow indicates the direction of the vector,
Note Since the length is never negative, the notation |a�⃗| < 0 has no meaning.
Position Vector
From Class XI, recall the three dimensional right handed rectangular coordinate system. Consider a
point
P in space, having coordinates (x, y, z) with respect to the origin O(0,0,0). Then, the vector �����⃗
OP having O and P as its initial and termin
) is given by
�����⃗| = �x 2 + y 2 + z 2
|OP
In practice, the position vectors of points A, B, C, etc,, with respect to the origin O are denoted by a, �⃗
b, c⃗,
etc., respectively.

Vector Algebra 1
Direction Cosines
Consider the position vector
�����⃗
OP (or r⃗) of a point P(x, y, z). The angles α, β, γ made by the vector r⃗ with the positive directions of x, y and z −
axes respectively, are called its direction angles, The cosine values of these angles, i. e. , cos α , cos β and cos γ are called direction co
respectively.

One may note that the triangle OAP is right angled, and in it, we have cos α =
x y
(r stands for |r⃗|). Similarly, from the right angled triangles OBP and OCP, we may write cos β = and cos γ =
r r
z
. Thus, the coordinates of the point P may also be expressed as (lr, mr, nr). The numbers lr, mr and nr, proportional to the direction
r
, and denoted as a, b and c, respectively.
Note One may note that l2 + m2 + n2 = 1 but a2 + b2 + c 2 ≠ 1, in general.
Types of Vectors
Zero Vector A vector whose initial and terminal points coincide, is called a zero vector (or null vector),
and denoted as
�⃗. Zero vector can not be assigned a definite direction as it has zero magnitude, Or, alternatively otherwise, it may be regarded as ha
0
represent the zero vector,
Unit Vector A vector whose magnitude is unity (i.e., 1 unit) is called a unit vector, The unit vector in the
direction of a given vector �a⃗ is denoted by â.
Coinitial Vectors Two or more vectors having the same initial point are called coinitial vectors.

2 Vector Algebra
Collinear Vectors Two or more vectors are said to be collinear if they are parallel to the same line,
irrespective of their magnitudes and directions.
Equal Vectors Two vectors
�⃗
a�⃗ and b are said to be equal, if they have the same magnitude and direction regardless of the positions of their initial points, and writ
�⃗
b.
Negative of a Vector A vector whose magnitude is the same as that of a given vector (say, �����⃗
AB), but
direction is opposite to that of it, is called negative of the given vector.
For example, vector �����⃗
BA is negative of the vector �����⃗ �����⃗ = −AB
AB, and written as BA �����⃗.
Remark The vectors defined above are such that any of them may be subject to its parallel
displacement without changing its magnitude and direction. Such vectors are called free vectors.
Throughout this chapter, we will be dealing with free vectors only.
Example 1 Represent graphically a displacement of 40 km, 30° west of south.
Solution The vector �����⃗
OP represents the required displacement.

Example 2 Classify the following measures as scalars and vectors.


(i) 5 seconds
(ii) 1000 cm3
(iii) 10 Newton (iv) 30 km/hr (v) 10 g/cm3
(vi) 20 m/s towards north
Solution
(i) Time-scalar (ii) Volume-scalar (iii) Force-vector
(iv) Speed-scalar (v) Density-scalar (vi) Velocity-vector
Example 3 In Figure given, which of the vectors are:
(i) Collinear (ii) Equal (iii) Coinitial

Solution

Vector Algebra 3
�⃗.
(i) Collinear vectors: a�⃗, c⃗ and d
(ii) Equal vectors : a�⃗ and c⃗.
(iii) Coinitial vectors : �⃗ �⃗.
b, c⃗ and d
Addition of Vectors
A vector
�����⃗
AB simply means the displacement from a point A to the point B. Now consider a situation that a girl moves from A to B and then fro
and expressed as

�����⃗
AC = AB�����⃗ + BC
�����⃗
This is known as the triangle law of vector addition.
In general, if we have two vectors a�⃗ and �⃗
b, then to add them, they are positioned so that the initial point
of one coincides with the terminal point of the other.

For example, we have shifted vector


�⃗
b without changing its magnitude and direction, so that it’s initial point coincides with the terminal point of a. Then, the vector �a⃗ +
�⃗
b, represented by the third side AC of the triangle ABC, gives us the sum (or resultant) of the vectors a and �⃗
b i.e., in
triangle ABC, we have
�����⃗ + BC
AB �����⃗ = AC
�����⃗
�����⃗ = −CA
Now again, since AC �����⃗, from the above equation, we have
�����⃗
AB + BC�����⃗ + CA
�����⃗ = AA
�����⃗ = 0
�⃗
This means that when the sides of a triangle are taken in order, it leads to zero resultant as the initial
and terminal points get coincided.
Now, construct a vector ������⃗
BC′ so that its magnitude is same as the vector BC, but the direction opposite
to that of it, i.e.,
������⃗ �����⃗
BC’ = −BC
Then, on applying triangle law from the, we have
AC’ = AB + ������⃗
������⃗ �����⃗) = a�⃗ – �⃗
BC′ = AB + (−BC b

4 Vector Algebra
������⃗ is said to represent the difference of �a⃗ and �⃗
The vector AC′ b.
Now, consider a boat in a river going from one bank of the river to the other in a direction perpendicular
to the flow of the river. Then, it is acted upon by two velocity vectors-one is the velocity imparted to
the boat by its engine and other one is the velocity of the flow of river water. Under the simultaneous
influence of these two velocities, the boat in actual starts travelling with a different velocity. To have
a precise idea about the effective speed and direction (i.e., the resultant velocity) of the boat, we have
the following law of vector addition.

If we have two vectors


a�⃗ and �⃗
b represented by the two adjacent sides of a parallelogram in magnitude and direction, then their sum �a⃗ + �⃗
b is
represented in magnitude and direction by the diagonal of the parallelogram through their common
point. This is known as the parallelogram law of vector addition.
Note From figure above, using the triangle law, one may note that
�����⃗ + AC
OA �����⃗ = �����⃗
OC
or �����⃗ �����⃗
OA + OB = OC �����⃗ (since �����⃗
AC = �����⃗
OB)
which is parallelogram law. Thus, we may say that the two laws of vector addition are equivalent to
each other.
Properties of vector addition
Property 1 For any two vectors a�⃗ and �⃗
b,
a�⃗ + �⃗
b = �⃗
b + a�⃗ (Commutative property)
Proof Consider the parallelogram ABCD, Let AB �����⃗ = �⃗
�����⃗ = a�⃗ and BC b, then using the triangle law, from triangle
ABC, we have

AC = a�⃗ + �⃗
�����⃗ b
Now, since the opposite sides of a parallelogram are equal and parallel, We have, AD �����⃗ = �⃗
�����⃗ = BC �����⃗ =
b and DC

Vector Algebra 5
�����⃗
AB = a�⃗. Again using triangle law, from triangle ADC, we have
�����⃗ = AD
AC �����⃗ = �⃗
�����⃗ + DC b + a�⃗
Hence �a⃗ + �⃗
b = �⃗
b + a�⃗
Property 2 For any three vectors a�⃗, �⃗
b and c⃗
(a�⃗ + �⃗ �⃗ + c⃗ )
b) + c⃗ = a�⃗ + (b (Associative property)
Proof Let the vectors �a⃗, �⃗
b and c⃗ be represented by �����⃗
PQ, QR �����⃗ and RS
����⃗, respectively.

Then �a⃗ + �⃗
b = �����⃗
PQ + QR�����⃗ = PR
�����⃗
and �⃗ �����⃗ + RS
b + c⃗ = QR ����⃗ = ����⃗
QS
So (a�⃗ + �⃗ ����⃗ = PS
�����⃗ + RS
b) + c⃗ = PR ����⃗
�⃗ + c⃗ ) = �����⃗
and a�⃗ + (b PQ + ����⃗ ����⃗
QS = PS
Hence (a�⃗ + �⃗ �⃗ + c⃗)
b) + c⃗ = a�⃗ + (b
Remark The associative property of vector addition enables us to write the sum of three vectors
a�⃗, �⃗
b, c⃗ as �a⃗ + �⃗
b + c⃗ without using brackets.
Note that for any vector a�⃗, we have
�⃗ = 0
�a⃗ + 0 �⃗ + a�⃗ = a�⃗
�⃗ is called the additive identity for the vector addition.
Here, the zero vector 0
Multiplication of a Vector by a Scalar
Let
a�⃗ be a given vector and λ a scalar, Then the product of the vector �a⃗ by the scalar λ, denoted as λa�⃗, is called the multiplication of vecto
, i.e.,
|λa�⃗| = |λ||a�⃗|
A geometric visualisation of multiplication of a vector by a scalar is giver in figure below.

When λ = -1, then λ �a⃗ =


−a�⃗, which is a vector having magnitude equal to the magnitude of a�⃗ and direction opposite to that of the direction of �a⃗. The vector −
a�⃗ is called the negative (or additive inverse) of vector �a⃗ and we always have
�⃗
a�⃗ + (−a�⃗) = (−a�⃗) + a�⃗ = 0

6 Vector Algebra
1
Also, if λ = |a�⃗| , provided �a⃗ ≠ O i. e. �a⃗ is not a null vector, then
1
|λa�⃗| = |λ||a�⃗| = |a�⃗ | = 1
|a�⃗|
So, λa�⃗ represents the unit vector in the direction of a�⃗. We write it as
1
â= a�⃗
|a�⃗|
Note For any scalar k, k0 �⃗ = 0
�⃗.
Components of a vector
Let us take the points A (1, 0, 0), B(0, 1, 0) and C(0, 0, 1) on the x-axis, y-axis and z-axis, respectively,
Then, clearly

�����⃗| = 1, |OB
|OA �����⃗| = 1 and |OC
�����⃗| = 1
The vectors
�����⃗, �����⃗
OA OC, each having magnitude 1, are called unit vectors along the axes OX, OY and OZ, respectively, and denoted by ı̂, ȷ̂ and k�
OB and �����⃗
, respectively.
Now, consider the position vector �����⃗
OP of a point P(x, y, z). Let P1 be the foot of the perpendicular from
P on the plane XOY.

We, thus, see that P1 P is parallel to z-axis. As ı̂, ȷ̂ and k� are the unit vectors along the x, y and z −
axes, respectively, and by the definition of the coordinates of P, we have ������⃗ �����⃗ = zk� . Similarly, QP
P1 P = OR ������⃗1 = ����⃗
OS =
�����⃗ = xı̂.
yȷ̂ and OQ
������⃗1 = OQ
Therefore, it follows that OP �����⃗ + QP
������⃗1 = xi + yj
and �����⃗
OP = �����⃗ ����⃗ = xi + yj + zk
OP + PP
Hence, the position vector of P with reference to O is given by

Vector Algebra 7
�����⃗
OP (or r) = xi + yj + zk
This form of any vector is called its component form. Here, x, y and z are called as the scalar
components of r⃗, and xi, yj and zk are called the vector components of r⃗ along the respective axes. Sometimes
x, y and z are also termed as rectangular components.
The length of any vector r⃗ = xi + yj + zk, is readily determined by applying the Pythagoras theorem
twice. We note that in the right angle triangle OQP1
2 2
������⃗1 | = ��OQ
|OP �����⃗� + �QP
������⃗1 � = �x 2 + y 2 ,

and in the right angle triangle OP1P, we have

�����⃗ ������⃗1 |2 + |P
OP = �|OP ������⃗
1 P| = �(x + y ) + z
2 2 2 2

Hence, the length of any vector r⃗ = xî + yȷ̂ + zk� is given by


|r⃗| = �xı̂ + yȷ̂ + zk� � = �x 2 + y 2 + z 2
b are any two vectors given in the component form a1î + a2ȷ̂ + a3k� and b1î + b2ȷ̂ + b3k� , respectively,
If a�⃗ and �⃗
then
(i) the sum (or resultant) of the vectors a�⃗ and �⃗
b is given by
b = (a1 + b1)i + (a2 + b2) ȷ̂ + (a3 + b3) k�
�a⃗ + �⃗
(ii) the difference of the vector a�⃗ and �⃗
b is given by
b = (a1 – b1)î + (a2 − b2) ȷ̂ + (a3 − b2) k�
a�⃗ − �⃗
(iii) the vectors a�⃗ and �⃗
b are equal if and only if
a1 = b1, a2 = b2 and a3 = b 3
(iv) the multiplication of vector a by any scalar λ is given by
λa�⃗ = (λa1)î + (λa2) ȷ̂ + (λa3) k�
The addition of vectors and the multiplication of a vector by a scalar together give the following
dist1ibutiye laws:
Let �a⃗ and �⃗
b be any two vectors, and k and m be any scalars, Then
(i) ka�⃗ + ma�⃗ = (k + m)a�⃗
(ii) k(ma�⃗) = (km)a�⃗
(iii) k�a�⃗ + �⃗ �⃗
b� = ka�⃗ + kb
Remarks
(i) One may observe that whatever be the value of
λ, the vector λa�⃗ is always collinear to the vector �a⃗. In fact, two vectors �a⃗ and �⃗ b are collinear if and only if there exists a nonzero scalar
λa�⃗. If the vectors �a⃗ and b are given in the component form, i. e. a�⃗ = a1 ı̂ + a2 ȷ̂ + a3 k� and �⃗
�⃗ b = b1 ı̂ + b2 ȷ̂ + b3 k� , then the
two vectors are collinear if and only if
b1 ı̂ + b2 ȷ̂ + b3 k� = λ�a1 ı̂ + a2 ȷ̂ + a3 k� �
⇔ b1 ı̂ + b2 ȷ̂ + b3 k� = (kx1 )ı̂ + (kx2 )ȷ̂ + (kx3 )k�
⇔ b1 = λa1 , b2 = λa 2 , b3 = λa 3
b1 b2 b3
⇔ = = =λ
a1 a2 a3
(ii) If a�⃗ = a l ı̂ + a2 ȷ̂ + a3 k� , then a1 , a 2 , a 3 are also called direction ratios of �a⃗.
(iii) In case if it is given that l, m, n are direction cosines of a vector, then lî + mȷ̂ + nk𝑘𝑘� = ( cos α) î + (cos β) ȷ̂ +

8 Vector Algebra
( cos γ)k� is the unit vector in the direction of that vector, where α, β and γ are the angles which the vector makes with x, y and z
axes respectively.
Example 4 Find the values of x, y and z so that the vectors �a⃗ = xî + 2ȷ̂ + zk� and �⃗
b = 2î + yȷ̂ + k� are equal.
Solution Note that two vectors are equal if and only if their corresponding components are equal. Thus,
the given vectors a�⃗ and �⃗
b will be equal if and only if
x = 2, y = 2, z = 1
�⃗�? Are the vectors �a⃗ and �⃗
Example 5 Let �a⃗ = î + 2j and b−= 2î + j, Is |a�⃗| = �b b equal?
�⃗� = √22 + 12 = √5
Solution We have |a�⃗| = √12 + 22 = √5 and �b
�⃗�. But, the two vectors are not equal since their corresponding components are distinct.
So, |a�⃗| = �b
Example 6 Find unit vector in the direction of vector �a⃗ = 2î + 3ȷ̂ + k�
1
Solution The unit vector in the direction of a vector �a⃗ is given by â = |a
�⃗|
a�⃗.

Now |a�⃗| = √22 + 32 + 12 = √14


1 2 3 1
Therefore â= (2î + 3ȷ̂ + k� ) = î+ ȷ̂ + k�
√14 √14 √14 √14

Example 7 Find a vector in the direction of vector a�⃗ = î − 2j that has magnitude 7 units.
Solution The unit vector in the direction of the given vector �a⃗ is
1 1 1 2
â= |a
�⃗|
a�⃗ = (î − 2ȷ̂) = î− ȷ̂
√5 √5 √5

Therefore, the vector having magnitude equal to 7 and in the direction of a�⃗ is
1 2 7 14
7â = 7 � î− ̂J� = î− ȷ̂
√5 √5 √5 √5
Example 8 Find the unit vector in the direction of the sum of the vectors, �a⃗ = 2î + 2ȷ̂ − 5𝑘𝑘� and �⃗
b = 2î +

ȷ̂ + 3𝑘𝑘,
Solution The sum of the given vectors is
b (= c, say) = 4î + 3ȷ̂ − 2𝑘𝑘�
a�⃗ + �⃗
and |c⃗| = �42 + 32 + (−2)2 = √29
1 1 4 3 2
c� = c⃗ = (4î + 3ȷ̂ − 2k� ) = î+ ȷ̂ − k�
|c⃗| √29 √29 √29 √29
Example 9 Write the direction ratio’s of the vector a�⃗ = î + ȷ̂ − 2𝑘𝑘� and hence calculate its direction
cosines.
Solution Note that the direction ratio’s a, b, c of a vector r⃗ = xî + yȷ̂ +

zk are just the respective components x, y and z of the vector, So, for the given vector, we have a = 1, b = 1 and c =
−2. Further, if l, m and n are the direction cosines of the given vector, then
a 1 b 1 c −2
1= = ,m = = ,n = = as |r⃗| = √6
|r⃗| √6 |r⃗| √6 |r⃗| √6
1 1 2
Thus, the direction cosines are � , ,− �.
√6 √6 √6

Vector joining two points


If P1 (x1 , y1 , z1 ) and P2 (x2 , y2 , z2 ) are any two points, then the vector joining P1 and P2 is the vector �������⃗
P1 P2 .

Vector Algebra 9
Joining the points P1 and P2 with the origin O, and applying triangle law, from the triangle
OP1 P2 , we have
������⃗
OP1 + �������⃗ ������⃗2
P1 P2 = OP
Using the properties of vector addition, the above equation becomes
�������⃗
P1 P2 = OP ������⃗2 − ������⃗
OP1
i.e. P1 P2 = �x2 ı̂ + y2 ȷ̂ + z2 k� � − �x1 ı̂ + y1 ȷ̂ + z1 k� �
�������⃗
= (x2 − x1 )ı̂ + (y2 − y1 )ȷ̂ + (z2 − z1 )k�
The magnitude of vector �������⃗ P1 P2 is given by
�������⃗
�P 1 P2 � = �(x 2 − x1 ) + (y2 − y1 ) + (z2 − z1 )
2 2 2

Example 10 Find the vector joining the points P(2,3,0) and Q(−1, −2, −4) directed from P to Q.
Solution Since the vector is to be directed from
P to Q, clearly P is the initial point and Q is the terminal point. So, the required vector joining P and Q is the vector �����⃗
PQ,
given by
PQ = (−1 − 2)î + (−2 − 3)ȷ̂ + (−4 − 0)k�
�����⃗
PQ = −3î − 5ȷ̂ − 4k� .
i.e. �����⃗
ection formula

Let
P and Q be two points represented by the pasition vectors �����⃗ �����⃗, respectively, with respect to the origin O. Then the line segme
OP and OQ
internally and externally, Here, we intend to find the position vector �����⃗
OR for the point R with respect to the origin
O. We take the two cases one by one.
Case I When R divides PQ internally.
If R divides �����⃗ �����⃗ = nPR
PQ such that mRQ �����⃗, where m and n are positive scalars, we say that the point R divides �����⃗
PQ
internally in the ratio of m : n. Now from triangles ORQ and OPR, we have
�����⃗ = OQ
RQ OR = �⃗
�����⃗ − �����⃗ b − r⃗
and �����⃗ �����⃗ �����⃗
PR = OR − OP = r⃗ − a�⃗,

10 Vector Algebra
�⃗ − r⃗� = n(r⃗ − a�⃗) (Why?)
Therefore, we have m�b
�⃗+na
mb �⃗
or r⃗ = (on simplification)
m +n
Hence, the position vector of the point R which divides P and Q internally in the ratio of m : n is given
by
mb�⃗ + na�⃗
�����⃗ =
OR
m+n
Case II When R divides PQ externally.
We leave it to the reader as an exercise to verify that the position vector of the point R which divides
the line segment PQ externally in the ratio

PR m
m: n i. e. = is givenby
QR n
�⃗ − na�⃗
mb
�����⃗ =
OR
m−n
Remark If R is the midpoint of PQ, then m = n. And therefore, from Case I, the midpoint R of �����⃗
PQ, will
have its position vector as
a�⃗ + �⃗
b
�����⃗ =
OR
2
Example 11 Consider two points P and Q with position vectors �����⃗ �⃗ and OQ
OP = 3a�⃗ − 2b �����⃗ = a�⃗ + �⃗
b. Find the position
vector of a point R which divides the line joining P and Q in the ratio 2:1, (i) internally, and (ii) externally,
Solution
(i) The position vector of the point R dividing the join of P and Q internally in the ratio 2 : 1 is
2�a�⃗ + �⃗ �⃗� 5a�⃗
b� + �3a�⃗ − 2b
�����⃗ =
OR =
2+1 3
(ii) The position vector of the point R dividing the join of P and Q externally in the ratio 2:1 is
2�a�⃗ + �⃗ �⃗�
b� − �3a�⃗ − 2b
�����⃗ =
OR �⃗ − �a⃗
= 4b
2−1
Example 12 Show that the points A(2î - ȷ̂ + k� ), B(î − 3ȷ̂ − 5k� ), C(3î − 4ȷ̂ − 4k� ) are the vertices of a right
angled triangle.
Solution We have
AB = (1 − 2)î + (−3 + 1)ȷ̂ + (−5 − 1)k� = −î − 2ȷ̂ − 6k�
�����⃗
�����⃗ = (3 − 1)î + (−4 + 3)ȷ̂ + (−4 + 5)k� = 2î − ȷ̂ + k�
BC
CA = (2 − 3)î + (−1 + 4)ȷ̂ + (1 + 4)k� = − î + 3ȷ̂ + 5k�
and �����⃗
Further, note that
�����⃗|2 = 41 = 6 + 35 = |BC
|AB �����⃗|2 + |CA
�����⃗|2
Hence, the triangle is a right angled triangle.

Vector Algebra 11
Product of Two Vectors
So far we have studied about addition and subtraction of vectors. An other algebraic operation which
we intend to discuss regarding vectors is their product. We may recall that product of two numbers is
a number, product of two matrices is again a matrix. But in case of functions, we may multiply them
in two ways, namely, multiplication of two functions pointwise and composition of two functions.
Similarly, multiplication of two vectors is also defined in two ways, namely, scalar (or dot) product
where the result is a scalar, and vector (or cross) product where the result is a vector. Based upon
these two types of products for vectors, they have found various applications in geometry, mechanics
and engineering. In this section, we will discuss these two types of products.
Scalar (or dot) product of two vectors
Definition 2 The scalar product of two nonzero vectors a�⃗ and �⃗
b, denoted by a�⃗. �⃗
b, is
defined as �a⃗ ⋅ �⃗ �⃗� cos θ,
b = |a�⃗|�b

where, θ is the angle between �a⃗ and �⃗


b, 0 ≤ θ ≤ π.
If either a�⃗ = 0 or �⃗
b = 0 then θ is not defined, and in this case, we define a�⃗ ⋅ �⃗
b=0
Observations
1. a�⃗ ⋅ �⃗
b is a real number,
2. Let a�⃗ and �⃗
b be two nonzero vectors, then �a⃗ ⋅ �⃗
b = 0 if and only if a�⃗ and �⃗
b are perpendicular to each other, i.e.
a�⃗ ⋅ �⃗
b = 0 ⇔ a�⃗ ⊥ �⃗
b
3. If θ = 0, then a�⃗ ⋅ �⃗ �⃗�
b = |a�⃗|�b
In particular, �a⃗ ⋅ a�⃗ = |a�⃗|2 , as 6 in this case is 0.
4. If θ = π, then a�⃗ ⋅ �⃗ �⃗�
b = −|a�⃗|�b
In particular, �a⃗ ⋅ �⃗ �⃗�, as 6 in this case is π.
b = −|a�⃗|�b
5. In view of the Observations 2 and 3, for mutually perpendicular unit vectors î, ȷ̂ and k� , we have
î. î = ȷ̂. ȷ̂ = k� ⋅ k� = 1,
î. ȷ̂ = ȷ̂. k� = k� . î =0
6. The angle between two nonzero vectors a�⃗ and �⃗
b is given by
a�⃗ ⋅ �⃗
b a�⃗. �⃗
b
cos θ = , or θ = cos −1 � �
�⃗
|a�⃗|�b� |a�⃗|�b �⃗�
7. The scalar product is commutative, i.e.
a�⃗ ⋅ �⃗
b = �⃗
b ⋅ a�⃗ (Why?)
Two important properties of scalar product
Property 1 (Distributivity of scalar product over addition) Let �a⃗, �⃗
b and c⃗ be any three vectors, then
�⃗ + c⃗ � = �a⃗ ⋅ �⃗
a�⃗ ⋅ �b b + a�⃗ ⋅ c⃗
Property 2 Let a�⃗ and �⃗ b be any two vectors, and 1 be any scalar, Then
(λa�⃗) ⋅ �⃗
b = (λa�⃗) ⋅ �⃗
b = λ�a�⃗ ⋅ �⃗ �⃗�
b� = a�⃗ ⋅ �λb

12 Vector Algebra
b are given in component form as a1 ı̂ + a2 ȷ̂ + a3 k� and b1 ı̂ + b2 ȷ̂ + b3 k� , then their scalar
If two vectors a�⃗ and �⃗
product is given as
b = �a1 ı̂ + a2 ȷ̂ + a3 k� � ⋅ �b1 ı̂ + b2 ȷ̂ + b3 k� �
a�⃗ ⋅ �⃗
= a1 ı̂ ⋅ �bl ı̂ + b2 ȷ̂ + b3 k� � + a2 ȷ̂. �b1 ı̂ + b2 ȷ̂ + b3 k� � + a 3 k� ⋅ �b1 ı̂ + b2 ȷ̂ + b3 k� �
= a1 b1 (î. ı̂) + a1 b2 (î. ȷ̂) + a1 b3 (î. k� ) + a 2 b1 (ȷ̂. ı̂) + a2 b2 (ȷ̂. ȷ̂) + a2 b3 �ȷ̂. k� �
+a 3 b1 �k� ⋅ ı̂� + a3 b2 �k� ⋅ ȷ̂� + a3 b3 �k� ⋅ k� � (Using the above Propemes 1 and 2)
= a1 b1 + a 2 b2 + a3 b3 (Using Observatlon 5)
Thus �a⃗ ⋅ �⃗
b = a1 b1 + a2 b2 + a3 b3
Projection of a vector on a line
Suppose a vector
�����⃗ makes an angle 6 with a given directed line l (say), in the anticlockwise direction. Then the projection of �����⃗
AB AB on l is a vector �P⃗ (say

is called the projection vector, and its magnitude


�⃗| is simply called as the projection of the vector �����⃗
|p AB on the directed line l.
�����⃗ along the line l is vector �����⃗
For example, in each of the following figures, projection vector of AB AC.
Observations
1. If p� is the unit vector along aline l, then the projection of a vector �a⃗ on the line l is given by a�⃗ ⋅ p� .
2. Projection of a vector a�⃗ on other vector �⃗
b, is given by
�⃗
b 1
�a⃗ ⋅ b� , or a�⃗ ⋅ � � , or �a�⃗ ⋅ �⃗
b�
�⃗
�b� �⃗
�b�
�����⃗ will be AB
3. If θ = 0, then the projection vector of AB �����⃗ itself and if θ = π, then the projection vector of �����⃗ �����⃗.
AB will be BA
π 3π
4. If θ = or θ = , then the projection vector of �����⃗
AB will be zero vector.
2 2
Remark If α, β and γ are the direction angles of vector a�⃗ = a1 ı̂ + a2 ȷ̂ + a3 k� , then its direction cosines may be
given as
a�⃗ ⋅ ı̂ a1 a2 a3
cos α = = , cos β = , and cos γ =
|a�⃗||ı̂| |a�⃗| |a�⃗| |a�⃗|
Also, note that
|a�⃗| cos α , |a�⃗| cos β and |a�⃗| cos γ are respectively the projections of a along OX, OY and OZ, i. e. , the scalar components a1 , a 2 and a 3 of
axis, y − axis and z − axis, respectively, Further, if �a⃗ is a unit vector, then it may be expressed in terms of its

Vector Algebra 13
direction cosines as
a�⃗ = cos αî + cos βȷ̂ + cos γk�
Example 13 Find the angle between two vectors a�⃗ and �⃗
b with magnitudes 1 and 2 respectively and when a�⃗ ⋅ �⃗
b=
1.
Solution Given a�⃗ ⋅ �⃗ �⃗� = 2 We have
b = 1, |a�⃗| = 1 and �b
a�⃗ ⋅ b 1 π
θ = cos −1 � � = cos −1 � � =
|a�⃗‖b �⃗| 2 3
Example 14 Find angle ‘θ’ between the vectors a�⃗ = î + ȷ̂ − k� and �⃗
b = ı̂ − ȷ̂ + k� .
Solution The angle θ between two vectors a�⃗ and �⃗
b is given by
�⃗
�⃗⋅b
a
cos6 = �⃗|
�⃗‖b
|a
�⃗ = (î + ȷ̂ − k� ) (î − ȷ̂ + k� ) = 1 − 1 − 1 = −1.
Now a�⃗ ⋅ b
−1
Therefore, we have cosθ =
3
1
hence the required angle is θ= cos −1 �− �
3

Example 15 If a�⃗ = 5ı̂ − ȷ̂ − 3k� and �⃗


b = î + 3ȷ̂ − 5k� , then show that the vectors �a⃗ + �⃗
b and a�⃗ − �⃗
b are perpendicular.
Solution We know that two nonzero vectors are perpendicular if their scalar product is zero,
b = (5î − ȷ̂ − 3k� ) + (î + 3ȷ̂ − 5k� ) = 6î + 2J − 8k�
Here �a⃗ + �⃗
b = (5î − ȷ̂ − 3k� ) − (î + 3ȷ̂ − 5k� ) = 4î − 4ȷ̂ + 2k�
and �a⃗ − �⃗
So �a�⃗ + �⃗ b� = �6ı̂ + 2ȷ̂ − 8k� �. (4î − 4ȷ̂ + 2k� ) = 24 − 8 − 16 = 0.
b� �a�⃗ − �⃗
Hence �a⃗ + �⃗
b and a�⃗ − �⃗
b are perpendicular vectors.
b = 2î + 3ȷ̂ + 2k� on the vector �⃗
Example 16 Find the projection of the vector �⃗ b = î + 2ȷ̂ + k� .
Solution The projection of vector �a⃗ on the vector �⃗
b is given by
1 (2 × 1 + 3 × 2 + 2 × 1) 10 5
�a�⃗ ⋅ �⃗
b� = = = √6
�⃗�
�b �(1)2 + (2)2 + (1)2 √6 3
Example 17 Find �a�⃗ − �⃗
b�, if two vectors �a⃗ and �⃗ �⃗� = 3 and a�⃗ ⋅ �⃗
b are such that |a�⃗| = 2, �b b = 4.
Solution We have
|a�⃗ − �⃗
b|2 = �a�⃗ − �⃗ b� ⋅ �a�⃗ − �⃗
b�
= a�⃗a�⃗ − �a⃗ ⋅ �⃗
b − �⃗
b ⋅ a�⃗ + �⃗
b ⋅ �⃗
b
= |a�⃗|2 − 2�a�⃗ ⋅ �⃗ �⃗|2
b� + |b
= (2)2 − 2(4) + (3)2
Therefore �a − b� = √5
Example 18 If a�⃗ is a unit vector and (x − �a⃗) ⋅ (x + �a⃗) = 8, then fmd|x�⃗|.
Solution Since a�⃗ is a unit vector, |a�⃗| = 1. Also,
(x�⃗ − �a⃗) ⋅ (x�⃗ + �a⃗) = 8
or x�⃗ ⋅ x�⃗ + �x⃗ ⋅ a�⃗ − �a⃗ ⋅ x�⃗ − �a⃗ ⋅ a�⃗ = 8
or |x�⃗|2 − 1 = 8 i. e. |x�⃗|2 = 9
Therefore |x�⃗| = 3 (as magnitude of a vector is non negative),
Example 19 For any two vectors �a⃗ and �⃗
b, we always have �a�⃗ ⋅ �⃗ �⃗� (Cauchy-Schwartz inequality).
b� ≤ |a�⃗|�b
�⃗ or �⃗
Solution The inequality holds trivially when either �a⃗ = 0 b=0�⃗ Actually, in such a situation we have �a�⃗ ⋅ �⃗
b� =
�⃗�. So, let us assume that |a�⃗| ≠ 0 ≠ �b
0 = |a�⃗|�b �⃗�.

14 Vector Algebra
Then, we have
�⃗⋅b�
�a
|a �⃗�
= |cos θ| ≤ 1
�⃗|�b

Therefore �a�⃗ ⋅ �⃗ �⃗�


b� ≤ |a�⃗|�b
Example 20 For any two vectors a�⃗ and �⃗
b, we always have �a�⃗ + �⃗ �⃗� (triangle inequality).
b� ≤ |a�⃗| + �b

Solution The inequality holds trivially in case either


�⃗ or �⃗
a�⃗ = 0 b=0�⃗ (How? ), So, let |a�⃗| ≠ 0 �⃗� Then,
�⃗ ≠ �b
2
|a�⃗ + �⃗
b|2 = �a�⃗ + �⃗ b� = �a�⃗ + �⃗ b� ⋅ �a�⃗ + �⃗
b�
�⃗ �⃗ �⃗ �⃗
= �a⃗ ⋅ a�⃗ + �a⃗ ⋅ b + b ⋅ a�⃗ + b ⋅ b
2a ⋅ �⃗
= |a�⃗|2 + ����⃗ �⃗|2
b + |b
≤ |a�⃗|2 + 2|a�⃗ ⋅ �⃗ �⃗|2
b| + |b
�⃗| + |b
≤ |a�⃗|2 + 2|a�⃗||b �⃗|2
2
�⃗��
= �|a�⃗| + �b
(scalar product is commutative)
(since x ≤ |x| ∀ x ∈ R)
(from Example 19)
Hence �a�⃗ + �⃗ �⃗�
b� ≤ |a�⃗| + �b
Remark If the equality holds in triangle inequality (in the above Example 20), i.e.
�a�⃗ + �⃗
b� = |a�⃗| + �b �⃗�,
then �AC �����⃗� = �AB
�����⃗� + �BC
�����⃗�
showing that the points A, B and C are collinear.
Example 21 Show that the points A (−2î + 3ȷ̂ + 5k� ), B(î + 2ȷ̂ + 3k� ) and C(7î − k� ) are collinear.
Solution We have
AB = (1 + 2) î + (2 − 3)ȷ̂ + (3 − 5)k� = 3î − ȷ̂ − 2k� ,
�����⃗
�����⃗ = (7 − l) î + (0 − 2)ȷ̂ + (−1 − 3)k� = 6î − 2ȷ̂ − 4k� ,
BC
AC = (7 + 2) î + (0 − 3)ȷ̂ + (−1 − 5)k� = 9î − 3ȷ̂ − 6k�
�����⃗
�����⃗| = √14, �BC
|AB �����⃗� = 2√14 and �AC
�����⃗� = 3√14
�����⃗� = �AB
Therefore �AC �����⃗� + �BC
�����⃗�
Hence the points A, B and C are collinear.
�����⃗ + �����⃗
Note In Example 21, one may note that although AB BC + �����⃗ �⃗ but the points A, B and C do not form
CA = 0
the vertices of a triangle.
Vector (or cross) product of two vectors
We have discussed on the three dimensional right handed rectangular coordinate system, In this
system, when the positive x-axis is rotated counterclockwise into the positive y-axis, a right handed

Vector Algebra 15
(standard) screw would advance in the direction of the positive z-axis.
In a right handed coordinate system, the thumb of the right hand points in the direction of the positive
z-axis when the fingers are curled in the direction away from the positive x-axis toward the positive
y-axis.

Definition 3 The vector product of two nonzero vectors �a⃗ and �⃗


b, is denoted by a�⃗ × �⃗
b and defined as
a�⃗ × �⃗ �⃗|sin θ n� ,
b = |a�⃗‖b

where, θ is the angle between a and b, 0 ≤ θ ≤


π and n� is a unit vector perpendicular to both a�⃗ and �⃗
b, such that a�⃗, b and n� form a right handed system. i. e. , the right handed system
.
�⃗ or �⃗
If either a�⃗ = 0 b=0�⃗, then 6 is not defined and in this case, we define a�⃗ × �⃗
b=0�⃗.
Observations
1. a�⃗ × �⃗
b is a vector.
2. Let a�⃗ and �⃗
b be two nonzero vectors, Then �a⃗ × �⃗
b=0�⃗ if and only if a�⃗ and �⃗
b are parallel (or collinear) to each
other, i.e.,
a�⃗ × �⃗ �⃗
�⃗ ⇔ a�⃗||b
b= 0
�⃗ and a�⃗ × (−a�⃗) = �0⃗, since in the first situation, θ = 0 and in the second one, θ = π , making
In particular, �a⃗ × �a⃗ = 0
the value of sin6 to be 0.
π
3. If θ= then �a⃗ × �⃗ �⃗|.
b = |a�⃗‖b
2
4. In view of the Observations 2 and 3, for mutually perpendicular unit vectors î, ȷ̂ and k� , we have

î× î = ȷ̂ × ȷ̂ = k� × k� = �0⃗

16 Vector Algebra
î.× ȷ̂ = k� , ȷ̂ × k� = î, k� × î = ȷ̂
5. In terms of vector product, the angle between two vectors a�⃗ and �⃗
b may be given as
�a�⃗ × �⃗
b�
sin θ =
�⃗|
|a�⃗‖b
6. It is always true that the vector product is not commutative, as �a⃗ × �⃗ �⃗ × �a⃗. Indeed, a�⃗ × �⃗
b = −b b=
�⃗| sin θn� , where a�⃗, �⃗
|a�⃗‖b b and n� form a right handed system, i. e. , θ is 𝔩𝔩rayersed ffom a�⃗ to �⃗
b, , While, �⃗
b × �a⃗ =
�⃗ �⃗ �⃗
|a�⃗‖b|sinen� , where b, a�⃗ and n� 1 form a right handed system i. e. θ is traversed from b to a�⃗,.

Thus, if we assume
a�⃗ and �⃗
b to lie in the plane of the paper, then n� and n� 1 both will be perpendicular to the plane of the paper, But, n� being directed abov
−n� .
Hence �a⃗ × �⃗ �⃗| sin θn�
b = |a�⃗‖b
�⃗|sin θn� 1 = −b
= −|a�⃗‖b �⃗ × �a⃗
7. In view of the Observations 4 and 6, we have
ȷ̂ × î = −k� , k� × ȷ̂ = −î and î × k� = −ȷ̂.
8. If a�⃗ and �⃗
b represent the adjacent sides of a triangle then its area is given as
1
�a�⃗ × �⃗
b�
2

By definition of the area of a triangle,


1
Area of triangle ABC = AB. CD.
2
�⃗� (as given), and CD = |a�⃗| sin θ.
But AB = �b
1
Thus, Area of triangle ABC = �⃗‖a�⃗|sin θ = 1 |a�⃗ × �⃗
|b b|.
2 2

9. If a�⃗ and �⃗
b represent the adjacent sides of a parallelogram, then its area is given by �a�⃗ × �⃗
b�.

Vector Algebra 17
We have
Area of parallelogram ABCD = AB.DE.
�⃗� (as given), and
But AB = �b
DE = |a�⃗| sin θ.
Thus,
�⃗‖a�⃗| sin θ = |a�⃗ × �⃗
Area of parallelogram ABCD = |b b|.
We now state two important properties of vector product,
Property 3 (Distributivity of vector product over addition): If a, �⃗
b and c⃗ are any three vectors and λ be a
scalar, then
�⃗ + c⃗ � = �a⃗ × �⃗
(i) a�⃗ × �b b + �a⃗ × c⃗
(ii) λ�a�⃗ × �⃗
b� = (λa�⃗) × �⃗ �⃗�
b = a�⃗ × �λb
b be two vectors given in component form as a1 ı̂ + a2 ȷ̂ + a3 k� and b1 ı̂ + b2 ȷ̂ + b3 k� , respectively,
Let �a⃗ and �⃗ Then
their cross product may be given by
ı̂ ȷ̂ k�
a�⃗ × �⃗
b = � a1 a2 a3 �
b1 b2 b3
Explanation We have
a × b = �a1 ı̂ + a2 ȷ̂ + a3 k� � × �b1 ı̂ + b2 ȷ̂ + b3 k� �
= a1 b1 (î × ı̂) + a1 b2 (î × ȷ̂) + a1 b3 �î × k� � + a2 b1 (ȷ̂ × î) + a 2 b2 (ȷ̂ × ȷ̂) + a2 b3 (ȷ̂ × k� )
+ a3 b1 �k� × ı̂� + a3 b2 �k� × ȷ̂� + a3 b3 �k� × k� � (by Property 1)
= a1 b2 (î × ȷ̂) − a1 b3 �k� × ı̂� − a2 b1 (î × ȷ̂) + a2 b3 �ȷ̂ × k� � + a3 b1 �k� × ı̂� − a3 b2 �ȷ̂ × k� �
(as î × î = ȷ̂ × ȷ̂ = k� × k� = 0 and î × k� = −k� × î, ȷ̂ × î = −î × ȷ̂ and k� × ȷ̂ = −ȷ̂ × k� )
= a1 b2 k� − a1 b3 ȷ̂ − a2 b1 k� + a2 b3 ı̂ + a3 b1 ȷ̂ − a3 b2 ı̂
(as î × ȷ̂ = k� ȷ̂ × k� = î and k� × î = ȷ̂)
= (a 2 b3 − a3 b2 ) î − (a1 b3 − a3 b1 )ȷ̂ + (a1 b2 − a2 b1 )k�
ı̂ ȷ̂ k�
= � a1 a 2 a 3 �
b1 b2 b3
Example 22 Find �a�⃗ × �⃗ b�, if a�⃗ = 2ı̂ + ȷ̂ + 3k� and b
�⃗ = 3ı̂ + 5ȷ̂ − 2k�
Solution We have
ı̂ ȷ̂ k�
�⃗
�a⃗ × b = �2 1 3 �
3 5 −2
= î (−2 − 15) − (−4 − 9)ȷ̂ + (10 − 3)k� = −17î + 13ȷ̂ + 7k�
Hence �a�⃗ × �⃗
b� = �(−17)2 + (13)2 + (7)2 = √507
Example 23 Find a unit vector perpendicular to each of the vectors �a�⃗ + �⃗
b� and �a�⃗ − �⃗
b�, where �a⃗ = î + ȷ̂ +
�k, b = î + 2ȷ̂ + 3k� .
b = 2ı̂ + 3ȷ̂ + 4k� and a�⃗ − �⃗
Solution We have a�⃗ + �⃗ b = −ȷ̂ − 2k�
A vector which is perpendicular to both �a⃗ + �⃗ b and a�⃗ − �⃗
b is given by
ı̂ ȷ̂ �
k
�a�⃗ + �⃗
b� × �a�⃗ − �⃗
b� = �2 3 4 � = −2î + 4ȷ̂ − 2k� (= c⃗ , say)
0 −1 −2

18 Vector Algebra
Now |c⃗ | = √4 + 16 + 4 = √24 = 2√6
Therefore, the required unit vector is
c⃗ −1 2 1
= î+ ȷ̂ − k�
|c⃗| √6 √6 √6
Note There are two perpendicular directions to any plane. Thus, another unit vector perpendicular to
1 2 1
a�⃗ + �⃗
b and a�⃗ − �⃗
b will be î− ȷ̂ + k� . But that will be a consequence of �a�⃗ − �⃗
b� × �a�⃗ + �⃗
b�.
√6 √6 √6

Example 24 Find the area of a triangle having the points A (1, 1, 1), B(1, 2, 3) and C (2, 3, 1) as its vertices.
Solution We have AB �����⃗ = î + 2ȷ̂. The area of the given triangle is 1 �AB
�����⃗ = ȷ̂ + 2k� and AC �����⃗ × �����⃗
AC�.
2
ı̂ ȷ̂ k�
�����⃗ × �����⃗
Now, AB AC = �0 1 2� = −4ı̂ + 2ȷ̂ − k�
1 2 0
�����⃗
Therefore �AB × AC� = �����⃗ √16 + 4 + 1 = √21
1
Thus, the required area is √21
2
Example 25 Find the area of a parallelogram whose adjacent sides are given by the vectors a�⃗ = 3ı̂ + ȷ̂ +
b = ı̂ − ȷ̂ + k�
4k� and �⃗
Solution The area of a parallelogram with �a⃗ and �⃗
b as its adjacent sides is given by �a�⃗ × �⃗
b�.
ı̂ ȷ̂ k�
Now a�⃗ × �⃗
b = �3 1 4� = 5ı̂ + ȷ̂ − 4k�
1 −1 1
Therefore �a�⃗ × �⃗b� = √25 + 1 + 16 = √42
and hence, the required area is √42.

Vector Algebra 19
11 Three Dimensional Geometry
Direction Cosines and Direction Ratios of a Line
From Chapter 10, recall that if a directed line L passing through the origin makes angles α, β and γ with x, y and z-

axes, respectively, called direction angles, then cosine of these angles, namely, cos α, cos β and cos γ are called
direction cosines of the directed line L.
If we reverse the direction of L, then the direction angles are replaced by their supplements, i.e., π − α, π − β and
π − γ. Thus, the signs of the direction cosines are reversed.
* For various activities in three dimensional geometry, one may refer to the Book
“A Hand Book for designing Mathematics Laboratory in Schools”, NCERT, 2005

Note that a given line in space can be extended in two opposite directions and so it has two sets of direction cosines.
In order to have a unique set of direction cosines for a given line in space, we must take the given line as a directed
line. These unique direction cosines are denoted by l, m and n.
Remark If the given line in space does not pass through the origin, then, in order to find its direction cosines, we
draw a line through the origin and parallel to the given line. Now take one of the directed lines from the origin and
find its direction cosines as two parallel line have same set of direction cosines.
Any three numbers which are proportional to the direction cosines of a line are called the direction ratios of the line.
If l, m, n are direction cosines and a, b, c are direction ratios of a line, then a = λl, b=λm and c = λn, for any nonzero

λ ∈ R.
Note Some authors also call direction ratios as direction numbers.
Let a, b, c be direction ratios of a line and let l, m and n be the direction cosines (d.c’s) of the line. Then
l m n
= = = k (say), k being a constant,
a b c
Therefore l = ak, m = bk, n = ck … (1)
But l2 + m2 + n2 = 1
Therefore k 2 (a2 + b2 + c 2 ) = 1
1
or k = ±
�a2 +b2 +c2
Hence, from (1), the d.c.’s of the line are
a b c
l=± m=± n=±
2 2
√a + b + c 2 2 2
√a + b + c 2 √a + b 2 + c 2
2

For any line, if a, b, c are direction ratios of a line, then ka, kb, kc; k ≠ 0 is also a set of direction ratios. So, any two

Three Dimensional Geometry 1


sets of direction ratios of a line are also proportional. Also, for any line there are infinitely many sets of direction
ratios.
Relation between the direction cosines of a line
Consider a line RS with direction cosines l, m, n. Through the origin draw a line parallel to the given line and take a
point P(x, y, z) on this line. From P draw a perpendicular PA on the x-axis.

OA x
Let OP = r. Then cos α = = This gives x = lr.
OP r
Similarly, y = mr and z = nr
Thus x2 + y2 + z2 = r2 (l2 + m2 + n2)
But x2 + y2 + z2 = r2
Hence l2 + m2 + n2 = 1
Direction cosines of a line passing through two points
Since one and only one line passes through two given points, we can determine the direction cosines of a line
passing through the given points P(x1, y1, z1) and Q(x2, y2, z2).

Let l, m, n be the direction cosines of the line PQ and let it makes angles α, β and γ with the x, y and z-axis,
respectively.
Draw perpendiculars from P and Q to XY-plane to meet at R and S. Draw a perpendicular from P to QS to meet at
N. Now, in right angle triangle PNQ, ∠PQN = γ.
NQ z2 −z1
Therefore, cos γ = =
PQ PQ
x2 −x1 y2 −y1
Similarly cos α = and cos β =
PQ PQ
Hence, the direction cosines of the line segment joining the points P(x1 , y1 , z1 ) and Q(x2 , y2 , z2 ) are
x2 − x1 y2 − y1 z2 − z1
, ,
PQ PQ PQ
where PQ = �(x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2

2 Three Dimensional Geometry


Note The direction ratios of the line segment joining P(x1, y1, z1) and Q(x2, y2, z2) may be taken as
x2 – x1, y2 – y1, z2 – z1 or x1 – x2, y1 – y2, z1 – z2
Example 1 If a line makes angle 90°, 60° and 30° with the positive direction of x, y and z-axis respectively, find its
direction cosines.
1 √3
Solution Let the d. c.'s of the lines be l, m, n. Then l = cos 900 = 0, m = cos 600 = , n = cos 300 = .
2 2
Example 2 If a line has direction ratios 2, – 1, – 2, determine its direction cosines.
Solution Direction cosines are
2 −1 −2
�22 +(−1)2 +(−2)2 �22 +(−1)2 +(−2)2 �22 +(−1)2 +(−2)2
2 −1 −2
or , ,
3 3 3
Example 3 Find the direction cosines of the line passing through the two points (– 2, 4, – 5) and (1, 2, 3).
Solution We know the direction cosines of the line passing through two points P(x1, y1, z1) and Q(x2, y2, z2) are
given by
x2 −x1 y2 −y1 z2 −z1
, ,
PQ PQ PQ

where PQ = �(x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2


Here P is (-2, 4, -5) and Q is (1, 2, 3).
2 2
So PQ = ��1 − (−2)� + (2 − 4)2 + �3 − (−5)� = √77
Thus, the direction cosines of the line joining two points is
3 −2 8
, ,
√77 √77 √77
Example 4 Find the direction cosines of x, y and z-axis.
Solution The x-axis makes angles 0°, 90° and 90° respectively with x, y and z-axis. Therefore, the direction cosines
of x-axis are cos 0°, cos 90°, cos 90° i.e., 1,0,0. Similarly, direction cosines of y-axis and z-axis are 0, 1, 0 and 0,
0, 1 respectively.
Example 5 Show that the points A (2, 3, – 4), B (1, – 2, 3) and C (3, 8, – 11) are collinear.

Solution Direction ratios of line joining A and B are 1 – 2, – 2 – 3, 3 + 4 i.e., – 1, – 5, 7.


The direction ratios of line joining B and C are 3 –1, 8 + 2, – 11 – 3, i.e., 2, 10, – 14.
It is clear that direction ratios of AB and BC are proportional, hence, AB is parallel to BC. But point B is common to
both AB and BC. Therefore, A, B, C are collinear points.
Equation of a Line in Space
We have studied equation of lines in two dimensions in Class XI, we shall now study the vector and cartesian
equations of a line in space.
A line is uniquely determined if
(i) it passes through a given point and has given direction, or
(ii) it passes through two given points.
Equation of a line through a given point and parallel to a given vector b

Three Dimensional Geometry 3


Let �a⃗ be the position vector of the given point A with respect to the origin O of the rectangular coordinate system.
Let l be the line which passes through the point A and is parallel to a given vector �⃗b. Let r⃗ be the position vector of
an arbitrary point P on the line.
Then �����⃗ AP = λ �⃗
AP is parallel to the vector b, i.e., �����⃗ b, where λ is some real number.
But �����⃗
AP = �����⃗
OP – �����⃗
OA
i.e. λ �⃗
b = r⃗ − a�⃗
⃗ ... (1)
�⃗ + λ𝐛𝐛
𝐫𝐫⃗ = 𝐚𝐚
Remark If �⃗ b = aı̂ + bȷ̂ + ck� then a, b, c are direction ratios of the line and conversely, if a, b, c are direction ratios of a
b = aı̂ + bȷ̂ + ck� will be the parallel to the line. Here, b should not be confused with |b
line, then �⃗ �⃗|.
Derivation of cartesian form from vector form
Let the coordinates of the given point A be (x1, y1, z1) and the direction ratios of the line be a, b, c. Consider the
coordinates of any point P be (x, y, z). Then
r⃗ = xı̂ + yȷ̂ + zk� ; a = x1ı̂ + y1ȷ̂ + z1k�
and b = aı̂ + bȷ̂ + ck�
Substituting these values in (1) and equating the coefficients of ı̂, ȷ̂ and k� , we get
x = x1 + λ a; y = y1 + λ b; z = z1+ λ c … (2)

These are parametric equations of the line. Eliminating the parameter λ from (2), we get
x−x1 y−y1 z−z1
= = … (3)
a b c
This is the Cartesian equation of the line.
Note If l, m, n are the direction cosines of the line, the equation of the line is
x − x1 y − y1 z − z1
= =
l m n
Example 6 Find the vector and the Cartesian equations of the line through the point (5, 2, −4) and which is parallel
to the yector 3î +2ȷ̂ − 8k� .
Solution We have
a�⃗ = 5 î+2 ȷ̂ − 4k� and �⃗
b = 3î+2 ȷ̂ − 8k�
Therefore, the vector equation of the line is
r⃗ = 5î +2ȷ̂ − 4k� + λ �3î + 2ȷ̂ − 8k� �
Now, r⃗ is the position vector of any point P(x, y, z) on the line.
Therefore, xı̂ +yȷ̂ + zk� = 5ı̂ +2ȷ̂‐4k� + λ (3ı̂ + 2ȷ̂ - 8k� )
= (5 + 3λ)ı̂ + (2 + 2λ)ȷ̂ + (−4 − 8λ)k�
Eliminating λ, we get

4 Three Dimensional Geometry


x−5 y−2 z+4
= =
3 2 −8
which is the equation of the line in Cartesian form.
Equation of a line passing through two given points
Let �a⃗ and �⃗
b be the position vectors of two points A (x1, y1, z1) and B (x2, y2, z2), respectively that are lying on a line.
Let r⃗ be the position vector of an arbitrary point P(x, y, z), then P is a point on the line if and only if �����⃗
AP = r⃗ – a�⃗ and
�����⃗ = �⃗
AB b − a�⃗ are collinear vectors. Therefore, P is on the line if and only if
�⃗ − a�⃗)
r⃗ − a�⃗ = λ (b
�⃗ - a�⃗), λ ∈ R. … (1)
or r⃗ = a�⃗ + λ(b
This is the vector equation of the line,
Derivation of cartesian form from vector form
We have
r⃗ =xî+y ȷ̂ + zk� , �a⃗ = x1 ı̂ + y1 ȷ̂ + z1 k� and b = x2 ı̂ + y2 ȷ̂ + z2 k� ,
Substituting these values in (1), we get
xı̂ + yȷ̂ + zk� = x1 ı̂ + y1 ȷ̂ + z1 k� + λ�(x2 − x1 )ı̂ + (y2 − y1 )ȷ̂ + (z2 − z1 )k� �
Equating the like coefficients of î, ȷ̂, k� , we get
x = x1 + λ(x2 − x1 ); y = y1 + λ(y2 − y1 ); z = z1 + λ(z2 − z1 )
On eliminating λ, we obtain
x − x1 y − y1 z − z1
= =
x2 − x1 y2 − y1 z2 − z1
which is the equation of the line in Cartesian form.
Example 7 Find the vector equation for the line passing through the points (-1, 0, 2) and (3, 4, 6).
Solution Let �a⃗ and �⃗
b be the position vectors of the point A (-1, 0, 2) and B(3, 4, 6).
Then a�⃗ = −ı̂ + 2k�
b = 3î + 4ȷ̂ + 6k�
and �⃗
Therefore �⃗ b ‐ a�⃗ = 4î + 4ȷ̂ + 4k�
Let r⃗ be the position vector of any point on the line. Then the vector equation of the line is
r⃗ = - ı̂ + 2 k� + λ (4ı̂ +4ȷ̂ + 4k� )
Example 8 The Cartesian equation of a line is
x+3 y−5 z+6
= =
2 4 2
Find the vector equation for the line.
Solution Comparing the given equation with the standard form
x − x1 y − y1 z − z1
= =
a b c
We observe that x1 = −3, y1 = 5, z1 = -6; a = 2, b = 4, c = 2.
Thus, the required line passes through the point (-3, 5, -6) and is parallel to the vector 2î + 4ȷ̂ + 2k� . Let r⃗ be the
position vector of any point on the line, then the vector equation of the line is given by
r⃗ = �−3 î + 5 ȷ̂ − 6k� � + λ(2 î + 4ȷ̂ + 2k� )
Angle between Two Lines
Let L1 and L2 be two lines passing through the origin and with direction ratios a1, b1, c1 and a2, b2, c2, respectively. Let
P be a point on L1 and Q be a point on L2. Consider the directed lines OP and OQ. Let θ be the acute angle between
OP and OQ. Now recall that the directed line segments OP and OQ are vectors with components a1, b1, c1 and a2,
b2, c2, respectively. Therefore, the angle θ between them is given by

Three Dimensional Geometry 5


a1 a2 +b1 b2 +c1 c2
cos θ = � � … (1)
�a2 2 2 2 2 2
1 +b1 +c1 �a2 +b2 +c2

The angle between the lines in terms of sin θ is given by


sin θ = �1 − cos 2 θ
(a1 a 2 + b1 b2 + c1 c2 )2
= �l −
(a21 + b12 + c12 )(a22 + b22 + c22 )

�(a21 + b12 + c12 )(a22 + b22 + c22 ) − (a1 a 2 + b1 b2 + c1 c2 )2


=
�(a21 + b12 + c12 )�(a22 + b22 + c22 )
�(a1 b2 −a2 b1 )2 +(b1 c2 −b2 c1 )2 +(c1 a2 −c2 a1 )2
= … (2)
�a2 2 2 2 2 2
1 +b1 +c1 �a2 +b2 +c2

Note In case the lines L1 and L2 do not pass through the origin, we may take lines L′1 and L′2 which are parallel to
L1 and L2 respectively and pass through the origin.
If instead of direction ratios for the lines L1 and L2, direction cosines, namely, l1, m1, n1 for L1 and l2, m2, n2 for L2 are
given, then (1) and (2) takes the following form:
cos θ = |l1 l2 + m1 m2 + n1 n2 | (as 𝑙𝑙12 + nq2 + n12 = 1 = l22 + m22 + n22 ) … (3)
and sin θ = �(l1 m2 − l2 nq)2 − (m1 n2 − m2 n1 )2 + (n1 l2 − n2 l1 )2 … (4)
Two lines with direction ratios a1 , b1 , c1 and a 2 , b2 , c2 are
(i) perpendicular i.e. if θ= 90o by (1)
a1 a 2 + b1 b2 + c1 c2 = 0
(ii) parallel i.e. if θ= 0 by (2)
a1 b1 c1
= =
a2 b2 c2
Now, we find the angle between two lines when their equations are given. If θ is acute the angle between the lines.
�⃗1 and r⃗ = a�⃗2 + μb
r⃗ = �a⃗1 + λb �⃗2
�⃗ .b
b �⃗2
then cos θ = � �⃗ 1 �⃗2 �

�b1 ��b

In Cartesian form, if θ is the angle between the lines


x−x1 y−y1 z−z1
= = … (1)
a1 b1 c1
x−x2 y−y2 z−z2
and = = … (2)
a2 b2 c2
where, a1 , b1 c1 and a 2 b2 , c2 are the direction ratios of the lines (1) and (2), respectively, then
a1 a 2 + b1 b2 + c1 c2
cos θ = � �
�a1 + b12 + c12 �a22 + b22 + c22
2

6 Three Dimensional Geometry


Example 9 Find the angle between the pair of lines given by
r⃗ = 3î + 2ȷ̂ - 4k� + λ �î + 2 ȷ̂ + 2k� �
and r⃗ = 5 î‐2 ȷ̂ + μ�3 î + 2 ȷ̂ + 6k� �
Solution Here �⃗ b1 = î + 2 ȷ̂ + 2k� and �⃗ b2 = 3î+2 ȷ̂ + 6k�
The angle θ between the two lines is given by
�⃗
b1 ⋅ �⃗
b2 �ı̂ + 2ȷ̂ + 2k� � ⋅ �3ı̂ + 2ȷ̂ + 6k� �
cos θ = � �=� �
�⃗1 ��b
�b �⃗2 � √1 + 4 + 4√9 + 4 + 36
3 + 4 + 12 19
= � �=
3×7 21
19
Hence θ= cos −1 � �
21
Example 10 Find the angle between the pair of lines
x+3 y−1 z+3
= =
3 5 4
x+1 y−4 z−5
and = =
1 1 2
Solution The direction ratios of the first line are 3, 5, 4 and the direction ratios of the second line are 1, 1, 2, If θ is
the angle between them, then
3.1 + 5.1 + 4.2 16 16 8√3
cos θ = � �= = =
√32 + 52 + 42 √12 + 12 + 22 √50√6 5√2√6 15
8√3
Hence, the required angle is cos −1
� �.
15
Shortest Distance between Two Lines
If two lines in space intersect at a point, then the shortest distance between them is zero. Also, if two lines in space
are parallel, then the shortest distance between them will be the perpendicular distance, i.e. the length of the
perpendicular drawn from a point on one line onto the other line.

Further, in a space, there are lines which are neither intersecting nor parallel. In fact, such pair of lines are non
coplanar and are called skew lines. For example, let us consider a room of size 1, 3, 2 units along x, y and z-axes
respectively.
The line GE that goes diagonally across the ceiling and the line DB passes through one corner of the ceiling directly
above A and goes diagonally down the wall. These lines are skew because they are not parallel and also never
meet.
By the shortest distance between two lines we mean the join of a point in one line with one point on the other line
so that the length of the segment so obtained is the smallest.

Three Dimensional Geometry 7


For skew lines, the line of the shortest distance will be perpendicular to both the lines.
Distance between two skew lines
We now determine the shortest distance between two skew lines in the following way: Let l1 and l2 be two skew
lines with equations
r⃗ = a�⃗1 + λ �⃗
b1 ... (1)
and r = �a⃗2 + µ �⃗
b2... (2)
Take any point S on l1 with position vector a�⃗1 and T on l2, with position vector a�⃗2. Then the magnitude of the shortest
distance vector will be equal to that of the projection of ST along the direction of the line of shortest distance (See 10.6.2).
If PQ is the shortest distance vector between l1 and l2, then it being perpendicular to both �⃗ b1 and �⃗b2, the unit vector
n� along �����⃗
PQ would therefore be

�⃗1 ×b
b �⃗2
n� = �⃗1 ×b
�⃗2 |
… (3)
|b

Then �����⃗
PQ = d n�
where, d is the magnitude of the shortest distance vector. Let θ be the angle between ����⃗
ST and �����⃗
PQ. Then

PQ = ST |cos θ|
������⃗
PQ.ST �����⃗
But cos θ = � ������⃗ �����⃗�

�PQ��ST
� (a
dn �⃗1 )
�⃗2 −a
=� ����⃗ = a�⃗2 – a�⃗1)
� (since ST
dST
�⃗1 ×b
�b �⃗2 �(a
�⃗2 - �a⃗1 )
=� �⃗1 ×b
�⃗2 �
� [From (3)]
ST�b

Hence, the required shortest distance is


d = PQ = ST |cos θ|
�⃗1 ×b
�b �⃗2 �.(a
�⃗2 ×a�⃗1 )
or d = � �⃗1 × b �⃗2 |

|b
Cartesian form
The shortest distance between the lines
x−x1 y−y1 z−z1
l1 : = =
a1 b1 c1
x−x2 y−y2 z−z2
and l2 : = =
a2 b2 c2
x2 −x1 y2 −y1 z2 −z1
� a1 b1 c1 �
a2 b2 c2
is � �
�(b1 c2 −b2 c1 )2 +(c1 a2 −c2 a1 )2 +(a1 b2 −a2 b1 )2

Distance between parallel lines


If two lines l1 and l2 are parallel, then they are coplanar. Let the lines be given by
�⃗ … (1)
r⃗ = �a⃗1 + λb

8 Three Dimensional Geometry


�⃗ … (2)
and r⃗ = �a⃗2 + µb

where, a is the position vector of a point S on l1 and �a⃗2 is the position vector of a point T on l2.
As l1, l2 are coplanar, if the foot of the perpendicular from T on the line l1 is P, then the distance between the lines
l1 and l2 = |TP|.
Let θ be the angle between the vectors ST ����⃗ and �⃗
b.
Then �⃗ �⃗��ST
����⃗ = ��b
b × ST ����⃗� sin θ� ň... (3)
where n� is the unit vector perpendicular to the plane of the lines l1 and l2.
����⃗ = a�⃗2 − a�⃗1
But ST
Therefore, from (3), we get
�⃗ �⃗�PT n� (since PT = ST sin θ)
b × (a�⃗2 − �a⃗1 ) = �b
�⃗ × (a�⃗2 − �a⃗1 )� = �b
i.e., �b �⃗�PT. 1 (as |n� | = 1)
Hence, the distance between the given parallel lines is
�⃗
b × (a�⃗ − �a⃗)
d = �PT�����⃗� = � �
�⃗�
�b
Example 11 Find the shortest distance between the lines l1 and l2 whose vector equations are
r⃗ = î + ȷ̂ + λ (2 î - ȷ̂ + k� ) … (1)
and r⃗ = 2 î + ȷ̂ − k� + μ �3 î − 5ȷ̂ + 2k� � … (2)
�⃗1 and r⃗ = a�⃗2 + μb
Solution Comparing (1) and (2) with r⃗ = a�⃗1 + λb �⃗2 respectively,
b1 = 2 î −ȷ̂ + k�
we get �a⃗1 = î + j, �⃗
a�⃗2 = 2 î +ȷ̂ − k� and �⃗
b2 = 3 î −5ȷ̂ + 2k�
Therefore �a⃗2 − �a⃗1 = î - k�
and �⃗ b2 = �2 î − ȷ̂ + k� � × �3 î − 5ȷ̂ + 2k� �
b1 × �⃗
ı̂ ȷ̂ k�
= �2 −1 1� = 3 î −ȷ̂ − 7k�
3 −5 2
�⃗1 × �⃗
So |b b2 | = √9 + 1 + 49 = √59
Hence, the shortest distance between the given lines is given by
�⃗1 ×b
�b �⃗2 �.(a�⃗2 - �a⃗1 ) |3-0+7| 10
d=� �⃗1 ×b �⃗2 �
� = =
�b √59 √59
|3−0+7| 10
= =
√59 √59
Example 12 Find the distance between the lines l1 and l2 given by
r⃗ = ı̂ + 2ȷ̂ − 4k� + λ(2î + 3ȷ̂ + 6k)
and r⃗ = 3î +3ȷ̂ − 5k� + μ�2 î + 3ȷ̂ + 6k� �

Three Dimensional Geometry 9


Solution The two lines are parallel (Why? ) We have
a�⃗1 = ı̂ + 2ȷ̂ − 4k� , a�⃗2 = 3ı̂ + 3ȷ̂ − 5k� and �⃗
b = 2ı̂ + 3ȷ̂ + 6k�
Therefore, the distance between the lines is given by
ı̂ ȷ̂ k�
�2 3 6 �
�⃗
b × (a�⃗2 - �a⃗1 ) � �
d=� � = 2 1 −1
�⃗�
�b �√4 + 9 + 36�

��
�−9î+14ȷ̂−4k √293 √293
or = = =
√49 √49 7
Plane
A plane is determined uniquely if any one of the following is known:
(i) the normal to the plane and its distance from the origin is given, i.e., equation of a plane in normal form.
(ii) it passes through a point and is perpendicular to a given direction.
(iii) it passes through three given non collinear points.
Now we shall find vector and Cartesian equations of the planes.
Equation of a plane in normal form
Consider a plane whose perpendicular distance from the origin is d (d ≠ 0).
�����⃗ is the normal from the origin to the plane, and n� is the unit normal vector along �����⃗
If ON ON. Then �����⃗
ON = d n� . Let P be
any point on the plane. Therefore, �����⃗
NP is perpendicular to ON�����⃗.
�����⃗ = 0... (1)
�����⃗.ON
Therefore, NP

Let r⃗ be the position vector of the point P, then �����⃗ �����⃗ + �����⃗
NP = r⃗ − d n� (as ON NP = �����⃗
OP)
Therefore, (1) becomes
(r⃗ − d n� ) ⋅ d n� = 0

or (r⃗ − d n� ) ⋅ n� = 0 (d ≠ 0)

or r⃗ ⋅ n� − d n� ⋅ n� = 0

i.e., 𝐫𝐫⃗ ⋅ 𝐧𝐧
� = d (as n� ⋅ n� = 1) … (2)
This is the vector form of the equation of the plane.
Cartesian form

10 Three Dimensional Geometry


Equation (2) gives the vector equation of a plane, where n� is the unit vector normal to the plane. Let P(x, y, z) be
any point on the plane. Then
OP = r⃗ = x ı̂ + y ȷ̂ + z k�
�����⃗
Let l, m, n be the direction cosines of n� . Then
n� = l ı̂ + m ȷ̂ + n k�
Therefore, (2) gives
(x ı̂ + y ȷ̂ + z k� ) ⋅ (l ı̂ + m ȷ̂ + n k� ) = d
i.e., lx + my + nz = d ... (3)
This is the cartesian equation of the plane in the normal form.
Note Equation (3) shows that if r⃗ ⋅ (a ı̂ + ȷ̂ + c k� ) = d is the vector equation of a plane, then ax + by + cz = d is the
Cartesian equation of the plane, where a, b and c are the direction ratios of the normal to the plane.
6
Example 13 Find the vector equation of the plane which is at a distance of from the origin and its normal vector
√29
from the origin is 2î - 3ȷ̂ + 4k� . Also find its cartesian form.
Solution Let �n⃗ = 2 î −3ȷ̂ + 4k� . Then
�⃗
n 2ı̂ − 3ȷ̂ + 4k� 2ı̂ − 3ȷ̂ + 4k�
n� = = =
|n
�⃗| √4 + 9 + l6 √29
Hence, the required equation of the plane is
2 −3 4 6
r⃗. � î+ ȷ̂ + k� � =
√29 √29 √29 √29
Example 14 Find the direction cosines of the unit vector perpendicular to the plane
r⃗. (6 î‐3 ȷ̂ − 2k� ) + 1 = 0 passing through the origin.
Solution The given equation can be written as
r⃗. �−6 î + 3ȷ̂ + 2k� � = 1 … (1)
Now �−6 î + 3ȷ̂ + 2k� � = √36 + 9 + 4 = 7
Therefore, dividing both sides of (1) by 7, we get
6 3 2 1
r⃗. �− î + ȷ̂ + k� � =
7 7 7 7
which is the equation of the plane in the form r⃗. n� = d.
6 3 2
This shows that n� = − î+ ȷ̂ + k� is a unit vector perpendicular to the plane through the origin, Hence, the direction
7 7 7
−6 3 2
cosines of n� are , , .
7 7 7
Example 15 Find the distance of the plane 2x – 3y + 4z – 6 = 0 from the origin.
Solution Since the direction ratios of the normal to the plane are 2, –3, 4; the direction cosines of it are
2 −3 4 2 −3 4
, , , i.e., , ,
�22 +(−3)2 +4 2 �22 +(−3)2 +4 2 �22 +(−3)2 +4 2 √29 √29 √29

Hence, dividing the equation 2x - 3y + 4z – 6 = 0 i.e., 2x - 3y + 4z = 6 throughout by √29, we get


2 −3 4 6
x + y + z =
√29 √29 √29 √29
This is of the form lx + my + nz = d, where d is the distance of the plane from the origin, So, the distance of the
6
plane from the origin is .
√29
Example 16 Find the coordinates of the foot of the perpendicular drawn from the origin to the plane 2x – 3y + 4z –
6 = 0.

Solution Let the coordinates of the foot of the perpendicular P from the origin to the plane is (x1, y1, z1).

Three Dimensional Geometry 11


Then, the direction ratios of the line OP are x1, y1, z1.
Writing the equation of the plane in the normal form, we have
2 3 4 6
x− y+ z=
√29 √29 √29 √29
2 −3 4
where, , , are the direction cosines of the OP.
√29 √29 √29
Since d.c.’s and direction ratios of a line are proportional, we have
x1 y1 z1
= = =k
2 −3 4
√29 √29 √29
2k −3k 4k
i.e., x1 = , y1 = , z1 =
√29 √29 √29
6
Substituting these in the equation of the plane, we get k = .
√29
12 −18 24
Hence, the foot of the perpendicular is � , , �.
29 29 29
Note If d is the distance from the origin and l, m, n are the direction cosines of the normal to the plane through the
origin, then the foot of the perpendicular is (ld, md, nd).
Equation of a plane perpendicular to a given vector and passing through a given point

In the space, there can be many planes that are perpendicular to the given vector, but through a given point P(x1,
y1, z1), only one such plane exists.
Let a plane pass through a point A with position
vector a�⃗ and perpendicular to the vector �N
�⃗.
Let r⃗ be the position vector of any point P(x, y, z) in the plane.

12 Three Dimensional Geometry


Then the point P lies in the plane if and only if
�����⃗
AP is perpendicular to N��⃗. i.e., �����⃗
AP. N ��⃗ = 0. But
�����⃗
AP = r⃗ − a�⃗. Therefore, (𝐫𝐫⃗ − 𝐚𝐚 ��⃗ = 0 … (1)
�⃗) ⋅ 𝐍𝐍
This is the vector equation of the plane.
Cartesian form
Let the given point A be (x1, y1, z1), P be (x, y, z) and direction ratios of N ��⃗ are A, B and C. Then,
a�⃗ = x1 ı̂ + y1 ȷ̂ + z1 k� , r⃗ = x ı̂ + y ȷ̂ + z k� and N
��⃗ = A ı̂ + B ȷ̂ + C k�
��⃗ = 0
Now (r⃗ – a⃗ ) ⋅ N
So �(x - x1 )î + �y - y1 �ĵ + (z - z1 )k� �. (Aî + Bȷ̂ + Ck� ) = 0
i.e. A (x – x1) + B (y – y1) + C (z – z1) = 0
Example 17 Find the vector and cartesian equations of the plane which passes through the point (5, 2, – 4) and
perpendicular to the line with direction ratios 2, 3, – 1.
Solution We have the position vector of point (5, 2, – 4) as a�⃗ = 5ı̂ + 2 ȷ̂ − 4k� and the normal vector N
��⃗ perpendicular

��⃗ = 2ı̂ +3 ȷ̂ − k�
to the plane as N
��⃗ = 0
Therefore, the vector equation of the plane is given by (r⃗ − a�⃗).N

or [r⃗ − (5 ı̂ + 2 ȷ̂ − 4 k� )] ⋅ (2 ı̂ + 3 ȷ̂ − k� ) = 0 … (1)
Transforming (1) into Cartesian form, we have
[(x – 5)ı̂ + (y – 2)ȷ̂ + (z + 4)k� ] ⋅ (2 ı̂ + 3 ȷ̂ − k� ) = 0

or 2(x − 5) + 3( y − 2) − 1(z + 4) = 0
i.e. 2x + 3y – z = 20
which is the cartesian equation of the plane.
Equation of a plane passing through three non collinear points
Let R, S and T be three non collinear points on the plane with position vectors a, b and c respectively.

Three Dimensional Geometry 13


The vectors ����⃗
RS and �����⃗
RT are in the given plane. Therefore, the vector RS����⃗ × RT
�����⃗ is perpendicular to the plane containing
points R, S and T. Let r⃗ be the position vector of any point P in the plane. Therefore, the equation of the plane
passing through R and perpendicular to the vector RS����⃗ × RT
�����⃗ is
����⃗ × RT
(r⃗ − a�⃗) ⋅ (RS �����⃗) = 0

or (𝐫𝐫⃗ – 𝐚𝐚 ⃗ – 𝐚𝐚
�⃗).[(𝐛𝐛 �⃗) × (𝐜𝐜⃗ – 𝐚𝐚�⃗)] = 0 … (1)
This is the equation of the plane in vector form passing through three noncollinear points.
Note Why was it necessary to say that the three points had to be non collinear? If the three points were on the
same line, then there will be many planes that will contain them.

These planes will resemble the pages of a book where the line containing the points R, S and T are members in
the binding of the book.
Cartesian form
Let (x1, y1, z1), (x2, y2, z2) and (x3, y3, z3) be the coordinates of the points R, S and T respectively. Let (x, y, z) be the
coordinates of any point P on the plane with position vector r⃗. Then
RP = (x – x1)ı̂ + (y – y1)ȷ̂ + (z – z1)k�
�����⃗
RS = (x2 – x1)ı̂ + (y2 – y1)ȷ̂ + (z2 – z1)k�
����⃗
RT = (x3 – x1)ı̂ + (y3 – y1)ȷ̂ + (z3 – z1)k�
�����⃗
Substituting these values in equation (1) of the vector form and expressing it in the form of a determinant, we have
x- x1 y- y1 z- z1
�x2 - x1 y2 - y1 z2 - z1 � = 0
x3 - x1 y3 - y1 z3 - z1
which is the equation of the plane in Cartesian form passing through three non collinear points (x1, y1, z1), (x2, y2,
z2) and (x3, y3, z3).

14 Three Dimensional Geometry


Example 18 Find the vector equations of the plane passing through the points R(2, 5, – 3), S(– 2, – 3, 5) and T(5,
3,– 3).
Solution Let �a⃗ = 2ı̂ + 5ȷ̂ − 3k� , �⃗
b = − 2ı̂ − 3ȷ̂ + 5k� , c⃗ = 5ı̂ + 3ȷ̂ − 3k�
Then the vector equation of the plane passing through a�⃗, �⃗
b and c⃗ and is given by
����⃗ × RT
(r⃗ − a�⃗) ⋅ (RS �����⃗) = 0 (Why?)

�⃗ − a�⃗) × (c⃗ − a�⃗)] = 0


or (r⃗ − a�⃗) ⋅ [(b

i.e. [r⃗ − (2ı̂ + 5ȷ̂ − 3k� )]⋅[(−4ı̂ − 8ȷ̂ + 8k� ) × (3ı̂ - 2ȷ̂)] = 0
Intercept form of the equation of a plane
In this section, we shall deduce the equation of a plane in terms of the intercepts made by the plane on the
coordinate axes. Let the equation of the plane be
Ax + By + Cz + D = 0 (D ≠ 0)... (1)
Let the plane make intercepts a, b, c on x, y and z axes, respectively.
Hence, the plane meets x, y and z-axes at (a, 0, 0), (0, b, 0), (0, 0, c), respectively.

−D
Therefore Aa + D = 0 or A =
a
−D
Bb + D = 0 or B =
b
−D
Cc + D = 0 or C =
c
Substituting these values in the equation (1) of the plane and simplifying, we get
x y z
+ + = 1 … (1)
a b c
which is the required equation of the plane in the intercept form.
Example 19 Find the equation of the plane with intercepts 2, 3 and 4 on the x, y and z-axis respectively.
Solution Let the equation of the plane be
x y z
+ + = 1 …(1)
a b c
Here a = 2, b = 3, c = 4.
x y z
Substituting the values of a, b and c in (1), we get the required equation of the plane as + + = 1 or 6x + 4y +
2 3 4
3z = 12.
Plane passing through the intersection of two given planes

Three Dimensional Geometry 15


Let π1 and π2 be two planes with equations r⃗ ⋅ n� 1 = d1 and r⃗ ⋅ n� 2 = d2 respectively. The position vector of any point
on the line of intersection must satisfy both the equations.
If t⃗ is the position vector of a point on the line, then
⃗t. n� 1 = d1 and ⃗t. n� 2 = d2
Therefore, for all real values of λ, we have
t⃗. (n� 1 + λn� 2) = d1 + λd2
Since ⃗t is arbitrary, it satisfies for any point on the line.
Hence, the equation r⃗. (n �⃗1 + λn
�⃗2) = d1 + λd2 represents a plane π3 which is such that if any vector r⃗ satisfies both the
equations π1 and π2, it also satisfies the equation π3 i.e., any plane passing through the intersection of the planes
r⃗. �n⃗1 = d1 and r⃗. n
�⃗2 = d2
has the equation r⃗.(n �⃗1 + λn
�⃗2) = d1 + λ d2... (1)
Cartesian form
In Cartesian system, let
�⃗1 = A1ı̂ + B2ȷ̂ + C1k�
n
�⃗2 = A2ı̂ + B2ȷ̂ + C2k�
n
and r⃗ = xı̂ + yȷ̂ + zk�
Then (1) becomes
x (A1 + λA2) + y (B1 + λB2) + z (C1 + λC2) = d1 + λd2
or (A1x + B1y + C1z - d1) + λ(A2x + B2y + C2z - d2) = 0... (2)
which is the required Cartesian form of the equation of the plane passing through the intersection of the given
planes for each value of λ.
Example 20 Find the vector equation of the plane passing through the intersection of the planes r⃗.(ı̂ + ȷ̂ + k� ) = 6
and r⃗. (2ı̂ + 3ȷ̂ + 4k� ) = -5, and the point (1, 1, 1).
Solution Here, n �⃗1 = ı̂ + ȷ̂ + k� and n �⃗2 = 2ı̂ + 3ȷ̂ + 4k� ;
and d1 = 6 and d2 = -5
Hence, using the relation r⃗. (n �⃗1 + λn�⃗2) = d1 + λd2, we get
r⃗. [ı̂ + ȷ̂ + k� + λ(2ı̂ + 3ȷ̂ + 4k� )] = 6 - 5λ
or r⃗. [(1 + 2λ)ı̂ + (1 + 3λ)ȷ̂ + (1 + 4λ)k� ] = 6 - 5λ... (1)
where, λ is some real number.
Taking r⃗ = xı̂ + yȷ̂ + zk� , we get
(xı̂ + yȷ̂ + zk� ) - [(1 + 2λ)ı̂ + (1 + 3λ)ȷ̂ + (1 + 4λ)k� ] = 6 - 5λ
or (1 + 2λ) x + (1 + 3λ) y + (1 + 4λ) z = 6 - 5λ
or (x + y + z - 6) + λ (2x + 3y + 4 z + 5) = 0... (2)
Given that the plane passes through the point (1, 1, 1), it must satisfy (2), i.e.
(1 + 1 + 1 - 6) + λ(2 + 3 + 4 + 5) = 0
3
or λ =
l4
Putting the values of λ in (1), we get
3 9 6 15
r⃗ ��1 + � î + �1 + � ȷ̂ + �1 + � k� � = 6 −
7 14 7 14

16 Three Dimensional Geometry


l0 23 l3 69
or r⃗ � ı̂ + ȷ̂ + k� � =
7 l4 7 l4
or r⃗.(20î + 23ȷ̂ + 26k� ) = 69
which is the required vector equation of the plane.
Coplanarity of Two Lines
Let the given lines be
�⃗1 … (1)
r⃗ = a�⃗1 + λb
�⃗2 … (2)
and r⃗ = �a⃗2 + μb
The line (1) passes through the point, say A, with position vector a�⃗1 and is parallel to �⃗
b1 . The line (2) passes through
the point, say B with position vector a�⃗2 and is parallel to �⃗
b2 .
�����⃗ = �a⃗2 - a�⃗1
Thus, AB
The given lines are coplanar if and only if AB �����⃗ is perpendicular to �⃗
b1 × �⃗
b2 .
i.e. AB �⃗1 × �⃗
�����⃗. �b �⃗1 × �⃗
b2 � = 0 or (a�⃗2 − �a⃗1 ) ⋅ �b b2 � = 0
Cartesian form
Let (x1 , y1 , z1 ) and (x2 , y2 , z2 ) be the coordinates of the points A and B respectively.
Let a1, b1, c1 and a2, b2, c2 be the direction ratios of �⃗ b1 and �⃗
b2, respectively. Then
AB = ( x2 − x1 )ı̂ + (y2 − y1)ȷ̂ + (z2 − z1)k�
�����⃗

b1 = a1 ı̂ + b1 ȷ̂ + c1 k� and �⃗
�⃗ b2 = a2 ı̂ + b2 ȷ̂ + c2 k�
The given lines are coplanar if and only if AB �⃗1 × �⃗
�����⃗. (b b2) = 0. In the cartesian form, it can be expressed as
x2 - x1 y2 - y1 z2 - z1
� a1 b1 c1 � = 0 … (4)
a2 b2 c2
Example 21 Show that the lines
x+3 y−1 z−5 x+1 y−2 z−5
= = and = = are coplanar.
−3 1 5 −1 2 5
Solution Here, x1 = – 3, y1 = 1, z1 = 5, a1 = – 3, b1 = 1, c1 = 5
x2 = – 1, y2 = 2, z2 = 5, a2 = –1, b2 = 2, c2 = 5
Now, consider the determinant
x2 - x1 y2 - y1 z2 - z1 2 1 0
� a1 b1 c1 � = �−3 1 5� = 0
a2 b2 c2 −1 2 5
Therefore, lines are coplanar.
Angle between Two Planes
Definition 2 The angle between two planes is defined as the angle between their normals. Observe that if θ is an

angle between the two planes, then so is 180 – θ. We shall take the acute angle as the angles between two planes.

Three Dimensional Geometry 17


If n �⃗2 are normals to the planes and θ be the angle between the planes
�⃗1 and n

r⃗ ⋅ �n⃗1 = d1 and r⃗. n


�⃗2 = d2.

Then θ is the angle between the normals to the planes drawn from some common point.
�⃗ ⋅n
n �⃗
We have cos θ = �|n�⃗ 1||n�⃗2 |�
1 2
Note The planes are perpendicular to each other if n
�⃗1. n
�⃗2 = 0 and parallel if �n⃗1 is parallel to �n⃗2.
Cartesian form Let θ be the angle between the planes,
A1 x + B1 y + C1z + D1 = 0 and A2x + B2 y + C2 z + D2 = 0
The direction ratios of the normal to the planes are A1, B1, C1 and A2, B2, C2 respectively.

A1 A2 +B1 B2 +C1 C2
Therefore cos θ = � �
�A2 2 2 2 2 2
1 +B1 +C1 �A2 +B2 +C2

Note
1. If the planes are at right angles, then θ = 90o and so cos θ = 0. Hence, cos θ = A1A2 + B1B2 + C1C2 = 0.
A1 B1 C1
2. If the planes are parallel, then = = .
A2 B2 C2
Example 22 Find the angle between the two planes 2x + y – 2z = 5 and 3x – 6y – 2z = 7 using vector method.
Solution The angle between two planes is the angle between their normals. From the equation of the planes, the
normal vectors are
�⃗1 = 2î + ȷ̂ − 2k� and N
�N ��⃗2 = 3î − 6ȷ̂ − 2k�
��⃗ ⋅N
N ��⃗2 ���3ı̆ −6ȷ̆−2k
�2ı̆ +J̆−2k �� 4
cos θ = � ��⃗ 1 ��⃗2 �
� =� �=� �
�N1 ��N √4+1+4√9+36+4 21
4
Hence θ = cos −1 � �
21
Example 23 Find the angle between the two planes 3x – 6y + 2z = 7 and 2x + 2y – 2z =5.
Solution Comparing the given equations of the planes with the equations
A1 x + B1 y + C1 z + D1 = 0 and A2 x + B2 y + C2 z + D2 = 0
We get A1 = 3, B1 = – 6, C1 = 2
A2 = 2, B2 = 2, C2 = – 2

3×2+(−6)(2)+(2)(−2)
cos θ = � �
�(32 +(−6)2 +(−2)2 )�(22 +22 +(−2)2 )

−10 5 5√3
=� �= =
7×2√3 7√3 21
5√3
Therefore, θ = cos −1 � �
21
Distance of a Point from a Plane
Vector form
Consider a point P with position vector a�⃗ and a plane π1 whose equation is r⃗ ⋅ n� = d.

18 Three Dimensional Geometry


Consider a plane π2 through P parallel to the plane π1. The unit vector normal to π2 is n� . Hence, its equation is (r⃗ −

a�⃗) ⋅ n� = 0

i.e., r⃗ ⋅ n� = a�⃗ ⋅ n�

Thus, the distance ON′ of this plane from the origin is |a⃗ ⋅ n� |. Therefore, the distance PQ from the plane π1 is

i.e., ON – ON′ = |d – a�⃗ ⋅ n� |


which is the length of the perpendicular from a point to the given plane.
Note
��⃗  d, where N
1. If the equation of the plane π2 is in the form r⃗ ⋅ N ��⃗ is normal to the plane, then the perpendicular
��⃗ - d�
�⃗.𝐍𝐍
�a
distance is ��⃗|
.
|𝐍𝐍
|d|
��⃗ = d is
2. The length of the perpendicular from origin O to the plane r⃗ ⋅ N (since �a⃗ = 0).
��⃗�
�N

Cartesian form
Let P(x1, y1, z1) be the given point with position vector a and
Ax + By + Cz = D
be the Cartesian equation of the given plane. Then
a�⃗ = x1 ı̂ + y1 ȷ̂ + z1 k�
��⃗ = A ı̂ + B ȷ̂ + C k�
N
Hence, from Note 1, the perpendicular from P to the plane is
�x1 ı̂ + y1 ȷ̂ + z1 k� �. �Aı̂ + Bȷ̂ + Ck� � − D
� �
√A2 + B 2 + C 2
Ax1 + By1 + Cz1 − D
=� �
√A2 + B 2 + C 2
Example 24 Find the distance of a point (2, 5, – 3) from the plane
r⃗. (6ı̂ - 3ȷ̂ + 2k� ) = 4
Solution Here, a�⃗ = 2ı̂ + 5ȷ̂ - 3k� , N ��⃗ = 6 ı̂ - 3 ȷ̂ + 2 k� and d = 4.
Therefore, the distance of the point (2, 5, - 3) from the given plane is
��2ı̂ +5ȷ̂−3k��⋅�6ı̂ −3ȷ̂+2k
��−4� 112−15−6−4| 13
�� = =
�6ı̂ −3ȷ̂+2k √36+9+4 7
Angle between a Line and a Plane

Three Dimensional Geometry 19


Definition 3 The angle between a line and a plane is the complement of the angle between the line and normal to
the plane.
Vector form If the equation of the line is r⃗ = a�⃗ + λ �⃗
b and the equation of the plane is r⃗ ⋅n
�⃗ = d. Then the angle θ
between the line and the normal to the plane is
�⃗ ⋅n
b �⃗
cos θ = � �⃗ �
�⃗|
�b�.|n

and so the angle ϕ between the line and the plane is given by 90 - θ, i.e.,
sin(90 − θ) = cos θ
�⃗ ⋅n
b �⃗ b⋅n
i.e. sin ϕ = � �⃗ � or ϕ = sin−1 � �
�⃗|
�b�|n �b�|n|

Example 25 Find the angle between the line


x+1 y z−3
= =
2 3 6
and the plane 10 x + 2y − 11z = 3.
Solution Let θ be the angle between the line and the normal to the plane, Converting the given equations into
vector form, we have
r⃗ = � −î + 3 k� � + λ�2 î + 3 ȷ̂ + 6k� �
and r⃗.(10 î+2 ȷ̂ − 11k� ) = 3
b = 2 î +3ȷ̂ + 6k� and n
Here �⃗ �⃗ = 10 î +2ȷ̂ − 11k�
��.�10ı̂ +2ȷ̂−11k
�2ı̂ +3ȷ̂+6k ��
sin ϕ = � �
�22 +32 +62 �102 +22 +112
−40 −8 8 8
=� �=� �= or ϕ = sin−1 � �
7×15 21 21 21

20 Three Dimensional Geometry


12 Linear Programming
Introduction
In earlier classes, we have discussed systems of linear equations and their applications in day to day
problems. In Class XI, we have studied linear inequalities and systems of linear inequalities in two
variables and their solutions by graphical method. Many applications in mathematics involve systems
of inequalities/equations. In this chapter, we shall apply the systems of linear inequalities/equations
to solve some real life problems of the type as given below:
A furniture dealer deals in only two items–tables and chairs. He has Rs 50,000 to invest and has storage
space of at most 60 pieces. A table costs Rs 2500 and a chair Rs 500. He estimates that from the sale
of one table, he can make a profit of Rs 250 and that from the sale of one chair a profit of Rs 75. He
wants to know how many tables and chairs he should buy from the available money so as to maximise
his total profit, assuming that he can sell all the items which he buys.
Such type of problems which seek to maximise (or, minimise) profit (or, cost) form a general class of
problems called optimisation problems. Thus, an optimisation problem may involve finding maximum
profit, minimum cost, or minimum use of resources etc.
A special but a very important class of optimisation problems is linear programming problem. The
above stated optimisation problem is an example of linear programming problem. Linear programming
problems are of much interest because of their wide applicability in industry, commerce, management
science etc.
In this chapter, we shall study some linear programming problems and their solutions by graphical
method only, though there are many other methods also to solve such problems.
Linear Programming Problem and its Mathematical Formulation
We begin our discussion with the above example of furniture dealer which will further lead to a
mathematical formulation of the problem in two variables. In this example, we observe
(i) The dealer can invest his money in buying tables or chairs or combination thereof. Further he would
earn different profits by following different investment strategies.
(ii) There are certain overriding conditions or constraints viz., his investment is limited to a maximum
of Rs 50,000 and so is his storage space which is for a maximum of 60 pieces.
Suppose he decides to buy tables only and no chairs, so he can buy 50000 ÷ 2500, i.e., 20 tables. His
profit in this case will be Rs (250 × 20), i.e., Rs 5000.
Suppose he chooses to buy chairs only and no tables. With his capital of Rs 50,000, he can buy 50000
÷ 500, i.e. 100 chairs. But he can store only 60 pieces. Therefore, he is forced to buy only 60 chairs
which will give him a total profit of Rs (60 × 75), i.e., Rs 4500.
There are many other possibilities, for instance, he may choose to buy 10 tables and 50 chairs, as he
can store only 60 pieces. Total profit in this case would be Rs (10 × 250 + 50 × 75), i.e., Rs 6250 and so
on.
We, thus, find that the dealer can invest his money in different ways and he would earn different profits
by following different investment strategies.

Linear Programming 1
Now the problem is : How should he invest his money in order to get maximum profit? To answer this
question, let us try to formulate the problem mathematically.
Mathematical formulation of the problem
Let x be the number of tables and y be the number of chairs that the dealer buys. Obviously, x and y
must be non-negative, i.e.,
x ≥ 0 ... (1)
(Non-negative constraints)
y ≥ 0 ... (2)
The dealer is constrained by the maximum amount he can invest (Here it is Rs 50,000) and by the
maximum number of items he can store (Here it is 60).
Stated mathematically,
2500x + 500y ≤ 50000 (investment constraint)

or 5x + y ≤ 100 ... (3)

and x + y ≤ 60 (storage constraint) ... (4)


The dealer wants to invest in such a way so as to maximise his profit, say, Z which stated as a function
of x and y is given by
Z = 250x + 75y (called objective function) ... (5)
Mathematically, the given problems now reduces to:
Maximise Z = 250x + 75y
subject to the constraints:
5x + y ≤ 100

x + y ≤ 60

x ≥ 0, y ≥ 0
So, we have to maximise the linear function Z subject to certain conditions determined by a set of
linear inequalities with variables as non-negative. There are also some other problems where we have
to minimise a linear function subject to certain conditions determined by a set of linear inequalities
with variables as non-negative. Such problems are called Linear Programming Problems.
Thus, a Linear Programming Problem is one that is concerned with finding the optimal value (maximum
or minimum value) of a linear function (called objective function) of several variables (say x and y),
subject to the conditions that the variables are non-negative and satisfy a set of linear inequalities
(called linear constraints). The term linear implies that all the mathematical relations used in the
problem are linear relations while the term programming refers to the method of determining a
particular programme or plan of action.
Before we proceed further, we now formally define some terms (which have been used above) which
we shall be using in the linear programming problems:
Objective function Linear function Z = ax + by, where a, b are constants, which has to be maximised or
minimized is called a linear objective function.
In the above example, Z = 250x + 75y is a linear objective function. Variables x and y are called decision

2 Linear Programming
variables.
Constraints The linear inequalities or equations or restrictions on the variables of a linear programming
problem are called constraints. The conditions x ≥ 0, y ≥ 0 are called non-negative restrictions. In the
above example, the set of inequalities (1) to (4) are constraints.
Optimisation problem A problem which seeks to maximise or minimise a linear function (say of two
variables x and y) subject to certain constraints as determined by a set of linear inequalities is called
an optimisation problem. Linear programming problems are special type of optimisation problems. The
above problem of investing a given sum by the dealer in purchasing chairs and tables is an example of
an optimisation problem as well as of a linear programming problem.
We will now discuss how to find solutions to a linear programming problem. In this chapter, we will be
concerned only with the graphical method.
Graphical method of solving linear programming problems
In Class XI, we have learnt how to graph a system of linear inequalities involving two variables x and y
and to find its solutions graphically. Let us refer to the problem of investment in tables and chairs
discussed earlier. We will now solve this problem graphically. Let us graph the constraints stated as
linear inequalities:
5x + y ≤ 100 ... (1)

x + y ≤ 60 ... (2)

x ≥ 0 ... (3)

y ≥ 0 ... (4)
The graph of this system (shaded region) consists of the points common to all half planes determined
by the inequalities (1) to (4). Each point in this region represents a feasible choice open to the dealer
for investing in tables and chairs. The region, therefore, is called the feasible region for the problem.
Every point of this region is called a feasible solution to the problem. Thus, we have,
Feasible region The common region determined by all the constraints including non-negative
constraints x, y ≥ 0 of a linear programming problem is called the feasible region (or solution region)
for the problem. The region OABC (shaded) is the feasible region for the problem. The region other than
feasible region is called an infeasible region.

Linear Programming 3
Feasible solutions Points within and on the boundary of the feasible region represent feasible solutions
of the constraints. Every point within and on the boundary of the feasible region OABC represents
feasible solution to the problem. For example, the point (10, 50) is a feasible solution of the problem
and so are the points (0, 60), (20, 0) etc.
Any point outside the feasible region is called an infeasible solution. For example, the point (25, 40) is
an infeasible solution of the problem.
Optimal (feasible) solution: Any point in the feasible region that gives the optimal value (maximum or
minimum) of the objective function is called an optimal solution.
Now, we see that every point in the feasible region OABC satisfies all the constraints as given in (1) to
(4), and since there are infinitely many points, it is not evident how we should go about finding a point
that gives a maximum value of the objective function Z = 250x + 75y. To handle this situation, we use
the following theorems which are fundamental in solving linear programming problems. The proofs of
these theorems are beyond the scope of the book.
Theorem 1 Let R be the feasible region (convex polygon) for a linear programming problem and let Z =
ax + by be the objective function. When Z has an optimal value (maximum or minimum), where the
variables x and y are subject to constraints described by linear inequalities, this optimal value must
occur at a corner point* (vertex) of the feasible region.
Theorem 2 Let R be the feasible region for a linear programming problem, and let Z = ax + by be the
objective function. If R is bounded**, then the objective function Z has both a maximum and a
minimum value on R and each of these occurs at a corner point (vertex) of R.
Remark If R is unbounded, then a maximum or a minimum value of the objective function may not

4 Linear Programming
exist. However, if it exists, it must occur at a corner point of R. (By Theorem 1).
In the above example, the corner points (vertices) of the bounded (feasible) region are: O, A, B and C
and it is easy to find their coordinates as (0, 0), (20, 0), (10, 50) and (0, 60) respectively. Let us now
compute the values of Z at these points.
We have
Vertex of the Feasible Region Corresponding value of Z (in Rs)
O (0,0) 0
C (0,60) 4500
B (10,50) 6250 ← Maximum

A (20,0) 5000

* A corner point of a feasible region is a point in the region which is the intersection of two boundary
lines.
** A feasible region of a system of linear inequalities is said to be bounded if it can be enclosed within
a circle. Otherwise, it is called unbounded. Unbounded means that the feasible region does extend
indefinitely in any direction.
We observe that the maximum profit to the dealer results from the investment strategy (10, 50), i.e.
buying 10 tables and 50 chairs.
This method of solving linear programming problem is referred as Corner Point Method. The method
comprises of the following steps:
1. Find the feasible region of the linear programming problem and determine its corner points (vertices)
either by inspection or by solving the two equations of the lines intersecting at that point.
2. Evaluate the objective function Z = ax + by at each corner point. Let M and m, respectively denote
the largest and smallest values of these points.
3. (i) When the feasible region is bounded, M and m are the maximum and minimum values of Z.
(ii) In case, the feasible region is unbounded, we have:
4. (a) M is the maximum value of Z, if the open half plane determined by ax + by > M has no point in
common with the feasible region. Otherwise, Z has no maximum value.
(b) Similarly, m is the minimum value of Z, if the open half plane determined by ax + by < m has no
point in common with the feasible region. Otherwise, Z has no minimum value.
We will now illustrate these steps of Corner Point Method by considering some examples:
Example 1 Solve the following linear programming problem graphically:
Maximise Z = 4x + y ... (1)
subject to the constraints:
x + y ≤ 50 ... (2)

3x + y ≤ 90 ... (3)

x ≥ 0, y ≥ 0 ... (4)
Solution The shaded region in figure is the feasible region determined by the system of constraints (2)

Linear Programming 5
to (4). We observe that the feasible region OABC is bounded. So, we now use Corner Point Method to
determine the maximum value of Z.
The coordinates of the corner points O, A, B and C are (0, 0), (30, 0), (20, 30) and (0, 50) respectively.
Now we evaluate Z at each corner point.

Corner Point Corresponding value


of Z
(0, 0) 0
(30, 0) 120 ← Maximum

(20, 30) 110


(0, 50) 50
Hence, maximum value of Z is 120 at the point (30, 0).
Example 2 Solve the following linear programming problem graphically:
Minimise Z = 200 x + 500 y ... (1)
subject to the constraints:
x + 2y ≥ 10 ... (2)

3x + 4y ≤ 24 ... (3)

x ≥ 0, y ≥ 0 ... (4)
Solution The shaded region is the feasible region ABC determined by the system of constraints (2) to
(4), which is bounded. The coordinates of corner points

6 Linear Programming
Corner Point Corresponding value of Z

(0, 5) 2500
(4, 3) 2300 ← Minimum
(0, 6) 3000
A, B and C are (0,5), (4,3) and (0,6) respectively. Now we evaluate Z = 200x + 500y at these points.
Hence, minimum value of Z is 2300 attained at the point (4, 3)
Example 3 Solve the following problem graphically:
Minimise and Maximise Z = 3x + 9y ... (1)
subject to the constraints: x + 3y ≤ 60 ... (2)

x + y ≥ 10 ... (3)

x ≤ y ... (4)

x ≥ 0, y ≥ 0 ... (5)
Solution First of all, let us graph the feasible region of the system of linear inequalities (2) to (5). The
feasible region ABCD. Note that the region is bounded. The coordinates of the corner points A, B, C and
D are (0, 10), (5, 5), (15,15) and (0, 20) respectively.

Comer Point Corresponding value of Z = 3x + 9y

A (0,10) 90
B (5, 5) 60 ← Minimum

Linear Programming 7
C (15, 15) 180 Maximum
D (0, 20) 180 }← (Multiple optimal
solutions)
We now find the minimum and maximum value of Z. From the table, we find that the minimum value
of Z is 60 at the point B (5, 5) of the feasible region.
The maximum value of Z on the feasible region occurs at the two corner points C (15, 15) and D (0, 20)
and it is 180 in each case.
Remark Observe that in the above example, the problem has multiple optimal solutions at the corner
points C and D, i.e. the both points produce same maximum value 180. In such cases, you can see that
every point on the line segment CD joining the two corner points C and D also give the same maximum
value. Same is also true in the case if the two points produce same minimum value.
Example 4 Determine graphically the minimum value of the objective function
Z = – 50x + 20y ... (1)
subject to the constraints:
2x – y ≥ – 5 ... (2)

3x + y ≥ 3 ... (3)

2x – 3y ≤ 12 ... (4)

x ≥ 0, y ≥ 0 ... (5)
Solution First of all, let us graph the feasible region of the system of inequalities (2) to (5). The feasible
region (shaded) is shown. Observe that the feasible region is unbounded.
We now evaluate Z at the corner points.

8 Linear Programming
Corner Point Z = – 50x + 20y
(0, 5) 100
(0, 3) 60
(1, 0) –50
(6, 0) – 300 ← smallest

From this table, we find that – 300 is the smallest value of Z at the corner point (6, 0). Can we say
that minimum value of Z is – 300? Note that if the region would have been bounded, this smallest
value of Z is the minimum value of Z (Theorem 2). But here we see that the feasible region is
unbounded. Therefore, – 300 may or may not be the minimum value of Z. To decide this issue, we
graph the inequality
– 50x + 20y < – 300 (see Step 3(ii) of corner Point Method.) i.e., – 5x + 2y < – 30
and check whether the resulting open half plane has points in common with feasible region or not. If
it has common points, then –300 will not be the minimum value of Z. Otherwise, –300 will be the
minimum value of Z.
Therefore, Z = –50 x + 20 y has no minimum value subject to the given constraints.
In the above example, can you say whether z = – 50 x + 20 y has the maximum value 100 at (0,5)? For
this, check whether the graph of – 50 x + 20 y > 100 has points in common with the feasible region.
(Why?)
Example 5 Minimise Z = 3x + 2y
subject to the constraints:
x + y ≥ 8 ... (1)

3x + 5y ≤ 15 ... (2)

x ≥ 0, y ≥ 0 ... (3)
Solution Let us graph the inequalities (1) to (3). Is there any feasible region?
Why is so?

Linear Programming 9
From figure, you can see that there is no point satisfying all the constraints simultaneously. Thus, the
problem is having no feasible region and hence no feasible solution.
Remarks From the examples which we have discussed so far, we notice some general features of linear
programming problems:
(i) The feasible region is always a convex region.
(ii) The maximum (or minimum) solution of the objective function occurs at the vertex (corner) of the
feasible region. If two corner points produce the same maximum (or minimum) value of the objective
function, then every point on the line segment joining these points will also give the same maximum
(or minimum) value.
Different Types of Linear Programming Problems
A few important linear programming problems are listed below:
1. Manufacturing problems In these problems, we determine the number of units of different products
which should be produced and sold by a firm when each product requires a fixed manpower, machine
hours, labour hour per unit of product, warehouse space per unit of the output etc., in order to make
maximum profit.
2. Diet problems In these problems, we determine the amount of different kinds of
constituents/nutrients which should be included in a diet so as to minimise the cost of the desired
diet such that it contains a certain minimum amount of each constituent/nutrients.
3. Transportation problems In these problems, we determine a transportation schedule in order to find
the cheapest way of transporting a product from plants/factories situated at different locations to
different markets.
Let us now solve some of these types of linear programming problems:
Example 6 (Diet problem): A dietician wishes to mix two types of foods in such a way that vitamin
contents of the mixture contain atleast 8 units of vitamin A and 10 units of vitamin C. Food ‘I’ contains
2 units/kg of vitamin A and 1 unit/kg of vitamin C. Food ‘II’ contains 1 unit/kg of vitamin A and 2 units/kg
of vitamin C. It costs Rs 50 per kg to purchase Food ‘I’ and Rs 70 per kg to purchase Food ‘II’. Formulate
this problem as a linear programming problem to minimise the cost of such a mixture.
Solution Let the mixture contain x kg of Food ‘I’ and y kg of Food ‘II’. Clearly, x ≥ 0, y ≥ 0. We make the
following table from the given data:

Resources Food Requirement


I II
(x) (y)
Vitamin A 2 1 8
(units/kg)
Vitamin C 1 2 10
(units/kg)
Cost (Rs/kg) 50 70

10 Linear Programming
Since the mixture must contain at least 8 units of vitamin A and 10 units of vitamin C, we have the
constraints:
2x + y ≥ 8 x + 2y ≥ 10
Total cost Z of purchasing x kg of food ‘I’ and y kg of Food ‘II’ is Z = 50x + 70y
Hence, the mathematical formulation of the problem is:
Minimise Z = 50x + 70y ... (1) subject to the constraints:
2x + y ≥ 8 ... (2)

x + 2y ≥ 10 ... (3)

x, y ≥ 0 ... (4)
Let us graph the inequalities (2) to (4). The feasible region determined by the system is shown in the
figure. Here again, observe that the feasible region is unbounded.
Let us evaluate Z at the corner points A(0,8), B(2,4) and C(10,0).

In the table, we find that smallest value of Z is 380 at the point (2,4). Can we say that the minimum
value of Z is 380? Remember that the feasible region is unbounded. Therefore, we have to draw the
graph of the inequality
50x + 70y < 380 i.e., 5x + 7y < 38
to check whether the resulting open half plane has any point common with the feasible region. We see
that it has no points in common.
Thus, the minimum value of Z is 380 attained at the point (2, 4). Hence, the optimal mixing strategy
for the dietician would be to mix 2 kg of Food ‘I’ and 4 kg of Food ‘II’, and with this strategy, the
minimum cost of the mixture will be Rs 380.
Example 7 (Allocation problem) A cooperative society of farmers has 50 hectare of land to grow two
crops X and Y. The profit from crops X and Y per hectare are estimated as Rs 10,500 and Rs 9,000
respectively. To control weeds, a liquid herbicide has to be used for crops X and Y at rates of 20 litres
and 10 litres per hectare. Further, no more than 800 litres of herbicide should be used in order to
protect fish and wild life using a pond which collects drainage from this land. How much land should

Linear Programming 11
be allocated to each crop so as to maximise the total profit of the society?
Solution Let x hectare of land be allocated to crop X and y hectare to crop Y. Obviously, x ≥ 0, y ≥ 0.
Profit per hectare on crop X = Rs 10500 Profit per hectare on crop Y = Rs 9000
Therefore, total profit = Rs (10500x + 9000y)
The mathematical formulation of the problem is as follows:
Maximise Z = 10500 x + 9000 y
subject to the constraints:
x + y ≤ 50 (constraint related to land) ... (1)

20x + 10y ≤ 800 (constraint related to use of herbicide)

i.e. 2x + y ≤ 80 ... (2)

x ≥ 0, y ≥ 0 (non negative constraint) ... (3)


Let us draw the graph of the system of inequalities (1) to (3). The feasible region OABC is shown
(shaded). Observe that the feasible region is bounded.
The coordinates of the corner points O, A, B and C are (0, 0), (40, 0), (30, 20) and (0, 50) respectively.
Let us evaluate the objective function Z = 10500 x + 9000y at these vertices to find which one gives
the maximum profit.

Corner Point Z = 10500x + 9000y

O(0, 0) 0
A( 40, 0) 420000
B(30, 20) Maximum
495000 ←

C(0,50) 450000

Hence, the society will get the maximum profit of Rs 4,95,000 by allocating 30 hectares for crop X and

12 Linear Programming
20 hectares for crop Y.
Example 8 (Manufacturing problem) A manufacturing company makes two models A and B of a product.
Each piece of Model A requires 9 labour hours for fabricating and 1 labour hour for finishing. Each piece
of Model B requires 12 labour hours for fabricating and 3 labour hours for finishing. For fabricating and
finishing, the maximum labour hours available are 180 and 30 respectively. The company makes a profit
of Rs 8000 on each piece of model A and Rs 12000 on each piece of Model B. How many pieces of
Model A and Model B should be manufactured per week to realise a maximum profit? What is the
maximum profit per week?
Solution Suppose x is the number of pieces of Model A and y is the number of pieces of Model B. Then
Total profit (in Rs) = 8000 x + 12000 y
Let Z = 8000 x + 12000 y
We now have the following mathematical model for the given problem.
Maximise Z = 8000 x + 12000 y ... (1)
subject to the constraints:
9x + 12y ≤ 180 (Fabricating constraint)

i.e. 3x + 4y ≤ 60 ... (2)

x + 3y ≤ 30 (Finishing constraint) ... (3)

x ≥ 0, y ≥ 0 (non-negative constraint) ... (4)


The feasible region (shaded) OABC determined by the linear inequalities (2) to (4) is shown. Note that
the feasible region is bounded.

Let us evaluate the objective function Z at each corner point as shown below:
Corner Point Z = 8000 x + 12000 y
0 (0, 0) 0
A (20, 0) 160000

B (12, 6) 168000 ← Maximum

C (0, 10) 120000


We find that maximum value of Z is 1,68,000 at B (12, 6). Hence, the company should produce 12 pieces

Linear Programming 13
of Model A and 6 pieces of Model B to realise maximum profit and maximum profit then will be Rs
1,68,000.

14 Linear Programming
13 Probability
Introduction

In earlier Classes, we have studied the probability as a measure of uncertainty of events in a random
experiment. We have also established equivalence between the axiomatic theory and the classical
theory of probability in case of equally likely outcomes. On the basis of this relationship, we obtained
probabilities of events associated with discrete sample spaces. We have also studied the addition rule
of probability. In this chapter, we shall discuss the important concept of conditional probability of an
event given that another event has occurred, which will be helpful in understanding the Bayes' theorem,
multiplication rule of probability and independence of events. We shall also learn an important concept
of random variable and its probability distribution and also the mean and variance of a probability
distribution. In the last section of the chapter, we shall study an important discrete probability
distribution called Binomial distribution. Throughout this chapter, we shall take up the experiments
having equally likely outcomes, unless stated otherwise.
Conditional Probability
Uptill now in probability, we have discussed the methods of finding the probability of events. If we
have two events from the same sample space, does the information about the occurrence of one of
the events affect the probability of the other event? Let us try to answer this question by taking up a
random experiment in which the outcomes are equally likely to occur.
Consider the experiment of tossing three fair coins. The sample space of the experiment is
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
1
Since the coins are fair, we can assign the probability to each sample point. Let E be the event ‘at least
8
two heads appear’ and F be the event ‘first coin shows tail’.
Then
E = {HHH, HHT, HTH, THH}
and F = {THH, THT, TTH, TTT}
Therefore P(E) = P ({HHH}) + P ({HHT}) + P ({HTH}) + P ({THH})
1 1 1 1 1
and = + + + = (Why?)
8 8 8 8 2
P(F) = P({THH}) + P({THT}) + P({TTH}) + P({TTT})
1 1 1 1 1
= + + + =
8 8 8 8 2
Also E ∩ F = {THH}
1
with P(E ∩ F) = P({THH}) =
8
Now, suppose we are given that the first coin shows tail, i.e. F occurs, then what is the probability of
occurrence of E? With the information of occurrence of F, we are sure that the cases in which first
coin does not result into a tail should not be considered while finding the probability of E. This
information reduces our sample space from the set S to its subset F for the event E. In other words,
the additional information really amounts to telling us that the situation may be considered as being
that of a new random experiment for which the sample space consists of all those outcomes only

Probability 1
which are favourable to the occurrence of the event F.
Now, the sample point of F which is favourable to event E is THH.
1
Thus, Probability of E considering F as the sample space = ,
4
1
or Probability of E given that the event F has occurred =
4
This probability of the event E is called the conditional probability of E given that F has already
occurred, and is denoted by P(E|F).
1
Thus P(E|F) =
4
Note that the elements of F which favour the event E are the common elements of E and F, i.e. the
sample points of E ∩ F.
Thus, we can also write the conditional probability of E given that F has occurred as
Number of elementary events favourable to E ∩ F
P(E|F) =
Number of elementary events which are favourable to F
n(E ∩ F)
=
n(F)
Dividing the numerator and the denominator by total number of elementary events of the sample
space, we see that P(E|F) can also be written as
n(E∩F)
P(E∩F)
P(E|F) = n(S)
n(F) = … (1)
P(F)
n(S)

Note that (1) is valid only when P(F) ≠ 0 i. e. , F ≠ϕϕ (Why?)


Thus, we can define the conditional probability as follows :
Definition 1 If
E and F are two events associated with the same sample space of a random experiment, the conditional probability of the event E giv
has occurred:
i.e. P(E|F) is given by
P(E∩F)
P(E|F) = provided P(F) ≠ 0
P(F)

Properties of conditional probability


Let E and F be events of a sample space S of an experiment, then we have
Property 1 P(S|F) = P(F|F) = 1
We know that
P(S ∩ F) P(F)
P(S|F) = = =1
P(F) P(F)
P(F∩F) P(F)
Also P(F|F) = = =1
P(F) P(F)

Thus P(S|F) = P(F|F) = 1


Property 2 If A and B are any two events of a sample space S and F is an event of S such that P(F) ≠ 0,
then
P�(A ∪ B)|F� = P(A|F) + P(B|F) − P�(A ∩ B)|F�
In particular, if A and B are disjoint events, then
P((A∪B)|F) = P(A|F) + P(B|F)
We have

2 Probability
P[(A ∪ B) ∩F]
P((A ∪ B)|F) =
P(F)
P[(A ∩ F) ∪ (B∩F)]
=
P(F)

(by distributive law of union of sets over intersection)


P(A ∩ F)+P(B∩ F)– P(A ∩ B ∩ F)
=
P(F)
P(A ∩ F) P(B ∩ F) P[(A∩ B)∩ F]
= + –
P(F) P(F) P(F)

= P(A|F) + P(B|F) – P((A ∩ B)|F)


When A and B are disjoint events, then
P((A ∩ B)|F) = 0

⇒ P((A ∪ B)|F) = P(A|F) + P(B|F)

Property 3 P(E′|F) = 1 − P(E|F)


From Property 1, we know that P(S|F) = 1
⇒ P(E ∪ E′|F) = 1 since S = E ∪ E′

⇒ P(E|F) + P (E′|F) = 1 since E and E′ are disjoint events

Thus, P(E′|F) = 1 − P(E|F)


Let us now take up some examples.
7 9 4
Example 1 If P(A) = , P(B) = and P(A ∩ B) = , evaluate P(A|B).
13 13 13
4
P(A∩B) 4
Solution We have P(A|B) = = 13
9 =
P(B) 9
13

Example 2 A family has two children. What is the probability that both the children are boys given that
at least one of them is a boy ?
Solution Let b stand for boy and g for girl. The sample space of the experiment is
S = {(b, b), (g, b), (b, g), (g, g)}
Let E and F denote the following events :
E : ‘both the children are boys’
F : ‘at least one of the child is a boy’
Then E = {(b,b)} and F = {(b,b), (g,b), (b,g)}
Now E ∩ F = {(b,b)}
3 1
Thus P(F) = and P(E ∩ F) =
4 4
1
P(E∩F) 1
Therefore P(E|F) = = 4
3 =
P(F) 3
4

Example 3 Ten cards numbered 1 to 10 are placed in a box, mixed up thoroughly and then one card is
drawn randomly. If it is known that the number on the drawn card is more than 3, what is the
probability that it is an even number?
Solution Let A be the event ‘the number on the card drawn is even’ and B be the event ‘the number on
the card drawn is greater than 3’. We have to find P(A|B).
Now, the sample space of the experiment is S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Then A = {2, 4, 6, 8, 10}, B = {4, 5, 6, 7, 8, 9, 10}

Probability 3
and A ∩ B = {4, 6, 8, 10}
5 7 4
Also P(A) = , P(B) = and P(A ∩ B) =
10 10 10
4
P(A∩B) 4
Then P(A|B) = = 10
7 =
P(B) 7
10

Example 4 In a school, there are 1000 students, out of which 430 are girls. It is known that out of 430,
10% of the girls study in class XII. What is the probability that a student chosen randomly studies in
Class XII given that the chosen student is a girl?
Solution Let E denote the event that a student chosen randomly studies in Class XII and F be the event
that the randomly chosen student is a girl. We have to find P (E|F).
430 43
Now P(F) = = 0.43 and P(E ∩ F) = = 0.043 (Why?)
1000 1000
P(E∩F) 0.043
Then P(E|F) = = = 0.1
P(F) 0.43

Example 5 A die is thrown three times. Events A and B are defined as below:
A : 4 on the third throw
B : 6 on the first and 5 on the second throw
Find the probability of A given that B has already occurred.
Solution The sample space has 216 outcomes.
(1,1,4) (1,2,4) . . . (1,6,4) (2,1,4) (2,2,4) . . . (2,6,4)
Now A = �(3,1,4) (3,2,4) . . . (3,6,4) (4,1,4) (4,2,4) . . . (4,6,4)�
(5,1,4) (5,2,4) . . . (5,6,4) (6,1,4) (6,2,4) . . . (6,6,4)
B = {(6,5,1), (6,5,2), (6,5,3), (6,5,4), (6,5,5), (6,5,6)}
and A ∩ B = {(6,5,4)}.
6 1
Now P(B) = and P(A ∩ B) =
216 216
1
P(A∩B) 1
Then P(A|B) = = 216
6 =
P(B) 6
216

Example 6 A die is thrown twice and the sum of the numbers appearing is observed to be 6. What is
the conditional probability that the number 4 has appeared at least once?

Solution Let E be the event that ‘number 4 appears at least once’ and F be the event that ‘the sum of
the numbers appearing is 6’.
Then, E = {(4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (1,4), (2,4), (3,4), (5,4), (6,4)}
and F = {(1,5), (2,4), (3,3), (4,2), (5,1)}
11 5
We have P(E) = and P(F) =
36 36

Also E ∩ F = {(2,4), (4,2)}


2
Therefore P(E ∩ F) =
36

Hence, the required probability


2
P(E ∩ F) 36 2
P(E|F) = = =
P(F) 5 5
36
For the conditional probability discussed above, we have considered the elementary events of the

4 Probability
experiment to be equally likely and the corresponding definition of the probability of an event was
used. However, the same definition can also be used in the general case where the elementary events
of the sample space are not equally likely, the probabilities P(E ∩ F) and P(F) being calculated
accordingly. Let us take up the following example.
Example 7 Consider the experiment of tossing a coin. If the coin shows head, toss it again but if it
shows tail, then throw a die. Find the conditional probability of the event that ‘the die shows a number
greater than 4’ given that ‘there is at least one tail’.
Solution The outcomes of the experiment can be represented in following diagrammatic manner called
the ‘tree diagram’.
The sample space of the experiment may be described as

S = {(H,H), (H,T), (T,1), (T,2), (T,3), (T,4), (T,5), (T,6)}

where (H, H) denotes that both the tosses result into head and (T, i) denote the first toss result into a
tail and the number i appeared on the die for i = 1,2,3,4,5,6.
Thus, the probabilities assigned to the 8 elementary events
(H, H), (H, T), (T, 1), (T, 2), (T, 3) (T, 4), (T, 5), (T, 6)
1 1 1 1 1 1 1 1
are , , , , , , , respectively
4 4 12 12 12 12 12’ 12
Let F be the event that ‘there is at least one tail’ and E be the event ‘the die shows a number greater
than 4’. Then

Probability 5
F = {(H,T), (T,1), (T,2), (T,3), (T,4), (T,5), (T,6)} E = {(T,5), (T,6)} and E ∩ F = {(T,5), (T,6)}
Now P(F) = P({(H,T)}) + P ({(T,1)}) + P ({(T,2)}) + P ({(T,3)}) + P ({(T,4)}) + P({(T,5)}) + P({(T,6)})
1 1 1 1 1 1 1 3
= + + + + + + =
4 12 12 12 12 12 12 4
1 1 1
and P(E ∩ F) = P({(T5)}) + P({(T6)}) = + =
12 12 6
1
P(E∩F) 2
Hence P(E|F) = = 6
3 =
P(F) 9
4

Multiplication Theorem on Probability


Let E and F be two events associated with a sample space S. Clearly, the set E ∩ F denotes the event

that both E and F have occurred. In other words, E ∩ F denotes the simultaneous occurrence of the

events E and F. The event E ∩ F is also written as EF.


Very often we need to find the probability of the event EF. For example, in the experiment of drawing
two cards one after the other, we may be interested in finding the probability of the event ‘a king and
a queen’. The probability of event EF is obtained by using the conditional probability as obtained below
:
We know that the conditional probability of event E given that F has occurred is denoted by P(E|F) and
is given by
P(E∩F)
P(E|F) = , P(F) ≠ 0
P(F)

From this result, we can write


P(E ∩ F) = P(F). P(E|F) … (1)
Also, we know that
P(F ∩ E)
P(F|E) = , P(E) ≠ 0
P(E)
P(E∩F)
or P(F|E) = (since E ∩ F = F ∩ E)
P(E)

Thus, P(E ∩ F) = P(E). P(F|E).... (2)


Combining (1) and (2), we find that
P(E ∩ F) = P(E) P(F|E)

= P(F) P(E|F) provided P(E) ≠ 0 and P(F) ≠ 0.

The above result is known as the multiplication rule of probability.


Let us now take up an example.
Example 8 An urn contains 10 black and 5 white balls. Two balls are drawn from the urn one after the
other without replacement. What is the probability that both drawn balls are black?
Solution Let E and F denote respectively the events that first and second ball drawn are black. We
have to find P(E ∩ F) or P (EF).
10
Now P (E) = P (black ball in first draw) =
15
Also given that the first ball drawn is black, i.e., event E has occurred, now there are 9 black balls and
five white balls left in the urn. Therefore, the probability that the second ball drawn is black, given that

6 Probability
the ball in the first draw is black, is nothing but the conditional probability of F given that E has
occurred.
9
i.e. P(F|E) =
14
By multiplication rule of probability, we have
P(E∩F) = P(E) P(F|E)
10 9 3
= × =
15 14 7
Multiplication rule of probability for more than two events If E, F and G are
three events of sample space, we have
P(E ∩ F ∩ G) = P(E) P(F|E) P(G|(E ∩ F)) = P(E) P(F|E) P(G|EF)
Similarly, the multiplication rule of probability can be extended for four or more events.
The following example illustrates the extension of multiplication rule of probability for three events.
Example 9 Three cards are drawn successively, without replacement from a pack of 52 well shuffled
cards. What is the probability that first two cards are kings and the third card drawn is an ace?
Solution Let K denote the event that the card drawn is king and A be the event that the card drawn is
an ace. Clearly, we have to find P (KKA)
4
Now P(K) =
52
Also, P (K|K) is the probability of second king with the condition that one king has already been drawn.
Now there are three kings in (52 - 1) = 51 cards.
3
Therefore P(K|K) =
51
Lastly, P(AIKK) is the probability of third drawn card to be an ace, with the condition that two kings
have already been drawn. Now there are four aces in left 50 cards.
4
Therefore P(A|KK) =
50
By multiplication law of probability, we have
P(KKA) = P(K) P(K|K) P(A|KK)
4 3 4 2
= × × =
52 51 50 5525
Independent Events
Consider the experiment of drawing a card from a deck of 52 playing cards, in which the elementary
events are assumed to be equally likely. If E and F denote the events 'the card drawn is a spade' and
'the card drawn is an ace' respectively, then
13 1 4 1
P(E) = = and P(F) = =
52 4 52 13
Also E and F is the event' the card drawn is the ace of spades' so that
1
P(E ∩F) =
52
1
P(E ∩F) 1
Hence P(EIF) = = 52
1 =
P(F) 4
13
1
Since P(E) = = P (EIF), we can say that the occurrence of event F has not affected the probability of
4
occurrence of the event E. We also have
1
P(E ∩F) 1
P(F|E) = = 52
1 = = P(F)
P(E) 13
4

Probability 7
1
Again, P(F) = = P(F|E) shows that occurrence of event E has not affected the probability of occurrence
13
of the event F.
Thus, E and F are two events such that the probability of occurrence of one of them is not affected by
occurrence of the other.
Such events are called independent events.
Definition 2 Two events E and F are said to be independent, if
P (F|E) = P (F) provided P (E) ≠ 0
and P (EIF) = P (E) provided P (F) ≠ 0
Thus, in this definition we need to have P (E) ≠ 0 and P(F) ≠ 0
Now, by the multiplication rule of probability, we have
P(E ∩ F) = P(E). P (F|E) ... (1)
If E and F are independent, then (1) becomes
P(E ∩ F) = P(E). P(F) ...(2)
Thus, using (2), the independence of two events is also defined as follows:
Definition 3 Let E and F be two events associated with the same random experiment, then E and F are
said to be independent if
P(E ∩ F) = P(E). P (F)
Remarks
(i) Two events E and F are said to be dependent if they are not independent, i.e. if
P(E ∩ F ) ≠ P(E). P (F)
(ii) Sometimes there is a confusion between independent events and mutually exclusive events. Term
‘independent’ is defined in terms of ‘probability of events ’ whereas mutually exclusive is defined in
term of events (subset of sample space). Moreover, mutually exclusive events never have an outcome
common, but independent events, may have common outcome. Clearly, ‘independent’ and ‘mutually
exclusive’ do not have the same meaning.
In other words, two independent events having nonzero probabilities of occurrence can not be mutually
exclusive, and conversely, i.e. two mutually exclusive events having nonzero probabilities of occurrence
can not be independent.
(iii) Two experiments are said to be independent if for every pair of events E and F, where E is
associated with the first experiment and F with the second experiment, the probability of the
simultaneous occurrence of the events E and F when the two experiments are performed is the product
of P(E) and P(F) calculated separately on the basis of two experiments, i.e., P (E ∩ F) = P (E). P(F)
(iv) Three events A, B and C are said to be mutually independent, if
P(A ∩ B) = P(A) P(B)
P(A ∩ C) = P(A) P(C)
P(B ∩ C) = P(B) P(C)
and P(A ∩ B ∩ C) = P(A) P(B) P(C)
If at least one of the above is not true for three given events, we say that the events are not
independent.

8 Probability
Example 10 A die is thrown. If E is the event ‘the number appearing is a multiple of 3’ and F be the
event ‘the number appearing is even’ then find whether E and F are independent ?
Solution We know that the sample space is S = {1, 2, 3, 4, 5, 6}
Now E = {3, 6}, F = {2, 4, 6} and E∩F = {6}
2 1 3 1 1
Then P(E) = = , P(F) = = and P(E ∩ F) =
6 3 6 2 6
Clearly P(E ∩ F) = P(E). P (F)
Hence E and F are independent events.
Example 11 An unbiased die is thrown twice. Let the event A be ‘odd number on the first throw’ and B
the event ‘odd number on the second throw’. Check the independence of the events A and B.
Solution If all the 36 elementary events of the experiment are considered to be equally likely, we have
18 1 18 1
P(A) = = and P(B) = =
36 2 36 2
Also P (A ∩ B) = P (odd number on both throws)
9 1
= =
36 4
1 1 1
Now P(A) P(B) = × =
2 2 4
Clearly P(A ∩ B) = P(A) × P(B)
Thus, A and B are independent events
Example 12 Three coins are tossed simultaneously. Consider the event E ‘three heads or three tails’, F
‘at least two heads’ and G ‘at most two heads’. Of the pairs (E,F), (E,G) and (F,G), which are
independent? which are dependent?
Solution The sample space of the experiment is given by
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Clearly E = {HHH, TTT}, F = {HHH, HHT, HTH, THH}
and G = {HHT, HTH, THH, HTT, THT, TTH, TTT}
Also E ∩ F = {HHH}, E ∩ G = {TTT}, F ∩ G = {HHT, HTH, THH}
2 1 4 1 7
Therefore P(E) = = , P(F) = = , P(G) =
8 4 8 2 8
1 1 3
and P(E ∩ F) = , P(E ∩ G) = , P(F ∩ G) =
8 8 8
1 1 1 1 7 7
Also P(E). P(F) = × = , P(E) • P(G) = × =
4 2 8 4 8 32
1 7 7
and P(F). P(G) = × =
2 8 16
Thus P(E ∩ F) = P(E). P(F)
P(E ∩ G) ≠ P(E). P(G)
and P (F n G) ≠ P (F). P(G)
Hence, the events (E and F) are independent, and the events (E and G) and (F and G) are dependent.
Example 13 Prove that if E and F are independent events, then so are the events E and F/.
Solution Since E and F are independent, we have
P(E∩F) = P(E).P(F)....(1)
From the venn diagram it is clear that E ∩ F and E ∩ F' are mutually exclusive events and also E = (E ∩
F) ∪ (E ∩ F').

Probability 9
Therefore P (E) = P (E ∩ F) + P(E ∩ F')
or P(E∩F') = P(E) - P(E ∩ F)
= P(E) - P(E). P(F)
(by (1))
= P(E) (1 - P(F))
= P(E). P(F')
Hence, E and F' are independent
Note In a similar manner, it can be shown that if the events E and F are independent, then
(a) E' and F are independent,
(b) E' and F' are independent
Example 14 If A and B are two independent events, then the probability of occurrence of at least one
of A and B is given by 1 - P(A') P(B')
Solution We have
P(at least one of A and B) = P(A ∪ B)
= P(A) + P(B) - P(A ∩ B)
= P(A) + P(B) - P(A) P(B)
= P(A) + P(B) [1 - P(A)]
= P(A) + P(B). P(A’)
= 1 - P(A') + P(B) P(A')
= 1 - P(A') [1 - P(B)]
= 1 - P(A') P (B')
Bayes' Theorem
Consider that there are two bags I and II. Bag I contains 2 white and 3 red balls and Bag II contains 4
white and 5 red balls. One ball is drawn at random from one of the bags. We can find the probability
1
of selecting any of the bags (i.e. ) or probability of drawing a ball of a particular colour (say white)
2
from a particular bag (say Bag I). In other words, we can find the probability that the ball drawn is of
a particular colour, if we are given the bag from which the ball is drawn. But, can we find the probability
that the ball drawn is from a particular bag (say Bag II), if the colour of the ball drawn is given? Here,
we have to find the reverse probability of Bag II to be selected when an event occurred after it is
known.
Partition of a sample space
A set of events E1, 1%..., En is said to represent a partition of the sample space S if
(a) Ei ∩ Ej = ϕ, i ≠ j, i, j = 1, 2, 3,..., n

10 Probability
(b) E1 ∪ E2 ∪... ∪ En = S and
(c) P(Ei)>0 for all i = 1, 2,..., n.
In other words, the events E1, E2, …, En represent a partition of the sample space S if they are pairwise
disjoint, exhaustive and have nonzero probabilities.
As an example, we see that any nonempty event E and its complement E' form a partition of the sample
space S since they satisfy E ∩ E' = ϕ and E ∪ E' = S.
From the Venn diagram, one can easily observe that if E and F are any two events associated with a
sample space S, then the set {E ∩ F’, E ∩ F, E' ∩ F, E' ∩ F’} is a partition of the sample space S. It may
be mentioned that the partition of a sample space is not unique. There can be several partitions of the
same sample space.
We shall now prove a theorem known as Theorem of total probability.
Theorem of total probability
Let {E1, E2,..., En} be a partition of the sample space S, and suppose that each of the events E1, E2,..., En
has nonzero probability of occurrence. Let Abe any event associated with S, then
P(A) = P(E1) P(A|E1) + P(E2) P(A|E2) +... + P(En) P(A|En)
n

= � P �Ej �P�A|Ej �
j=1

Proof Given that E1, E2,..., En is a partition of the sample space S. Therefore,
S = E1 ∪ E2 ∪  ∪ En … (1)

and Ei ∩ Ej = ϕ, i ≠ j, i, j = 1, 2,..., n
Now, we know that for any event A,
A=A∩S

= A ∩ (E1 ∪ E2 ∪... ∪ En)

= (A ∩ E1) ∪ (A ∩ E2) ∪ …∪ (A ∩ En)

Also A ∩ Ei and A ∩ Ej are respectively the subsets of Ei and Ej. We know that Ei and Ej are disjoint, for

i ≠ j, therefore, A ∩ Ei and A ∩ Ej are also disjoint for all i ≠ j, i, j = 1, 2,..., n.

Thus, P(A) = P [(A ∩ E1) ∪ (A ∩ E2)∪.....∪ (A ∩ En)]

= P (A ∩ E1) + P (A ∩ E2) +... + P (A ∩ En)


Now, by multiplication rule of probability, we have
P(A ∩ Ei) = P(Ei) P(A|Ei) as P (Ei) ≠ 0∀i = 1,2,..., n
Therefore, P (A) = P (E1) P (A|E1) + P (E2) P (A|E2) +... + P (En)P(A|En)

Probability 11
or P(A) = ∑nj=1 P �Ej �P�A|Ej �
Example 15 A person has undertaken a construction job. The probabilities are 0.65 that there will be
strike, 0.80 that the construction job will be completed on time if there is no strike, and 0.32 that the
construction job will be completed on time if there is a strike. Determine the probability that the
construction job will be completed on time.
Solution Let A be the event that the construction job will be completed on time, and B be the event
that there will be a strike. We have to find P(A).
We have
P(B) = 0.65, P(no strike) = P(B′) = 1 − P(B) = 1 − 0.65 = 0.35 P(A|B) = 0.32, P(A|B′) = 0.80

Since events B and B′ form a partition of the sample space S, therefore, by theorem on total probability,
we have
P(A) = P(B) P(A|B) + P(B′) P(A|B′)
= 0.65 × 0.32 + 0.35 × 0.8
= 0.208 + 0.28 = 0.488
Thus, the probability that the construction job will be completed in time is 0.488.
We shall now state and prove the Bayes' theorem.
Bayes’ Theorem If E1, E2,..., En are n non empty events which constitute a partition of sample space S,
i.e. E1, E2,..., En are pairwise disjoint and E1∪ E2∪... ∪ En = S and A is any event of nonzero probability,
then
P(Ei )P(A|Ei )
P(Ei |A) = ∑n
for any i = 1, 2, 3, …, n
j=1 P�Ej �P�A|Ej �

Proof By formula of conditional probability, we know that


P(A ∩ Ei )
P(Ei |A) =
P(A)
P(Ei )P(A|Ei )
= (by multiplication rule ofprobability)
P(A)
P(Ei )P�A�Ei �
= ∑n
(by the result oftheorem of total probability)
j=1 P�Ej �P�A|Ej �

Remark The following terminology is generally used when Bayes' theorem is applied.
The events E1, E2,..., En are called hypotheses.
The probability P(Ei) is called the priori probability of the hypothesis Ei
The conditional probability P(Ei |A) is called a posteriori probability of the hypothesis Ei.
Bayes' theorem is also called the formula for the probability of "causes". Since the Ei's are a partition
of the sample space S, one and only one of the events Ei occurs (i.e. one of the events Ei must occur
and only one can occur). Hence, the above formula gives us the probability of a particular Ei (i.e. a
"Cause"), given that the event A has occurred.
The Bayes' theorem has its applications in variety of situations, few of which are illustrated in following
examples.
Example 16 Bag I contains 3 red and 4 black balls while another Bag II contains 5 red and 6 black balls.
One ball is drawn at random from one of the bags and it is found to be red. Find the probability that
it was drawn from Bag II.

12 Probability
Solution Let E1 be the event of choosing the bag I, E2 the event of choosing the bag II and A be the
event of drawing a red ball.
Solution Let E1 be the event of choosing the bag I, E2 the event of choosing the bag || and A be the event of
drawing a red ball.
1
Then P(E1) = P(E2 ) =
2
3
Also P(A|E1 ) = P(drawing a red ball from Bag I) =
7
5
and P(A|E2) = P(drawing a red ball from Bag II) =
11
Now, the probability of drawing a ball from Bag II, being given that it is red, is P(E2 |A)
By using Bayes’ theorem, we have
1 5
P(E2 )P(A|E2 ) × 35
P(E2 |A) = = 2 11 =
P(E1 )P(A|E1 ) + P(E2 )P(A|E2 ) 1 3 1 5 68
× + ×
2 7 2 11
Example 17 Given three identical boxes I, II and III, each containing two coins. In box I, both coins are
gold coins, in box II, both are silver coins and in the box III, there is one gold and one silver coin. A
person chooses a box at random and takes out a coin. If the coin is of gold, what is the probability that
the other coin in the box is also of gold?
Solution Let E1, E2 and E3 be the events that boxes I, II and III are chosen, respectively.
1
Then P(E1 ) = P(E2 ) = P(E3 ) =
3
Also, let A be the event that ‘the coin drawn is of gold’
2
Then P(A|E1 ) = P(a gold coin from bag I) = = 1
2
P(A|E2) = P(a gold coin from bag II) = 0
1
P(A|E3) = P(a gold coin from bag III) =
2
Now, the probability that the other coin in the box is of gold
= the probability that gold coin is drawn from the box I,
= P(E1 |A)
By Bayes’ theorem, we know that
P(E1 )P(A|E1 )
P(E1 |A) =
P(E1 )P(A|E1 ) + P(E2 )P(A|E2 ) + P(E3 )P(A|E3 )
1
×1 2
= 3 =
1 1 1 1 3
×1+ ×0+ ×
3 3 3 2
Example 18 Suppose that the reliability of a HIV test is specified as follows:
Of people having HIV, 90% of the test detect the disease but 10% go undetected. Of people free of HIV,
99% of the test are judged HIV–ive but 1% are diagnosed as showing HIV+ive. From a large population
of which only 0.1% have HIV, one person is selected at random, given the HIV test, and the pathologist
reports him/her as HIV+ive. What is the probability that the person actually has HIV?
Solution Let E denote the event that the person selected is actually having HIV and A the event that
the person's HIV test is diagnosed as +ive. We need to find P(E|A).
Also E′ denotes the event that the person selected is actually not having HIV.

Clearly, {E, E′} is a partition of the sample space of all people in the population. We are given that

Probability 13
0.1
P(E) = 0.1% = = 0.001
100

P(E′) = 1 – P(E) = 0.999


P(A|E) = P(Person tested as HIV+ive given that he/she is actually having HIV)
90
= 90% = = 0.9
100
and P(A|E ′ ) = P(Person tested as HIV+ive given that he/she is actually not having HIV)
1
= 1% = = 0.0l
100
Now, by Bayes’ theorem
P(E)P(A|E)
P(E|A) =
P(E)P(A|E)+P(E′ )P(A|E′ )
0.001 × 0.9 90
= =
0.001 × 0.9 + 0.999 × 0.01 1089
= 0.083 approx.
Thus, the probability that a person selected at random is actually having HIV given that he/she is tested
HIV+ive is 0.083.
Example 19 In a factory which manufactures bolts, machines A, B and C manufacture respectively 25%,
35% and 40% of the bolts. Of their outputs, 5, 4 and 2 percent are respectively defective bolts. A bolt
is drawn at random from the product and is found to be defective. What is the probability that it is
manufactured by the machine B?
Solution Let events B1, B2, B3 be the following :
B1 : the bolt is manufactured by machine A
B2 : the bolt is manufactured by machine B
B3 : the bolt is manufactured by machine C
Clearly, B1, B2, B3 are mutually exclusive and exhaustive events and hence, they represent a partition of
the sample space.
Let the event E be ‘the bolt is defective’.
The event E occurs with B1 or with B2 or with B3. Given that,
P(B1) = 25% = 0.25, P (B2) = 0.35 and P(B3) = 0.40
Again P(E|B1) = Probability that the bolt drawn is defective given that it is manufactured by machine A
= 5% = 0.05
Similarly, P(E|B2) = 0.04, P(E|B3) = 0.02.
Hence, by Bayes’ Theorem, we have
P(B2 )P(E|B2 )
P(B2 |E) =
P(B1 )P(E|B1 ) + P(B2 )P(E|B2 ) + P(B3 )P(E|B3 )
0.35×0.04
=
0.25×0.05+0.35×0.04+0.40×0.02
0.0140 28
= =
0.0345 69
Example 20 A doctor is to visit a patient, From the past experience, it is known that the probabilities
that he will come by train, bus, scooter or by other means of transport are respectively
3 1 1 2 1 1 1
, , and . The probabilities that he will be late are , , and , if he comes by train, bus and scooter
10 5 l0 5 4 3 l2
respectiyely, but if he comes by other means of transport, then he will not be late, When he arrives,

14 Probability
he is late, What is the probability that he comes by main?
Solution Let E be the event that the doctor visits the patient late and let T1 , T2 , T3 , T4 be the events that
the doctor comes by main, bus, scooter, and other means of transport respectively.
3 1 1 2
Then P(T1) = , P(T2 ) = , P(T3 ) = andP(T4 ) = (given)
l0 5 l0 5
1
P(E|T1 ) = Probability that the doctor arriving late comes by train =
4
1 1
Similarly, P(E|T2) = , P(E|T3) = and P(E|T4) = 0, since he is not late if he comes by other means of
3 l2
transport.
Therefore, by Bayes’ Theorem, we have
P(T1 |E) = Probability that the doctor arriving late comes by train
P(T1 )P(E|T1 )
=
P(T1 )P(E|TI ) + P(T2 )P(E|T2 ) + P(T3 )P(E|T3 ) + P(T4 )P(E|T4 )
3 1
× 3 120 1
= 10 4 = × =
3 1 1 1 1 1 2 40 18 2
× + × + × + ×0
10 4 5 3 10 12 5
1
Hence, the required probability is .
2
Example 21 A man is known to speak truth 3 out of 4 times. He throws a die and reports that it is a
six. Find the probability that it is actually a six.
Solution Let E be the event that the man reports that six occurs in the throwing of the die and let S1
be the event that six occurs and S2 be the event that six does not occur.
1
Then P(S1 ) = Probability that six occurs =
6
5
P(S2 ) = Probability that six does not occur =
6
P(E|S1 ) = Probability that the man reports that six occurs when six has actually occurred on the die
3
= Probability that the man speaks the truth =
4
P(E|S2) = Probability that the man reports that six occurs when six has not actually occurred on the die
3 1
= Probability that the man does not speak the truth = 1 − =
4 4
Thus, by Bayes’ theorem, we get
P(S1 |E) = Probability that the report of the man that six has occurred is actually a six
P(S1 )P(E|S1 )
=
P(S1 )P(E|S1 ) + P(S2 )P(E|S2 )
1 3
× 1 24 3
= 6 4 = × =
1 3 5 1 8 8 8
× + ×
6 4 6 4
3
Hence, the required probability is .
8
Random Variables and its Probability Distributions
We have already learnt about random experiments and formation of sample spaces. In most of these
experiments, we were not only interested in the particular outcome that occurs but rather in some
number associated with that outcomes as shown in following examples/experiments.
(i) In tossing two dice, we may be interested in the sum of the numbers on the two dice.
(ii) In tossing a coin 50 times, we may want the number of heads obtained.

Probability 15
(iii) In the experiment of taking out four articles (one after the other) at random from a lot of 20 articles
in which 6 are defective, we want to know the number of defectives in the sample of four and not in
the particular sequence of defective and nondefective articles.
In all of the above experiments, we have a rule which assigns to each outcome of the experiment a
single real number. This single real number may vary with different outcomes of the experiment. Hence,
it is a variable. Also its value depends upon the outcome of a random experiment and, hence, is called
random variable. A random variable is usually denoted by X.
If you recall the definition of a function, you will realise that the random variable X is really speaking a
function whose domain is the set of outcomes (or sample space) of a random experiment. A random
variable can take any real value, therefore, its co-domain is the set of real numbers. Hence, a random
variable can be defined as follows :
Definition 4 A random variable is a real valued function whose domain is the sample space of a random
experiment.
For example, let us consider the experiment of tossing a coin two times in succession.
The sample space of the experiment is S = {HH, HT, TH, TT}.
If X denotes the number of heads obtained, then X is a random variable and for each outcome, its value
is as given below :
X (HH) = 2, X (HT) = 1, X (TH) = 1, X (TT) = 0.
More than one random variables can be defined on the same sample space. For example, let Y denote
the number of heads minus the number of tails for each outcome of the above sample space S.
Then Y(HH) = 2, Y (HT) = 0, Y (TH) = 0, Y (TT) = – 2.
Thus, X and Y are two different random variables defined on the same sample space S.
Example 22 A person plays a game of tossing a coin thrice. For each head, he is given Rs 2 by the
organiser of the game and for each tail, he has to give Rs 1.50 to the organiser. Let X denote the amount
gained or lost by the person. Show that X is a random variable and exhibit it as a function on the
sample space of the experiment.
Solution X is a number whose values are defined on the outcomes of a random experiment. Therefore,
X is a random variable.
Now, sample space of the experiment is
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Then X (HHH) = Rs (2 × 3) = Rs 6
X(HHT) = X (HTH) = X(THH) = Rs (2 × 2 − 1 × 1.50) = Rs 2.50

X (HTT) = X (THT) = (TTH) = Rs (1 × 2) – (2 × 1.50) = – Re 1 and X (TTT) = − Rs (3 × 1.50) = − Rs 4.50


where, minus sign shows the loss to the player. Thus, for each element of the sample space, X takes
a unique value, hence, X is a function on the sample space whose range is
{– 1, 2.50, – 4.50, 6}
Example 23 A bag contains 2 white and 1 red balls. One ball is drawn at random and then put back in
the box after noting its colour. The process is repeated again. If X denotes the number of red balls
recorded in the two draws, describe X.

16 Probability
Solution Let the balls in the bag be denoted by w1, w2, r. Then the sample space is
S = {w1 w1, w1 w2, w2 w2, w2 w1, w1 r, w2 r, r w1, r w2, r r} Now, for ω ∈ S

X (ω) = number of red balls


Therefore
X({w1 w1}) = X ({w1 w2}) = X({w2 w2}) = X ({w2 w1}) = 0
X({w1 r}) = X ({w2 r}) = X({r w1}) = X ({r w2}) = 1 and X ({r r}) = 2 Thus, X is a random variable which can
take values 0, 1 or 2.
Probability distribution of a random variable
Let us look at the experiment of selecting one family out of ten families f1, f2,..., f10 in such a manner
that each family is equally likely to be selected. Let the families f1, f2,..., f10 have 3, 4, 3, 2, 5, 4, 3, 6, 4,
5 members, respectively.
Let us select a family and note down the number of members in the family denoting X. Clearly, X is a
random variable defined as below :
X (f1) = 3, X (f2) = 4, X (f3) = 3, X (f4) = 2, X (f5) = 5, X (f6) = 4, X(f7) = 3, X(f8) = 6, X(f9) = 4, X(f10) = 5
Thus, X can take any value 2,3,4,5 or 6 depending upon which family is selected.
Now, X will take the value 2 when the family f4 is selected. X can take the value 3 when any one of the
families f1, f3, f7 is selected.
Similarly, X = 4, when family f2, f6 or f9 is selected,
X = 5, when family f5 or f10 is selected
and X = 6, when family f8 is selected.
Since we had assumed that each family is equally likely to be selected, the probability that family f4 is
1
selected is .
10
1 1
Thus, the probability that X can take the value 2 is . We write P(X = 2) =
10 10
Also, the probability that any one of the families f1, f3 or f7 is selected is
3
P({f1, f3, f7}) =
10
3
Thus, the probability that X can take the value 3 =
10
3
We write P(X = 3) =
10
Similarly, we obtain
3
(X = 4) = P({f2 , f6 , f9 }) =
10
2
P(X = 5) = P({f5 , f10 }) =
10
1
and P(X = 6) = P({f8 }) =
10
Such a description giving the values of the random variable along with the corresponding probabilities
is called the probability distribution of the random variable X.
In general, the probability distribution of a random variable X is defined as follows:
Definition 5 The probability distribution of a random variable X is the system of numbers
X : x1 x2 ... xn
P(X) : p1 p2 ... pn

Probability 17
Where,p1 > 0, ∑ni=1 pi = 1, i = 1, 2,…, n
The real numbers x1, x2,..., xn are the possible values of the random variable X and pi (i = 1,2,..., n) is the
probability of the random variable X taking the value xi i.e., P(X = xi) = pi
Note If xi is one of the possible values of a random variable X, the statement X = xi is true only at some
point (s) of the sample space. Hence, the probability that X takes value xi is always nonzero, i.e. P(X =
xi) ≠ 0.
Also for all possible values of the random variable X, all elements of the sample space are covered.
Hence, the sum of all the probabilities in a probability distribution must be one.
Example 24 Two cards are drawn successively with replacement from a well-shuffled deck of 52 cards.
Find the probability distribution of the number of aces.
Solution The number of aces is a random variable. Let it be denoted by X. Clearly, X can take the values
0, 1, or 2.
Now, since the draws are done with replacement, therefore, the two draws form independent
experiments.
Therefore, P(X = 0) = P(non-ace and non-ace)
= P(non-ace) × P(non-ace)
48 48 l44
= × =
52 52 l69
P(X = 1) = P(ace and non-ace or non-ace and ace)
=P (ace and non-ace) + P(non-ace and ace)
=P(ace). P(non-ace) + P (non-ace). P(ace)
4 48 48 4 24
= × + × =
52 52 52 52 l69
and P(X = 2) = P (ace and ace)
4 4 1
= × =
52 52 169
Thus, the required probability distribution is

X 0 1 2

P(X) 144 24 1
169 169 169

Example 25 Find the probability distribution of number of doublets in three throws of a pair of dice.
Solution Let X denote the number of doublets. Possible doublets are (1,1), (2,2), (3,3), (4,4), (5,5), (6,6)
Clearly, X can take the value 0, 1, 2, or 3.
6 1
Probability of getting a doublet = =
36 6
1 5
Probability of not getting a doublet = 1 − =
6 6
5 5 5 125
Now P(X = 0) = P (no doublet) = × × =
6 6 6 216
P(X = 1) = P (one doublet and two non‐doublets)
1 5 5 5 1 5 5 5 1
= × × + × × + × ×
6 6 6 6 6 6 6 6 6

18 Probability
1 52 75
= 3 × =
6 62 216
P(X = 2) = P (two doublets and one non-doublet)
1 1 5 1 5 1 5 1 1 1 5 15
= × × + × × + × × =3 2× =
6 6 6 6 6 6 6 6 6 6 6 216
and P(X = 3) = P (three doublets)
1 1 1 1
= × × =
6 6 6 216
Thus, the required probability distribution is
X 0 1 2 3

125 75 15 1
P(X)
216 216 216 216

Verification Sum of the probabilities


n
125 75 15 1
� pi = + + +
216 216 216 216
i=1
125+75+15+1 216
= = =1
216 216
Example 26 Let X denote the number of hours you study during a randomly selected school day. The
probability that X can take the values x, has the following form, where k is some unknown constant.
0.1if x = 0
kx, if x = 1or 2
P(X = x) = �
k(5 - x)if x = 3 or 4
0, otherwise
(a) Find the value of k.
(b) What is the probability that you study at least two hours ? Exactly two hours? At most two hours?
Solution The probability distribution of X is
X 0 1 2 3 4
P(X) 0.1 k 2k 2k k
(a) We know that ∑ni=1 pi = 1
Therefore 0.1 + k + 2k + 2k + k = 1
i.e. k = 0.15
(b) P(you study at least two hours) = P(X ≥ 2)
= P(X = 2) + P (X = 3) + P (X = 4)
= 2k + 2k + k = 5k = 5 × 0.15 = 0.75
P(you study exactly two hours) = P(X = 2)
= 2k = 2 × 0.15 = 0.3
P(you study at most two hours) = P(X ≤ 2)
= P (X = 0) + P(X = 1) + P(X = 2)
= 0.1 + k + 2k = 0.1 + 3k = 0.1 + 3 × 0.15
= 0.55
Mean of a random variable
In many problems, it is desirable to describe some feature of the random variable by means of a single

Probability 19
number that can be computed from its probability distribution. Few such numbers are mean, median
and mode. In this section, we shall discuss mean only. Mean is a measure of location or central
tendency in the sense that it roughly locates a middle or average value of the random variable.
Definition 6 Let X be a random variable whose possible values x1, x2, x3,..., xn occur with probabilities p1,
p2, p3,..., pn, respectively. The mean of X, denoted by µ, is the number ∑ni=1 xi pi i.e. the mean of X is the
weighted average of the possible values of X, each value being weighted by its probability with which
it occurs.
The mean of a random variable X is also called the expectation of X, denoted by E(X).
Thus E(X) = μ = ∑ni=1 xi pi = x1 p1+ x2 p2 +... + xn pn.
In other words, the mean or expectation of a random variable X is the sum of the products of all
possible values of X by their respective probabilities.
Example 27 Let a pair of dice be thrown and the random variable X be the sum of the numbers that
appear on the two dice. Find the mean or expectation of X.
Solution The sample space of the experiment consists of 36 elementary events in the form of ordered
pairs (xi, yi), where xi = 1, 2, 3, 4, 5, 6 and yi = 1, 2, 3, 4, 5, 6.
The random variable X i.e. the sum of the numbers on the two dice takes the values 2, 3, 4, 5, 6, 7, 8,
9, 10, 11 or 12.
1
Now P(X = 2) = P({(1,1)}) =
36
2
P(X = 3) = P({(1,2), (2,1)}) =
36
3
P(X = 4) = P({(1,3), (2,2), (3,1)}) =
36
4
P(X = 5) = P({(1,4), (2,3), (3,2), (4,1)}) =
36
5
P(X = 6) = P({(1,5), (2,4), (3,3), (4,2), (5,1)}) =
36
6
P(X = 7) = P({(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}) =
36
5
P(X = 8) = P({(2,6), (3,5), (4,4), (5,3), (6,2)}) =
36
4
P(X = 9) = P({ (3,6), (4,5), (5,4), (6,3)}) =
36
3
P(X = 10) = P({(4,6), (5,5), (6,4)}) =
36
2
P(X = 11) = P({(5,6), (6,5)}) =
36
1
P(X = 12) = P({(6,6)}) =
36
The probability distribution of X is
X or xi 2 3 4 5 6 7 8 9 10 11 12

P(X) or p i 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36

Therefore,
n
1 2 3 4
μ = E(X) = � xi pi = 2 × +3× +4× +5×
36 36 36 36
i=1
5 6 5 4 3 2 1
+6 × +7× +8× +9× + 1O × + 11 × + 12 ×
36 36 36 36 36 36 36

20 Probability
2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12
= =7
36
Thus, the mean of the sum of the numbers that appear on throwing two fair dice is 7.
Variance of a random variable
The mean of a random variable does not give us information about the variability in the values of the
random variable. In fact, if the variance is small, then the values of the random variable are close to
the mean. Also random variables with different probability distributions can have equal means, as
shown in the following distributions of X and Y.
X 1 2 3 4
P(X) 1 2 3 2
8 8 8 8

Y -1 0 4 5 6
P(Y) 1 2 3 1 1
8 8 8 8 8
1 2 3 2 22
Clearly E(X) = 1 × + 2 × + 3 × + 4 × = = 2.75
8 8 8 8 8
1 2 3 1 1 22
and E(Y) = - 1 × + 0 × + 4 × + 5 × = 6 × = = 2.75
8 8 8 8 8 8
The variables X and Y are different, however their means are same. It is also easily observable from the
diagramatic representation of these distributions.

To distinguish X from Y, we require a measure of the extent to which the values of the random variables
spread out. In Statistics, we have studied that the variance is a measure of the spread or scatter in
data. Likewise, the variability or spread in the values of a random variable may be measured by variance.
Definition 7 Let X be a random variable whose possible values x1, x2,...,xn occur with probabilities p(x1),
p(x2),..., p(xn) respectively.
Let µ = E (X) be the mean of X. The variance of X, denoted by Var (X) or σ2x is defined as
n

σ2x = Var(X) = �(xi - μ)2 p(xi )


i=1
or equivalently σ2x = E(X − μ)2
The non-negative number

Probability 21
n

σx = �Var(X) = ��(xi − μ)2 p(xi )


i=1

is called the standard deviation of the random variable X.


Another formula to find the variance of a random variable. We know that,
Var (X) = ∑ni=1(xi − μ)2 P(xi )
n

= �(xi2 + μ2 − 2μxi ) p(xi )


i=1
n n n

= � xi2 2
p(xi ) + � μ p(xi ) − � 2 μxi p(xi )
i=1 i=1 i=1
n n n

= � xi2 2
P(xi ) + μ � P (xi ) − 2μ � xi p(xi )
i=1 i=1 i=1
i= 1i= 1i =1
n n n

= � xi2 2 2
p(xi ) + μ − 2μ �since � p (xi ) = 1 and μ = � xi p(xi )�
i=1 i=1 i=1
n

= � xi2 p(xi ) − μ2
i=1
2
or Var(X) = ∑ni=1 xi2 p(xi ) − �∑ni=1 xi p(xi )�
or Var (X) = E(X 2 ) − [E(X)]2 , where E(X 2 ) = ∑ni=1 xi2 P(xi )
Example 28 Find the variance of the number obtained on a throw of an unbiased die.
Solution The sample space of the experiment is S = {1, 2, 3, 4, 5, 6}.
Let X denote the number obtained on the throw. Then X is a random variable which can take values 1,
2, 3, 4, 5, or 6.
1
Also P(1) = P(2) = P(3) = P(4) = P(5) = P(6) =
6
Therefore, the Probability distribution of X is
X 1 2 3 4 5 6
P(X) 1 1 1 1 1 1
6 6 6 6 6 6

Now E(X) = ∑ni=1 xi p(xi )


1 1 1 1 1 1 21
Also = 1 × + 2 × + 3 × + 4 × + 5 × + 6 × =
6 6 6 6 6 6 6
1 1 1 1 1 1 91
Thus E(X 2)
= 1 × + 2 × + 3 × + 4 × + 5 × + 62 × =
2 2 2 2 2
6 6 6 6 6 6 6
2) 2
V𝔑𝔑(X) = E(X − �E(X)�
2
91 21 91441 35
= � � = =
6 6 636 12
Example 29 Two cards are drawn simultaneously (or successively without replacement) from a well
shuffled pack of 52 cards. Find the mean, variance and standard deviation of the number of kings.
Solution Let X denote the number of kings in a draw of two cards. X is a random variable which can
assume the values 0, 1 or 2.

22 Probability
48
48 C 48×47 188
Now P(X = 0) = P (no king) = 52 C
2
= 2!(48−2)!
52! = =
2 52×51 221
2!(52−2)!
4
C1 48 C1
P(X = 1) = P (one king and one non- king) 52 C
2
4 × 48 × 2 32
= =
52 × 51 221
4C 4×3 1
and P(X = 2) = P (two kings) = 52 C
2
= =
2 52×51 221

Thus, the probability distribution of X is


X 0 1 2

P(X) 188 32 1
221 221 221

Now Mean of X = E(X) = ∑ni=1 xi P(xi )


188 32 1 34
= O× +1× +2× =
221 221 221 221
Also E(X 2 ) = ∑ni=1 xi2 p(xi )
188 32 1 36
= 02 × + 12 × + 22 × =
221 221 221 221
Now Var(X) = E(X 2 ) − [E(X)]2
36 34 2 6800
= � � =
221 221 (221)2
√6800
Therefore σx = �Vπ(X) = = 0.37
221
13.7 Bernoulli Trials and Binomial Distribution
13.7.1 Bernoulli trials
Many experiments are dichotomous in nature. For example, a tossed coin shows a ‘head’ or ‘tail’, a
manufactured item can be ‘defective’ or ‘non-defective’, the response to a question might be ‘yes’ or
‘no’, an egg has ‘hatched’ or ‘not hatched’, the decision is ‘yes’ or ‘no’ etc. In such cases, it is customary
to call one of the outcomes a ‘success’ and the other ‘not success’ or ‘failure’. For example, in tossing
a coin, if the occurrence of the head is considered a success, then occurrence of tail is a failure.
Each time we toss a coin or roll a die or perform any other experiment, we call it a trial. If a coin is
tossed, say, 4 times, the number of trials is 4, each having exactly two outcomes, namely, success or
failure. The outcome of any trial is independent of the outcome of any other trial. In each of such trials,
the probability of success or failure remains constant. Such independent trials which have only two
outcomes usually referred as ‘success’ or ‘failure’ are called Bernoulli trials.
Definition 8 Trials of a random experiment are called Bernoulli trials, if they satisfy the following
conditions :
(i) There should be a finite number of trials.
(ii) The trials should be independent.
(iii) Each trial has exactly two outcomes : success or failure.
(iv) The probability of success remains the same in each trial.
For example, throwing a die 50 times is a case of 50 Bernoulli trials, in which each trial results in
success (say an even number) or failure (an odd number) and the probability of success (p) is same for

Probability 23
all 50 throws. Obviously, the successive throws of the die are independent experiments. If the die is
1 1
fair and have six numbers 1 to 6 written on six faces, then p = and q = 1 – p = = probability of failure.
2 2
Example 30 Six balls are drawn successively from an urn containing 7 red and 9 black balls. Tell whether
or not the trials of drawing balls are Bernoulli trials when after each draw the ball drawn is
(i) replaced (ii) not replaced in the urn.
Solution
(i) The number of trials is finite. When the drawing is done with replacement, the probability of success
7
(say, red ball) is p = which is same for all six trials (draws). Hence, the drawing of balls with
16
replacements are Bernoulli trials.
(ii) When the drawing is done without replacement, the probability of success (i.e., red ball) in first trial
7 6 7
is , in 2nd trial is if the first ball drawn is red or if the first ball drawn is black and so on. Clearly, the
16 15 15
probability of success is not same for all trials, hence the trials are not Bernoulli trials.
Binomial distribution
Consider the experiment of tossing a coin in which each trial results in success (say, heads) or failure
(tails). Let S and F denote respectively success and failure in each trial. Suppose we are interested in
finding the ways in which we have one success in six trials.
Clearly, six different cases are there as listed below:
SFFFFF, FSFFFF, FFSFFF, FFFSFF, FFFFSF, FFFFFS.
6!
Similarly, two successes and four failures can have combinations. It will be lengthy job to list all
4! × 2!
of these ways. Therefore, calculation of probabilities of 0, 1, 2,..., n number of successes may be lengthy
and time consuming. To avoid the lengthy calculations and listing of all the possible cases, for the
probabilities of number of successes in n-Bernoulli trials, a formula is derived. For this purpose, let us
take the experiment made up of three Bernoulli trials with probabilities p and q = 1 – p for success
and failure respectively in each trial. The sample space of the experiment is the set
S = {SSS, SSF, SFS, FSS, SFF, FSF, FFS, FFF}
The number of successes is a random variable X and can take values 0, 1, 2, or 3. The probability
distribution of the number of successes is as below :
P(X = 0) = P(no success)
= P({FFF}) = P(F) P(F) P(F)
= q. q. q = q3 since the trials are independent P(X = 1) = P(one successes)
= P({SFF, FSF, FFS})
= P({SFF}) + P({FSF}) + P({FFS})
= P(S) P(F) P(F) + P(F) P(S) P(F) + P(F) P(F) P(S)
= p.q.q + q.p.q + q.q.p = 3pq2
P(X = 2) = P (two successes)
= P({SSF, SFS, FSS})
= P({SSF}) + P ({SFS}) + P({FSS})
= P(S) P(S) P(F) + P(S) P(F) P(S) + P(F) P(S) P(S)
= p.p.q. + p.q.p + q.p.p = 3p2q

24 Probability
and P(X = 3) = P(three success) = P ({SSS})
= P(S). P(S). P(S) = p3
Thus, the probability distribution of X is
X 0 1 2 3
P(X) q3 3q2p 3qp2 p3
Also, the binominal expansion of (q + p)3 is
q3 + 3q2 p + 3qp2 + p3
Note that the probabilities of 0, 1, 2 or 3 successes are respectively the 1st, 2nd, 3rd and 4th term in
the expansion of (q + p)3.
Also, since q + p = 1, it follows that the sum of these probabilities, as expected, is 1.
Thus, we may conclude that in an experiment of n-Bernoulli trials, the probabilities of 0, 1, 2,..., n
successes can be obtained as 1st, 2nd,...,(n + 1)th terms in the expansion of (q + p)n. To prove this
assertion (result), let us find the probability of x-successes in an experiment of n-Bernoulli trials.
Clearly, in case of x successes (S), there will be (n – x) failures (F).
n!
Now, x successes (S) and (n – x) failures (F) can be obtained in ways.
x !( n − x)!

In each of these ways, the probability of x successes and (n − x) failures is


= P(x successes). P(n–x) failures is

P(S).P(S)... P(S) ⋅ P(F).P(F)... P(F)


= = px qn-x
x times ( n −x) times
n!
Thus, the probability of x successes in n-Bernoulli trials is pxqn–x
x !( n − x)!

or nCx px qn–x
Thus P(x successes) = nCx px qn−x, x = 0, 1, 2,...,n. (q = 1 – p)

Clearly, P(x successes), i.e. nCx px qn−x is the (x + 1)th term in the binomial expansion of (q + p)n.
Thus, the probability distribution of number of successes in an experiment consisting of n Bernoulli
trials may be obtained by the binomial expansion of (q + p)n. Hence, this distribution of number of
successes X can be written as
X 0 1 2 ... x n
P (X) n
C0 qn n
C1qn–1p1 n
C2qn–2p2 n
CXqn–XpX n
Cn qn
The above probability distribution is known as binomial distribution with parameters n and p, because
for given values of n and p, we can find the complete probability distribution.
The probability of x successes P(X = x) is also denoted by P(x) and is given by P(x) = nCx qn–xpx, x = 0,
1,..., n. (q = 1 – p)
This P(x) is called the probability function of the binomial distribution.
A binomial distribution with n-Bernoulli trials and probability of success in each trial as p, is denoted
by B(n, p).

Probability 25
Let us now take up some examples.
Example 31 If a fair coin is tossed 10 times, find the probability of
(i) exactly six heads
(ii) at least six heads
(iii) at most six heads
Solution The repeated tosses of a coin are Bernoulli trials. Let X denote the number of heads in an
experiment of 10 trials.
1
Clearly, X has the binomial distribution with n = 10 and p =
2
Therefore P(X = x) = n Cx qn−x p𝑥𝑥 , x = 0, 1, 2,…, n
1 1
Here n = 10, p = , q = 1 − p =
2 2
1 10−x 1 x 1 10
Therefore P(X = x) = 10
Cx � � � � = 10
Cx � �
2 2 2
1 10 10! 1 105
Now (i) P(X = 6) = 10
C6 � � = =
2 6!×4! 210 512

(ii) P(at least six heads) = P(X ≥ 6)


= P (X = 6) + P (X = 7) + P (X = 8) + P(X = 9) + P (X = 10)
1 10 1 10 1 10 1 10 1 10
=10 C6 � � +10 C7 � � +10 C8 � � +10 C9 � � +10 C10 � �
2 2 2 2 2
10! 10! 10! 10! 10! 1 193
= �� �+� �+� �+� � + � �� 10 =
6! × 4! 7! × 3! 8! × 2! 9! × 1! 10! 2 512
(iii) P(at most six heads) = P(X ≤ 6)
= P(X = 0) + P(X = l) + P(X = 2) + P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6)
1 10 1 10 1 10 1 10 1 10 1 10 1 10
= � � +10 C1 � � +10 C2 � � +10 C3 � � +10 C4 � � +10 C5 � � +10 C6 � �
2 2 2 2 2 2 2
848 53
= =
l024 64
Example 32 Ten eggs are drawn successively with replacement from a lot containing 10% defective
eggs. Find the probability that there is at least one defective egg.
Solution Let X denote the number of defective eggs in the 10 eggs drawn. Since the drawing is done
with replacement, the trials are Bernoulli trials. Clearly, X has the binomial distribution with n =
10 1
10 and P = = .
100 10
9
Therefore q = 1 − p =
10
Now P (at least one defective egg) = P(X ≥ 1) = 1 − P(X = 0)

10
9 10 910
=1− C0 � � = 1 − 10
10 l0

26 Probability

You might also like