0% found this document useful (0 votes)
68 views

Business Statistics & Analytics KMBN104 UNIT-1

1) Statistics is defined as the collection, organization, analysis, interpretation and presentation of data. It has various applications across disciplines. 2) There are three main measures of central tendency - the mean, median, and mode. The mean is the average value found by dividing the sum of all values by the total number of values. The median is the middle value when values are arranged in order. 3) Calculating the mean involves summing all values and dividing by the total number of data points for individual data sets. For frequency distributions, it involves multiplying each value by its frequency, summing those products, and dividing by the total frequency. The median can be found by ordering values and taking the middle value.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Business Statistics & Analytics KMBN104 UNIT-1

1) Statistics is defined as the collection, organization, analysis, interpretation and presentation of data. It has various applications across disciplines. 2) There are three main measures of central tendency - the mean, median, and mode. The mean is the average value found by dividing the sum of all values by the total number of values. The median is the middle value when values are arranged in order. 3) Calculating the mean involves summing all values and dividing by the total number of data points for individual data sets. For frequency distributions, it involves multiplying each value by its frequency, summing those products, and dividing by the total frequency. The median can be found by ordering values and taking the middle value.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

BUSINESS STATISTICS & ANALYTICS

KMBN104

UNIT-1
The word statistics has been derived from the Latin word Status or Italian word
Statistica or German word Statistick. In each case it means ‘an organized political
state’.

The common man refers the word statistics as numerical data. for example- statistics
of National income, market statistics, production statistics, import and export
statistics etc.

DEFINITION OF STATISTICS
According to A.L.Bowley- statistics is the science of counting and statistics may be
called the science of averages.

According to Croxton and Cowden, statistics may be defined as the collection,


presentation, analysis and interpretation of the numerical data.

SCOPE OF STATISTICS
In ancient days the scope of statistics was limited but now-a -days the scope of
statistics is so vast and ever expanding that it is difficult to define it. The scope of
statistics may be discussed under three main parts-

1. Division of statistics
2. Importance of statistics
3. Applications of statistics in various disciplines.

The science of statistics may be classified into the following main divisions-
1. Theoretical statistics
2. Statistical methods
3. Applied statistics

MEASURES OF CENTRAL TENDENCY

The statistical measures which tells us the location or position of the central vaule or
central point to describe the central tendency of the entire mass of the data is known
as the measures of central tendency .
“ A measure of central tendency is a single value within the range of the entire mass
of the data that is used to represent the whole data.”
KINDS OF STATISTICAL AVERAGES
The various averages (measures of central tendency) are as given below-
1. Arithmetic average or mean or arithmetic mean
2. Median
3. mode

ARITHMETIC AVERAGE OR ARITHMETIC MEAN

Definition-
Arithmetic mean (A.M) of a group of observations is the quotient obtained
by dividing the sum of all observation s by the total number of observations.A.M. is
denoted by 𝑥̅ .
Arithmetic mean = Total value of observations divided by Totalnumber of
observation.

∑𝑥
𝑥̅ =
𝑁
Whereas-N= Number of observations.
∑ x= Total of all observations. it is simply translated as “add up all the
values of x.

For individual series-


Mean = sum of all observation / no. of observations.
For discrete series-
∑𝒇𝒙
Mean= ∑𝒇 where ∑f= sum of frequency.
For continuous series-
∑𝒇𝒙
Mean= ∑𝒇 where ∑f= sum of frequency
x=mid value of the class interval.

Examples-1 The monthly income of 5 persons is as given below-


1132,1140,1144,1136 and 1148 find the arithmetic mean.

Solution- monthly income of 5 persons are-


1132,1140,1144,1136 and 1148
This is an individual series, so mean-

Mean = sum of all observation / no. of observations.


𝑥̅ = 1132+1140+1144+1136+1148/ 5
= 5700/5 = 1140
Mean=1140.
Example-2 Calculate mean from the following series-
Size: 6 7 8 9 10 11 12
Frequency: 5 8 9 12 6 6 4

Solution-
Size(x) Frequency(f) fx
6 5 30
7 8 56
8 9 72
9 12 108
10 6 60
11 6 66
12 4 48
N or ∑f = 50 ∑fx = 440


Mean (𝑥̅ ) = = 440/50 = 8.8

Example-3 calculate mean from the following data-

Marks obtained : 0-10 10-20 20-30 30-40 40-50


No. of students : 10 12 20 18 10

Solution- this is an example of continuous series-

Class interval(marks) No. of students (f) Mid value(x) fx


0-10 10 5 50
10-20 12 15 180
20-30 20 25 500
30-40 18 35 630
40-50 10 45 450
∑f=70 ∑fx=1810


Mean (𝑥̅ ) = = 1810/70=25.86

MEDIAN
Definition- if a group of N observation is arranged in ascending or descending order
of magnitude then the middle value is called median of these observations and it is
denoted by M. that is M = th observation.

COMPUTATION OF MEDIAN
INDIVIDUAL SERIES-
In case of individual series, the following steps are taken for calculating the
median-
1)- arrange the data in ascending or descending order.
2)- locate the middle value-
 if the no. of observation N is odd then there will be a single value in the middle
which is taken as median. Mathematically,
Median, M= th observation.
 If the no. of observation N is even then there will be two mid value the median
is the average of these two mid values. Mathematically,
Median, M=

Example 4:

Calculate the median for the following data:


310, 290, 280, 275, 225, 195, 200, 200, 190, 185, 175.
Solution-
Arrange the data in descending order,
we have310, 290, 280, 275, 225, 200, 200, 195 190, 185, 175.
N=11 (odd)
Then median M = N+1/2 th observation
M = 11+1/2 =12/2 =6 th observation
6 th observation in data is 200.
hence the median is 200.

DISCRETE FREQUENCY DISTRIBUTION-


In case of discrete frequency distribution, the following steps are taken for
calculating the median:
1. Arrange the data in ascending or descending order.
2. Obtain the sum of frequency ∑f = N.
3. Obtain cumulative frequency (c.f)
4. Find the median by using the following formula-
Median M = th observation.
5. The value for which the cumulative frequency includes th value will be
taken as median.

Example 5-
Calculate median for the following table-
Value : 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Frequency: 2 3 8 10 12 16 10 8 6 5 6 4 3 1

Solution-
Value(x) Frequency(f) Cumulative frequency(cf)
2 2 2
3 3 5
4 8 13
5 10 23
6 12 35
7 16 51
8 10 61
9 8 69
10 6 75
11 5 80
12 6 86
13 4 90
14 3 93
15 1 94

Median number, M = (N+1/2) th value = 94+1/2 = 95/2 =47.5 th value.


Since the median number lies in the cummulative frequency 51 , corresponding
observation is 7 hence, median is 7.

CONTINUOUS FREQUENCY DISTRIBUTION-


In case of continuous frequency distribution, the following steps are taken for the
calculation-
1- Make the classes in exclusive form.
2- Find the total frequency, N = ∑f and median number N/2.
3- Calculate cumulative frequency c.f.
4- With help of median number find the median class. The class in which
median number falls is called the median class.
5- Apply the following formula to locate the median.

Median, M = L1 + ( L2 – L1)

Where, M = median
L1 = lower limit of the median class.
L2 = upper limit of the median class.
f = frequency of the median class.
C = cumulative frequency of preceding the median class.
N= sum of frequency.

Example 6 -

find the median from the following distribution table-


marks obtained : 0-5 5-10 10-15 15-20 20-25 25-30 30-35
no. of students : 4 6 10 16 12 8 4

solution-

Marks(c-i) No. of students(f) Cumulative frequency(c.f)


0-5 4 4
5-10 6 10
10-15 10 20
15-20 16 36
20-25 12 48
25-30 8 56
30-35 4 60
∑f=60

Median number- m= N/2 = 60/2=30.


Median number lies in the cumulative frequency 36,corresponding class interval is
15-20 which is median class.

L1= 15 f= 16
L2 = 20 c= 20 N=60
Now , median M
M = L1 + ( L2 – L1)
= 15 + 60/2 – 20 /16 (20-15)
= 15+(30-20/16)5
= 15 + 10*5/16
= 15+ 3.125
= 18.125

MODE
The mode is the number which appears more times than any number in a given
set. It is quoted as a typical value of the variable.
The value of the variable for which the frequency is maximum is called mode or
modal value and it is denoted by Z.

Example-

For individual series-


3,4,2,1,7,6,6,7,5,6,8,9,5

Solution – in this data 6 is repeated maximum times hence mode is 6.

Example-

For discrete frequency distribution-

Find the mode of the given data-

Height(in cm)- 150 160 170 180 190 200 210


No. of person - 2 4 8 10 6 5 3

Solution- in this maximum frequency is 10 hence corresponding observation is


mode that is 180.
Z= 180.
For continuous series-
Mode-
Z=𝐿1 + ∗ ∗𝑖
L1= lower boundary of the mode class interval
f1= Highest frequency of given distribution
f0 = above frequency of f1
f2= below frequency of f1
i= Gap of class intervals (L1-L2)

Example-

Calculate the mode of the given data-

Class interval: 0-10 10-20 20-30 30-40 40-50 50-60 60-70

Frequency : 6 9 10 16 12 8 7

Solution- the maximum frequency is 16 hence modal class is 30-40.


Then- L1= 30
L2= 40
f1 = 16
f2 = 12
f0 = 10

Z=𝐿1 + ∗
∗𝑖
Z=30 +[(16-10)/2*16 -10 -12 ]*10
=36

MEASURE OF DISPERSION
To obtained a measure of location or position of a distribution, we need to know how
the data is spread about that point. Information about the spread can be given by
one or more of dispersion.

THE RANGE
This is the simplest measure of dispersion available in statistical analysis. It uses
only two extreme values. The range is defined as the difference between the
maximum and minimum values of a given data set. Its advantage lies in its simplicity
and its independence of the measure of position. However, it is distorted by the
extreme values and tells us nothing between the maximum and minimum values.

THE QUARTILE DEVIATION.


The median divides the area under the frequency curve in two. The quartiles divide
the area in four equal parts.

Example :
Calculate the first and third quartiles for the following data set:
44, 76, 49, 52, 52, 48, 51.
We first arrange the data set in ascending order.
44, 48, 49, 51, 52, 52, 76.
Q1 is the value of = [7+1]/4 items, =8/4=2 item
This is 48.
Q3 is the value of= (7+1)*3/4 item,=24/4=6 item
This is 52

𝒒𝟏 𝒄𝒇𝟎
First Quartile (Q1) = 𝑳𝟏 + ( 𝒇
)∗𝒊 q1= (N*1)/4 items

𝒒𝟑 𝒄𝒇𝟎
Third Quartile (Q3) =𝑳𝟏 + ( 𝒇
) ∗ 𝒊 q3= (N*3)/4 items

𝑸𝟑 𝑸𝟏
Quartile Deviation (Q.D.) = 𝟐

MEAN DEVIATION

Mean Deviation is also known as ‘Average Deviation’ or First Moment of Dispersion’.


This measure is an average of the deviation of all items from the arithmetic mean .To
consider the deviation of an item from the mean, the sign is not taken into account.
Definition- According to Mills-“The Mean Deviation of a series of magnitudes is
the arithmetic mean of their deviations from an average value, either Mean or
Median.”

Mean deviation from mean


M.D.𝒙=𝜮|d𝒙|/ N [For individual series]
Mean deviation from mean [For grouped data] -- M.D.𝒙=𝜮|fd𝒙|/ N
THE STANDARD DEVIATION [S.D.]
The standard deviation is the most widely used measure of dispersion, since it is
directly related to the mean. If you chose the mean as the most appropriate measure

of central location, then the standard deviation would be the natural choice for a
measure of dispersion.

The standard deviation measures the differences from the mean; a larger value
indicates large variation. Standard deviation is denoted by small sigma σ. The
standard deviation is in the same units as the actual observations.
To calculate the standard deviation for ungrouped (individual series) data, we
follow the following steps.

1) Find Assume mean (A) [select A as a minimum value(data) from the given
data]
2) Taking Deviation(dx) from X (x-A) Then totaled up as 𝚺dx
3) Squares the deviations (dx2) Then totaled up as 𝚺dx2
4) Formula is—
2
S.D= −
2
σ= −

Standard Deviation (For grouped data) - Steps are--


1) Find Assume mean (A) [select any value as A from the given data]
2) Taking Deviation(dx) from X (x-A) and Multiplied by respective frequency, then
sum up as 𝚺fdx
3) Such multiplications (fdx) are again multiplied by deviations (dx) and (fdx2) is
obtained. These products are totaled to get 𝚺fdx2
4) Formula is—

2
S.D= −
2
S.D= − 𝒙 =A+
Example-

Find the standard deviation of the given data:


Age: 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
No.of person: 15 15 23 22 25 10 5 10

Solution-

age Mid frequency dx = x- fdx dx2 Fdx2


value (x) 35
0-10 5 15 -30 -450 900 13500
10-20 15 15 -20 -300 400 6000
20-30 25 23 -10 -230 100 2300
30-40 35 22 0 0 0 0
40-50 45 25 10 250 100 2500
50-60 55 10 20 200 400 4000
60-70 65 5 30 150 900 4500
70-80 75 10 40 400 1600 16000
∑f=125 ∑fdx=20 ∑fdx2=48800

Mean 𝒙 =A+ = 35 + 20/125 = 35.16

2 2
S.D= − = − = 19.76

COEFFICIENT OF VARIATION
Coefficient of variation calculates the standard deviation from a set of observation
as
Percentage of the arithmetic mean.
C.V. = S.D.*100/mean
C.V. = σ*100/𝒙

SKEWNESS
Skewness in a set of data relates to the shape of the histogram which could be
drawn from the data. The Literal meaning of the word Skewness is ‘lack of
symmetry’. If frequency distribution on either side of the central value is not
symmetrical, it will be called skewness.
Definition-
1) “Skewness is the tendency of a distribution to depart from normal in the
balance of its two sides.” – Blair
2) “A distribution is said to be skewed when the mean and median fall at different
points in the distribution, and balance is shifted to one side or the other.”-
Garrett

Types of Skewness

1) Symmetrical distribution
2) Asymmetrical distribution

A) Symmetrical distribution- in this distribution frequencies increase and


decrease in a regular order, the spread of frequencies will be the same on
both side of the centre point. In this distribution the value of Mean, Median
and Mode are equal.

B) Asymmetrical distribution-In this distribution there is no uniformity or


regularity in the order of increase and decrease of frequencies. This
distribution may be two types-
1) Positively Skewed distribution- A distribution on which more than half of
the area under the normal curve is to the right side of the mode is a
positively skewed distribution. In this distribution mean is greater than the
median and median is greater than the mode (𝒙>M>Z).
2) Negatively Skewed distribution- A distribution on which more than half
of the area under the normal curve is to the left side of the mode is a
negatively skewed distribution. In this distribution mean is less than the
median and median is less than the mode (𝒙<M<Z).

Measures of Skewness
1) Karl Pearson coefficient of Skewness (Jk)= (Mean-Mode)/S.D.
2) Bowley’s coefficient of Skewness(JQ)=
SOLVE THESE QUESTIONS
1-Calculate mean, median, mode of the given data-

Class- 11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50


Frequency 7 10 13 26 35 20 11 5

2- Calculate mean median mode of the give data-

X 5 10 15 20 25 30 35 40 45
y 3 51 25 5 9 7 12 16 8

3- Calculate mean, median, mode, standard deviation and coefficient of variation of the give
series-

class 30-40 40-50 50-60 60-70 70-80 80-90 90-100


frequency 18 26 30 12 10 4 3

4-calculate Karl Pearson coefficient of skewness from the following data:

Marks(above) 0 10 20 30 40 50 60
No of students 150 140 100 80 70 30 14

5- Calculate Bowleys coefficient of skewness-

Age under 10 20 30 40 50 60 70 80
No. Of 5 15 30 55 81 100 120 125
persons

You might also like