Statistics Lec 3
Statistics Lec 3
BOX PLOT
In data analysis, box charts, also known as box-
and-whisker plots, are crucial tools for displaying
a dataset's distribution, central tendency, and
variability. They offer a brief summary of the
main statistical characteristics of the data.
Box plots show the five-number summary of a
set of data: including the minimum score, first
(lower) quartile, median, third (upper) quartile,
and maximum score.
Example
A sample of 10 bags of rice has these weights (in
kg) as follow
25,34,35,28,30,35,37,38,29,29
Make a box plot of the data.
Example
Draw a Box and Whisker diagram for the number of books
issued by first year students from library and compare
this with the box and whisker diagram for the number of
books issued by third year students.
The number of books taken out of the library per month
by first year students from a sample of 15
is as follows:
3, 0, 12, 0, 2, 0, 26, 0, 7, 5, 5, 2, 1, 1, 2.
The number of books taken out of the library per month
by third year students from a sample of 15
is as follows: 12, 0, 9, 4, 15, 2, 6, 10, 27, 15, 5, 9, 1, 14, 2.
Measure of Variability
The statistical tools used to characterize the
dispersion or spread of a collection of data are
referred to as the measure of variability. It shows
the degree to which the data values deviate from
the center value (such as the mean or median) or
from one another. Typical metrics for assessing
variability include:
• range
• interquartile range
• variance
• standard deviation
Range: The most basic measure of variation is
the range, which is the distance from the
smallest to the largest value in a distribution.
Range= Largest value – Smallest Value
However, the range uses only two values in the
data set, and one of these values may be an
unusually large or small value.
Variance for ungrouped data
Variance measures the average squared
deviation of each data point from the mean.
The general expression of variance is given as;
𝑥𝑖 −𝜇 2
𝜎2 = , for population data
𝑁
2 𝑥𝑖 −𝑥 2
𝑆 = , for sample data
𝑛
Variance for grouped data
When the data is grouped into frequency
distribution having k classes with midpoints 𝑥𝑖
and the corresponding frequencies 𝑓𝑖 , the
variance is given by.
𝑓𝑖 𝑥𝑖 −𝜇 2
𝜎2 = , for population data
𝑁
𝑓𝑖 𝑥𝑖 −𝑥 2
𝑆2 = , for sample data
𝑛
Standard deviation for ungrouped data
The positive square root of the variance is called
standard deviation.
The general expression of S.D is given as;
𝑥𝑖 −𝜇 2
𝜎= , for population data
𝑁
𝑥𝑖 −𝑥 2
𝑆= , for sample data
𝑛
Standard deviation for grouped data
The positive square root of the variance is called
standard deviation.
The general expression of S.D is given as;
𝑓𝑖 𝑥𝑖 −𝜇 2
𝜎= , for population data
𝑁
𝑓𝑖 𝑥𝑖 −𝑥 2
𝑆= , for sample data
𝑛
Example
Calculate the variance and standard deviation of
the following frequency distribution showing the
weight of apples