0% found this document useful (0 votes)
4 views

Lesson 4

The document is a lesson on data description in statistics, focusing on measures of variation such as range, variance, and standard deviation. It provides definitions, formulas, and examples for calculating these measures, as well as the importance of understanding data variability. Additionally, it covers sample variance and standard deviation, including methods for grouped data.

Uploaded by

renad.na00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lesson 4

The document is a lesson on data description in statistics, focusing on measures of variation such as range, variance, and standard deviation. It provides definitions, formulas, and examples for calculating these measures, as well as the importance of understanding data variability. Additionally, it covers sample variance and standard deviation, including methods for grouped data.

Uploaded by

renad.na00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

KINGDOM OF SAUDI ARABIA ‫المملكة العربية السعودية‬

Ministry of Education ‫وزارة التعليـــــــــم‬


University of Tabuk ‫جــامـعـة تـبــوك‬
Faculty of Science ‫كلية العلوم‬
Statistics Department ‫قسم اإلحصاء‬

Introduction to Statistics
STAT 1101

‫ كلية العلوم‬- ‫قسم اإلحصاء‬ ‫ هـ‬1445


‫‪Lesson‬‬ ‫‪4‬‬

‫)‪Data Description (2‬‬

‫قسم اإلحصاء ‪ -‬كلية العلوم‬ ‫‪ 1445‬هـ‬


Contents
Introduction
Measures of Variation
The Range
The Variance
The Standard Deviation
Coefficient of Variation
Measures of Position
The Five-Number Summary and Boxplots
Statistical Computations using Microsoft Excel
‫ كلية العلوم‬- ‫قسم اإلحصاء‬ ‫ هـ‬1445
Introduction

In statistics, to describe the data set accurately, statisticians must know more than
the measures of central tendency.
Example 3-15
Solution
A testing lab wishes to test two experimental
brands of outdoor paint to see how long The mean for brand A is
each will last before fading. The testing lab ∑𝑿 210
makes 6 gallons of each paint to test. Since 𝝁= 𝑵
= 𝟔
=35 months
different chemical agents are added to each
group and only six cans are involved, these The mean for brand B is
two groups constitute two small populations.
The results (in months) are shown in the ∑𝑿 210
𝝁= 𝑵
= 𝟔 =35 months
table to the right.
Find the mean of each group.
Chapter 4: Data Description 4
Introduction

Even though the means are the same for both brands, the spread, or variation, is
quite different. that brand B performs more consistently; it is less variable.

for the spread or variability of a data set, three measures are commonly used:
Range

Standard
Variance
deviation

Chapter 4: Data Description 5


The Range

Definition
The range is the highest value minus the lowest value. The symbol R
is used for the range.
R = highest value − lowest value
Example 3-16
Solution
Find the Range for the paints in example 3-15. The range for brand A is
𝑹 = 𝟔𝟎 − 𝟏𝟎 = 𝟓𝟎 months
The range for brand B is
𝑹 = 𝟒𝟓 − 𝟐𝟓 = 𝟐𝟎 months

Chapter 4: Data Description 6


Population Variance and Standard Deviation

Data variation

❑It is based on the difference or distance each data value is from the mean. This
difference or distance is called a deviation.

❑The sum of the deviations for all data values about the mean (without rounding),
this sum will always be zero. That is, Σ (X − μ) = 0.

Chapter 4: Data Description 7


Population Variance and Standard Deviation

Definition
The population variance is the average of the squares of the distance
each value is from the mean. The symbol for the population variance is
σ2 (σ is the Greek lowercase letter sigma).
The formula for the population variance is
𝟐
Σ X − μ
σ𝟐 =
𝑁
Where
X = individual value
μ = population mean
N = population size
Chapter 4: Data Description 8
Population Variance and Standard Deviation

Definition
The population standard deviation is the square root of the variance.
The symbol for the population standard deviation is σ.
The corresponding formula for the population standard deviation is
Σ X − μ 𝟐
σ= σ𝟐 =
𝑁

Example 3-18
Find the variance and standard deviation for the brand A paint in example 3-15.
10, 60, 50, 30, 40, 20
Chapter 4: Data Description 9
Population Variance and Standard Deviation

Chapter 4: Data Description 10


Population Variance and Standard Deviation

Solution
ΣX 10 + 60 + 50 + 30 + 40 + 20 210
Step 1 Find the mean for the data. μ= = = = 35
𝑵 𝟔 𝟔
Step 2 Subtract the mean from each values (X − μ)
𝟏𝟎 − 𝟑𝟓 = −𝟐𝟓 𝟔𝟎 − 𝟑𝟓 = 𝟐𝟓 𝟓𝟎 − 𝟑𝟓 = 𝟏𝟓
𝟑𝟎 − 𝟑𝟓 = −𝟓 𝟒𝟎 − 𝟑𝟓 = 𝟓 𝟐𝟎 − 𝟑𝟓 = −𝟏𝟓
𝟐 𝟐 𝟐 𝟐 𝟐 𝟐 𝟐
Step 3 Square each result X −μ −𝟐𝟓 𝟐𝟓 𝟏𝟓 −𝟓 𝟓 −𝟏𝟓
𝟐
Step 4 Find the sum of the Square each result ∑ X −μ 625 + 625 + 225 + 25 + 25 + 225 = 1750
𝟐
∑ X −μ
Step 5 Divide the sum by N to get the variance
𝑵
𝟏𝟕𝟓𝟎
σ𝟐 = = 𝟐𝟗𝟏. 𝟕
𝟔
Step 6 Take the square root of the variance to get the standard deviation. 𝝈 = 𝟐𝟗𝟏. 𝟕
Chapter 4: Data Description 11
Population Variance and Standard Deviation

Solution
It is helpful to make a table.

✓ Column A contains the raw data X.


✓ Column B contains the differences X − μ
obtained in step 2.
✓ Column C contains the squares of the
differences obtained in step 3.

Chapter 4: Data Description 12


Population Variance and Standard Deviation

Example 3-19
Find the variance and standard deviation for the brand B paint in example 3-15.

35, 45, 30, 35, 40, 25


Solution
ΣX 35 + 45 + 30 + 35 + 40 + 25 210
Step 1 Find the mean for the data. μ= 𝑵 = 𝟔
= 𝟔 = 35
Step 2 Subtract the mean from each values (X − μ)
𝟑𝟓 − 𝟑𝟓 = 𝟎 𝟒𝟓 − 𝟑𝟓 = 𝟏𝟎 𝟑𝟎 − 𝟑𝟓 = −𝟓
𝟑𝟓 − 𝟑𝟓 = 𝟎 𝟒𝟎 − 𝟑𝟓 = 𝟓 𝟐𝟓 − 𝟑𝟓 = −𝟏𝟎
Step 3 Square each result and place the squares in column C of the table.

Chapter 3: Data Description 13


Population Variance and Standard Deviation

Solution
Step 4 Find the sum of the Square in column C
𝟐
∑ X −μ = 𝟎 + 𝟏𝟎𝟎 + 𝟐𝟓 + 𝟎 + 𝟐𝟓 + 𝟏𝟎𝟎 = 𝟐𝟓𝟎

Step 5 Divide the sum by N to get the variance

𝟐
𝟐 ∑ X −μ 𝟐𝟓𝟎
σ = = = 𝟒𝟏. 𝟕
𝑵 𝟔

Step 6 Take the square root of the variance to get the standard deviation.

𝟐
∑ X −μ
𝝈= = 𝟒𝟏. 𝟕 = 𝟔. 𝟓
𝑵
Chapter 3: Data Description 14
Sample Variance and Standard Deviation

Definition
The formula for the sample variance (denoted by 𝒔𝟐 ) is
Σ X − ഥ 𝟐
𝑿
𝒔𝟐 =
𝒏−𝟏
The formula for the sample standard deviation (denoted by 𝒔) is

Σ X −𝑿 𝟐
𝒔=
𝒏−𝟏
Where
X = individual value
ഥ = sample mean
𝑿
n = sample size
Chapter 4: Data Description 15
Sample Variance and Standard Deviation

Example 3-20
The number of public school teacher strikes in Pennsylvania for a random sample of school years is
shown. Find the sample variance and the sample standard deviation.

Solution 9, 10, 14, 7, 8, 3


Step 1 Find the mean for the data. ഥ = ΣX = 9 + 10 + 14 + 7 + 8 + 3 = 51 = 8.5
𝑿
𝒏 𝟔 𝟔
ഥ)
Step 2 Subtract the mean from each values (X −𝑿
𝟗 − 𝟖. 𝟓 = 𝟎. 𝟓 𝟏𝟎 − 𝟖. 𝟓 = 𝟏. 𝟓 𝟏𝟒 − 𝟖. 𝟓 = 𝟓. 𝟓
𝟕 − 𝟖. 𝟓 = −𝟏. 𝟓 𝟖 − 𝟖. 𝟓 = −𝟎. 𝟓 𝟑 − 𝟖. 𝟓 = −𝟓. 𝟓
ഥ )2
Step 3 Square each result (X −𝑿
𝟎. 𝟓𝟐 = 𝟎. 𝟐𝟓 , 𝟏. 𝟓𝟐 = 𝟐. 𝟐𝟓, 𝟓. 𝟓𝟐 = 𝟑𝟎. 𝟐𝟓, −𝟏. 𝟓𝟐 = 𝟐. 𝟐𝟓, −𝟎. 𝟓𝟐 = 𝟎. 𝟐𝟓, −𝟓. 𝟓𝟐 = 𝟑𝟎. 𝟐𝟓
Chapter 3: Data Description 16
Sample Variance and Standard Deviation

Solution
Step 4 Find the sum of the Square

∑ X −𝑿 𝟐
= 𝟎. 𝟐𝟓 + 𝟐. +𝟑𝟎. 𝟐𝟓 + 𝟐. 𝟐𝟓 + 𝟎. 𝟐𝟓 + 𝟑𝟎. 𝟐𝟓 = 𝟔𝟓. 𝟓

Step 5 Divide the sum by n-1 to get the sample variance


𝟐
𝟐 ∑ X −𝑿ഥ 𝟔𝟓.𝟓
s = = = 𝟏𝟑. 𝟏
𝒏−𝟏 𝟓

Step 6 Take the square root of the variance to get the sample standard deviation.

ഥ 𝟐
∑ X −𝑿
𝒔= = 𝟏𝟑. 𝟏 = 𝟑. 𝟔
𝒏−𝟏
Chapter 3: Data Description 17
Sample Variance and Standard Deviation

Example 3-21
The number of public school teacher strikes in Pennsylvania for a random sample of school years is
shown. Find the sample variance and the sample standard deviation.

9, 10, 14, 7, 8, 3
Chapter 3: Data Description 18
Sample Variance and Standard Deviation

Solution
Step 1 Find the sum of the values ΣX = 9 + 10 + 14 + 7 + 8 + 3= 51

Step 2 Square each value and find the sum:


∑𝑿𝟐 = 𝟗𝟐 + 𝟏𝟎𝟐 + 𝟏𝟒𝟐 + 𝟕𝟐 + 𝟖𝟐 + 𝟑𝟐 = 𝟒𝟗𝟗

Step 3 Substitute in the formula and solve:


𝟐 𝒏∑𝑿𝟐 − ∑ 𝑿 𝟐 𝟔 𝟒𝟗𝟗 −𝟓𝟏𝟐 𝟑𝟗𝟑
s = 𝒏(𝒏−𝟏)
= 𝟔(𝟔−𝟏)
= 𝟑𝟎
= 𝟏𝟑. 𝟏

𝒔 = 𝟏𝟑. 𝟏 = 𝟑. 𝟔

Chapter 3: Data Description 19


Variance and Standard Deviation for Grouped data

Step 1 Make a table as shown and find the midpoints of each class and place them in column C.
.

Step 2 Multiply the frequency by the midpoint for each class, and place the product in column D.
Step 3 Multiply the frequency by the square of the midpoint, and place the products in column E.
Step 4 Find the sums of columns B, D, and E. (The sum of column B is 𝒏. The sum of column D is
𝚺𝒇. 𝑿𝒎 . The sum of column E is 𝚺𝒇. 𝑿𝟐𝒎 .)
Step 5 Substitute in the formula and solve to get the variance.
𝐧 ∑𝒇. 𝑿 𝟐 − 𝚺𝒇. 𝑿 𝟐
𝒎 𝒎
𝒔𝟐 =
𝒏 𝒏−𝟏
Step 6 Take the square root to get the standard deviation

Chapter 4: Frequency Distributions and Graphs 20


Variance and Standard Deviation for Grouped data

Example 3-22
Find the sample variance and the sample standard deviation for the frequency
distribution of the data shown. The data represent the number of miles that 20
runners ran during one week.

Chapter 4: Data Description 21


Variance and Standard Deviation for Grouped data

Solution
Step 1 Make a table as shown, and find the midpoint of
each class
Step 2 Multiply the frequency by the
midpoint for each class, and place the product
in column D.
Step 3 Multiply the frequency by the square
of the midpoint, and place the products in
column E.

Step 4 Find the sums of columns B, D, and E.

Chapter 4: Data Description 22


Variance and Standard Deviation for Grouped data

Solution
Step 5 Substitute in the formula and solve to get the variance.

𝟐 𝟐
𝟐
𝐧 ∑𝒇. 𝑿𝒎 − 𝚺𝒇. 𝑿𝒎
𝒔 =
𝒏 𝒏−𝟏

𝟐
𝟐𝟎 𝟏𝟑𝟑𝟏𝟎 − 𝟒𝟗𝟎 𝟐𝟔𝟏𝟎𝟎
= = = 𝟔𝟖. 𝟕
𝟐𝟎 𝟐𝟎 − 𝟏 𝟑𝟖𝟎
Step 6 Take the square root to get the standard deviation
𝒔 = 𝟔𝟖. 𝟕 = 𝟖. 𝟑

Chapter 4: Data Description 23


Uses of Variance and Standard Deviation

✓The variances and standard deviations can be used to determine the spread of
the data. If the variance or standard deviation is large, the data are more
dispersed. This information is useful in comparing two (or more) data sets to
determine which is more (most) variable.
✓The measures of variance and standard deviation are used to determine the
consistency of a variable. For example, in the manufacture of fittings, such as
nuts and bolts, the variation in the diameters must be small, or else the parts will
not fit together.
✓The variance and standard deviation are used to determine the number of data
values that fall within a specified interval in a distribution.

Chapter 4: Data Description 24


Coefficient of Variation

❑ Whenever two samples have the same units of measure, the variance and standard deviation
for each can be compared directly.
❑ A statistic that allows you to compare standard deviations when the units are different, is
called the coefficient of variation.
Definition
The coefficient of variation, denoted by CVar, is the standard deviation
divided by the mean. The result is expressed as a percentage.
For samples, For populations,
𝑺 𝝈
𝐂. 𝐕𝐚𝐫 = ഥ × 𝟏𝟎𝟎 𝐂. 𝐕𝐚𝐫 = × 𝟏𝟎𝟎
𝑿 𝝁
Chapter 4: Data Description 25
Coefficient of Variation

Example 3-23
The mean of the number of sales of cars over a 3-month period is 87, and the
standard deviation is 5. The mean of the commissions is $5225, and the standard
deviation is $773. Compare the variations of the two.
Solution
𝑺 𝟓
The coefficients of variation are 𝐂. 𝐕𝐚𝐫 = ഥ × 𝟏𝟎𝟎 = × 𝟏𝟎𝟎 = 𝟓. 𝟕%
𝑿 𝟖𝟕
𝑺 𝟕𝟕𝟑
𝐂. 𝐕𝐚𝐫 = × 𝟏𝟎𝟎 = × 𝟏𝟎𝟎 = 𝟏𝟒. 𝟖%

𝑿 𝟓𝟐𝟐𝟓
Since the coefficient of variation is larger for commissions, the commissions are
more variable than the sales.
Chapter 4: Data Description 26
Measures of Position

❑In addition to measures of central tendency and measures of variation, there


are measures of position or location.

Measures of
Position

Standard
Percentiles Quartiles.
scores

Chapter 4: Data Description 27


Percentiles

Percentiles are position measures used in educational and health-related fields to


indicate the position of an individual in a group.
Definition
Percentiles divide the data set into 100 equal groups.

Percentiles are symbolized by 𝑷𝟏 , 𝑷𝟐 , 𝑷𝟑 , . . . , 𝑷𝟗𝟗 and divide the distribution into


100 groups

Chapter 4: Data Description 28


Percentiles

Finding a Data Value Corresponding to a Given Percentile


Step 1 Arrange the data in order from lowest to highest.
Step 2 Substitute into the formula
𝒏𝒑
c = 𝟏𝟎𝟎
where n = total number of values, 𝒑 = percentile
Step 3A If c is not a whole number, round up to the next whole number. Starting at the
lowest value, count over to the number that corresponds to the rounded up value.
Step 3B If c is a whole number, use the value halfway between the c th and (c+1)st values
when counting up from the lowest value.

Chapter 4: Data Description 29


Percentiles

Example 3-32
The number of traffic violations recorded by a police department for a 10-day period
is shown. Find the data value corresponding to the 65th percentile.

22 19 25 24 18 15 9 12 16 20
Solution

Step 1 Arrange the data in order from lowest to highest.


9 12 15 16 18 19 20 22 24 25
Step 2 Substitute in the formula
𝒏𝒑 𝟏𝟎×𝟔𝟓
C= = = 6.5
𝟏𝟎𝟎 𝟏𝟎𝟎

Chapter 4: Data Description 30


Percentiles

Solution
Step 3 Since c is not a whole number, round it up to the next whole number; in this
case, it is c = 7.

Start at the lowest value and count over to the 7th value, which is 20.

9 12 15 16 18 19 20 22 24 25

7th value

Hence, the value of 20 corresponds to the 65th percentile.


Chapter 4: Data Description 31
Percentiles

Example 3-33
The number of traffic violations recorded by a police department for a 10-day period
is shown. Find the data value corresponding to the 30th percentile.

22 19 25 24 18 15 9 12 16 20
Solution

Step 1 Arrange the data in order from lowest to highest.


9 12 15 16 18 19 20 22 24 25
Step 2 Substitute in the formula
𝒏𝒑 𝟏𝟎×𝟑𝟎
c= = = 3
𝟏𝟎𝟎 𝟏𝟎𝟎

Chapter 4: Data Description 32


Percentiles

Solution
Step 3 Since c is a whole number, use the value halfway between the c and c + 1
values when counting up from the lowest.
In this case, it is the third and fourth values.
9 12 15 16 18 19 20 22 24 25

3rd 4th values


The halfway value is between 15 and 16. It is 15.5.
Hence, 15.5 corresponds to the 30th percentile.

Chapter 4: Data Description 33


Quartiles

Definition
Quartiles divide the distribution into four equal groups, denoted by
𝑸𝟏 , 𝑸𝟐 , 𝑸𝟑 .
✓Note that 𝑸𝟏 is the same as the 25th percentile; 𝑸𝟐 is the same as the 50th
percentile, or the median; 𝑸𝟑 corresponds to the 75th percentile.

✓ Quartiles can be computed by using the formula given for computing


percentiles. For Q1 use p = 25. For Q2 use p = 50. For Q3 use p = 75.

Chapter 4: Data Description 34


Quartiles

Finding Data Values Corresponding to 𝑸𝟏 , 𝑸𝟐 , and 𝑸𝟑


Step 1 Arrange the data in order from lowest to highest.

Step 2 Find the median of the data values. This is the value for 𝑸𝟐 .

Step 3 Find the median of the data values that fall below 𝑸𝟐 .

This is the value for 𝑸𝟏 .

Step 4 Find the median of the data values that fall above 𝑸𝟐 .

This is the value for 𝑸𝟑 .


Chapter 4: Data Description 35
Quartiles

Example 3-34
The number of traffic violations recorded by a police department for a 10-day period
is shown. Find 𝑸𝟏 , 𝑸𝟐 , and 𝑸𝟑 .
22 19 25 24 18 15 9 12 16 20
Solution
Step 1 Arrange the data in order from lowest to highest.
9 12 15 16 18 19 20 22 24 25
Step 2 Find the median 𝑸𝟐 .

9 12 15 16 18 19 20 22 24 25

𝟏𝟖+𝟏𝟗
𝑴𝑫 = =18.5
𝟐
Chapter 4: Data Description 36
Quartiles

Solution
Step 3 Find the median of the data values below 18.5.

9 12 15 16 18

𝑸𝟏 =15

Step 4 Find the median of the data values greater than 18.5.

19 20 22 24 25

𝑸𝟑 =22

Hence, 𝑸𝟏 = 15, 𝑸𝟐 = 18.5, and 𝑸𝟑 = 22.


Chapter 4: Data Description 37
Interquartile Range

Definition
The interquartile range (IQR) is the difference between the third
and first quartiles.
𝐈𝐐𝐑 = 𝑸𝟑 − 𝑸𝟏

Example 3-35

Find the interquartile range of the data set in Example 3–34.


Solution
𝐈𝐐𝐑 = 𝑸𝟑 − 𝑸𝟏 = 𝟐𝟐 − 𝟏𝟓 = 𝟕
The interquartile range is equal to 7.
Chapter 4: Data Description 38
The Five-Number Summary and Boxplots

✓ A boxplot can be used to graphically represent the data set.


✓ These plots involve five specific values:

✓ These values are called a five-number summary of the data set.


Chapter 4: Data Description 39
The Boxplots

Definition
A boxplot is a graph of a data set obtained by
drawing a horizontal line from the minimum data
value to Q1, drawing a horizontal line from Q3 to the
maximum data value, and drawing a box whose
vertical sides pass through Q1 and Q3 with a vertical
line inside the box passing through the median or Q2.

Chapter 4: Data Description 40


The Boxplots

Constructing a Boxplot
Step 1 Find the five-number summary for the data.

Step 2 Draw a horizontal axis and place the scale on the axis. The scale should start on or below
the minimum data value and end on or above the maximum data value.

Step 3 Locate the lowest data value, Q 1, the median, Q3, and the highest data value; then draw a
box whose vertical sides go through Q1 and Q3.

Draw a vertical line through the median.

Finally, draw a line from the minimum data value to the left side of the box, and draw a line
from the maximum data value to the right side of the box.
Chapter 4: Data Description 41
The Boxplots

Example 3-37
The number of meteorites found in 10 states of the United States is:
89, 47, 164, 296, 30, 215, 138, 78, 48, 39.
Construct a boxplot for the data.
Solution

Step 1 Find the five-number summary for the data.


Arrange the data in order:
30, 39, 47, 48, 78, 89, 138, 164, 215, 296

Chapter 4: Data Description 42


The Boxplots

Find the median 30, 39, 47, 48, 78, 89, 138, 164, 215, 296

median=83.5

Find 𝑸𝟏 30, 39, 47, 48, 78

𝑸𝟏 =47

Find 𝑸𝟑 89, 138, 164, 215, 296

𝑸𝟑 =164

The minimum data value is 30, and

the maximum data value is 296.


Chapter 4: Data Description 43
The Boxplots

Step 2 Draw a horizontal axis and the scale.

Step 3 Draw the box above the scale using Q1 and Q3 . Draw a vertical line through the median,
and draw lines from the lowest data value to the box and from the highest data value to
the box.

Chapter 4: Data Description 44


The Boxplots

Information Obtained from a Boxplot


1. a. If the median is near the center of the box, the distribution is approximately
symmetric.
b. If the median falls to the left of the center of the box, the distribution is positively
skewed.
c. If the median falls to the right of the center, the distribution is negatively skewed.
2. a. If the lines are about the same length, the distribution is approximately
symmetric.
b. If the right line is larger than the left line, the distribution is positively skewed.
c. If the left line is larger than the right line, the distribution is negatively skewed.
Chapter 3: Data Description 45
The Boxplots

The boxplot in above Figure indicates that the distribution is


slightly positively skewed.

Chapter 4: Data Description 46


Statistical Computations
using Microsoft Excel
Sample Variance and Standard Deviation

Example XL3-2
The number of public school teacher strikes in Pennsylvania for a random sample of school years is
shown. Find the sample variance and the sample standard deviation using Excel.

Solution 9, 10, 14, 7, 8, 3


1. On an Excel worksheet enter the data in cells A2–A7. Enter a label for
the variable in cell A1.
2. In a blank cell enter=VAR.S(A2:A7) for the sample variance.
3. In a blank cell enter=STDEV.S(A2:A7) for the sample standard deviation.
4. For the range, compute the difference between the maximum and the
minimum values by entering = Max(A2:A7)-Min(A2:A7).
Chapter 4: Data Description 48
Percentiles

Excel has two built-in functions to find the Percentile Rank corresponding to a value
in a set of data.
1. PERCENTRANK.INC calculates the Percentile Rank corresponding to a data value in the range
0 to 1 inclusively.
2. PERCENTRANK.EXC calculates the Percentile Rank corresponding to a data value in the range
0 to 1 exclusively.

Example XL3-4

Given the following dataset. Find the data value corresponding to the 30th percentile. (Page 164)
5 6 12 13 15 18 22 50
Chapter 4: Data Description 49
Percentiles

Solution
1. On an Excel worksheet enter the data in cells A2–A9. Enter a label for the variable in cell A1.
2. Label cell B1 as Percent Rank INC and cell C1 as Percent Rank EXC.
3. Select cell B2.
4. Select the Formulas tab from the toolbar and Insert Function
5. Select the Statistical category for statistical functions and scroll in the function list to PERCENTRANK.INC
(PERCENTRANK.EXC) and click [OK].
In the PERCENTRANK.INC (PERCENTRANK.EXC) dialog boxes:
6. Type A2:A9 for the Array.
7. Type A2 for X, then click [OK].
8. Repeat the procedure above for each data value in the set.
Chapter 4: Data Description 50
Percentiles

The function results for both PERCENTRANK.INC and PERCENTRANK.EXC are shown
below.
Note: Both functions return the Percentile Ranks as a number between 0 and 1.
You may convert these to numbers between 0 and 100 by multiplying each function
value by 100.

Chapter 4: Data Description 51


The Boxplots

Example XL3-6
Given the following data set
33, 38, 43, 30, 29, 40, 51, 27, 42, 23, 31.
Construct a boxplot for the data using Excel.
Solution

✓ Excel does not have procedures to produce boxplots plot.


✓ However, you may construct these plots by using the MegaStat Add-in available
from the Online Learning Center.
Chapter 4: Data Description 52
The Boxplots

Solution
1. On an Excel worksheet enter the data in cells A1–A11.
2. Select the Add-Ins tab, then MegaStat from the toolbar.
2. Select Descriptive Statistics from the MegaStat menu.
3. Enter the cell range A1:A11 in the Input range.
4. Check Boxplot Plot. Click [OK].

The boxplot is shown below.

Chapter 4: Data Description 53

You might also like