0% found this document useful (0 votes)
76 views58 pages

Chapter 4

The document discusses various measures used to describe and analyze data. It covers measures of central tendency like mean, median and mode. It also discusses measures of dispersion such as range, interquartile range, variance and standard deviation. Finally, it discusses measures of relative position including quartiles, percentiles and z-scores. The document provides examples and formulas for calculating each of these statistical measures.

Uploaded by

meia quider
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views58 pages

Chapter 4

The document discusses various measures used to describe and analyze data. It covers measures of central tendency like mean, median and mode. It also discusses measures of dispersion such as range, interquartile range, variance and standard deviation. Finally, it discusses measures of relative position including quartiles, percentiles and z-scores. The document provides examples and formulas for calculating each of these statistical measures.

Uploaded by

meia quider
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

ENGINEERING

DATA
ANALYSIS
Quick Review

What was our topic last meeting?


Group Activity

• Materials: Fifty pieces of coins


• Directions: You will see how many coins you
can stack on top of each other before they
fall. Each member of the team will take
turns to stack the coins. You will record the
result of each attempt.
• Rules: You may not support the stack with
your hand. Once you place the coin you
may not readjust. You are given five minutes
to do the activity.
Group Activity

• After the activity, each group will determine


the following:

(Total) ∕ (Number of Attempts) = ______

Sort the numbers per attempt in order from


least to greatest

(Greatest number) – (Least number) = ______


Data Management
CHAPTER 4
Objectives

At the end of the lesson, the students will be


able to:
• Discuss how data can be described using
the measures of averages
• Find out how spread out or close the data
values in the data set
• Determine the location or position of the
value relative to the other values in the data
set
Measures of Central
Tendency
Central Tendency

It is a central or typical value for a probability


distribution.
It refers to the “middle” value or perhaps a
typical value of the data, and is measured using
the mean, median, or mode.
Measures of Central Tendency

1. Mean
• It is the average of the numbers.
• It is easy to calculate:
❑ Add up all the numbers
❑ Then divide by how many numbers there are
Measures of Central Tendency

• Denoted 𝑥ҧ
Computed using the formula:
σ𝑥
𝑥ҧ =
𝑛
where:
x is the value of an observation
n is the total number of observation
Measures of Central Tendency

Example:
1. Find the mean of the set of values:
3, 4, 6, 8, 9, and 5.
Solution:
3+4+6+8+9+5
𝑥ҧ =
6
35
𝑥ҧ =
6
𝑥ҧ = 5.83
Measures of Central Tendency

Example:
2. Find the mean of the set of values:
1.2, 3.5, 4.0, and 1.3.
Solution:
1.2 + 3.5 + 4.0 + 1.3
𝑥ҧ =
4
10
𝑥ҧ =
4
𝑥ҧ = 2.5
Measures of Central Tendency

2. Median
• It is the value separating the higher half of a
data sample, a population, or a probability
distribution, from the lower half.
• For a data set, it may be thought of as the
“middle” value.
• It is denoted as md.
Measures of Central Tendency

Example:
1. In the data set: {2, 4, 7, 8, 9, 12}

the median is
7+8
md =
2

md = 7.5
Measures of Central Tendency

Example:
2. In the data set: {1, 3, 4, 5, 7, 8, 10}
the median is
md = 5
Measures of Central Tendency

3. Mode
• It is the value that appears most often.
• It is the most commonly occurring value in a
distribution.
• It may exist sometimes does not. If it exists,
sometimes it has one mode or sometimes it
has more than one mode.
• It is denoted as mo.
Measures of Central Tendency

Example:
1. Given the data set: {3, 2, 7, 3, 4, 5}

Mode is 3 since 3 is repeated the most.


Measures of Central Tendency

Example:
2. Given the data set: {4, 5, 3, 4, 7, 5, 6, 3, 8}

Mode is 4, 5, 3 since they are repeated more


than the other item.
Measures of Central Tendency

Type of Data Best measure of


central tendency
Nominal Mode

Ordinal Median

Interval/Ratio (not skewed) Mean

Interval/Ratio (skewed) Median


Measures
of
Dispersion
Measures of Dispersion

It is also known as the measure of spread or


measure of variability.
It measures how the various elements behave
with regards to central tendency.
It includes range, interquartile range, absolute
deviation, variance and standard deviation.
Measures of Dispersion

1. Range
• It is the difference between the highest value
and the lowest value.
• It tells how far the lowest value from the
highest value is.
• It is denoted R.
Measures of Dispersion

Example:
1. Given the data set {5, 8, 7, 12, 12, 13, 18}

R = 18 – 5
R = 13
Measures of Dispersion

Example:
2. Given the data set {8, 11, 13, 7, 2, 15, 17}

R = 17 – 2
R = 15
Measures of Dispersion

2. Interquartile Range
• It is also called the midspread.
• It is the difference between the 75th and the
25th percentile or between upper and lower
quartile.
• It is denoted by IQR and computed as
IQR = Q3 – Q1
Measures of Dispersion

Example:
1. When Q3 = 17.5 and Q1 = 10.5

IQR = 17.5 – 10.5


IQR = 7.0
Measures of Dispersion

Example:
2. Given: {4, 5, 8, 9, 10, 11, 15}
1 7+1 𝑡ℎ 3 7+1 𝑡ℎ
𝑄1: 𝑄3:
4 4
Q1:2nd Q3:6th
Q1 = 5 Q3 = 11
Therefore,
IQR = 11 – 5
IQR = 6
Measures of Dispersion

3. Variance
• This is the expectation of the squared
deviation of a random variable from its
mean.
• It measures how far a set of numbers are
spread out from their average value.
Measures of Dispersion

• To determine the variance of ungrouped data:


a. Arrange the data from highest to lowest
b. Calculate the mean and the deviations of
each item from the mean
c. Use the formula:
σ(𝑥 − 𝑚𝑒𝑎𝑛) 2
𝑠2 =
𝑛 −1
Measures of Dispersion

Example:
Consider the following scores of students in an
achievement exam (15, 19, 11, 13, 17, 10, 20)
Measures of Dispersion

Solution:
x x – mean (x – mean)2
10 -5 25
11 -4 16
13 -2 4
15 0 0
17 2 4
19 4 16
20 5 25
2 90
n=7 mean = 15 𝑠 = = 15
7−1
Measures of Dispersion

4. Standard Deviation
• This is the square root of its variance.
• A low standard deviation indicates that the
data set tend to be close to the mean.
• A high standard deviation indicates that the
spread of data points is of wider range.
Measures of Dispersion

Example:
Consider the example in variance when:
𝑠 2 = 15
𝑠 = 15
𝑠 = 3.87
Measures of Dispersion

5. Absolute Deviation
• This is the average of the absolute deviation
from the central point or the average of the
average distance between each data value
and the mean.
Measures of Dispersion

• To find the absolute deviation:


a. Arrange the values from highest to lowest
b. Compute the mean and the absolute
deviation of each value from the mean
c. Compute using
σ |𝑥 − 𝑚𝑒𝑎𝑛|
𝐴𝐷 =
𝑛
Measures of Dispersion

Example:
Consider the number of blender unit sold by a
store for one week.
(5, 7, 2, 8, 9, 6, 12)
Measures of Dispersion

Solution:
Unit sold |x – mean|
2 5
5 2
6 1
7 0
8 1
9 2
12 5
16
n=7 mean = 7 AD= = 2.29
7
Measures of
Relative Position
Measures of Position

It is sometimes referred to as measure of


location.
It is considered as the extension of median.
It talks about the position/location of the value
relative to the other values in the data set.
Measures of Relative Position

1. Quartile
• This measure divides the observation in four
equal parts.
• The Q1 is the middle point between the
smallest value and the center value.
• The Q2 is also called the median.
• The Q3 is the middle value between the
median and the highest value of the data set.
Measures of Relative Position

Example:
Given: {4, 8, 7, 10, 15, 18, 20}
Solution:
Measures of Relative Position

2. Percentile
• This divides the observation into 100 equal
parts.
• It is used to indicate how much of the
observation may be found below.
Measures of Relative Position

• To find the k-th percentile:


a. Arrange the observation from lowest to
highest.
b. Compute the position.
L = (K/100)*N
c. If L is whole number, k-th is midway
between L and the next value.
d. If L is not a whole number, round it up to
the next integer and the value at that
position is the k-th percentile.
Measures of Relative Position

Example:
Consider the sugar levels of 20 students and
look for the 40th and 28th percentile.
80 90 91 100 120 122 123 125 201 90

88 98 130 124 111 109 140 102 85 91


Measures of Relative Position

Solution:
Arrange:
80 85 88 90 91 91 98 100 102 109 111
120 122 123 124 125 130 140 201
L = (40/100)*20 = 8
Thus, the 40th percentile is the value between
8th and 9th observation and it is 99.
L = (28/100)*20 = 5.6
Thus, the 28th percentile is the 6th item in the
observation which is 91.
Measures of Relative Position

3. Z-Score
• This indicates how many standard deviations
an element is from the mean.
• The positive and negative signs indicate the
direction of the point away from the mean.
Measures of Relative Position

• It also allows us to calculate the probability


of a score occurring within our normal
distribution also it can help in comparing
scores that are from different normal
distribution.
• Computed using the formula:
𝑥 − 𝑥ҧ
𝑧=
𝑠
Measures of Relative Position

Example:
The report card of Chris shows that his grade
in Math is 98 and in Science is 90. The mean
grade in Math is 90 and a standard deviation is
10. In Science, the mean grade is 80 and a
standard deviation is 5. In which subject does
Chris perform better?
Measures of Relative Position

Solution:
For Math For Science
98 −90 90 −80
𝑧= 𝑧=
10 5
z = 0.8 z=2
The values shows that Math is 0.8 higher than
the mean while Science is 2 standard deviation
higher than math, thus Cris performs better in
Science.
Evaluation
The following is Ms. Cathy’s exam scores. Find the
mean, median, mode, range, IQR, variance,
standard deviation, absolute deviation, Q1, Q3, 10th
and 85th percentiles.
97 90 86 83 84 78 73 73 69
65 98 90 88 83 81 79 78 72
69 60 93 98 85 82 80 78 77
71 68 59 91 89 84 82 80 77
75 70 62 55 91 89 84 82 78
77 72 70 63 54 65 89 90 81
Evaluation

Answer:
Mean = 78.54 AD = 8.52
Median = 79.5 Q1 = 70.5
Mode = 77 & 78 Q3 = 85.5
Range = 44 10th = 63
IQR = 15 85th = 90
s2 = 115.23
s = 10.73
Using Measures of Center and
Spread: The Box Plot
The Five-Number Summary:
Min Q1 Median Q3 Max
•Divides the data into 4 sets containing an
equal number of measurements.
•A quick summary of the data distribution.
•Use to form a box plot to describe the
shape of the distribution and to detect
outliers.
Constructing a Box Plot

✓Calculate Q1, the median, Q3 and IQR.


✓Draw a horizontal line to represent the
scale of measurement.
✓Draw a box using Q1, the median, Q3.

Q1 m Q3
Constructing a Box Plot

✓Isolate outliers by calculating


✓Lower fence: Q1-1.5 IQR
✓Upper fence: Q3+1.5 IQR
✓Measurements beyond the upper or lower
fence is are outliers and are marked (*).

*
Q1 m Q3
Constructing a Box Plot

✓Draw “whiskers” connecting the largest


and smallest measurements that are NOT
outliers to the box.

*
Q1 m Q3
Example
Amt of sodium in 8 brands of cheese:
260 290 300 320 330 340 340 520
Q1 = 292.5 m = 325 Q3 = 340

Q1 Q3
Example
IQR = 340-292.5 = 47.5
Lower fence = 292.5-1.5(47.5) = 221.25
Upper fence = 340 + 1.5(47.5) = 411.25
Outlier: x = 520

*
m

Q1 Q3
Interpreting Box Plots

✓ Median line in center of box and whiskers


of equal length—symmetric distribution
✓ Median line left of center and long right
whisker—skewed right
✓ Median line right of center and long left
whisker—skewed left

You might also like