0% found this document useful (0 votes)
29 views

Presentation - Week 4 1

This document provides information about various statistical measures including: 1. Measures of location (mean, median, quartiles) which describe the central tendency of a data set. The mean is the average, the median is the middle value, and quartiles divide the data into four equal parts. 2. Measures of dispersion (range, variance, standard deviation, interquartile range) which describe how spread out the values are. The range is the difference between highest and lowest values. Variance and standard deviation measure average distance from the mean. 3. Other concepts like the mode, weighted mean, geometric mean, and trimmed mean. The mode is the most frequent value. Trimmed mean removes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Presentation - Week 4 1

This document provides information about various statistical measures including: 1. Measures of location (mean, median, quartiles) which describe the central tendency of a data set. The mean is the average, the median is the middle value, and quartiles divide the data into four equal parts. 2. Measures of dispersion (range, variance, standard deviation, interquartile range) which describe how spread out the values are. The range is the difference between highest and lowest values. Variance and standard deviation measure average distance from the mean. 3. Other concepts like the mode, weighted mean, geometric mean, and trimmed mean. The mode is the most frequent value. Trimmed mean removes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Statistical Characteristics

1
Statistical Characteristics

Measures of Measures of
location dispersion
(position)

1) Mean 1) Range
2) Median 2) Variance
3) Quartile 3) Standard Deviation
4) Interquartile Range (IQR)

2
Mean
The arithmetic mean or arithmetic average, or just the
mean or the average, is the sum of a collection of
numbers divided by the count of numbers in the
collection

A= mean
n= number of values
ai= data set values
https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-
data/mean-median-basics/a/mean-median-and-mode-review 3
Weighted Mean

4
Geometric Mean

5
Harmonic Mean

6
Trimmed mean
A trimmed mean is a method of averaging that
removes a small percentage of the largest and
smallest values before calculating the mean.
Let's say, as an example, a figure skating
competition produces the following scores: 6.0,
8.1, 8.3, 9.1, and 9.9.
To trim the mean by a total of 40%, we remove the
lowest 20% and the highest 20% of values,
eliminating the scores of 6.0 and 9.9.
7
Median
• Median: The middle number; found by:
• 1) Ordering all data points and picking out
the one in the middle (or if there are two
middle numbers, taking the mean of those
two numbers).
=> Position (odd)
=> Applied for numbers
(even)
https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-
data/mean-median-basics/a/mean-median-and-mode-review 8
Quartile
Quartiles divide your data into four parts, as equal as
possible. For the calculation quartiles, the data must be
sorted from the smallest to the largest value.
• Quartile (Q1): The middle value between the smallest
value (minimum) and the median.
• Quartile (Q2): The median of the data, i.e. 50% of the
values are smaller and 50% of the values are larger.
• Quartile (Q3): The middle value between the median
and the largest value (maximum).

9
Interquartile Range
Interquartile Range (IQR): Interquartile range is
defined as the range between 75 percentile
(Q3) and 25 percentile (Q1).

IQR = Q3 – Q1
10
Example

To avoid wrong calculation of quartiles:


• 1st calculate the Median ( 50 %);
• then calculate the median of the upper side for
Q3 and
• For Q1 the median of the lower side 11
Exercise
Find the quartiles of:
{13,6,11,8,7,8,23,2,14,15,22,12}

12
The range is the easiest measure of dispersion. It is
simply calculated by subtracting the highest value
from the lowest value.
13
Standard Deviation & Variance

Variance: Defined as the average of squared difference from the mean. measures
how far each data point in datasets from the mean
Standard deviation : indicates the spread of a variable around its mean value.14
Another way to calculate the variance

15
Example
Ex: we have N=5 element and sum of (xi)=25,
sum of (xi^2)=750 find the variance

16
Mode
The mode is the value that appears most frequently in a data
set. A set of data may have one mode, more than one mode, or no
mode at all.

17
Example of the Mode
• In the following list of numbers, 3, 3, 6, 9, 16, 16, 16, 27, 27, 37, 48
16 is the mode since it appears more times in the set than any other number:
• A set of numbers can have more than one mode (this is known as bimodal if there
are two modes) if there are multiple numbers that occur with equal frequency, and
more times than the others in the set. 3, 3, 3, 9, 16, 16, 16, 27, 37, 48
• In the above example, both the number 3 and the number 16 are modes as they
each occur three times and no other number occurs more often.
• If no number in a set of numbers occurs more than once, that set has no mode: 3,
6, 9, 16, 27, 37, 48
• A set of numbers with two modes is bimodal, a set of numbers with three modes
is trimodal, and any set of numbers with more than one mode is multimodal.

18
How Do I Calculate the Mode?
• Calculating the mode is fairly straightforward. Place all numbers
in a given set in order; this can be from lowest to highest or
highest to lowest, and then count how many times each number
appears in the set.
• The one that appears the most is the mode.

Example • Find the mod of the given the


observation data: 15, 21, 11, 18, 16,
9, 8, 8, 33, 11

19
Dot Plot
A survey of "How long does it take you to eat breakfast?" has these results:

Minutes: 0 1 2 3 4 5 6 7 8 9 10 11 12

People: 6 2 3 5 2 5 0 0 2 3 7 4 1

Which means that 6 people take 0 minutes to eat breakfast (they probably had no
breakfast!), 2 people say they only spend 1 minute having breakfast, etc. And here is the
dot plot:

https://www.cuemath.com/data/dot-plot/ 20
Exercises
The following measurements were recorded for the drying time, in
hours, of a certain brand of latex paint

Assume that the measurements are a simple random


sample.
(a) What is the sample size for the above sample?
(b) Calculate the sample mean for these data.
(c) Calculate the sample median.
(d) Plot the data by way of a dot plot.
(e) Compute the 20% in total trimmed mean for the above data
set.

21
22
Exercises
According to the journal Chemical Engineering, an important property of a
fiber is its water absorbency. A random sample of 20 pieces of cotton
fiber was taken and the absorbency on each piece was measured. The
following are the absorbency values:

(a) Calculate the sample mean and median for the above sample values.
(b) Compute the 10% for each side trimmed mean.
(c) Do a dot plot of the absorbency data.
(d) Using only the values of the mean, median, and trimmed mean, do you
have evidence of outliers in the data?

(An outlier: is an observation that lies an abnormal distance from other


values in a random sample from a population)

23
24
Interpolation
• Interpolation is generally used in engineering and similar
sciences based on experiments/measurements to fit the
collected data to a function curve.
• In cases when collected data is scattered (dispersed) and
extremely heterogeneous (different), it becomes important
to find the values in the empty fields by interpolation.
• Extrapolation is also used to make predictions in an area
outside the known points.

25
26
27
28
Calculation of the median in the case
of continuous variable
Here are the amounts paid by customers during a sales period
in a store X :
xi ni
[0- 100 TL [ 43
[100- 200 TL [ 50
[200- 300 TL [ 56
[300 - 400 TL [ 34
[400- 500 TL [ 23
[500 TL and more 13
Total 219

1) Specify the character and its nature.


2) calculate the median.
29
Solution
• the character: amounts paid by customers,
nature: Quantitative Continuous
• Median location= 219/2=109.5
xi ni ni_Cumulative 200 Median 300
]0- 100 TL ] 43 43 94 109.5 149
]100- 200 TL ] 50 93
]200- 300 TL ] 56 149
]300 - 400 TL ] 34 183
]400- 500 TL ] 23 206
]500 TL and more 13 219 Median=229 TL
Total 219
30

You might also like