Week 5 - Result and Analysis 1 (UP)
Week 5 - Result and Analysis 1 (UP)
INTRODUCTION
STATISTICAL ANALYSIS
c. Of 1000 households polled nationwide, 40% said they owned at least one
cordless phone, 9% had two or more.
1
∇ performing hypothesis testing
∇ determining relationships among variables, and
∇ making predictions
A population consists of all subjects (human or otherwise) that are being studied
GRAPHICAL PRESENTATION
a. Pareto Charts
b. Time Series
c. Pie
DATA DESCRIPTION
Three aspects
1. Measures of Central Tendency
Definition Symbol
Mean sum of values divided by total µ, x
number of value
Median Middle point in the data set MD
Mode Most frequent data value None
Midrange (Lowest value plus highest value)/2 MR
2
2. Measures of Variation.
Sometime the mean is not good enough to describe a data set as in the following
example.
Example: A testing lab wishes to test two experimental brands of outdoor paint to
see how long each would last before fading. Different chemical agents are added in
each group and only six cans are involved. These two groups constitute two small
populations. The results (in months) follow.
Brand A Brand B
10 35
60 45
50 30
30 35
40 40
20 25
Mean = 35 Mean 35
Note that Brand A and B gave similar means = 35. Thus one might conclude that
both brand of paint last equally well. But a different conclusion might be withdrawn
when the data set are examined graphically.
The range
Definition Symbols
Range distance between highest and lowest value R
Variance average of the squares of the distance each σ , s2
2
3
Population variance (σ2)
∑ ( X − µ )2
σ2 =
N
Where
X = individual value
µ = population mean
N = population size
∑ ( X − µ )2
σ = σ2 =
N
s 2
=
(
∑ X −X )2
n −1
s = s2 =
(
∑ X−X )2
n −1
Where:
X = individual value
X = sample mean
n = sample size
Coefficient of variation is the standard deviation divided by the mean. The result is
expressed as a percentage.
For sample
s
CVar = .100%
X
4
For populations
σ
CVar = .100%
µ
3. Measure of Position
Definition Symbol
Standard score or z Number of standard deviation a data z
score value is above or below the mean
Percentile Position in hundredths a data value is Pn
in the distribution
Decile Position in tenths a data values is in Dn
the distribution
Quartile Position in fourths a data value is in Qn
the distribution
5
The lower hinge (LH) is the median of all value less than or equal to the median
when the data set has an odd number of values, or the median of all value less than
the median when the data set has an even number of values.
The upper hinge (UH) is the median of all value greater than or equal to the median
when the data set has an odd number of values, or as the median of all value
greater than the median when the data set has an even number of values.
6
Confidence Intervals
• The parameter is specified as falling between two values. Example average age
of all student
• The probability of being correct can be assigned. E.g 95% confidence interval
means that its is 95% sure/chance that the population mean is contained within
the range.
I am ----% confident that the interval ---- to ----- includes the population
mean,µ