Topic 5 Concept
Topic 5 Concept
interval of values. The total area under a density curve represents the whole population
Density curves
image: NASA
We use mathematical functions
called density curves to model
Probability distributions
Events are defined over intervals of values.
continuous probability distributions.
Uniform distributions are flat density curves defined over a bounded interval.
Ex: Uniform distributions
total height = 1
total height = 1
= (0.5 – 0.1) * 1 = 0.4
What is the probability of getting a value x
between 0.1 and 0.5?
x x
total height
P(0.1 ≤ x ≤ 0.5) = event width * total height 0 0.1 0.5 1 0 0.1 0.5 1
total height
= (0.5 – 0.1) * 1 = 0.4 total width = 1 The probability P(0.1 < x < 0.5) is total width = 1
• defined from –∞ to +∞ 10
Percent
= 2, 4, and 6
Math properties
4
• scaled by their standard deviation σ sigma)
2
• notation: N μ, σ 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 0
under 56
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72 or more
• inflection points at μ σ
Inflection point Height (inches)
Middle area:
For any Normal curve N(µ,σ): About 68% of all observations For any Normal curve N(µ,σ): About 68% of all observations
under N(µ,σ) are within µ ± σ. under N(µ,σ) are within µ ± σ.
Scaling
Almost all (99.7%) observations Almost all (99.7%) observations
under N(µ,σ) are within µ ± 3σ. under N(µ,σ) are within µ ± 3σ.
2
1 x
1
N ( , ) : f ( x) e 2
2
Number of times σ from µ Number of times σ from µ
The probability that a randomly The probability that a randomly The standard Normal distribution is N(0,1).
selected US adult woman has a selected US adult woman has a
height less than 69.5 inches is height less than 69.5 inches is 𝑧
approximately approximately
A) 99.7% A) 99.7% (x )
B) 97.5% B) 97.5% We can standardize the 𝑥 variable by computing z
C) 95% C) 95%
D) 84% ~0.025 ~0.025 D) 84% If 𝑥 has the N(𝜇,𝜎 ) distribution, then 𝑧 has the N(0,1) distribution.
x x
E) 68% E) 68%
2
1 x
1
N ( , ) : f ( x) e 2
2
(x )
z Precise probability calculations with Normal distributions require technology.
Computing a probability
N(64.5, 2.5) N(0,1)
TI 83/84
=> 2nd DISTR
image: NASA normalcdf(lower, upper, μ, σ)
Normal The distribution of heights (in inches) of adult women in the US The distribution of heights (in inches) of adult women in the US
probability can be modeled with the N(64.5, 2.5) distribution. can be modeled with the N(64.5, 2.5) distribution.
computations:
x x
http://crunchit3.bfwpub.com/psls4e
TI 83/84 CrunchIt
2nd DISTR What is the 90th percentile
of heights among US
p
Cumulative adult women?
area p
x x = invNorm(p, μ, σ)
67.7 x P(x ≤ a) is the blue area, the area under the density curve P(x ≤ 69.5 inches) = 0.9772
for values less than or equal to a.
90th percentile = 67.70 inches
90% of US adult women are shorter than 67.7 inches. a is the threshold value splitting the density curve into the
10% of US adult women are taller than 67.7 inches. two complementary areas, P(x ≤ a) in blue and P(x > a).
Target population
CORE IDEA OF STATISTICS: When the sample is random and representative, Population distribution of the quantitative variable 𝒙: refers to the
probability distribution of a random variable (the distribution of the value of
the value of the sample statistic should be pretty close to the value of the
Critical distinctions
Why it is important
the variable for all the individuals in the population). MATH MODEL
population parameter.
2. How close is “pretty close”? Can we quantify that? Sampling distribution of the sample mean 𝒙: refers to the theoretical
probability distribution of the mean 𝑥̅ , of all possible samples of size n from a
given population. MATH MODEL
The sampling distribution of the statistic addresses these questions.
[There are other statistics than 𝒙 and they have their own sampling distribution, which we won’t study.]
Consider a population (the variable 𝑥) with mean 𝜇 and standard deviation 𝜎.
Unbiased estimate
The sampling distribution of the sample mean (the variable 𝑥̅ ) has mean :
𝜇 𝜇
image: NASA
̄
Probability distributions
The sampling distribution of the statistic 𝑥̅ 𝑥̅ is an unbiased estimate the population mean 𝜇
Density curve:
Sampling distribution (𝑥̅ correctly estimates 𝜇, on average)
of the mean
A sample size of 10+: good enough for most symmetric data distributions
A sample size of 25+: good enough for many situations (no extreme skew)
Sample means vary much less than individual observations A sample size of 30-40+: usually good enough even for extreme skews
𝜇̄ 𝜇
Larger samples tend to give closer estimates of 𝜇
𝜎̄ 𝜎/ 𝑛
3 different
populations
Corresponding
sampling
distributions
for various
sample sizes
𝜇 ̄ 𝜇
Shape of sampling
𝜎 ̄ 𝜎/ 𝑛 distribution of 𝒙 depends
on central limit theorem
image: NASA