0% found this document useful (0 votes)
9 views

Lesson 7.1 Introduction To The Normal Distribution

Uploaded by

dhirendra lamsal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lesson 7.1 Introduction To The Normal Distribution

Uploaded by

dhirendra lamsal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Lesson 7.

1 Introduction to the Normal


Distribution
7.1 Identify basic properties of the normal
distribution.
Last module we focussed on discrete random variables. We will now shift focus to
continuous random variables.

Continuous random variable:


outcomes can take any value in an interval.
When analyzing discrete random variables, all possible values can be listed in a
table. We call that table the probability mass function (pmf). However, continuous
random variables can take any value in an interval. This is an infinite amount of
values! For this reason all values cannot be listed in a table. Instead we represent
the distribution as a probability density function.

Probability density function (pdf):


describes the relative likelihood of all values for a continuous random variable.
Usually denoted f(x).
For example the time it takes a student to travel to school is a continuous random
variable where the outcomes are measured values that can take any value in a
range (32 minutes, 5.02 mins, 15.88888 mins, etc).
The pdf can be written as a function, f(x), but often a graphical representation is
the most descriptive. The area under portions of the curve given by the pdf provide
the probability. Some additional properties of pdfs are:

1. values must be non-negative

2. total area under the curve = 1


Example:
Which of the following is not a valid pdf? Explain.

For continuous random variables, we can also define the cumulative distribution
function.

Cumulative distribution function (cdf):


for a continuous random variable, it is the probability that for any number x, the
observed value of the random variable will be at most x or p ( X ≤ x ) . The notation
F ( x ) is typically used for the cdf of x.
Just like in the discrete case the cdf starts at 0 and ends at 1! F(x) also always
increases as the value of X increases.
Example:
Which of the following is not a valid cdf? Explain.

For continuous random variables, X, we can also define summary statistics:

1. Mean: μ or E ( X ) is a measure of the center of the distribution. The mean is a


weighted average of the possible values of the random variable, with the pdf
providing the weights. Graphically, the mean is where a pivot is placed so that the
pdf balances.

2. Variance: σ 2 is a measure of the spread of a distribution. The variance, like the mean,
is a weighted average. The variance averages the squared distance of each possible
value of X from the mean, with weights provided by the pdf.
3. Standard Deviation: σ is another measure of spread of a distribution. It is the square
root of the variance.
Calculus is required to calculate the value of these variables for continuous random
variables except in the case of well defined distributions that we know their
properties.
There is a pdf shape for continuous random variables that occurs very frequently in
nature when we study sums (totals) or averages.
Here are some examples of distributions for height, blood pressure, and test scores.
What do you notice about the shapes of the pdfs?

This is called the normal distribution.

Normal distribution:
a continuous probability distribution characterized by a bell-shaped probability
distribution function (unimodal) and is symmetric around the mean.
To summarize the normal distribution:

1. Models: Averages, totals, many natural phenomenon

2. Notation: X ∼ N ( μ , σ 2 ), the distribution N ( 0 , 1 ) is known as the standard normal


distribution.

3. Parameters

• μ mean of the distribution


2
• σ variance of the distribution
• σ standard deviation of the distribution

4. Possible values: All real numbers

5. Shape: Symmetric and unimodal = bell shaped curve


The shape of the normal distribtion will change as the mean and standard deviation
change.
Let’s try it!
Go to Desmos https://www.desmos.com/calculator/igf0ak6hkf. This interactive chart
shows how the shape of the distribution changes depending on the mean and
standard deviation.

7.2 Define and calculate percentiles.


Unimodal and symmetric distributions, such as the normal distribution, we know
how values should be distributed around the mean - they follow a pattern. In such
distributions, the mean, median, and mode are equal.

Empirical Rule:
for any unimodal and symmetric distribution, the pattern of observations is
1. 68% of data is located within one standard deviation of the mean in the interval
[ μ −σ , μ+σ )
2. 95% of data is located within two standard deviations of the mean in the interval
[ μ −2 σ , μ+ 2 σ )
3. 99.7% of data is located within three standard deviations of the mean in the
interval [ μ −3 σ , μ+3 σ )
In Module 3 we identified outliers as values that were 3 standard deviations away
from the mean.That is any value outside of the middle 99.7% of data is unusual.
Example:
Assume that the scores on a test can be modelled with a normal distribution. This
distribution has a mean of 169.13 and standard deviation of 11.96.
1. What score do 68% of students fall between?

2. What scores do 95% of students fall between?

3. What scores do 99.7% of students fall between? (Outliers are values beyond
these points)
7.3 Calculate probabilities for normally distributed
values using various methods.

z-scores
What if we want to know a percentage of observations that fall a distance other
than 1, 2, or 3 standard deviations from the mean? We need a probability table.

There are an unlimited number of normal distributions for every combination of μ


and σ so we cannot make a probability table for every possibility.

There is a given probability table for the standard normal distribution that has a
mean of zero and a standard deviation of 1. Any normal distribution can be
converted to a standard normal distribution by finding the z-scores.

z-score:
A signed value that indicates the number of standard deviations a quantity is from
the mean. A positive z-score indicates that the quantity is above the mean and a
negative z-score indicates that the quantity is below the mean. A z-score with high
absolute value implies that the quantity is farther from the mean, and thus more
unusual.

x−μ
z=
σ
Example:
Assume that the scores on a test can be modelled with a normal distribution. This
distribution has a mean of 169.13 and standard deviation of 11.96.
1. What is the z-score for the student that scores 180?

2. What is the z-score for the student that scores 160?

Normal Probabilities
Once you have computed the z-scores, the next step is to find the probability a
variable will have a value within any given interval using the standard normal
distribution table.
To read the normal distribution table, follow these steps:

1. Find the z-score: Determine the z-score you want to look up.

2. Locate the Tenths: For positive z-scores, find the tenths value in the left-most
column. For negative z-scores, use the right-most column.

3. Locate the Hundredths: Find the hundredths value in the top row.
By combining these two values, you can find the corresponding probability in the
body of the table.
It is important to remember that the curve represents the probability density
function. Any area under the normal curve corresponds to a probability!! This
means the total area under the normal curve is 1.00 (all probabilities sum to 1!).
We will follow these steps to compute probabilities for the normal distribution:

1. Sketch a labelled normal curve.

2. Mark the value of interest (make sure it is correct relative to the mean) and shade in
the desired region of probability

3. Calculate the z-score(s)


4. Find the probabilities from the table (this class) or in R (next class)

5. Calculate desired probability


Example:
Assume that the scores on a test can be modelled with a normal distribution. This
distribution has a mean of 169.13 and standard deviation of 11.96.
1. Compute the probability of scoring less than 180.

2. Compute the probability of scoring less than 160.

3. Compute the probability of scoring between 160 and 180.

Working Backwards
There are two types of problems when working with the normal distribution:

1. Given a score find a probability

2. Given a probability find a score


We will now work on the second type of question. Probabilities can also be given as
percentiles, a value below which a given percentage of data lies.
To solve these types of problems, use the formula

x=σ z + μ
We will follow these steps to compute scores from the normal distribution:

1. Sketch a labelled normal curve.


2. Determine the area of interest from the question (given as a percentile, percentage,
or probability) and shade it in on the graph.

3. Find the closest area in the body of the chart and determine the z-score. Watch the
sign of the z-score!

4. Calculate the score using the formula


Example:
Assume that the scores on a test can be modelled with a normal distribution. This
distribution has a mean of 169.13 and standard deviation of 11.96.
1. What value corresponds to the 80th percentile?

2. What value corresponds to the top 10%?

You might also like