0% found this document useful (0 votes)
72 views22 pages

Stat Notes-Week 1-9

The document defines random variables and discusses discrete and continuous probability distributions. It provides examples of discrete and continuous random variables. It then discusses the properties of probability distributions, including that the probabilities of all outcomes must be between 0 and 1 and sum to 1. The document gives an example problem of tossing two coins and finding the mean, variance, and standard deviation of the random variable for the number of heads. It solves this example problem in multiple steps, constructing the probability distribution and calculating the statistics.

Uploaded by

Chrystell Jane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views22 pages

Stat Notes-Week 1-9

The document defines random variables and discusses discrete and continuous probability distributions. It provides examples of discrete and continuous random variables. It then discusses the properties of probability distributions, including that the probabilities of all outcomes must be between 0 and 1 and sum to 1. The document gives an example problem of tossing two coins and finding the mean, variance, and standard deviation of the random variable for the number of heads. It solves this example problem in multiple steps, constructing the probability distribution and calculating the statistics.

Uploaded by

Chrystell Jane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Mean and Variance

of Random Variables
RANDOM VARIABLE
–a function that associates a real number to each element in the sample space.
– a variable whose values are determined by chance.
DISCRETE VS. CONTINUOUS Examples
Discrete
Discrete:
• Possible outcomes are 1. The number of voters favoring a candidate
countable 2. The number of deaths per year attribute to
• Presented by countable data lung cancer
• There is a limit
Continuous
Continuous 1. The average amount of electricity consumed
• Possible outcomes are on per household per month
a continuous scale (height 2. The weight of newborns each year in a hospital
weight, or temperature) 3. The speed of a bus
• There is no limit

Discrete Probability Distribution


– or a probability mass function, consists of the values a random variable can assume and the
corresponding probabilities of the values.
– the graphical method for displaying the shape of a distribution is called a histogram. It is
particularly useful when there are a large number of observations.
PROPERTIES OF A PROBABILITY DISTRIBUTION
The probability of each value of the random variable must be between or equal to 0 and 1. In
symbol, we write it as 0 ≤ P(X) ≤ 1.
The sum of the probabilities of all values of the random variable must be equal to 1. In symbol, we
write it as P(X) = 1
EXAMPLE:
Determine whether the distribution represents a probability distribution.
1. X 1 2 3 4 5
1 10 5 2 1
P(X) 19 19 19 19 19
2. P(0) = 1 P(2) = 1 P(4) = 4 P(6) = 1 P(8) = 1
6, 6, 6, 6, 6
3. P(1) = 0, P(2) = 0.71, P(3) = 0.39

4. X 1 5 7 8 9
1 1 1 1 1
P(X) 5 5 5 5 5

example problem
Suppose that two coins are tossed at the same time. Let Y be the random variable representing
the number of heads that occur. Find the values of the random variable Y. Construct the
probability distribution of the random variable Y and its histogram. Solve the mean, variance,
and standard deviation.
STEP 1:
Determine the sample space. Let H represent head and T represent tail.
T TT
T
H TH
SAMPLE SPACE: S = {TT, TH, HT, HH}
T HT
H
H HH
STEP 2:
Count the number of heads in each outcome in the sample space and assign this number to this
outcome.

Possible Outcomes Value of the Random Variable Y (number


of heads)
TT 0
TH 1
HT 1
HH 2
STEP 3:
Identify the probability of each value of the variable Y and create a Histogram to represent the
data.
Number of Heads 0.6
Probability P(Y)
Y 0.5
1

Probability P(Y)
0 0.4
4
0.3
1 1
2 0.2

2 1 0.1
4
0 0 1 2
Number of Heads Y

Mean of a Discrete Probability Distribution


The mean of a random variable with a discrete probability distribution is given by:
u = X1 • P(X1) + P(X2) + P(X3) + … + Xn • P(Xn) = X• P(X)
Where X1, X2, X3, …, Xn are the values of the random variable X; and P(X1), P(X2), P(X3) + …, P(Xn)
are the corresponding probabilities.
STEP 4: (from the first problem)
Find the mean.
u = X• P(X) u = Y• P(Y)
Number of Heads Y Probability P(Y) Y • P(Y)

0 1 (0) 41 = 0
4

1 1 (1) 21 = 21
2

2 1 (2) 41 = 42
4

u = Y• P(Y) = 0 + 21 + 42 = 1
VARIANCE of a Discrete Probability
Distribution
The variance of a random variable with a discrete probability distribution is given by:
o2 = (x– u )2 • P(x)
Where
x = value of the random variable
P(x) = probability of the random variable X
u   = mean of the probability distribution
STEP 5:
Find the variance.
Number of Probability Y • P(Y) Y– u (Y– u)2 (Y– u)2 • P(Y)
Heads Y P(Y)
0 1 (0) 41 = 0 0 – 1 = –1 (–1)2­ = 1 (1) 41 = 0.25
4
1 1 (1) 21 = 21 1–1=0 (0)2­ = 0 (0) 41 = 0
2
2 1 (2) 41 = 42 2–1=1 (1)2­ = 1 (1) 41 = 0.25
4
o2 = (Y– u )2 • P(Y) = 0.25 + 0 + 0.25 = 0.5
u = Y• P(Y) = 0 + 21 + 42 = 1
o2 = 0.5 = 0.71
The probability distribution of the number of heads after tossing a coin two times has the
mean, variance, and standard deviation of 1, 0.5, and 0.71 respectively. The results show that it
is expected that the average outcome of all the tosses will be 1 head. In addition, the number of
heads is near to the mean as indicated by the variance and standard deviation.

example problem
Suppose that three coins are tossed at the same time. Let X be the random variable
representing the number of tails that occur. Find the values of the random variable X.
STEP 1:
Determine the sample space. Let H represent head and T represent tail.
H TH HHH
HHT
H T T H HTH SAMPLE SPACE: S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
HTT
H T H THH
THT
T T T H TTH
TTT
STEP 2:
Count the number of tails in each outcome in the sample space and assign this number to this
outcome.

Possible Outcomes Value of the Random Variable X


(number of trials)
TTT 3
TTH 2
THT 2
THH 1
HTT 2
HTH 1
HHT 1
HHH 0

STEP 3:
Identify the probability of each value of the variable X and create a Histogram to represent the
data.
Number of Tails 0.4
Probability P(X)
X 0.35
0 1 0.3
8
Probability P(X)

0.25
1 3 0.2
8
0.15
2 3
8 0.1
0.05
3 1
8 0 0 1 2 3
Number of Tails X
STEP 4:
Find the mean.
u = X• P(X)
Number of Tails X Probability P(X) X • P(X)

0 1 (0) 81 = 0
8

1 3 (1) 38 = 38
8

2 3 (2) 38 = 68
8

3 1 (3) 81 = 38
8

u = X• P(X) = 0 + 38 + 68 + 38 = 128 = 32 or 1.5

STEP 5:
Find the variance.
Number of Probability X • P(X) X– u (X– u)2 (X– u)2 • P(X)
Tails X P(X)
0 8
1 (0) 81 = 0 0 – 1.5 = –1.5 (–1.5)2­ = 2.25(2.25) 81 = 0.28125
3 (1) 38 = 38 3
1 8 1 – 1.5 = –0.5 (–0.5)2­ = 0.25 (0.25) 8 = 0.09375

2 3 (2) 38 = 68 2 – 1.5 = 0.5 (0.5)2­ = 0.25 (0.25) 38 = 0.09375


8
3 8
1 (3) 81 = 38 3 – 1.5 = 1.5 (1.5)2 = 2.25 (2.25) 81 = 0.28125
o2 = (X– u )2 • P(X) = = 0.28125 + 0.09375 + 0.09375
3 6 3 12 3
u = X• P(X) = 0 + 8 + 8 + 8 = 8 = 2 or 1.5 + 0.28125 = 0.75
o2 = 0.75 = 0.87

The probability distribution of the number of tails after tossing a coin three times has the mean,
variance, and standard deviation of 1.5, 0.75, and 0.87 respectively. The results show that it is
expected that the average outcome of all the tosses will be 1.5 tails. In addition, the number of
tails is near to the mean as indicated by the variance and standard deviation.
Normal Distribution
– also called the normal curve, is the distribution of data where the mean, median, and mode are
equal
– the distribution is clustered at the center
– the graph is a bell-shaped curve, and symmetrical

PROPERTIES OF THE NORMAL PROBABILITY DISTRIBUTION


1. The distribution curve is bell-shaped
2. The curve is symmetrical about its center
3. The mean, median, and the mode coincide at the center
4. The width of the curve is determined by the standard deviation of the distribution
5. The tails of the curve flatten out indefinitely along the horizontal axis, always approaching the
axis but never touching it. That is, the curve is asymptotic to the baseline.
6. The area under the curve is 1. Thus, it represents the probability or proportion or the
percentage associated with specific sets of measurement values

THE STANDARD NORMAL CURVE


– a probability distribution that has a mean u = 0 and a standard deviation o = 1.
Probability

-4 -3 -2 -1 1 2 3 4
Standard Deviation
THE Z-SCORE
– also called z-values, are the areas under the normal curve
Where:
z= X-u X = given measurement
o u = population mean
o = standard deviation
example problems
Given the mean u = 50 and the standard deviation,
o = 4 of a population of Reading scores. Find the
z-value that corresponds to a score X = 58.
z = X - u = 58 –4 50 = 2
o -3 -2 -1 1 2 3
38 42 46 50 54 58 62
Find the z-value of the following set of data. Tell whether the score is above or below the mean.
1. u = 45, o = 6, X = 39 z=–1 Below
2. u = 40, o = 8, X = 52 z = 1.5 Above
3. u = 75, o = 15, X = 82 z = 0.47 Above
TABLE OF AREAS UNDER THE NORMAL CURVE
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
For values of z above 3.09, use 0.4999 for the area.
Find the area under the standard normal curve between z = 0 and the following z-scores
a. z = 0.96 or P(0 < z < 0.96)
Step 1: Express the z-value into 3-dig its Step 4: Read the area (or probability) at the
intersection of the row and the column
z = 0.96
A = 0.3315
Step 2: In the table find the first two dig its
on the row
z = 0.9
Step 3: Match the third dig it with the
appropriate column heading
z = 0.06
-3 -2 -1 1 2 3

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

b. z = 1.45 or P(0 < z < 1.45)


Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

-3 -2 -1 1 2 3
c. z = 2.38 or P(0 < z < 2.38)
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

-3 -2 -1 1 2 3
Find the area under the standard normal curve bounded by the following pairs of z-scores.
a. z = 1 and z = 2 or P(1 < z < 2)
Z 0.00 Z 0.00
1.0 0.3413 2.0 0.4772
The area of the region in between: 0.3413

z = 1 —> A = 0.3413 0.4772


z = 2 —> A = 0.4772
A = 0.4772 – 0.3413 = 0.1359 -3 -2 -1 1 2 3

b. z = 2.16 and z = 3.43 or P(2.16 < z < 3.43)


Z 0.06 Z 0.03
2.1 0.4846 3.4 0.4999
The area of the region in between:
z = 2.16 —> A = 0.4846 0.4846
z = 3.43 —> A = 0.4999
A = 0.4999 – 0.4846 = 0.0153 -3 -2 -1 1 2 3
0.4999
c. z = -1.91 and z = 3 or P(-1.91 < z < 3)
Z 0.01 Z 0.00
1.9 0.4719 3.0 0.4987
The area of the region in between:
z = -1.91 —> A = 0.4719
z = 3 —> A = 0.4987
A = 0.4719 + 0.4987 = 0.9706 -3 -2 -1 1 2 3
0.4719 0.4987
Find the area or proportion (probability) indicated by each item.
a. Above z = -1 or P(z > -1) Z 0.00
The area of the region in between: 1.0 0.3413

z = 1 —> A = 0.3413
Half of the Normal Curve —> A = 0.5 0.3413 0.5
A = 0.3413 + 0.5 = 0.8413 -3 -2 -1 1 2 3
b. To the left of z = -1.5 or P(z < -1.5)

Z 0.00
1.5 0.4332 0.4332
The area of the region in between: 0.5

z = 1.5 —> A = 0.4332 -3 -2 -1 1 2 3


Half of the Normal Curve —> A = 0.5
A = 0.5 – 0.4332 = 0.0668

Percentile
– a measure of relative standing.
– a descriptive measure of the relationship of a measurement to the rest of the data
– divides a set of data into 100 equal parts
– always refer to quantities below or less than the percentile rank

example problems
a. What percentile does the z-score 2.34 represent?
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

Add 0.5000 and round off to the nearest


hundredths:
0.4904 + 0.5000 = 0.9904 = 0.99
The z-score represents the 99th percentile
or P99
-3 -2 -1 1 2 3
0.5 0.4904
b. What percentile does the z-score –1.82 represent?
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

Subtract from 0.5000 and round off to


the nearest hundredths:
0.5000 – 0.4656 = 0.0344 = 0.03
The z-score represents the 3rd percentile
0.4656 or P3
-3 -2 -1 1 2 3
0.5

c. What z-score corresponds to P96 or 96th percentile?


P96 or 96th percentile (96%) represents the
0.9600 area under the normal curve.
A = 0.9600 – 0.5000 = 0.4600
Nearest value = 0.4599 z = 1.75

-3 -2 -1 1 2 3

d. What z-score corresponds to P34 or 34th percentile?


P34 or 34th percentile (34%) represents the
0.3400 area under the normal curve.
A = 0.5000 – 0.3400 = 0.1600
Nearest value = 0.1591 z = –0.41

-3 -2 -1 1 2 3

example word problems


1. DG company has 100 branches nationwide. The annual profit of DG company is normally
distributed with a mean of Php 73 million a year with a standard deviation of Php 3.25 million.
How many branches have a profit of Php 73 million to Php 80 million?
Given: Number of branches
n = 100 z = X - u = 80 – 73 2.15 = [Area of P(0 < z < 2.15)](n)
u = P73M = 73 o 3.25 = (0.4842) (100)
X = P80M = 80 = 48.42
o = P3.25M = 3.25 48 branches

Therefore, 48 branches have a profit of


Php 73 M to Php 80 M

0.4842
-3 -2 -1 1 2 3
73M 80M

2. Fifty job applicants took an IQ test and their scores are normally distributed with a mean of 100
a. How many applicants obtain a score between 74 and 126 if the standard deviation is 20?
Given:
n = 50 z1 = X1 - u = 74 20 – 100 = –1.3 z2 = X2 - u = 126 20
– 100 = 1.3
u = 100 o o
X1 = 74
X2 = 126
o = 20
Number of branches
= [Area of P(–1.3 < z < 1.3)](n)
= (0.4032 + 0.4032) (50)
= 40.32 0.4032 0.4032
40 applicants
Therefore, 40 applicants obtained -3 -2 -1 1 2 3
a score between 74 and 126 if the 74 100 126
standard deviation is 20
b. The management decided not to hire the lowest 20% of the applicants, what must be the
score an applicant must obtain to get hired if the standard deviation is 20?
Given:
n = 50
Nearest value to 0.30:
A = 0.3021 z= X-u X = zo + u
= (–0.85)(20) + 100
o
u = 100 z = –0.85 X = 83
o = 20
A = 30% below mean

20% 30%
-3 -2 -1 1 2 3
100
Random Sampling
– a part of the sampling technique in which each sample has an equal probability of being
chosen. A sample chosen randomly is meant to be an unbiased representation of the total
population.
PARAMETERS
– are descriptive measures computed from a population
STATISTICS
– are descriptive measures computed from a sample
SAMPLING DISTRIBUTION OF SAMPLE MEANS
– is a frequency distribution using the means computed from all possible random samples of a
specific size
FINITE POPULATION
– is one that consists of a finite or fixed number of elements, measurements, or observations
INFINITE POPULATION
– contains, hypothetically, at least, an infinite number of elements
example problems
Data Average Sample Mean X Frequency Probability P(X)
3,4,6 4.33 4.33 1 0.10
3,4,7 4.67
4.67 1 0.10
3,4,9 5.33
3,6,7 5.33 5.33 2 0.20
3,6,9 6.00 5.67 1 0.10
3,7,9 6.33 6.00 1 0.10
4,6,7 5.67
6.33 2 0.20
4,6,9 6.33
4,7,9 6.67 6.67 1 0.10
6,7,9 7.33 7.33 1 0.10

FINDING THE SAMPLE MEAN


The solution of finding the sample mean is just the same as finding the mean. The only
difference is the symbol used.
u = X• P(X)
X Frequency P(X) or Probability X • P(X)
4.33 1 0.10 0.433
4.67 1 0.10 0.467
5.33 2 0.20 1.066
5.67 1 0.10 0.567
6.00 1 0.10 0.600
6.33 2 0.20 1.266
6.67 1 0.10 0.677
7.33 1 0.10 0.733
Mean ux = 5.80

FINDING THE VARIANCE


X Frequency P(X) X • P(X) X – ux (X – ux)2 P(X) – (X – ux)2
4.33 1 0.10 0.433 -1.47 2.1609 0.2161
4.67 1 0.10 0.467 -1.13 1.2769 0.1277
5.33 2 0.20 1.066 -0.47 0.2209 0.0442
5.67 1 0.10 0.567 -0.13 0.0169 0.0017
6.00 1 0.10 0.600 0.20 0.0400 0.0040
6.33 2 0.20 1.266 0.53 0.2809 0.0562
6.67 1 0.10 0.667 0.87 0.7569 0.0757
7.33 1 0.10 0.733 1.53 2.3409 0.2341
Mean ux = 5.80 Variance o2x = 0.76

PROPERTIES OF THE SAMPLING DISTRIBUTION


OFIf allSAMPLE MEAN
possible samples of size n are drawn from a population of size N when mean is u and
variance o2, then the sampling distribution of the sample means has the following properties:
• The mean of the sampling distribution of the sample means is equal to the population mean u.
That is ux = u.
• The variance of the sampling distribution of the sample means o2 is given by:
o2x = on • NN –– n1 (for Finite Population)
2 2
o2x = on (for Infinite Population)
• Standard deviation of ox = o
n
• The standard deviation o of the sampling distribution of the sample means is just the square
root of its variance
• It is also known as the standard error of the mean
• It measures the degree of accuracy of the sample mean as an estimate of the population
mean
example problems
a. Compute the Population Mean, Population Variance, and Variance of the Sampling Distribution.
Population X X–u (X – u)2 POPULATION MEAN
3 -2.8 7.84 u = X = 3 + 4 + 6 + 7 + 9 = 5.8
N 5
4 -1.8 3.24 POPULATION VARIANCE
6 0.2 0.04 (X – u)2
o= N
7 1.2 1.44 7.84 + 3.24 + 0.04 + 1.44 + 10.24
=
9 3.2 10.24 5
= 22.8 = 4.56
Population Variance o2 = 4.56 5

VARIANCE OF THE SAMPLING DISTRIBUTION


Therefore, the mean of the
Since it is a finite population, then sampling distribution is 5.8 and its
variance is 0.76
o2x = on • NN –– n1 = 4.56 • 5 – 3 = 0.76
2
3 5–1
b. A population has a mean of 60 and a standard deviation of 5. A random sample of 16
measurements is drawn from this population. Describe the sampling distribution of the
sample means by computing its mean and standard deviation.
Given: The problem does not indicate the size of the population—it is assumed that the
o=5 population is infinite. Therefore,
n = 16
u = 60 u = 60 ox = on = 165 = 1.25

Central Limit Theorem


The Central Limit Theorem states that if random samples of size n are drawn from a
population, then as n becomes larger, the sample distribution of the mean approaches the
normal distribution, regardless of the shape of the population distribution.
THE FORMULA: is used when computing for the probability that will take on a value
within a given range in the sampling distribution of X
Z = X o– u where:
o = population standard deviation
X = sample mean
n u = population mean n = sample size

example problems
a. The average time it takes a group of college students to complete a certain examination
is 46.2 minutes. The standard deviation is 8 minutes. Assume that the variable is normally
distributed. If 50 randomly selected college students take the examination, what is the
probability that the mean time it takes the group to complete the test will be less than 43
minutes?
Given: The problem is dealing with data about the sample
u = 46.2 mean. Thus, the formula above will be used.
o=8
X = 43
n = 50 Z = X o– u = 43 –846.2 = -2.83
n 50

The problem is asking for P(x < 43). Therefore,


P(x < 43) = P (z < -2.83)
= 0.5000 – 0.4977
= 0.0023
The probability that 50 randomly
selected college students will complete
the test less than 43 minutes is
0.0023 or 0.23% -3 -2 -1 1 2 3

b. In the frequency of 30 times of a TV advertisement, the sales of a product has a mean of


Php 100,000 a week and with standard deviation of Php 75,000.
i. What is the probability that the sample mean belongs to the interval of Php 90,000 to
Php 110,000?
ii. What is the interval for the mean sale cover of the middle 95% of the distribution of
the sample mean?
Given: Problem i is asking for: Therefore, the probability of 53.46%
u = 100,000 represents the sample mean that
o = 75,000 X 1 – u X 2 – u belongs to the interval of Php 90,000
n = 30 P o < z < o to Php 110,000
n n
90,000 – 100,000 110,000 – 100,000
= P 75,000 <z< 75,000
30 30
= P (-0.73 < z < 0.73) = 0.2673 + 0.2673
= 0.5346
For problem ii, since the interval covers the middle 95% of the distribution, 47.5% of
the distribution is required from both sides from the mean. The z-scores that represent
47.5% of the distribution is + 1.96
X = Z on + u

X = -1.96 75,0000 + 100,000 X = 1.96 75,0000 + 100,000


30 30
X = Php 73,161.59 X = Php 126,838.41
Therefore, 95% of the
distribution sample mean is
represented by the interval
Php 73,161.59 < X < Php 126,838.41

Point Estimate
ESTIMATION
– the process of finding an approximate value of some parameter—such as the mean—of a
population from random samples of the population
– population parameters are usually unknown fixed values but there are 2 ways to determine
and report them:
1. Report a number that describes the average. This number is called the point estimate
• A point estimate is a specific numerical value of a population parameter. The sample
mean y is the best point estimate of the population mean
• The mean is the best estimator because any change in value affects the result
2. Report a range of values that contains the number that truly describes the data. This
number is called interval estimate
• An interval estimate is a range of values that may contain the parameter of a
population

PROPERTIES OF A GOOD ESTIMATOR


Population parameters are usually unknown fixed values but there are 2 ways to determine
and report them:
1. When the mean of a sample statistic from a large number of different random samples
equals the true population parameter, then the sample statistic is an unbiased estimate of
the population paramater
2. Across the many repeated samples, the estimates are not very far from the true
parameter value

A B C Which of the 3 graphs show a good


estimator?
Out of the three graphs, graph b shows a
good estimator
u u u
To feel confident of our estimations, we take as many random samples as possible, compute the
sample statistics and carefully compare results and formulate conclusions
GRAPH A GRAPH B GRAPH C
• Represents negative • Is unbiased or on target estimate • Represents positive
bias or underestimate for the reason that equal numbers bias or overestimate
since most of the of samples are taken from both because majority of
samples are taken sides of the mean, where population the samples are above
below the mean mean is equal to the sample mean the mean

example problems
1. Mr. Santiago’s company sells bottles of coconut juice. He claims that a bottle contains 500ml of
such juice. A consumer group wanted to know if his claim is true. They took six random samples of
10 such bottles and obtained the capacity, in ml, of each bottle. The result is shown as follows.

Sample 1 500 498 497 503 499 497 497 497 497 495
Sample 2 500 500 495 494 498 500 500 500 500 497
Sample 3 497 497 502 496 497 497 497 497 497 495
Sample 4 501 495 500 497 497 500 500 495 497 497
Sample 5 502 497 497 499 496 497 497 499 500 500
Sample 6 496 497 496 495 497 497 500 500 496 497

Assuming that the measurements were carefully obtained and that the only kind of error present is
the sampling error, what is the point estimate of the population mean?`
The point estimate of the population mean u is also known as the mean of the means or the
overall mean. To find its value, simply find the sum of the mean values and divide this sum by the
total number of sample means
SOLUTION 1: Sample Row Mean

Sample Row Sum of Scores Mean


1 4980 498.0
The point estimate of the population
2 4984 498.4 parameter is 497.83 ml. The
3 4972 497.2 computed mean based on the sample
is slightly less than the claimed value
4 4979 497.9 of 500 ml.
5 4984 498.4
6 4971 497.1
Overall Mean 497.83
SOLUTION 2: Sample Column Mean

Sample Columns Sum of Scores Mean


1 2996 499.33
2 2984 497.33
3 2987 497.83 The population mean is 497.83
ml. Regardless of how samples
4 2984 497.33 are randomly selected, forming a
sampling distribution of means, the
5 2984 497.33 mean of the means is equal to the
population mean.
6 2988 498.00
7 2991 498.50
8 2988 498.00
9 2987 497.83
10 2981 496.83
Overall Mean 497.83

2. The US Census Bureau publishes annual price figures for new mobile homes in Manufactured
Housing Statistics. The figures are obtained from sampling, not from a census. A simple random
sample of 36 new mobile homes yielded the prices, in thousands of dollars, shown in the Table
below. Use the data to estimate the population mean price, u, of all new mobile homes.
67.8 68.4 59.2 56.9 63.9 62.2 55.6 72.9 62.6
67.1 73.4 63.7 57.7 66.7 61.7 55.5 49.3 72.9
49.9 56.5 71.2 59.1 64.3 64.0 55.9 51.3 53.7
56.0 76.7 76.8 60.6 74.5 57.9 70.4 63.8 77.9

SOLUTION: Sample Row Mean


Sample Rows Sum of Scores Mean
1 569.5 63.28
The point estimate of the
2 568 63.11 population parameter is
3 525.9 58.43 63.28k dollars.
4 614.6 68.29
Overall Mean 63.28
3. Estimate the mean consumption of 6 families in one month if their expenses are Php 14,200, Php
15,500, Php 16,800, Php 17,500, Php 20,000, and Php 27,000.

Family Expenses
1 Php 14,200
2 Php 15,500
3 Php 16,800 The mean consumption of the 6 families in one month
4 Php 17,500 is Php 18,500
5 Php 20,000
6 Php 27,000
Mean Php 18,500

Confidence Interval Estimators


DEFINITIONS
ESTIMATE
– a value or range of values that approximate a parameter
INTERVAL ESTIMATE
– a range of values that may contain the parameter of a population
CONFIDENCE INTERVAL
– a type of interval estimate that is used to estimate a parameter that may or may not contain
the true parameter value
CONFIDENCE LEVEL
– the percentage of all possible samples that can be expected to include the true population
parameter
– describes the uncertainty of the sampling method
– usually takes on the values 90%, 95%, and 99%.
ALPHA ERROR ( α)
– the probability that the confidence interval will not contain the true parameter value
– 1 – α = confidence level
α Probability z-score Formula:
5% 95% + 1.96 P(X)
2
10% 90% + 1.645 then find the
corresponding z-score
1% 99% + 2.576
FORMULAS
The 100% confidence interval derived from the Central Limit Theorem is stated below:
SAMPLE MEAN POPULATION MEAN
u – Z on < X < u + Z on = u + Z on X – Z on < u < X + Z on = X + Z on

example problems
1. Calculate the 95% confidence interval for 150 employees receiving a monthly salary of Php 15,000
with a standard deviation of Php 2,500.
Given: u – Z on , u + Z on
u = 15,000
o = 2,500
n = 150 = 15,000 – 1.96 2,500 150 , 15,000 + 1.96 150
2,500
z = 1.96 (95% confidence interval)
= (14,599.92, 15,400.08) or (14,600, 15,400)

The data shows that employees with salaries from Php 14,600 to Php 15,400 belong to the
95% of the true population receiving a monthly salary of Php 15,000. This implies that with
95% confidence that the mean salary is between Php 14,600 to Php 15,400.

2. A marketing officer wishes to select female receptionists from 300 employees with an average
height of 170 cm and a sample standard deviation of 25 cm. What is the 99% confidence interval
of their height?
Given: X – Z on , X + Z on
X = 170
o = 25 25 , 170 + 2.576 25
n = 300 = 170 – 2.576 300 300
z = 2.576 (99% confidence interval)
= (166.28, 173.71) or (166,174)

You might also like