0% found this document useful (0 votes)
40 views

Chapter Two Fundamentals of Marketing Estimation and Hypothesis Testing

- Estimation involves using sample data to make inferences about population parameters like the mean and variance. The sample mean and standard deviation are used as point estimators of the population mean and standard deviation. - The sampling distribution of the sample mean follows a normal distribution, with the mean of the sampling distribution equal to the population mean. The standard deviation of the sampling distribution is less than the population standard deviation and decreases as sample size increases. - Even if the population is not normally distributed, the Central Limit Theorem states that as long as the sample size is large enough, the sampling distribution of the sample mean will be approximately normal.

Uploaded by

shimelis adugna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Chapter Two Fundamentals of Marketing Estimation and Hypothesis Testing

- Estimation involves using sample data to make inferences about population parameters like the mean and variance. The sample mean and standard deviation are used as point estimators of the population mean and standard deviation. - The sampling distribution of the sample mean follows a normal distribution, with the mean of the sampling distribution equal to the population mean. The standard deviation of the sampling distribution is less than the population standard deviation and decreases as sample size increases. - Even if the population is not normally distributed, the Central Limit Theorem states that as long as the sample size is large enough, the sampling distribution of the sample mean will be approximately normal.

Uploaded by

shimelis adugna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 73

Quantitative Methods for

Marketers
Chapter Two

Fundamentals of Marketing
Estimation and Hypothesis
Testing
Sampling Distribution
• Estimation: collect data from a sample and process it
in some way that yields a good inference/conclusion/
of something about the population.
• Often the purpose of sampling is to estimate
parameters of a population.
• A parameter is a numerical characteristic of a
population.(e.g., mean and variance)
• In point estimation we use the data from the sample
to compute a value of a sample statistic that serves as
an estimate of a population parameter.
• We refer to  xas the point estimator of the population
mean . x is the point estimator of the population
standard deviation 
Sampling Distribution of the Sample Mean
The sampling distribution of the sample mean
is the probability distribution of the population
of the sample means obtainable from all
possible samples of size n from a population of
size N
•Example: Quantitative Methods Scores: Take a
sample of 8 random students from a population of 40
students. You might get a mean of 87 for that sample.
Then, you do it again with a new sample of 8 students.
You might get a mean of 95 this time. Then, you do it
again. And again…… Since you have sufficient number
of sample means, you can have distribution of the
population of sample means.
General Conclusions
a. If the population of individual items is normal,
then the population of all sample means is also
normal.

b. Even if the population of individual items is not


normal, there are circumstances when the
population of all sample means is normal
(Central Limit Theorem)
General Conclusions Continued
c. The mean of all possible sample means equals
the population mean
That is,  x = 
x
d. The standard deviation  of all sample means
is less than the standard deviation of the
population that is, x < 

Each sample mean averages out the high and


the low measurements, and so are closer to
 than many of the individual population
measurements
And the Empirical Rule
The empirical rule holds for the sampling distribution
of the sample mean
– 68.26% of all possible sample means are within (plus or
minus) one standard deviationx of 
– 95.44% of all possible observed values of x are within
(plus or minus) two  x of 
– 99.73% of all possible observed values of x are within
(plus or minus) three x of 
Central Limit Theorem
Now consider sampling a non-normal population
Still have:  x   and  x   n
– Exactly correct if population is infinite
– Approximately correct if population size N is finite but
much larger than sample size n
• Especially if N ≥ 20  n
But if population is non-normal, what is the shape of the
sampling distribution of the sample mean?
– Is it normal, like it is if the population is normal?
– Yes, the sampling distribution is approximately normal
if the sample is large enough, even if the population is
non-normal by the “Central Limit Theorem”
The Central Limit Theorem #2
No matter what is the probability distribution that
describes the population, if the sample size n is
large enough, then the population of all possible
sample means is approximately normal with
mean and standard deviation
x  
x   n

Further, the larger the sample size n, the closer the


sampling distribution of the sample mean is to
being normal
Cont’d
Example
• A population has mean 557 and standard
deviation of 35
• A. Find the mean and standard deviation of x
bar for samples of 50.
• B. Find the probability that the mean of a
sample of size 50 will be more than 570.
Finite Population Multiplier
• If we randomly select a sample of size n without
replacement from a finite population size N, then
it can be shown that  x   n ( N  n) / N , 1where the
quantity ( N  n) / N  1 is called the finite population
multiplier. If the size of the sampled population is
at least 20 times the size of the sample (that is , if
N≥20n), then the finite population multiplier is
approximately equal to one and  x   n. However, if
the finite population is less than 20 times the size
of the sample, then multiplier is substantially less
than one and must be used in the calculation of 
The Sampling Distribution of the
Sample proportions
• The population proportion is denoted by p and
the sample proportion is denoted by pˆ.
The population of all possible sample proportions
1. Approximately has a normal distribution , if the
sample size n is large.
2. Has mean = p
3. Has variance p(1-p)/n and standard deviation
of 

n should be considered large if both np and n(1-p)
are at least 5. p̂ p(1  p) / n
Example
• 1. The proportion of a population with a
characteristic of interest is p =0.76. Find the mean
and standard deviation of the sample proportion
obtained from a random samples of size 1200.
• 2. Random sample of size 225 are drawn from a
population in p̂ which the proportion with the
characteristic of interest is 0.25. Decide whether or
not the sample size is large enough to assume that
the sample proportion is normally distributed.

Cont’d
Confidence Intervals
z-Based Confidence Intervals
for a Mean:  Known

Confidence interval (CI) for the population mean is an


interval constructed around the sample mean so that
we are reasonably sure or confident that this interval
contains the population mean.
The starting point is the sampling distribution
of the sample mean
– Recall that if a population is normally distributed
with mean  and standard deviation , then the
sampling distribution of  is normal with mean
=  and standard deviation x   n
Cont’d
– To obtain a confidence interval for consider
x usingx, and to calculate the interval.
x } =x [± 2 x
{±2 ] n
This applies to probability of 0.9544 that the interval
contains 
– Use a normal curve as a model of the sampling
distribution of the sample mean
• Exactly, because the population is normal
• Approximately, by the Central Limit Theorem for large
samples
The Empirical Rule
Recall the empirical rule, so…
– 68.26% of all possible sample means are within
one standard deviation of the population mean
– 95.44% of all possible observed values of x are
within two standard deviations of the
population mean
– 99.73% of all possible observed values of x are
within three standard deviations of the
population mean
Generalizing Continued
The probability that the confidence interval will
contain the population mean  is denoted by
– 1 –  is referred to as the confidence coefficient
– (1 – )  100% is called the confidence level
Usual to use two decimal point probabilities for 1 –

– Here, focus on 1 –  = 0.95 or 0.99
General Confidence Interval
In general, the probability is 1 –  that the
population mean  is contained in the interval
 
  
x  z 2 x   x  z 2 
 n 
– The normal point z/2 gives a right hand tail
area under the standard normal curve equal
to /2
– The normal point - z/2 gives a left hand tail
area under the standard normal curve equal
to /2
– The area under the standard normal curve
between -z/2 and z/2 is 1 – 
Example
• Example: A researcher wants to know the
average daily spending of Kebele A dwellers.
For the purpose, the researcher randomly
selected 100 persons in the kebele and
found out that the average expenditure of
the sample is 20 birr. From past researches,
it is known that the standard deviation of
their expenditure is 5 birr. Calculate the
interval estimate?
Example 2
• 1. A random sample of size 144 is drawn from a
population whose distribution , mean and standard
deviation are all unknown. The summary statistics are
x =58.2 and s= 2.6
• A. Construct an 80% CI for the population mean.
• B. Construct a 90% CI for thee population mean.
• 2. A sample of 250 workers aged 16 and older
produced an average length of time with the current
employer of 4.4 years with standard deviation 3.8
years. Construct a 99.9% CI for the mean job tenure
of all workers aged 16 or older.
Sampling Distribution Of All
Possible Sample Means
The Effect of  on Confidence
Interval Width

z/2 = z0.025 = 1.96 z/2 = z0.005 = 2.575


t-Based Confidence Intervals for a
Mean:  Unknown

• If  is unknown (which is usually the case), we


can construct a confidence interval for  based on
the sampling distribution of
x 
t
s n

• If the population is normal, then for any sample


size n, this sampling distribution is called the
t distribution
The t Distribution
The curve of the t distribution is similar to that of the
standard normal curve
– Symmetrical and bell-shaped
– The t distribution is more spread out than the
standard normal distribution
– The spread of the t is given by the number of degrees
of freedom
• Denoted by df
• For a sample of size n, there are one fewer degrees of
freedom, that is,
df = n – 1
The t Distribution and Degrees of
Freedom
For a t distribution with n – 1 degrees of
freedom,
– As the sample size n increases, the degrees of
freedom also increases
– As the degrees of freedom increase, the
spread of the t curve decreases
– As the degrees of freedom increases
indefinitely, the t curve approaches the
standard normal curve
• If n ≥ 30, so df = n – 1 ≥ 29, the t curve is very
similar to the standard normal curve
t-Based Confidence Intervals for a
Mean:  Unknown

If the sampled population is normally distributed with


mean , then a )100% confidence interval for  is

s
x  t 2
n

t/2 is the t point giving a right-hand tail area of /2 under


the t curve having n – 1 degrees of freedom
Example
• A random sample of 12 students from a large
university yields mean GPA 2.71 with sample
standard deviation 0.51. Construct a 90% CI
for the mean GPA of all students at the
university. Assume that the numerical
population of GPAs from which the sample is
taken has a normal distribution.
Sample Size Determination (z)
If  is known, then a sample of size

2
 z 2 
n   
 E 
where B denotes desired margin of error so that  is within B
units of , with 100(1-)% confidence
Example
• A consumer group would like to estimate the
mean monthly electricity charge for a single
family house in July with in Br 5 using a 99%
level of confidence . Based on similar studies
the standard deviation is estimated to be Br
20. How large a sample is required?
Sample Size Determination (t)
If  is unknown and is estimated from s, then a sample
of size
2
 t 2 s 
n   
 E 
so that x is within E units of , with 100(1-)%
confidence. The number of degrees of freedom for the t/2
point is the size of the preliminary sample minus 1
Confidence Intervals for a
Population Proportion
If the sample size n is large*, then a )100%
confidence interval for p is

pˆ 1  pˆ 
pˆ  z 2
n
* Here n should be considered large if both
n  pˆ  5 and n  1  pˆ   5
Example
• A sample of 500 executives who own their
own home revealed 175 planned to sell their
homes and retire to Addis Ababa. Develop a
98% CI for the proportion of executives that
plan to sell and move to Addis Ababa.
Determining Sample Size for
Confidence Interval for p
A sample size
2
 z 2 
n  p1  p   
 E 
will yield an estimate p̂ , precisely within B units of p, with
100(1-)% confidence

Note that the formula requires a preliminary estimate of p.


The conservative value of p = 0.5 is generally used when
there is no prior information on p
Example
• The American Kennel Club wanted to estimate
the proportion of children that have a dog as a
pet. If the club wanted the estimate to be
within 3% of the population proportion, how
many children would they need to contact?
Assume a 95% level of confidence and the
club estimated that 30% of the children have a
dog as a pet.
Confidence Intervals for Population
Mean and Total for a Finite Population

For a large (n  30) random sample of measurements


selected without replacement from a population of size N,
a )100% confidence interval for  is

s N n
x  z 2
n N

A )100% confidence interval for the population


total  is found by multiplying the lower and upper limits
of the corresponding interval for  by N
Example
• The Dean of College of Business and Economics
wants to estimate the mean number of hours
worked per week by students. A sample of 49
students showed a mean of 24 hours with a
standard deviation of 4 hours.
• A. What is the population mean?
• B. Find the 95%CI for the population mean.
• C. Construct a 95% CI for the mean if there are
only 500 students on campus.
Hypotheses Testing
Null and Alternative Hypotheses

The null hypothesis, denoted H0, is a statement of the


basic proposition being tested. The statement generally
represents the status quo and is not rejected unless there is
convincing sample evidence that it is false.

The alternative or research hypothesis, denoted Ha, is


an alternative (to the null hypothesis) statement that will
be accepted only if there is convincing sample evidence
that it is true
8-40
Summary of Types of Hypotheses
One-Sided, “Greater Than” Alternative
H0:   0 vs. Ha:  > 0

One-Sided, “Less Than” Alternative


H0 :   0 vs. Ha :  < 0

Two-Sided, “Not Equal To” Alternative


H0 :  = 0 vs. Ha :   0

where 0 is a given constant value (with the appropriate


units) that is a comparative value

8-41
Types of Decisions
As a result of testing H0 vs. Ha, will have to decide
either of the following decisions for the null
hypothesis H0:

Do not reject H0
– A weaker statement than “accepting H0”
– But you are rejecting the alternative Ha
OR
Reject H0
– A weaker statement than “accepting Ha”
8-42
Test Statistic
In order to “test” H0 vs. Ha, use the “test statistic”
x  0 x  0
z 
x  n

where 0 is the given value (often the claimed to


be true) and x-bar is the mean of a sample
z measures the distance between 0 and x-bar on the
sampling distribution of the sample mean
If the population is normal or the sample size is large*,
then the test statistic z follows a normal distribution
* n ≥ 30, by the Central Limit Theorem
8-43
Type I and Type II Errors
Type I Error: Rejecting H0 when it is true
Type II Error: Failing to reject H0 when it is false

State of Nature

Conclusion H0 True H0 False

Reject H0 Type I Correct


Error Decision
Do not Reject H0 Correct Type II
Decision Error

8-44
Error Probabilities
Type I Error: Rejecting H0 when it is true
  is the probability of making a Type I error
 1 –  is the probability of not making a Type I error

Type II Error: Failing to reject H0 when it is false


  is the probability of making a Type II error
 1 –  is the probability of not making a Type II error
State of Nature

Conclusion H0 True H0 False

Reject H0  1–
Do not Reject H0 
1–
8-45
Typical Values
Usually set  to a low value
– So that there is only a small chance of rejecting a true H0
– Typically,  = 0.05
• For  = 0.05, strong evidence is required to reject H0
• Usually choose  between 0.01 and 0.05
–  = 0.01 requires very strong evidence is to reject H0
• Sometimes choose  as high as 0.10
– Tradeoff between  and 
• For fixed sample size, the lower we set , the higher is 
– And the higher , the lower 

8-46
Steps in Testing a “Greater Than”
Alternative

The steps are as follows:


1. State the null and alternative hypotheses
2. Specify the significance level 
3. Select the test statistic
4. Determine the rejection rule for deciding whether or
not to reject H0
5. Collect the sample data and calculate the value of the
test statistic
6. Decide whether to reject H0 by using the test statistic
and the rejection rule
7. Interpret the statistical results in managerial terms and
assess their practical importance

8-47
Example
ABC company claims that viewers stay longer in
the new TV channel than the existing channel
in the country.
Viewers on average stay for about 60 minutes in
the current channel.
Assume  is known and is 4 minutes, and 64
viewers are selected for the test.

8-48
Steps in Testing a “Greater Than”

1. State the null and alternative hypotheses


– H0:   60
Ha:  > 60

where  is the average time people spend in the


new channel
2. Specify the significance level 
–  = 0.05

8-49
Steps in Testing a “Greater Than”

3. Select the test statistic


– Use the test statistic

x  60
 n
– A positive value of this test statistic results from a
sample mean that is greater than 60
• Which provides evidence against H0 in favor of Ha

8-50
Steps in Testing a “Greater Than”

4. Determine the rejection rule for deciding whether or


not to reject H0
– To decide how large the test statistic must be to reject
H0 by setting the probability of a Type I error to , do the
following:
– The probability  is the area in the right-hand tail of the
standard normal curve
– Use the normal table to find the point z (called the
rejection or critical point)
• z is the point under the standard normal curve that gives a
right-hand tail area equal to 
• Since  = 0.05 in the TV case, the rejection point is z = z0.05
= 1.65 8-51
Steps in Testing a “Greater Than”

4. Continued
– Reject H0 in favor of Ha if the test statistic z is
greater than the rejection point z
• This is the rejection rule
– In the TV case, the rejection rule is to reject H 0 if
the calculated test statistic z is > 1.65

8-52
Steps in Testing a “Greater Than”

5. Collect the sample data and calculate the value


of the test statistic
– In the TV case, assume that  is known and  = 4
– For a sample of n = 64, x = 62. Then

x  60 62  60
z   4 .0
 n 4 64

8-53
Steps in Testing a “Greater Than”

6. Decide whether to reject H0 by using the test


statistic and the rejection rule
– Compare the value of the test statistic to the
rejection point according to the rejection rule
– In the TV case, z = 4 is > z0.05 = 1.65
– Therefore reject H0:  ≤ 60 in favor of Ha:  > 60 at
the 0.05 significance level
• Have rejected H0 by using a test that allows only a
5% chance of wrongly rejecting H0
• This result is “statistically significant” at the 0.05
level
8-54
Steps in Testing a “Greater Than”

7. Interpret the statistical results in managerial


terms and assess their practical importance
– Can conclude that viewers stay longer in the new
TV channel.

8-55
Effect of 

the smaller we set , the larger is


the rejection point, and the
stronger is the statistical
evidence that is required to
reject the null hypothesis H0

8-56
The p-Value
• The p-value or the observed level of
significance is the probability of obtaining the
sample results if the null hypothesis H0 is true
• The p-value is used to measure the weight
of the evidence against the null
hypothesis
• Sample results that are not likely if H0 is true
have a low p-value and are evidence that H0 is
not true
• The p-value is the smallest value of  for
which we can reject H0
• Use the p-value as an alternative to testing
8-57
with a z test statistic
Steps Using a p-value to Test a
“Greater Than” Alternative

4. Collect the sample data and compute the


value of the test statistic
– Example assume test statistics = 2.2
5. Calculate the p-value by corresponding to the
test statistic value.
– In this case, the area under the standard normal
curve in the right-hand tail to the right of the test
statistic value z = 2.20
– The area is 0.5 – 0.4861 = 0.0139
– The p-value is 0.0139

8-58
Steps Using a p-value to Test a
“Greater Than” Alternative Continued
5. Continued
– If H0 is true, the probability is 0.0139 of obtaining a
sample whose mean is 60 or higher
– This is so low as to be evidence that H0 is false and
should be rejected
6. Reject H0 if the p-value is less than 
– In the case considered,  was set to 0.05
– The calculated p-value of 0.0139 is <  = 0.05
• This implies that the test statistic z = 2.20 is > the
rejection point z0.05 = 1.645
– Therefore reject H0 at the  = 0.05 significance level

8-59
Weight of Evidence Against the Null

Calculate the test statistic and the corresponding p-


value
Rate the strength of the conclusion about the null
hypothesis H0 according to these rules:
– If p < 0.10, then there is some evidence to reject H0
– If p < 0.05, then there is strong evidence to reject H0
– If p < 0.01, then there is very strong evidence to
reject H0
– If p < 0.001, then there is extremely strong evidence
to reject H0

8-60
Confidence Intervals vs.
Hypothesis Testing

The null hypothesis H0 can be rejected in favor of


the alternative Ha by setting the probability of a
Type I error equal to  if and only if the 100(1 –
)% confidence interval for  does not contain 0
– where 0 is the claimed value for the population
mean
In other words, if the confidence interval does
not contain the claimed value 0, then it can be
rejected as being false

8-61
Confidence Intervals vs.
Hypothesis Testing Continued

The  used here is the same  used in estimating


the interval
– The confidence level is 1 – 
– The significance level is 
– (1 –  ) +  = 1, so the confidence level and
significance levels are complementary

8-62
t Tests about a Population Mean
( Unknown)

Suppose the population being sampled is


normally distributed
The population standard deviation  is unknown,
as is the usual situation
– If the population standard deviation  is unknown,
then it will have to be estimated from a sample
standard deviation s
Under these two conditions, we have to use the t
distribution to test hypotheses

8-63
Defining the t Random Variable
( Unknown)

Define a new random variable t:


x 
t
s n

– with the definition of symbols as before


The sampling distribution of this random
variable is a t distribution with n – 1
degrees of freedom

8-64
Defining the t Statistic ( Unknown)

Let x-bar be the mean of a sample of size n with standard


deviation s
Also, 0 is the claimed value of the population mean
Define a new test statistic
x  0
t
s n

If the population being sampled is normal, and


If s is used to estimate , then …
The sampling distribution of the t statistic is a t distribution
with n – 1 degrees of freedom
865-
t Tests about a Population Mean
( Unknown)

Reject H0:  = 0 in favor of a particular


alternative hypothesis Ha at the  level of
significance if and only if the appropriate
rejection point rule or, equivalently, the
corresponding p-value is less than 
We have the following rules …

8-66
t Tests about a Population Mean
( Unknown) Continued
Alternative Reject H0 if: p-value

Ha:  > 0 t > t Area under t distribution to right of t

Ha:  < 0 t < –t Area under t distribution to left of –t

Ha:   0 |t| > t /2 * Twice area under t distribution to


right of |t|

• t, t/2, and p-values are based on n – 1 degrees of freedom


(for a sample of size n)
* either t > t/2 or t < –t/2

8-67
Hypothesis Tests about a
Population Proportion

If the sample size n is large, we can reject H0: p = p0 at


the  level of significance (probability of Type I error
equal to ) if and only if the appropriate rejection point
condition holds or, equivalently, if the corresponding p-
value is less than 

We have the following rules …

8-68
Hypothesis Tests about a
Population Proportion Continued

Reject
Alternative H0 if: p-value
H a : p  p0 z  z Area under standard normal to the right of z
H a : p  p0 z   z Area under standard normal to the left of –z
H a : p  p0 z  z  / 2* Twice the area under standard normal to the
right of |z|
p̂  p0
z 
p 0 1  p 0 
n

where the test statistic is:

* either z > z/2 or z < –z/2


8-69
Type II Error Probabilities
• We want the probability  of not rejecting a false
null hypothesis
– That is, we want the probability  of committing
a Type II error
–  is called the power of the test

8-70
Calculating 
Assume that the sampled population is normally distributed, or
that a large sample is taken
Test H0:  = 0 vs Ha:  < 0 or Ha: > 0 or Ha:  ≠ 0
We want to make the probability of a Type I error equal to  and
randomly select a sample of size n
The probability  of a Type II error corresponding to the
alternative value a for  is equal to the area under the standard
normal curve to the left of
0  a
z* 
 n

Here z* equals z if the alternative hypothesis is one-sided ( < 0


or  > 0)
Also z* = z/2 if the alternative hypothesis is two-sided ( ≠ 0)

8-71
Sample Size
Assume that the sampled population is normally distributed,
or that a large sample is taken
Test H0:  = 0 vs. Ha:  < 0 or Ha:  > 0 or Ha:  ≠ 0
Want to make the probability of a Type I error equal to  and
the probability of a Type II error corresponding to the
alternative value a for  equal to 
Then take a sample of size:
n
 z *  z  2  2
 0  a  2
Here z* equals z, if the alternative hypothesis is one-sided (
< 0 or  > 0) and z* equals z/2 if the alternative hypothesis is
two-sided ( ≠ 0)
Also z is the point on the scale of the standard normal curve
that gives a right-hand tail area equal to 

8-72
• End of Chapter 2

You might also like