0% found this document useful (0 votes)
11 views81 pages

Unit 6

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views81 pages

Unit 6

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

UNIT 6

Introduction to Probability Distributions


• Random Variable
• Represents a possible numerical value from a random event
• Takes on different values based on chance

Random
Variables

Discrete Continuous
Random Variable Random Variable
Discrete Random Variable
• A discrete random variable is a variable that can assume only a countable
number of values
Many possible outcomes:
• number of complaints per day
• number of TV’s in a household
• number of rings before the phone is answered
Only two possible outcomes:
• gender: male or female
• defective: yes or no
• spreads peanut butter first vs. spreads jelly first
Continuous Random Variable

• A continuous random variable is a variable that can assume any value


on a continuum (can assume an uncountable number of values)
• thickness of an item
• time required to complete a task
• temperature of a solution
• height, in inches

• These can potentially take on any value, depending only on the ability
to measure accurately.
Discrete Random Variables

• Can only assume a countable number of values


Examples:

• Roll a die twice


Let x be the number of times 4 comes up
(then x could be 0, 1, or 2 times)

• Toss a coin 5 times.


Let x be the number of heads (then x
= 0, 1, 2, 3, 4, or 5)
Discrete Probability Distribution
Experiment: Toss 2 Coins. Let x = # heads.

Probability Distribution
x Value Probability
0 1/4 = .25
1 2/4 = .50
2 1/4 = .25

.50

Probability
.25

0 1 2 x
Probability Distributions
Probability
Distributions

Discrete Continuous
Probability Probability
Distributions Distributions

Binomial Normal

Poisson
Continuous Probability Distributions

• A continuous random variable is a variable that can assume any value on a


continuum (can assume an uncountable number of values)
• thickness of an item
• time required to complete a task
• temperature of a solution
• height, in inches

• These can potentially take on any value, depending only on the ability to
measure accurately.
The Normal Distribution
Probability
Distributions

Continuous
Probability
Distributions

Normal
The Normal Distribution
f(x)

• Bell Shaped
• Symmetrical
• Mean, Median and Mode are Equal
Location is determined by the mean, μ σ
Spread is determined by the standard
deviation, σ x
μ
The random variable has an infinite
theoretical range: Mean
+ ∞ to − ∞ = Median
= Mode
Many Normal Distributions

By varying the parameters μ and σ, we obtain different normal


distributions
The Normal Distribution Shape

f(x) Changing μ shifts the distribution


left or right.

Changing σ increases or decreases


the spread.
σ

μ
x
Finding Normal Probabilities

Probability is measured by the area under the curve

f(x)
P (a ≤ x ≤ b)

a b x
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is symmetric, so half is
above the mean, half is below

f(x)
P(−∞ < x < μ) = 0.5 P(μ < x < ∞) = 0.5

μ x

P(−∞ < x < ∞)= 1.0


Empirical Rules

What can we say about the distribution of values around


the mean? There are some general rules:
f(x)

μ ± 1σ encloses about 68% of


x’s
σ σ

x
μ−1σ μ μ+1σ
68.26%
• μ ± 2σ covers about 95% of x’s
• μ ± 3σ covers about 99.7% of x’s

2σ 2σ 3σ 3σ

μ x μ x

95.44% 99.72%
Importance of the Rule

• If a value is about 2 or more standard deviations away from the


mean in a normal distribution, then it is far from the mean

• The chance that a value that far or farther away from the mean is
highly unlikely, given that particular mean and standard deviation
The Standard Normal Distribution
• Also known as the “z” distribution
• Mean is defined to be 0
• Standard Deviation is 1

f(z)

z
0

Values above the mean have positive z-values, values below


the mean have negative z-values
The Standard Normal
• Any normal distribution (with any mean and standard deviation
combination) can be transformed into the standard normal
distribution (z)

• Need to transform x units into z units


Transformtion to the Standard Normal Distribution

• Transform from x to the standard normal (the “z” distribution) by


subtracting the mean of x and dividing by its standard deviation:

x −μ
z=
σ
Example

• If x is distributed normally with mean of 100 and


standard deviation of 50, the z value for x = 250 is

x − μ 250 −100
z= = = 3.0
σ 50
• This says that x = 250 is three standard deviations
(3 increments of 50 units) above the mean of 100.
Comparing x and z units

μ = 100
σ = 50

100 250 x
0 3.0 z

Note that the distribution is the same, only the scale has changed. We can
express the problem in original units (x) or in standardized units (z)
General Procedure for Finding Probabilities

To find P(a < x < b) when x is distributed normally:

• Draw the normal curve for the problem in terms of x

• Translate x-values to z-values

• Use the Standard Normal Table


Z Table example
• Suppose x is normal with mean 8.0 and standard
deviation 5.0. Find P(8 < x < 8.6)

Calculate z-values:

x −μ 8−8
z= = =0
σ 5
8 8.6 x
x − μ 8.6 − 8 0 0.12 Z
z= = = 0.12
σ 5 P(8 < x < 8.6)
= P(0 < z < 0.12)
Z Table example
(continued)
• Suppose x is normal with mean 8.0 and standard deviation 5.0.
Find P(8 < x < 8.6)

µ =8 µ =0
σ =5 σ =1

8 8.6 x z
0 0.12

P(8 < x < 8.6) P(0 < z < 0.12)


Finding Normal Probabilities

• Suppose x is normal with mean 8.0 and standard


deviation 5.0.
• Now Find P(x < 8.6)

Z
8.0
8.6
Finding Normal Probabilities
(continued)

• Suppose x is normal with mean 8.0 and standard


deviation 5.0.
• Now Find P(x < 8.6)
.5000 .0478
P(x < 8.6)
= P(z < 0.12)
= P(z < 0) + P(0 < z < 0.12)
= .5 + .0478 = .5478
Z
0.00
0.12
What is a Hypothesis?

• A hypothesis is a claim (assumption) about a population


parameter.
The Null Hypothesis, H0
(continued)

• Begin with the assumption that the null


hypothesis is true
• Similar to the notion of innocent until proven
guilty
• Refers to the status quo
• Always contains “=” , “≤” or “≥” sign
• May or may not be rejected
The Alternative Hypothesis, HA

• Is the opposite of the null hypothesis


• Challenges the status quo
• Never contains the “=” , “≤” or “≥” sign
• May or may not be accepted
• Is generally the hypothesis that is believed (or needs to be
supported) by the researcher – a research hypothesis
Errors in Making Decisions
Errors in Making Decisions

• Type I Error
• Reject a true null hypothesis
• Considered a serious type of error

The probability of Type I Error is α


• Called level of significance of the test
• Set by researcher in advance
Errors in Making Decisions
(continued)

• Type II Error
• Fail to reject a false null hypothesis

The probability of Type II Error is β


Outcomes and Probabilities

Possible Hypothesis Test Outcomes


State of Nature
Key: Decision H0 True H0 False
Outcome Do Not
(Probability) Reject No error Type II Error
H0 (1 -α ) (β)
Reject Type I Error No Error
H0 (α ) (1-β)
Type I & II Error Relationship

 Type I and Type II errors cannot happen at the same time


 Type I error can only occur if H0 is true
 Type II error can only occur if H0 is false
Hypothesis Testing Procedure
Hypothesis Testing Procedure
1. Formulate the null (H0) and the alternative (H1) hypotheses.
2. Select the appropriate formula for the t/z statistic.
3. Select a significance level, α , for testing H0. Typically, the 0.05
level is selected.
4. Take one or two samples and compute the mean and standard deviation
for each sample.
5. Calculate the t/z statistic assuming H0 is true/status quo.
6. Locate the critical value of the t/z statistic in the table.
7. Compare tcalculated with tcritical (or zcalculated with zcritical)

8. If zcalculated is larger than zcritical then reject H0. However if zcalculated is less than
zcritical then you are failed to reject the null hypothesis ( or you have to accept
the null hypothesis (in non-technical language))
Level of Significance
Z test 0.10 0.05 0.02 0.01
Two Tailed 1.645 1.96 2.33 2.575
One Tailed 1.28 1.645 2.0565 2.33

for two-tailed test, the values are ± while for one-tailed tests, they are negative or positive accordingly as they are
left-tailed or right-tailed

6. Express the conclusion reached by the t test in terms of the marketing


research problem.
Formulating Hypotheses
• Example 1
Ford motor company has worked to reduce road noise inside the cab
of the redesigned F150 pickup truck. It would like to report in its
advertising that the truck is quieter. The average of the prior design
was 68 decibels at 60 mph.

•What is the appropriate test?


H0: µ ≥ 68 (the truck is not quieter) status quo
HA: µ < 68 (the truck is quieter) wants to support
• If the null hypothesis is rejected, Ford has sufficient evidence to
support that the truck is now quieter.
Example 2:
The average annual income of buyers of Ford F150 pickup trucks is
claimed to be $65,000 per year. An industry analyst would like to
test this claim.

• What is the appropriate test?

H0: µ = 65,000(income is as claimed) status quo


HA: µ ≠ 65,000(income is different than claimed)

•The analyst will believe the claim unless sufficient


evidence is found to discredit it.
Example 3
• The mean life time of a sample of 400 light bulbs produced by a company is
found to be 1600 hours with standard deviation of 150 hours. Test the hypothesis
that the mean life time of the bulbs produced in general is higher than the mean
life of 1570 hours @ alpha=0.01 level of significance.

• Ho: µ=< 1570


• Ha: µ> 1570 Right tailed test.

• An ambulance service claims that it takes on the average 8.9 minutes to reach its destination in
emergency calls. To check this claim, the agency which licenses ambulance services has then
timed on 50 emergency calls, getting a mean of 9.3 minutes with standard deviation of 1.8
minutes. Does this constitute evidence that the figure claimed is too low at 1 percent
significance level.

• Ho: µ=8.9
• Ha: µ not equal to 8.9
• Two tailed test
Hypotheses Testing
• The mean life time of a sample of 400 light bulbs produced by a company is
found to be 1600 hours with standard deviation of 150 hours. Test the
hypothesis that the mean life time of the bulbs produced in general is higher
than the mean life of 1570 hours @ α =0.01 level of significance.
Rejection region
Acceptance region
Ho: µ=< 1570
Ha: µ > 1570 Right tailed test.

0
x̄ = 1600 hrs, µ= 1570, N= 400, Zcal = 4
4

, Therefore, Zcal > Zcritical 2.326


Null hypothesis is rejected At α =0.01 zcritical = 2.326
Critical value

Level of Significance
Z test 0.10 0.05 0.02 0.01
Two Tailed 1.645 1.96 2.33 2.575
One Tailed 1.28 1.645 2.0565 2.33
• An ambulance service claims that it takes on the average 8.9 minutes
to reach its destination in emergency calls. To check this claim, the
agency which licenses ambulance services has then timed on 50
emergency calls, getting a mean of 9.3 minutes with standard
deviation of 1.8 minutes. Does this constitute evidence that the figure
claimed is too low at 1 percent significance level.
• Ho: µ = 8.9
• Ha: µ ≠ 8.9
• Two tailed test

• Zcalculated= 1.57
[email protected]=2.58 0 1.57
-2.58 2.58
We are failed to reject the Null hypothesis
• A factory has a machine dispense 80ml fluid in bottle. An employee
has a doubt about the measurement. Using a sample of 40 bottles, he
measures the average liquid dispensed by the machine is 78 ml with
standard deviation of 2.5 ml.
• Design the hypotheses, test the claim at 95%
Chi-Square Test
 The chi-square is a very versatile statistic and is used
extensively in statistical work.
 The variable as given below also closely follows
the chi-square distribution.

 The O and E are, respectively the observed and


expected frequencies.
 Chi-square test is used for:
Testing whether the observed frequencies for certain
groups or classes are in line with some hypothesized
pattern.
Testing whether a standard probability distribution
fits well to a given frequency distribution.
Testing whether the attributes involved in a cross-
classified data are independent.
Testing whether two or more proportions are equal or
not.
Example 1

A dice is thrown 204 times and the results obtained in


this regard are as follows –

Number 1 2 3 4 5 6
Frequency 22 24 38 30 46 44

Check at α =0.01, if the die is fair (or unbiased)


Solution -
Ho: Dice is fair (or unbiased)
Ha: Dice is unfair (or biased)

Observed Number 1 2 3 4 5 6
Frequency 22 24 38 30 46 44

Number 1 2 3 4 5 6
Expected
Frequency 34 34 34 34 34 34
O E (O-E) (O – E)2 (O – E)2/E

22 34 -12 144 4.24

24 34 -10 100 2.94


Acceptance region
38 34 4 16 0.47

30 34 -4 16 0.47

46 34 12 144 4.24

44 34 10 100 2.94
Rejection region
15.3
15.086
Chi square calculated value = 15.3 15.3
But the critical or table value is 15.086
Therefore, null hypothesis is rejected
Thus, dice is unfair (or biased)
Level of significance
α =0.01

Degree of freedom
= n-1
= 6-1
=5

Critical Value
= 15.086
Example 2
A coin is flipped 200 times and and the results obtained in this regard are as follows -

Head Tail
Frequency 92 108

Check at α =0.01, if the die is fair (or unbiased)


Solution -
Ho: Coin is fair (or unbiased)
Ha: Coin is unfair (or biased)

Observed Head Tail


Frequency 92 108

Expected Head Tail


Frequency 100 100
O E (O-E) (O – E)2 (O – E)2/E

92 100 -8 64 0.64

108 100 8 64 0.64


Acceptance region
1.28

Chi square calculated value = 1.28


But the critical or table value of Chi Square is 6.635
Rejection region

6.635
1.28
Therefore, we are failed to reject the null hypothesis
Thus, it is concluded that coin is fair (or unbiased)
Level of significance
α =0.01

Degree of freedom
= n-1
= 2-1
=1

Critical Value
= 6.635
Report and Presentation
Importance of the Report and Presentation
For the following reasons, the report and its presentation
are important parts of the marketing research project:

1. They are the tangible products of the research effort.


2. Management decisions are guided by the report and the
presentation.
3. The involvement of many marketing managers in the project is
limited to the written report and the oral presentation.
4. Management's decision to undertake marketing research in the
future or to use the particular research supplier again will be
influenced by the perceived usefulness of the report and the
presentation.
The Report Preparation and Presentation Process
Problem Definition, Approach, Research
Design, and Fieldwork

Data Analysis

Interpretations, Conclusions, and


Recommendations

Report Preparation

Oral Presentation

Reading of the Report by the Client

Research Follow-Up
Report Format
I. Title page
II. Letter of transmittal
III. Letter of authorization
IV. Table of contents (see next page)
V. List of tables
VI. List of graphs
VII. List of appendices
VIII. List of exhibits
IX. Executive summary
a. Major findings
b. Conclusions
c. Recommendations
1. Problem definition
a. Background to the problem
b. Statement of the problem
2. Approach to the problem
3. Research design
a. Type of research design
b. Information needs
c. Data collection from secondary sources
d. Data collection from primary sources
e. Scaling techniques
f. Questionnaire development and pretesting
g. Sampling techniques
h. Fieldwork
4. Data analysis
a. Methodology
b. Plan of data analysis
5. Results
6. Limitations and caveats
7. Conclusions and recommendations
8. Exhibits
a. Questionnaires and forms
b. Statistical output
c. Lists
9. Reference/Bibliography
Report Writing
• Readers. A report should be written for a specific reader or readers: the
marketing managers who will use the results.
• Easy to follow. The report should be easy to follow. It should be structured
logically and written clearly.
• Presentable and professional appearance. The look of a report is
important.
• Objective. Objectivity is a virtue that should guide report writing. The rule
is, "Tell it like it is."
• Reinforce text with tables and graphs. It is important to reinforce key
information in the text with tables, graphs, pictures, maps, and other
visual devices.
• Terse. A report should be terse and concise. Yet, brevity should not be
achieved at the expense of completeness.
Guidelines for Tables
• Title and number. Every table should have a number (1a) and title (1b).
• Arrangement of data items. The arrangement of data items in a table should emphasize
the most significant aspect of the data.
• Basis of measurement. The basis or unit of measurement should be clearly stated (3a).
• Leaders, rulings, spaces. Leaders, dots or hyphens used to lead the eye horizontally,
impart uniformity and improve readability (4a). Instead of ruling the table horizontally or
vertically, white spaces (4b) are used to set off data items. Skipping lines after different
sections of the data can also assist the eye. Horizontal rules (4c) are often used after the
headings.
• Explanations and comments: Headings, stubs, and footnotes. Designations placed over
the vertical columns are called headings (5a). Designations placed in the left-hand column
are called stubs (5b). Information that cannot be incorporated in the table should be
explained by footnotes (5c).
• Sources of the data. If the data contained in the table are secondary, the source of data
should be cited (6a).
U.S. Auto Sales 2003 - 2007
Guidelines for Graphs: Round or Pie Charts

• In a pie chart, the area of each section, as a percentage of


the total area of the circle, reflects the percentage associated
with the value of a specific variable.
• A pie chart is not useful for displaying relationships over
time or relationships among several variables.
• As a general guideline, a pie chart should not require more
than seven sections.
Pie Chart of 2007 U.S. Auto Sales

9%
GM

7% 24% Ford

Chrysler

Toyota
11%
Honda

Nissan

Other

18%

18%

13%
Guidelines for Graphs: Line Charts

• A line chart connects a series of data points using


continuous lines.

• This is an attractive way of illustrating trends and


changes over time.

• Several series can be compared on the same chart,


and forecasts, interpolations, and extrapolations
can be shown.
Line Chart of Total U.S. Auto Sales

5000000
4500000 GM
4000000
Ford
3500000
Chrysler
Unit Sales

3000000
2500000 Toyota
2000000 Honda
1500000 Nissan
1000000
Other
500000
0
2003 2004 2005 2006 2007
Year
Guidelines for Graphs: Line Charts

• A stratum chart is a set of line charts in


which the data are successively
aggregated over the series.

• Areas between the line charts display the


magnitudes of the relevant variables.
Stratum Chart of Total U.S. Auto Sales

20000000
18000000 Other
16000000
Nissan
14000000
Honda
Unit Sales

12000000
10000000 Toyota
8000000 Chrysler
6000000 Ford
4000000
GM
2000000
0
2003 2004 2005 2006 2007
Year
Guidelines for Graphs: Pictographs

• A pictograph uses small pictures or symbols to


display the data.

• Pictographs do not depict results precisely. Hence,


caution should be exercised when using them.
Guidelines for Graphs:
Histograms and Bar Charts

• A bar chart displays data in various bars that may be positioned


horizontally or vertically.

• The histogram is a vertical bar chart in which the height of the bars
represents the relative or cumulative frequency of occurrence of a specific
variable.
Histogram of 2007 U.S. Auto Sales

45,00,000

40,00,000

35,00,000

30,00,000

GM
25,00,000 Ford
Chrysler
20,00,000 Toyota
Honda

15,00,000 Nissan
other

10,00,000

5,00,000

0
Oral Presentation

• The key to an effective presentation is preparation.


• A written script or detailed outline should be prepared following the
format of the written report.
• The presentation must be geared to the audience.
• The presentation should be rehearsed several times before it is made
to the management.
• Visual aids, such as tables and graphs, should be displayed with a
variety of media.
• It is important to maintain eye contact and interact with the audience
during the presentation.
Oral Presentation
• Filler words like "uh," "y'know," and "all right," should not be used.
• The "Tell 'Em" principle is effective for structuring a presentation.
• Another useful guideline is the "KISS 'Em" principle, which states:
Keep It Simple and Straightforward (hence the acronym KISS).
• Body language should be employed.
• The speaker should vary the volume, pitch, voice quality, articulation,
and
rate while speaking.
• The presentation should terminate with a strong closing.
Reading the Research Report

• Addresses the Problem – The problem being addressed


should be clearly identified and the relevant background
information provided.
• The research design should be clearly described in non-
technical terms.
• Execution of the Research Procedures – The reader
should pay special attention to the manner in which the
research procedures were executed.
• Numbers and statistics reported in tables and graphs
should be examined carefully by the reader.
Reading the Research Report

• Interpretation and Conclusions – The interpretation of the basic results should be


differentiated from the results per se. Any conclusions or recommendations made
without a specification of the underlying assumptions or limitations should be
treated cautiously by the reader.
• Generalizability – It is the responsibility of the researcher to provide evidence
regarding the reliability, validity, and generalizability of the findings.
• Disclosure – The reader should carefully examine whether the spirit in which the
report was written indicates an honest and complete disclosure of the research
procedures and results.
Research Follow-up
• Assisting the Client – The researcher should answer
questions that may arise and help the client to
implement the findings.
• Evaluation of the Research Project – Every marketing
research project provides an opportunity for learning
and the researcher should critically evaluate the entire
project to obtain new insights and knowledge.
• SPSS OLAP cubes are interactive tables that enable you to slice your
data in different ways for data exploration and presentation.

• Smart Viewer enables the researcher to distribute reports, graphs,


tables, even pivotal report cubes, over the web. Company
managers can be empowered to interact with the results by putting
a report cube on the Web, intranet, or extranet. Thus, they can
answer their own questions by drilling down for more detail and
creating new views of the data.

You might also like