Sampling
Population, sample, and sampling
Population (Universe): the collection of elements or objects that
possess the information sought by the researcher;
about which inferences are to be made
Must be clearly defined
Sample: a subset of the population
Sampling: the process to generate the sample from the population
in order to estimate characteristics of the whole population
Sampling vs. Census
Parameter and Statistics
Parameter (true value/ population value): a measure of a population
Income
Awareness level of the Dos Coyotes restaurant
Attitude score toward Comcast cable service
Statistic (sample value): a measure of a sample
Income
Awareness level of the Dos Coyotes restaurant
Attitude score toward Comcast cable service
Sampling error: the difference between:
results obtained from a sample (sample value) and
results obtained from a population (true value)
Quota Sampling
Develop quotas of population elements
Sample elements are selected based on convenience or judgment
Snowball Sample
Simple Random Sample
Each element in the population has a known and equal probability of selection
List of CBA majors from the Undergraduate Student Office
Randomly select 100 students from the list
Systematic Sampling
• Uses a fixed skip interval to draw elements from a numbered
population.
Stratified Sampling
• A few mutually exclusive and exhaustive subsets
• Each subset has distinct characteristics
• Simple random samples of elements
Cluster Sampling
• Many mutually exclusive and exhaustive subsets.
• A random sample of the subsets is selected.
• Main assumption—elements in cluster are heterogeneous
• Clusters themselves are homogeneous
Stratified and Cluster Sampling
Stratified
• Population divided into a few subgroups
• Homogeneity within subgroups
• Heterogeneity between subgroups
• Choice of elements from within each subgroup
Cluster
• Population divided into many subgroups
• Heterogeneity within subgroups
• Homogeneity between subgroups
• Random choice of subgroups
What is probability?
• A measure of the likeliness that something will happen
• Any event has a probability
• Intrinsic ranking attached: justification of using numbers to describe
probability
• Quantify the uncertainty
• 0<=P(A)<=1
• all possible outcomes together must have probability 1
• P (A does not occur) = 1- P(A)
Distribution of what???
• Probability
• For each value (of X), there is some associated probability...
• In other words, the probability of observing the value (of X) is distributed
in certain pattern...
• Example: selection of one person from our class:
• Probability = 1/54
• Evenly distributed
• “Uniform distribution”
Probability Function
• The probability of observing the value x depends
on x
• Probability of x is a function of x: probability function “f(x)”
• keep tossing a coin until you see a Head
• the outcomes are H, TH, TTH, ...
• X : number of the tosses before H
• Possible value: 0, 1, 2, ....
• The probability function for X is
All types of distributions...
• Normal
• Uniform
• Exponential
• Chi-squared
• Bernoulli
• Poisson
• Beta
• Gamma
• Wishart
Normal Distribution
• Symmetric, single-peaked, bell-shaped
• examples: SAT scores; IQ scores; MPG ratings for 2014
(or any model year) vehicles; unemployment rates in 50
states
• 68-95-99.7 rule
• 68% of the observations fall with 1 s.d. of the mean
• ...
Z= value of the variable - mean of the variable
standard deviation of the variable
where
X = value of the variable
= mean of the variable
= standard deviation of the variable
• This sample mean follow a normal
distribution
• The mean of ALL these sample means would
be equal to
The sample size required is given by:
n = Z2 2
E2
Z = level of confidence expressed in standard errors
1.65 for 90% confidence
1.96 for 95% confidence
2.58 for 99% confidence
= population standard deviation
E = acceptable amount of sampling error
Sample Size Determination
(means)
Sampling distribution of proportion
• Mean proportion of sample population: population proportion
• Standard Error (standard deviation):
Sampling Distribution
N = 1.65^2 x 27 / .9^2
1. Ho: Mean ([Link] number) = Mean ([Link] number)
Ha: Mean ([Link] number) <I> Mean ([Link] number)
2. T-test
3. Alpha = 0.05
4. t = 1.228, d.f. = 445, p value = .220
5. Since p-value (0.220) > alpha (0.05) we fail to reflect Ho; and the data supports
with 95% confidence that males and females do not significantly differ regarding the
importance they place on the internet as the information source on movie theaters.