0% found this document useful (0 votes)
10 views

Samplesize

This document discusses methods for determining the appropriate random sample size for research based on whether the purpose is to generalize results to the population or compare subgroups. It outlines using confidence intervals to calculate sample sizes for categorical or continuous data when generalizing to a population. When comparing subgroups, it recommends using a power analysis to determine sample size based on desired power, variance, effect size, and subgroup balance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Samplesize

This document discusses methods for determining the appropriate random sample size for research based on whether the purpose is to generalize results to the population or compare subgroups. It outlines using confidence intervals to calculate sample sizes for categorical or continuous data when generalizing to a population. When comparing subgroups, it recommends using a power analysis to determine sample size based on desired power, variance, effect size, and subgroup balance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Methods for Determining Random

Sample Size
This document discusses how to determine your random sample size based on
the overall purpose of your research project. Methods for determining the
random sample size are outlined.

Prepared by:
UW-Stout Office of Planning, Assessment, Research and Quality

Contact:
Susan Greene

Revised: 8/13/2012
3/20/2017

1
OFFICE OF PLANNING, ASSESSMENT, RESEARCH AND QUALITY
Inspiring Innovation. Learn more at www.uwstout.edu
RANDOM SAMPLE DECISION TREE

Random Sample
PURPOSE: PURPOSE:
Sample Sample
Generalized to Compared to
Population Population

Confidence Intervals Power Analysis

Need to Know: Need to Know:


· Population · Statistical Testing needed
· Alpha · Alpha
· Type of Data · Power (1-beta)
· Margin of Error · Estimate for Variance,
· Estimate of Variance Absolute Effect Size, Balanced
or Unbalanced Sub-Group

Computations
Computations

· Based on Data Type


· N Size · Based on Data
· Specific Estimates Type
· Possible To Do By · Need to Use
Hand Application
Developed by
Statistician

Type of Data

Categorical Continuous
· Nominal Data · Evenly spaced
· 2 or more categories or is a
categories not continuous
ordered number
· Can assign numbers · Distance between
but the value is categories is the
meaningless same
· EX. (yes/no) (male/
female)

2
DEFINITIONS

Project population is the group of individuals you want to generalize your results
to. These are the people you are interested in describing, comparing, predicting.
Population vs.
The project sample is a part of the population you select to produce the results.
sample
Typically, the population is everyone of interest, and the sample is a sub-set of
the population.

The range in a sample distribution between which it is expected that the true
Confidence
population value will lie, given the particular degree of confidence (typically
interval
95% or 99%).

Project research question stated as a hypothesis such that it is assumed that there
Null hypothesis is no effect or no difference between comparison groups. Statistical analysis
tests whether the null hypothesis can be rejected or not. Often symbolized as H0.
Probability that you reject the null hypothesis when it is true --this a false
Alpha positive. Typically, alpha is set by the researcher prior to any statistical testing;
common settings are 0.05 and 0.01. Often symbolized as α.
Probability that you will accept the null hypothesis when it is false – this is a
Beta
false negative. Often symbolized as β.

Probability that you reject the null hypothesis when it is false -- that you are able
Power
to detect a true effect. Often symbolized as 1 - β.

Also called nominal data. Data that has 2 or more categories that are not ordered.
Categorical data Can assign numbers but the absolute value have no practical meaning. For
example yes/no responses, male/female.

May have evenly spaced categories or be a continuous number. The absolute


Continuous data distance between categories is the same so can answer the question of how much
difference there is between categories.

Tells us about the error due to sampling -- how well our sample represents the
Margin of error
population.

Variance Spread of scores/responses around the average.

Difference between average observed and expected effects; observed average


Effect size
difference between 2 groups.

3
COMPUTATIONS
Notes:
1. For surveys or other archival data with more than one type of data, Cochran1 suggests that
the researcher decides which type of data contains the most critical information for the
success of the project, and base the sample size on that data type. The researcher could also
calculate sample sizes for each type of data and then use the most reasonable number based
on available resources.
2. The results of the chosen estimation method will be for minimum random sample sizes only.
For surveys and longitudinal studies, the researcher will need to increase the sample
size due to non-response and drop-outs. The exact amount of adjustment will depend on
the particular circumstances of the study. It is best to consult with resident experts to
determine the adjustment factor for a specific project.
3. Sample size selection is also dependent on the precision of the measurement tool.

Purpose: Sample Generalized to Population


When your project results are meant to generalize from the sample to the broader population, the
next section outlines methods to select your sample size.

Examples of sample generalized to population:

 You are sending a survey to a random sample of UW-Stout students that contains a series
of yes/no questions. You want to collect enough responses to reasonably generalize the
results of your random sample to the entire UW-Stout student body. Follow the
“Confidence Interval Method -- Categorical Data” methodology.

 Your survey contains rating scale questions – for example, Likert-type scale where
1=strongly, disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree. You want to
collect enough responses from your random sample of UW-Stout students to be confident
in saying that the average ratings represent the opinion of all current Stout students.
Follow the “Confidence Interval Method – Continuous Data” method. Note: if you don’t
agree that these types of survey questions yield continuous data, please use the
“Confidence Interval Method -- Categorical Data” methodology.

1
Cochran, W. G. (1977). Sampling Techniques (3rd edition). New York: John Wiley & Sons.

4
Confidence Interval Method

Categorical Data:
Data needed prior to calculations:
· Specify population size
· Specify alpha and margin of error, typically set at 0.05 and 5% respectively.
· Specify variance estimate. For a dichotomous variable use ½ or 0.50 as the estimate of the
population proportion unless you have evidence otherwise.

There are two options for calculating sample size for categorical data – using an online tool, or
doing this by hand.
1. Online tool at http://www.raosoft.com/samplesize.html

2. Hand calculations using the Cochran method outlined in Bartlett, Kotrlik, and Higgins
(2001)2:
𝑡 2 ×𝑝×(1 − 𝑝)
𝑛0 = Equation 1
𝑑2
Where
· 𝑛0 is the minimum estimated sample size
· t is the value of the t-distribution corresponding to the chosen alpha level – for .05 this is 1.96
· p is the estimate of population proportion*
· d is the margin of error – Bartlett et al recommend using 5%

*When p is unknown, generally it is best to set it at .5

3. If the estimate 𝑛0 is greater than 5% of the overall population, make the following correction:
𝑛0
𝑛1 = Equation 2
1 + 𝑛0 ⁄𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
Where
· 𝑛1 is the adjusted minimum estimated sample size
· 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 is the total population size

2
“Organizational Research: Determining Appropriate Sample Size in Survey Research” accessible online at
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.486.8295&rep=rep1&type=pdf

5
Continuous Data:
Hand computation using the method developed by Cochran and outlined in Bartlett et al.

The steps are:


1. Specify population size
2. Specify alpha and margin of error, typically set at 0.05 and 3% respectively.
a. For rating scale questions, the margin of error would be 0.03 * # of scale points, so
for a 5 point scale the margin of error would be 0.15
3. Specify variance estimate. There are 3 methods suggested by both Bartlett et al (2001) and
Lenth3 (2001) for doing this
a. Do a pilot study to estimate variance
b. Finding variance estimates from published literature of similar studies
c. Using researcher’s experience
i. For survey’s, Bartlett et al suggest using the following estimate for the
standard deviation:
𝑆 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑖𝑛𝑡𝑠 𝑜𝑛 𝑠𝑐𝑎𝑙𝑒 ⁄𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛𝑠 Equation 3
Where
· 𝑆 is the estimate of the standard deviation
· The typical number of standard deviations used for a distribution is 6 –
this covers 99% of the data in the normal distribution
· For a 5 point scale, this would be 5/6 or 0.83
ii. Lenth suggests constructing a histogram or other diagram of the distribution
of how you think the data should turn out and estimate variance based on this.

4. Calculate minimum sample size:


𝑡2 × 𝑆2
𝑛0 = Equation 4
𝑑2

Where
· 𝑛0 is the minimum estimated sample size
· t is the value of the t-distribution corresponding to the chosen alpha level – for .05 this is 1.96
· 𝑆 is the estimate of standard deviation
· d is the margin of error

5. If the estimate 𝑛0 is greater than 5% of the overall population, make the following correction:
𝑛0
𝑛1 = Equation 5
1 + 𝑛0 ⁄𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
Where
· 𝑛1 is the adjusted minimum estimated sample size
· 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 is the total population size

3
Lenth, R. V. (2001), ``Some Practical Guidelines for Effective Sample Size Determination,'' The American
Statistician, 55, 187-193.

6
Purpose: Sample Compared to Population
When your project results are meant to compare the sample to the broader population, the next
section outlines methods to select your sample size.

Examples of sample compared to population:


 You are surveying a random sample of UW-Stout students to determine their satisfaction
with different aspects of campus life.
o Your survey contains rating scale questions – for example, Likert-type scale
where 1=strongly, disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree.
o You have collected demographic data such as gender and year in school.
o You want to collect enough responses from your random sample of UW-Stout
students to be confident in saying that the differences in the average ratings by
demographic group represent the differences in the opinions by demographic
group of all current Stout students. For example
 Are there differences in the average ratings between males and females?
 Are there differences in the average ratings amongst the year in school
groups?

Power Method
Option 1: Free online tool developed by Russell Lenth located at
http://www.cs.uiowa.edu/~rlenth/Power/#Advice

Option 2: Free software to download and run on your PC, information located at
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3. G-Power offers more options
for selecting test type than the Lenth tool.

You will need to have the following information prior to obtaining your sample size results
· Statistical test you are interested in running

· Alpha – typically set at 0.05

· Power (1-beta) – typically set at 0.80

· Variance estimate – see discussion above

· Absolute effect size estimate – Lenth (2001) advises 2 alternatives:


1. Based on the Principal Investigators knowledge of the project, determine the effect
that you hope to see. This would establish an upper bound on the absolute effect size
and a lower bound on the sample size. Then ask if an effect half that size would be
important, noting that in most cases this would quadruple the sample size. This would
help to establish a lower bound on the absolute effect size and an upper bound on the
sample size. Keeping the power constant, you can use the different effect sizes to find
a range of sample sizes, review these keeping in mind your purpose and resources,

7
and then select your final sample size. Or conversely, you can use different effect
sizes and a given sample size and estimate the power, review these keeping in mind
your purpose and resources, and then select your final sample size.

2. Examine published literature related to the study and see what the typical effect sizes
are. Could you reasonably expect the same effect size? If so, use this as your base
absolute effect size.

· Determine if you will have balanced or unbalanced sub-groups. For example, if you are
making comparisons between men and women, will you have equal numbers in your
response sample?

You might also like