0% found this document useful (0 votes)
22 views28 pages

LESSON 5 - RESEARCH DESIGN

Uploaded by

Ihra Castillo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views28 pages

LESSON 5 - RESEARCH DESIGN

Uploaded by

Ihra Castillo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

TOPIC THREE RESEARCH DESIGN

Meaning of research design


Research design refers to the way a study is planned and conducted, the procedures, and
techniques employed to address the research problem or question. It shows the tools required, the
resources needed, the cost involved, and the time schedule of anticipated progress. The main
objective of a research design is to enhance validity of research findings by controlling potential
sources of bias that may distort findings. It addresses the following questions:
- What is the study about?
- Why is the study being made?
- Where will the study be carried out?
- What type of data is required?
- Where can the required data be found?
- What periods of time will the study include?
- What will be the sample design?
- What techniques of data collection will be used?
- How will data be analyzed?
- In what style will the report be prepared?

The research design can be split into the following:


1. Sampling design: deals with the method of selecting items to be observed in the study.
2. Observational design: relates to conditions under which the observations are made.
3. Statistical design: concerns with the question of how many items are to be observed and
how the information and data gathered are to be analyzed.
4. Operational design: deals with the techniques by which the procedures specified in the
three categories above can be carried out.

Features of a good research design


A good research design specifies
1. The means of obtaining information
2. The availability and skills of the researcher and staff
3. The objectives of the problem to be studied
4. The nature of the problem
5. The availability of time and money for the research

Concepts in research design


▪ Variable: a concept that can take on different quantitative values such as income,
height, weight etc. Qualitative attributes are also quantified on the basis of the presence
or absence of the attributes.
▪ Continuous variable can take on values even in decimal points. For example age.
Discrete variable is one that can take on integer values only. For example number of
children
▪ Dependent/explained/regressand/endogenous variable is a consequence of the other
variable(s).
▪ Independent variable/explanatory/regressor/exogenous variable is one that causes
changes in the dependent variable. For example savings depends on income. Hence
savings is dependent variable and income is an independent variable
▪ Extraneous variable is an independent variable that is not related to the purposes of the
study but may affect the dependent variable. For example, when testing the
relationship
between student’s performance in economics and self concepts, intelligence may as
well affect the performance. But since it is not related to the purpose of the study it is
extraneous.
▪ Control: restrained experimental conditions meant to minimize the effects of
extraneous variables
▪ Confounded relationship one between the dependent variable and the independent
variable when the dependent variable is not free from the influence of extraneous
variable
▪ Experimental group: one exposed to
special or novel conditions.
▪ Control group: one exposed to the
usual conditions.

Different Types of Research Designs

There are different research designs depending on


the type of research
A. Exploratory / formulative research studies: emphasize discovery of ideas and insights.
Hence uses survey of relevant literature to build upon the work of others; experience
survey of people who have had practical experience with the problem; analysis of
insight- stimulating examples usually used where there is little experience to serve as a
guide. It uses existing records and unstructured interviewing among other methods and is
flexible in design.

B. Descriptive research studies: are concerned with describing the characteristics of a


particular individual or a group. Studies concerned with specific predictions, with
narration of facts and characteristics concerning individual, group or situation.

C. Diagnostic research studies: determine the frequency with which something occurs or its
association with something else. For example studies concerning whether certain
variables are associated.

In both descriptive and diagnostic research studies the researcher defines clearly what she wants
to measure, and find adequate methods of measuring it along with a clear-cut definition of the
population she wants to study.

The research design must make enough provision for protection against bias and must maximize
reliability with due concern for the economical completion of the study.

The design should be rigid and focus on:


1. Formulating the objectives
2. Designing the methods of data collection
3. Selecting the sample
4. Collecting data
5. Processing and analyzing data
6. Reporting the findings
Differences in research design between exploratory and descriptive /diagnostic studies

R Type of study
e E D
s x e
e p s
a l c
o r
r
r i
c a p
h t t
o i
D r v
e y e
s / /
i D
g f i
n o a
r g
m n
u o
l s
a t
t i
i c
v
e s
t
s u
t d
u y
d
y
O Flexible design (provides Rigid design (design must
v opportunity for considering different make enough provision for
e aspects of the problem) protection against bias and
r must maximize
a r
l e
l l
i
d a
e b
i
s
l
i i
g t
n y
)
S N Probability sampling design
a o (random sampling)
m n
p -
l p
e r
o
d b
e a
s b
i i
g l
n i
t
y

s
a
m
p
l
i
n
g

d
e
s
i
g
n

(
P
u
r
p
o
s
i
v
e

o
r

j
u
d
g
m
e
n
t

s
a
m
p
l
i
n
g
)
S N P
t o r
a e
t p -
i r p
s e l
t - a
i p n
c l n
a a e
l n d
n
d e d
e d e
s s
i d i
g e g
n s n
i
g f
n o
r
f
o a
r n
a
a l
n y
a s
l i
y s
s
i
s
O Unstructured instruments for S
b collection of data t
s r
e u
r c
v t
a u
t r
i e
o d
n
a o
l r
d w
e e
s l
i l
g
n t
h
o
u
g
h
t

o
u
t

i
n
s
t
r
u
m
e
n
t
s

f
o
r

c
o
l
l
e
c
t
i
o
n

o
f

d
a
t
a
O N Advanced decisions about
p o operational procedures
e
r f
a i
t x
i e
o d
n
a d
l e
c
d i
e s
s i
i o
g n
n s

a
b
o
u
t

t
h
e

o
p
e
r
a
t
i
o
n
a
l

p
r
o
c
e
d
u
r
e
s

Research design in case of hypothesis testing


research studies
- These studies are known as experimental studies and here the researcher tests the
hypothesis of causal relationships between variables.
- They require procedures that will not only reduce bias and increase reliability, but will
permit drawing of inferences about causality.

SAMPLING DESIGN

Census and Sample Survey

(i) Census
All items in any field of inquiry constitute a ‘Universe’ or ‘Population’. A complete enumeration
of all items in the ‘population’ is known as a census inquiry. It can be presumed that in such an
inquiry, when all items are covered, no element of chance is left and highest accuracy is obtained.
Demerits of census:
1. There is no way of checking the element of bias or its extent except through a resurvey or
use of sample checks.
2. Involves a great deal of time, money and energy.
3. At times, this method is practically beyond the reach of ordinary researchers.
4. Sometimes it is not possible to examine every item in the population, and sometimes it is
possible to obtain sufficiently accurate results by studying only a part of total population.
In such cases there is no utility of census surveys.
5. Where census involves destruction of elements in the population.

However, it needs to be emphasized that when the universe is a small one, it is no use resorting to
a sample survey.
(ii) Sample survey
When field studies are undertaken in practical life, considerations of time and cost almost
invariably lead to a selection of respondents i.e. selection of only a few items. The informants
selected should be as representative of the total population as possible in order to produce a
miniature cross-section. The selected respondents constitute what is called a sample and the
selection process is known as a sampling technique. The survey so conducted is known as
sample survey.

Algebraically, if we let the population size to be N, if a part of size n (where n < N) of this
population is selected according to some rule, for studying some characteristic of the population,
then the group consisting of these n units is known as sample.

Reasons for sampling:


1. Sampling reduces the time of a study: Data is collected and analyzed faster with sampling
than with an entire population.
2. Sampling reduces research costs: the more items included in a study, the greater the cost.
Every unit studied adds some cost to the research, whether in data collection or data
analysis, or both. This is why a census is usually a very costly exercise.
3. Sampling allows for better supervision, record keeping and training of researchers: with
fewer items to be studied, there is more time to train and supervise researchers. It will
also be easier to keep accurate data with a sample than with a large population.
4. Sampling produces more accurate results: There are more chances of making errors
when dealing with an entire population as opposed to a sample. Such errors include those
of omission, handling and calculation. The convenience of dealing with elements of a
sample makes it possible to produce more accurate results.
5. Infinitely large population need sampling
since all the items cannot be examined
6. If the study involves destruction of the elementary units, then studying the entire
population will mean destroying the elementary units
7. Where entire population is not accessible

(iii) A Sample Design


A sample design is a definite plan for obtaining a sample from a given population. It refers to the
technique or the procedure the researcher would adopt in selecting items for the sample. Sample
design may as well lay down the number of items to be included in the sample i.e, the size of the
sample. Sample design is determined before data are collected. Researcher must select/prepare a
sample design that should be reliable and appropriate for her research study.

STEPS IN SAMPLE DESIGN

The researcher must pay attention to the following


steps:
(i) Type of universe: Define the set of objects i.e. the Universe, to be studied. The universe
can be finite or infinite. In finite universe, the number of items is certain. For example
the number of workers in a business organization. But in case of an infinite universe, the
number of items is infinite, i.e, no idea about the total number of items. For example, the
number of listeners of a specific radio programme.
(ii) Sampling unit: A decision has to be taken concerning a sampling unit before selecting
sample. Sampling unit/unit of analysis may be a geographical one such as state, district,
village, or a construction unit such as a house, flat, or it may be a social unit such as
family, club, school, or it may be an individual. The researcher will have to decide one
or more of such units that she has to select for her study.
(iii) Source list: Sampling frame from which sample is to be drawn. It contains the names of
all items of a universe (in case of finite universe). If source list is not available,
researcher has to prepare it. Such a list should be comprehensive, correct, reliable and
appropriate. It is extremely important for the source list to be as representative of the
population as possible.
(iv) Size of sample: Refers to the number of items to be selected from the universe to
constitute a sample. The size of sample should neither be excessively large, nor too small.
It should be optimum (one which fulfills the requirements of efficiency,
representativeness, reliability and flexibility). While deciding the size of a sample, the
researcher should determine the desired precision at an acceptable confidence level for
the estimate. The size of population variance needs to be considered as in case of larger
variance usually a bigger sample is needed. The size of population should be kept in
view for this also limits the sample size. The parameters of interest in a research study
should be kept in view, while deciding the size of the sample. Costs too dictate the size of
sample that can be drawn. As such, budgetary constraint must be taken into consideration
when the sample size is decided.
(v) Parameters of Interest: In determining the sample design, one must consider the question
of the specific population parameters, which are of interest. For instance, you may be
interested in knowing some average or the other measure concerning the population.
There may also be important sub-groups in the population about whom you would like to
make estimates.
(vi) Budgetary constraint: Cost considerations have a major impact upon decisions relating to
not only the size of the sample but also to the type of sample. This fact can even lead to
the use of a non-probability sample.
(vii) Sampling procedure: Finally, the researcher should decide the type of sample she will
use i.e., she must decide about the technique to be used in selecting the items for the
sample. In fact, this technique or procedure stands for the sample design itself. She
should select that sampling design which, for a given sample size and for a given cost,
has a smaller sampling error.

CRITERIA OF SELECTING A SAMPLING


PROCEDURE

The researcher must consider the costs involved in


a sampling analysis. The costs include:
- The cost of collecting the data (it must be possible to obtain information from sample
selected given the available resources)
- The cost of an incorrect inference resulting from the data. There are two causes of
incorrect inferences.
o Systematic bias and
o Sampling error

1. Systematic bias
This results from errors in the sampling procedures. It cannot be reduced or eliminated by
increasing the sample size. At best the causes responsible for these errors can be detected and
corrected.

Causes of systematic bias in research


Usually a systematic bias is the result of one or
more of the following factors:
- Inappropriate sampling frame: i.e. a biased representation of the universe. This is a
deliberate selection of a representative sample where items are picked at will.
- Selection using improper random methods: this may allow the researchers desire to obtain a
certain result to influence the selection.
- Substitution of items in a sample when difficulties are encountered in obtaining information.
For example, in a house-to-house survey, the next house may be taken when there is no
reply from the targeted house. This will necessarily lead to preponderance of houses of the
type that are occupied all day, for example houses of people with families
- Failure to cover all the items in a sample:
investigators give up on following
individuals.
The items, which are left out, may contain
important information.
- Defective measuring device: In survey work, systematic bias can result if the questionnaire
or the interview is biased. Similarly, if the physical measuring device is defective, there
will be systematic bias in the data collected through such a measuring device.
- Non-respondents: If we are unable to sample all the individuals initially included in the
sample. The reason is that in such a situation, the likelihood of establishing contact or
receiving a response from an individual is often correlated with the measure of what is to be
estimated.
- Indeterminacy principle: Individuals act differently when kept under observation than what
they do when kept in non-observed situations. For instance, if workers are aware that
somebody is observing them in course of a work study on the basis of which the average
length of time to complete a task will be determined and accordingly the quota will be set
for piece work, they generally tend to work slowly in comparison to the speed with which
they work if kept unobserved.
- Natural bias in the reporting of data: For example, people in general understate their
incomes if asked about it for tax purposes, but they overstate the same if asked for social
status of their affluence. Generally in psychological surveys, people tend to give what they
think is the ‘correct’ answer rather than revealing their true feelings.

Example of Biased Selection


Actual example of a case in which an unsatisfactory method of selection introduced serious bias
into the results follows.

A sample of households was taken with the object of making a study of morbidity. It was also
intended to use this sample for the study of birth rates. Before beginning this latter study, which
was subsidiary to the morbidity study, a comparison was made of the sizes of households of the
sample with those of the corresponding census tracts. This comparison is shown in table 5.1
(households of one were not included in the survey).

N O C
u r e
m i n
b g s
e i u
r n s
a
o l T
f r
H S a
o a c
u m t
s p s
e l
h e
o
l
d
s
N P N P
u e u e
m r m r
b c b c
e e e e
r n r n
t t
254 1 1 2
9 , 6
. 7 .
4 6 8
2
338 2 1 2
5 , 6
. 7 .
9 4 5
5
307 2 1 2
3 , 1
. 4 .
5 3 9
8
201 1 8 1
5 5 3
. 3 .
4 0
106 8 3 5
. 8 .
1 8 9
46 3 2 3
. 0 .
5 8 2
25 1 9 1
. 6 .
9 5
29 2 8 1
. 6 .
2 3
1 9 6 1
, 9 , 0
3 . 5 0
0 9 7 .
6 6 1

It is immediately apparent from the table that the sample contains a greater proportion of large
household than what exists in the whole population. Households of two are under-represented in
the sample to the extent of 7.4 per cent of all households. This deficiency is attributed to the
failure of enumerators to include missed households, in which childless married women working
away from home are likely to predominate. In order to provide a more satisfactory sample, it
was necessary to make a further survey of those families that were missed together at the time of
the morbidity survey.

It is interesting to note that the sample was apparently considered satisfactory for the morbidity
study because the workers had been primarily concerned with securing a sample representative
of the area in regard to prevalence of sickness rather than size of household. Actually, such
biased sample can scarcely be regarded as satisfactory even for a morbidity study, since sickness
rates are likely to vary with the size and composition of the family (Adopted from Ngao and
Kumssa, 2004)

2. Sampling errors
These are the random variations in the sample estimates around the true population parameters.
Since they occur randomly and are equally likely to be in either direction, their nature happens to
be of compensatory type and the expected value of such errors happens to be equal to zero.

Sampling error decreases with the increase in the size of the sample, and it is of a smaller
magnitude in the case of homogeneous population.

Sampling error
Can be measured for a given sample design and size. The measurement of sampling error is
usually called the precision of the sampling plan. If the sample size is increased, the precision is
improved.

But increasing the size of the sample has its own limitations: it increases the cost of collecting
data and enhances the systematic bias.

Thus the effective way to increase precision is usually to select a better sampling design which
has a smaller sampling error for a given sample size at a given cost.

In practice, however, people prefer a less precise


design because:
- it is easier to adopt
- Systematic bias can be controlled in a better
way in such a design.

The characteristics of a good sample design are


that it should:
(a) Result in a truly representative sample.
(b) Result in a small sampling error.
(c) Be viable in the context of funds available for
the research study.
(d) Control systematic bias in a better way.
(e) Be such that the results of the sample study can be applied in general for the universe with a
reasonable level of confidence.

TYPES OF SAMPLE DESIGNS/SAMPLING PLANS/METHODS

There are different types of sample designs based


on two factors:
(i) the representation basis and
(ii) the element selection technique.

- On the representation basis, the sample may be probability sampling (based on the concept
of random selection) or it may be non-probability sampling (non-random sampling).
- On element selection basis, the sample may be either unrestricted (each sample element is
drawn individually from the population at large) or restricted (all other forms of sampling).

Thus, sample designs are basically of two types:


(i) Non-probability sampling
(ii) Probability sampling.

Diagram showing basic sampling designs

Element selection R
Technique e
p
r
e
s
e
n
t
a
t
i
o
n

b
a
s
i
s
↓ ↓
Probability
sampling Non-probability sampling
S H
U i a
n m p
r p h
e l a
s e z
t a
r r r
i
c a d
t n
e d s
d o a
m m
s p
a s l
m a i
p m n
l p g
i l
n i o
g n r
g c
o
n
v
e
n
i
e
n
c
e

s
a
m
p
l
i
n
g
R Complex random sampling Purposive sampling (such as
e (such as cluster sampling, quota sampling, judgment
s systematic sampling, stratified sampling).
t s
r a
i m
c p
t l
e i
d n
g
,
s
a e
m t
p c
l .
i )
n
g

(Adopted from Kothari, 2004)

A: Non-probability sampling:
- Refers to the sampling procedure which does not afford any basis for estimating the
probability that each item in the population has of being included in the sample. In such a
design, personal element has a great chance of entering into the selection of the sample.
- The probability of selecting an element into the sample may not be the same for each
element. It is not quite possible to introduce randomization into this type of sampling.

Non-probability sampling is used where:


(a) It satisfactorily meets sampling objectives. For example, if it is not required that the sample
needs to meet a cross-section of the population, the non-probability sampling is suitable.
(b) It cuts on cost and time requirements as
compared to probability sampling.
(c) The application of probability sampling breaks down in its application, which may happen
due to the carelessness of people applying it.

Types of non-probability sampling


1. Convenience Sampling: the researcher selects those respondents who are close at hand. This
saves time, money and effort. What is lost in accuracy is gained in efficiency. Volunteer
subjects such as those used by archaeologists or historians are an example of convenience
or accidental samples.
2. Purposive or Judgmental Samples: Sometimes a researcher selects a sub-group, which can be
judged to be representative of the population. Choosing the first three days of the month
as typical days for auditing, or picking a typical village to represent a national rural
population are examples of a purposive sample.
3. Snowball Samples: where a researcher picks an initial small sample of respondents, which
grows bigger and bigger as the information flow to a researcher increases. This technique
is common in observational research and in community studies. Snowballing is used in
obscure or hidden studies such as prostitution, homosexuality, and abortion, among
others.
4. Quota Sampling: the interviewers are simply given quotas to be filled from different strata,
with some restrictions on how they are to be filled. In other words, the actual selection of
the items for the sample is left to the interviewer’s discretion. This method is very
convenient and is relatively inexpensive but introduces researcher bias.

B: Probability sampling / random sampling /chance sampling.


- Under this sampling design, every item of the universe has an equal chance of inclusion in
the sample. The results obtained from probability or random sampling can be assured in
terms of probability i.e. you can measure the errors of estimation or the significance of
results obtained from a random sample.
- Random sampling ensures the law of Statistical Regularity, which states that if on an
average the sample chosen is a random one, the sample will have the same composition and
characteristics as the universe. This is the reason why random sampling is considered as
the best technique of selecting a representative sample.
- Random sampling from a finite population refers to that method of sample selection, which
gives each possible sample combination an equal probability of being picked up and each
item in the entire population to have an equal chance of being included in the sample. This
implies sampling without replacement i.e. once an item is selected for the sample, it cannot
appear in the sample again. On the other hand, in sampling with replacement, the element
selected for the sample is returned to the population before the next element is selected. In
such a situation the same element could appear twice in the same sample before the second
element is chosen. This method is used less frequently.

The implications of random sampling (or simple random sampling)


(i) It gives each element in the population an equal probability of getting into the sample and
all choices are independent of one another.
(ii) It gives each possible sample combination an equal probability of being chosen.

We can therefore define a simple random sample from a finite population as a sample, which is
chosen in such a way that each of the NCn possible samples have the same probability, (1/NCn) of
being selected.

Example
Consider a certain finite population consisting of six elements (a, b, c, d, e, f) i.e. N = 6. Suppose
that you want to take a sample size n = 3 from it. Then there are 6C3 = 20 possible distinct
samples of the required size, and they consist of the elements:
{abc}; {abd}; {abe}; {abf}; {acd}; {ace};
{acf}; {ade}; {adf}; {aef}; {bcd}; {bce};
{bcf};
{bde}; {bdf}; {bef}; {cde}; {cdf}; {cef}; and
{def}.

If you choose one of these samples in such a way that each has the probability 1/20 of being
chosen, you will then call this a random sample.

How to select a random sample


(i) In simple cases, write each of the possible samples on a slip of paper, mix these slips
thoroughly in a container and then draw as a lottery either blindfolded or by rotating a
drum or by any other similar device. Such a procedure is obviously impractical, if not
impossible in complex problems of sampling.

(ii) You can write the name of each element of a finite population on a slip of paper, put the
slips of paper so prepared into a box or bag and mix them thoroughly and then draw the
required number of slips for the sample one after the other without replacement. In doing
so you must make sure that in successive drawing each of the remaining elements of the
population has the same chance of being selected. This procedure will also result in the
same probability for each possible sample.

In the earlier example, since you have a finite population of 6 elements and you want to
select a sample of size 3, the probability of drawing any one element for your sample in
the first draw is 3/6, the probability of drawing one more element in the second draw is
2/5, (the first element drawn is not replaced) and similarly the probability of drawing one
more element in the third draw is 2/4. Since these draws are independent, the joint
probability of the three elements which constitute our sample is the product of their
individual probabilities and this works out to 3/6 x 2/5 x ¼ = 1/20.

(iii) Use random number tables to select a random sample. Tippet gave 10400 four-figure
numbers. He selected 41600 digits from the census reports and combined them into fours
to give his random numbers, which may be used to obtain a random sample.

Illustration: The first thirty sets of Tippet’s


numbers are:

2952 6641 3992 9792 7979 5911


3170 5624 4167 9525 1545 1396
7203 5356 1300 2693 2370 7483
3408 2769 3563 6107 6913 7691
0560 5246 1112 9025 6008 8126

Suppose you are interested in taking a sample of 10 units from a population of 5000 units,
bearing numbers from 3001 to 8000. You will select 10 such figures from the above random
numbers which are not less than 3001 and not greater than 8000. If you randomly decide to read
the table numbers from left to right, starting from the first row itself, you obtain the following
numbers: 6641, 3992, 7979, 5911, 3170, 5624, 4167, 7203, 5356 and 7483. The units bearing the
above serial numbers would then constitute your required random sample.

Note that it is easy to draw random samples from finite populations with the aid of random
number tables only when lists are available and items are numbered. But in some situations, it is
often impossible to proceed in this way. For example, if you want to estimate the mean height of
trees in a forest, it would not be possible to number the trees, and choose random numbers to
select a random sample. In such a situation what you should do is to select some trees for the
sample haphazardly without aim or purpose, and should treat the sample as a random sample for
study purposes.

Random sample from an infinite universe


Selection of each item in a random sample from an infinite population is controlled by the same
probabilities and that successive selections are independent of one another.

For example, suppose you consider the 20 throws of a fair dice as a sample from the
hypothetically infinite population, which consists of the results of all possible throws of the dice .
If the probability of getting a particular number, say 1, is the same for each throw and the 20
throws are all independent, then the sample is random. Also if you sample with replacement from
a finite population, the sample would be considered as a random sample if in each draw all
elements of the population have the same probability of being selected and successive draws
happen to be independent
Probability sampling methods/sampling designs

1. Simple Random Sampling


The simple random sampling is the basic probability sampling design. A simple random sample
is one, which every member of the population has an equal and independent chance of being
selected. Randomness as a sample selection process can be accomplished with either lottery or a
table or random numbers. Both methods require a listing of the population units or the sampling
frame.

2. Systematic Sampling
You begin with a listing of all elements in the designated population. Then determine the desired
sample size and divide it into the population size to give an increment value, labeled N. The
sample selected is composed of every N th element of the sample frame. The first element is
selected by a random process in order to avoid bias.

For example, if a 4 per cent sample is desired, the first item would be selected randomly from the
first twenty-five and thereafter every 25 th item would automatically be included in the sample.
Thus, in systematic sampling only the first unit is selected randomly and the remaining units of
the sample are selected at fixed intervals.

Merits:
(i) It can be taken as an improvement over a simple random sample in as much as the
systematic sample is spread more evenly over the entire population.
(ii) It is an easier and less costly method of sampling and can be conveniently used even
in case of large populations.
Demerits:
(i) If there is a hidden periodicity in the population, systematic sampling will prove to be
an inefficient method of sampling.
For instance, every 25th item produced by a certain production process is defective. If
you were to select a 4% sample of the items of this process in a systematic manner,
you would either get all defective items or all good items in the sample depending
upon the random starting position.
(ii) If the population list is not in random order, the results of such sampling may, at
times, not be very reliable.
In practice, systematic sampling is used when lists of population are available and they are of
considerable length.

3. Stratified Sampling
The population is divided into layers or strata. Stratification is especially useful when a
population is characterized as heterogeneous but consists of a number of homogeneous sub-
populations or strata. When a population is homogeneous, little or no benefit is obtained from
stratification.

The population is divided into several sub-populations that are individually more homogeneous
than the total population and then you select items from each stratum to constitute a sample.
Since each stratum is more homogeneous than the total population, you are able to get more
precise estimates for each stratum and by estimating more accurately each of the component
parts; you get a better estimate of the whole. Stratification results in more reliable and detailed
information.

Three questions are relevant in this context:


(a) How to form strata?
The strata should be formed on the basis of common characteristic(s) of the items to be put in
each stratum. Various strata to be formed in such a way as to ensure elements are most
homogeneous within each stratum and most heterogeneous between different strata. Strata
are purposively formed and are based on past experience and personal judgment of the
researcher. Careful consideration of the relationship between the characteristics of the
population and the characteristics to be estimated are used to define the strata. At times, pilot
study may be conducted for determining a more appropriate and efficient stratification plan.
You can do so by taking small samples of equal size from each of the proposed strata and
then examining the variances within and among the possible stratifications.

(b) How should items be selected from each


stratum?
The usual method for selection of items for the sample from each stratum is simple random
sampling. Systematic sampling can be used if it is considered more appropriate in certain
situations.

(c) How many items to be selected from each stratum or how to allocate the sample size of each
stratum?
Method of proportional allocation under which the sizes of the samples from the different
strata are kept proportional to the sizes of the strata is followed. That is, if P i represents the
proportion of population included in stratum I, and n represents the total sample size, the
number of elements selected from stratum I is n.Pi.

Example
Suppose we want a sample of size n = 30 to be drawn from a population of
size N = 8000 which is divided into three strata of size N1 = 4000, N2 = 2400
and N3 = 1600.

Adopting proportional allocation, the sample sizes from each stratum are obtained as follows:

For strata 1 with N1 = 4000: P1 = (4000/8000) = 0.5 and n1 = n. P1 = 30 (0.5) = 15

For strata 2 with N2 = 2400: P2 = (2400/8000) = 0.3 and n2 = n. P2 = 30(0.3) = 9

For strata 3 with N3 = 1600: P3 = (1600/8000) = 0.2 and n3 = n. P3 = 30(0.2) = 6

Thus, using proportional allocation, the samples sizes for different strata are 15, 9 and 6
respectively which is in proportion to the sizes of the strata viz., 4000: 2400: 1600.

Proportional allocation is considered the most efficient and an optimal design when the cost of
selecting an item is equal for each stratum, there is no difference in within-stratum variances, and
the purpose of sampling happens to be to estimate the population value of some characteristic.

But in case the purpose happens to compare the differences among the strata, then equal sample
selection from each stratum would be more efficient even if the strata differ in sizes.
In cases where strata differ not only in size but also in variability and it is considered reasonable
to take larger samples from the more variable strata and smaller samples from the less variable
strata, a researcher can then account for both (differences in stratum size and differences in
stratum variability) by using disproportionate sampling design by requiring that:

n1/N1σ1 = n2/N2σ2 = ….. = nk/NKσK


Where σ1, σ2, …,σK denote the standard deviations of the k strata, N1, N2, … NK denote the sizes of
the k strata and n1, n2, … nK denote the sample sizes of k strata.

This is called ‘optimum allocation’ in the context of disproportionate sampling. The allocation in
such a situation results in the following formula for determining the sample sizes different strata:

ni = n. N1 σ1
N1 σ1 + N2 σ2 +. …+ NK σK

For I = 1, 2, …,k.

Example
A population is divided into three strata so that N1 = 5000, N2 = 2000 and N3 = 3000. Respective
standard deviations are:

σ1 = 15, σ2 = 18 and σ3 = 5.

How should a sample of size n = 84 be allocated to the three strata, if you want optimum
allocation using disproportionate sampling design?

Solution:
Using the disproportionate sampling design for optimum allocation, the sample sizes for different
strata will be determined as under:

Sample size for strata with N1 = 5000

n1 = 84(5000) (15)
(5000) (15) + (2000) (18) + (3000) (5)

= 6300000/126000 = 50

Sample size for strata with N2 = 2000

n2 = 84(2000) (18)
(5000) (15) + (2000) (18) + (3000)
(5)

= 3024000/126000 = 24

Sample size for strata with N3 = 3000

n3 = 84(3000) (5)
(5000) (15) + (2000) (18) + (3000) (5)

= 1260000/126000 = 10

In addition to differences in stratum size and differences in stratum variability, you may have
differences in stratum sampling cost, and then you can have cost optimal disproportionate
sampling design by requiring

n1 = n2 =… = nK
N1 σ1 C1 C CK
N2 σ 2 2 NKσK
W
h
e
r
e
C =
1

C =
2

C =
K

And all other terms remain the same as explained earlier. The allocation in such a situation results
in the following formula for determining the sample sizes for different strata:
ni = n.Niσi/ Ci for I = 1; 2, …, k

N1 σ1 C1 + N2 σ2 C + …+ NKσK CK
NB:
2
It is not necessary that stratification be done keeping in view a single characteristic. Populations
are often stratified according to several characteristics. For example, a system-wide survey
designed to determine the attitude of students toward a new teaching plan, a state college system
with 20 colleges might stratify the students with respect to class, sec and college. Stratification of
this type is known as cross-stratification, and up to a point such stratification increases the
reliability of estimates and is much used in opinion surveys.

The sample so constituted is the result of successive application of purposive (involved in


stratification of items) and random sampling methods. As such it is an example of mixed
sampling. The procedure where you first have stratification and then simple random sampling is
known as stratified random sampling.

4. Cluster sampling
If the total area of interest is big, a convenient way in which a sample can be kept is to divide the
area into a number of smaller non-overlapping areas and then to a randomly select a number of
these smaller areas (clusters), with the ultimate sample consisting of all (or samples of) units in
these small areas or clusters. Thus in cluster sampling the total population is divided into a
number of relatively small subdivisions which are themselves clusters of still smaller units and
then some of these clusters are randomly selected for inclusion in the overall sample.

Suppose you want to estimate the proportion of machine-parts in an inventory, which are
defective. Also assume that there are 20000 machine parts in the inventory at a given point of
time, stored in 400 cases of 50 each. Now using a cluster sampling, you would consider the 400
cases as clusters and randomly select ‘n’ cases and examine all the machine parts in each
randomly selected case.

It requires grouping of the population. The units of the population are grouped by cluster rather
than by strata For example, workers in the quality control division. Cluster sampling is used only
because it reduces cost by concentrating surveys in selected clusters. Hence estimates based on
cluster samples are usually more reliable per unit cost.

Demerits:
(i) Cluster sampling can lead to large sampling errors if it is not properly done, hence less
precise than random sampling.
(ii) There is not as much information in ‘n’ observations within a cluster as there happens
to be in ‘n’ randomly drawn observations.
5. Multi-stage Sampling
This is a form of random sampling, which takes place in a series of stages. For example:
Stage 1: Random selection of regions
Stage 2: Random selection of neighbourhood within regions and
Stage 3: Random selection of households within neighborhood

Any of the other methods of sampling may be used in each of these stages. If you select
randomly at all stages, you will have what is known as multi-stage random sampling design. This
method of sampling is applied in big inquiries extending to a considerable large geographical
area, such as the entire country.

Suppose you want to investigate the working efficiency of nationalized banks in Kenya and you
want to take a sample of few banks for this purpose. The first stage is to select large primary
sampling unit such as provinces.
▪ If you select certain districts and interview all banks in the chosen districts. This would
represent a two-stage sampling with the ultimate sampling units being clusters of districts.
▪ If instead of taking a census of all banks within the selected districts, you select certain towns
and interview all banks in the chosen towns. This would represent a three-stage sampling
design.
▪ If instead of taking a census of all banks within the selected towns, you randomly sample
banks from each selected town, then it is a case of using a four-stage sampling plan.

Merits:
(i) It is easier to administer than most single stage designs mainly because of the fact
that sampling frame is developed in partial units.
(ii) A large number of units can be sampled for a given cost because of sequential
clustering, whereas this is not possible in most of the simple designs.
(iii) It is most useful in sampling a large number of units, especially when cost saving is
an important consideration.

Demerits:
Sampling errors are likely to be larger than those
of other probability samples.

6. Area sampling
If clusters happen to be some geographic subdivisions, in that case cluster sampling is known as
area sampling. Hence cluster designs, where the primary sampling unit represents a cluster
sampling are also applicable to area sampling.

7. Sampling with probability proportional to


size
In case the cluster-sampling units do not have the same number of approximately the same
number of elements, it is appropriate to use a random selection process where the probability of
each cluster being included in the sample is proportional to the size of the cluster. For this
purpose, you have to list the number of elements in each cluster irrespective of the method of
ordering the cluster. Then you should sample systematically the appropriate number of elements
from the cumulative totals. The actual numbers selected in this way do not refer to individual
elements, but indicate which clusters and how many from the cluster are to be selected by simple
random sampling or by systematic sampling.

Merits:
(i) The results of this type of sampling are equivalent to those of a simple random
sample i.e. not so biased
(ii) The method is less cumbersome
(iii) It is relatively less expensive.
Example
The following are the number of departmental
stores in 15 towns: 35, 17, 10, 32, 70, 28, 26, 19,
26, 66, 37, 44, 33, 29 and 28. If you want to select a sample of 10 stores, using cities as clusters
and selecting within clusters proportional to size, how many stores from each town should be
chosen?

Solution: The information can be put in the


following table

N C S
o u a
. m m
u p
o l l
f a e
t
d i
e v
p e
a
r t
t o
m t
e a
n l
t
a
l
s
t
o
r
e
s
1 35 35 1
0
2 17 52
3 10 62 6
0
4 32 94
5 70 1 1 1
6 1 6
4 0 0
6 28 1
9
2
7 26 2 2
1 1
8 0
8 19 2
3
7
9 26 2 2
6 6
3 0
66 3 3
2 1
9 0
37 3 3
6 6
6 0
44 4 4
1 1
0 0
33 4
4
3
29 4 4
7 6
2 0
28 5
0
0

Since there are 500 departmental stores from which you have to select a sample of 10 stores, the
appropriate sampling interval is 50. The starting point is 10 and then you add successively
increments of 50 till 10 numbers have been selected. The numbers, thus, obtained are: 10, 60,
110, 160, 210, 260, 310, 410 and 460. From this two, stores should be selected randomly from
town number five and one each from town number 1, 3, 7, 9, 10, 11, 12, and 14. This sample of
10 stores is the sample with probability proportional to size.

8. Sequential sampling
The ultimate size of the sample is determined according to mathematical decision rules on the
basis of information yielded as survey progresses. This is usually adopted in case of acceptance
sampling plan in context of statistical quality control. In sequential sampling, one can go on
taking samples one after another as long as one desires to do so.

When a particular lot is to be accepted or rejected on the basis of single sample, it is known as
single sampling; when the decision is to be taken on the basis of two samples, it is known as
double sampling and in case the decision rests on the basis of more than two samples but the
number of samples is certain and decided in advance, the sampling is known as multiple
sampling. But when the number of samples is more than two but it is neither certain nor decided
in advance, this type of system is often referred to as sequential sampling.

Conclusion
▪ One should resort to simple random sampling because under it bias is generally eliminated
and the sampling error can be estimated.
▪ Purposive sampling is considered more appropriate when the universe happens to be small
and a known characteristic of it is to be studied intensively.
▪ At times, several methods of sampling may
well be used in the same study.

You might also like