Sampling
Sampling
Census
All the items in any field of inquiry constitute a Universe or Population A complete enumeration of all items in the population is known as Census Inquiry Most times census inquiry is not practically possible A census is appropriate if the sample size is small. Every individual may be included in the population. Census inquiry requires enormous time, money & energy; slightest bias may get magnified when number of observations are increased
SAMPLING
Sample - contacting a portion of the population (eg., 10% or 25%) best with a very large population (n) easiest with a homogeneous population Sample Survey selection of a few items of the population
Population Parameter
Sample Statistic
We measure the sample using statistics in order to draw inferences about the population and its parameters.
Definition of Sampling:
Measuring a small portion of something and then making a general statement about the whole thing.
Process of selecting a number of units for a study in such a way that the units represent the larger group from which they are selected.
(ex. A company is thinking of lowering its price for its soap bar product. After making a survey in the sales of their product in a known mall, they concluded that they will not cut down the price of the soap bar since there was an increased in sales compared to last year. Bias is present in this study since the company based its decision for the sales of a known mall which have consumers who can afford high price products. They did not consider the sales of their products in other area wherein they have middle class or low class consumers.)
Precision : sample represents the population (ex. Customers who visited a particular dress shop are requested to log in their phone numbers so that they will receive information for discounts and new arrivals. Management wish to study customers satisfaction for that shop. By means of interviewing thru phone they get comments and reactions of their client. Samples used are not an exact representative of the population since it is limited only to those customers who log in their phone numbers and they did not consider customers without phone numbers indicated.
this is (bad)
Population
Sample
Population
- The universe or population to be studied maybe too large or unlimited that it is almost impossible to reach all of them. Sampling makes possible this kind of study because in sampling only a small portion of the population maybe involved in the study, enabling the researcher to reach all through this small portion of the population.
2. Sampling is for economy. - Research without sampling may be too costly. Sampling reduces the study population to a reasonable size that expenses are greatly reduced.
3. Sampling is for speed. - Research without sampling might be too time consuming.
4. Sampling is for accuracy. - If it takes too long a time to cover the whole study population, there maybe inaccuracy. The research must be finished within a reasonable period of time so that the data are still true, valid and reasonable. 5. Sampling saves the sources of data from being all consumed. - The act of gathering data may consume all the sources of information without sampling. In such a case, there is no more data to apply the conclusion to.
3. If the population is very large and there are many sections and subsections, the sampling procedure becomes very complicated.
4. If the researcher does not possess the necessary skill and technical knowhow in sampling procedure.
Bias in the selection of a sample can arise if: The sampling is done by a non-random method, which generally means the selection is influenced by human choice. The sampling frame (census, phone book) which serves as the basis for selection does not cover the population adequately, completely or accurately. Groups of the population are not represented because you can't find them or they refuse to participate.
Terminology
Population The entire group of people of interest from whom the researcher needs to obtain information. Element (sampling unit) one unit from a population Sampling The selection of a subset of the population Sampling Frame Listing of population from which a sample is chosen Census A polling of the entire population Survey A polling of the sample
Parameter The variable of interest Statistic The information obtained from the sample about the parameter Goal To be able to make inferences about the population parameter from knowledge of the relevant statistic to draw general conclusions about the entire body of units Critical Assumption The sample chosen is representative of the population
Types of universe: set of objects, finite universe (population of city ,students in class, workers in a factory) Infinite universe(stars in the sky, listeners of a specific programme)
Sampling Unit: may be a geographical one such as state , district, village, construction unit such as house, flat or it may be a social unit like family, school or it may be an individual. Source list: Sampling Frame from which sample is to be drawn. It should be correct, reliable and appropriate.
Size of Sample: refer to the number of items to be selected from the universe to constitute the sample. It should neither too large, nor too small. It should be optimum.
The size of a sample varies inversely as the size of the population. A larger proportion is required of a smaller population and a smaller proportion may do for a bigger population. For a greater accuracy and reliability of results, a greater sample is desirable. In biological and chemical experiments, the use of few persons is more desirable to determine the reactions of humans. When subjects are likely to be destroyed during experiment, it is more feasible to use nonhumans.
1. Determine the size of the target population. 2. Decide on the margin of error. As much as possible the margin of error should not be higher than 5%. Probably 3% is an ideal one. 3. Use the formula n = N 1 + Ne2 n = sample size N = the size of the population e = the margin of error 4. Compute the sample proportion by dividing the result in number 3 by the population.
Example 1. Population is 5,346 2. Margin of error is 3% 3. Using the formula n = ___5,346_ 1+ 5346(.03)2 n = 920 4. Sample proportion (%) = 920 / 5346 = 17%
Random Convenience Sampling Purposive Sampling (such as quota sampling, snow ball and judgmental sampling)
Comparison factor
List of the population element Information about the sampling unit Sampling skill required Time requirement Cost per unit sampled Estimate of population parameter Sample Representativeness Accuracy and reliability Measurement of sampling error
Non-probability sampling - - unequal chance of being included in the sample (non-random) convenience sampling judgment sampling snowball sampling quota sampling
Probability Sampling
An objective procedure in which the probability of selection is nonzero and is known in advance for each population unit. It is also called random sampling. Ensures information is obtained from a representative sample of the population Eg. Lottery method in which individual units are picked up from the whole group.
It gives each element in the population an equal probability of getting into the sample and all choices are independent of each other.
Examples include drawing names from a hat and selecting the winning raffle ticket from a large drum.
drawing names or numbers out of a fishbowl, using a spinner, rolling dice, may be an appropriate way to draw a sample from a small population, when populations consist of large numbers of elements, sample selection is based on tables of random numbers or computer-generated random numbers.
Systematic sampling
Suppose a researcher wants to take a sample of 1,000 from a list of 200,000 names. With systematic sampling, every 200th name from the list would be drawn. The procedure is extremely simple. A starting point is selected by a random process; then every nth number on the list is selected. The problem of periodicity occurs if a list has a systematic patternthat is, if it is not random in character.
Advantage Systematic sample is spread more evenly over the entire population Easier and less costlier method Disadvantage If there is a hidden periodicity in the population, it will be prove an inefficient method.
Stratified Sampling
A probability sampling procedure in which simple random subsamples that are more or less equal on some characteristic are drawn from within each stratum of the population.
Population is divided into mutually exclusive and exhaustive strata based on an appropriate population characteristic. (e.g. race, age, gender etc.) Simple random samples are then drawn from each stratum. Random sampling error will be reduced with the use of stratified sampling, because each group is internally homogeneous but there are comparative differences between groups.
proportional stratified sample (Difference in stratum size only.) A stratified sample in which the number of sampling units drawn from each stratum is in proportion to the population size of that stratum. Eg. Suppose we want a sample of size n = 30, to be drawn from a population of size N= 8000 which is divided into three strata of size N1=4000, N2=2400 and N3=1600.so what will be the size of sample from each strata. For strata with N1=4000, we have Pi=4000/8000 Hence n1= n x Pi= 30x (4000/8000) = 15 For strata with N2=2400, we have Pi=2400/8000 Hence n2= n x Pi= 30 x (2400/8000) = 9 For strata with N3=1600, we have Pi=1600/8000 Hence n3= n x Pi= 30 x (1600/8000) = 6 Pi is proportion of population include in a stratum
disproportional stratified sample (difference in stratum size + difference in stratum variability) A stratified sample in which the sample size for each stratum is allocated according to analytical considerations.
Eg. A population is divided into 3 strata so that N1= 5000, N2= 2000 and N3 = 3000. Respective std deviation are 1 = 15, 2= 18 and 3= 5. How should a sample of size n=84 be allocated to the three strata, if we want optimum allocation using disproportionate sampling design. Ni= n x Ni i/N1 1+ N2 2+ Nk k N1 = 84 x 5000 x 15/5000 x 15 + 2000 x 18 + 3000 x 5 = 50 N2 = 84 x 5000 x 18/2000 x 15 + 2000 x 18 + 3000 x 5 = 50 N3 = 84 x 5000 x 5/3000 x 15 + 2000 x 18 + 3000 x 5 = 50
Used if 1) some strata are too small 2) some strata are more important than others 3) some strata are more diversified than others
cluster sampling
An economically efficient sampling technique in which the primary sampling unit is not the individual element in the population but a large cluster of elements; clusters are selected randomly. Cluster samples frequently are used when lists of the sample population are not available. Ideally a cluster should be as heterogeneous as the population itselfa mirror image of the population. A problem may arise with cluster sampling if the characteristics and attitudes of the elements within the cluster are too similar.
Non-Probability Sampling
Subjective procedure in which the probability of selection for some population units are zero or unknown before drawing the sample. information is obtained from a no representative sample of the population Sampling error can not be computed Survey results cannot be projected to the population
Advantages
Cheaper and faster than probability Reasonably representative if collected in a thorough manner.
convenience sampling refers to sampling by obtaining people or units that are conveniently available. A research team may determine that the most convenient and economical method is to set up an interviewing booth from which to intercept consumers at a shopping center. The college professor who uses his or her students has a captive sampleconvenient, but perhaps not so representative. Convenience samples are best used for exploratory research when additional research will subsequently be conducted with a probability sample.
Judgment (purposive) sampling is nonprobability sampling technique in which an experienced individual selects the sample based on his or her judgment about some appropriate characteristics required of the sample member. Researchers select samples that satisfy their specific purposes, even if they are not fully representative. Judgment sampling often is used in attempts to forecast election results. People frequently wonder how a television network can predict the results of an election with only 2 percent of the votes reported.
quota sampling is to ensure that the various subgroups in a population are represented on pertinent sample characteristics to the exact extent that the investigators desire.
In quota sampling, the interviewer has a quota to achieve. For example, an interviewer in a particular city may be assigned 100 interviews, 35 with owners of Sony TVs, 30 with owners of Samsung TVs, 18 with owners of Panasonic TVs, and the rest with owners of other brands. The interviewer is responsible for finding enough people to meet the quota.
The major advantages of quota sampling over probability sampling are speed of data collection, lower costs, and convenience.
Quota sampling may be appropriate when the researcher knows that a certain demographic group is more likely to refuse to cooperate with a survey. For instance, if older men are more likely to refuse, a higher quota can be set for this group so that the proportion of each demographic category will be similar to the proportions in the population.
snowball sampling involve using probability methods for an initial selection of respondents and then obtaining additional respondents through information provided by the initial respondents. This technique is used to locate members of rare populations by referrals.
Reduced sample sizes and costs are clear-cut advantages of snowball sampling. However, bias is likely to enter into the study because a person suggested by someone also in the sample has a higher probability of being similar to the first person.
Description
Disadvantages
1. Convenience: The Very low cost, No need for Unrepresentative samples researcher uses the most extensively list of likely; convenient sample or used population random sampling error economical sample units. estimates cannot be made; projecting data beyond sample is relatively risky 2. Judgment: Moderate cost, An expert or experienced average use researcher selects the sample to fulfill a purpose, such as ensuring that all members have a certain characteristic. Useful for certain types of forecasting; sample guaranteed to meet a specific objective Bias due to experts beliefs may make sample unrepresentative; projecting data beyond sample is risky
Description 3. Quota: The researcher classifies the population by pertinent properties, determines the desired proportion to sample from each class, and fixes quotas for each interviewer.
Cost and Advantages Degree of Use Moderate cost, very extensively used Introduces some stratification of population; requires no list of population
Disadvantages Introduces bias in researchers classification of subjects; nonrandom selection within classes means error from population cannot be estimated; projecting data beyond sample is risky High bias because sample units are not independent; projecting data beyond sample is risky
4. Snowball: Initial respondents are selected by probability samples; additional respondents are obtained by referral from initial respondents.
Low cost, used Useful in in special locating situations members of rare populations
Description
1. Simple random: The researcher assigns each member of the sampling frame a number, then selects sample units by random method.
Disadvantages
Requires sampling frame to work from; does not use knowledge of population that researcher may have; larger errors for same sampling size than in stratified sampling; respondents may be widely dispersed, hence cost may be higher If sampling interval is related to periodic ordering of the population, may introduce increased variability
2. Systematic: The Moderate cost, Simple to draw researcher uses natural moderately used sample; easy to ordering or the order check of the sampling frame, selects an arbitrary starting point, then selects items at a preselected interval.
Description
Disadvantages
3. Stratified: The High cost, researcher divides moderately the population into used groups and randomly selects subsamples from each group. Variations include proportional, disproportional, and optimal allocation of subsample sizes. 4. Cluster: The Low cost, researcher selects frequently sampling units at used random, then does a complete observation of all units or draws a probability sample in the group.
Requires accurate information on proportion in each stratum; if stratified lists are not already available, they can be costly to prepare
If clusters geographically defined, yields lowest field cost; requires listing of all clusters, but of individuals only within clusters; can estimate characteristics of
Larger error for comparable size than with other probability samples; researcher must be able to assign population members to unique cluster or else duplication or omission of individuals will result
Meaning of Scaling
Scaling has been defined as assignment of numbers (or property of objects in order characteristics of numbers question. a procedure for the other symbols ) to a to impart some of the to the properties in
Scaling describe the procedures of assigning of numbers or symbols. This can be done in two ways
Making a judgment about some characteristic of an individual and then placing him directly on a scale that has been defined in terms of that characteristic.
Constructing questionnaires in such a way that the score of individuals responses assigns him a place on a scale.
Scaling involves creating a continuum upon which measured objects are located.
Scale Properties
Nominal Scales Ordinal Scales
Interval Scales Ratio Scales
Scale Properties:
In nominal scale the numbers serve only as labels or tags for identifying and classifying objects. The ordinal scale is a ranking scale in which numbers are assigned to objects to indicate the relative extent to which the objects possess some characteristic. In interval scale numerically equal distances on the scale represent equal values in the characteristic being measured. The ratio scale possesses all the properties of the nominal, ordinal, and interval scales. It has an absolute zero point.
Nominal Scale
A nominal scale is the simplest of the four scale types and in which the numbers or letters assigned to objects serve as labels for identification or classification. Example: Males = 1, Females = 2 Sales Zone A = Islamabad, Sales Zone B = Rawalpindi Drink A = Pepsi Cola, Drink B = 7-Up, Drink C = Mirinda
Ordinal Scale:
Ordinal measurements describe order, but not relative size or degree of difference between the items measured. In this scale type, the numbers assigned to objects or events represent the rank order (1st,2nd,3rd,etc) of the entities assessed. A likert scale is a type of ordinal scale and may also use names with an order such as: Bad, medium and good very satisfied, satisfied, neutral, unsatisfied, very unsatisfied
Examples of Ordinal:
Career Opportunities = Moderate, Good, Excellent Investment Climate = Bad, inadequate, fair, good, very good Merit = A grade, B grade, C grade, D grade
A problem with ordinal scales is that the difference between categories on the scale is hard to quantify, ie., excellent is better than good but how much is excellent better?
Interval Scale
All quantitative attributes can be measured in interval scales. Measurements belonging to this category can be counted, ranked, added, or subtracted to take the difference, The zero point in the interval scale is arbitrary, and also negative values are also defined. The variables measured on an interval scale are known as interval variables or scaled variables.
A good example of this category is the measurements made in the Celsius scale. The temperatures inside an air conditioned room and the surroundings can be 160 C and 320 C. It is reasonable to say the temperature outside is 160 C higher than inside, but it is true to say that outside is twice as hot as inside, which is obviously incorrect thermodynamically.
Ratio Scale
An interval scale with a true zero point can be considered as a ratio scale. The measurements in this category can be counted, ranked, added, or subtracted to take the difference. Also, these values can be multiplied or divided, and the ratio between two measurements makes sense. Most measurements in the physical sciences and engineering is done on ratio scales.
A good example is the Kelvin scale. It has an absolute zero point, and multiples of measurements make perfect sense. Taking the statement from the previous paragraph, if the measurements are made in Kelvins, it is reasonable to say its twice as hot outside (this is only for comparison; truly, it is really difficult to make this statement, unless you are in space)
Ordinal
Interval Ratio
H Dangi, FMS
67
Comparative Scales
Noncomparative Scales
Paired Comparison
Rank Order
Constant Sum
H Dangi, FMS
Likert
Semantic Differential
Stapel
68
0 0 1 0
Bharti Wal1 1 0 1 Mart Number of 3 2 0 4 1 Times Preferredb aA 1 in a particular box means that the need in that column was preferred over the need in the corresponding row. A 0 means that the row need was preferred over the column need. bThe number of times a need was preferred Dangi, FMS 70 is obtained by summing the 1s in each column.
H Dangi, FMS
71
3.Banking
4. FMCG 5. Automobile
_________
_________ _________
H Dangi, FMS
72
H Dangi, FMS
73
On the next slide, there are eight attributes of security force . Please allocate 100 points among the attributes so that your allocation reflects the relative importance you attach to each attribute. The more points an attribute receives, the more important the attribute is. If an attribute is not at all important, assign it zero points. If an attribute is twice as important as some other attribute, it should receive twice as many points.
H Dangi, FMS
74
Segment III
75
H Dangi, FMS
76
90
100
90
100
H Dangi, FMS
77
H Dangi, FMS
78
Likert Scale
The Likert scale requires the respondents to indicate a degree of agreement or disagreement with each of a series of statements about the stimulus objects.
Strongly disagree Disagree Neither agree nor disagree 3 3 3X Agree Strongly agree
1. Subiksha sells high quality merchandise. 2. Subiksha has poor in-store service. 3. I like to shop at Subiksha .
1 1 1
2X 2X 2
4 4 4
5 5 5
The analysis can be conducted on an item-by-item basis (profile analysis), or a total (summated) score can be calculated. When arriving at a total score, the categories assigned to the negative statements by the respondents should be scored by reversing the scale.
H Dangi, FMS 79
H Dangi, FMS
80
Staple Scale
The Staple scale is a unipolar rating scale with ten categories numbered from -5 to +5, without a neutral point (zero). This scale is usually presented vertically.
Reliance Fresh +5 +4 +3 +2 +1 HIGH QUALITY -1 -2 -3 -4X -5 +5 +4 +3 +2X +1 POOR SERVICE -1 -2 -3 -4 -5
The data obtained by using a Stapel scale can be analyzed in the same way as semantic differential data.
H Dangi, FMS 81