Statistics Population Sample Randomly Probability Probability
This document discusses different sampling methods used in statistics including simple random sampling, systematic sampling, and stratified sampling. It provides details and examples of how each method works. For simple random sampling, it explains how to use a random number table to select a simple random sample without replacement from a population. It also gives an example of drawing a simple random sample from a dataset of 100 pieces of yarn tested.
Download as DOC, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
51 views
Statistics Population Sample Randomly Probability Probability
This document discusses different sampling methods used in statistics including simple random sampling, systematic sampling, and stratified sampling. It provides details and examples of how each method works. For simple random sampling, it explains how to use a random number table to select a simple random sample without replacement from a population. It also gives an example of drawing a simple random sample from a dataset of 100 pieces of yarn tested.
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7
Simple random sample (SRS)
In statistics, a simple random sample from a population is a sample chosen randomly, so
that each possible sample has the same probability of being chosen. One consequence is that each member of the population has the same probability of being chosen as any other. In small populations such sampling is typically done "without replacement", i.e., one deliberately avoids choosing any member of the population more than once. Although simple random sampling can be conducted with replacement instead, this is less common and would normally be described more fully as simple random sampling with replacement. Conceptually, simple random sampling is the simplest of the probability sampling techniques. It requires a complete sampling frame, which may not be available or feasible to construct for large populations. Even if a complete frame is available, more efficient approaches may be possible if other useful information is available about the units in the population. Advantages are that it is free of classification error, and it requires minimum advance nowledge of the population. It best suits situations where the population is fairly homogeneous and not much information is available about the population. If these conditions are not true, stratified sampling may be a better choice. Drawing Simple Random Samples using a Table of Random Numbers An easy way to select a !"! is to use a random number table, which is a table of digits #,$, %,&, each digit having equal chance of being selected at each draw. 'o use this table in drawing a random sample of si(e n from a population of si(e N, we do the following) $. *abel the units in the population from # to N $. +. ,ind r, the number of digits in N $ . ,or e-ample. if N / $##, then r / +. 0. "ead r digits at a time across the columns or rows of a random number table. 1. If the number in 203 corresponds to a number in 2$3, the corresponding unit of the population is included in the sample, otherwise the number is discarded and the ne-t one is read. 4. Continue until n units have been selected. If the same unit in the population is selected more than once in the above process of selection, then the resulting sample is called a !"! with replacement. otherwise it is called a !"! without replacement. 'he observations in the sample are the enumeration or readings of the units selected. Example 1 2cf. 5evore, 6. *. and 7ec, "., $&&8, 493. 'o draw a !"!, consider the data below as our population. In a study of wrap breaage during the weaving of fabric, one hundred pieces of yarn were tested. 'he number of cycles of strain to breaage was recorded for each yarn and the resulting data are given in the following table. :9 $19 +4$ 940 &: +1& 1## +&+ $0$ $9& $84 $89 89 +91 $4 091 $&4 +9+ :: +91 $48 ++# 1+ 0+$ $:# $&: 0: +# 9$ $+$ +:+ ++1 $1& $:# 0+4 +4# $&9 &# ++& $99 0: 008 99 $4$ 01$ 1# 1# $04 4&8 +19 +$$ $:# &0 0$4 040 48$ $+1 +8& :$ $:9 1&8 $:+ 1+0 $:4 ++& 1## 00: +&# 0&: 8$ +19 $:4 $:: 49: 44 44 9$ +11 +# +:1 0&0 0&9 +#0 :+& +0& +09 +:9 $&1 +88 $10 $&: +91 $#4 +#0 $+1 $08 $04 04# $&0 $:: :4 8$ 89 :0 4$ $: 89 9& 9$ +9 09 so that the sample observations are) :$ +9+ +&# ++& 09: 0&9 $04 $&4 +01 $:4 A !"! with replacement in the above e-ample would be) :$ +9+ +&# ++& 09: 0&9 +&# $04 $&4 +01. ;ere we have a population of si(e N / $##. 'o draw a simple random of si(e n/$# without replacement, we proceed as follows) $. *abel the units in the population from ## to &&. +. ,ind r, the number of digits in N. ,or e-ample, if N /$##, then r / +. 0. "ead + digits at a time across the columns or rows of a random number table. Part of a Random Number Table :48$ 89:0 4$$: 899& 9$+9 0990 0#4& 8:#8 &+$& 10:0 &#+$ 8#$0 #+00 001: 1#88 #:91 4#44 :90$ 488# #4#4 #0:9 &8&+ $9&# 1:81 0#:1 #++: :40& &084 4#19 :904 1840 $&&+ :$:+ +94: +&$1 1##4 $488 $8$1 8:9+ 8##& #+4+ 0#8# $490 0##: 08$9 $+98 $#90 11$4 :1&9 988& $490 8:00 404$ ++8: #981 $+4+ 9:$0 1#$9 0&9$ 9:&# &1&8 #$#4 49+9 #4+& #9#+ 1480 $1&& 888+ 884& &1#4 &4#+ 01#: 9&0$ 8&19 1944 9:+0 8094 9$1# #048 8#9& 88$4 &#:0 9$:# $$0$ 8#4& &:#: &:#0 8::0 4&10 991& 940+ 1#1: 0#11 :#04 $#14 :01& 41++ #0$4 818# 898& $8+9 $0&# 1&&8 490+ � :$:1 :009 49:1 4:19 8#49 +:18 18$4 +:9& +489 4080 :$84 #0:1 401: :+0+ :$:9 49#4 #&0& &0:# $918 80#8 4:&0 849& 8#&+ 1108 +8++ 8:#8 4&#: 41+4 &98& +01: 1&+9 $49$ 8+&& +$&4 4081 0991 :+9& 4+1$ 1109 4+94 848$ :+&& 9##9 +$1+ ++80 #&00 9$0$ +1#9 #8$4 4#9& $990 :#$4 &$+# #998 1::1 :9#$ 008# 011& 8$4: :&4# 81$0 &4+9 &98# 0#84 :0+$ :+&4 90+8 4184 494# 	$ 89:8 0:1& ++#8 9&$# 1$99 !uppose we read the first two digits of the first two columns of the above random number table to get the following numbers :4 8$ 89 :0 4$ $: 89 9& 9$ +9 09 1. !ince the random digit :4 corresponds to a unit in 2$3, we select unit :4 of the population in the sample. If any random digit in 203 e-ceeds &&, the random digit is discarded and the ne-t one is read. After selecting 9 random numbers of two digits, we find a random number 89 which is discarded for !"! without replacement as it appeared before. Continue until n / $# units have been selected. 'hus we have the sample units) :4 8$ 89 :0 4$ $: 9& 9$ +9 09 so that the sample observations are) :$ +9+ +&# ++& 09: 0&9 $04 $&4 +01 $:4 A !"! with replacement in the above e-ample would be) :$ +9+ +&# ++& 09: 0&9 +&# $04 $&4 +01. Systematic sampling Systematic sampling is the selection of every n th element from a sampling frame, where n, the sampling interval, is calculated as) n / <umber in population = <umber in sample >sing this procedure each element in the population has a nown and equal probability of selection. 'his maes systematic sampling functionally similar to simple random sampling. It is however, much more efficient and much less e-pensive to do. 'he researcher must ensure that the chosen sampling interval does not hide a pattern. Any pattern would threaten randomness. A random starting point must also be selected. Stratified sample ?hen sub@populations vary considerably, it is advantageous to sample each subpopulation 2stratum3 independently. Stratification is the process of grouping members of the population into relatively homogeneous subgroups before sampling. 'he strata should be mutually e-clusive ) every element in the population must be assigned to only one stratum. 'he strata should also be collectively e-haustive ) no population element can be e-cluded. 'hen random or systematic sampling is applied within each stratum. 'his often improves the representativeness of the sample by reducing sampling error. It can produce a weighted mean that has less variability than the arithmetic mean of a simple random sample of the population. 'here are several possible strategies) $. 7roportionate allocation uses a sampling fraction in each of the strata that is proportional to that of the total population. If the population consist of 9#A in the male stratum and 1#A in the female stratum, then the relative si(e of the two samples 2one males, one females3 should reflect this proportion. +. Optimum allocation 2or 5isproportionate allocation3 @ Each stratum is proportionate to the standard deviation of the distribution of the variable. *arger samples are taen in the strata with the greatest variability to generate the least possible sampling variance. A real@world e-ample of using stratified sampling would be for a >! political survey. If we wanted the respondents to reflect the diversity of the population of the >nited !tates, the researcher would specifically see to include participants of various minority groups such as race or religion, based on their proportionality to the total population as mentioned above. A stratified survey could thus claim to be more representative of the >! population than a survey of simple random sampling or systematic sampling. !imilarly, if population density varies greatly within a region, stratified sampling will insure that estimates can be made with equal accuracy in different parts of the region, and that comparisons of sub@regions can be made with equal statistical power. ,or e-ample, in Ontario a survey taen throughout the province might use a larger sampling fraction in the less populated north, since the disparity in population between north and south is so great that a sampling fraction based on the provincial sample as a whole might result in the collection of only a handful of data from the north. Adantages focuses on important subpopulations but ignores irrelevant ones improves the accuracy of estimation efficient sampling equal numbers from strata varying widely in si(e may be used to equate the statistical power of tests of differences between strata. Disadantages can be difficult to select relevant stratification variables not useful when there are no homogeneous subgroups can be e-pensive requires accurate information about the population, or introduces bias. loos randomly within specific sub headings. !"oice of sample si#e for eac" stratum In general the si(e of the sample in each stratum is taen in proportion to the si(e of the stratum. 'his is called proportional allocation. !uppose that in a company there are the following staff) male, full time) &# male, part time) $: female, full time) & female, part time) 90 'otal) $:# and we are ased to tae a sample of 1# staff, stratified according to the above categories. 'he first step is to find the total number of staff 2$:#3 and calculate the percentage in each group. A male, full time / 2 &# = $:# 3 - $## / #.4 - $## / 4# A male, part time / 2 $: = $:# 3 -$## / #.$ - $## / $# A female, full time / 2& = $:# 3 - $## / #.#4 - $## / 4 A female, part time / 290=$:#3-$## / #.04 - $## / 04 'his tells us that of our sample of 1#, 4#A should be male, full time. $#A should be male, part time. 4A should be female, full time. 04A should be female, part time. 4#A of 1# is +#. $#A of 1# is 1. 4A of 1# is +. 04A of 1# is $1. !ometimes there is greater variability in some strata compared with others. In this case, a larger sample should be drawn from those strata with greater variability. !luster sample !luster sampling is a sampling technique used when "natural" groupings are evident in the population. 'he total population is divided into these groups 2or clusters3, and a sample of the groups is selected. 'hen the required information is collected from the elements within each selected group. 'his may be done for every element in these groups, or a subsample of elements may be selected within each of these groups. Elements within a cluster should ideally be as homogeneous as possible. But there should be heterogeneity between clusters. Each cluster should be a small scale version of the total population. 'he clusters should be mutually e-clusive and collectively e-haustive. A random sampling technique is then used on any relevant clusters to choose which clusters to include in the study. In single@stage cluster sampling, all the elements from each of the selected clusters are used. In two@stage cluster sampling, a random sampling technique is applied to the elements from each of the selected clusters. T"e main difference between cluster sampling and stratified sampling is that In cluster sampling the cluster is treated as the sampling unit so analysis is done on a population of clusters 2at least in the first stage3. In stratified sampling, the analysis is done on elements within strata. In stratified sampling, a random sample is drawn from each of the strata, whereas in cluster sampling only the selected clusters are studied. T"e main ob$ectie of cluster sampling is to reduce costs by increasing sampling efficiency 2'his contrasts with stratified sampling where the main obCective is to increase precision.3. One version of cluster sampling is area sampling or geograp"ical cluster sampling. Clusters consist of geographical areas. A geographically dispersed population can be e-pensive to survey. Dreater economy than simple random sampling can be achieved by treating several respondents within a local area as a cluster. It is usually necessary to increase the total sample si(e to achieve equivalent precision in the estimators, but the savings in cost may mae that feasible. In some situations, cluster analysis is only appropriate when the clusters are appro-imately the same si(e. 'his can be achieved by combining clusters. If this is not possible, probability proportionate to si#e sampling is used. In this method, the probability of selecting any cluster varies with the si(e of the cluster, giving larger cluster a greater probability of selection and smaller clusters a lower probability. ;owever, if clusters are selected with probability proportionate to si(e, the same number of interviews should be carried out in each sampled cluster so that each unit sampled has the same probability of selection.