0% found this document useful (0 votes)
52 views52 pages

Block Designs

Randomized Complete Block Designs

Uploaded by

Muhammad Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
52 views52 pages

Block Designs

Randomized Complete Block Designs

Uploaded by

Muhammad Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 52
CHAPTER V Randomized Complete Block Design Y-1 Applications of the Randomized Complete Block Design V-1L.1 INTRODUCTION If the whole of the experimental material, area, or time is not homo- geneous, it may be possible to stratify or group the material into homogeneous subgroups. As explained in Chapter [, thisis one of the methods for controlling the variability of experimental material. If the treatments are applied to the relatively homogeneous material within each stratum or group and repli- cated on the other strata, the design is a randomized complete block. For the completely randomized design, no stratification of the experimental site (space, material, or time) is made. The treatments are randomly allotted to the experimental unit. In the randomized complete block design the treat- ments are randomly allotted within each stratum, i.e., the randomization is restricted. Also, the variation among strata (replicates or blocks) is removed from the variation among replicates within treatments. Therefore, if it is desired to control one source of variation by stratification, the experimenter should select the randomized complete block rather than the completely ran- domized design. Since the development of the randomized complete block design about 1925 [125, 127], the design has become extremely popular in a large number Of fields. Its flexibility and ease of adaptation and analysis have made it the most popular of all designs, with the latin square being its closest rival. The blocks or replicates used may be days, observers, batches of material, animals, pens, patients, schools, classes, laboratories, ovens, ctc., provided that these categories do not interact with the treatments (sce section [X-3.5). In other words, the design may be used to control a source of variation in the experimental material and not solely the variation among blocks in a field. Y-1.2 ADVANTAGES AND DISADVANTAGES: The chief advantages of the randomized complete block design are: wa Accuracy. This design has been shown to be more accurate than the completely randomized design for most types of experimental work. The elimination of ns § -1.3] Layout and Analysis 115 the block sum of squares from the error sum of squares usually results in a decrease in the error mean square. (ii) Flexibility. No restrictions are placed on the number of treatments or an the number of replicates in the experiment. In general, at least two replicates are required to obtain tests of significance (see chapters on factorial experi- ments for exceptions). In addition. the check or other treatmenta may be included more than once with little complication to the analysis. (iii) Ease of analysis. The statistical analysis is simple and rapid. Moreover, the exror of any treatment comparison may be isolated and any number of ‘treatments may be omitted from the analysis without complicating it. These facilities may be useful when certain treatment differences turn out to be very large, when some treatments produce failures, or when the experimental errors for the various comparisons are heterogeneous. The chief disadvantage of the randomized complete block design is that it is not suitable for large numbers of treatments or for cases in which the complete block contains considerable variability. Cochran [45, 47] found for experiments carried out at the Rothamsted Experimental Station and associated centers between 1927 and 1934 that the error mean square for a randomized complete block design was 60 per cent of the error for completely randomized designs. Assuming equal costs of con- ducting the experiments, it would require ten replicates of a completely randomized design to obtain an amount of information equal to that from a randomized complete block design with six replicates. V-1.3 LAYOUT AND ANALYSIS Using the same example as for the completely randomized design the five treatments A, B, C, D, and E may be included in each of the four blocks. The following diagram illustrates an experimental layout for the field, labo- ratory, or greenhouse: Block I Block II BI pei as) as) BI E (A (B) (Cc) lock LV (E) ne fas) The breakdown of the total degrees of freedom is roe of variation af Mean square mong 5 treatments ‘ T Among 4 blocks 3 R Remainder or error 12 E Total 19 = ot le Randomised Complete Block Design (§ V-1.3 As is apparent from the analysis, 3 of the degrees of freedom are segregated from the error degrees of freedom for a completely randomized design. These 3 degrees of freedom are associated with the sum of squares attributable to the differences among the means of the four blocks. For the general case with v treatments and r replicates the breakdown of the total degrees of freedom for a randomized complete block design is ‘Source of variation df - ms Replicates rat EX v = Kev R Treatments vol DER Ke tiev T Residual (-DW-) FExu - DXi - OX t+ Kv E Total (corrected{orthe = rv ~1 LEX ~Kt/v - mean) r Correction for the mean. 1 X.trv cr X,. = ith treatment total, X= jth replicate total, X.. = total of ro experimental units, and X;; = yield of the experimental unit from the ith treatment in the jth replicate. If an experiment has been conducted as a randomized complete block design, it is possible to determine the efficiency for the same experiment ‘conducted as a completely randomized design [319]. The calculated variance for the latter design is obtained from the sum of squares for replicates plus the sum of squares obtained by multiplying the error mean square by the treatment plus error degrees of freedom and dividing by the total degrees of freedom. Symbolically, this is Ee (treatment plus error degrees of freedom) £ + (block degreesof freedom) R treatment + error + block ‘ares of freedom Q@-l+(-)@-DIEF+[ DF -De-H+F ro The efficiency of the randomized complete block design relative to the completely randomized design is the ratio of the relative amounts of informa- tion on the two designs. The amount of information (126, 273] is defined as the reciprocal of the error variance. The efficiency of the randomized com- plete block design relative to what it would be bad a completely randomized design been used is (r~@-1)+1 / etek: Or) 41 _ (ro—r—v +2) (ro +3) E [I= +3)E/ foe—1) +3] ~ (ror + 4)(ro— 0 FE" where the coefficients represent the corrections for the difference in the de- grees of freedom associated with E’ and E [126, and formula (I-1)). The (v-1) (v2) § V-1.4) Analysis for One Observation Per Experimenta Unit ut increase in efficiency due to the use of the randomized complete block design rather than the completely randomized design is obtained from equation (V-2) minus one expressed in per cent. With reference to the suggested replicate shape and the suggested size and shape of experimental units within the replicate the reader is referred to Chapter III. The amount of replication required will depend upon the pre- cision with which the experimenter wishes to measure the treatment means. He usually has some idea regarding the coefficient of variation in the material under observation. Also, he has some idea of the size of the difference be- tween two treatments, which is of practical significance. From this infor- mation the approximate number of replicates to use may be obtained by one of the methods presented in Chapter II]. The layout, computational procedure, and efficiency of randomized com- plete block designs are illustrated in the following examples. V-14 ANALYSIS FOR ONE OBSERVATION PER EXPERIMENTAL UNIT ‘The general analysis for complete “harvesting” of the experimental unit is presented above. In this case the total for an experimental unit is used: this results in a single observation per experimental unit. The computational procedure for a randomized complete block design with one observation per experimental unit is illustrated below for a particular example. Example V-1. Lathwell and Evans [191] present yield data from soybean plants for five treatments grown in six replicates of a randomized complete block design. The experiment was conducted in the greenhouse, using sand cultures in pots. The five treatments are LLL, LLH, LHH, HLL, and HHL, where L refers to a light appli- cation of nitrogen (20 parts per million), H refers to a heavy application of nitrogen (100 ppm), and the position of the letters refers to the time of application, i.e., first, second, and third dates. The additional treatments, LHL, HLH, and HHH, would have resulted in a factorial arrangement of treatments, but they were not included in this experiment. The yield data and analysis of variance for the soybean experiment are presented in tables V-1 and V-2. The total aum of squares corrected for the mean is obtained as 8.82 + 12.98 + +++ + 26.62 — 924.92/5(6) = 5675.03 — 28514.67 = 7160.36. The Greatment sum of squares in (96.5 + 134.7?-+ 170.0? + 220.1 + $03.6)/6 — 924.97/30 = 4314.21. The replicate sum of squares is (169.3 + +++ + 136.0%)/5 — 924.99/30 = 466.44. The error sum of squares is obtained by subtracting the treatment and replicate sums of squares from the total sum of squares; thus, 7160.36 — 4314.21 — 466.44 = 2379.71. The error sum of squares may also be obtained as the sum of ‘squares of the deviations of observed values, X;j, from expected values, 2. + 2.4 ~ 4 = Ry Le the oror sum of squares equals 35 3 (Xy ~ Ruy Snedecor’s F Is used to test the hypothesis of no difference among the treatment means; F = 1078.55/118.986 = 9.06 > Fo(4,20df) = 4.43. Therefore. we would Teject the hypothesis of no difference among the five treatment means. If it is not pos- sible to partition the degrees of freedom for treatments, we should have applied one us Randomised Complete Block Design [$§ Val.6 TABLE V-1. Yields of soybeans in grams per pot v vr | Totals 2.0 (9.9 | 6S ahh 23.5 | 16.7 5.6 22.6 | 170.0 38.8 45.6 | 220.1 sta 6.6 | 403.6 357-9 136.0 | h.9 = = N66. 93.29 agi. 1078.55 wse.29 sage .29 bei.se 213-01 2379.71 128.986 7260.36 . 28514.67 5 3567503 : of the tests in Chapter II. For example, Duncan’s multiple comparisons test indicates that mean number 5, HHL, is significantly higher than any of the other moans. Also, the mean for treatment number 4, LHH, is larger then h, treatment LLL. The means for treatments 2, 3, and 4 and for treatments 1, 2, and 3 form seta declared not to be heterogeneous. It should be noted that means #,, #1, and 23 yield a sum of squares and range almoat large enough to exceed the required value at the 5 per cent level. Tt may be of interest to partition the treatment sum of squares into individual components, each based upon a single degree of freedom. The particular set. of compo- nenta used depends entirely upon the objectives of the experiment and upon the nature of the treatments. In this perticular experiment, it may be of interest to determine the relationship between amount of nitrogen applied and the yield obtained. If L equals 20 units and H = 100 units, then treatment LLL receives 60 units, LLH and HLL each receive 140 units. and LHH and HHL each receive 220 units of nitrogen. ‘The linear component of the treatment sum of squares is = 6196. = 134.7 — 170.0 + 220.1) + 4(30 3.6)? (=O) + (=I + (1 + P+) = (1211.1)1/420 = 8492.29, which representa a sizeable portion of the treatment sum § ¥-1.4] Analysis for One Observation Per Experimental Unit 19 of squares. The coefficients ¢(/Vs — fi), of the treatment totals, X;., are the deviations of the amounts of nitrogen, Ni, per treatment from the mean nitrogen application where (60 + 140 + 140 + 220 + 220)/5 = 156 = fi; the deviations are coded by dividing by 16. Since the sum of the cross products is squared, and since the code umber is squared in the denominator, no decoding is necessary; thus: (v3) where ¢ is the constant used to code each Mi. Furthermore, the sum of squares for the other components, quadratic, cubic, and quartic, could be computed if so desired [273]. Since only three levels of nitrogen, 60, 140, and 220,were used, it was deemed inadvisable to compute any of the curvilinear regressions for the particular range of levels explored. The sum of squares attributable to linear regression represents the largest portion of the total treatment sum of squares (table V-2). The test of the hypothesis of zero relationship between nitrogen and yield in F = 3492.29/118.986 = 29.4 > Foil, ,20df) = 8.10, The mean square for residuals = the deviations from linear regression is not significantly larger than the error mean ‘square, since F = 273.97/118.986 = 2.30 < Fio(3,20df) = 2.38 (table I1-8). —6(96.5) — 134.7 — 170.0 + 4(220.1) + 4(303.6) For the mean difference, —Si oy (my $ (ahs A = 2ULA 2.9036, the standard error is obtained from the formula, an Vinee se where the &; are the coefficients of the treatment totals, s = error mean square, and + = number of replicates. The standard error of the mean difference 2.8836 is 118.986 1 Ve Nor SUE E ae] 7 259226. ‘Therefore, { ~ 2.8836/.53226 — 5.418, and * — F — 29.4. The number of significant figures carried is larger than warranted by the accuracy of the data but is required to obtain agreement between (* and F. The standard error of the mean difference between two treatments is #5 = V'2(118-986)76 = 6.30. The isd = salox(20df) = 13.14. The coefficient of variation is 4/2 = 300/118.986/924.9 = 10.91/30.83 = 35 per cent, which appears to be rather large. Since the replicate mean square is less than the error mean square, the present design is {ess efficient than a completely randomized design would have been. If it is desired to compute the efficiency of the randomized complete block design relative to the completely randomized design, formula (V-2) is used. For the data of table V.2, er m $66H8 + 240118989) _ 14 S55. Therefure, the relative efficiency is estimated to be (30 — 6 — § +.2)(80 — 5 + 8)(114.555) . (0 — 6 — 5+ 4)(80— § + 1)(118.986) * 10 ~ 98 Per cent. 120 Randomised Complete Block Design (§ V= Y-1.8_ ANALYSIS FOR MORE THAN ONE OBSERVATION PER EXPERI- MENTAL UNIT For k observations, readings, or determinations per experimental unit the following analysis of variance table is appropriate: Source of variation “ Replicates rol Ext =X. vk ‘Treatmente ByXetyrk — XMevk Erominmuderer Wee ER EAE BAY ae Samplingerr =D EX (EXut - Xun) Total k-1 PLEX where the X,,.. X., X..., and X,j. equal totals for the treatments, replicates, the grand total, and the experimental unit, respectively. The detailed compu- tations are presented in example V-2. Example V-2. The data presented in table V-3 represent the grams of rubber obtained from two randomly selected plants in a plot for each of the seven varieties of guayule planted in the five replicates. The allocation of the varieties to the seven plots in each replicate was random. The plot size was twenty-eight plants long by twelve rows wide, with 20 inches between plants within a row and 24 inches between rows, resulting in a plot of (1/12){(28 x 20) x (12 x 24)] = 4636” x 24’. The repli- cate size was 7 X 24’ by 4624’ = 168’ X 4634". The shape of the replicate might not have been the most desirable. Plots six rows wide by fifty-six plants long might have resulted in a better replicate shape in this experiment. ‘The sums, means, and sums of squares for the data in table V-3 are presented in table V-4. The results are summarized in table V-5. The mean squares are obtained by dividing the sums of squares by the appropriate degrees of freedom. In table V-5, two errors are listed, experimental and sampling. The experimenter may often be in a quandary as to which one to use [126,sec. 65; 47, p.28-35]. The answer depends upon the hypothesis to be tested and upon the assumptions made about the data. If the worker wishes to confine his remarks to the particular five replicates used above. the sampling error is used for testing the variation among variety means. If, on the other hand, the experimenter wishes Lo make an inference about the true dif- ferences among the seven varieties from the random sample of five replicates, the experimental error is used. The last cited instance ia the one of practical importance in most cases. ‘The sampling error is larger than the experimental error, but not significantly so. If the variation of plot means from plot to plot after removing replicate and variety fev. ruuepd omg on may sores 1 20qq7U Jo Luau = GoquINE ous you] PTE “CoReREWOP [wIaLNEA = “oe pocoos fou yofd = “OU HL, “As OT LEA wos ‘i | cos con | sos tre | ore On | wer ‘ero | BS ‘ws | ws ‘are Loy = 62 GOT - Oy - te ay - we gy Ae R= weg ‘ag | stor ‘org | usg ‘tg | Leto tts | Sore 6v'9 | aro ‘wary | WR tery gon- ez | wn-iz | c6s-ge | gw-se | Sop- ae | for-cs | or-w@ Ju & ura ‘629 | 060 ‘02s | aby ‘avg | ors coz | 9 WL | ble eoe | woes |S et = st son > gt Ot = Lt oth oT 905 - 6t Loy - 08 Gone 1h § S'S | WIE | Ges Ky | SO9'90m | we set | ws tae | Lt |S Lom - 9 ot = et gta - gon TT Sy - oF £65 - 5 eor-9 . awetee | cease | wes | oot | wn ‘oro | It ‘99 | SLC ‘go tor-t wre 2 st - € bor - H-E mr -9 ot-2 * mueyd poyoopes putopues Z 30} aqqnu Jo (sured) 2q3jau puB MOK ope¢dmM00 posTMOpUE! ¢ cy o~UEDE Jo SOREA | JO jueWaPTELN PPL “tA TTEVL 123 Randomised Complete Block Design [§ ¥-1.5 TABLE V-4. Totals of plot yields and sums of squares Replicate Number im Meus 0.06 S112 a iz 3 x 18.68 6:80 8 5.546 Tih 3.7 11.79 10.92 10.10 5.709 65.58 71-95 67.57 one | 5.201 Total sum of equares with 69 df: 2.06 + 6.124 + 2.53" + --. + 5.82" + s.oge — OTES2" -= 2287.489 — 1971.816 = 315.673. ‘Sum of squares for replicates with 4 df: $9.50 + yee ansay m= 1993.812 — 1971.816 = 2.998. ‘Sum of squares for varieties with 6 df: S112 + = 2042.747 — 1971.816 = 70.931. Sum of aquares of plot totals with 34 df: 7.85% + 11.80% + ++» + 10.908 (37152) - = 116.286. EE ERE ~ SG - 2148.102 — 19 71.816 = 116, Sum of squares for interaction of replicates and varieties (by subtraction): 176.286 — 70.931 — 21.995 = 83.360 with 24 df. Whthin plot sum of squares: (6.12 = 2.06)" , vee q (5.82 — 5.08)? a z GAZ +200"... 4 5.920 4 5.008 — 29.90)" = 2287489 — 2148.102 = 139.387 with 35 df. = 6.12" + 2.069 — TABLE V-S. Analysis of variance for the data of table V-3 Towce of variation J Replicate variety 6 109 ve, others 330 5405 ve 593 330, Bose 595 v2, 405, hor, 46 foots 605, tar, M18” iMxyer imental error 2h Senpling errar (between plants) | 35 Frotai 6 Reeee § P-2.5] More than One Observation Per Experimental Unit 123, effect is zero in the population, it would be expected that the experimental error would, be smaller in about 50 per cent and larger in 50 per cent of the samples. If the latter error is significantly smaller than the sampling error, it would be concluded that a significant negative intraclass correlation (127, 273] exists. The explanation would depend upon the particular type of biological material involved. Even though the experimental error is the smaller of the two variances, it is the best estimate of the error term for testing the significance of the difference among ‘reatment means. The experimenter may wish to be more “conservative” and to use the sampling error and the degrees of freedom associated with the experimental error. Other schemes could be followed, but the most logical one is to use the experimental error a8 the estimate of error variation in making various tests of hypotheses, sinoe this is not a result-guided procedure. The F test of the differences among the seven treatment means is pu 18218 | 349, For 6 and 24 degrees of freedom the F values at the 2.5 and 1 per cent points are 2.99 and 3.67. respectively. The i question of importance would be to determine which, if any, of the seven varieties are significantly different with respect to yield of rubber at the end of one growing season. To make these comparisons, several testa are eug- gested in Chapter II. Tukey's test (sec. [I-1.3) indicates that vuriety 416 is siguiticantly lower than the others in yield of rubber and that the variation in yield among the varieties might logically be ascribed to chance. The methods in Chapter II are applicable to a group of unrelaled varieties or treat- ments. In this case, considerable information concerning the relationships of the eeven Varieties was available from past experiments. Variety 109 is the only 54+ chromo- some variety in the group; the remaining are in the 72 + category. A logical comparisan is the mean of the 72's versus the mean of the 54-chromosome variety: 2) — 59.16 ~ 50.38 57.09)" +UFTHU+ T+ 0+ 51.12? | (59.16 + +++ + 57.09)? _ zee = SE 4 tet tr! Also, it in known that varieties 406 and 130 are ei from 593. The 2 degrees of freedom among these three means could logically be partitioned into two single de- grees of freedom representing the comparison of the Lwo selections with the parent variety and the comparison between the selections; Among 130, 406, and 593: 59.167 + 66.60" + S7.09* _ 182.95¢ _ ~~ 10 30 7 5.0026. = 0.4456. 130 + 406 vs 593: 59.16 + 66.60 — 2(57.09)] 1071+ 1+ 4) _ (59.16 + see + 57.09)? | 2.2349. (59.16 + 66.601" + 20 Randomized Complete Block Design (8 ¥-1.6 ¢ )> ore ty 7 27677. Furthermore, varieties 130, 406, and 593 are phenotypically different from the remaining three varieties, 405, 407, and 416. The former have round greenish leaves and short branching habit, while the latter group have long serrated grayish-green leaves and longer branches. A logical comparison would be between the means of the ‘two groupe, (59.16 + 66.60 + 57.09 — 50.38 — 55. lod +T+T+t+r+ iy The remaining 2 degrees of freedom make up the comparisons among the three varieties 405, 407, and 416, with the following sum of squares: 50.38" + 55.467 + 31.717 _ 137.58? eS - S = izes. It was not known what relationship existed among the three varieties, and without this information the partitioning of the variety sum of squares is finished. Tukey's ‘Vest (section I1-1.3) may be applied to these three means, resulting in two subgroups, 405 and 407 in one group and 416 in the other. The sums of squares are summarized in table V-5, and a8 a partial check they should add up to the total 70.931. F ~ 9.85 exceeds the tabulated F at the one per cent point, and F = 4.50 exceeds the F value at the 5 per cent point. The means of the three varieties, 130, 406, and 593, and of the three varieties, 405, 407, and 416, cannot be considered as coming from the same general population. Upon examination of the latter three varieties, it is found that they do not represent a homogeneous group and that the very low yield of variety 416 accounts for the large F’ values in both instances. ‘The amount of variability relative to the mean in this experiment was much higher than desired. The coeflicient of variation is V3ATII72 1.318 3.307 ~ 5.307 ‘The standard deviation per plant mean yield ia /FAT33/2, which equals the standard deviation resulting from an analysis of the plot means. The verification that division of the error mean square by wo (equals number of items from each plot) results in the same value es that obtained from using the plof means in the analysis is left as an exercise for the student. ‘The efficiency of this design relative to a completely randomized design in estimated 2 31-7)" _ 34.2015. = 25 per cent. tobe 21.995 + 3.4733(6 + 24) 4+64%4 3.7116 9 = “gag 3arag ~ 107 Per emt, or a gain in efficiency of 7 per cent. V-1.6_ UNEQUALNUMBERS OF OBSERVATIONS PER EXPERIMENTAL UNIT AND UNEQUAL REPLICATION PER TREATMENT Missing experimental units or unequal numbers of units within the experi- mental unit often create analytical difficulties. Regardless of precautions §¥-1.6.1) Unequal Replication of Treatments 125, taken, disproportionate results are occasionally inevitable. The experimenter may start with an equal number of replicates or with experimental units of equal size, but some of the animals may become sick and die; the technician may accidentally omit or mix up some of the results; a part or all of a field plot may not germinate or may be cultivated out; or any of several other things may happen to a part or all of the experimental unit. When faced with such resulta, the experimenter would still like to obtain all the informa- tion possible from the remaining observations. In response to this need, Statisticians have developed several analytical procedures for handling dis- proportionate results from experiments (1, 17, 18, 33, 111, 139, 164, 189, 240, 276, 316, 318a]. Before discussing the various methods and situations, it should be pointed out that the method of calculation does not contribute any more information than is present in the data themselves. In other words, when a “missing plot” value is computed, no new datum is added. The procedure is merely a calcu- lational dodge for circumventing a more complex procedure. Also, some computational procedures are exact, while others are approxi- mate. The investigator must know the assumptions underlying a particular Procedure [54, 100, 175, 276, 290] before he is assured of the correctnessof the Procedure. V-1.6.1 Missing experimental units. Allan and Wishart [1] were the first to present a formula for computing the value for one missing or extremely divergent value for a randomized complete block experiment. Yates [316] showed that their (1] formula resulted in minimizing the error sum of squares. He presented an iterative procedure for calculating the values for several missing experimental units. The validity of the analysis of variance Procedure was investigated, and it was shown that there is little disturbance provided that the proportion of missing values is not large and that the number of degrees of freedom for the ordinary randomized block experiment is reduced by the number of missing plot values computed. ‘The expectation of the treat- ment mean square is too large in that the coefficient for the error variance component for the error variation is larger than one (111, 316]. This means that the F ratio is too large relative to the correct ratio and too many signi- cant results are obtained. It has been shown that the correct mean squares may be obtained without too much difficulty (111, 316].! The value of a missing experimental unit from the ith treatment in the ith replicate ia computed from the formula, Xi. + 1X5 — Xe. Ryn @—1)(r—1)” (V-5) where the X,. equals the total of treatment i in the (r — 1) replicates in which it is present, X., equals the total of the (v — 1) treatments in the jth replicate, and X.. equals the total of the re — 1 observations. The applica- *The treatment mean with the missing plot is equal to X, = (Xs. + ud/r. 126 Randomized Complete Block Design (§ V-1.6.2 tion of the above formula is illustrated in several places in the literature [e.g., 1, 60, 273, 316). Formulae have been developed for various numbers and arrangements of missing plot values (14, 17, 18, 111, 189] and are useful in special cases. Use of these special formulae often results in a considerable saving of compute- tional labor, but if only one procedure is to be followed, the iterative proce- dure proposed by Yates [316] is recommended. If more than one value is missing, we guess-estimate the values for all but one of the missing unite; this one is computed from formula (V-5). Then, we use formula (V-5) to com- pute the value for one of the other missing values, This procedure is continued until the computed values become stabilized for each of the missing units. Usually, three cycles will suffice, but the number of cycles depends upon the closeness of the guess-estimates to the computed values. The procedure is illustrated by Yates (316] and Snedecor (273). V-1.6.2 Disproportionate numbers per experimental unit. Snede- cor and Cox [276] list the references on analyses for disproportionate numbers per experimental unit. The computational analysis is illustrated by them [276] and by Snedecor (273]. The particular type of analysis used will depend upon the assumptions about the data and the amount of disproportionateness. If the interaction or the experimental error variance component is negligible or nonexistent, the “method of fitting constants” analysis (273, 276, 318] is appropriate. If the experimental error or the interaction mean square is considered to be different from the mean square for individuals within the experimental unit and if there are no missing subclasses, the “weighted squares of means” analysis is appropriate (273, 276, 318a]. The expectations of the mean squares have been obtained by Federer [111] and Henderson [155]. ‘Two approximate procedures are available (79, 276]. In the first procedure the means per individual or unit are obtained for each experimental unit, and the analysis of variance is performed on the means. The second procedure involves obtaining expected subclass numbers and completing the analysis utilizing the expected values. Crump [79] has obtained the expectations of the mean squares for these two approximate methods. Another approximate procedure is to observe the experimental unit with the smallest number of units, say &. Then discard at random units from all other experimental units until &; units remain, Equal numbers are obtained, and the analysis proceeds as illustrated in section V-1.5. This procedure is not efficient because all material has not been utilized in the analysis, but it may be used for preliminary analyses or for casea in which the numbers of individuals per experimental unit are nearly equal or for cases in which the numbers per experimental unit are large. V-1.6.3 Other situations. It may happen that the yields of two or ‘more experimental units are available in tolo but not individually. The ex- §¥-2.1] One Unit Per Experimental Unit 127 perimenter may unwittingly bulk the yields from two or more treatments before weighing, but is able to obtain the total weight of the bulked items. Bose and Mahalanobis [33] provide formulae for computing the experimental unit yields making use of the combined weight of the mixed-up experimental . An application is made in their paper. In other situations the experimenter may be short of material for one or more treatments but has an excess of material for other treatments. The experiment may be designed with “missing plots” for some of the treatments [164], and the “missing plots” filled in with the excess material of the other treatments. The missing experimental units are designed into the experiment at random. One analytical procedure is to run an analysis on only the replicates for which all treatments are present, and a second analysis on only the treatments which are present in all replicates. The procedure does not make use of the substituted plots. Pearce [240] presents a procedure for utilizing all experimental units and illustrates the method with an example. In the same paper, Pearce [240] presents a procedure for analyzing an experiment in which one treatment appears twice in one replicate but does not appear in the second replicate, with the reverse situation being true for a second treat- ment. An interchanging of the treatments in this manner occasionally occurs in laboratory work. The analytical procedure is not difficult and is recommend- ed for experiments in which the treatments have been interchanged. In certain experiments a treatment is applied in sequence to an experi- mental unit. The experimenter may inadvertently apply the wrong treatment at. some stage in the sequence. Grundy [139] developed the statistical analysis appropriate for analyzing data from an experiment of this sort. YV-2 Least Squares Estimates and Expectation of Mean Squares V-.1 ONE UNIT PER EXPERIMENTAL UNIT Ifa single observation is made on each experimental unit of a randomized complete block design and if the effects are additive, the linear model, Xgmutrt ost ci (V-6) may be assumed to represent the yield of any plot, where u represents the Population mean value, r; = effect common to ith treatment, p; = ee ‘common to jth replicate, and «sis a random error component. i = 1,2, - J = 1,2, +++, n. The above linear equation may be put in the form of a inal tiple regression equation if written in the form, Xs = wt iXal + 04Xri’ + esi (V-7) where X,’ and X,' have the values of zero or 1. X;,’ takes on the value 1 in all cells of the two-way classification where 7: is present and zero elsewhere; likewise, X;,’ has the value 1 in replicate j and zero elsewhere, 128 Randomised Complete Block Design I$ V-2.1 The least squares estimates of y, 11, 12, ¢**, Tu Pts Pry ***, Oa (N= Number of replicates) are obtained as before; i.e., by differentiating the reaidual sum of squares with respect to the variables and equating the results to zero. The sum of squares to be minimized is EL Ava ute Rk (v3) Partial differentiation of equation (V-8) with respect to A, fj. and r, resulta in the following system of equations: SB. ESky an) = 05 (v9) BB asx, — aur) = 03 (v-10) # - ~lo-n) =o (v-11) A, ti, and rs, the least squares solutions of the equations, make the residual sum of squares a minimum, The set of differential equations leads to the following normal equations: Equation for a: LOAs = A. = ME + oe + mp. (V-12) 7 Equations for treatment effects hy +++, le: DX = Xi = nh + Dry + na (V-13) DX; = Xe. = nt + Dry + na, (V-14) 7 7 DX =X. = ab + Dy t np. (V-18) ‘Equations for replicate effects, ry, ra, +++, Pat Xa= Dh ont og, (V-16) Xa= Dl t+ ore + op, (V-17) 7 Xn Det ore + on. (v-18) In order to obtain unique solutions for the » + n + | partial regression coefficients, the following restrictions are imposed: Qh om ar (V-19) §¥-2.1] One Unit Per Experimental Unit 129 Since Er; = 0 and Ep; = 0 in the population, this is a reasonable restriction. ‘Now, the least squares estimate of the experimental mean is 1 X.. a= pags ~vDnt+X..) = eth (V-20) of tis (veep and of ry is xy rye i be tye (V-22) The variance of p is Bia — ol = Bie ~ ot = &[<] . af tnd t+ oy t+) ze | ay no a of at wee te (V-23) if it is assumed that the Dor: = Doe; = 0. The variance of any (is Eu = 1) = al = ay — Er) . ,{" tart Eat es nt nln Eo + ELI | _ em 7 no = By +t 2 — Bere = PVoe, (V-24) Ina like manner the variance of any r; is wad, the covariance of any Gls (ix4i') oF of any riry(j J’) is —02/nv, and the covariance of firs, iA, or of rjp is equal to zero. Instead of the variances or covariances of the least squares estimates the expectation, or average value, of a sum of squares after fitting certain constants may be desired. For this case, it is assumed that the r.. p; and «are random variates from populations with mean zero and variances equal to E(r2) = a,%, E(p?) = of? and E(«?) = 0.3, respectively (79, 111, 155, 169, 175, 290]. In obtaining the expectation or average value of various mean squares, the restriction that Scr; — oe; = 0 is not imposed, but the De = 0 = Dy; still holds. 130 Randomised Complete Block Design ($21 ‘The expectation of the total sum of squares after fitting the estimated constants A, ti, and r; is (EEX = BX. = EbXs. — En) a e(DExy - Ee EAS Xe Te - Arru erecta peta tented not en 7 ptabeb a temt asters ey ° op (atm nls beset) + lo bet pn) ttt 70 = nop? + nve,? + nee 2 + noo, ~ not — noe? - ve, — not — ne, — noe ~ nad + nop! + ne,? + ve? + = (no ~~ nt 1)o2 = (n ~ 10 ~ Dov. (V-25) The expectation of the sum of squares due to fitting the f: only is ob- tained as the difference of the expectations for the residual sum of squares after fitting the constants a’ and r,’ and for the residual sum of squares after fitting p, t,, and rj; thus:! A (ExKe ox. BP) -(exxg ox. Eas ESE)] = EL.) = wf BX KE tel] = 002 = nog! + nve,? + 00,3 + vod — nou? — na,? — v0, — 08 = (0— Io? + ne,*), (V-26) which is the expectation of the treatment sum of squares. Likewise, the ex- pectation of the replicate sum of squares is of Zz - x] = (n— 1)(62 + 002) (V-27) and of the total sum of squares is aL DEX = x) = n(o— Loy? + o(n — Ned + (no — Nod. (V-28) It is not necessary to assume that the z, and (or) the p, are random variates. If the 7; are assumed to be fixed effects, then Er? = 1? x 07, and Sor = 0. ‘The expectation of the treatment sum of squares is (v — Lo? + nor? The latter may be a more appropriate form of the expectation for the treat- ment aum of squares in many experiments than is equation (V-26). Like- \See footnote on page 103. § V-2.2) k Units Per Experimental Unit 131 wise, the expectation of the replicate mean square for fixed p, is obtained as ot + Zorn ~1). V-22 k UNITS PER EXPERIMENTAL UNIT If & observations are made on each cell of a two-way classification, the linear model, Xin = wt tet ps + rpg t essa (V-29) may be assumed to represent the yield of any observation, where u represents the population mean value, r; = effect common to the ith treatment, p; = ef- fect common to the jth replicate, rp,; = an effect common to the ith treatment in the jth replicate, es = @ random error associated with the ijhth observa- tion, i= 1,2, +++, 05 j= 1,2, +++, nsandh = 1, 2, +++, k The least squares solutions of », rs, »;, and rox; are obtained as before; thus: m7 2K ~ a by — rhe) = 0, (V-30) oR = PEE Kn Pb — — rly) = 0, (V-31) an = IEE Kn ake re) 0 (V8) and 2B - 2 Xun —p—t— ry ~ rly) =0, (V-33) where Re LEE Mn — a — bry — rts (v-34) B, &, ry, and ri; are the estimates which make equation (V-34) a minimum. ‘The above set of differential equations leads to the following set of normal equations: Equation for a: = RD + vk Dory + DD; + nok. (V-35) Equations for t: Xe bls + RI + rhs) + nk. (V-36) Equations for r;: - RG + rtss) + vkry + kp. (v-37) Equations for ri, Xj Rll + rs + rig + A). (V;38) 132 Randomized Complete Block Design (§ V2.2 In order to obtain unique solutions for the n + v + nv+ 1 partial re- gression coefficients, the following restrictions are imposed: Dh = 0, (V-39) Lr = 0, (V-40) and Dry = 0 = Det. (v-41) 7 7 With these restrictions the least squares estimates are X... pata k (V-42) (V-43) ye CE ames k (v-44) and Se har Pm By — B. - BZ HE (V-45) The variances and covariances of the above least squares estimates may be obtained as before and are left as an exercise for the reader. Ifit isassumed that the 71, 05, rp.;, and ¢:;, are random independent variates with zero means and variances oy, 3, oy’, and o,?, respectively, we may proceed to obtain the expectations of the various mean squares obtained in section V-1.5. The residual (sampling error) sum of squares obtained after fitting the constants 2, 4;, r,, and ri,; has the expectation: ELIOT Ky? — aX... — LOX... ~ DnXy. — LE toXy) ~A[Ex(pxw -*)] - eXE[ Du serch opbroy bean)? hark hrcthort korybewt ~~~ bey] = nok ut + 0.9 + 6p + oes + ont) — nol hy? + koe? + ko? + kore? + 07) = nok — Dow. (v.46) Ina like manner the expectations of the other cums of squares are BKet _ XF) oy lor + het + ake), (v4?) [Eke — Sat] 2 (a (8 + brat + oho), (v48) 4 V-3) Formulae for Missing Values 133 and : a a DEKut — EXet EX 4) 0 yin — led + hee. (V-49) If it is assumed that the ri, ps and rp; are fixed effects which sum. to zero in the experiment, then the sampling error mean square has expectation o4; the interaction = “experimental error” mean square has expectation a2 + kD (erus)*/(n — 1)(v — 1); the treatment mean square has expecta- tion o? + nk>>r2/(v — 1); and the replicate mean square has expectation a2 + vk22p?/(n — 1). For an experiment in which all effects are assumed to be fixed, the sampling error is the appropriate error mean square for testing the null hypotheses of the effects p, r:, and pris. This was explained in example Y-2, The nature of the material determines the linear model. The assumptions are made by the experimenter. The above expectations are difficult to obtain if the number of individuals, is, per experimental unit varies. A discussion of the various expectations may be found in the literature [79, 111, 155, 169, 175, 290]. Y-3 Development of Formulae for Missing or Deleted Values If an experimental) unit is missing, the yields for an experiment designed as a randomized complete block may be represented as ‘Treatment i aanea] 1 2 a Totals 1 an Xe Xu Kut Ra 2 Xa Xe Xen Xe. vy Xu Xa L. Xa X.. Totals Xat Ru Xa Xe X. + ku where Xj, is the missing value, n = number of replicates, » = number of treatments, X;; = yields, and the various totals are listed in the last row and the last column. There is no loss in generality in placing X». in the first row and column, If the analysis of variance on the values in the above table is computed, the error sum of squares is [cx +f +E ] [ew +a ix] Re £184 DDK - es - + At Kut (50) 134° Randomised Complete Block Design [§V-3 aR 2%. + An) _ AA + Xu), 2K..+ Xu) _ at, 28, — See Sal _ Ree 5 At 0. (V-51) Solution of equation (V-51) for Xi; results in the following: aw WX. + 2X. - x Raw ee (V-52) which is equation (V-5). If more values are to be estimated, say X,, and Xn, we follow the same Procedure and obtain two equations for estimating the missing values, (V-53) and nXa+ (0 — I). Ru = Ga ata If Xu and Xx are missing, the two equations for estimating the missing values are w= AA (0 = 1) (NK + 0%.) — Xs. — 2X5 — (RU ~ A. wy Rn (nv — 0 —n)(nv—- 9 —~n +2) (w-s8) (V-54) and ww (R= DV — DOK 2 + 0%.) — 0M. — nXa — (n9 = 0 = WX. py Ae (rv — 0 — nav ~ v—n +2) SAY ‘The procedure may be continued to obtain the equations for various combina- tions of missing values. The general formula for missing values becomes too complex for easy manipulation [111]. An experimenter may either use the iterative method suggested by Yates (316], or he could develop formulae for his particular needs by the method outlined above. CHAPTER VI The Latin Square Design VI-1 Applications of the Latin Square Design VI-L.L INTRODUCTION In randomized complete block designs the restriction is imposed that all treatments or varieties must appear together in a block an equal or propor- tional number of times rather than being allotted at random over the whole experimental area as in the completely randomized design. For the latin square design, two restrictions are imposed; namely, that for an experimental area divided into rows and columns, each treatment must appear once in a row and once in a column. Thus for latin squares, the treatments are grouped into replicates in two ways, once in rows and once in columns. Through the elimination of row and column effects from the within treatment variation the residual or error variance may be considerably reduced. The effect of the removal of the row and column variances on the residual variance is illus~ trated after the discussion on the construction and design of latin squares. Latin square designs have a wide variety of applications in experimental work. They are used in industrial, laboratory, field, greenhouse, educational, medical, marketing, and sociological experimentation; from comparing a group of varieties or fertilizer treatments to testing biological assays, from comparing worker differences in the laboratory to comparing weaving proc- esses, and from tasting tea to comparing patients in a hospital. Reference to literature citations of numerical examples of the latin square design indicates the wide variety of experiments for which the latin square design has been used. Many more uses may be found in other literature citations in which the original data are not reproduced. ‘One has but to consult researchers to determine the popularity of the latin square design. Despite its popularity the latin square design is practical only for five to twelve treatments; if two or more squares are used, it is suitable for fewer treatments. For the 2 X 2,3 X 3, and 4 X 4 latin squares, there are zero, 2, and 6 degrees of freedom associated with the residual sum of squares, and with such few degrees of freedom in the error term, it is recom- mended that the latin square be repeated or that another design be used. Since the latin square design requires a8 many replicates as treatments, the ns 136 The Latin Square Design (6 VIeL2 design is seldom used for more than ten to twelve treatments. With regard to the use of the latin square design and with regard to the high precision (standard error less than 2 per cent of the mean) frequently obtained, Fisher [126, sec. 33] says, “If experimentation were only concerned with the com- parison of four to eight treatments or varieties, it (the latin square design) would therefore be not merely the principal but almost the universal design employed.” ‘VI-1.2 ADVANTAGES AND DISADVANTAGES ‘The advantages of the latin square design over other designs are (@ With a two-way stratification or grouping the latin square controls more of the variation than the completely randomized design or the randomized complete block design. The two-way elimination of variation often resulta 2 only slightly more complicated than that for Kk de lesign. ii) The analysis remains relatively simple even with missing data (1, 14, 33, 85, 216, 316). Analytical procedures are available for omitting one or more treat- ‘ments, rows, or columns (321, 333]. The disadvantages of the latin square design are (i) The number of treatments is limited to the number of rows and columns, except as noted above (321, 333]. For more than ten treatments the latin square is seldom used. (ii) For fewer than five treatments the allocation of degrees of freedom for con- trolling heterogeneity is disproportionately large. Even with repetition of ‘squares a disproportionate number of degrees of freedom is associated with rows and columns for two, three, and four treatments. When corrections are made for degrees of freedom (formula (I-1)) the latin square may not be as efficient as the randomized complete block or completely randomized designs for two, three, and four treatments. Cochran [45, 47] reported that the efficiency of the latin square design relative to the completely randomized design is 222 per cent, and, relative to the randomized complete block design, is 137 per cent for the experiments grown at Rothamsted and associated centers during the years 1927 to 1934. This meana that ten replicates of a completely randomized design or six replicates of a randomized complete block design are roughly equivalent to four or five replicates of a latin square, Similar results were obtained at the University of Saskatchewan by Ma and Harrington (201}, who found that the randomized complete block was only 79 per cent as efficient as the latin square; i.e., four replicates of a latin square are approximately equivalent to five replicates of a randomized complete block design. § VI-1.3] Construction and Arrangements 137 VI-L.3.| CONSTRUCTION AND ARRANGEMENTS For the discussion on the construction of latin squares, it is advantageous to define or explain some of the terminology used in connection with these designs, Fisher and Yates [129] give the following definitions: (@) slandard square. A square is said to be standard if the first row and first column are ordered alphabetically or numerically. There are as many standard squares for a kX & latin square as there are types which cannot be converted into another square by a reshuffling of rows and columns. (ii) conjugate square. Two standard squares are conjugate If the rows of one are the columns of the other. (ii) self-conjugate square. A square is self-conjugate if its arrangement in rows and columns is the same, (iv) adjugate set. By permuting with each other the three categories, rows, columns, and letters, six sets (not necessarily all different) are formed. The resulting sets are aaid to be adjugate. (v) self-adjugate set. A set is self-adjugate if a permutation of the three categories, columns, rows, and letters results in the same set. For the 2 X 2 latin square, there is only the one standard square, The two conjugate squares for the above standard square result in the same arrangement as given above. This means that the 2 X 2 latin square is also self-conjugate, since the letters in a row are in the same order as those in the corresponding column. By interchanging rows with columns, columns with letters, and letters with rows, three latin square arrangements sre obtained. The conjugate of each of the above three sets may be obtained resulting in the six adjugate seta. These sets give the single square for the 2 X 2 latin square; hence, the 2 X 2 latin square is self-adjugate. Likewise, for the 3 X 3 latin square, there is only one standard square, 138 The Latin Square Design (§ VEL The square is self-conjugate, since the arrangement of the letters in rows and columns is the same. To illustrate the construction of the adjugate set, interchange the column order with letter order in the square, mA | 12B | 13C un | izz | 133 2B | 22C | 23a zz f 223 | 230 31C } 324 | 33B 313 | 321 | 332 to obtain the square, A B Cc c A B B c A Likewise, interchange the row order and letter order to obtain the square, Upon interchanging rows and columns of the above three squares, we obtain a total of six squares, or the adjugate set; these squares are the same as the standard square when reordered by interchanging rows and by interchanging columns. However, the six squares of the adjugate set are not all the same. § VI-ns] Construction and Arrangements 139 ‘There are twelve possible arrangements for the 3 X 3 latin square: r A aAi|c|sB Bi/cl/a Bilajc B Bi/alc c/}a/B Cc] Bia cy} als c|}Bp]aA [a Bic aA;lc|]sB c]B/A cl/al]s A|B | c Alc|]B |_| {tf A Cc B A B c c A B c B A B A c B c A B Cc A B A c 1 B c A [oe A c c B A c A B A;BIi Cc aA|c]B Blal/c Bilc]a c/}/al]|B c|B/A A;|c|]B A|B|Cc There are 3! (3 — 1)! = 12 arrangements for the 3 X 3 latin square, of which 11 are nonstandard squares. Four standard squares are possible for the 4 X 4 latin square: 140 The Latin Square Design (§ Vi-1.4 A|B|c]D A|;/B{c oD B|/D]aj\c c c A D B B D|/c/;]B/aA A All four standard 4 x 4 latin squares are self-conjugate. For each standard square, there are 4! (4 — 1)! = 144 possible arrangements, resulting in a total of 576 possible arrangements; these have been tabulated by Kitagawa and Mitome [187]. Of the 576 arrangements, 572 are nonstandard squares, and the remaining are the four standard squares. For the 5 X 5 latin square, there are twenty-five standard squares and their conjugates, plus six self-conjugate squares, resulting in fifty-six standard squares. Also, there are 56(5!)(4!) = 56(2880) = 161,280 possible arrange- ments [129]. “ ‘The number of possible arrangements increases rapidly as the size of the latin square increases. It is obvious, then, why the possible arrangements for all k X k latin squares have not been tabulated. Fisher and Yates 129] give the standard squares for the 4 X 4 and 5 X 5 latin squares; they [128, 129] Bive the five conjugate pairs of transformation sets and the twelve sets con- taining conjugates for the 6 X 6 latin squares. Norton [233] has tabulated the 562 sets from which it is possible to generate the 16,927,968 standard squares for the 7 X 7 latin square.' To date, all the standard squares for higher ordered latin squares have not been tabulated. Further discussion on the formation of latin squares appears in the litera- ture citations (126, 128, 129, 175, 207, 308, 317] at the end of the book. A number of literature citations relating to the construction of latin squares appears in references [113] and [129]. Some of the topics related to orthogonal latin squares [129] and graeco-latin squares (126, 129] are discussed in later chapters. Fisher and Yatea [129] list sample squares for the 7 X 7, 8 X 8,9 X 9, 10 x 10, 11 X 11, and 12 x 12 latin squares. The experimenter may also make up his own sample squares for the larger squares. VI-1.4 RANDOMIZATION In designing an experiment as a latin square, one of the possible arrange- 'Sade (Ann, Math. Slat. 22:306, 1951) has found the correct number of 7 X 7 [atin squares to be 16,942,080. § VI-1.4] Randomiaation 141 ments is selected at random. The procedure is quite simple for the 2X 2 latin square; of the two arrangements, ‘one is chosen by the toss of a coin or from a table of random numbers. The letters A and B represent two treatments under consideration. Likewise, for 3 X 3 latin square with the treatments A. B, and C, one of the twelve arrangements listed above is chosen at random. All possible arrangements of the 2 X 2 and 3 X 3 latin squares may be found in several references. The remainder have not been enumerated to date. ‘Therefore, another method for selecting a random arrangement must be pro- vided. In accordance with the rule for obtaining latin square arrangements as set forward by Fisher and Yates [129] the following procedure may be utilized for obtaining latin square designs for experiments: @ e% X 2 latin square. Randomize the arrangement of the columns of the stand- ard square, or, alternatively, select one of the two arrangements at random. ii) 3 & 3 latin square. Randomize the arrangement of the three columns and of the last two rows of the standard aquare or, alternatively, select one of the twelve arrangements at random. (iii) 4 X 4 latin square. Select one of the four standard squares at random and then randomize the arrangement of the columns and the last three rows. Also, the procedure of selecting one of the 576 arrangements at random may be used [187]. Gv) 5 X 5 lafin equare. Select one of the fifty-six standard squares at random and ‘then randomize the arrangement of the five columns and the last four rows; this resulta in the selection of one of the 161.280 arrangements. (v) 6 X 6 datin square. Select one of the 9408 standard squares at random and randomize the arrangement of the columns and the last five rows; alter- natively, select at random one of the sets enumerated by Fisher and Yates {129] in’ proportion to the number of standard squares possible in the set and then randomize the allotment of the letters to the treatments, the arrange- ment of the columns, and the arrangement of the rows. (wi) 7 X 7 latin square. Select one of the 16,942,080 standard squares at random and then arrange all columns and the last six rows at random; alternatively, follow the second plan for the 6 X 6 latin square. (vil) 8 X 8 and higher latin squares. Select one of the tabulated squares or construct ‘one and then arrange the columns and the rows at random and assign the letters to the treatments at random or follow the alternative procedure out- lined below. The procedure given in (vii) may be used to construct the 5 X 5 and larger latin square arrangements. However, it must be remembered that this method 142 The Latin Square Design [8 VI-1.S does not result in all possible arrangements, since certain configurations are excluded. Little harm from this procedure is likely to result unless the latin squares are used extensively or unless the results of several experiments are summarized. Yates (317] has discussed the theoretical basis for randomization of latin squares. The student is referred to the above reference for a further discussion of this topic. An alternative procedure for obtaining a random arrangement may be used for the larger latin squares. [t is observed (i) that the method is biased in that all arrangements do not have an equal probability of being drawn and (ii) that this method gives all possible arrangements. An outline of the pro- cedure follows:! (i) Assign letters to first row at random; this results in kI permutations of the letters in the first row. Gi) Assign the remaining (k — 1) letters at random in the first column resulting in (k — 1) I permutations of these letters. (ii) Continue the process until all rows and columns are filled, excluding letters which have already appeared in a row or in a column. Aga further suggestion, one might take the resulting square from the above procedure and randomize the rows, columns, and letters. It is not known what effect the second randomization has on the bias. The bias may be iflus- trated fairly simply with the 4 x 4 latin square. If the above procedure is followed, it will be observed that the probability of selecting two of the four standard aquores is 14 each, and the probability of selecting the other two standard squares is }4 each. VI-1.8 EXPERIMENTAL LAYOUT The general concept is that the latin square design should occupy a square or nearly square experimental area. In practice, this is generally true for field experiments, but it is not necessary; usually the rows are perpendicular to the columns; thus: Column number Row number} 2 3 1 A c B 2 Cc B A 3 B A c 'Guggested in a discussion with F. Yates and P. J. McCarthy. § VI-1.5] Experimental Layout 43 Now, the purpose of the latin square design in field and laboratory experiments is to control variation in two directions, such as down the field and across the field, across the greenhouse bench and along the bench, or from two sources. In some instances, it may be desirable to keep the treatments in a row in a compact block in such a way that the blocks are the rows and the order within the blocks represents the columns. Such an experimental design might be illustrated by the following: Row I or Row 2 or Row 3 or Block 1 Block 2 Plock 3 A c B [ c B a] E A c | or ———J Row 1 Cc B Cc Row 2 B A B Row 3 A c The above design might be used on a single row or set of rows in a grape vineyard, where treatment A represents no spray or check, treatment B represents spray 1, and treatment C represents spray 2. The sprayer could be equipped with two tanks and the various treatments applied as indicated in the design. In some instances, more replication is desired. The procedure here is to select, at random, the desired number of arrangements of the latin square. Suppose that nine replicates for the three treatments A, B, and C are desired. The field design could be of the following form for three rows (or three sets of rows) in a grape vineyard: 3] co] oa] > mm i 1“ The Latin Square Design [§ ¥i-1.6 or for the three locations, farms, or positions: Square I Square IL Square ILI A|c|B cla A| Bic ci pla Bi[cl/a pBi/clia Bilal[c a;lsei[c clals Modifications of latin square designs result in other configurations, depending upon the nature of the experiment and the experimental material. The first of the two designs listed above may be useful in baking or cooking experi- ments which are conducted over a period of days. If it is possible to bake three cakes per day and if the worker tires as the day progresses, it may be desirable to have each kind of cake baked in all three orders of baking. The 3 X 3 latin square design satisfies these requirements. The experiment may be repeated on a second set of three days, using a different randomization. YI-1.6 STATISTICAL ANALYSIS FOR ONE OBSERVATION PER EXPERI- MENTAL UNIT The breakdown of the total degrees of freedom in the analysis of a k X k latin square design is ‘Source of variation Degroce of freedom Mean square Row k-1 R Column k-1 Cc ‘Treatment: k-1 T Error or residual & ~ Da -2) E Total w— ‘The row sum of squares is obtained by summing the squares of the row totals, X;.., dividing by k, and then subtracting the correction term (equal to the grand total, X..., squared and divided by k*); Kitt Kote $ Xt Kat eked Kut k ~"R » k Re = BX1 + ta Xa bee t aXe — EX. = RD. ~ (VIEL) where #;.. = row mean and £ = experiment mean. In a similar manner the treatment and column sums of squares are obtained as Xgttert Xa X.! k (WI-2) Katto +Xa2 (v3) §¥I-1.6] One Observation Per Experimental Unit M5 respectively, where X.., represents the treatment total and X.,. represents the column total. The total sum of squares with k? ~ 1 degrees of freedom is obtained by squaring the k* determinations, X.,,, and subtracting the cor- rection term, Brut Ae (14) i= 1,2,-+--, Rand the subscript j = 1, 2, «++, k to give the k* observations. Within each row and column the treatments are arranged to appear once in each row and once in each column. Thus, for h = 1, 2, «++, k, there are two restrictions imposed. The error or residual sum of squares is obtained by sub- tracting the row, column, and treatment sums of squares from the total, Eiut — Ret — Et — et (Es) The estimated standard error of a difference between two means is ob- tained from the formula, [2(error mean square) v k In the event that one of the treatments yields more variable results than the other k — 1 treatments, Yates (316, 321) gives a method for determining the error of the more variable treatment and the error of the remaining k — 1 treatmenta which have approximately the same amount of variation. In addi- tion, he describes a procedure for calculating missing values and for making comparisons among the means. (VI-6) Example V1-1. One of the studies of the Regional Cooperative Project S-5 reported by the Southern Cooperative Group is concerned with the variation in moisture con- tent. from plant to plant and from leaf to leaf of turnip greens. Knowledge of the magnitude of these sources of variation is of importance in sampling studies for chem- ical determinations. In order to study the sources of variation in sampling turnip greens, Peterson ef al. (247] set up a 5 X 5 latin square with five different plants as the rows and five different leaf sizes, ranked from smallest to largest and designated as A. B,C, D, and Eaa the columns. In addition, it was desired to measure the relative variation due to time of sampling. The treatments in this experiment are the five times of sampling, The data on moisture content of deribbed turnip leaves are presented in table VI-1. The moisture contents given in the table are in percentages and are coded values obtained by subtracting 80 from each of the original moisture contents. The ‘sums of squares are obtained from the data and totals in table VI-1. Application of formula (VI-1) resulta in the following sum of squares for rows or plants: $0.68? + or + 52:68 - inst = 1320.2443 ~ 1290.8212 = 29.4231. ‘Application of equations (VI-2), (VI-3), and (VI-4) results in the sums of squares for 146 The Latin Square Design (§ VE-L.6 times, leaf sizes, and the total corrected for the mean. ‘The error or residual sum of ‘squares, equation (V1-5), is 1353.5604 — 1320.2443 — 1313.8162 — 1291.3635 4 2(1290.8212) = 62.7392 ~ 29.4231 ~ 22.9950 — 0.5423 = 9.7788. ‘The ubove sums of squares and the corresponding mean squares are presented at the bottom of table VI-1 TABLE VI-1. Moisture content of turnip greens (minus 80) Leaf cize (A to E, smallest to largest) Plants a B c D z a v. 6.67} dv. 7-15] 4. 6.29 | 111. 8.95 | 44. 9.62 2 di, SMO] v. 8.77 fav. 5.40 | 4. 7.54 | 114. 6.93 3 444. 7.32 | 4a. 6.53] v. 8.50 | dv. 9.99] 4. 9.68 a 4. b.g2 | a4. 5.00] 44. 7.29 |v. 7.85 | tv. 7.08 5 av. 4.86] 4. 6.26 | 111. 7.83 | 41. 5.38] v. 8.51 \“sotaa zone] al oral oem] neliee| - | L Uncosed | 85.84 86.32 87.46 87.9% 88.36] + | 8T-19 | Time totals and means (uncoded) w.22 | 36.05 87.28 | 87.22 Leaf eizes (column) ‘Times (treatment) af 2.7392 Correction for mean a 1290.82.12 - [_towt_weorrectes 25 | 1983. 5608 = An F test indicates that differences exist among plants and among leaf sizes, but that time differences are negligible, thus: § VI-1.6) One Observation Per Experimental Unit 147 and F = 91856 _ 9.47, where the F, values are obtained from table II-8. {t may be of interest. to determine if the time means are more alike than expected relative to the error variance; thus F = 8149/.1356 = 6.0, which is greater than the tabulated value for Fos (12 and 4d) = 5.91; this is the 10 instead of the 5 per cent level, since the F test was made after observing which mean square was the smaller. Hence, the mean square for time of sampling is amall but probably not unusually small. If it were significantly smaller than the error mean square, then competition, in the general sense, would be suspected, and we should determine the element in our experimental technique which gives either error mean square that is Loo large or a times mean square that is too small. ‘Asa further subdivision of the sum of squares, it may be of interest to obtain the linear, quadratic, cubic, and residual components for leaf size. Without additional knowledge on leaf size, we could assign the values ~2, ~1, 0, 1, and 2, or 1, 2, 3, 4, and 5, and obtain the components by the method described by Snedecor (273, Ch. 14 and 15]. In this particular experiment, such comparisons are unrealistic, since the actual weights aré available and the authors [247] used a covariance analysis on moisture contents for leaf size. The standard error of a time, plant, or leaf-size mean is n= (B= ao ‘The standard error of a mean difference is (2(.8149) t= FS) » on. The coefficient of variation is Since the heterogeneity was controlled by the two groupings, plants and leaf sizes, the experimental design was satisfactory. If either plant variation or leaf-size variation is ignored, the residual variance increases considerably. This means that the latin square design ia more efficient that either a randomized complete block or a completely randomized design for comparing treatment (time) means. [f the plants were the repli- cates, the efficiency of this latin square relative to a randomized complete block design is (k= k—2) +1\(R- DP +3)C+ . (Gang 4) gantt) Ed where the symbols are defined in the analysis of variance table at the beginning of this section; for this example, the efficiency is equal to (12 4.1) (16 + 3) (5.7488 + 4(.8149)) az+3) (16+) ( 5(8149) ) donee 148 The Latin Square Design (8 Vi-n.7 Approximately eleven replicates of a randomized complete block design with plants as replicates would have yielded a standard error of a mean equal to that obtained with five in the latin square. It is probably unrealistic to consider leaf sizea as repli- cates and to ignore plant differences, but if the columns were the replicates, the efficiency of the latin square relative to the randomized complete block is obtained from the formula, (k= 1)(e ~ 2) + 1k = 1)? + 3)[R + (k — DE) (VES) (e = 1)(k = 2) + SYR = 1)? + TYRE] . which for this latin square is 13) 19) rssss + 4(.8149) 5, 5(8149) Approximately thirteen replicates of a randomized block design of this type would have been equal to five replicates of the latin square. Similarly, the efficiency of the two-way grouping in the latin square relative to no stratification. as in the completely randomized design, is (tt 1) (dit 3\k—DR+ +e L+ (k— DR —2E an + 3)\an + @-DE = 282 per cent. D(k-2)+1 y +3\R4 C04 (k- DE, - ($ (= NR 2) +5, Gane 2 +1) (tk D+) e+ 1B (VES) ‘The efficiency of the latin square relative to the completely randomized design is (3 a\(3)2 3558 + 5.7488 + 4.8149) 6(-8149) or, approximately sixteen replicates of a completely randomized design would have been equivalent to the latin square with five. YI-1.7_ STATISTICAL ANALYSIS FOR A GROUP OF LATIN SQUARES WITH A SINGLE DETERMINATION PER PLOT In some cases, it is desirable to have more than a single latin square at a single location or to have a single latin square at several locations. For the 2 X 2.3 X 3, and sometimes the 4 X 4 latin squares, it is often desirable to have two or more squares at a location in order to have sufficient degrees of freedom in the error sum of squares. The procedure of designing an experi- ment in more than one latin square has already been discussed. The break- down of the total degrees of freedom in the analysis of variance follows: = 318 per cent, ‘Source of variation Degrees of freedom Mean square Squares (or locations) eal 8 Rows within squares rs) R Columns within squares «-D c Treatments T ‘Treatments X squares ‘a= via -) Ts Residual within squares atk ~ 1k — 2) E Total m1 $¥E1.7 Analysis for a Group of Latin Squares 9 The “treatment X squares” sum of squares may be pooled with “residual within-squares” sum of squares if there is no treatment-square interaction. If the aquares are at different locations, then, for some hypotheses, it is appro- priate to use the “treatment X square mean square” to test the treatment differences. Fisher [126, sec. 65] discusses the analysis for a group of latin squares and the appropriate error mean square for testing the differences among treatment means for various hypotheses. The analysis for two or more latin squares has been discussed by several authors. ‘The formulae for obtaining the sums of squares in the above analysis of variance table are given below. For squares, (VI-10) for rows within squares, (VI-U) (VI-12) (VI-13) (VI-14) Cin? we}, (VL-15) and for the total sum of squares, + x, 2% RX ‘The subscripts j, h, and g correspond to subscripts h, i, and j for the single latin square, as described previously, and i = 1, 2, «++, s. The treatments squares sum of squares in formula (VI-14) may also be obtained by subtract- ing the treatment aum of squares from the treatment-within-squares sum of squares. An application of the above formulae is illustrated below. The efficiency of the latin square design relative to what it would have been had a completely randomized design been used is! (s — 1)S + s(k wae =e (VI-17) 'The correction for the difference in degrees of freedom associated with the two mean ‘squares is not included. (VI-16) 150 The Latin Square Design (§ VI-1.7 where £” is the mean square from the pooled sums of squares for error within squares and treatments X squares. The other symbols are defined above. The efficiency of the latin square design relative to what it would have been had the rows been used as replicates within each square is (for Z’, as defined in formula (VI-17)! a(k ~ 1)C + 8(k ~ 1)” ak(k = 1) (VI-18) If Cis replaced by R in formula (VI-18), the formula for efficiency is obtained for using the columns as replicates. Also, the efficiency of the latin square relative to what it would have been had a completely randomized design been used in each square ist a(k — I(R +0) + 8(k — IE" a DE ore) Ezample VI-2. Dominick (92] has used sets of latin squares in marketing research. The data, pounds of apples sold per hundred customers, presented in table VI-2, represent a part of one such set. Data for the remainder of the set are given in problem VI-3. Four treatments on McIntosh apples were compared in four stores; treatment A = regular apples, B = apples of 2.25-inch diameter at. a lower price, C= carefully handled uniform apples of 2.5-inch diameter, and D = highly colored uniform apples of 2.5-inch diameter. It was suspected that the days of the week as well as the parts of the week might contribute to the variability. The first part of the week is Monday, Tuesday, Wednesday, and Thursday, and the second part of the week is composed of Friday morning, Friday afternoon, Saturday morning, and Saturday afternoon. From previous work, it was found that the two parts of the week contained about the eame number of customers. The randomization scheme followed was a random allocation of the four standard to the two parts of each of the two weeks. Then, the randomization scheme of section VI-1.4 was followed for each square. The analysis of variance is obtained for each square separately, and the results are combined by the method of the preceding section. The separate analyses present no additional work, unless the individual mean squares are obtained, and may indicate the source of large variations. As a general rule, it ia wise to study the individual anslyses in connection with the combined analysis. The separate analyses and the combined analysis of variance are presented at the bottom of table VI-2. The sum of squares for weeks (squares), formula (VI-10), is obtained by adding the correction terms from the individual weeks end subtracting the new correction term, thus: 10050.06 + 11130.25 — 823%/32 = 13.78 = (401 — 422)?/32, with +1 — 1 = 1 degree of freedom. The stores-within-weeks, days-within-weeks, and treatments-within-weeks sums of squares are obtained by adding the sums of squares from each latin square, thus: stores-within-weeks = 848.19 + 408.75 = 1256.94, days-within-weeks = 237.69 + 1080.75 = 1318.44, and trestments-within-weeks = 707.19 + 1146.75 = 1853.94, 'The correction for the difference in degrees of freedom associated with the two mean ‘squares is not included. . TABLE VI-2. Pounds of McIntosh apples per 100 customers purchased in four stores for four treatments in the first parts of two weeks Store number Day of vex} 22S | Total Monday zie p 8 ch pw | n10 Tuesday B20 az Dw C25 | 115 Weanestay | Deb ciz 812 a27 | 75 ‘Toursday C52 p16 A B22 | 102 ‘Total 6 8 ee | hon Store total) Ie 25D Dey totais | "on Tes. Wed. ‘Ture, 200 205 155 265 ‘Treatuents a B c D Week 1 B e 108 16 hon weex2 | 72 % z ie wee Totals | 167 358 200 29) 3 Means 209 19.8 25.0 37.2 25.7 Analyses of variance for each week ‘Source of variation ‘Stores (column) BB.19 408-75 Days (row) 3 231.69 19:23 1080.75 | 360.25 ‘Treataent 3 tori | 235.75 116.75 | 382.25 Error 6 452.87 ie €13-50_| 102.25 Total 5 22bh.ob : 3249.75 : Correction for meen | 1 | 1050.06 : 1130.25, 2 ‘Total uncorrected 16 | 12295.00 : 1836000 ° ‘Combined analysis of variance ‘Source Of variation ar 3 = ‘Squares (weeks) 2 13.78 13.78 Days within veeks 6 3338.44 219.76 Stores within weeks 6 1256.9 209.49 Treatoents vitbin weeks 6 1853.94 . ‘Treatnent 3 1540.59 525.53 ‘reatnent x veek 3 313.35 108-85 gy gy Error vithin veeks 2 1065.37 £8.78 Total Bh 5908.47 - Correction for mean a 21266.55 : 151 1582 The Latin Square Design (§ VIAL? each with 3 +3 = 6 degrees of frecdom. The latter sum of squares is partitioned into the two components, treatment sum of squares, formula (VI-13), 167" + 1587+ 200% + 298" _ 823 / Ray OO with 3 degrees of freedom, and the treatment x week (square) sum of squares, 1853.94 — 1540.59 = 313.35, with 6 — 3 = 3 degrees of freedom. The other two sums of squares may be partitioned in the same manner as treatments-within-weeks for this particular example. If the two latin squares had been conducted during the two parts of one week, it would not be correct to partition the days-within-weeks sum Of squares into the components days and days X weeks. Also. the above statement holds if eight stores instead of four had been used for the two 4 X 4 latin squares. The error-within-squares sum of squares is obtained by summing the individual error sums of equares, 451.87 + 613.50 = 1065.37, with6 + 6 = 12degrees of freedom. The total eum of squares for the two squares may be obtained by formula (VI-16) or by adding the uncorrected total sums of squares from the individual squares and subtracting the overall correction term, 12295 + 14380 — 8237/32 = 5508.47, with 16 + 16 — 1 = 31 degrees of freedom. The treatment mean square is significantly larger than ordinarily expected in sampling from a homogeneous population; F = 523.53/88.78 = 5.78 is larger than Fom(3, 12df) = 4.47 and almost equal to Fox(3, 12df) = 5.95 (see table II-8). There- fore, we reject the null hypothesis of no difference among the four treatments on the purchase of apples. If it is desired to compare the other mean squares with the error mean square, the F test may be used. However, for this particular experimental setup the stores-within-weeks and days-within-weeks sums of squares should be partitioned into the component parts prior to making any tests. The standard error of a treatment mean is #5 = V/E/ak = 88.1876 = 3.33. The standard error of a difference of two treatment means is i ~ /E(3 + 3 = V8878/4 = 4.71. The coefficient of variation is /E/2 = 32/88.78/823 ~ 37 per cent. Application of Duncan's multiple comparisons test indicates that treatment Dis different from treatments A. B, and C and that A, B. and C do not differ among themselves. ‘The efficiency of the two latin squares used relative to what would have been obtained from a completely randomized design is equal to (formula (VI-17)), 13.78 + 6(219.74 + 209.49) + 18(91-91) 31(91.91) where 91.91 = (313.35 + 1065.37)/15. The use of Z’ = 91.91 assumes @ zero treat- ment. X week interaction. If this is not true, then formulae (VI-17) to (VI-19) should he modified accordingly. Approximately twelve replicates of a completely randomized design would have been required to attain the same precision as the present design with eight replicates. Likewise, if the stores and weeks were the replicates of a ran- domized complete block design, the efficiency of the latin square is (formula (VI-18)), 6(219.74) + 18(91.91) 24(91.91) = 1540.59, 100 = 149 per cent, % 100 = 135 per cent. § VI-1.8) More Than One Observation Per Experimental Unit 183 VI-1.8 ANALYSIS FOR MORE THAN ONE OBSERVATION PER EXPERI- MENTAL UNIT. Several variations of latin squares are possible, and some of these will be discussed in later chapters. The example of the present section is @ latin square with more than one unit per plot. The particular breakdown of the degrees of freedom in the analysis of variance depends upon the nature of the material and the design of the experiment. Example VI-3. A 4 X 4 latin square design [288] was set up to compare the effects of four light intensities (D = dark or zero, L = 500, M = 900, and H = 1200 foot-candles of light) on the difference in bioelectric potential (in millivolts) between @ point on the stem of the bean plant and the point at which the nutrient solution made contact with the stem. Since the difference ia bioelectric potential required a period of 1 to 1}4 hours to become stabilized after a change in light intensities, it was neces- sary to wait two hours after changing light intensities in order to obtain a reliable measure of the difference in potential between the two points measured on the stem of the bean plant. The first treatment was applied at 10:00 4... and the reading was recorded at noon. The second treatment was started at noon, and the corresponding reading was taken at 2:00 p.m. Likewise, the third and fourth treatment readings were recorded at 4:00 and 6:00 p.m, reapectively. [t was believed that time of day might have an effect on differences in bioelectric potential. Therefore, it was necessary to have the treatments (light intensities) applied once at each of the four times. A period of four days waa required to run the experiment. The time of day was considered to be the row effect. and the day of Lhe week the column effect. The light intensities were the treatments. ‘The original data with three plants per day exposed to each of four light inten- sities and two readings on each plant under each of the light intensities are recorded in table VI-3. A different set of three plants was used on each of the four days. The order of applying light-intensity treatments was at random with the restriction that each of the treatments must occur in each of the orders over a four day period. The arrangement of the treatments (light intensities) and the various totals used in ob- taining the analysis of variance in table VI-4 are given in table VI-3. The sums of squares for the analysis of variance in table VI-4 are obtained in much the same manner as for previous examples. The total sum of squares is obtained by squaring the 96 determinations and subtracting the correction term, 64 + 654+ ++ + 53¢-+ 508 — MO 6 15,654.00, ‘The row, column, and treatment sums of squares are, respectively, 1112* + 1296 + 1068" + 9647 _ 44407 3x 2K = a5 ~ 2409.83, 12261 + Ute 1135* + 1020° - ee 1092.58, 74+ es 1113? + 1040% _ 4440" _ $975.58. 4 [> oat ser 6 geet Teer emo om sx ok ibe ar a a a) an Sy Oe sure 6 192 $62 Sat 992 Teo 1199 oe Mm ¥ ‘or tr mw oly Wt gor mer KC Sm z ze ge cS me le “29R PUZ E009. ms oy Som s we ae 6 6 9% “1p at 'W acomyvory, ig waneexy i suaeax, ‘1 waned got one, 6x2 we TH 1129 SL 6 & K br Kb ‘1 waves [seer 692 6 ol ob oO oO % 1H usmeacy att 952 6 6B 9 ty on & Wyn (@ wasyoKL T <2 vt Toy seq 10914 mu Axpamves oronbs ue] > X } © oy pasuexze (3784) Jo SfPUE>-100} OOZT = HI PU ‘006 = Wi ‘00S = J ‘0 = C1) sueM “wan {yywaeq01 1y¥q m0} sopon syUETA TeIq 20} DORANjOS JUAN pas mers aeoULyoq (MYOANTTIN) [HUA UNooRONG w wONNORIT “F-1A TTTVL § VI-1.8] More Than One Observation Per Experimental Unit 188 TABLE V1-4. Analysis of variance of the data in table VI-3 Source of Degrees of | Sum of Mean variation freedon squares square Tine of day (row) 3 2403.33 Gon.1n Day of veek (colum) ’ 1032.58 3h .29 Light Intensities (treatment) 3 375-58 25.19 Experinental error 6 3397-28 556.20 Anong plants within celle RB 8837.93 263.67 ‘Among plante in eolums 8 3034.27 579.27 Remainder 2h 5403.16 225.33 Between readings on ‘the sane plant 48 68.00 Lig Total 5 15654, The experimental error sum of squares is obtained by subtracting the row, column, and treatment sums of squares from the sum of squares of the k* = 16 cell totals squared; i. aoe = 320! + 308% + See eee eee — “AP — (2ans.s3 + 1032.58 + 375.58) = 212,498.67 — 205,350.00 — 3811.49 = 7148.67 ~ 3811.49 = 3337.18, with 6 degrees of freedom. The sum of squares associated with the variation among plant totals in each of the 16 cells is 124% + 129? + 67 _ 320" 94% + 107? + 60? 26)? eB = 220,936.00 — 212,498.67 = 8437.33, with 82 degrees of freedom. The sum of squares attributable to the variation among plants within days is 3291 4 485*-+ 412" 1226", ___ , 2978+ 398" + 325" _ 1020 8 ee 8 24 = 209,416.75 — 206,382.58 = 3034.17, with 8 degrees of freedom. Subtracting the above sum of squares from that for variation among plant totals in each of the sixteen cells results in the remainder sum of squares which is a composite sum of squares of plants X treatments (ignoring rows) within columns, thus: 8437.33 — 3034.17 — 5403.16, with 24 degrees of freedom. 156 The Latin Square Design (§ Vi-1.8 The sum of squares of the differences among readings on the same plant is ob- tained as follows: oat + 608 — 1A 65s 4 g4r — 129% oe z +--+ +308 + 907 — - — 30) ~ 221,004.00 — 220,936.00 = 68.00 = {64 = 60)" 5... 4 (90 30)" with 48 degrees of freedom. In an experiment designed in this manner, it is possible to test several hypotheses. The experimental error is used to test the variation among treatment means. In this particular case the treatment mean square is less than the experimental error mean square which indicates somewhat more uniformity among the treatment means than might be expected in a population with error variances of the magnitude found in this experiment. The F ratio of the experimental-error mean square and the mean square associated with the differences among plants within the cells of the 4 x 4 latin square is F = 556.20/263.67 = 2.1%. which is slightly lower than the tabulated F value, 2.40, at the 5 per cent level of probability for 6 and 32 degrees of freedom. One could test the hypothesis of no differences among the plant means within days by the F test, PF = 1.68. 379.27 225.13 ‘The corresponding F value at the 5 per cent point for 8 and 24 degrees of freedom is equal to 2.36. The variation among plant means for each day could be tested in a like manner to determine if any group of three plants may be considered as unusually variable. The variance attributable to the differences between readings on the same plant, 1.42, is extremely small in comparison with the remaining mean squares. The conclu- sion is that one reading per plant would be sufficient and that more homogeneous groups of plants are required. If this is impossible, then more plants per cell and more replicates of the treatments are required to obtain standard errors of a mean that are relatively small. The treatment mean square is smaller, but not significantly so, than any of the others (table VI-4). Neither of the mean squares for time of day or day of the week exhibit any unusual variability. If the treatment mean square were larger than the error mean square, the next step in examining the experimental results might be to obtain the linear regression of light intensity and bicelectric potential ‘The standard error of a light-intensity mean is tg = \/556.20/2 X 3X4 and the standard error of a difference of two treatment means is #2 = +/2(556.20)/2 X 3 X 4 = 6.81. The efficiency of a latin square of this type relative to a randomized complete block design or to a completely randomized design may be computed from formulae (VI-2), (VI-8), or (VI-9). The coefficient of variation is equal to V556.2076 25 which appears to be rather high for experimental work. There might be differences among the treatment, row, or column means, but the experimental material or methods were too variable to allow differences of this size to be detected. Following the results = 21 per cent, § VI-1.9] Missing Data 187 from the initial analysis of variance, Taylor [288] studied the experimental material and the procedure. He found that plants with high potential readings tended to remain high and vice versa. With this information, the plants were divided into homogeneous groups with regard to magnitude of initial bioelectric potential readings, and the differences among the groups were confounded with day to day differences by applying the treatments to a different group each day. By refining his techniques further, it ‘was possible to reduce the magnitude of the experimental error mean square and to detect differences among the treatments. Another arrangement of the above experiment would be to use three different plants ineach of sixteen cells of the 4 X 4 |latin square, resulting in a total of forty-eight plants rather than the Lwelve used. This procedure was considered impractical owing to the amount of time required for setting up the apparatus to obtain readings on differences in bioelectric potentials between the stem of a bean plant and the nutrient solution in which its roots were submerged, but if it had been used, the breakdown of the total degrees of freedom would be ‘Source of variation Degrees of freedom Row (time of day) 3 Column (day of week) 3 Treatment (light intensities) 3 Error (experimental) 6 Among plants within cells 2X 16 = 32 Between readings on same plants 1X48 = 48 Total 95 VI-1.9 MISSING DATA VI-1.9.1 Missing experimental unite. Allan and Wishart [1] and Yates [316] present the following formula for estimating a missing yield for the ith row, jth column, and hth treatment in a k X k fatin square: Ray = Ht Ku. f= 2X. (v1.20) where the totals are as defined previously. Yates (316] has also given an itera- tive method for estimating the yields for several missing values in a k X k latin square. For each missing datum computed, | degree of freedom is sub- tracted from the error degrees of freedom. Bartlett {14} suggests the procedure of inserting a one for the missing value and zeros otherwise and performing a covariance analysis with the zeros and the one as the independent variate (see Chapter XVI). If more than one experimental unit is missing, the same procedure is followed except that a multiple covariance is performed. Nair [216] used Bartlett's [14] method for analyzing the resulta from a k X k latin square design with several missing values. A paper by DeLury [85] in 1946 summarizes most of the results for handling missing experimental units in latin squares or sets of [atin squares. The row, column, and treatment mean squares have expectations which are 1s The Latin Square Design (§ VI-1.9.2 slightly too large. The correct mean squares may be obtained with little addi- tional work [316]. YI-1.9.2 Disproportionate numbers in the experimental unit. If disproportionate numbers of observations per experimental unit are available, one of the approximate methods of analysis described in V-1.6.2 may suflice. If a more precise analysis is desired, an extension of the methods described in Chapter 11 of Snedecor’s Slatistical Methods [273] may be used. It will be necessary to estimate treatment regression coefficients as well as those for rows and columns. VI-1.9.3. Other situations. In some cases the latin square may be designed with k — 1 treatments in the k rows and columns or one of the tt ments may have failed in the experiment. Yates [321] presents the analysis for this situation. Also, if one row or column has been lost or is omitted in the latin square, Yates (321] gives the method of analysis and illustrates it with a numerical example. Yates and Hale [333] give a method of analysis for two or more missing rows, columns, or treatments in a latin square. Their results are illustrated with an example. It should be pointed out that one or more rows, columns, or treatments may be missing either by design or by accident. DeLury [85] generalizes their results to several latin squares, and Smith [269] presents the analysis for missing observations in an incomplete latin square. If two experimental unit yields are obtained as a single total and not individually, the method described by Bose and Mahalanobis [33] and by Nair (216] may be used to estimate the missing yields. VI-2 Least Squares Estimates and Expectation of Mean Squares In the following discussion, it is assumed that treatments, columns, rows, and squares represent random samples from their respective populations. If any or all of these items are considered to be from a finite population, adjust- ments in the expectations may be made in the manner described in Chapters IV and V. VI-2.1 ONE UNIT PER EXPERIMENTAL UNIT If a single observation is made on each plot of a latin square design, the yield of the ijhth observation may be expressed as Xan mat pet het re + cum (VI-21) where i, j = 1,2, +++, k:h = 1, 2, +++, k appears once in a row and once in a column; » represents the population mean, p; an effect common to the ith row, , an effect common to the jth column, r, an effect common to the Ath treatment, and ¢,, an effect common to the ijhth observation; it is assumed § VI-2.1.2] One Unit Per Experimental Unit 189 that the yield of any observation is expressible as the sum of several inde- pendent linear effects. VI-2.1.1 Least squares estimates. The least squares estimates of the 3k + 1 parameters, p, pi, +*-, pay Ai, ***, Aa, Ti ***, Ta, BE Obtained by partial differentiation of the residual sum of squares with respect to the 3k + L parameters, by setting the resulting equations equal to zero, and by solving for the set of estimates. The residual sum of squares after fitting the 3k + 1 constants is on ~p-n~-G~ hy, (VI-22) and the normal equations after differentiation are X= EN + kL + kD + ee, (VI-23) Xe. = ket Dat Dh + ka, (VI-24) Xgm= Dret hey + Db + bp, (VI-25) wna Xia = Dri Des + bly + ba. (VI-26) The ri, ¢;, ts, and pf are the estimates which make the residual sum of squares a minimum. Now in order to obtain a unique solution the following restrictions are necessary: Bre Lan Daa. (VE-27) T With the above conditions, then, petaX./k (VI-28) ree Xe../k — Pm Be. — 2, (VI-29) om Xy/k—- P= 2y.— 2, (VI-30) and ty = X.a/k — R= 2.0 2 (VI-31) The variances of the least squares estimates may be obtained as before (Chapters IV and Y), VI-2.1.2 Expectation of mean equares (Model II). The sum of squares due to f is the correction term and has the expectation: E ] [Zot tM tt oa) | EDX...) = ES | = 4 |, = = By + het + host + hod + of. (VI-32)

You might also like