0% found this document useful (0 votes)

36 views20 pages

Institute of Mathematical Statistics

This document reviews various criteria that have been proposed for identifying outliers in a sample from a normal population that may be contaminated. The criteria can be grouped into those that assume knowledge of the population variance versus those that do not. Criteria in the first group include x2 tests and measures of extreme deviation. Criteria in the second group that rely only on sample information include modified F tests and measures based on the sample range. The document aims to evaluate the performance of these criteria at discovering different types of contamination and potential biases they may introduce.

Uploaded by

Jose Antonio Mendoza Aquino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views20 pages

Institute of Mathematical Statistics

Uploaded by

Jose Antonio Mendoza Aquino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Analysis of Extreme Values

Author(s): W. J. Dixon
Reviewed work(s):
Source: The Annals of Mathematical Statistics, Vol. 21, No. 4 (Dec., 1950), pp. 488-506
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/2236602 .
Accessed: 24/12/2012 01:12

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The
Annals of Mathematical Statistics.

http://www.jstor.org

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
ANALYSISOF EXTREMEVALUES
BYW. J. DIXON1
Universityof Oregon
1. Introduction. It is well recognized by those who collect or analyze data
that values occur in a sample of n observations which are so far removed from
the remaining values that the analyst is not willing to believe that these values
have come from the same population. Many times values occur which are "diu-
bious" in the eyes of the analyst and he feels that he should make a decision as
to whether to accept or reject these values as part of his sample. On the other
hand he may not be looking for an error, but may wish to recognize a situation
when an occasional observation occurs which is from a different population.
He may wish to discover whether a significant analysis of variance indicates an
extreme value significantly different from the remainder. Also, of course, the
extreme value may differ significantly without causing a significant analysis
of variance and he may wish to discover this. It is reasonable to suppose that a
criterion for rejecting observations would be useful here also. The choice of a
suitable criterion for rejecting observations introduces a number of questions.
1. Should any observations be removed if we wish a representative sample in-
cluding whatever contamination arises naturally? In other words, it may be
desirable to describe the population including all observations, for only in that
way do we describe what is actually happening.
2. If the analyst wishes to sample the population unaffected by contamination
he must either remove the contaminating items or employ statistical procedures
which reduce to a minimum the effect of the contamination on the estimates of
the population. That is, he may wish to describe only 95% of his population
if the description is altered radically by the remaining 5% of the observations.
He may have external reasons which are good and sufficient for wishing to de-
scribe only 95% of his observations. Suppose he wishes to use the sample for a
statistical inference; the inclusion of all the data may sufficiently violate the
assumptions underlying the inference to exclude the possibility of making a valid
inference.
This paper will concern itself only with those problems wvhich arise from Ques-
tion 2.
If we wish to follow some procedure which attempts to remove contamination
we must consider the performance of any proposed criterion with respect to the
propoition of contamination the criterion will discover and, of course, the propor-
tion of the "good" observations which are removed by the use of the criterion.
But, perhaps more important, we must consider what sort of bias will resuilt
when the standard statistical procedures are applied to samples of observations
which have been processed in this manner.

IThis paper wi-asprepared under a contract with the Office of Naval Research.
488

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 489

If we wish to follow a procedure which will not search for particular values to
be excluded but will minimize their effect if present, we must investigate the
sampling distributions of these modified statistics and estimate the loss in in-
formation resulting from their use when all observations are "good." We must
also investigate the expected bias which will result when "bad" items are present
even though essentially excluded. Perhaps most disturbing about the avoidance
of "bad" items is the fact that a decision must still be made as to whether a
"bad" item was present or not in order to know in which way our estimates may
be biased. For example, a sample mean computed by avoiding the two end ob-
servations will not be a biased estimate of the mean of a symmetric population
if both end items should actually be included or if both end items should not be
included. However, if only one of the two should not be included this estimate of
the mean will be biased.

2. Models of contamination. The performance of the various criteria for dis-

covery of one or more contaminators will be measured with reference to con-
taminations of the following two types entering into samples of observations
from a normal population with mean u and variance _2, N(p, o2)
A. One or more observations from N(Q + Ao, 2)2

B. One or more observations from N(y, X2a2).

A represents the occurrence of an "error" in mean value such as will occur in
dial readings when errors are made in reading incorrectly digits other than the
last one or two digits. Errors of this sort may result from momentary shifts in
line voltage or from the inclusion among a group of objects of one or two items
of completely different origin. This type of contamination will be referred to as
"location error." B represents the occurrence of an "error" from a population
with the same mean but with a greater variance than the remainder of the sample.
This type of error will be referred to as a "scalar error." It is likely that many
errors could be better described as a combination of A and B, but a study of these
two errors separately should throw considerable light on the question of "gross
errors" or "blunders."
Many authors have written on the subject of the rejection of outlying observa-
tions. Apparently none have been successful in obtaining a general solution to
the problem. Nor has there been success in the development of a criterion for
discovery of outliers by means of a general statistical theory; e.g., maximum
likelihood. A large number of criteria have been advanced on more or less intui-
tive grounds as appropriate criteria for this purpose. In no case was investigation
made of the performance of these criteria except for a few illustrative examples.
References for the criteria discussed in the next section are given at the end
of this paper. Indications are given as to the significance values available in
those papers.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
490 W. J. DIXON

3. Criteria to be considered. The performance of two types of criteria has

been investigated for samples contaminated with location or scalar errors.
a) af known or estimated independently,
b) crunknown.
The n observations are ordered xi < x2 < ... < x.. The criteria involving
external knowledge of o-are:
A. x2test,
2 2 (x_ x2
X 2

B. Extreme deviation,
-
B1 = (or X)

Xn -Xn-1 I X2 -XlX
B2 = or -x )

C. Range,

C , W-X7-wX= -

C2= s2 (x - x)2 (s independently estimated).

The criteria involving only the information of a single sample of n observations

are:
D. Modified F test.
1. For single outlier xi,
s2 n n
D= S2 where S1 = E (x - Ex2, / (n 1),
S2 2 2
n n
=
s2 (X-X)27 xln
1 1

n
(or forxn, D1 = S

2. For double outliers x1 ,X

s2 n nL
A2 = 1S, 2 where S2L2 = - x1>, 22 2 E x/(n-2)
3 3

(or for Xn Xn11, D2 =

E. Ratios of ranges and subranges.

1. For single outlier xi,

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 491

X2 - Xi

Xn - Xi

(or for Xn X rio = Xn-Xn1)

2. For single outlier xi avoiding xf,

ril - X1
Xn-I - Xl

(or for xn avoiding xl, rll = Xn Xni1

3. For single outlier xl, avoiding xn, xn_,

X2 -X
r12 =
Xn-2 - Xl

(or for xn avoiding xl, X2, r12 = Xn- Xn1)

4. For outlier xi avoiding x2,

X3 - Xl
r2O =
Xn - XI

Xn Xn-2
(or for Xn avoiding Xn1 r20 = )
-n Xl

5. For outlier xi avoiding x2 and xn,

X3 -Xl
r2l=
xn-l - X1
Xn/ XnXIX

(orfor Xn avoiding xn-I , x1 , r2l = X

Xn
- X-2)
X

6. For outlier xi avoiding x2 and xn, xnl,

X3 -X
r22 = -
Xn-2 -X

(or for Xn avoiding xn-1, xl, X2, r22 = Xn -xn).

F. Extreme deviation and standard deviation.

For single outlier Xn,
x
F =
-Z
(or for x ,F _ Xl)
s s

The performance of the large number of criteria listed here will be assessed
with respect to discovery of contamination of the type given in Section 2.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
492 W. J. DIXON

4. Performanceof criteria(estimate of a available).The x2 test will of course

give an indication of a large dispersion and since the extreme values are chief
contributors to the sum of squares, it is possible to use this test as a criterion for
rejecting a value or values which are at the greatest distance from the mean.
It might be supposed the B1 and B2 would give better results since particular
attention is paid to the end item. The same argument would influence one in
favor of C, or C2 . The performance of C2 can, of course, be expected to vary with
the degrees of freedom in the independent estimate of r. For this study the de-
grees of freedom for this estimate were held to the single value 9 d.f.
x may be used since if the value of x2 is too large (greater than some upper per-
centage point for x2) we might reject the value most distant from the mean.
x2 tables may be used for percentage points. Percentage points for the other
statistics considered here are given in the references at the end of this paper.
The criteria A, B1, B2, C1, C2 were investigated for a = 1%, 5% and 10%
for X 2, 3, 5, 7, where one or more items are selected from a population N(1, +
Xac,O) and the remainder from N(p, 2). Investigations were also made for one
item from N(M,X2a2)for X = 2, 4, 8, 12. The investigation was carried out by
sampling methods. The performances of different criteria were assessed for the
same group of samples in order to obtain more precision in the comparison of the
(lifferent tests. All of the points appearing on the graphs in the subsequent sec-
tions of this paper were based on from 66 to 200 determinations.
The performance of the above criteria is measured by computing the propor-
tion of the time the contaminating distribution provides an extreme value and
the test discovers this value. Of course, performance could be measured by the
proportion of the time the test gives a significant value when a member of the
contaminating population is present in the sample, even though not at an ex-
treme. However, since it is assumed that discovery of an outlier will frequently
be followed by the rejection of an extreme We shall consider discovery a success
only when the extreme value is from the contaminating distribution.
The performance was judged by applying the criteria to each sample, always
suspecting an outlier in the direction of the shifted mean for location error.
Since the location errors were inserted by adding a fixed value to one or more
of the observations, the largest value was tested as an outlier. The measure of
performance was the percentage of location errors identified. When the location
error was not an outlier, no test was performed and a failure for the test recorded.
In the case of the model of contamination involving the scalar error, the value
was suspected which was farthest from the mean. This of course, alters somewhat
the level of significance, but this procedure was followed alike for all criteria
investigated. The performance was measured in the same fashion as for location
errors.
Considering first, location errors, a study of the performance curves showing
the per cent discovery of contaminators plotted against X (the number of standard
deviation units the population of contaminators is removed from the remainder),
shows that the level of performance for 47 known is considerably above the level

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 493

of performance when a is not known. The difference is greater for n = 5 than

for n = 15 and, of course, the difference will diminish as the sample size increases.
Figure 1 shows the performance curves for a = 5% (5% significance level for
the test for an outlier) of B1 = (x. - x)/la for n = 5 and n = 15 and of rio
Xn- =
- for n = 5and n 15.
X- Xi

The graphs for a = 1% and 10% would be similar in appearance. Figure 2

indicates the change in performance for a = 1%, 5%, and 10%. The curves
plotted are for B1 = (xn - t)/a. The curves for A, B2, Cl, C2show very similar
results.
The curve for test B1 was used in Figures 1 and 2 since it gives the best per-
formance of all criteria which are considered here if a single location error is
present. The curves showing the comparative performance of these criteria as

/
B/

'~71
75-
l
rX loll -- 7 A-) } T

/1 1'f//! j_ - --T

eSt 0-~ 1-fX l2Xr7- t -1-

/ _ _? _5_ _> - > - -- 7 -

_
/~~~7
, S 6 S 6 7 A

FIG. 1. Improvement in performance ob- FIG. 2. The effect of the level of signifi-
tained with knowledge of a, a = 5%, n - 5, cance on the performance of B1 ; a =%,
15. 5%, 10%; n = 5, 15.

wvellas one to be considered later (rio) are given in Figure 3 for a = 5% and for
n = 5 and n = 15.
The following statements can be made from inspection of Figure 3:
a) The differences among A, B1, B2, and Ci are not great.
b) The knowledge of a- is less important in larger samples.
c) The curve for C2 lies above that of rio for n = 5 and below that of r1ofor
n = 15. This is consistent with the use of 9 d.f. in the independent estimate
of o.
If the question of ease in computation or application is important, it may be
desirable to use B2 or C1 in place of B1 for they are slightly easier to compute
and it is not necessary to measure all observations to obtain the value of these
statistics. From Figure 3 it will be noted that the performances of these criteria
are nearly as good as for B1 . If two outliers may be expected in a single sample,

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
494 W. J. DIXON

rok ; $ /~~~~S 7-= W /_1

0 / 2 3 4L; 6 7 A 0 / 2 3 4 s. 6 A

FIG. 3. Comparison of the performance of criteria using a known (or using external
estimates of a) and rio for samples of size 5 and 15, a = 5%.

the performance of B2 will be lowered and the performance of B1 and C1will be

improved. Any differences between the performance of B1 and the performance
of C1when two outliers are present was not discernable for n = 5 or 15. Figure 4
illustrates the improvement in performance for B1 for a 5% and n = 15.
The performance curves of these criteria if a scalar error is present are very
similar to those above except that:
1. A high level of performance is approached very slowly. For example, see
Figure 5 showing the performance of B1 and r1ofor nr- 5 and n = 15 and a = 5%.
2. There is a smaller difference in the performance between the criteria with
a known and a unknown (see Figure 5).
The performance of B1 and C1 are noticeably increased by the introduction
of more contaminators while that of B2 decreases. No difference in the perform-

/00-

as50- - +i
-
-?

9 / 2 34 5S 75?8
FIG. 4. Comparisonof the performance of B1for one and two location errors in samples
of size 15, ax- 5%.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 495

ance of B1 and C1were noted for either n = 5 or n = 15. Figure 6 shows the in-
crease in performance of two contaminators for B1 for n = 15, a = 5%.
The general recommendations for possibilities of either type of contamina-
tion, location or scalar errors, would lead one to the use of B1 or C1if o-is known.
Criterion C1 is recommended since:
1. Its performance is almost as good as the performance of B1 for a single
outlier. Their performances are about equal for two outliers and C1 affords pro-
tection for outliers either above or below the mean.
2. It is simple to compute.
If ease of computation is not essential and maximum performance is desired,
the criterion B1 should be used. The performance of C2 will approach that of
B1as the number of degrees of freedom in the denominator increases.

O / 2 3 + 5 6 Z 8TWO E2 3O/5--7 8

;7
-5.

FIG. 5. Comparison of the performance of FIG. 6. Comparison of the perfo: mance

B1 and r,ofor one scalar error for samples of B1 for one and two scalar errors in samples of
size 5 aIid 15, C8 = 5%. size 15, CY= 5/.

6. Performance of criteria (no external estimate of a). Criteria Di and A2

have strong intuitive reasons for their use since the dispersion is estimated by
s2. The r ratios are attractive becauseof their simplicityand their preoccupation
with the extreme values. Test F is the "studentized " ratio corresponding to Bl,
and is equivalent to DI since Di= 1-F2/(n-1). There is no apparent dif-
ference in the performance of Di and rio when one outlier is present and no
apparentdifferencein D2 and r2owhen two outliers are present. This is true for
both models of contamination and for the three levels of significance investigated.
However the comparisonof D2 and r2owas made only for n-=5 since critical
values are not available2 for A2 for n-= 15. (Critical values are availablefor
n < 12.)
The performance of Di and rio under the two models of contamination can
be obtained by reference to the curve for rio in Figure 1 and Figure 5. The curve
for DI is practically identical with the curve for rio .

2After this paper was submitted, the critical values of D2 have been extended to n < 20
(see references).

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
496 W. J. DIXON

There is no question that rio is simpler to use, so that if this condition of

contamination (scalar errors) exists, rio would probably be chosen. However, as
before, we should investigate what happens when more than one error is present.
D2 is designed for this case as is r2o. Since the performance of these two criteria
is approximately the same, r20would probably be chosen because of its simplicity.
Critical values for this statistic are available for n < 30.
r1l, r12, r20, r21, r22were designed for use in situations where additional out-
liers may occur and we wish to minimize the effect of these outliers on the in-
vestigation of the particular value being tested.
It has been suggested that D1 could be used repeatedly to remove more than
one outlier from a sample. This procedure cannot be recommended since the
presence of additional outliers handicaps the performance of both D1 and rlo
for small sample sizes and therefore the process of rejection might never get
started. For larger sample sizes the performance of D1 is affected much less by
the presence of two errors than is the performance of rio . The repetitive use of
Di is not recommended in this case either since r20performs in a superior man-
ner to D1 in such situations. This difference in performance of D1 and rio de-
pends markedly on the level of significance used as well as the sample size.
For small samples there is little difference in perfoimance for any of the levels
of significance one might use. For the larger sample sizes there is no appreciable
difference for very high levels of significance. The diffefence is however very
great for lower levels of significance. In fact as X increases for two errors of the
location type, the level of significance which divides the region of approach to
zero performance from the region of approach to perfect performance of D1 is
given by the level of significance correspondingto a significance value of!( - )

n
for D1. Thus, for example, in samples of size 15, =2 = .536.
This value lies between the values for the 2.5% and 5% level of significance.
These values are .503 and .556 respectively. Therefore the use of the 1% or
2.5% levels will give poorer and poorer performance as X increases, and the
use of the 5% or 10% levels will give better and better performance as X increases
when two errors are present. The dividing point is such that for samples of
size 11 or less the use of any of the given levels of significance will cause the
performance to decrease as X increases. For samples of size n < 14 the 1%,
2.5% and 5% levels have the same effect, and for samples of size n < 16 the 1%
and 2.5%, for samples of size n < 19 just the 1% level. For three such errors
2
the limit approached by D1 as X increases is n - . Therefore, the perform-
ance of D1 will approach zero for all levels of significance and for all sample
sizes for which critical values are known except the 10% level of significance
for sample sizes larger than 21. An indication of these limiting values c 1 n
k n -t
for k contaminations present can be obtained by considering these k values to

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 497

/O ~ ~ ~ ~ ~ ~ 0

k70~~~~~5

-0t r -----|A
- - - .

0
250-4ELeLI~~~ AL
/ 2 3 + $5 6 7 8
-4
0 / 2 3 4 5 6 7 8

FIG. 7. Comparison of the performance of FIG. 8. Comparison of the performance of

the r criteria for one location error in the r criteria for one scalar error in samples
samples of size 5, a = 5%. of size 5, a = 5%.

be at a distance k from the population mean, computing D1 and allowing X to

increase indefinitely.
The comparative performance of the r criteria, a = 5%, in samples of size 5
for the two models of contamination (one contaminator present) are given in
Figures 7 and 8. For samples of size 15 the curves are given in Figures 9 and 10.
A single curve suffices here since there is no discernable difference in the curves
for the different r criteria. There is considerable difference in the performance
curves if more than one outlier is present. However, the performances of r10,
r1l, rI2 are essentially the same when two location outliers are present as are
the performances of r2o, r2l, r22. Figures 11 and 12 show the comparative per-
formance of r1o, ril, r12for one and two contaminators for a = 5% and n = 5.
Figures 13 and 14 are for n = 15. Figures 15 and 16 show the comparative per-

25:XXX~~~~~~i iX _4 A i -
O I 23> > 4 5 6 7 8 0 / e23 4 5 6 7 8
FIG. 9. Performance of the r criteria for FIG. 10. Performance of the r criteria for
one location error in saniples of size 15, a = one scalar earrorinsamples of size 15, a = 5%/o.
5%,/.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
498 W. J. DIXON

/90/Oo - |-o | I 0 TI IF I i1 -

FIG. 11. Comparison of the performance FIG. 12. Comparison of the performance
of the ri. criteria for one and two location of the r1. criteria for one and two scalar
errors in samples of size 5, a = 5%. errors in samples of size 5, a = 5%.

formance for r20, r2l, (r22 is not a test for n = 5) for one and two contaminators
for a = 5% and n = 5. Figures 17 and 18 are for r20, r2l , r22for n = 15. The
six curves represented by the single curve of Figure 17 lie within 5% of the
curve shown. The same is true of the three curves represented by each of the
two curves of Figure 18.
Since no loss in performance results for larger samples from the use of r2O,
r2i, r22 in place of rio, nrl, r12, and further, these criteria are not appreciably
affected by the presence of another outlier it would seem unwise to recommend
the use of rio, ri2, r12. However, note that for small samples (see Figures 11 and
12) the performances of rlo and ril and r12are considerably better when a single

/00 ] r - - 6 -7
- / I

7~~~~~~~~~~~~~~~~~PZ

err in sa 1 sie 7t isp of a =

fFIG.
O9 1. Coprio othpefrace Fi. 14. ,Coprison of tll pronac
ofthr. crteriafo on adtwloain fthricieiaor on an tw scla
:yo 0) 50 o /

0 0
/ 234v5 678 d / 2 3 45 678

FIG. 13. Comparison of the performance FIG. 14. Comparison of the performance
of the r1. criteria for one anld two location of the r1. criteria for one and two scalar
errors in samples of size 15, az = 5%. errors in samples of size 15, az = 5%0.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 499

5/.~~~~~~~~~
O1V16_ Ic- wo I ___ _
7w/Y E2 OeROe-SeeS, _____ 1

5o
11_, 50-,>

o X Iz 3 4- s5 6 7 83 ? 3 v s5 6' 7 6
FIG. 15. Coinparison of the performance FIG. 16. Comparison of the performance
of the r2. criteria for one and two location of the r2. criteria for one and two scalar er-
errors in samples of size 5, a = 5%1o. rors in samples of size 5, a = 5%.

outlier is present. Therefore in larger (n > 10) samples r20 or r2l Would appear
to be the best criteria. In samples of size 10 or less, r10or r2Oshould be used;
r21if the extreme value at the opposite end should be avoided.
It should be noted in the comparisons that no model of contamination was
investigated which would cause one or more errors at both extremes in the
sample. It is obvious that the performance of D1 and D2 would be conisiderably
decreased while the performance of r11, r12, and r21, r22wvould not be materially
affected since these criteria avoid values at the opposite extreme. Their repeated
use might discover most of such outliers, while D1 or D2 might fail on the first
trial.

~~~~~~~~~/0c. . __/n
I elR_.

-rvvo E-zzeos-__

50 - A ---. 1 50 - -

25tX2 1 X a t 1 I J 1
2 6
0FIG. 18.
FIG. 17. Comparison of the performance Comparison of the
ofrOr
thei3nsm s for one and two location er- of the r2.
r2. criteria /2
criteria for one and twoi -scalar
performanice
-
er-
rors in samples of size 15, a = 5%. rors in samnples of size 15, a = 5/.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
50)0 W. J. DIXON

75---- ~~~~~~~~7$
~ ~ - ~ -

250 0 X 1 X \ 5 LI---L A

0 2 3 4 5 6 7 8 0 38 6 7
n=5 n=15
FIG. 19. Performance of B, for various levels of significance when the population is 10%
contaminated with location errors.

6. Sampling from a contaminated population. In the previous sections the

performance of the various criteria were assessed for samples where a certain
number of contaminators were present. One might well ask why a test is needed
is it is known that contaminators are present. It would seem more realistic to
state that a certain per cent of contamination will occur in the long run and
that one will not know in any particular case whether 0, 1, 2, . . . contaminators
will be present. One would then wish a criterion to indicate the presence of
contamination in a particular sample.
The performances of these criteria will be investigated for the same two
models of contamination and their performances will be reported as per cent of

/00--?~~~~~~~~~~0

,oo' , . - oo_.
..725- - /0

0 3 0_ 5 / _ 4 6 78
n=5 n =15
FIG. 20. Performance of B1 for various levels of significance when the population is 10%
contaminated with scalar errors.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 501

50 l0/;
A00. ~ ~ ~ ~ ~ ~ ~ ~ t, ~ ~ ~ Xt0

A2 ~ ~ ~ ~ ~ ~ ~ 25

-?-~~~~~A
0 / 203456 8 o / 234567 8
n=5 n= 15
FIG. 21. Performance of B1 for various levels of contamination for location errors and
using the 5% level of significance.

total contamination discovered. The tests will be applied only once to each
sample. Repeated use of the criterion would in many cases increase the per cent
of total contamination discovered. It is not known what effect such a procedure
would have on the level of significance.
Investigation has been made for 5, 10, and 20% contamination. For example,
in samples of size 5 which have 10% contamination, on the average, 59.0% of
the samples will contain no "errors", 32.8% will contain one, 7.3% two, 0.8%
three, 0.1% four, and 0.0% five. Thus in 100 samples of 5 which are 10% con-
taminated with location errors having mean A + 5o-, about 59 contain no errors.
If the rio criteria is used with a 5% level of significance one value will be "dis-

7S- 75~~~00
- - - - -{- -

50- ~ ~ ~ ~ ~ ~ ~ ~ 7

FIG.
0

Xq~~~~~~i
n =5
A:A 0~~~~~~~~~~~~~~2

22. Performance of B1 for various levels of contamination for scalar errors and
n =15

using the 5% level of significance.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
502 W. J. DIXON

/00 j /001-
- -

254- - - 75 - A

as - _ S W_ A

0 / L?34D 5 67z8 0 / 2 3 4L564i? 8

(Location) (Scalar)
FIG.23. Performance of r1o, D, r, D2 in samples of size 5 using the 5% level of signifi-
cance and sampling from a population which is 10% contaminated.

covered" in 3.0 of the samples containing no errors. Of the 33 samples containing

one "error"the "error" would by discovered in 18 of these samples. This criteria
would discover none of the "errors" in samples containing more than one "er-
ror". We would have obtained 18 of the 50 contaminating values and 3 which
were members of the original population.
When o- is known the performance will increase when more contaminators
are present. Performance however has been measured in terms of finding a
single contaminator; i.e., the test has been used only once. Therefore even with
increasing percent contamination the level of performance will decrease with
increasing contamination. Repeated use of the test criteria has not been in-
vestigated.

0~~~~~~

50?-~~~~~~~~0
75- S _ ___ _ .
- 4;t
7S----
--I I

0 /a34S6 ~~ A ~ 1. 23 4 6 7 a
r10(DI) r,2 (Dl r2, r,1)
n=5 n =15
FIG. 24. Performance of rlo(D1)and r22(DI, r2o, r2l) for various levels of significance
when the population is 10% contaminated with location errors.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 503

7.s*?---- ~~~~~~~~75---
/00 - - - - 1 - --

50?-- '0

0 2 .3 < 4~~~~~~~~~~/

rlo(Di) r22(11, r2l)

n = 5 n =15
FIG. 25. Performance of rio(D1) and r22(D1, r2O, r2) for various levels of significance
when the population is 10% contaminated with scalar errors.

Criteria B1 gives the best performance for both location and scalar errors for
the levels of contamination and levels of significance considered. A and C1 are
only slightly inferior. B2 is handicapped when more than one error is present
thus its performance is poorer for heavier contamination. Figure 19 shows the
performance of B1 for the different levels of significance, 10% contamination,
and the two sample sizes 5 and 15 for location errors. Figure 20 shows the results
for scalar errors. Figures 21 and 22 show the performance of B1 for the 5%
level of significance for the different levels of contamination.
When ar is not known the performance of various criteria will eventually
decrease as more and more contaminators are present in the sample even though

10
~ 0

0-- 50-?

2\ 2

0 2.3 4 5 6 78 0 8 3 4 -56 7

rio(D1) r22(D , r2 , r2l)

n = 5 n = 15
FIG. 26. Performance of rio(D1) and r22(DI r2, , r,n) for various levels of contamination
J

for location errors and using the 5% level of significance.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
504 W. J. DIXON

7t.l
5-- 7S -- -

sot 1t CW

rjo(D1) r22(D1, ra , r2j)

n=5 n = 15
FIG. 27. Performance of rlo(D1) and r22(D1, r2O, r2l)for various levels of contamination
for scalar errors and the 5% level of significance, a = 5%.

several of the criteria show improvement in discovering a single error if two

are present. The performance of these criteria is greatly affected by the size
of the sample. For samples of size 5, r10and D1 perform alike, rio being superior
to the other r's (r2Osecond best) for the levels of contamination considered,
and D2 is inferior to r20. Figure 23 compares the performance of r10, D1, r2o0
and D2 for the 5% level of significance and 10% contamination. The results
for other levels of significance and contamination are comparable.
For samples of size 15, r2o, r21 and r22perform alike as do rio, ril and r12. D
and r20, r2l, r22 perform approximately the same and are superior to r10, 'ru,

/0 4 6 <

FIG. 28. A comparison of the performance of r22 and D1 for two scalar contaminators
when tests are made at one extreme only, a = 5%, n 15.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
EXTREME VALUES 505

and r12 . Critical values are not available for D2 for n > 12. The performances
of D1, r20, r21and r22are indicated by a single line in Figures 24, 25, 26, and 27
which show the effect of level of significance and level of contamination of the
performance of D1, r20, r21 and r22 for samples of size 15 and for r10 (D1) for
samples of size 5.

7. Remarks and conclusions. Throughout the investigation of performance,

location errors were placed only at one extreme and scalar errors at either ex-
treme. The test for an error was made using as a suspected value the extreme
value in the direction of the location error or in the case of the scalar error the
value most distant from the mean. It can be expected then that if performance
were assessed when location errors could occur in either direction, different
results would be obtained. Also in the case of scalar errors if errors were always
sought at one particular extreme or at both extremes different results would be
obtained. If these changes were made in the models of contamination, those
criteria designed to avoid errors at the other extreme would have an advantage
over those which were not so designed for a- unknown. If a- is known the criteria
which do not avoid the other extreme would have an advantage over those
which do avoid the other extreme. These points just mentioned will be used to
discriminate between those criteria which were judged to be equal in perform-
ance under the models used in the sampling study. For example, Figure 28
compares the performance of r22 and D1 for two scalar contaminators when
tests are made only at one extreme, a = 5%, n = 15.
1. For a-known:
B1 or C1should be used, or in small samples A, B1 or C1should be used.
2. For a- unknown:
r10should be used for very small samples. r22should be used for sample sizes
over 15. Probably r2l would be best for sample sizes from about 8 to 13. If sim-
plicity in computation is not important and "errors" are not expected at both
extremes D1 would do equally well. When critical values are available for larger
n, D2 should prove useful in the larger sample sizes.
LITERATURE REFERRING TO CRITERIA LISTED IN SECTION 3
(B1) A. T. MCKAY, "The distribution of the difference between the extreme observation and
the sample mean in samples of n from a normal universe," Biometrika, Vol. 27
(1935), pp. 466-471. Procedures for obtaining percentage values given.
(B2) J. 0. IRWIN, "On a criterion for the rejection of outlying observa.tions," Biotnetrika,
Vol. 17(1925),pp.238-250. Pr(B2 > X),X= .1(.1)5.0;n= 2,3,10(10)100(100)1,000.
Tables concerning the second and third ordered observations are also given.
(C1) E. S. PEARSON AND H. 0. HARTLEY, "The probability integral of the range in samples
of n observations from the normal population," Biometrika, Vol. 32 (1942), pp.
301-310. 0.17o, 0.5%o, 1.0%0, 2.5%, 5%, 10%0, i = 2(1)12, values to 20 available by
interpolation.
(C2) D. NEWMAN, "The distribution of ranges in samples from a normal population, ex-
pressed in terms of an independent estimate of the standard deviation," Bionmetrika,
Vol. 31 (1940), pp. 20-30. 1%.t,and 5%6points for C2; for w, n = 2(1)12, 20; s, d.f. =
5(1)20, 24, 30, 40, 60, co.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions
506 W. J. DIXON

(C2) E. S. PEARSONANDH. 0. IIARTLEY, "Tables of the probability integral of the student-

ized range," Biometrika, Vol. 33 (1942), pp. 89-99. Upper and lower 5% and 1%
points for C2 ; for w, n = 2(1)20; for s, d.f. = 10(1)20, 24, 30, 40, 60, 120, oo.
(C2,B1) K R. NAIR, "The distribution of the extreme deviate from the sample mean and
its studentized forms," Bionietrika, Vol. 35 (1948), pp. 118-144. B1 upper and lower
.1%, .5%, 1%, 2.5%, 5%, 10%points for n = 3(1)9.
(D1, D2, F, B1) F. E. GRUBBS, "Sample criterion for testing outlying observations,"
Annals of Math. Stat., Vol. 21 (1950), pp. 27-58. F, DI :1%, 2.5%, 5%, 10%, n < 25;
D2: 1%, 2.5%,5%, 10%,n < 20; B1:1%, 2.5%,5%, 10%,n < 25.
R. THOMPSON,
(F) WV. "On a criterion for the rejection of observations and the distribution
of the ratio of deviation to sample standard deviation," Annals of Math. Stat.,
Vol. 6 (1935), pp. 214-219. 20%, 10%, 5%, n = 3(1)22(10)42, 102, 202, 502, 1002.
(F) E. S. PEARSON AINDCHANDRA SEKAR give a further discussion of F in "The efficiency of
statistical tools and a criterion for the rejection of outlying observations," Bio-
metrika,Vol. 28 (1936), pp. 308-320. 10%,5%, 2.5%, 1%, n = 3(1)19.
(r's) W. J. Dixos, "Ratios involving extreme values," Annals of Math. Stat., to be pub-
lished. r1o , ril , r12 , r20 , r2l, r22 ; .5%, 1%, 2%c, 5%2c,10%, 20%, 30%, 40%, 50%,
60%, 70%, 80%, 90%, 95%, n < 30.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

All use subject to JSTOR Terms and Conditions

ISISO16269 Part4 2010 (Reaffirmed2021)
50% (2)
ISISO16269 Part4 2010 (Reaffirmed2021)
59 pages
Outliers
No ratings yet
Outliers
24 pages
Preview Thought Power Seth Vignanam 75714
No ratings yet
Preview Thought Power Seth Vignanam 75714
15 pages
L1 24 07 2019 Introduction
100% (2)
L1 24 07 2019 Introduction
28 pages
Identification of Outliers (Monographs On Statistics and - D. M. Hawkins (Auth.)
No ratings yet
Identification of Outliers (Monographs On Statistics and - D. M. Hawkins (Auth.)
194 pages
Francis R Pitard - Theory of Sampling and Sampling Practice, Third Edition-CRC Press (2019)
No ratings yet
Francis R Pitard - Theory of Sampling and Sampling Practice, Third Edition-CRC Press (2019)
26 pages
SSC Grading System - Education Board Bangladesh - 17!06!16
100% (1)
SSC Grading System - Education Board Bangladesh - 17!06!16
2 pages
PHY224H1F/324H1S Notes On Error Analysis: References
No ratings yet
PHY224H1F/324H1S Notes On Error Analysis: References
14 pages
Test To Identify Outliers in Data Series
100% (1)
Test To Identify Outliers in Data Series
16 pages
Topic III
No ratings yet
Topic III
27 pages
Robust Statistics - How Not To Reject Outliers
100% (1)
Robust Statistics - How Not To Reject Outliers
5 pages
Ch15
No ratings yet
Ch15
79 pages
Lecture 12 1
No ratings yet
Lecture 12 1
46 pages
Principles of MGT 1301 Short
No ratings yet
Principles of MGT 1301 Short
120 pages
RI 7472 Graphical Method Outlier
No ratings yet
RI 7472 Graphical Method Outlier
15 pages
6 CE 411 - HYDROLOGY (Statistical Measures)
No ratings yet
6 CE 411 - HYDROLOGY (Statistical Measures)
33 pages
Rudolf Steiner - Cosmosophy Volume I GA 207
100% (1)
Rudolf Steiner - Cosmosophy Volume I GA 207
99 pages
ERRORS
No ratings yet
ERRORS
34 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
28 pages
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
No ratings yet
Outlier Detection in Non-Gaussian Distributions Uitschieter Detectie in Niet-Gauss Verdelingen
45 pages
Carr - Review of Koselleck's Future Past
No ratings yet
Carr - Review of Koselleck's Future Past
9 pages
Iso 16269-4-2010 - 7500
No ratings yet
Iso 16269-4-2010 - 7500
3 pages
Comm222 Notes
No ratings yet
Comm222 Notes
18 pages
The Future of Data Analysis
No ratings yet
The Future of Data Analysis
68 pages
Westgard Rules
No ratings yet
Westgard Rules
2 pages
Sample Criteria For Outliers GRUBBS
No ratings yet
Sample Criteria For Outliers GRUBBS
33 pages
10 Barnettandlewis 1978 Outliersinstatisticaldata
No ratings yet
10 Barnettandlewis 1978 Outliersinstatisticaldata
31 pages
Ouliers in Statistica
0% (1)
Ouliers in Statistica
5 pages
STAT
No ratings yet
STAT
40 pages
Tukey, J. W. (1961) - Discussion, Emphasizing The Connection Between Analysis of Variance and Spectrum Analysis
No ratings yet
Tukey, J. W. (1961) - Discussion, Emphasizing The Connection Between Analysis of Variance and Spectrum Analysis
30 pages
Bayesian Local Contamination Models For Multivariate Outliers
No ratings yet
Bayesian Local Contamination Models For Multivariate Outliers
25 pages
History and Basic Terms Example References Conclusions: Uncertainties of Measurement in Excel
No ratings yet
History and Basic Terms Example References Conclusions: Uncertainties of Measurement in Excel
29 pages
Box, Wetz Technical Report PDF
No ratings yet
Box, Wetz Technical Report PDF
95 pages
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
No ratings yet
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
19 pages
ACTL30004 Assignment
No ratings yet
ACTL30004 Assignment
15 pages
IGNOU Assignment
0% (1)
IGNOU Assignment
9 pages
4-Regression Diagnostics SAS
No ratings yet
4-Regression Diagnostics SAS
12 pages
Generalized Linear Models-1
No ratings yet
Generalized Linear Models-1
29 pages
4 6003574465387038522 PDF
No ratings yet
4 6003574465387038522 PDF
14 pages
Individual Learner Differences
No ratings yet
Individual Learner Differences
36 pages
Alamgir
No ratings yet
Alamgir
21 pages
Maximum Likelihood An Introduction: L. Le Cam
No ratings yet
Maximum Likelihood An Introduction: L. Le Cam
31 pages
Davies 1993
No ratings yet
Davies 1993
12 pages
Shoaib Lab 1 To 7
No ratings yet
Shoaib Lab 1 To 7
18 pages
Finding The Outliers That Matter
No ratings yet
Finding The Outliers That Matter
10 pages
AccredQualAssur 2007 12 231
No ratings yet
AccredQualAssur 2007 12 231
10 pages
2024 A/L Business Statistics English Medium
No ratings yet
2024 A/L Business Statistics English Medium
6 pages
Physical Chemistry II
No ratings yet
Physical Chemistry II
11 pages
Sullivan 2021
No ratings yet
Sullivan 2021
14 pages
Sri Ramana para Vidya Upanishad - II
100% (3)
Sri Ramana para Vidya Upanishad - II
50 pages
1 s2.0 S2950509725000231 Efron
No ratings yet
1 s2.0 S2950509725000231 Efron
10 pages
Syllabus Research Method
No ratings yet
Syllabus Research Method
21 pages
Percentage Points For A Generalized ESD Many-Outlier Procedure
No ratings yet
Percentage Points For A Generalized ESD Many-Outlier Procedure
8 pages
A Review of Statistical Outlier Methods
No ratings yet
A Review of Statistical Outlier Methods
8 pages
Outliers PDF
No ratings yet
Outliers PDF
5 pages
STA2e TRBWorksheet ch02
No ratings yet
STA2e TRBWorksheet ch02
6 pages
Statistica
No ratings yet
Statistica
8 pages
Measuring Effectiveness of The Promotional Program
100% (2)
Measuring Effectiveness of The Promotional Program
14 pages
Second Mid-Term - Exam - Probability and Statistics - B - Second
No ratings yet
Second Mid-Term - Exam - Probability and Statistics - B - Second
5 pages
Sampling Criterion
No ratings yet
Sampling Criterion
6 pages
Discriminant Analysis Statistics
No ratings yet
Discriminant Analysis Statistics
18 pages
Lm#4c-Measures of Variability
No ratings yet
Lm#4c-Measures of Variability
4 pages
Beyond Interpretation - Culler
No ratings yet
Beyond Interpretation - Culler
14 pages
Experiment 1 Lab Report
No ratings yet
Experiment 1 Lab Report
10 pages
Act 2 AGJ
No ratings yet
Act 2 AGJ
6 pages
Other Errors Are
No ratings yet
Other Errors Are
3 pages
KELTON Desktop Calculation Software Brochure
No ratings yet
KELTON Desktop Calculation Software Brochure
12 pages
Is 11498 1985
100% (1)
Is 11498 1985
8 pages
Statistics Inferences Past Paper
No ratings yet
Statistics Inferences Past Paper
3 pages
How To Calculate Outliers
No ratings yet
How To Calculate Outliers
7 pages
Act2 Apren GVZA
No ratings yet
Act2 Apren GVZA
4 pages
CMS 301-F - April - 2019 Part Time
No ratings yet
CMS 301-F - April - 2019 Part Time
6 pages
Recommended Criteria For Single Samples: Table 1 Table 1
No ratings yet
Recommended Criteria For Single Samples: Table 1 Table 1
1 page
Law 824 Tutorial - Question 3 - HazwaniHassan
100% (1)
Law 824 Tutorial - Question 3 - HazwaniHassan
4 pages
Assertiveness Training Exercises 1
No ratings yet
Assertiveness Training Exercises 1
2 pages
Chapter-1: Introduction To Dynamics Mechanics As The Origin of Dynamics
No ratings yet
Chapter-1: Introduction To Dynamics Mechanics As The Origin of Dynamics
92 pages
Two Body, Central-Force Problem
No ratings yet
Two Body, Central-Force Problem
15 pages
Target Research Final
100% (1)
Target Research Final
23 pages
Lab Report Criteria
No ratings yet
Lab Report Criteria
48 pages
Module 7: Application of Leadership Theories Megan Miller 1 May 2020
No ratings yet
Module 7: Application of Leadership Theories Megan Miller 1 May 2020
16 pages
Activity 1: Let'S Ponder!
No ratings yet
Activity 1: Let'S Ponder!
3 pages
Hidayatullah National Law University, New Raipur: Jurisprudence - I Semester V
No ratings yet
Hidayatullah National Law University, New Raipur: Jurisprudence - I Semester V
3 pages
Homework in Audit Sampling
No ratings yet
Homework in Audit Sampling
3 pages
Orientation of Fibers Prof. Bohuslev Neckar Department of Textile Technologies Indian Institute of Technology, Delhi
No ratings yet
Orientation of Fibers Prof. Bohuslev Neckar Department of Textile Technologies Indian Institute of Technology, Delhi
24 pages
Lecture 1B PDF
No ratings yet
Lecture 1B PDF
24 pages
Overview of Curriculum Processes and Products: Chapter One
No ratings yet
Overview of Curriculum Processes and Products: Chapter One
14 pages
Information Direction, Website Reputation and eWOM Effect
No ratings yet
Information Direction, Website Reputation and eWOM Effect
7 pages
8554 29407 1 PB PDF
No ratings yet
8554 29407 1 PB PDF
12 pages
Plato Idea of Justice
No ratings yet
Plato Idea of Justice
2 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

Institute of Mathematical Statistics

Uploaded by

Institute of Mathematical Statistics

Uploaded by

Analysis of Extreme Values

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

2. Models of contamination. The performance of the various criteria for dis-

B. One or more observations from N(y, X2a2).

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

3. Criteria to be considered. The performance of two types of criteria has

C2= s2 (x - x)2 (s independently estimated).

The criteria involving only the information of a single sample of n observations

2. For double outliers x1 ,X

(or for Xn Xn11, D2 =

E. Ratios of ranges and subranges.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

(or for Xn X rio = Xn-Xn1)

2. For single outlier xi avoiding xf,

(or for xn avoiding xl, rll = Xn Xni1

3. For single outlier xl, avoiding xn, xn_,

(or for xn avoiding xl, X2, r12 = Xn- Xn1)

4. For outlier xi avoiding x2,

5. For outlier xi avoiding x2 and xn,

(orfor Xn avoiding xn-I , x1 , r2l = X

6. For outlier xi avoiding x2 and xn, xnl,

(or for Xn avoiding xn-1, xl, X2, r22 = Xn -xn).

F. Extreme deviation and standard deviation.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

4. Performanceof criteria(estimate of a available).The x2 test will of course

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

of performance when a is not known. The difference is greater for n = 5 than

The graphs for a = 1% and 10% would be similar in appearance. Figure 2

eSt~~ 0-~~~ 1-fX l2Xr7- t -1-

/ _ _? _5_ _> - > - -- 7 -

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

rok ; $ /~~~~S 7-= W /_1

the performance of B2 will be lowered and the performance of B1 and C1will be

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

FIG. 5. Comparison of the performance of FIG. 6. Comparison of the perfo: mance

6. Performance of criteria (no external estimate of a). Criteria Di and A2

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

There is no question that rio is simpler to use, so that if this condition of

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

FIG. 7. Comparison of the performance of FIG. 8. Comparison of the performance of

be at a distance k from the population mean, computing D1 and allowing X to

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

err in sa 1 sie 7t isp of a =

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

6. Sampling from a contaminated population. In the previous sections the

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

using the 5% level of significance.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

0 / L?34D 5 67z8 0 / 2 3 4L564i? 8

covered" in 3.0 of the samples containing no errors. Of the 33 samples containing

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

rlo(Di) r22(11, r2l)

rio(D1) r22(D , r2 , r2l)

for location errors and using the 5% level of significance.

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

rjo(D1) r22(D1, ra , r2j)

several of the criteria show improvement in discovering a single error if two

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

7. Remarks and conclusions. Throughout the investigation of performance,

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

(C2) E. S. PEARSONANDH. 0. IIARTLEY, "Tables of the probability integral of the student-

This content downloaded on Mon, 24 Dec 2012 01:12:31 AM

You might also like

eSt 0-~ 1-fX l2Xr7- t -1-