Get TRDoc
Get TRDoc
Sponsored by
The Army Mathematics Steering Committee 'f41
IELECTE
T IC
on Behalf of
AUG 15 1989
I, .
U.S. Army Research Office
HOST
HELD AT
The original format for the Design of Experiments Conferences, which are under
the auspices of the Army Mathematics Steering Committee (AMSC), was outlined by
the eminent statistician, Professor Samuel S. Wilks, who served as conference
chairman until his death. Through these symposia the AMSC hopes to introduce
and encourage the use of the latest statistical and design techniques into the
research, development and testing conducted by the Army's scientific and
engineering personnel. It is believed that this purpose can be best pursued by
holding these meetings at various government installations throughout the
country.
Members of the program committee were pleased to obtain the services of the
following distinguished scientists to speak on topics of interest to Army
personnel:
Speaker and Affiliation Title of Address
)By .. . .. .
iii LDIt ributlon/
/7-- Availability Codas
,'~ Dist
/Speutal
M~ AviiIJ and~/or
Four days before the start of the planned two-day tutorial on "Topics in Modern
Regression Analysis", its speaker advised Mr. Agee he could not give his
planned lectures. Fortunately, Professor Ali Hadi of Cornell University was
able, so to speak, to save the day. The attendees were very pleased with Dr.
Hadi's interesting and informative tutorial on "Sensitivity Analysis in Linear
Regressi on".
iv
TABLE OF CONTENTS*
Title Page
Foreword.************...***** C eO S* *C CCC0* C 000**...g. iill
Tabl e of Contents................................0......C............ v
Program.. ......
........ ...* ..... *.. ....................... vii
*This Table of Contents contains only the papers that are published
in this technical manual. For a list of all papers presented at the
Thirty-Fourth Conference on the Design of Experiments, see the Program of
this meeting.
v
TABLE OF CONTENTS (continued)
Title Page
;NUMERICAL ESTIMATION OF GUMBEL DISTRIBUTION PARAMETERS.,
Cha rl es E. Hall1 , Jr.. aa. * * * ** ** .. * * * . * * * . * * * * **
0 0 a a 0 a # a f * ** * 1856
vi
AGENI1A
WELCOMING REMARKS:
Chairman:
vii
1200 - 1330 LUNCH
viii
xxx= Thursday, 20 October xxxx€
0800 REGISTRATION
0815 - 1000 TECINICAL SESSION 1
II
A SMALL SAMPLE POWER STUDY OF THE ANDERSON-DARLING STATISTIC
ix
EXPERIMENTAL DESIGN AND OPTIMIZATION OF BLACK CHROME SOLAR
SELECTIVE COATINGS
I. J. Hall and R. B. Pettit, Sandi& National Laboratory
x
xxxzz Friday, 21 October xxxxx
0800 REGISTRATION
ADJOURN
PROGRAM COMMITTEE
xi
SOME APPLICATIONS OF ORDER STATISTICS*
H. A. David
Department of Statistics
102D Snedecor Hall
Iowa State University
Ames, IA 50011-1210
The subject of order statistics deals with the properties and applica-
tions of these ordered random variables and of functions involving them,
Examples are the extremes Xl:n and Xn:ng the range Wn - Xn: n - Xl:n, the
extreme deviate (from the sample mean) Xn:n - i, and the maximum absolute
deviation from the median (MAD) 1 max.,X i - M1, where the median M
or
bn(X) 0 E a iain(X), (2.3)
X 10
is the sample mean10 1 E0
9
T 1(1) is the trimmed mean 8E Xil 0
8
W 10 (2) is the Winsorized mean 1- (2X3 :10 + E Xi.1 0 + 2X8 :10 )
The figures are confined to ) ) 0 since results for X < 0 follow by skew-
symmetry in Fig. I and by symmetry in Fig. 2.
BISb(X).
BIAS bn Since at O i we see from (2.3) that the bias is a strictly
increasing function of X for each of the estimators, and from (2.4) that
bn (-) - ai
i i:a-l (2.5)
This gives the numerical values placed on the right of Fig. 1. The jagged
graphs are the corresponding "stylized sensitivity curves" (Tukey, 1970;
Andrews et al., 1972) obtained by plotting Ln(al :n1,..*'a n X)
i:n-
3
~
P,,4
40
IAfII
I
N*N
Ot*
P4 da
4E
:go g 990)
. 1 meio(,:9....
me g- -" :9~ 2
1 0 < X~ <a6:
1('5:9 + 6:9 2 '6:9 ' 6:9
- 0.1372 6
The last result is the same as given by (2.5). In fact, each of the horizon-
tal lines serves as an asymptote to the corresponding bias function. It is
seen that the median performs uniformly best.
MEAN SQUARED ERROR MSE() _. No clear-cut results emerge. The sample mean
does best for ;k< 1.5 but is quickly outclassed for larger X. Overall, TIO(1)
performs best, although the more highly trimmed TIO(2) is slightly superior
for very large X.
EXTENSIONS
I. The intuitively appealing result that for symmetric unimodal distributions
the median is the least biased among the Ln-estimators can be formally
established under (2.1) and also for a class of symmetric non-normal
distributions (David and Ghosh, 1985).
2. For n 4 20 appropriately trimmed means still do well in the MSE sense when
compared with much more complex estimators, but for X sufficiently large
and n not too small are inferior to the best of the adaptive estimators
such as Tukey's biweight (Mosteller and Tukey, 1977, p. 205).
3. An often used alternative outlier model replaces the second line of (2.1) by
x, ~ N(u, - 20 2 ) T2 > I
For this model location estimators remain unbiased but their variance is
increased. Since bias has been sidestepped, only the variance of the
estimator needs to be studied (David and Shu, 1978; Rosenberger and Gasko,
1983).
CASE OF SEVERAL EXTREME OUTLIERS. For q(1 4 q < - n) outliers Rocks (1986)
defines as a measure of outlier resistance of an estimator of location T the
"expected maximum bias" DT(n,q) by
where the supremum is taken over all possible choices of the constants
Xl,...,)q and the Z's are the normal OS. When T - Ln, the supremum will
evidently occur when the X'A are all + 4 or all - 0. As Rocke points out, by
focusing on the worst case of bias one need not specify the usually unknown
distribution(s) of the outliers. It suffices to model the good observations
which more generally could be from any standardized distribution.
It appears that unwittingly Rocks does not use (2.6) but in fact works
with the standardized bias
5
T
(n,q) - E{T(Zi~n...,on.q~n...,e)}s (2.7)
size n, and X(i) the r-th OS of S (i) the moving r-th OS. Moving maxima
r:n' n -
ranges W i n X~i
n:n -
1:n C~)
i -1,2,...). The latter have a longer history
(Grant, 1946), being natural companions to moving averages on quality control
charts. Such charts are particularly appropriate when it takes some time to
produce a single observation.
Moving medians are robust current measures of location and, like moving
averages, smooth the data; see, e.g., Tukey (1977, p. 210). Cleveland and
Kleiner (1975) have used the moving midmean, the mean of the centrAl half of
the ordered observations in each Sn"i) , together with the mean of the top half
and the mean of the bottom half, as three moving descriptive statistics
indicating both location and dispersion changes in a time series.
d
ifgh(d) - Pr{rank (X(1))
r:n - g, 9:n
rank x(l +d ) )
hj, (3.i)
where rank (Y) denotes the rank of Y in the combined sample XI,.*..Xn+d . It
follows that
6
(1) x (s:n~d)
E(X r:n Zg,h gh E(X g:n+d 0n+d
(3.2)
in terms of the first two moments of the OS for sample sizes up to 2n-I from a
distribution with cdf F(x).
X13 1 1 10 2 4 3 9 10 2 1
i )
x 2:3 1 1 2 4 3 4 9 9 2
x 1 4 41 51 3 51 71 7 4J
133 33 3 3
7
a fixed sample becomes complicated although a fairly elegant expression for
the pdf can be written down in terms of permanents (Vaughan and Venables,
1972) if the X's are not identically distributed but still independent. It is
easily seen that the moving median and other order statistics will reflect
trends except for a smoothing at the ends. Thus for the following sequence,
where the upward trend is underlined, we have
x 5 2 1 3 4 6 9 12 14 11 7
x2:3 2 2 3 4 6 9 12 12 11
x22 2 22 41 61 91 1 10
3 3 3 3 ~ 3 13 10
For a linear trend given by
X, 0 iT + Zi, i 1,2,... (3.3)
where the Zi are i.i.d., we evidently have
Q)~ W
r:n -r:n
(i )
with covariances coy (X
Sr:nD X(i))
*:nt (r,s o 1,...,n) independent of i.
Thus Xr:n will tend to lead the trend, reflect the current state, or lag the
trend according as r m, and will do so increasingly as T increases; for
<O, the results are reversed. However, in contrast to the sample mean,
whose variance remains unchanged under a linear trend, the variance of the
sample median increases with T. (I am indebted to Or. W. J. Kennedy for some
computations verifying the latter intuitively obvious result.) Thus the use
of the median, under locally linear trend, is appropriate primarily as
protection against outliers. In this situation, but under nonlinear trend,
Bovik and Naaman (1986) consider the optimal estimation of EX. by linear
functions of order statistics.
I_ II8
Let Xi(i - 1,...,n) be the lifetime of the i-th component and
Ri(x) PriXi > x} its reliability at time x (the probability that it will
function at time x). Then the reliability of the system S at time x is
Rs(x) - Prixn(k+x)n > x1.
If the Xi are independent (but not necessarily identically distributed) one
may write (Sen, 1970; Pledger and Proschan, 1971).
n ai I-A
Rs(X) _ E i R (x) [I - Ri(x)] ,
A i-l
n
where Ai 0 or 1, and A is the region E A > k. It ca be shown that a
ili
series (parallel) system is at most (least) as reliable as the corresponding
= n
system of components each having reliability -(x)E i W. An excellent
n iAl e
general account, covering also important situations when the X i are not
independent, is given in Barlow and Proschan, (1975).
Now, for a random sample from any continuous distribution with cdf F(x),
P(n;n,t) is just
the cdf of the sample range (Hartley, 1942). Let A1 ' and A2 be the events
The event A' occurs if n-I or n-2 of Xl,.%.,Xr fall in the interval
(Xi:noX n + t] and A2 if n-2 of the X, are in (X2:n,X2:n + ti. Since A2
includes the event that n-1 of the X, are in (Xl:n,Xl:n + t], we can avoid
unnecessary duplication by replacing A1 ' in (4.1) by A1 , the event that
exactly n-2 of the X, are in (Xi:n1 Xl:n + t].
9
We have immediately, writing n ( j ) . n(n-1) ... (n-j+l), that
and
3
Pr{A1 A2 } - n( )f9fx+t[F(x~t) - F(x+t)]
-n3[Y+t) dF(y)dF(x)
From these results P(n-1; n,t) has been tabulated in David and Kinyon
(1983) when F(x) - O(x). Note that P(n.-l;n,t) may be interpreted as the
probability that at least n-i out of n independent normal N(p,o 2 ) variates are
within an interval of length to.
EXAMPLE. As in Baker and Taylor (1981) suppose that XI,.,.,X, are independent
normal variates with a - 10- 5 The entry P(6;7,3) a 0.9587 tells us that the
probability of at least six detonations out of 3 possible seven within time
span 3o is 0.9587. By comparison, the probability of seven detonations is
only 0.6601, as found from tables of the cdf of the range (Pearson and
Hartley, 1970).
10
APPENDIX
H. A. David and C. C. Yang
The author does not stay with his own definition of DT(n,q) but in fact uses
DTn'q) -EIT(zlo""""n-q0"'"'1")}'
Even with this change the proof of the theorem on p. 176 is in error since the
combinatorial tem associated with 6 should be (n-l ), not (n1 ). However,
n-r n-r
since 6n-r = 6r-q, the theorem follows directly from Case 2 of David and
Groeneveld (Biometrika (1982), 69, 227-32) and has essentially been proved in
P. K. Sen (Ed.) Biostatisties (1985) North-Holland, pp. 309-11.
REFERENCES
Andrews, D. F., Bickel, P. J., Hampel, F. ',, Huber, P. J., Rogers, W. H., and
Tukey, J. W. (1972). Robust Estimates of Locations. Princeton University
Press.
11
David, H. A. (1981). Order Statistics. 2nd edn. Wiley, New fork.
Griibel, R. (1988). The length of the shorth. Ann. Statist. 16, 619-28.
12
Rosenberger, J. L. and Gasko, M. (1983). Comparing location estimatorn:
Trimmed means, medians, and trimean. In: Hoaglin, D. C., Mosteller, F.,
and Tukey, J. W. (Eds.) Understanding Robust and Exploratory Data Analysis,
pp. 297-338, Wiley, New York.
13
MULTI-SAMPLE FUNCTIONAL STATISTICAL DATA ANALYSIS
Emanuel Parson
Department of Statistics
Texas A&M University
College Station, Texas 77843-3143
15
4. RANKS, MID-RANKS, AND MID-DISTRIBUTION FUNCTION. Nonparamet-
ric statistics use ranks of the observations in the pooled sample; let
defining
p.k = n/N
to be the fraction of the pooled sample in the kth sample; we interpret it as the empirical
probability that an observation comes from the kth sample.
The unconditional and conditional variances are denoted
Note that our divisor is the sample size N or nk rather than N - c or n k - 1. The latter
then arise as factors used to define F statistics.
We define the pooled variance to be the mean conditional variance:
16
6. TWO SAMPLE NORMAL T TEST. In the two sample cast the statistic to test
H0 is usually stated in a form equivalent to
N - 2))((1/ni) + (1/n2))})
T = (Y- - Y2-}1a^{(N1(
We believe that one obtains maximum insight (and analogies and extensions) by expressing
T in the form which compares Yf'" with Y-:
T = ((N - 2)p.i/(1 - p.)' 5 (Y -
- Y'}laA
1
The exact distribution. of T is t(N - 2), t-distribution with N - 2 degrees of freedom.
7. TWO-SAMPLE NONPARAMETRIC WILCOXON TEST. To define the popular
Wilcoxon non-parametric statistic to test Ho we define Wk to be the sum of the nk ranks
of the Yk values; its mean and variance are given by
E(Wk] =nk(N + 1)/2, VAR[Wk] = nln2 (N + 1)/12
The usual definition of the Wilcoxon test statistic Is
A = {Wk - E[Wk]j}{VAR[Wk]) 5 .
The approach we describe in this paper yields as the definition of the nonpasrametric
Wilcoxon test statistic (which can be verified to approximately equal the above definition
of TI, up to a factor {1 - (1/N) 2}'5 )
One should notice the analogy between our expressions for the parametric test statistic
T and the nonparametric test statistic Ti; the former has an exact t(N - 2) distribution
and the latter has asymptotic distribittion Normal{O, 1).
=Z (--p.k)ITFk1 2 ,
k=1
17
defining
Tk- -(N- ){Yk-- Y-/a'
TFk = {(N - C)p.k/(1 - - -
Y-I/O"
The asymptotic distribution of T 2 /(C -1 ) and TF' are F(c - 1, N - c) and F(1, N - c)
respectively.
9. TEST OF EQUALITY OF c SAMPLES NONPARAMETRIC KRUSKAL.
WALLIS TEST. The Kruskal-Wallis nonparametric test of homogeneity of c samples
can be shown to be
18
18. COMPARISON OF DISCRETE DISTRIBUTIONS. To compare two discrete
distributions we define first d(u) and then D(u) as follows:
d(u) = d(u; H, F) = pp(H- I())/ pH(H-,(u),
D(u) = d(t)dt.
We apply this definition to the discrete sample distributions F^ and F"k to obtain
dk"(, ) = d(u; F, Fk')
and its integral Dk'(u).
We obtain the folowing definition of dk"(u) for the c sample testing problem with all
values distinct:
dk'(u ) = N/nk if (Rk(j) - 1)IN < u < Rk(j)/N,j = 1,...,nk,
= 0, otherwise.
A component, with score function J(u), is a linear functional
Tk'(J) = J(u)dk'(u)du
It equals
(link) N -I N )d,
u}.
fOI J(u)d(Dk-(u) -
19
The components of the Kruskal-Wallis nonparametric test statistic TKW 2 for testing
the equality of c means have score function J(u) = u - .5 satisfying
EIJ(U)]- .5, VAR[J(U)] = 1/12.
The components of F test statistic T 2 have score function
J(U) = {QC )- Y }
where Q'(u) is sample quantile function of the pooled sample Y.
15. GENERAL DISTANCE MEASURES. General measares of the distance of D(u)
from u and of ((tt) from 1 are provided by the integrals from 0 to 1 of
2
{d(&) - 12, {D'(u) - U}2, {D,() - i2/U(j _ U), {(u) -1
where d^(u) is a smooth version of cr(u). We will see that these measures can be decom-
posed Into components which may provide more insight; recall basic components are linear
functionals defined by (1)
r(J) = J(u)d"(u)du.
If 4O(u), i = 0, 1,2,..., are complete orthonormal functions with 0 = 1, then H0 can
be tested by diagnosing the rate of increase (as a function of m = 1, 2,...) of
f1 m
{dm(u)- 1_}2 d = 1: IT-(0,1 2
i=1
20
Define cosine score functions by
~ci(u) = 2 '5coa(iir u ).
One can show that a Cramer-von Mises type statisi., denoted CM(D), can be repre-
sented
,i-1
In addition to Legendre polynomial and cosine components we consider Hermite poly-
nomial components corresponding to Hermite polynomial score functions
OHi(u) = (il)-'Hi(V-(u))
where H, (z) are the Hermite polynomials:
Hi(z) = ,
H2 () = x2 - 1,
H3 () = , 3 - 3,
H4 (x) = x4 _ SX2 + 3.
21
A pooled portmsnteau chi-squared statistic is
a
CQ = 1( - P.k)CQk
ksi
AL'k = 2 TLk(i)I
00
2 /i(i + 1)
AD = >.(- P.k)ADk
cM = f
CM = >2(l P~k)CMk
k=1
I
22
7) Hermite components and chi-squares up to order 4 are defined:
THM(i) = Tk"(OHi)
m
C14(m)= IT k(i)lI
i-1
CH(m) - p.k)CHk(,)
k=1
23
However one can look at k sample Anderson-Darling statistics as a single number
formed from combining many test statistics called components. The importance of com-
ponents is also advocated by Boos (1986), Eubank, La Riccia, and Rosenstein (1987) and
Alexander (1989). Insight is greatly increased if instead of basing one's conclusions on
the values of single test statistics, one looks at the components and also at graphs of the
densities of which the components are linear functionals corresponding to various score
functions. The question of which score functions to use can be answered by considering
the tall behavior of the distributions that seem to fit the data.
REFERENCES
Alexander, William (1989) "Boundary kernel estimation of the two-sample comparison
density function" Texas A&M Department of Statistics Ph. D. thesis.
Aly, E.A.A., M. Csorgo, and L. Horvath (1987) "P-P plots, rank processes, and Chernoff-
Savage theorems" In Now Perspectives in Theoretical and Applied Statistics (ed. M.L.
Puri, J.P. Vilaplann, W. Wertz) New York: Wiley 135-156.
Boos, Dennis D. (1986) "Comparing k populations with linear rank statistics" Journal of
the American Statistical Association, 81, 1018-1025.
Eubank, R.L., V.N. La Riccia, R.B. Rosenstein (1987) "Test statistics derived as compo-
nents of Pearson's Phi-squared distance measure" Journal of the American Statistical
Association, 82, 816-825.
Parzen, E. (1979) "Nonparametric statistical data modeling" Journal of the American
Statistical Association, 74, 105-131.
Parzen, E. (1983) "FunStat quantile approach to two-sample statistical data analysis"
Texas A&M Institute of Statistics Technical Report A-21 April 1983.
Pettitt, A.N. (1976) "A two-sample Anderson Darling statistic" Biometrika, 63, 161-168.
Scholz, F.W, and M.A.Stephens (1987) "k-sample Anderson- Darling tests" Journal of the
American Statistical Association, 82, 918-924.
Shorack, Galen and Jon Wellner (1986) Empirical Processes With Applications to Statistics
New York: Wiley,
24
4)i
C-4 wbD
- 4)x 0
C*- >
cyI-
) rn
w .w
0)0
4) 0. 4"W! W! an0
C24
For samples I and IV, sample comparison distribution function IY(u)
huoimg WO/hice IFUR FUU AStft owin Yalu/Price SINGUL FAMILV ASSESS
6 .6
.4 .4
.~ .3
.1 .2 .3 s ti
* tj ti 1.1 C .2 .3 .4 5 .6 .7 38 .
For samples I and WV, sample comparison density d(u), sample quartile density dQ-(u)
(square wave), nonparametric densty estimator d^(u)
4ouing
W 93 u/Pice 1001 FAILIV ASSESSI 2,8wuin Ualuu/Price SINML FAMILY ASSESS
21
For samples I and IV, Legendre, cosine, and Hermite orthogonal polynomial estimator of
order 4 of the comparison density, denoted d4 (u, compared to sample quartile density
dQ-(u).
Ieg, cos(x's), NerW(Os Density Ley, Cos~x's), Her(es) Density
33.3
... 2
.3
2 .1
.1 .2 .3 .4 5 .6 .7 .1 .9 1I1 . 4 . 6
26
Reliability of the M256 Chemical Detection Kit
Abstract
The U.S. Army uses the M256 Chemical Detection Kit (CDK) to indicate the presence
or absence of certain agents in the battlefield, which is indicated by a color change on the kit.
Strength of response Is also influenced by the quantity of agent. Lots must meet reliability
specificutIons to be considered "battle-ready". How do we go about collecting and analyzing
our data so as to evaluate Its reliability? Other problems of interest include quantifying how
the agent quantity affects the response and if there are differences between the two manufac.
turers of the M256 CDK. Consultants at the Ballistic Research Laboratory have employed a
dose-response framework to study the reliability problem. We use a binary response
(present/not present) and assume a lognormal distribution In arriving at a response curve for
each lot. Assessments of our approach and suggestions for alternative approaches are asked
of the panel.
27
Description of Kit
The M256 Chemical Detection Kit (CDK) is used to detect the presence/absence of
dangerous concentrations of toxic agents by color-changing chemical reactions. Each CDK
contains twelve samplers, which are the actual testing devices. Four types of agents can be
detected with the CDKL The tests indicate
a) Ifit is permissible to remove the protective mask following an agent attack,
b) if agent is present in the air or on surfaces suspected of contarnk'ation,
c) if any agent is present after decontamination operations.
The U.S. Army requires that the samplers exhibit at least a 92.5% reliability (with 90%
confidence) in responding to agent concentrations at the specification levels, However, the
kit should not be so sensitive that soldiers wear their mask at safe levels of concentration,
thereby interrupting other battlefield duties.
On the back of each sampler are complete instructions for testing and colored examples
of safe and danger responses. After performing the test, a paper test spot is checked for any
change of color. The color change will not usually be an exact match with the colors shown
on the back of the sampler. This is because the response depends upon the agent quantity.
To make matters more complex, when the agent is present the observed response may be
nonuniform with a few shades of the danger response showing.
28
The test equipment that controls the concentration of ageut in the test chamber is very
accurate and precise, but it is slow. It may take about an hour to change to a higher concen-
tration. When going from a high to a low concentration, the waiting period may be several
hours since the high concentration tends to leave a residual amount of agent in the test
chamber.
We have decided to evaluate each agent and the chosen lots separately. From each
manufacturer, we have selected one lot from the available age groups. Also, we have tried to
choose lots of similar age from the manufacturers so that they can be paired and we can look
for general trends. In all, we have chosen fifteen lots ranging in age from 1 to 8 years.
Although the sites are in varying climatic areas, most of the warehouses are humidity and
temperature controlled; therefore the locations are treated as homogenous. Differences
existing between manufacturers are not considered in our initial design, but will be addressed
later.
We have taken the route of estimating the reliability of each lot at the specification level
of each agent. We have also chosen a dose-response type experiment, where our dose Is the
agent concentration and the response Is safe/danger. For the purpose of determining
response, U.S. Army manuals specify a set of nine color chips that progressively range from
the "safe" color to the "danger" color. The manual also states a cutoff color for the Bernoulli
response. (In most cases, color chips 1-3 correspond to a safe response, while chips 4-9 are
considered danger responses.)
We have made the assumption that the response curves follow that of the lognormal
cumulative distribution function with unknown mean and standard deviation. The lognormal
was selected based on historical precedent, although we note that the log-logistic would have
also been a reasonable choice.
To choose the concentration levels at which to run the tests, we have considered several
candidate sequential designs. In light of some of our restrictions, however, none of these
would be very practical (e.g., Robbins-Monro would have required too much laboratory
time).
Instead, we have chosen a two-stage "semi"-fixed design. In the first stage, 11 samplers
are tested at seven different levels; one concentration level set at an estimated mean, three
concentrations above this estimated mean, and three concentration levels below the
estimated mean, each being a multiple of the standard deviation away from the mean, Mean
and standard deviation estimates are based on the results of a pretest (which for the purpose
of brevity is deleted from this presentation). The multiple of the standard deviation is chosen
so that the specification level will be covered by the seven test concentrations.
29
Stage I
1 +1 A- 3
2
l4k. 1 1
i i11
Note: k is chosen so that the seven test concentrations cover the specification level. A,and 0,
come from the pretest.
At the conclusion of Stage I, the data are analyzed using the DiDonato-Jarnagin max-
imum likelihood estimation algorithm to produce new estimates of the parameters, A and
&2. In Stage II, nine more units are tested at five concentration levels; one, level set at the
new estimated mean, and at two levels above and below this, each now being a multiple of the
new standard deviation from the mean.
Stage II
Concentration Number of Samplers
2-2 1
A2 -2 2
A2 3
A2 + 2
2+ g2 1
9
At the conclusion of Stage II, the parameter estimates for the lot are re-evaluated using
all 20 data points, giving us a final Aand 8. With these final estimates, the .925 quantile is
estimated by A + z(.. 0.
By taking the variance of the above equation, we get an estimate of the variance of the
.925 quantile,
Var(A) + (z(.)) 2 Var(b) + 2 z(.M ) Cov( , a)
(The DiDonato-Jarnagin algorithm gives the values of the variances and covariance term.) If
the one-sided 90% upper confidence limit of the .925 quantile is less than the specification
30
concentration, then we can conclude that the lot meets the requirement for that particular
agent.
We do not have a statistical technique per se for detecting significant differences
between manufacturers or sites. Our "approach" would be to simply look for any obvious
trends or differences. To study the age issue, a separate accelerated life test will be con-
ducted at a later date.
Questions
1. Is our approach appropriate for determining an extreme quantile?
2. Can one estimate a quantile when considering more than two possible responses (e.g.,
the nine color chips)?
3. How might we statistically compare the reliability of the manufacturers (or sites)?
Concluding Remarks
Following our presentation, we heard comments and suggestions from the clinical ses-
sion panelists and audience. Two major concerns were expressed by several persons. First
was uneasinessd towards our assumption of a log-normal distribution. Some respondents felt
this to be a potentially dangerous assumption, especially since we are estimating the tail of
our distribution. Secondly, some persons questioned our method of estimating the mean of'
the distribution, and then extrapolating to the .925 quantile. These two problems could lead
to some very erroneous conclusions.
In general, the comments we heard confirmed our beliefs that this is a very difficult
problem to analyze, in light of the small sample sizes and other laboratory constraints to
which the test is subjected, Although no definitive alternative approaches arose from our dis-
cussions, some possible attacks that were suggested to us included --
2. Isotonic regression.
3. Testing at the specification level and employing a general linear model approach
with the color chip number corresponding to the color change as the response and
age, manufacturer, and storage site as variables.
We would like to thank the panelists and audience for their many suggestions and
remarks.
31
COMPARISON OF RELIABILITY CONFIDENCE INTERVALS
Paul H. Thrasher
Engineering and Analysis Branch
Reliability, Availability, and Maintainability Division
Army Materiel Test and Evaluation Directorate
White Sands Missile Range, New Mexico 88002-5175
ABSTRACT
INTRODUCT ION
33
The one-sided confidence interval considered is the upper confidence
interval. This is based on the premise that having a reliability too low is
much more serious than the reliability being toc high.
FISHERIAN APPROACH
34
binomial b( i;n,R) consists of a single spike of unit height at 1-0. As R and
c increase, b(i;n,_) takes a shape illustrated in Figure 4 and described by
The extent of the shifting from the single spike is determined by the data
x and n. The value of R is determined in two steps. First, R is increased
until the sum b(x;n,R)+b(x+1;n,R)+...+b(n;n,) equals the probability a making
the confidence relation P[RcR_]-J or P[R>j]-1-Q untrue. Second, the continuous
variable R is decreased infinitesimally making the confidence relation
P[R>j]m1- just barely valid. Thus R is neither too large or too small to be
a (1-:)100% lower confidence limit on R when
n (n\, 1 -R n
where a and b are parameters. Using the equality of the gamma fUnction r(J)
and the factorial (J-1)1 when j is an integer yields
35
Postulating that the reliability is described by f(R) and setting the area to
the left of R at c yields
R
af f(R) dR.
Comparison of this summation and the summation for a in the previous paragraph
yields a-x and a+b-1,-n. Thus the parameters in the Beta function for the
lower limit R are a-x and b-n+1-x.
)n-i
ix (n - '1 (1.7
iO i
(2) " is the lower limit of integration over the second Beta function
1
- f f(R) dR,
(4) the second Beti function parameters are identified by x-a'-1 and
n-a'+b'-1 to be a'-x+1 and bl-n-x.
36
This Beta function does not describe R when x=n because r(b')=r(O)u(O-1)!
is meaningless. For this special case, R-1 for all ,. This may be seen from
a binomial distribution symmetric to Figure 4. Using an R near 1 and an (
containing binomial terms from J-O to J-xan, it is easily seen that a is 1
even when R is 1. Since R continuous, R-i for any value of 1-s.
BAYESIAN APPROACH
The Bayesian approach (Martz J-d Waller) uses the data x and n to update
a prior distribution g(R) describing R to a posterior distribution g(Rlx)
describing R after x is given. The algebraic relation between these two is
based on the equality of the joint density h(x,R) to both the product
g(Rlx)f(x) and the product f(xIR)g(R). Thus the posterior is found from
n x
f(xlR) - b(x;n,R) -(n)Rx (1R)
'X
and (2) the marginal density f(x) from the integral of h(x,R)-f(xjR)g(R) is
37
Ignorant Prior:
One prior that can be used is the uniform distribution g(R)=1 for OR41
and g(R)=O elsewhere. This is sometimes called the ignorant prior because all
values of R between 0 and I are equally likely. That is, there is no evidence
to favor the selection of any value of R over any other R between 0 and 1.
g(RIx) Rx (R)n-x -
f Rx (I-R)n ' x dR
r(x+1) r(n-x+1)
Noninformed Prior:
2 .
m n Arcsin(Rl/ )
Figures 5 and 6 show that for O<x<n these similar curves become nearly equally
spaced along the 0 axis as n is increased. The noninformed argument assumes
that all n+1 curves are essentialy equal and equally spaced for all n. This
38
makes being noninformed about x equivalent to being ignorant about 0. The
prior assumption that (1) x is unknown but (2) the situation is described by
the one of these curves thus leads to a prior distribution of 0 that is
uniform between 0-0* and -900. The corresponding prior of R may be found
from the transformation of variable technique (Freund and Walpole) by applying
g(R) - h(o) _
dR
2
RX-1 /2 (1.R)n-x-i/
g(Rjx) - . .
fI R( X+ 1 / 2 ) l (I-R) ( n ' x +1 / 2 ) ' 1 dR
The three methods reviewed in the previous sections have been applied to
confidence intervals on reliability. Both two-sided and one-sided intervals
have been investigated.
39
The effect of changing x for fixed n is seen to be a change in the order
of the Fisherian, ignorant, and noninformed intervals, The Fisherian interval
seems to be the widest. For x near n/2, the ignorant interval seems to be
narrower than the noninformed interval. For x near n however, the noninformed
interval seems to be the narrowest of the three.
The symmetry of the Beta functions makes the lower confidence limits for
x near 0 such that the Fisherian is lowest, the noninformed Bayesian is next
lowest, and the ignorant Bayesian is the highest of the three. This is shown
in Figures 25 though 28. These figures and Figures 29 through 31 also show
that large n leads to fairly close agreement between the three methods.
CONCLUSION
The three methods are all on sound theoretical ground but'give different
results. No single method provides most logical confidence intervals. The
choice between methods has to be based on goals and philosophy. Since the
Fisherian method leads to the widest confidence intervals, it is the most
conservative approach. Since proponents of the Bayesian method prefer priors
which contain more information than the ignorant or noninformed prior, the
Bayesian method (without a prior based on previous tests/calculations) does
not meet all the goAls of analysts with a Bayesian philosophy. Thus the
Fisherian method seems to be a good, conservative method for the initial
analysis. This initial analysis can provide a prior for a future Bayesian
analysis of addition data from a future test.
40
REFERENCES
Mann, Nancy R., Schafer, Ray E., and Singpurwalla, Nozer D., Methods for
Statistical Analysis of Reliability and Life Data, John Wiley & Sons, New
York, 1974
Martz, Harry F., and Waller, Ray A., Bayesian Reliability Analysis, John Wiley
& Sons, New York, 1982
Box, George E.P., and Tiao, George C., Bayesian Interference in Statistical
Analysis, Addison-Wesley, Reading, Massachusetts, 1973
Freund, John E., and Walpole, Ronald E., Mathematical Statistics, Prentice-
Halls Inc., Englewood Cliffs, New Jersey, 1980
41
.. .. . .. .. . .. .
I- ,
o
LU W I
LUI
/ I 1
co I'
I V)I "
. U.. . -,
- - --
,...
-
..
- -
1.
42
uJ
z
Il
-LJ
i-
-W
L L
43
.- ............
LU
LlU
IW-
441
~u.o.
LUL
LU Z z (n
I LU ccIF
d LU Al
~LU _ _
06
0 0
LU
00
45
I I I -
04
00
0.
ol
41
= lp
LU
U Q
I z
6LII
46
.10.
-K2
CLC
0 I0
LALa
LL (0--- ~ ~
- I- '2
a--- -- s
- - a--- -
147
to L6aa~
- l-
CL CL CLd
-a M
.3 * en
'~47
LCV
I I , I -i
Ixi ,
II 48
I I ---
Li q
0 o
13,,
49
Iii,
.................. o
II .4 I
z |
(S)Z ,
Qw
LLJ
a:
cc
---- I_ I
50
x
I0I
z'
ii1010
00)
QI
Z
mw
51
I0
II0
I0 1
ii.0
520
- - - - ----- -
- - - - ---
LU%
cc
C4 v
53
LII
cc
II4
544
92 0.
.. .. .. .
Cl55
3xx V \ i
9
01~11
~\
00,
31P
IIl
co
LL 4"0
'K I
I
a:
u
Jz
LLL
aq 00c
0, L. w
ZL 0ZU
57
- I
- %
L DI
I -
co I
cc,
58s
- -
- a
- II
-a Ii
a.
a
I4~ a
N
II %% I~
x ~jg
N I
a, I
N
zQ Z
I.-
U
LL ~
II,
I I I I C
0
II ~I ~gl I
I I
59
00
.00
Q 0/0
Vo
%4.
g
60
- 10
0 I0
LU
C4
- -- --- - - - -P.. ... . .
.. .. ... . . . .
Z -
L.N.
LL v
lz.
i62
92
II "P
I-
Lli.
cr..
,.3
z
LL (D
. ....
zMA
cr. a:
65
zS
z
Q LL
oz
I-
LL6
wU z
cc 0
C-C,
00L
wL0.z
66
CC
>-z
uul
ui 0
LUU
z EI
Z3
67
z
zi
LOLu
wsi 9
wL
I-..
wJ 3
00
C-L)
w
0 D
0 z.
zW
tat
z
LL 0
cc- ~
0 I a,
w ~ 69
0
z-
IL>
<0)
0.t
0.1
0 Sj
UJ3
CC;h
I I I I
Iz
70
Z II
LL W
cc 8
C-
UJ z
cr.
0
- w
~I U..
w z
0 z
711
.
........................................................
LL.
LU
LU
=j LA.
LU z
cc N
CLA
ui i
z z
72
Environmental Samplings A Case Study
Dennis L. Brandon
US Army Engineer
Waterways Experiment Station
Vicksburg, Mississippi 39180
The case study site is approximately 200 acres with both upland and
wetland areas (see Figure 1). This site was known to have very high
concentrations of metals in surface soils. Major pathways for
contaminant mobility are the meandering stream which flows north and the
drainage ditches. Also, tidal inundation affects a substantial portion
of this site.
The objectives of the study were tot (1) define the extent of the
hazardous substance contamination on the site; (2) identify the sources
of the hazardous substances detected on the property; (3) evaluate the
extent of migration of the hazardous substances on the property; (4)
assess the bioavailability, mobility, and toxicity of the hazardous
substances detected on the property; (5) evaluate the condition of the
wetland and upland habitats on the property. This paper focuses on the
use of soil samples to achieve objectives 1 thru 4.
73
1000~00 300P
(q200
ES
FigureY10 Ma fSuyAe
74CI
SAMPLING PLAN The sampling plan was formulated based on previous
soil and water data, historical information, and tha potential pathways
for contaminant mobility. The sampling locations are shown In Figure 1.
Three samples were collected at some locations and one sample was
collected at the other locations. The triplicate samples were used in
statistical comparisons. This sampling plan reduced the cost of the
investigation by allowing a selected number of sample locations to be
tested extensively while other sample locations received one-third the
cost and effort. A total of 178 samples were collected and analyzed for
As, Cd, Cu, Fb, Ni, Se, and Zn.
There is an analogy between the strategy used here and the disposal
philosophy of many Corps eldments. Most dredging and disposal decisions
are made at the local level on a case by case basis. Often, the
environmental objective is to prevent further degradation of the disposal
area. Therefore, samples are collected at the dredge site and disposal
site. A statistical evaluation performed on the chemical analysis of the
samples becomes the basis for determining whether degradation will occur.
In this study, samples were collected at the remote reference area and an
area of degradation (i.e. contamination). Ten triplicate samples were
collected in the remote reference area. Twenty-eight triplicate samples
were collected in the area of contamination. Locations having a mean
concentration of metals in soil, plants, or animals statistically greater
than similar data from all remote reference locations were declared
contaminated, These concentrations provide a judgemental basis for
classifying the 64 single sample locations.
75
A Generalized Gumbel Distribution
Siegfried H. Lehnlgk
Research Directorate
Research, Development, and Engineering Center
U.S. Army Missile Command
Redstone Arsenal, AL 35898-5248
77
A Generalization of the Eulerian Numbers with a
Probabilistic Application
Bernard Harris
University of Wisconsin, Madison
C. J. Park
San Diego State University
*t Z+ ~ )Aq (3)
Also,
A , mA,%j. i; (4)
1 An,. (5)
80
As seen from the tabulation, P4 (1) = 1,P4 (2) =.11, P4 (3) - 1, P 4 (4) - 1,which concides
with A 4 jj " 1,2,3 r4.
Let
An(t A . (6)
Jul
Then
t ), A(t)x'
A /nl,.t 1. (7)
are
The above relations and some of their properties can be found in(8]; the polynomials (6)
also discussed inL. Carlitz (4]. These results may also be found inthe expository paper of L.
Carlitz [3]. The formulas (1) and (2) ar also given in L. v. Schrutka [21).
Ddsir6 Andrd [1] established that AV isthe number of permutations of {X, } with j "elementary
itiversions". He also established that Aq, is the number of circular permutations of (X,, I} with j
4elementary inversions".
The equivalence of these two results with the enumeration of the number
of increases in permutations of {X.} can be trivially established.
G. Frobenius [ 15) studied the polyomonlals
introduced by Euler, and established many of their properties. In particular, relations with the
Bernoulli numbers are given in [15].
In D. P. Roselle [201, the enLneration of permutations by the number of rises, A, is related
to enumeration by the number of successions, that is, a permutation ir of {X,,} has a succession if
,r(i) -"i+ 1, i- 1, 2,..., n,
Some number theoretic properties of AV are given in L. Carlitz and J. Riordan [7] and in L.
Carlitz [5].
In this paper, we study a generalization of the Eulerian numbers. A generalization in a dif-
f.rcnt direction was given by E. B,Shanks [221, who apparently did not note a connection of his
-coefficients with the Eulerian numbers, L. Carlitz [2] noted the relationship of Shank's results to
81
the Eulerian numbers and obtained representations for these generalized Eulerian numbers using
results due to N. Nielsen [17].
F. Poussin [18] considered the enumeration of the number of inversions of permutations of
(X,, which end in J, I g j n. This produces a decomposition of the Eulerian numbers. She
also introduced a polyomial generating function for these numbers. The sums of these polynomials
are the Euler-Frobenius polynomials.
Another deomposition of the Eulerian numbers with a combinatorial interpretation is given by
'.F. Dillon and D.P. Roselle [12].
J. Riordan (19] lists many properties of the Eulerlan numbers in Exercise 2, page 38-39 and de-
scribes the combinatorial interpretation of the Eulerian numbers in terms of triangular permutations
(which is equivalent to the elementary inversions described by Andrd [1]). He also gives a brief
table of the Eulerian numbers on page 215. See also L. Comet [10], where generating functions for
the Eulerian numbers are given and the Eulerian numbers are obtained by enumerating the number
of permutations with a specified number of increases. Many properties of the Eulerian numbers are
given as well as their historical origins in terms of sums of powers.
F.N. David and D.E. Barton [11] suggest the use of the Eulerian numbers as a statistical test for
the randomness of a sequence of observations in time, employing the probability distribution given
by
P * Aj/n!, j - 1,2,...,n. (9)
The generating function (7) is derived and employed to obtain the moments and cumulants
of the distribution (9). In particular, David and Barton show that the factorial moments are the
generalized Bernoulli numbers. However, David and Barton do not make any identification of
these distributions with the Eulerian numbers.
Using probabilistic arguments, Carlitz, Kurtz, Scoville and Stackelberg [6] showed that the Eu-
lerian numbers, when suitably normalized, have an asymptotically standard normal distribution,
This was accomplished by representing the distribution PV as the distribution of a sum of indepen-
82
dent Bernoulli random variables. S. Tanny [24] demonstrated the asymptotic normality by utilizing
the relationship of the Eulerian numbers to the distribution of the sum of independent uniform ran-
dom variables and applying the central limit theorem.
L. Takdcs [23] obtained a generalization of the Eulerian numbers which provide the solution
to a specific occupancy problem. Namely, let a sequence of labelled boxes be given, the first box
labelled 1, the second box 2, and so on. At trial number n distribute I balls randomly in the first
n boxes so that the probability that each ball selects a specific box is 1/n and the selections are
stochastically independent. For I - 1, the probability that j - 1 boxes are empty after trial number
n is Aq/n,j 1,2,..., n. Takics' paper contains many references and describes additional
combinatorial problems whose solution is related to the Eulerian numbers.
Finally, L. Toscano [25] obtained formulas expressing the Eulerian numbers in terms of Stirling
numbers of the second kind.
These polynomials are mentioned in L. Carlitz, D.P. Roselle and R.A. Scoville [8], As noted
there, AO (0) are the Eulerian numbers. These polynomials are also used by P.S. Dwyer [13] to
calculate sample factorial moments. Dwyer .oes not relate these to the Eulerian numbers.
We begin our analysis with the following theorem:
Theorem 1. Let n and k be non-negative integers and let 6 be any real number. Then
83
"2 (+ ) (-I)(6 +j-v) n
1
JUO VUO VJ
t( + (k)A,()(11)
jWO
is independent of 6 fork = 0, 1,..., .
Ef. The following identity (see N. Nielsen, [17], page 28) will be utilized in the proof.
( _+x (8+jv)' (12)
)no0 n NV
and
E(f( )) = f (Z+ 1),
Then, it can be shown that (C. Jordan, [16])
N0
r
the last equality follows from elementary properties of the function (S + j)k' j (C. Jordan, [16], p.
51). Thus, for r - 0,1,..., n, from (12) we have
84
Hence,
+1 (-+( + - )
(16)
Thus, it follows that
too JN WVaO
n= 1
(17)
Setting z 6 + n- k - 1+ I in (12) we get
(n- k-(nk-s.l).2
1+ 0)1 = E
J=0
+ n-
nl
- +
Va=0
(n+ I' (6(-q1
j(-+1-)"
and hence
(-(n frh /n-k k I ) (18)
ISO
and is independent of 6.
In particular, = , , ( ) (2" - 1)/n and
(3n - 2n 1 + l)/*n(- 1), L,-3(n
() (6)- 14-iMn-2)
.,4--3".,.(2 -I
A brief table of/[t]( )(6) for k - 0 1, 2,3 and n So 0, 1,2,3 is given in the Appendix to this
paper,
The Nielsen identity (12) seems to have been discovered in a somewhat less general context by
Paul S. Dwyer (13], who employed it to calculate factorial moments by means of cumulative sums;
see also Ch. A. Charalambides [9], who in addition to discussing Dwyer's work also showed that
these generalized Eulerian are related to enumeration of compositions of integers.
85
The following corollary will be subsequently employed.
Corollary. Let n and kbe non-negative integers with kc 7bi.Ten
is independent of 8.
I~rg.we can write
(6 + j(01(20)Ept
+J),
1.00
where Ok are the Stirling numbers of the second kind. Since the coefficients Ph do not depend on
8, substituting (20) into (19) and interchanging the order of summation, we got
AkA
tw0
which is independent of 8.
Prior to demonstrating that the independence of 8, noted in Theorem I and its corollary can not
be extended to k m + 1, we will need to calculate the derivative of 1A i(i).
4 Thus, we have:
Thoe'2 Let naind k be non-negative Integers and let 8 be any real number. Then
d 1()- Ii)n..)5
0-0 (22)
Ew~.Since
( +j)~h-1 s
1 F,')6 (-I) V(6 +j-O
86
Comparing the first term with (19) and employing the Pascal triangle identity on the second
term we get
h k 8) .'+~
jNO vnO V
1 ()v( -~V)"-
(23)
Further,
= (n~(n-i)!
1)! (6 +n ) ( )'(,
+ 1)- J(n I
V1O) ( V) ) V (24)
From (13)
b
(+,n- 0
V)"- -, (25)
vWO V
since it is the nth difference of a polynomial of degree n - 1. The second term on the right hand
side of (24) is (-1) (6),
In addition,
'
(n(
- 6 + ) (-1)M6(6 + j- V)-
-
(+ - ( -1 +)
~E 8( + j -v -
87
1(
"Ih(i-~j+ 1
I
(- (+J
(n-) I j ~ ,=1l
(I),o or j go V
ni +
or
dgkn(6) k (6) - k, N ) (27)
k _,(8) - Z ( ), ) (28)
A1)(6) = Ca.,t 6 62 + d,
d/46~()-
pm+2 ~+,~ (m + 2)IA42
89
3 Applications to Probability Theory
Let U1 , U2 ,..., U,tI be independent random variables uniformly distributed on (0,1). Let
owl
,~~U. The distribution of Sol is well-known and is given by the probability density
,.-
function
fs,. n+I (-)V(x - V)"O < X < n+ 1, (31)
(zc<
- a)+ -- (32)
x-a, X-a>O,
Write
where CS,, I]denotes the integer part of S.,I,Clearly 6isa continuous random variable and
0 < 6 < 1; [So I] isadiscrete random variable with carrier set {0,1,2 ....,n}.
The conditional distribution of S,+ I given that the fractional part of 8,o I is 6 is given by
IS
P{Smi1 -x IX x] - 6} a fs.,(J+ 6)/,fS.,(+ 8), (33)
90
and thus (34) is a discrete probability distribution with carrier set {6, 1 + 8, ... , n + 6}.
Let W,, . be the random variable whose distribution is given by (34). We then have the fol-
lowing theorem.
Theorem 4. The moments of order k - 0,1 ,...,n of W,, 1 ,8 coincide with the corresponding
moments of 8,, that is,
P{6 8*) x) d-
J-O
Finally, we note that W,. 1,6 is asymptotically normally distributed. This is stated in the following
theorem.
91
Theorem S. As n - oo, for 0 b <1, the distribution of
F(,4q~+)Wt,-
+ ) 11 (i's)) (36)
Eof. Both (36) and (37) are Immediate consequences of the representation of , as the con-
ditional distribution of the sum of n+ 1 independent uniform random variables on (0,1) given the
fractional part of the sum and the central limit theorem.
92
Appendix
This Appendix is devoted to some tables illustrative of some of the quantities introduced in the
body of the paper.
,Table A.I
$0 1 2 3
0 1 6 6(6- 1) 6(6- 1)(6-2)
1 1 ft". 1+6_82_(l-. L) 26 3 +36 2 ,,.-_1+n
2 1 2+ ++26- 2S' -
1_ Wy4
2 4
3 1 La±I n
(PL ri-2)
93
The Distrbution of W.,s, n = 5, 6n-.1, .4, .5, .9
Note the aymmolby for 6 - .5 and that 8 n .9 and 8 - 1 are identical when the column for
8 = .9 is read going tip and 6 = 1 is read going down (the entries 8 x 10"8 and 5 x 10- differ
as a consequence of rounding errors).
94
A'(6), n --5, k =0, ,..,10;6 -a 0,.I,.3, p.5,1.7,1.9
6=0 .1 .3 .5 .7 .9
k 0 1 1 1 1 1 1
1 3 3 3 3 3 3
2 9.5 9.5 9.5 9.5 9.5 9.5
3 31.5 31.5 31.5 31.5 31.5 31.5
4 108.7 108,7 108.7 108,7 108.7 108.7
5 388.5 388.5 388.5 388.5 388.5 388.5
6 1432.50 1432.50 1432.53 1432.55 1432.53 1432.50
7 5431.50 5431,51 5432.01 5432.48 5432.31 5431.69
8 2118,7 21117.60 21122.56 21129.77 21129.66 21122,07
9 84010.5 83989.19 84020.48 84096.88 84116.67. 84049.80
10 341270.5 341018.48 341121,81 341763.40 342089.16 341628.77
95
References
[1] Andrd, D., Mdmoire sur les inversions 616mentaires des permutations, Mem. della Pontifica
Accad. Romana dci Nuovo Lincei, 24 (1906), 189-223.
[2) Carlitz, L., Note on a paper of Shanks, Am. Math. Montly, 59 (1952), 239-241.
[3] Carlitz, L., Eulerian numbers and polynomials, Math. Mag. 32 (1959), 247.260.
[4] Carlitz, L., Eulerian numbers and polynomials of higher order, Duke Math. J.27 (1960), 401-
424.
[5] Carlitz, L., A note on the Eulerian numbers, Arch. Math., 14 (1963), 383-390.
[6] Carlitz, L., Kurtz, D.C., Scoville, R., and Stackelberg, O.P., Asymptotic Properties of Euleria
Numbers, Z. Wahrscheinlichkeitstheorie verw. Geb., 23 (1972), 47-54.
(7] Carlitz, L. and Riordan, J., Congruences for Eulerian numbers, Duke Math. J. 20 (1953) 339-
344.
[8] Carlitz, L., Roselle, D.P., and Scoville, R.A., Permutations and sequences with repetitions by
number of increases, J.Combin. Theory, 1 (1966), 350-374.
[9] Charalambides, Ch.A., On the enumeration of certain compositions and related sequences of
numbers, Fibonnacci Quarterly, 20 (1982), 132-146.
[10] Comtet. L., Analyse Combinatoire, Presses Universitaires de France, Paris, 1970.
[111 David, F.N. and Barton, D.E., Combinatorial Chance, Charles Griffin and Company Ltd.,
London, 1962.
[12] Dillon, J.F. and Roselle, D.P., Eulerian numbers of higher order, Duke Math, J., 35 (1968),
247-256,
96
[13] Dwyer, P.S., The calculation of moments with the use of cumulative totals, Ann. Math. Statist.,
2 (1938), 288-304.
(14] Feller, W., An Introduction to Probability Theory and its Applications, 2nd Ed., Vol. HI, John
Wiley & Sons, Inc., New York, 1971.
(15] Frobanius, 0., OYber die Bernoullischen Zahien und die Eulerschen Polynome, Sitz. Ber.
Preuss. Akad. Wiss. (1910), 809-847.
[16] Jordan, K., Calculus of Finite Differences, 2nd Ed., Chelsea Publishing Co., New York (1960).
[17] Nielsen, N. Traitd dldmentalre des nombrei de Bernoulli, Gauthier-Villars et Cie,, Paris
(1923).
(18] Poussin, FR, Sur une propridt6 arithmdtique de certains polyn6mes associds aux nombres
d'Euler, Comptes Rendus Acad. Sci. Paris, 266 (1968), 392-393.
(19] Riordan, J., An Introduction to Combinatorial Analysis, John Wiley & Sons, Inc., New York,
1958.
[20) Roselic, D.P., Permutations by number of rises and successions, Proc. Amer. Math. Soc., 19
(1968), 8-16.
[21] v. Schrutka, Lothar, Sine nucue Eintilung der Permutationen, Math. Annalen., 118 (1941),
246-250.
[22] Shanks, E.B., Iterated sums of powers of the binomial coefficients, Am, Math. Monthly, 58
(1951), 404-407.
[23) Takdcs, L., A generalization of the Eulerian numbers, Publicationes Mathematicae Debrecen,
26 (1979), 173-181.
97
(24] Tanny, S., A probabilistic interpretation of Eulerian numbers, Duke Math. J., 40 (1973), 717-
722.
[25] Toscano, L., Sulla somma di alcune serie numeriche, Tohoku Math. J., 38 (1933), 332-342.
(26] Worpitzky, J., Studien Uber die Bernoullischen und Eulerschen Zahlen, Journal fUr die reine
und angewandte Mathematik, 94 (1883), 203-232.
98
The Analysis of Multivariate Qualitative Data
Using an Ordered Categorial Approach
ABSTRACT
When the experimental units being classified are sub-sampling units in the study, an
ordered categorical procedure cannot be applied directly. Further, the count data ob-
tained which is routinely analyzed by univariate statistical methods, ignores the depen-
dence among the responses. A modification of the method developed I'y Nair (1986, 1987)
is used to derive the scores and indices, which are analyzed by nonparametric AOV. An
example from teratogenicity studies is used to illustrate the technique,
This problem arises from the consideration of studies where a reproduction safety test
must be performed prior to the use of drug, chemical or food additive, The standard pro-
tocol in such studies requires that pregnant female subjects (usually rodents) are randomly
sslgned to one of four treatment groups, The appropriate dosage is administere4 shortly
after the beginning of gestation. When the animals are near term, they are sacrificed
and the number of potential offspring are counted. Other data collected are the number
of implantation, early and late fetal deaths, number of live offspring and the number of
fetuses according to various degrees of increasing severity of malformation. Also data on
continuous variables such as fetal weight are collected. It is unclear from the literature
which statistical methods are appropriate for the analysis of this type of data.
For continuous measurements one may quickly turn to the analysis of variance. For
count data describing the number of fetuses with or without some qualitative outcome,
other methods have evolved. A per-fetus analysis using total of early death and total
number of implantation in a Fisher exact-test or a chi- squared test of independence may
be performed, but this appears to inflate samples sizes and ignores the dependence of
observations within litters. A review of per-fetus analysis is given by Haseman and Hogan
(1975) who conclude the per letter analysis is more appropriate.
All but one of the proposed methods for per-litter analysis considF.r a single outcome.
The need to include within and among-litter variation negates the use of simple binomial
or Poisson models for count data. In the methods which consider several single responses,
a problem of family error rate arises. Since the tests arc not independent, the nominal
family error rate cannot be exactly determined. The multivariate method developed by
99
Ryttman (1976) relies on the assumption of norma.ity which is violated in the case of
fetal deaths, This lack of success, however, does not preclude a multivariate approach. In
situations where ranking the categories from mild to severe is possible, ordered categorical
models may be applied and the family error problem may be eliminated.
In this paper we obtain a scoring system for various outcomes which produces a severity
index for each litter. This index is sensitive to location shifts. The modeling which follows
will be based on this index.
The study design prohibits the straight-forward application of ordered categorical pro-
cedures because the items (fetuses) are not independent, Thus a scoring procedure allows
consideration of the effect of letter size on severity of the response, as a whole, in the litter,
Here the sampling unit is the fetus or individual. Three observations should be made; i)
results are different per litter than for per fetus, ii) per litter evaluates the proportion of
fetuses affected rather than the numbers of affected litters, and iii) observed treatment -
control differences is less significant than per-fetus indicates (via simulation).
Univariate Analysis,
The simple analysis is based on litter as the experimental unit. This analysis is carried
out using binomial and poisson models. The binomial assumption states that conditional
on litter size the number affected is binomial. The analysis is based on transformed data,
usually the arc-sine of the observed proportion. The poisson model does not account for
litter size as it assumes the mean number affected is the same for all dose groups. The
analysis again used a transformation, usually the square root of the observed number,
Neither fits the data very well. This may be due to extra binomial or extra poisson
variability, as the case may be,
More sophisticated models are reviewed by Haseman and Kupper (190) include: weighted
least squares based on proportion and unequal sample sizes. This approach due to Cochran
(1043) requires sample sizes which are too large .for this application. Others include, the
normal-binomial (Luning et. al. 1966), beta-binomial (Williams 1075), negative binomial
(McCtughran and Arnold 1076), correlated - binomial Altham (1978), jackknife Gladen
(1970), Several nonparametric procedures have been tried, namely the Mann-Whitney U,
the Kruskal-Wallis and the Jouckheere/ Terspstra, Some attempts at multivariate analy-
sis have been tried by Ryttman (1076), log-linear models by Haberman and others (1074)
and generalized linear models by McCullagh (1080), All of the latter techniques have
distributional assurnIptions.
Since some of the ordered categorical procedures develop or accept scores for the ca.t-
egories, this approach was pursued. Scores induce relative spacing among the categories.
Thus, a mean score may be obtainerd for each litter. This implies analysis by litter as a.
sampling unit, We note that CATMOD in SAS alows for scoring, but the scores must. be,
100
user specified.
Ipscn (1955) suggested a scoring for Bioassay. Instead of estimating an LD, 0 or ED50
based on number of survivors after x days, he ordered the data into categories with the
continuum represented by time (days). The scores proposed are such that the variance of
the linear regression of mean scores on the log dose is maximized with respect to the total
variance. An adjustment is made if the scores do not reflect the ordering of the categories.
Bradley etal, (1962) scores by maximizing the treatment sum of squares after scaling
relative to the error sum of squares. This is an iterative procedure which does not require
the assumption of linearity.
Using no distributional assumption, Nair (1986, 1987) suggested some techniques for
analyzing ordered categorical data in the field of quality control. He showed the Taguchi
statistic for 2 x 2 tables, can be orthogonally decomposed into X - 1 components where
K is the number of categories. In the two sample case he showed the first components
is equivalent to Wilcoxon's rank test on grouped data. Thus, this components would be
sensitive to shifts in the multinomial model, Further, the second components corresponds
to Mood's rank test for grouped data, thus is sensitive to scale changes in the 2 x K model,
In the lion equiprobable case the correspondence does not apply though the interpre-
tation still holds. This result has been verified using a comparison density approach for
the two samnple problem by Eubank, LaRiccia and Rosenstein (1987).
When applied to 2 x K tables, the first set of Nair's scores is sensitive to shifts in locntion
of thee underlying random variable. It is reasonable to suggest, when applied to litters.
these scores yield a11colitilous index useful for detecting shift. In teratogenicity studies
the location shifts of interest would be those that indicate a significant dose-res)onse: .
Nnir's Metlhd
As alrea.ly mentioned, the first and second components of the orthogonal dccompo itiou
corresp(nd to the Wileoxon and Mood rank test,, respectively.
Wilcoxon i;ests,
101
where F, G are two distribution functions.
Mood tests,
Ho G(x) = F(x)
H1 G(x) = F(x/O)
where 0 is a constant.
For more than two treatment groups the first component corresponds to the Kruskal
- Wallis statistic for grouped data, and the second to the generalized Mood statistic. in
the general case (except equiprobable) case the equalities are no longer exact, but the first
two corr.ponent have good power for detecting location and scale shifts respectively. The
focus of this work is on location shifts.
102
Pikk = 1, 2, ... K.
7
rik = Ej~ Pij
Ho: rlk 7
2k k=1, 2, ... K
H1 : ( - r2k) < 0 for all K
(strict inequality for at least one k).
If <2p is the Pearson x2 statistic from a 2 x 2 table where column 1 contains the
cumulative frequencies of categories 1 through k and column 2 contains the cumulative
frequencies of categories (k+1) through K. Then,
K-i
TE = F, X'kP
k=l
TE assigns weight w4 = [dk(1 - dk)] -1 to the k h term in the sum which is equal for each
k under H0 .
K-1 2
Nair's Statistic T = K Wk f{E Rj(Zk/R, - dk) 2
k=l iLI
The statistics in the class are obtained by the choice of the set {wk} where u'k > 0
for k = 1, 2, ... K - 1. The decomposition is carried out conditionally on the marginal
proportions. For yl, Wk k = 1, 2, ... K - 1, W is a diagonal matrix. Using the d we form
the (K - 1) x K matrix A by;
103
Thus T is given by
T = y'AtWAyi/Nrjr 2
A WA
AVA = QrQt
~= E1Q]
satisfies
U= Q'(Nroi)-
yields
K-1
T Z li'
where the rj's are elements of the vector of eigenvalues, r, and Uj's are elements of .,
Under H0 the distribution of Yi conditional on row and column proportions is multiple
hypergeometric with
104
E(Y) = Nrlck
or
E(yj) = NrjA1
cov(y, = N(l - 1/N)-Irlr 2 A((I- 1) itA]
, Under HO it can be shown that the limiting distribution of yj converges to the multi-
variate normal distribution as N goes to infinity, Thus
X-1 K-i
j-1 jul
The approximate solution by Nair proposed two sets of statistics which have properties
the same as those obtained for the equiprobable case (i.e. ck = 1/k). That is the first
component of TE or Uvl is equivalent to the Wilcoxon test on the 2 x K table and UE,2,
the second component, isequivalent to Mood's
K
M L [k - (K + 1)/2]2 Ylk.
k=l
They do not require the solution of the eigenvalue problem as orthogonal decomposi-
tion is not necessary. The first component, all observations in the category are assigned a.
105
score proportional to the midpoint of the category. The second component the scores are
quadratic in the midrank. Additionally, each set of scores is adjusted to satisfy orthogo-
nality.
To calculate the scores, let c be of length K with elements being the column proportions.
Form
.5 0 .... 0
B= 1 0...." 0
B =5
Let r = Bc and r* = 1 - .5(1). Note the r's are Bross's ridits. The first set of scores is
obtained from
1 = I/c'r 2
c = 1'1- (c'13)1) - 1.
Then
2 ,
S = e/cte
VI'2 L'/RI + L 2 2 /R 2
where
Li = 1Vyj 1, 2
and
V2 ' = S1 /R1 + S 2 2 /R 2
where
Si = st i 1, 2
106
which are comparable in magnitude and consequently in power to U1 and U2 respectively.
PROTOCOL
Live fetuses are weighed., sexed and examined for external malformations. They are
then sacrificed in order to perform the skeletal and visceral examination. Recorded are the
number of corpora luta on each ovary, number of implantations, number of fetuses, and
the number of resportions in each utrine horn. Table I displays the data for each rodent
and close level.
The following definitions are employed to categorize the fetuses: Dead - Early or late
resorption of dead at c-section, malformed-gross visceral or skeletal variation, growth re-
tarded - body weight more than two standard deviation from the mean for the given sex
or by a range test. Normal - absence of any of the previous outcomes, Table's II and
III summarize the results by number and percent for each dose by category. It should be
noted that the differing number of letters is due to nonpregnant females, not toxicity.
The final columin of Table I is the calculated severity index, This index is calculated by
multiplying the score for the category by the number of fetuses in the category, summing
and dividing by the number of implantations, i.e.,
Sr = ntc/nt1.
Details of the calculations of a severity index are given in the following example,
107
Table I
Nitrofen Data - Strague-Dawley Rats
Number of Growth Severity
Id Implantations Normal Retared Malformed Dead Index
Dose Group = Control (0.0mg/kg/day b.w.)
19 1 1 0 0 0 0.00000
8 4 3 0 0 1 0.68423
11 5 4 1 0 0 0.25139
7 8 7 0 0 1 0,34212
1 12 9 0 3 0 0.49145
16 14 14 0 0 0 000000
24 14 11 0 3 0 0.42124
6 15 15 0 0 0 0.00000
9 15 13 0 1 1 0.31352
20 15 12 0 3 0 0.39316
22 15 12 1 2 0 0.34590
2 16 14 1 1 0 0.20142
4 16 16 0 0 0 0.00000
10 16 16 0 0 0 0.00000
12 16 14 2 0 0 0.15712
17 16 16 0 0 0 0.00000
23 16 15 0 0 1 0.12286
3 17 11 0 6 0 0.69382
5 17 10 0 6 1 0.85481
13 17 13 0 3 1 0.50790
15 17 13 0 2 2 0.55326
21 18 13 0 4 1 0.58890
Dose Group m'Low (625 mg/kg/day b.w.)
32 1 0 0 0 1 2.73692
28 12 10 0 2 0 0.32763
43 12 9 0 3 0 0.49145
26 14 10 0 4 0 0.56166
31 14 14 0 0 0 0.00000
39 14 11 0 3 0 0,42124
41 14 9 0 5 0 0,70207
47 14 10 0 4 0 0.56166
48 14 14 0 0 0 0.00000
33 15 10 0 5 0 0,65527
38 15 10 0 5 0 0.65527
40 15 13 0 1 1 0,31352
45 15 12 0 3 0 0.39316
25 16 11 0 5 0 0.61432
27 16 12 0 4 0 0.49145
34 16 10 0 6 0 0.73718
35 16 12 0 4 0 0.49145
.37 16 9 0 7 0 0.86004
44 16 12 0 4 0 0.49145
46 16 11 0 5 0 0.61432
36 17 7 0 8 2 1.24708
108
Table I (Cont'd.)
Nitrofen Data - Strague-Dawley Rats
Number of Growth Severity
Id Implantations Normal Retared Malformed Dead Index
Dose Group = Control (0.0 mg/kg/day b.w.)
54 2 0 0 0 2 2.73692
70 3 1 0 2 0 1.31054
59 4 2 0 2 0 0.98290
64 8 5 0 3 0 0.73718
53 11 4 1 5 1 1.25663
55 13 7 0 5 1 0.96661
58 14 7 0 6 1 1.03798
60 14 7 0 5 2 1,09306
65 14 8 0 5 1 0.89757
68 14 10 0 3 1 0.61674
62 15 6 0 6 3 1.33371
67 15 13 0 1 1 0.31352
71 15 8 0 4 3 1.07160
49 16 6 0 10 0 1.22863
69 16 11 0 4 1 0.66251
56 18 15 0 2 1 0.37047
57 18 13 0 5 0 0.54606
72 18 7 0 11 0 1,20133
Dose Group = High (25.0 mg/k/day b.w.)
91 2 0 0 1 1 2.35136
80 7 3 0 3 1 1.23348
86 8 6 1 1 0 0.40284
73 10 1 0 9 0 1.76923
77 14 3 0 11 0 1.54450
78 14 3 0 11 0 1.54456
79 14 2 0 12 0 1.68408
83 14 9 0 5 0 0.70207
93 14 1 0 12 1 1.88047
76 15 0 0 15 0 1.96581
84 15 6 0 9 0 1.17049
92 15 7 0 8 0 1.04843
74 16 4 0 11 1 1.52255
87 16 6 0 10 0 1.22863
94 16 8 0 8 0 0.98290
95 16 6 0 10 0 1.22863
96 16 4 0 10 2 1.57075
89 17 6 0 11 0 1.27199
90 17 0 0 11 6 2.23797
75 18 6 0 12 0 1.31054
81 18 6 0 12 0 1.31054
88 19 11 0 7 1 0.86829
109
Table II
Number of Implantations
Group Normal Gr.Retarded Malformed Dead Total
Control 252 5 35 8 '300
Low 239 1 89 5 334
Mid 130 1 79 18 228
High 98 1 199 13 311
Table III
Percent of Implantations.
Group Normal Gr.Retarded "Malormed Dead
Control 84.0 1.7 11.7 2.7
Low 71,6 0.3 26.6 15
Mid 57.0 0,4 34,6 7.9
High 31.5 0,3 . 64.0 4.2
110
Number of Implantations (Fetuses)
Norrnal Gr.Retarded Malformed Dead Totals
Control 252 5 35 8 300
Low Dose 239 1 89 5 334
Middle Dose 130 1 79 18 228
High Dose 98 1 199 13 311
Totals 719 8 402 44 1173
Calculate Bross's ridits (1958) by the formula rk = (cO+c +. .+ck-1)+ Sck where co = 0
rk .30647912 .61636829 79113385 .98124468
Then, a litter with 11 implantations of which 4 are classified as normal, 1 as growth re-
tarded, 5 as malformed and 1 dead, would have a severity index of:
This can be interpreted in light of the above scores, i.e., an index near zero would be
indicative of a litter with nearly all normal fetuses at cesarean-section and a "core 'nir
2,7369 would be indicative of a litter with nearly all fetuses dead at cesarean-section.
Five designs were evaluated which assume normality. The one-way classification, a
one-way classification using litter size as a covariate, a generalized randomized Iblock using
ill
litter size as a blocking variable, and a weighted anelysis using in one case litter size as a
weight and in another the square root of litter size. The results are summarized in Table
IV in terms of calculated F, associated P values and R 2 .
Table rV
f F P R2
One Way analysis 3,81 22.97 < .0001 .46
Covariance 3,80 25.99 < .0001 .53
Generalized RBD 3,65 20.52 < .0001 .58
Weighted AOVr (litter size) 3,81 15.62 < .0001 .37
Weight AOV'( littersize) 3,81 33.11 < .0001 .55
As was expected the covariance and blocking provided an improvement over the one-
way classification as measured by R 2. However, the magnitude of the improvement does
not seem to warrent the chance of violating the more restrictive assumptions placed on the
experiment by those designs. A better alternative, in the parametric case, may be using
the square root of litter size as a weight which provides nearly the same value of R I as
does the blocking design, However, we would prefer the one-way analysis for its simplicity
and robustness in application,
The normality assumption on the severity index is quite suspect in many situations,
As an alternative, the nowparametric Kruskal-Wallis procedure was carried out. In view of
the overwhelming significance of the parametric procedures, this result was not surprising
X2 = 47.75; af = 3 p < .0001. Figure 1 compares the linearity of the mean severity index
and the median severity index.
Figure 1
2.00
1.50-
1.00
0,50.
112
Statistical Procedure
The consideration of litter size is not necessary -for analysis of the SI's. It is important
to note than the SI's are probably not normally distributed, particularly in the control
group and at the higher dose levels, The following is suggested for toxicity-teratogenicity
studies,
1. If the SI's are reasonably normal, calculate the AOV F-statistic for a one way layout,
SAS code is available which reads litter data, calculates scores, computes SI's and cal-
culated the statistics, The results above have been "tested" by simulation analysis of
additional nitrofen studies and two other biological examples. Also, the method detects
different dose patterns with equal ability, The K - W test showed consistently higher power
than the F-statistic in the simulation studies,
113
REFERENCES
[1] Altham, P.M,E. (1978), Two generalizations of the binomial distribution. Applied Statistics
27, 162-167
[2] Bradley, R. A,, S. K. Katti and I. J. Coons. (1962), Optimal scaling for ordered
categories. Psychometrika 27, 355-374.
[3] Bross, I,D.J, (1958). How to under ridit analysis, l 14, 18-38.
[4] Cochran, W. G. (1943). Analysis of variance for percentages based on unequal num-
bers. J. American Statistical Association 38, 287-301.
[5] Eubank, R.L., V.N. LaRiccia and R.B. Rosenstein. (1987), Test statistics derived as
components of Pearson's phi-squared distance measure,
J..American Statistical Association 82, 816-825,
[6] Gladen, B, (1979), The use of the jackknife to estimate proportions from toxicological
data in the presence of litter effects, J. American StatisticalAssociation 74, 278-28,
[7] Haberman, S.3, (1974). Log-linear models for frequency tables with ordered classifica-
tions. B 30, 589-600.
[8] Haseman, J.K. aid M.D. Hogan. (1975). Selection of the experimental unit in tera-
tology students, Teratology 12, 165-171.
[9] Haseman, JK, and L.L. Kupper. (1979). Analysis of dichotomous response data from
certain toxicological experiments. Diometrii 35, 281-293.
[10] Ipsen, J. (1955). Appropriate scores in bio-assay using death times and survivor
symptoms. B 2, 465-480.
[11] Luning, K. G., W, Sheridan, K. H. Ytterborn and U. Gullberg, (1966), The rela-
tionship between the number of implantations and the rate of intra-uterine denth in
mice. Mutation Research 3, 444-451.
114
[12] McCaughran, D.A. and D.W. Arnold. (1976). Statistical models for numbers of im-
plantation sites and embryonic deaths in mice. Toxicology and Applied Pharmacology
38 , 325-333.
[13] McCullagh, P. (1980). Regression models with ordinal data (with discussion).
J. Royal Statistical Society, B 42, 109-142.
(15] Nair, VJ. (1986). Testing in industrial experiments with ordered categorical data
(with discussion). Jechnomnetrigs 28, 283-311,
[16] Nair, V.J. (1987). Chi-squared type tests for ordered alternatives in contingency
tables, J. American Statistical Association 82, 283-291.
[18] Williams, D,A, (1975). The analysis of binary responses from toxicological experi-
nients involving reproduction and teratogenicity. Biometic 31, 949-982.
116
A SMALL SAMPLE POWER STUTDY OF THE
Abstract
117
1. INTRODUCTION
00
and
118
Kn sup \/ IFn(x) - F(x)IV F-3J. (1.2)
-00 < x < =0
Samples producing large values of W 2 (or Kn) lead to rejection of the null
weight function k In (1.1) and (1.2). By a suitable choice for k, specific ranges
of values of the random variable X, corresponding to different regions of the
Cramdr-von Mises statistic [Cramir, 1928 and von Mises, 1031] and K n
With this choice for the weighting function, metric (1.1) becomes the basis for
Section 3, the most accurate tabulation to date of the test statistic is provided.
In Section 4, the description and the results of a power study are given in
which the Anderson-Darling, the Cram:'-von Mises, and the Kolmogorov
119
2. THE ANDERSON-DARLING STATISTIC
Proof:
P(nF,(x) - k) - P(exactly k values xi < x), for k - 0, 1, ..., n.
120
From Lemma 2.1,
and
that is, a function that will weigbt more heavily values in the tails of the
distribution F(x) at the expense of values closer to the median, consider the
keep In mind that the value x Is fixed, so F(x) is a constant, and the
which, after algebraic manipulation (Appendix A) yields the variance nnd bias.
121
Under the null hypothesis HO: H(x) - F(x) Vx, (2.2) becomes
1
Anderson-Darling chose as a weighting function, O[F'(x)) ~)[ ~)
the statistic Fn(x) and also maintains the objective of accentuating values In
2 ) 24
00 Fn(x) -F(x)] d'
.F(x) [1 - F(x)]
122
3. DISTRIBUTION OF THE ANDERSON-DARLING STATISTIC
Darling [1952]. Lewis (1981] undertook the tabulation of F(z; n) - P(W2 < z)
[0.025, 8.000]. Lewis' table entries were computed using a Monte Carlo
ob3ervation that the U are distributed U[0,1] [Feller, 19661, the table appearing
123
values used in the construction of Fm(x). With n fixed, the 95% confidence
band can be made arbitrarily small by a suitable choice for m, the number of
suggests using
1
M 3.51
the width not exceeding 0.001, the value for m must be at least 7,375,881. In
from 0.025 to 8.000 and for n - 1, 2, ..., 10. The column labeled "oo" contains
124
-g -r
125~
ti
d d 0 0 c C6
d260a0C ic '
I" t - - N
o ooo
6--- -- == --- oc --- dd c di0 0o0
126
H~ t-
0i ag;
0E
12
Cc 00
ai 0 a
k-~ -
128
4. POWER STUDY
The power of the Anderson-Darling test was compared with two other
becomes
D+ - max Ui and
D- mmax ui
l [ n
00
129
W;
E M n + 12n
-- '_ (4.2)
parameters are not specified and must be estimated from the sample data.
distributions were chosen, each with location parameter the same as the null
respectively, against which the power of the three goodness-of-fit tests are
compared.
on the power curve, a large number of samples of size n was generated from a
times the null hypothesis was rejected at a specific level of slgnlficance was
1 30
samples generated, N, provides an estimate, p - Y/N, of the probability of
rejecting the null hypothesis when it should be rejected (power). The value p
Since the counter Y is distributed binomial( '; p,N) where the parameter p is
the true but unknown power, and since an approximate confidence Interval for
until the confidence interval for p given in (4.3) was sufficlently small.
the confidence interval width not to exceed 0.025. Then the confidence limits
(4,3) were successively evaluated until the interval width was satisfied.
131
4.1. Case 1: Distribution Parameters Specified.
hypothesis, samples of size n - 5, 10, 15, 20 were chosen for study and, as
previously mentioned, the location parameters of both the null and alternative
hypotheses coincided. The scale parameter for the alternative hypothesis were
The level of significance for the study was 0.05. The critical value for
each test was determined from tables In Conover [1980] for the Kolmogorov
test, Stephens and Maag [1968] for the Cramir-von Mises test, and Table 1 in
sample siz ' and hypotheses chosen for this study. This is perhaps to be
132
monotone increasing. An explanation of this feature is suggested by
N(0,1) and Cauchy (0,C) are compared. There it is seen (Figure 14) that
increase (decrease) in the scale parameter C causes the tails of the distributions
parameters are specified, and so precludes their use in the more likely situation
sometimes used anyway with the caveat that the tests are likely to be
Hm: H(x) - F(x), where F(x)- N(p,a') and the population parameters are
Figures 16 - 27. As in case 1, the sample sizes are n - 5, 10, 15, and 20, and
133
the level of significance is 0.05. Both location and scale parameters coincide;
the scale parameter are values from 0.025 to 3.000 in increments of 0.025.
The power plots are horizontal, demonstrating that power does not
change with scale parameter and provides empirical support for Stephens'
expected. When both location and scale parameters agree, all three tests are
competitive for the sample sizes and alternative distributions chosen for this
study.
1.44
.*
* *, S
* • ^ .2
* SL .;I
I U
I S
S
SSo
5 '.
'S 35
I,,
I
I
* 9 -~
9
* 9
9
99 (.3
I
9 9 519
99
99
iIj N
9
pm-. 9 9
9
41
9 9.
PU V
9 9
9
9~
x 9
-& ~
1 *
9
9
Q '9
9
9
9%. I 9. N
'99
9%~ C,
J~Mod 0
136
I
i
I
II.
P
0
E
1 4
I elm eo~
* 4 ~ I,
C ~ .2
1* S
4
~ *
* S
iI U?
S...,
0 I
4~
4
4 Ii. 0
S
6 4
4
*
(V
4
9. *5 'S.
~Mod
137
'a
I:
~ Og
13
a ,
4'
I *
I 'a Oo~
a a
I
-~ a
0 ii~
a. *
jJ:
>1
4
a
9 a §
B
0 LI~
'a
'-a
a U?
I
a e0" *0
*0. *~ 5 1 1 d
~ C C
138
UU
1%0
ap4
1391
iC
6 0
J*Ij *
14
Z, ,
00
il=ll Ci,
iOU
CR
to
I I I I II I I
-, 5 0
141
9414h
00
cl*
* 9,,
PC.
9a o
I 142
S
U
U U U
U'
U Umu
'U U
IU U'
U
S
~Ii
:111
~uIu
U
U
q'N
U
U
~;II U U w
U. II
'4, U
*
a
* II 4, 5
*
U * U
4,* 6 5'
UU *
4, U **
U.S
\Ib
4,.
p.4 *
________________________________________________ S
I I I -~
a* '0* .4'. 0
"4 0 0 0 0
Z~ Mod
143
ccc
hI
£ aNo 4 a000
... Sm moml
14
I hO U
Ii
i ~
U N
I U
I
0 S
S I I'
,. U I. d
'.4
S -~
'S
'S 0.
It S S
4., *
4.. 5
* C
S S
S
S#q'*% 05 -
*%. ~ .
'S.. 0..
S
.5
A.
U .3 5~ 0
________________________________________________ 0
I I 0
o 4' N 0
- 0 0 0 § 0
3 MOd
145
i b iw
a UN ,,
0'9.14
im
r !I
P4b
"1 I
I. . .. . | .. i . . i
I
146
C!0
I *;
US"S
04147
*
U
U
10
* U
o *
*
-
a
o I
P4
U a
Q
A
.1~~
a
I
a
a
- P4
o *
I
S
I
--~ Cia
£
I
I. a -
I: I
U
o I:
z
0
I I I
o
- C C § 0 0
(x) ~
148
w C -
*
S
*
ISU (
Sm C.
I
S
S
SSS
* Ii~
Q
S.
S
S A
I
'-4
- S
- S.S.
otu
Z
0 U I:
PS
0 0 0 0
149
U U
o t m
i
SI
I
I, 00
0
~ U
I bebOa
I
I o
I ha
bAke
I
1 H:
U2 I ii:
I
1
I
0
u2~ -~
I.'
C
~II I -
I
~Ib~ I 0
I a
a
I
I -
I
0
a
I
I
1 I I I 0
0 U 0
- 0 0 0 0 0
a Mod
151
IsI
Io
i:i
AI, I I;
Is
'Bi
so I
I~6l
Is
1 00
U.l 01
RI
S.mo
19 12
Is162
C
I. S
bO 3
ig
Is 0
F
oo~
C
lii
*
I I:
III
BE'
C
&
I.'
I'* w Q
z
'S
I'
U'
0 I:
~III II cmu
I.
I
U'
Q
WiS
C
F
U'
Em
U.
I I
U
a Mod
153
q
U
a
S
Ia
a a
U
a
lii
~Gw
I
U
* a I
I U
I. U
U
ii: U
U
~II a
*
a
ii
- I
*
.31
*
a - II
*
a
U
* L.
U
o a
U
*
I-
I
a
U
U
*
U
- U
a
in
I
a
I
o U
_______________ U
o UI N 0
C2 C C C 0
.xuMod
154
-q
IO 0
- c-
mmI
155
i i
z i
IS q
O WN
44I Wi
I
I § 0 I
5
N
1!1
FWA4 Sm
PC 0M
0I
ccS
ci c;
ii: m1 S
C
I.
bo I.
S*
U.
'B
'B
'B
- I~ SE
U,
I~4
iii
I.e
SB
'a
'S
a:
Urn
I
B.
'U,
'B
IS
- B1
KB U
'B
'B 45
w~o
I: u
I
B~
U)
o 'U
lB
'U
C
-
'U
'B
a:
B.
'U
UB
'a
C
'B
o 'S
'U
U
Q
Z ~ Mod
157
ii
U,,
l0I
gI
NN
k . i "iI
. . I . ....
'0o C
ZSMod
158
q
if mm
C
S.
obo~
''if
IL
~Oy, IS
me
I.
0
ii ci
WI
U2 II S
I.me
34.
5'
I.
Is S..
II
Si
.5'
I.
C
0
I.5' 0
C
od -~
0 0
J SM
159
II
.. _.6
e 'a
I'
- Ut
*S
Ii. I
I' n 4 5
1606
. . . _I. - I
3 i
- ", -
~ I
I:
I:
i "E€
i _ III
II II I . . .| II. _ ., . ... .
REFERENCES
162
Kolmogorov, A. N. "Sulia determinazione empirica delle leggi di probablilta,"
Giornale dell'fat tuto Italiano degli Attuari, 19nS, Vol. 4, pp 8.91.
Rubinstein, Reuven Y. Simulation and the Monte Carlo Method. New York:
John Wiley & Sons, 1981.
....
and Urs R. Maag. "Further percentage points for W2," Biometrska.
1g8, Vol. 55, pp. 428-430.
163
APPENDIX A: EXPECTION OF SQUARED DISCREPANCY
BETWEEN AN EMPIRICAL DISTRIBUTION FUNCTION
AND A SPECIFIED DISTRIBUTION FUNCTION
n E [Fn(x) - F(x)]'.
+ F(x) - H(x)I
- H(x)2
+ jF(x)
164
E [FDcX) 2 - 2 Fn(x)H(x) + H2(x)] + {(x) - H(x)}
+ {F(x) - H(X)]
165
APPENDD3C Bs EXPANSION AND INTEGRATION OF
THE ANDERSON-DARLING STATISTIC
2 - 0 [F,(x) - Fx1
n F(Y.) (1- F(x)] dF(')
Let F(x) mu
dF(x) wdu
F(xi) -w
V2 _ 2k u + U2
n-l Uj1 n2 n d
--ulI n I)+ u(1.U.) d
+ [-1 n- nui
166
V
uk~l du
+ [a f u(-u) ]
+ r~ .1 fcU du
+ 11+ un - In Un}
mn {-u 4-
1 - In (1 - ul)
k2In (Uk+l) - In (1-ul - In Uk + In (I-U)
k
k-i n2Uf1
n-i
E
2k
T
rI (1 - Uk+1) - In (1 - Uk)I
k-i
167
man u-in (1-u 1 ) + E -- Uk- n (1 Uki
in+
- In (1U
I2n-2
mfn{1 2 [(2
k- )
InlfUk(
2(n-k) +J
2 1JIn (1 -Uk)]
168
APPENDIX C: DERIVA.TION OF THE
CRAMfR-VON MISES STATISTIC
1 xn <x
-n Jf [Fj(x)
- 0 0
- F(x)]2 dF(x) + E f[F.(x)
k- 1 xk
- F(x)]2 dF(x)
Let F(x) - u
dF(x) -du
F(x1 ) - u1
169
Then
[oU2ubl2
+l + i1U]d}
2 n 3 Uk n(,3-un
1n- 1n+ 2 1
3 1 - ub + 13 + U 3 n
~k-)
L 2k -1 2 2 n
170
Completing the square,
12
Ik - 1 2k-i1
En1 [Uk -2k- 3 E~c~ 2ni
n ~2 - JJ12 + n ,1 ( - )22n2+(2 )
r- 2n + 3 4n2 2+3 yf-i
Eka[U[
k
-2k
1
2
1~L
1
2 + -
2
1 2+
171
Nonpare, a Consultation System for Analysis of Data
173
1. Introduction
Statistical software packages, to large extent, accept any properly configured
data set and proceed to process It. Few if any checks are made to ensure the adequacy
of the data and the suitability of the analysis, and little is done to provide an
explanation or interpretation of the results. This requires a great deal from the user.
Declining computation costs, together with increased availability of computers and
proliferation of statistical software, has further enhailced the opportunity for faulty
data analysis. Application of expert system techniques from artificial intelligence to
produce more cognizant software is one approach to reversing this unfortunate trend.
In 1985, a workshop sponsored by AT&T Bell Laboratories brought together
mainy of the active investigators in artificial intelligence and statistics and was the
genesis of a book by the same title edited by Gale (1]. This reference is in essence the
proceedings of the workshop; but the papers given there, some with extensive
bibliographies, provide the most complete centrally-located account of research In this
topic to date.
This report details an effort underway at the US Army Ballistic Research
Laboratory (BRL) to develop a consultation system for analysis of data using
nonparametric statistical procedures. The system, called Nonpare, is intended to
serve as an intelligent interface that will act as a guide, an instructor, and an
interpreter to a body of statistical software. The system is currently a prototype, with
a first release planned for 1989 for field testing.
2. Nonpare
Nonparametric statistics is too large an area to hope to encompass at once,
especially if the entire field of mathematical statistics Is partitioned into parametric
and nonparametric procedures. The common-sense approach to construction of
consultation systems suggests limiting the domain of application, but nonparametric
statistics has qualities that make it strongly appealing.
Nonparametric data analysis is characterized chiefly by the absence of
restrictive distribution assumptions -notably freedom from dependence on the normal
(Gaussian) distribution. Many nonparametric statistical procedures are exact rather
than approximate for small data sets, and they are the only confirmatory procedures
which can be used to analyze data collected on a nominal or an ordinal scale of
measurement. For these and other compelling reasons advanced, for example, by
Conover, [2] Hollander and Wolfe, [3] and Lehmann, [4] nonparametric procedures
find use in a wide variety of disciplines.
2.1 The System Structure
Nonpare uses Genie, an expert system shell developed at the BRL, [5] to
provide a frame-based production system with forward and backward inferencing as
well as an explanation facility that allows the user to interrogate the system -what
hypotheses are being entertained, what rules are being verified, what facts are in
evidence. Genie was chosen over commercial expert system shells for the research
and development of Nonpare because of its accessibility for modification.
174
Nonpare, shown schematically in figure 1, consists of three subsystems' in
addition to Genie.
Genie
inference engine
forward
knowledge reasoning explanation user
basebackward facility
reasoning
nonpararnetric
data system
analysis dictionary
175
p, p. Is the Pr(perforation} > .80 ?
(A diversion here. Searching for a statistical procedure with a set of data
already collected is precisely how not to proceed. The purpose for collecting the data
should first be established, and then the statistical tools available to support this
purpose determined. Then the collection and analysis of data can proceed in an
informed manner. Lamentably, the methodology-search scenario is enacted over and
over again; so this example is not too contrived.)
It should be apparent from the onset that the question regarding
Pr{perforation} > .80 can never be answered unequivocally yes or no, but only with
some degree of qualification.
Nonpare presently has nineteen distinct data analysis procedures at its
disposal; the number continues to increase. No assumptions have been made about
their frequency of use; one procedure has not been declared most likely to be
exercised, a second procedure next most likely, and so on, since the base of potential
users is so broad. For the user, this means that any procedure is a likely starting point,
as in this session, the dialog of which begins in figure 2. In the remainder of this
section, the conventions that boldface denotes system prompts and brackets contain
user input will be adopted. An occasional system response may be italicized but
should not be confusing within the context of its appearance.
The session begins with a question about the configuration of the data.
)n you have a sample XP ..., Xn ? The data, n, ... , p, look like X1, ..., Xn; respond
[y]es.
Are you interested In whether the data conform to a specified distribution ?
Nonpare is investigating a possible goodness-of-fit situation. A statistician,
anticipating an approach to this problem, might find a [y]es response is appropriate
here. A nonstatistician, for whom this portion of the system is designed, and who is
interested in whether Pr{perforation} > .80, should respond (nbo, as indicated.
Are you Interested In the probability of occurrence of a particular category or
event ? [y]es. The user is interested in the probability of occurrence of a perforation.
176
Enter the name of the category of interest. [perforation]. Domain-dependent
terminology is being introduced.
Are the n trials producing the values X1, ..., X, Independent ? Suppose the user is
unsure of the technical implications of the term "independent." An acceptable
response is [What is independent] - as shown in figure 3.
Independence relates to freedom from external influence or control- here, the reference
is to measurements (data) being free to assume values without regard to other
measurements that may be made.
This illustrates a dilemma for the subject area specialist. It may be impossible to
rigorously define a term without reliance upon other terms that are equally obscure to
a user with only a modest statistical background, This is the case here, where
independence is bound to basic concepts of probability theory. Nonpare's response
conveys the notion, but regrettably not the substance, of independence. More work is
needed here. For now, assume the experimenter has collected a set of independent
data.
Are the n trials producing the values Xp ...
,X. Independent ? [y]es.
Does each trial have the same probability p of producing the perforation ? [y]es.
Notice that Nonpare is now using language the user provided, when it talks about
probability of perforation.
Are you interested in considering whether the probability of occurrence of the
perforation equals or Is bounded by some specified value p* ? (y]es. The user is
interested in the inequality Pr{perforation} > .80. After a [y]es response, the system
suggests a possible approach, shown in figure 4.
177
The binornial test. is an, arpropriate procedure. To execute the
bir omli a test, uSe the menu to comp lete this atatem nt:
I am interested in testing the null hypothesis that:The pelohtabflit-y
of occurrence of the per'forarforf
The menu allows the user to select either a two-sided or one-sided test of hypothesis
and is a potential source of error. Beginning statistics students, not realizing that a
null (or empty) hypothesis is chosen to be rejected, might mistakenly choose is at least
p* at this juncture. Here again, some -level of statistical competence is required.
Selecting the hypothesis does not exceed p* from the menu using a mouse, the user
obtains for confirmation (figure 5) the statement:
I am Interested In testing the null hypothesis that: The probability of occurence of the
perforationdoes not exceed p 1.
178
Spec4~ the number of datum values assigned to the perforation. [11]
The first two "Specify ... "commands determine the appropriate binomial distribution;
the third determines the size of the critical region for the statistical procedure, which
is explained in figure 7, following the system-generated histogram shown in figure 6.
Region of Rejection
Pr (X) '1WRk
.25
IV
RN%
P
Sf f %SV
IN '~~%"
AmmVN',V
%ANS
M%&A
IRA\S
\, MM~%
01 2 31
X "~
Figure~~~~~
Sttsiarahsmay
6.~
The~~~~~~~~~~~~~~5
dipastepoaiiyo hitga bsrigeatyn(o0 4
aro eroaiosi
fute sosiftetre(utukon)P~eroaio).0
Attsiinwllraiyasmlteti rp.I teue eeylok ti sapo
involving~~ hc ~ h ~ ih ~ ~ ~~~5
ryrgon
n onsi orsodn on>1,hlssm
~
special~ ~ an ~ ~
itpoiessm ~~~~~,S5S
sigifcace easrnereadn teuse
compuatios,
i wil hav sered is puposehere Figur 7,Vl whc peaso h
terminal5-.* explinstha
simutaneusly
The cr iti,-al level of this test, car respording to the light gray region,
is .69
This nea, s that if 'YOu reject. the hypothesis (The prateabiit, of
occurrence of the perforat.ion coes not eceect 8) you ,ioso tol h a ,69
probabillt.'y of being in error,
Since the investigation began with the assumption (null hypothesis) that the
Pr(perforation} <_ .80, the evidence collected -eleven perforations, three
nonperforations-is not sufficient to support abandonment of that assumption. A
probability of being in error of .69 is more than a reasonable person would be willing
to assume. And so, the response to the original question, Is the Pr{perforation} >
.80 ? is a qualified no, the qualification being expressed through invocation of the
critical level.
Would you like to run the binomial procedure again ?
At this juncture, an experimenter might well be asking a number of "What if
questions. "What if I had been able to afford three more firings?" or, "What if I had
observed one more perforation?" and so on. A response of [y]es here allows the user
to exercise the binomial procedure directly, without having to respond again to all the
preliminary questions. A [n]o response is given, but this is an excellent place to use
Nonpare's tutorial capabilities to study the sensitivity of the binomial procedure to
modification of parameter values or slight changes in the data.
Are you interested in determining an interval In which the probability p of
occurrence of the perforation lies?
The foregoing analysis suggests that an assertion that the probability of perforation
lies in the interval (.80, 1] cannot be made. What interval might be expected to
capture this unknown parameter? A response of [y]es causes this question to be
answered, first graphically, as in figure 8, and then verbally, as in figure 9.
180
CONFIDENCE INTERVAL
with 95.0% Confidence Level
p, .78
.48 .94
Are y,,u irterested in',dter nin 'D I iltorva1 i wh I t -he pf-Iob.bI, liit.,v F:l
f occur rer,:e of the per fortiron 1ies'? /
Would , 1iiyou
e C,:1rfiderI,:e level other tha Q5 ' 1-
Would you like n conldcnce level other than .95 ? (io]. The 95% confidence levcl
was prechosen. A [y]es response allows the user to control the confidence level. The
session is terminated with a [n]o response, shown in figure 9.
At the conclusion of the session the inference engine displays a fact solution
tree for all the intermediate decisions leading to the final conclusion. Buttoning with a
mouse any node of the fact tree produces the logic leading to that location. In figure
10, fact'11 was buttoned, and the corresponding trace is displayed beneath the fact
tree. These are features of the inference engine rather than Nonpare, but they are
valuable as diagnostics to the developer and provide some measure of reassurance to
the user.
181
II
g--
", ~.. ,
\ - - !, - , , - C-P1.
IIII
I
-0 1 ,
-- . 1'
,1 .. .
.... . I I
18
i iI I- / Ul
i i ii
m I n
4. Conclusions
Nonpare, a consultation system for analysis of data using nonparametric
statistical procedures, has been described; and most of its operational features have
been illustrated. The essence of the system is the rule-based interface with
accompanying software for data analysis and the interpretation of the ensuing
computations. Nonpar; is under active development, but its feasibility as an
operational system has been estabdshed. Enlargement of the rule-base and the
addition of more statistical procedures is clearly indicated before it can approach its
potential. Not surprisinSly, tangential problems in basic research have been spawned
by this effort. A first relewe is planned for 1989 for field testing.
References
[1] W.A. Gale, Ed., Atificial Intelligence and Statistics (Addison-Wesley, 1986).
(2] W.J. Conover, PracticalNonparametricStatistics (John Wiley, 1980).
(3] M. Hollander and D.A, Wolfe, Nonparametric Statistical Methods (John Wiley,
1973).
[4] E.L Lehmann, Nonparametrics(Holden-Day, 1975).
[5] F.S. Brundick, et.al.,Genie: An inference engine with applications to vulnerability
analysis, Technical Report BRL-TR-2739, US Army Ballistic Research Laboratory,
Aberdeen Proving Ground, MD (1986).
¢~~~ I Y3 :I1
Numerical Estimation of Gumbel Distribution Parameters
185
EXPERIMENTAL DESIGN AND OPTIMIZATION OF BLACK CHROME
SOLAR SELECTIVE COATINGS
I. J. Hall and R. B. Pettit
Sandia National Laboratories
Albuquerque, NM 87185
A]ISTAL. Some years ago Sandia Laboratories was given the
task of investigating selective coatings for solar applications.
Early experimental results, which were based on one variable at
the time experiments, produced acceptable coatings in the
laboratory. However, when full scaled parts were coated by
commercial electroplaters, the coatings quickly degraded when
heated in air. At this point a systematic approach using a
fractional factorial design was used to determine both the
effects and interactions between several variables, including the
bath composition(four variables), current density, plating time,
substrate, and bath temperature. Response surface for the
optical properties of the coatings were constructed for both the
am-plated and the thermally aged samples. These response
surfaces were then used to specify ranges for the bath
compositions, and other plating parameters, that provided
coatings with optimum thermal stability. With proper control of
the plating variables, stlective coatings were obtained that
should maintain high solar absorptance values during years of
operational at 300 C in air.
1. INTRODUCTION. Two variables are of interest to
selective coating investigators, namely, absorptance (a) and
emittance (a). Good selective coatings have high a's and low
6'S. In our investigations we concentrated on making a as large
as possible and settling for the corresponding e if it was not
"too big". The independent variables that effected a and 6
divided themselves into two groups (bath variables and plating
variables) in such a way that a split-plot experimental design
would have been appropriate. The bath variables would have been
associated with the whole plots and the plating variables with
the sub-plots. The bath variables were chromic acid, trivalent
chromium, addition agent and iron. and the plating variables were
plating time, current density, bath temperature, bath agitation
and substrate. For a specified combination of bath variables a
entire set of experiments were possible for the plating variables
as in a split-plot design. Because of many constraints we did run
the experiment as a split-plot design. The dependent variable
readings ( a's ) were obtained by coating a substrate and then
measuring the absorptance with a Beckman Model DK 2A
spectroreflectometer. Readings were obtained for the substrate
both as-plated and as-aged. The as-aged readings were obtained
after the specimens were heated in a 450 0C furnace for 40 hours
while the as-plated readings were taken before they were
subjected to any extreme environments. The aged readings were
the most important because we were concerned about the thermal
stability of the coatings, i.e. would coatings not degrade at
high temperature for extended time periods. The experimentation
was done in three phases that are briefly described below.
187
2.
LEx~erimentationj. Based on previous experience, we
decided that the bath variables were most important and thus we
concentrated most of our effarts on investigating these
variables. The plating variables wero set at nominal values. We
used standard response surface methodoloqy to guide us in the
experimentation. (See Box, Hunter, and Hunter, "Statistics for
Experimenters", Chapter 15, 1978) The first phase consisted of
running a 1/2 replicate of a 24 factorial experiment on the four
bath variables. The experimentation was done in a rather limited
range of the factor space. The results of this experiment were
used to determine a path of stoepest ascent (Phase two). Three
more experiments were done along this line of steepest ascent.
These experiments would normally indicate a region in the bath
variable space that would produce larger a values. In our case
however all the coatings turned gray after a short time in the
furnace - a highly undesirable result. The most valuable
information from these three bath experiments was that a "cliff"
existed in the response surface. Because of time limitations we
did not repeat the experiments along the steepest ascent line.
Based on a combination of engineering judgement and factorial
design methodology, several more baths were mixed and the a's
measured on the coated substrates (Phase three). A total of
eighteen baths were mixed and the results from these baths were
used to estimate a quadratic surface - i.e. a was written as a
function of a second degree polynomial in the four bath variables
and the variable coefficients were estimated using a backward
stepwise statistical package. The final regrepsion equation had
11 terms including the constant term with an R' - 0.96. Several
graphs were drawn based on this equation that allowed us to map
out an acceptable region in the bath variable space. This space
was very near the "cliff" in the response surface. A limited
number of experiments also were done involving the plating
variables for a fixed bath. Based on these experiments we were
able to specify ranges for the plating variables as well.
3. Summary. Using response surface methodology we were
able to determine the variables and the range of variables that
produced stable selective coatings. The procedures developed in
the laboratory were subsequently implemented in a production
environment with excellent results. The close interaction
between the statistician and the experimenter led to a
satisfactory solution with a rather limited number of
experiments.
188
DETERMINATION OF DETECTION RANGE OF
MONOTONE AND CAMOUFLAGE PATTERNED FIVE-SOLDIER
CREW TENTS BY GROUND OBSERVERS
Christopher J. Neubert
U. S. Army Materiel Command
Alexandria, Virginia 22333-0001
ABSTRACT
Field evaluations have determined that camouflage patterns reduce detectability ranges for
uniforms and vehicles in woodland environments. This study identified the effects of three pat-
terned and two monotoned Five-Soldier Crew Tents using detection ranges and number of false
detections as determined by ground observers. The distance of correct detections were recorded
along with the number of false detections. An analysis of variance for the detection ranges and
number of false detections was performed. The Duncan's Multiple-Range Test was used to
determine significant differences (a - 0.05) in groups of tents. From this data, it was deter.
mined that the three patterned Five-Soldier Crew Tents were more difficult to detect than the
two monotone tents.
Several years ago, the U.S. Army decided that camouflage patterns have a definite ad-
vantage when used on uniforms and vehicles in woodland environments. This had led to a similar
consideration for teats, since the current U.S. Army tents are solid (i.e., monotone) color. Tents
present a large, relatively smooth form, making them conspicuous targets. The use of patterns
to break up this signature could increase camouflage effectiveness. However, before such a
judgement could be made, a field test was planned to determine the relative merits of various
patterns versus monotones in a woodland background. The Natick RD&E Center fabricaLed
three patterned tents and two monotone tents for evaluation. In consultation with Belvoir, the
patterned tents were fabricated in the standard four-color uniform pattern, one in the standard
pattern size and the other two in progressively larger expanded patterns. The two monotone
tents were in colors Forest Green and Green 483 (483 being the textile equivalent of paint color
Green 383). A test plan I/ was developed by Belvoir at the request and funding of Natick, and
the field test was conducted by Belvoir at Ft. Devens, Massachusetts, in the summer of 1987.
This report describes the test and its results.
189
2.0 SECTION 11 . EXPERIMENTAL DESIGN
190
Table I
Distance& of Markers to Tents for Site #1
S 1,182.64 so 464.78
y 1,128.57 Y,446.74
a 1,094.00 0'428.17
L 1,049.93 L' 413.48
F 1,008.07 F1 398.48
P 978.31 Po 383.34
E 947.02 El 364.04
K 902.75 K' 348.27
A 858.10 A' 334.46
T 817.81 T' 322.69
V 778.91 .V, 308.59
B 750.15 B' 289.59
M 709.76 Mo 281.60
Li 674.87 U' 269.08
H 702.65 H' 253.16
Z 677.99 Z' 235.50
R 648.46 R, 217.81
N 613.35 N' 199.60
X 602.56 X,178.93
I 594.57 I', 156.76
D 578.05 D' 141.15
C 561,16 C, 120.05
0 541.70 0' 102.34
J 525.33 J'85.37
G 505.62 62.81
W 483.64 W1 41.84
Table 2
F 1,205.36 A 653.34
W 1,168.63 Z 813.20
U 1,130.58 E 574.09
o 1,086.03 P 540.30
C 1,048.10 H 513.10
R 1,006.15 K 496.46
V 982.00 S 475.57
0 974.13 F' 459.10
M 942.37 W' 417.71
901.58 U' 379,40
B 889.75 0, 338.25
J 858.01 C' 296,90
L 851.84 R1 278.53
X 841.28 V' 258.20
G 803.95 0' 220.73
D 764.09 I' 180,87
Y 723.48 B' 143.94
T 695.32 J' 111,00
N 073.60 L' 89.78
A total of 153 enlisted soldiers from Ft. Devens served ad ground observers, All person-
nel bad at least 20/30 corrected vision and normal color perception. A minimum of 30 observers
were used for each test tent, about evenly split between test sites. Each observer was used oaly
once,
The test procedure was to determine the detection distances of the five tents involved by
searching for them while traveling along the predetermined measured paths. Each ground ob-
server started at the beginning of the observation path, i.e., marker S for Site #1 and marker
F for Site #2. The observer rode in the back of an open 5/4-ton truck accompanied by a data
collector. The truck traveled down the observation path at a very slow speed, about 3-5 mph.
The observer was instructed to look for rtilitary targets in all directions except directly to his
rear. When a possible target was detected, the observer informed the data collector and pointed
192
to the target. The truck was immediately stopped, and the data collector sighted the apparent
target. If the sighting was correct, i.e., the Five-Soldier Crew Tent, the data collector recorded
the alphabetical marker nearest the truck. If the detection was not correct, the false detection
was recorded, and the data collector' informed the observer to continue looking. The truck
proceeded down the observation path. This search process was repeated until the correct tar-
get (tent) was located.
The tents were rotated between the two test sites on a daily basis, until all tents had been
observed by at least 15 observers at each site. (This number of observers allows the use of
parametric statistics which have a guod opportunity to yield absolute conclusions). Their orien-
tations with respect to the sun were kept constant at both test sites.. The Five-Soldier Crew
Tent was positioned so that a full side was facing the direction of observer approach.
Table 3
Mean Detection Ranges (Meters) and 95 Percent
Confidence Intervals,
95 PERCENT CONFIDENCE
STANDARD INTERVAL
TENT N MEAN ERROR LOWER LIMIT UPPER LIMIT
193
Table 4
Analysis of Variance for Tent Detection
Across Five Levels of Color Variation
DEGREES
OF
SOURCE FREEDOM SUM OF SQUARES MEAN SQUARE F-TEST SIG LEVEL
Table 4 indicates that there are significant differences in the ability of the ground observers
to detect the Five-Soldier Crew Tents in different four-color patterns and solid colors
785
770 75.21
72S5
~6$710
610-
630
.21 635-
605
590 - 594.6113
" 560
S4O
545 -
t530
o00 49.-5799
470
6 43! 447.3931
0 440
420
410 -397.8210
3910 374,3047
Z 3850
<~ 365T
33S-62.83|4
19 350
320
305 - 326,7615
'90I 304039
273 290,6794
lao I I I.
..
A a C D I
FIVE SOLDIER CREW TENTS
194
Table 5
The harmonic mean group size is 30.58. The subsets are significant at a - 0.05
The Duncan's Multiple-Range test separates a set of significantly different means into sub-
sets of homogeneous means. One of the assumptions is that each random sample is of equal
size. Since this was not true, the harmonic mean of the group was used as the group size. As
seen above, the range of detection was the shortest for tents A, C, and D and these tents do
not differ significantly from -each other (a - 0.05). Tent E had the longest mean range of detec-
tion and is significantly (a - 0.05) different from the other 4 tents in this respect.
3.2 False Detections
The number of false detections is defined as the number of times a target other than the
test target is detected by an observer, In this study such detections are rocks, trees, shadows,
etc. These detections, as a rule, are a function of how hard it is to detect the test target. The
more difficult the detection task, the greater the number of false detections. Tables 6, 7, and
8 show the false detection data. Table 6 gives the mean false detection value, and its associated
95 percent confidence interval, for each of the Five-Soldier Crew Tents, Table 7 contains the
analysis of variance performed upon the data of Table 6 to determine if there were significant
differences in the rate of false detections. Table 8 indicates which tent patterns and colors had
significant rates of false detection.
Table 6
g5 Percent Confidence
Standard Interval
Tent N Mean Error Lower Limit Upper Limit
195
Table 7
Analysis of Variance for Rates of False
Detection across Five Levels of Color Variance
DEGREES
OF
SOURCE FREEDOM SUM OF SQUARES MEAN SQUARE F-TEST LEVEL
Table 7 indicates that there are significant differences in the rates of false detection for
the Five.Soldier Crew Tents.
-,0717
4,0393
[.- 5.670o23.11
3.
WI L8341
I7j8s 2.6671
A I C D
'19F
Table 8
Duncan's Multiple-Range Test
(Rates of False Detection)
SUBSET I SUBSET2
GROUP MEAN GROUP MEAN
E 2.50 B 3.53
C 3.38 D 3.87
B 3.53 A 4.87
D 3.87
The rates of false detection for tent groups E, C, B, and D, and B, D, and A were not sig-
nificantly different (a - 0.05). However subset 1 is significantly different from subset 2.
The Duncan's Multiple-Range Test (Table 5) shows that the group of Five-Soldier Crew
Tents A, C, and D had the shortest detection range. Tent A is the standard size woodland
uniform four-color pattern, while Tents C and D are expansions of this pattern. The pattern at
Tent A Is repeated every 27.25 Inches, the pattern for Tent C is repeated every 36 inches, and
the pattern for Tent D is repeated every 50 inches. Tents C, D, and B are significantly different
from each other. Tent B is solid color, Forest Green, Tent E, which is not solid color Green
483, had the longest mean detection range (674.89 meters), and this is significantly (e - 0.05)
longer than any of the other means for the Five-Soldier Crew Tents. Thus, it can be concluded
that the patterned tents are harder to detect from ground observation, but that the pattern
should not be expanded beyond the repeat of every 36 in, hes. The human eye is probably resolv-
ing the larger pattern repeated every 50 inches as bei6, different from the tree and bush back-
ground (the color brown, in particular, becomes distinguishable from the woodland background
when overexpanded).
When working with detection ranges, the question of field data stability is always paramount
to the amount of weight that can be given to the test conclusions. One of the best methods to
determine data stability Is through a test-retest procedure. Field studies are very expensive and
time consuming, so this data is very rare. We do have such an opportunity to examine this type
of data for the Turner Drop Zone. A ground evaluation of camouflage nets was conducted in
the summers of 19853/ and 19874/, The net sites and test procedures were identical to the sites
and test procedures in which the Five-Soldier Crew Tents were evaluated. In both net studies,
the standard camouflage net was evaluated. In 1985 this net had a mean detection range of
411.75 meters, while in 1987 the mean detection range was 414.41 meters. This difference in
mean detection range is only 2.66 meters. From these results, it is inferred that the mean detec-
197
tion ranges for the live-Soldier Crew Tents are stable, and solid conclusions about their
camouflage effectiveness can be made.
The analysis of false detections seen in Table 8 and Figure 2 also lends credence to the
belief that the Five-Soldier Crew Tent A had the best performance as to camouflage effective-
ness, with Tent E the worst performance. In the following discussion of false detections in Sec-
tion 3.2, it would be expected that Tent A, being the hardest to find, would have the most false
detections, and Tent E the least number of false detections. This is exactly what occurred, with
Tent A having a mean false detection rate of 4.87, and Tent E a mean false detection rate of
2.50. Duncan's Multiple-Range Test (Table 8) shows that the two rates of false detection dif-
fer significantly (a - 0,05) from each other. The false detection rates of tents B, C, and D are
not in the expected ordinal position. The expected order, based upon mean range of detection,
would be B, D, and C, while the true order of rates of false detection is C, B, and D. However,
a check of Tables 5 and 8 shows that these tents are not significantly different from each other
either for range of detection or for rate of false detection. Thus, from a statistical view, these
three tents are considered to have the same ordinal position,
Five, Five-Soldier Crew Tents were evaluated by ground observers to determine their
camouflage effectiveness as measured by the mean detection range and the mean rate of false
detection. These tents were in the following four-color camouflage patterns and solid colors:
* Tent A - Standard size four-color uniform pattern repeated every 27.25 inches,
* Tent B - Forest Green
* Tent C - Expanded four-color uniform pattern repeated every 36 Inches
a Tent D Expanded four-color uniform pattern repeated every 50 inches
* Tent E - Green 483
A minimum of 30 ground observers per Five-Soldier Crew Tent were driven toward each of two
sites on marked observation trails in the back of an open 5/4-ton truck. The observers were
looking for military targets, and they informed the data collector when they thought they saw
one. If the detection was correct, the ciosest alphabetic ground marker to the truck was recorded.
From this letter, the distance to the tent from the truck was determined. If the detection was
not correct, i.e., false detection, it was noted on the data sheet. The ground observer then con-
tinued the search, with the truck traveling down the observation path until the test target was
seen. An analysis of the resulting data provided the following conclusions:
A. Five.-Soldier Crew Tent A was the most camouflage effective, with the lowest mean
range of detection and highest rate of false detections,
B. Four-color pattern Five-Soldier Crew Tents are more camouflage effective than solid
colors.
198
C. The expanded four-color pattern, repeated every 50 inches, is too large to be effective
in denying detection. (The color brown becomes distinguishable from the woodland background
when overexpanded).
D. The solid colors Green 483 and standard Forest Green should not be used,
E. The mean range of detection data appears to be very stable. A test-retest field study
using identical sites and test procedures in the summers of 1985 and 1937 involving the stand-
ard camouflage net yielded mean detection ranges of 411.75 and 414.41 meters respectively.
REFERENCES
1, Anitole, George and Johnson, Ronald, Unpublished OutlIne Test Plan, Exaluatin..aL
Camnuflage Tosn. U.S. Army Belvoir Research, Development and Engineering Center, Fort Bel-
voir, VA, 1987.
2. Natrella, Mary G,, Expertmental Statistlcs, National Bureau of Standards Handbook 91, U.S.
Department of Commerce, Washington, D.C., 1966.
3. Anitole, George, and Johnson, Ronald, Stnittlcal Evaluatinn of Woodland Carnnuflaue Net by
Oraund OehArwArx, US. Army Belvoir Research, Development and Engineering Center, Fort Bel-
voir, VA, August 1986.
4. Anitole, George, and Johnson, Ronald, Evaluation of Woodland Camouflage Ners by Ground
Observer, U.S. Army Belvoir Research, Development and Engineering Center, Fort Belvoir, VA,
1988.
199
AN EXAMPLE OF CHAIN SAMPLING AS USED IN ACCEPTANCE TESTING
JERRY THOMAS
ROBERT L UMHOLTZ
WILLIAM E. BAKER
ABSTRACT
The Probability mnd Statistics Branch of the Ballistic Research Laboratory was asked to
develop a procedure of acceptance testing for armor packages. Because the available sample
sizes were extremely small, we were unable to identify a sampling plan directly applicable to
this problem. Accordingly, we have devised a new procedure by adapting an existing tech-
nique, known as chain sampling, to both the attribute portion (structural integrity) and the
variable portion (penetration depth) of the acceptance testing process. Operating charac-
teristic curves and power curves are presented for this procedure, and suggestions are made
201
I. INTRODUCTION
In most cases a consumer's decision concerning whether or not to accept a manufac-
tured product is based on an examination of a sample from that product. When General
Mills introduces a new pre-sweetened breakfast cereal, they spend millions of dollars in
advertisement costs with the hope that the consumer will sample it. Here, the consumer con-
siders the entire supply of this new cereal as a single manufactured lot, to be accepted or
rejected. Product acceptance, in this case, corresponds to the consumer purchasing more
boxes of the new cereal.
This is merely an everyday example of what is known as acceptance sampling, that is,
various techniques which allow for discrimination between an acceptable product and an
unacceptable one. Sampling may be based on an attribute criterion, a variable criterion, or
some combination of these. In our example the consumer may judge the sweetness of the
cereal as satisfactory or excessive (attribute), or he may measure the time in milk before the
cereal becomes soggy (variable). Sampling by attributes is a dichotomous situation in that,
based on a particular attribute, each item is either defective or non-defective; rejection occurs
if there is a high percentage of defectives in the sample. Sampling by variables establishes an
acceptable level of a particular variable, and rejection occurs if its sample value crosses the
acceptable threshold. Of course, in our example of a box of cereal, the sample size was one.
Generally, this will not be the case; but occasionally, for one reason or another, the consumer
is forced to make a decision based upon a very small sample size.
Because decisions are made from samples, there is some risk of error, either the error of
accepting a bad product or the error of rejecting a good product. The amount of protection
desired against such risks can be specified. The Acceptable Process Level (APL) is a high-
quality level that should be accepted 100(1-a)% of the time; a is thus defined to be the
producer's risk. The Rejectable Process Level (RPL) is a low-quality level that should be
accepted only 100(3)% of the time; 3 is thus defined to be the consumer's risk. Unfor-
tunately, these error factors vary inversely; that is, as the consumer's risk grows, the
producer's risk diminishes and vice versa. The Operating Characteristic (OC) curve is an
important part of any acceptance sampling plsn, since it provides a graphical display of the
probability of accepting a product versus the value of the particular parameter being
inspected. The OC curve is a function of APL, RPL, c4 and i9,as well as sample size. Given a
particular acceptance sampling plan, the OC curve depicts the associated error risks and
demonstrates the relationship among all of the variables.
The US Army Ballistic Research Laboratory (BRL) has developed acceptance sampling
plans for armor packages. These plans were briefed to the Project Manager MiAl on 14
April 1988 at Aberdeen Proving Ground, Maryland. Their general structures were accepted
with the guidance that the processes would be officially adopted pending some refinements.
202
II. CHAIN SAMPLING
Numerous sampling techniques exist, each with special properties that make it applica-
ble to particular situations. Sampling plans reviewed in the literature required sample sizes
much larger than those feasible for armor testing. In our case extremely small sample sizes
were warranted due to the expense of both the armor and the testing procedure, augmented
by the destructive nature of the test itself. Accordingly, we have devised a new procedure by
adapting an existing technique, chain sampling, for use in this project.
Chain sampling Is particularly appropriate for small samples because it uses information
over the past history of production lots. Even with small samples, it is possible to accept a
marginal Jot provided that a given number of lots immediately preceding (i.e., the chain) were
acceptable. When a consumer uses an expendable product such as the breakfast cereal in our
previous example, he utilizes chain sampling in his decision of whether or not to subsequently
purchase the same product. If the first or second box he buys is unacceptable, he will prob-
ably discard the product forever. However, if the tenth box is unacceptable, he might con-
tinue with one more purchase of the same cereal taking into consideration its past history of
nine boxes of acceptable quality.
An advantage of chain sampling is its automatic incorporation of reduced or tightened
inspection procedures when applicable. That is, as quality remains acceptable over a period
of time and our confidence grows, the sample size is reduced (or, more accurately, samples
are taken less frequently). If quality becomes marginal, inspection Is tightened by taking sam-
ples more frequently. When quality diminishes to the point where a production lot must be
rejected, the production process is stopped and necessary adjustments and corrections are
made. At that point a new chain begins and continues as before.
Certain assumptions must be made before chain sampling is considered as a sampling
technique. In particular, production should be a steady, continuous process in which lots are
tested in the order of their production. Also, there should be confidence in the supplier to
the extent that lots are expected to be of essentially the same quality. Generally, a fixed sam-
ple size will be maintained with the investigator taking more or fewer samples as tightened or
reduced inspection is dictated.
203
A combined chain sampling plan was proposed, The maximum length of the chain was
fixed at eight, meaning that after the chain has been established, we will consider the current
set along with the seven immediately preceding. While the chain is growing, there is an area
between the criteria for acceptance and rejection in which we can make no decision. At least
one set will be tested each month; but if no decision can be made, tightened inspection will
dictate the examination of additional sets, possibly up to a maximum of eight. Table 1 shows
the relationships among months, sets, and shots for this particular procedure. Note that the
maximum number of sets and, hence, the maximum number of shots decrease over time as
the chain is being formed. Following the third month and the concurrent drop in the
minimum number of shots, then when the chain is at its full length (definitely by the eighth
month), one set and at most four shots are all that is required in order to make a decision for
each subsequent production lot.
A rejection in either the structural integrity or the penetration depth will result in overall
rejection of the production lot. In that case production is stopped, adjustments and correc-
tions are made, and testing resumes with the construction of a new chain. If neither measure
results in a rejection but at least one falls within the no-decision region, another set should be
examined and both categories re-evaluated using the addi.-ona data.
Projectiles are fired at these packages, which are then inspected for structural integrity.
With attribute sampling, only two outcomes are possible. The structural integrity is assessed
to be either defective or non-defective, regardless of the number of shots. Any decision to
either accept or reject a lot is based on the number of defective plates in the sample being
considered.
Chain sampling is employed in this attribute sampling plan. Results from the most
recent eight sets influence decisions regarding a lot. A lot can be either accepted or rejected
at any time (except for one case discussed in the next paragraph). In the early stages of sam-
piing there is also an area in between acceptance and rejection where no decision is rendered
immediately but sampling is continued. After a chain reaches its full length of eight sets, a
decision to accept or reject is made immediately.
In the sampling plan, a safeguard is built in to prevent rejection of a good lot after only
one set. If there are no defectives in the first set, the lot is accepted. Otherwise, no decision
is made. Subsequently, rejection would occur only when there were three or more failures in
the most recent eight sets.
Table 2 shows the decision rules for a chain building to a maximum length of eight. The
OC curves for this plan are depicted in Figure 1. It shows that for a chain at full length, the
probability of accepting a lot whose true defective rate is 5% is equal to 0.96, while the proba-
bility of accepting a lot whose true defective rate is 10% is equal to 0.79. Power curves for
the plan are depicted in Figurle 2. For a chain at full length, the probability of rejecting a lot
from a process whose true defective rate is 5% is equal to 0.04, while the probability of reject-
ing a lot whose true defective rate is 10% is equal to 0.21. (Note, if these probabilities are
deemed to be unsatisfactory, a different plan providing more satisfactory levels could be
developed by varying the maximum chain length or modifying the decision rules).
204
TABLE 1. Relationships Among Variables in Chain Sampling Procedure
1 1 8 4 32
2 1 7 4 28
3 1 6 4 24
4 1 5 2 20
5 1 4 2 16
6 1 3 2 12
7 1 2 2 8
8 1 1 2 4
9 1 1 2 4
k 1 1 2 4
205
TABLE 2. Decision Rules for Acceptance Sampling by Attributes.
DECISION RULES
SET
NUMBER ACCEPT REJECT NO DECISION
1 l n .... f
2 2 2
2 Ef 0 E f,3 1:5 E f,52
Jul 1.l 1-1
6 6
6 66 ;fll 6f;t>3 Eft=- 2
8 f,_52 ;f,>3
1.1 i-1
9
9 f-2 E f23
i.2 1 2
k k
k E f
1 <2 Zf >3
f-k-7 1-k-7
206
QC
aa-ldo, joXT~qeOl
a20
~Ci2 t
0.)
U-~3p
C)
~I)
I
0
oq~J I-
- 0
UOU32~J Jo ~
2U8
B. Acceptance Sampling by Variables
When primary interest Is in a process level rather than a percent defective, sampling by
variables is the proper procedure. For the armor packages, depth of penetration for a partic-
ular munition was the process level of interest. When variable sampling plans are established,
two major assumptions must be satisfied: first, the distribution of the variable of interest
must be known; and second, a good estimate of its standard deviation must be available.
In our particular problem there were 22 baseline shots from which we were to determine
a distribution and estimate its standard deviation, as well as establish acceptable and reject-
able process levels (APL & RPL). The 22 shots had a mean (Xb) of 5mm with a standard
deviation (Sb) of 30mm. The data had been transformed, allowing for both positive and nega-
tive penetration values. When plotted, the data appeared normal; and, indeed, the hypothesis
of normality could not be rejected using statistical goodness-of-fit tests. The APL was esta-
blished at 20mm (1/2 baseline standard deviation from the baseline mean) and the RPL was
set at 80mm (2 1/2 baseline standard deviations from the baseline mean). o, the probability
of rejecting at the APL, was set at 0.05; and 0, the probability of accepting at the RPL, was
allowed to vary with the sample size -- for a sample of size four, P would equal 0.10.
As in the attribute case, a set consists of a right side and a left side. For each set an
attempt will be made to fire a second round into each side. Because this might not always be
possible, due primarily to discrepancies between the aim point and the hit location, each set
can result in either two, three, or four data points, depending on whether or n~ot both shots on
each side are considered to be good hits. It is important that during the first three months,
while the chain Is being formed, at least four shots are available upon which to make a deci-
sion. Table 3 outlines the decision rules for the variable sampling plan. Like the attribute
sampling plan, it Incorporates chain sampling with a maximum length of eight sets. The plan
will not reject based on the first sample, and it has a region of no decision until the chain
reaches its full length. In this table, X represents the mean penetration depth for all shots
currently considered, s represents the standard deviation of this sample, n is the total number
of shots used in computing X, and t,9s represents the 95th percentile of the t-dstribution for
the appropriate degrees of freedom (n-I). Thus, n can vary from 2 to 32 depending upon the
length of the chain and the number of shots available on each side of the armor package.
Because n varies so widely, any one of many OC curves may be applicable. Figure 3
shows these curves for sample sizes 2, 32, and many integers in between. The abscissa value,
D, represents a multiple of sb from Xb; thus, the numbers in parentheses are the penetration
depths in millimeters. Note that for all n, the probability of accepting at the APL is 0.95
(1-a). Because the probability of accepting at the RPL is too high for n-2 and n-3, the pro-
cedure will not allow lot acceptance at these small sample sizes (see Table 3). Table 4 pro-
vides the values for the t-statistic for (1-cx)-levels of 0.99 and 0.95 and degrees of freedom
from 3 to 31.
Power curves show the probabity of rejecting a particular lot. Generally, they are noth-
ing more than the complement of OC curves. However, for our procedure this is not the
case, since there is a region of no decision. Figure 4 shows the power curves for this variable
sampling procedure. Basically, there are two sets of curves -- the first two pertaining to a
0.05 and the next three pertaining to a - 0.01. Note from Table 3 that in order to reject
209
TABLE 3. Decision Rules for Acceptance Sampling by Variables.
DECISION RULMS
SET
NUMBER, ACCEPT REJECT NO DECISION
X-.APL I-APL
2 (combine with 1) X-A .< 9s ...... > tP9
X-APL X-APL .
9 (combine with 2-8) - t95
r'.APL X-.APL
k (combine with (k-7) - (k-L)) _ t95 > t.95
s Il,,/- s l /'
At least four shots are required in each of the first three months;
210
0p4
o//
of .9.
0~~.
/6 / Il,
r~
_ _I _ t4
P4 --
aou-edaooV JO S~jflct:qchId'
211
TABLE 4. Values of the Cumulative t-Statistic
3 2.35 4.54
4 2.13 3.75
5 2.02 3.37
6 1.94 3.14
7 1.90 3.00
8 1.86 2.90
9 1.83 2.82
10 1.81 2.76
11 1.80 2.72
121.78 2.68
13 1.77 2.65
14 1.76 2.62
15 1.75 2.60
16 1.75 2.58
17 1.74 2.57
18 1.73 2.55
19 1.73 2.54
20 1.73 2.53
21 1.72 2.52
22 1.72 2.51
23 1.71 2.50
24 1.71 2.49
25 1.71 2.49
26 1.71 2.48
27 1.70 2.47
28 1.70 2.47
29 0.0 2.46
30 1.70 2.46
31 1.70 2.45
212
Ch
uo~o~.jjo A TITq'eqoad
213
before the chain is at its maximum length, we use the smaller a-level, and Figure 4 shows
some possible sample sizes for a = 0.01. If we reject at an a-level of 0.05, our sample size
must be somewhere between 16 and 32; and these curves are also shown in Figure 4. Gen-
erally, the power curves are of more interest to the producer than the OC curves, since they
highlight the producer's risk.
214
44.)
4-j.
* a
(t 4Ie-l
u.9.a U d -@
215I
It is important that some type of quality control charts be represented in the acceptance
sampling plan. They are relatively easy to maintain and might provide early warning signs
which could be beneficial to both the producer and the consumer.
IV. CONCLUSIONS
Generally, it is not feasible for a consumer to Inspect every item from a production lot
that he might want to purchase. A judicious choice of a lot acceptance plan will allow him to
sample the production lot and determine with a pre-established level of confidence whether
or not it meets his specifications, Chain sampling Is'a particular method of lot acceptance
sampling used when sample sizes are small. It utilizes the most recent lot information to pro-
vide more confidence in the decision.
In testing armor packages for acceptance by the US Army, chain sampling provides a
logical method, since destructive testing dictates small sample sizes. A technique involving
both structural integrity (attribute sampling) and penetration depth (variable sampling) has
been proposed. One set of armor packages Is represented by both a left side and a right side.
The procedure allows for accepting the production lot (one month's production) after exa-
mining just one set. It allows for rejecting the production lot only after testing at least two
sets. There Is.a region of no decision; but after the chain has reached its maximum length of
eight sets, a decision must be rendered.
Operating characteristic curves and power curves provide the probability of accepting
and rejecting lots given a percent structurally defective (attributes) and given a mean penetra-
tion depth (variables).
In addition to the acceptance sampling plans, control charts should be used for both the
attribute and variable parameters. These charts display sample results for particular parame-
ters such as percent defective, mean penetration depth, and variability of penetration depth.
The data might be presented as individual sample points or as sums over a preceding number
of samples. By continually examining the control charts, we can see when one of the parame-
ters is drifting toward the rejection region, enabling the producer to make adjustments and,
possibly, preventing rejection of an entire lot of armor plate.
The proposed lot acceptance plan was briefed to the Project Manager MIAl on 14 April
1988 at Aberdeen Proving Ground, Maryland. It was approved and will be adopted subject to
any refinements agreed upon by both the US Army Ballistic Research Lboratory and the
Project Manager.
216
BIBLIOGRAPHY
Crow, Edwin L, et. al., Statistics ManuA with Examples Taken from Ordnance
Development, Dover Publications, Inc., 1960.
Duncan, Acheson J., Quality Control and Industrial Statistics, Richard N. Irwi, Inc., 1974.
Grant, Eugene L. & Leavenworth, Richard S., Statistical Quality Control, McGraw Hill Book
Company, 1980.
Juran, J.M., Editor, Quality Control Handbook, McGraw Hill Pook Company, 1974.
Mioduski, R.E., Tables of the Probability Integral of the Central t-Distribution, BRL
Technical Note # 1570, August 1965.
Schilling, Edward G., Acceptance Sampling in Quality Control, Marcel Dekker, Inc., 1982.
Thomas, D.W., Chairman, Statistical Quality Control Handbook, Western Electric Company,
Inc., 1956.
217
SOME NOTES ON VARIABLE SELECTION
CRITERIA FOR REGRESSION MODELS
(AN OVERVIEW)
Eugene F. Dutolt
U.S. Army Infantry School
Fort Banning, Georgia
1. The Problem.
Y X1 XI X, . . . X.
k
Y' a a + b X (1)
1.1
219
2. The Multiple Correlation Coefficient. (R2)
where p *k + 1
*estimated variance of the complete modei (He*.;
all Independent variables Included).
a set IMLted variance of the candidate (subset)
model.
N anumber of cases,
220
Figure 1
Co N Cr, - P
4 D#
3 A
2 B.
I C
1 2 3 4 P
Cp a RSS.6 - [N - 2 p1 (5)
^2
where p - k + I (as before)
FIgure 2
Regression ANOVA - Full Model
i 221
The model with k independent variables: where k < q is:
j a +biXi + ba Xa +...b.: Xw (7)
The regress ion AfN0VA table Is given In Figure 3.
Figure 3
Regression ANOVA - Subset Model
Source DF SS Ms
Explained kc (N-1) (S2Y)M(R J
Residual N-k-i (N-k-I) Salk SVX
N - I
where
S2YIXb * N-i1 (Say) 0lR~
N-k-i
Note that equation (4) ci iowe for cases where some subset model
has less variance (S2 ) than the varian~ce for the complete
222
model (- =) In this case the Cp plot is below the I ins
(i.e.; point C in Figure 1). This can be expressed in Figure 4:
Figure 4
S, I XI.:
Minimum value of SAX,
1 2 31 Ik
The minimum value In Figure 4 corresponds to point C In Figure
1 . The subset of regression that has minimum variance would be
the best predictor of the dependent variable Y. The ratio
described by equation (11) can be rewritten to gain some Insight
to the process, From equations (8) the fol lowing expressions
can be Inferred:
,)
S-'YIX . • N-1 (S2'- (1l-R='A
N-k-1 (12)
223
X, = Number of promotional accounts.
Xz = Number of active accounts.
X3 - Number of competing brands.
X* a District potential.
Y = Sales In thousands of dollars.
Figure 5
Stepwise Results
"43 '1,1,
2. 2.
3.,3 4 4.98
5.1i2" .99612
...
.99,590 4
6 3.4
5
224
6. Summary. This paper has examined several methods of
determining when to enter additional independent variables into
linear multiple regression In order to form a oRtimum subset
from all the candidate variables.
References
225
TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS
Barry A. Bodt
ABSTRACT
An effort is underway to enhance the battlefield survivability of combat vehicle tire sys-
tems. The Army is currently investigating several new tire technologies with regard to their
ability to function after battlefield degradation. The tires, in a run-flat condition, must sup-
port some vehicle mobility according to that vehicle's mission profile. The immediate objec-
tive of this program is choosing, for further research, the most promising among the new tire
technologies. The presenter has been tasked to develop an appropriate test plan.
Sound experimental strategy, for this or any study, must be accompanied by a clear
understanding of the problem(s) to be resolved. A list of question areas worth exploring to
help gain this understanding is suggested by Hahn (Technometrics, 1984) as part of more
general guidelines. The presenter demonstrates their usefulness to that end in the above
mentioned tire program. The test plan and the process by which it evolved is discussed.
227
TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS
1. INTRODUCTION
An effort is underway to enhance the battlefield survivability of combat vehicle tire sys-
tems. The impetus for current investigations dates back to a 1979 issue paper, submitted to
DA by the US Training and Doctrine Command (TRADOC). In 1985 the Tank Automotive
Command (TACOM) established a tire task force, the need for which was supported by the
results of a 1984 independent evaluation of one tire system performed by the Operational
Test and Evaluation Agency (OTEA). OTEA observed that when the run-flat tires for the
High Mobility Multi-Purpose Wheeled Vehicle (HMMWV) were run flat for 30 miles, the
tires became unserviceable and had to be replaced. The objective of the TACOM Tire Task
Force is to identify a survivable tire system (STS) technological replacement which demon-
has been adopted to screen available STS technologies in search of candidates for more
intense research and development. The operational phase, considering the standard and
seven new STS technologies, was completed by the Combat Developments Experimentation
Center (CDEC) in 1987. The technical phase, the focus of this paper, is being conducted by
the Vulnerability Lethality Division (VLD) of the Ballistic Research Laboratory (BRL)
according to the test plan developed by the Probability and Statistics Branch (PSB) of the
This paper is intended to accomplish two tasks. The first is to discuss the test plan that
has been adopted for the technical testing phase -- not in great detail but sufficiently to
demonstrate the degree to which experimental objectives are satisfied. As part of the discus-
228
sion it is shown how, for example, tire performance specifications, factors thought to
influence performance, and physical and budgetary corwtraints are incorporated in the test
extracting the necessary information from experimenters. Any sound experimental strategy
tion essential to that understanding is often difficult to obtain. The fragmented manner in
which information is passed from client to consultant inhibits a cogent assimilation of facts
needed for efficient problem solving. Hahn (1984) suggests imposing the structure of ques-
tion area guidelines (see Figure 1) both to help sort the information coming in and to direct
The remainder of the paper is organized as follows. In Section 2 the problem and test
plan are developed, punctuated by Hahn's guidelines. It is hoped that this presentation will
both give fair treatment to the Army concern as well as illustrate a reasonable approach to
consultation. In Section 3 a brief critique of the test plan's strengths and weaknesses is given,
Problem information is divulged in this section according to Hahn's guidelines, and that
constitutes our presentation of his technique. We seek only to show how encompassing those
question areas are by developing in full the Army's problem through their use. In the text,
italicized words and phrases refer back to guidelines in Figure 1. The guidelines have been
juggled to allow for a logical presentation and the order shown in Figure 1 corresponds, with
few exceptions, to that of this section. This is simply a matter of convenience and not a claim
229
1. The objectives of the experiment.
2. The variables to be held constant and how this will
be accomplished (as well as those that are to be
varied).
3. The uncontrolled variables - what they are and
which ones are measurable.
4. The response variables and how they will be meas-
ured.
5. Special considerations which Indirectly impose
experimental constraints.
230
for an ideal sequence, In fact, each consulting session is likely not only to naturally gravitate
toward different orders but also to move around from area to area, possibly returning several
times to some.
Let us begin by considering objectives. We consider two types; military and experimen-
tal. The military objective is that HMMWV tires remain serviceable when degraded through
battlefield exposure to small caliber munitions and shell fragments. Serviceable means that
the tire exhibits perfonnance consistent with the standards specified in the NATO-Fnabel 20
A 5 1956 NATO Test Procedure. Summarized expectations set forth therein say that the
combat tire must possess (as nearly as possible) the same over the road performance as the
classic radial tire In terms of maximum vehicle speed and lateral and longitudinal traction and
stability. After degradation, normal military performance of the vehicle is still required when
no more than two tires (one drive and one steering) are damaged. The experimental objective
is to screen six, including the standard, tire systems with the purpose of selecting a subset for
further research, development, and the eventual upgrading of combat tires. The selection cri-
Question areas 2-4 in Figure 1 each concern variables. It Is in the identification ind
classification of these variables that the experimental strategy begins to take form. In Table I
the most important ones are given. Care is taken to initially classify them as candidates for
response, design or nuisance variables and to subclassify them for each of the last two
categories. The scale of measurement is also noted. A short definition of each of these vari-
ables Is given in the appendix. Because the variables listed in Table I represent only those
231
Ll
"> ,. ,.>
ii
ii >Q
e
", "> ,e > > "
i ]G
_, ,, I
...- -I II 7ii...'"i "'II IIII '
i
H
'4d
~ r - --
I-I
"I , !! I t i I
i.
233
considered essential, all must be incorporated in the experimental strategy. We briefly dis-
cuss several c them here so that the reader may gain a sense of the complexity of the prob-
lem.
The logical starting point for discussion is with tire technology, for it Is the selection
from among these prototypes that is the objective of this experiment. Six manufacturer
offerings, including the standard, are to be considered, but there are basically only four tech-
nologies. When combat tires are exposed to small caliber munitions and shell fragments they
will surely tear, puncture, or in some other way be damaged so as to induce partial or com-
plete deflation. Then in order for military objectives to be satisfied, the survivable tire will
either successfully negate this damage or be structurally capable of supporting vehicle mobil-
ity without benefit of full tire pressure. Taking the first tact, the sealant tire systems contain
chemical compounds which are intended to flow to a source of air loss, solidify, and thereby
negate the threat damage. Run-flats take the second tact and are able to support the vehicle
with a metal or plastic insert which acts in the tires stead when the tire is deflated. Self-
supporting tires are so named because molded into the tread is a rigid fiber glass band,
designed to carry the tire's full load in the absenice of tire pressure. Solid urethane tires cir-
cumvent the problem by containing no air to be lost, ',ut they do so at the cost of additional
A limitation of the CDEC exercise is that tire degradation from shell fragments is not
make inferences about tire performance after fragment damage the consensus is that either
live shells should be detonated near standing tires or the expected resulting fragments should
234
be simulated. A specialconsideration in long range plans is that an acceptance test consistent
with current testing be developed. Due to the repeatability requirements inherent in accep-
tance testing the shell-detonation approach was dropped in favor of fragment simulation
This decision led to variables involving fragment shape, size, and velocity. Due to budget
and time constraints it appears unreasonable to select many values for each and then proceed
in a factorial manner when incorporating them in the design. Rather we option to create two
fragment profiles, each representative of a distinct threat. Avoiding great detail, a standard
BRL fragment design is specified for shape. Velocity and mass are determined as follows.
Each are a function of the distance between the shell at detonation and the tire. The distance
selected corresponds to a 50% mobility kill for the vehicle according to models accessed by
the Army Material Systems Analysis Activity (AMSAA). Avoiding the experimental region
where the expected outcome is .,own, we do not consider distances so close that the personnel
are not likely to survive. The median velocity and mass among computer simulated frag-
ments possessing an appropriate trajectory then serve as representative values for these
characteristics. Trial firings suggest some deviations in these choices so that resulting dam-
Other factors of keen interest include the terrain traveled and tire position, motion, and
pressure. The mission profile for the HMMWV dictates that it shall be able to travel primary
roads, se.condary roads, and cross country. Further it suggests that in a characteristic mission
those three terrains might comprise 30%, 30%, and 40%, respectively, of the total mileage
covered Tire pcsiticn refers to its designation as a drive tire or a drive and steering tire; the
HMMWV is 4-wheel drive. In addition to this one-at-a-time damage, recall that the NATO
Finabel standards require acceptable performance when two tires on the vehicle are
235
damaged. When attacked, the HMMWV may be moving or at rest. Proponents of the
sealant technology claim that if the tire is in motion when punctured, then the sealant
mechanism will be mote effective in finding and repairing the damage. Past test data indi-
cates that tire pressure may influence the type of puncture, that is, clean or ragged. Manufac-
turer recommended high and low pressures for each tire will be considered.
The specialconsiderationthat this experiment complement the CDEC exercise fixed two
important test definitions. TACOM decided that the response would remain defined as miles
until failure. Failure occurs when either the tire begins to come apart when in use or the
operator must slow to less than 50% of normal operating speed in order to maintain control.
Under a rigid value for normal speed, failure could depend on the size and strength of the
operator. We propose to account for that by establishing a profile on operators (actually dri-
ing teams) in their normal operation of the vehicle. The 50% rule is then based on normal
team performance. Driving teams are established to avoid failure due to fatigue. Past test
data reveals that some degraded tires remain serviceable after 100 continuous miles of opera-
tion. In order to avoid truncated data, the test course is extended to 125 continuous miles, but
at the additional cost of trial time. It is felt that If two operators are allowed to rotate after
each 25 mile lap, then fatigue will not enter into the failure determination.
The test plan will be implemented in stages. A fairly large number ot experimental con-
ditions define the experiment outlined in Section 2.1. To examine each uf these conditions in
a factorial manner will require mote resources than the experimental budget will allow; for all
but the standard tire no more than 30 prototypes will be made available. Moreover, recall
236
that the principal objective of this study is to facilitate comparison among tires. Placing too
much emphasis (sample allocation) on ancillary issues may partially obscure (weaken conclu-
sions regarding) the main experimental focus. For these reasons, resource limitations and
The division of testing meets the above concerns. In stage 1 all the experimental condi-
tions are incorporated in the design as factors or fixed test procedures. Only the standard
HMMWV tire is considered in stage 1. The purpose of this stage is two-fold. First the vari-
ous test conditions may be examined. It is hoped that some will prove unnecessary for inclu-
sion in stage 2, thereby increasing the experimental information per sampled test condition.
Second, test procedures may be smoothed. Field test exercises nearly always present unex-
pected problems, often resulting in samples which must be invalidated for the analysis. Here
we run only the risk of wasting some more plentiful, standard tires instead of the scarce new
prototypes. In stage 2 the prototypos wiU be examined by an experienced testing group under
the conditions remaining after stage 1. Since the complete details will not be available until
stage 1 is concluded, we defer further discussion of stage 2 to future papers. In the remainder
Stage I will be run as a 1/2 replication of a 4x24 factorial design, requiring 32 observa-
tions. The design factors, each discussed in Section 2.1, are listed in Table 2. The 4 levels for
threat include a 7.62mm round fired at 450 and 90" obliquity on the sidewall, a small fragment
simula.tor, and a large fragment simulator. Note that only 2 tire position levels, drive or steer-
ing, are considered. The case in which two tires are danaged, requiring twice as many sam-
ples, is handled only in a limited sense. Imbedded in the stage 1 factorial design are four
treatment combitations having two damaged tires which arise from a 1/2 replicate of a 24
237
IC
a
6eq
> 1~ -- -
23
design. The remaining 4 observations are already included in the principal stage 1 design.
The other three factors are handled as previously noted. This design allows hypothesis tests
on all main effects and on most first-order interactions. Due to the anticipatedcomplexity of
the relationship amoong variables some first-order interactions can be sacrificed to the experi-
mental error formed by the remaining second, and above, order interactions. The remaining
Randomization for the stage I design is complete except in the case of driving team,
vehicle, and terrain. In consideration of the proceduresfor runninga test and the ease in which
variables can be changed and the details of the physical set up some compromises were made.
The complete randomization of the driving teams is not possible because both teams are to
be used simultaneously. The first driver in the rotation for each team was randomized. As
indicted In Figure 2, four vehicles are used but are not Included as test factors. To mitigate
their effect on the outcome, they have been selected according to age and state of repair and
have been partially randomized over the design. Also noted in Figure 2, the three terrains
mentioned in Section 2.1 comprise the test track. The course layout attempts to mix or ran-
donmize the terrains so that not all the mileage for one type will be traveled before the next is
encountered.
3. CRITIQUE OF APPROACH
In this section we address the primary advantages and disadvantages of the test plan
interpreted in terms of the stated military and experimental objectives and follow with some
comments pertaining to the consulting technique employed. Beginning with the military
objectives, all of the variables considered important by TACOM or the NATO Finable Stan-
239
~.1.
,.
240
II4.
dards are included in the test plan in a manner suitable to TACOM. Sometimes this requires
compromise, such as in the use of terrain. Terrain is considered only through Its inclusion in
the test course in proportions consistent with the HMMWV mission profile. For some othe:
variables the military Interests are clarified in the test plan. For example, normal operating
speeds in the failure definition are m',e sensibly tied to the normal performance of individual
driving teams. Also, efforts to handle the fragment threat result in a reasonable fragment
With regard to experimental objectives, the selection of STS prototypes for further
research and development follows directly from analysis of the second testing stage. Further,
the stage 1 plan imposes an analyzable design structure on a complex problem providing for
the testing of all important hypotheses. In addition, running the experiment in stages has the
emphasis and resource advantages mentioned in Section 2.2. However, the test plan has
several weaknesses. By examining the standard tire only in stage 1, comparisons between it
and other STS prototypes are hindered. Experimental error is an issue since complete ran-
domization is not possible and since some pooling of low-order interactions into the error
term may be necessary. Choice of an error term for the imbedded test of the two-tire effect
is far from straight.forward, particularly since four of the eight observations must be used in
the analysis twice. Finally, we had to take some liberties in the combination of variables to
form factors so that a design would be possible with the available samples.
As to consulting, we cannot prove the usefulness of Hahn's guidelines, but we hope that
the illustration is convincing, Surely, the information can be obtained through other methods,
but the imposed structure of this approach facilitates a very comprehensive investigation. In
the end, all methods must be judged by the experimental strategies which they help to
241
develop, but their performance is hopelessly confounded with the skills of the consultant using
them. Of course the purposes of those strategies are to meet both application objectives and
satisfy statistical theory. Whether this strategy satisfies those purposes, and if not, whether
fault lies with the consultant, the approach, or the problem are questions left for the reader to
decide.
242
REFERENCES
Drelling, IS., Pietzyk, S., Schrag, H. (1987), "Survivable Tire Test," CDEC-TR-87-014.
Hahn, 0. J. (1984), "Experimental Design in the Complex World," Technometricsa 26, 19-3 1.
Kempthorne, 0. (1952), Design and Analysis of Experiments, New York; John Wiley & Sons.
243
Mimi
APPENDIX
terrain the three driving surfaces listed as primary road, secon-
dary road, and cross country. Each will induce different
tire stress and all are included in the vehicle's mission
profile.
threat obliquity
244
# shots on tire - the number of punctures to be made in each tire to
obtain Its degraded state.
subjective assessments -comments, solicited from drivers regarding handling of
vehicle when tires are in normal or degraded mode.
245
Parallel Coordinate Densities'
Edward J. Wegman
Center for Computational Statistics
242 Science-Technology Building
George Mason University
Fairfax, VA 22030
1. Introduction. The classic scatter diagram is a fundamental tool in the construction of' a
model for data. It allows the eye to detect such structures in data as linear or nonlinear features,
clustering, outliers and the like. Unfortunately, scatter diagrams do not generalie readily beyond three
dimensions. For this reason, the problem of visually representing multivaiate data I a difficult,
largely unsolved one. The principal difficulty, of course, i the fact that while a data vector may be
arbitrarily high dimensional, say n, Cartesian scatter plots may only easily be done in two dimensions
and, with computer graphics and more effort, in three dimensions. Alternative multidimnsional
representations have been proposed by several authors Including Chernoff (1973), Fienberg (1979),
Cleveland and McGill (1984a) and Car et al. (1988).
An Important technique based on the use of motion is the computer-based kinematic display
yielding the illusion of three dimensional scatter diagrams. This technique was pioneered by Friedman
and Tukey (1973) and Is now available in commercial software packages (Donohoe's MacSpin and
Velleman's Data Desk). Coupled with easy data manipulation, the kinematic display techniques have
spawned the exploitation of such methods as projection pursuit (Friedman and Tukey, 1974) and the
grand tour (Asimov, 1985). Clearly, projection-bnsed techniques lead to important insights concerning
data. Nonetheless, one must be cautious in making inferences about high dimensional data structures
based on projection methods alone. It would be highly desireable to have a simultaneous
representation of all coordinates of a data vector especially if the representation treated all components
in a similar manner. The cause of the failure of the standard Cartesian coordinate representation is the
requirement for orthogonal coordinate axes. In a 3-dimensional world, it is difficult to represent more
than three orthogonal coordinate axes. We propose to give up the orthogonality requirement and
replace the standard Cartesian axes with a set of n parallel axes.
'This research was sponsored by the Army Research Office, Contract DAAL03-87-K-0087
247
representation as a device for computational geometry. His 1985 paper is the culmination of a series of
technical reports dating from 1981. Finally we note that Diaconis and Friedman (1083) discuss the so-
called M and N plots. Their special case of a I and 1 plot is a parallel coordinate plot in two
dimensions. Indeed, the 1 and I plot is sometimes called a before-and-after plot sud has a much older
history. The fundamental theme of this paper is that the transformation from Cartesian coordinates to
parallel coordinates is a highly structured mathematical transformation, hence, maps mathematical
objects into mathematical objects. Certain of these can be given highly useful statistical
interpretations so that this representation becomes a highly useful data analysis tool.
3. Parallel Coordinate Geometry. The parallel coordinate representation enjoys some elegant
duality properties with the usual Cartesin orthogonal coordinate representation. Consider a line Z in
the Cartesian coordinate plane given by Z: y=mx+b and consider two points lying on that line, say
(a, ma+b) and (c, mc+b). For simplicity of computation we consider the xy Cartesian axes mapped
into the xy parallel axes as described in Figure 3.1. We superimpose a Cartesian coordinate axes t,u on
the xy parallel axes so that the y parallel axis has the equation u=1. The point (a, ma-l-b) In the xy
Cartesian system maps into the line joining (a, 0) to (ma+b, 1) in the tu coordinate axes. Similarly,
(c, mc+b) maps into the line joining (c, 0) to (mc+b, 1). It is a straightforward computation to show
that these two lines intersect at a point (in the tu plane) given by 1: (b(1-m) "1 , (1-m)'). Notice
that this point In the parallel coordinate plot depends only on m and b the parameters of the original
line in the Cartesian plot. Thus I is the dual of L and we have the interesting duality result that
points in Cartesian coordinates map into lines In parallel coordinates while lines in Cartesian
coordinates map into points In parallel coordinates.
For 0 < (1-m)" < 1, m is negative and the Intersection occurs between the parallel
coordinate axes. For m=-I, the intersection Is exactly midway. A ready statistical Interpretation
can be given. For highly negatively correlated pairt, the dual line segments In parallel coordinates will
tend to cross near a single point between the two parallel coordinate axes. The scale of one of the
variables may be transformed in such a way that the intersection occurs midway between the two
parallel coordinate axes in which case the slope of the linear relationship Is negative onie.
In the case that (1-m)"1 <0 or (1-m)'>1, m is positive and the intersection occurs external
to the region between the two parallel axes. In the special case m=1, this formulation breaks down.
However, it Is clear that the point pairs are (a, a+b) and (c, c+b). The dual lines to these points are
the lines in parallel coordinate space with slope b " and Intercepts -ab and -cb " respectively. Thus
the duals of these lines in parallel coordinate space are parallel lines with slope b" . We thus append
the ideal points to the parallel coordinate plane to obtain a projective plane. These parallel lines
intersect at the ideal point in direction b " . In the statistical setting, we have the following
interpretation. For highly positively correlated data, we will tend to have lines not intersecting
betweeu the parallel coordinate axes. By suitable linear resallng of one of the variables, the lines may
be made approximately parallel in direction with slope b" . In this case the slope of the linear
relationship between the rescaled variables Is one. See Figures 3.2 for an illustration of large positive
and large negative correlations. Of course, nonlinear relationships will not respond to simple linear
rescaling. However, by suitable nonlinear transformations, It should be possible to transform to
linearity. The point-line, line-point duality seen In the transformation from Cartesian to parallel
coordinates extends to conic sections. An Instructive computation Involves computing in the parallel
coordinate space the image of an ellipse which turns out to be a general hyperbolic form. For purposes
of conserving space we do not provide the details here.
It should be noted, however, that the solution to this computation is not a locus of points, but
a locus of lines, a line conic. The envelope of this line conic Is a point conic. In the case of this
computation, the point conic in the original Cartesian coordinate plane Is an ellipse, the image in the
parallel coordinate plane Is as we have just seen a line hyperbola with a point hyperbola as envelope.
Indeed, it in true that a conic will always map into a conic and, in particular, an ellipse will always
map into a hyperbola. The converse Is not true. Depending on the details, a hyperbola may map into
an ellipse, a parabola or another hyperbola. A fuller discussion of projective transformations of conics
is given by Dimsdak (1984). Inselberg (1985) generalizes this notion Into parallel coordinates resulting
248
in what he calls hstars.
We mentioned the duality between points and lines and conics and conics. It is worthwhile to
point ou6 two other nice dualities. Rotations in Cartesian zoerdi'late beco.e tradslations in paiallel
coordinates and vice versa. Perhaps more interesting from a statistical 1oint of view is that pointis of
hillection in Cartesian space become cusps in parallel coordinate space and vice versm. Thus the
i-ilatively hard-to-detect inflection point property of & function becomes the notably more easy to
detect cusp in the parallel coordinate representation. Inselberg (1985) dim.uses theme properties In
detail.
4. Further Statistical Interpretationa. Since ellipses map into hyperbolas, we can have an easy
template for diagnosing uncorclated data pairs. Consider Figure 3.2. With a completely uncorrelated
data set, we would expect the 2-dimensional scatter diagram to fill sebstantally a circumscribing
circle. As illustrated in Figure 3.2, the parallel coordinate plot would approximate a figure with a
hyperbolic envelope. As the corrolation approaches negative one, the hyperbolic envelope would deepen
so that in the limit we would have a pencil of lines, what we like to call the cross-over effect. As the
correlation approaches positive one, the hyperbolic envelope would widen with fewer and fewer cross-
overs so that in the limit we would have parallel lines. Thus corre.ation structure can be diagnosed
from the parallel coordinate plot. As nuted earlier, Griffen (1958) used this as v graphical devicu for
computing the Kendall tasu.
C:iffen, in fact, attributes the graphical device to Holmes (1928) which predates Keudall's
discussion. Tha coniput.iMonal formula is
where X is the number of intersections resulting by connecting the twqo ranklngs of each member by
linw, one ranking havinb been put in natural order While the original formulation was framed in
terms of ranks for both x and y axes, it is clear that the numbe. of crossings is invariant to any
monotone increaeing transformation of elt-er x or y, the ranks being one su"L transformation. Bucauce
of this scale invariance, une would expect. rank-beld htatistIcA to have an intimate relstionship to
parallel coordinates.
It is clear that if there is a perfect positive linar relationi'!.ip with no cromings, then X = 0
and r = 1. Similarly, if there . a perfect negative linear relationsitip, Figure 3.2 is again appropriate
j we have a pencil of lines. Since every line neets every other line, the number of intersections is
so that
rr21 -(n)1
2 =_I 3
It should be further noted that cluste'ing is easily diagnosed using the parallel coor.&nate
representation.
So far we have focused primarily on pairwiae parallel coordinate relationships. The idea
however is that we can, sc to speak, stack these diagrams and rel-resent rill n diauensions
simultaneously. Figure 4.1 trs illustrates f6-dimensinnal Gaussian uncorrelated data plotted in
parallel coordinates. A 6-dimensional ellipsoid would have a similar general shape but with iiyperbolas
of different depths. This data is deep ocean acoustic noise and is illustrative of what might be
expected.
Figure 4.2 is illustrative of some data structurom one might see in a five.dimensional data set.
First it should be noted that the plots along any given axis represent dot diagrams (a refinement of the
histograms of Hartigan), hence convey graphically the one-dinensional marginal distributions. In this
illustration, the first axis is meant to have an approximately normal distribution shape while axis two
the shape of the negative of a X2 . As discussed above, the pairwise comparisons can be made. Figure
249
4.2 illustrates a number of instances of linear (both negative and positve), nonlinear and cludtering
situations. Indeed, it is clear that there is a 3-dimensional cluster along coordinates 3, 4 and 4.
Consider also the appearance of a mode in parallel coordinates. The mode is, intuitively
speaking, the location of the most intense concentration of probability. Hence, in a sampling situation
it will be the location of the niost intense concentration of observations. Since observations are
represented by broken line segments, the mode in parallel coordinates will be represented by the most
intense bundle of broken line paths in the parallel coordinate diagram. Roughly speaking, we should
look for the most Intense flow through the diagram. In Figure 4.2, such a flow begins near the center
of coordinate axis one and finishes on the left-hand side of axis five.
Figure 4.2 thus illustrates some data analysis festures of the parallel coordinate representation
including the ability to diagnose one-dimensional features (marginal densities), two-dimensional
features (eorrelations and nonlinear structures), three-dimensional features (clustering) and a five-
dimensional feature (the mode). In the next section of this paper we consider a real data set which will
be illustrative of some additional capabilities.
5. An Auto Data Esample. We Illustrate parallel cooordinates as an exploratory analysis tool
on data about 88 1980 model year automobiles. They consist of price, miles per gallon, gear ratio,
w.'.Ight and cubic Inch displacement. For n = 5, 3 presentationo are needed to present all pairwine
nermutations. Figures 5.1, 5.2 and 5.3 are these three presentations. In Figure 5.1, perhaps the most
striking feature is the cross-over effect evident In the rdlationship between gear ratio and weight. This
suggests a negative correlation. Indeed, this is reasonable since a heavy car would tend to have a large
engine providing cou3iderable torque thus requiring a lower Sear ratio. Conversely, a light car would
tend to have a small engine providing small amounts of torque thus requiring a higher gear ratio.
Consider as well the relationship between weight and cubic inch displacement. In this diagram
we have a considerable amount of approximate parallelism (relatively few crossings) suggesting positive
correlation. This Is a graphic representation ot the fact that big cars tend to have big engines, a fact
most are prepared to believe. Quite striking however is the negative slope going from low weight to
moderate cubic Inch displacement. This is clearly an outlier which Is unusual in neither variable but in
their joint relationshlp. The same observation is highlighted in Figure 5.2.
The relationship between miles per gallon and price Is also perhaps worthy of comment. The
left-hand side shows an approximate hyperbolic boundary while the right-hand side clearly illustrates
the cross-over effect. This suggests for Inexpensive cars or poor mileage cars there is relatively little
correlation. However, costly cars almost always get relatively poor mileage while good gas mileage cars
are almost always relatively inexpensive.
Turning to Figure 5.2, the relationship between gear ratio and miles per gallon is instructive.
This diagram is suggestive of two classes. Notice that there are a number of observations represented
by line segments tilted slightly to the right of vertical (high positive slope) and a somewhat larger
nlimber with a negative slope of about -1. Within each of these two classes we have approximate
parallelism. This suggests that the relationship between gear ratios and miles per gallon is
approximately linear, a believable conjecture since low gears = big engines = poor mileage while high
gears = small engines = good mileage. What is intriguing, however, is that there seems to be really
two distinct classes of automobiles each exhibiting a linear relationship, but with different linear
relationships within each class.
Indeed in Figure 5.3, the third permutation, we are able to highlight this separation intu twn
classes in a truly 5-dimensional sense. The shaded region in Figure 5.3 describes a clas of vehicles with
relatively poor gas mileage, relatively heavy, relatively inexpensive, relatively large engines and
relatively low gear ratios. Figure 5.4 is a repeat of this graphic but with different shading highlighting
a clas of vehicles with relatively good gas mileage, relatively light weight, relatively inexpensive,
relatively small engines and relatively high gear ratios. In 1980, these two characterizations describe
respectively domestic automobiles and imported automobiles.
6. Graphical Eziensions of Parallel Coordinate Plots. The basic parallel coordinate idea
250
suggests some additional platting devices. We call these respectively the Parale: Coordinate Dernity
Plots, Relative Slope Plots and Color Histograms. These are extensions of the basic idea of parallel
coordinates, but structured to ex.iloit additional features or to convey certain information more easily.
6.1 Parallel Coodinate Density Plots. While the basic parallel coordinate plot Isa useful
device itself, like the conventional scatter diagram, it suffers from heavy overplotting with large data
sets. In order to get around this problem, we use a parallel coordinate density plot which is computed
as follows. Our algorithm isbased on the Scott (1985) notion of average shifted histogram (ASH) but
adapted to the parallel coordinate context. As with an ordinary two dimensional histogram, we decide
on appropriate rectangular bins. A potential difficulty arses because a line segment representing a
point may appear in two or more bins in the same horizontal slice. Obviously if we have k a-
dimensional observations, we would like to form a histogram based on k entries. However, since the
line segment could appear in two or more bins iua horizontal slice, the count for any givea horizontal
slice Is at least k and may be bigger. Moreover, every horizontal slice mAy not have tht; same count.
To get around this, we convert, line oegments to points by intersecting each line' begunent with a
horizontal line passing through the middle of the bin. This gives us an exact count of k for ebch
horizontal slice. We construct an ASH for each horizontal slice (typidally averaging 5 histograms to
form our ASH). W, have used contours to represent the two-dimenslonal density although gray scale
shading could be used in a display with sufficient bit-plane memory, An example of an parallel
coordinate density plot is given in Figure 6.1. Parallel coodinate desity plots have tWe advantage of
being graphical representations of data setc which are cimultaneously high dimensional and very large.
6.2 Relative Slope Plots. We have already seen that parallel line segments In a parallel
coordinate plot correspond to high positive correlatiou (linear relationship). As in our automobile
example, it is possible for two or more sets of linear relationships to exist simultaneously. In an
ordinary parallel coordinate plot, we see thefse as sets of parallel lines with distinct slopes. The work of
Cleveland and McGill (1984b) suggests that comparison of slopes (angles) is a relatively inaccurate
judgement task and that it is much easier to compare magnitudes on the same waie. The relative
slope plot ismotivated by this. In an'n-dimensional relative slope plot there are u-1 parallel axes,
each corresponding to a pair of axes, say xd and xj, with xj regarded as the lower of the two coordinate
axes. For each observation, the slope of the line segment between the pair of axes isplotted as a
magnitude between -1 and +1, The maximum positive slope ic coded as + 1, the minimum negative
slope as -1 and a slope of oo as 0. The magnitude Is calculated as tot q]where q is the angle between
the xj axis and the line segment correspoading to the observation. Each individual observation in the
relative slope plot corresponds to a vertical section through the axis system. An example of a relative
slope plot is given in Figure 6.2. Notice that since slopes are coded as heights, simply laying a
straightedge will allow us to discover sets of linear relationshipe within the pair of variables xj and xj.
6.3 Color Histogram. The basic set-up for the color histogram is similar to the relative slope
plots. For an n-dimensional data set, there are n parallcl axes. A vertical section through the diagram
corresponds to an observation. The idea is to code the magnitude of an observation along a given axis
by a color bin, the colors being chosen to form a color gradient. We typically choose 8 to 15 colors.
The diagram is drawn ny choosing an axis, say xk, and sorting the observations in ascending order.
Along this axis, we see blocks of color arranged according to the color gradient with the width of the
block being proportional to the number of observations falling into the color bin. The observations on
the other axes are arranged in the order corresponding to the xk axis and color coded according to their
magnitude. Of course, if the same color gradient shows up say on the xm axis as on the x&, then we
know xh is positively "correlated" with xm. If the color gradient is reversed, we know the *correlation"
is negative. We used the phrase "correlation" advisedly since in fact if the color gradient isthe same
but the color block sizes are different, the relationship is nonlinear. Of course if the xm axis ahows
color speckle, there Is no "correlation" and xj is unrelated to xm. An exanple of a color histogram is
given In Figure 6.3 (for pdrposw of reproduction here it Isreally a gray-scale histogram).
7. Implementigions ,nd Ezperience8. Our parallel coordinates data analysib software has been
implemented in two forms, one a PASCAL program operating on the IBM RT"under the AIX
operating system. This code allows for up to four simultaneous windows and offers simultaicous
251
display of parallel coordinates and scatter diagram displays. It offers highlighting, zooming and other
similar features and also allows the possibility of nonlinear resealing of each axis. It incorporates axes
permutations and also Includes Parallel Coordinate Density Plots, Relative Slope Plots and Color
Histograms.
Our second implementation isunder development in PASCAL for MS-DOS machines and
includes similar features. In addition, it has a mousedriven painting capability and can do real-time
rotation of .3-dimensional scatterplots. Both programs use EGA graphics standards, with the second
also using VGA or Hercules monochrome standards.
We regard the parallel coordinate representation as a device complementary to scatterplots, A
major advantage of the parallel coor lnate representation over the scatterplot matrix isthe linkage
provided by connecting points on the axes. This linkage is difficult to duplicate in the scatterplot
mattix. Because of the projective line-point duality, the structures seen in a auatterplot can also be
seen in a parallel coordinate plot. Moreover, the work of Cleveland and McGill (1984b) suggests that
it is aWier and more acwurate to compare observations on a common scale. The parallel coordinate
plot and the derivatives of it de facto have a common scale and so for example a sense of variability
and central tendency among the variables are easier to graSp visually in pallul coordinates when
compared with the scatterplot matrix. On the other hand, one might interpret all the Ink generated by
the lines as a significant disadvantage of the parallel coordinate plot. Our experience on this Ismixed.
Certainly for large data sets on hard copy this is a problem. When viewed on an Interactive graphic
screen particularly a high resolution screen, we have often found that individual points in a scattetplot
can get lost because they are simply not bright enough. That does not happen in a parallaP coordinate
plot. Howover, if many points are plotted in monochrome, it is hard to distinguish between points.
We have gotten around this problem by plotting distinct olnts Indifferent colors. In an EGA
implementation, this means 16 colors. This Is surprisingly effective in separating points. In one
experiment, we plotted 5000 5-dimensional random vectors using 16 colors, and inspte of total
overplotting, we were still able to see some structure. In data sets nf somewhat smaller scale, we have
implement a scintillation technique. With this technique, when there Isoverplotting we comue tLe
screen view to scintillate between the colors representing the overplotted poluts. The speed of
scintillation Isis proportionil to the number of points overplotted and by carefally tracing colors, one
can follow an individual, point through the entire diagram.
We have found painting to be an extraordinarily effective technique In pruallel coordinates.
We have a painting scheme that not only paints all line, within a given rectangular area, but also all
line lying between to slope constraints. This is very effective in seprating clusters. We also use
invisible paint to eliminate observation points frot the dp\ta het temporarily. This is a natural way of
doing a subset selection.
References
Asimov, Daniel (1985), "The grand tour: a tool for viewiag multidlrnsional data," JIAM J.
Scient. Statist. Compul., 4, 128-143.
Carr, D. B., Nicholson, W. L., Littlefield, It., Hall, D. L. (1986), "Interactive color display
methods for multivariate data," in Siatiaical Image .Processing and Graphics, (Wegman, E. and
DePriest, D., eds.), New York: Marcel Dekker, Inc.
Chernoff, H. (1973), "Using faces to represent pointo in k-dinienslonIal space," J. Am. SaLtist.
Assoe., 68, 361-368.
Cleveland, W. S. and McGill, Rt. (1984a), "The many faces of the scatterplot," J. Am. Statist.
Asioc., 79, 807-822.
Cleveland, W. S. and McGill, Rt. (1984b), "Graphical perception: theory, etperimetitAtion, and
application to the development of graphical methods," J. Am. Statist. desoc., 79, 31 -,S54
Diaconis, P. and Friedman, J. (1983), "M and N plns," in Recent Advances inS atsics, 425-
447, New York: Academic Press, Inc.
252
--- H0I I I III I I Il M-
..... . _ .- ... .. . .. l ....
Dimsdale, B. (1984), "Conic transformations and projectivities," IBM Los Angeles Scientific
Center Report #6320-2753.
Fienberg, S. (1979), "Graphical methods in statistics," Am. Statisicin, 33,185-178.
Friedman, J. and Tukey, J. W. (1973), "PRIM-9" a film produced by Stanford Linear
Accelerator Center, Stanford, CA Bin 88 Productions, April, 1973.
Friedman, J. and Tukey, J. W. (1974), "A projection pursuit algorithm for exploratory data
analysis," IEEE Trans. CompuL, 0.23, 881-889.
Grifen, H. D. (1958), *Graphic computation of tau as a coefficient of disarray," 4. Am.
Statist. Assoc., 53, 441-447.
Hartigan, John A. (1975), Clustering Algorihm, New York. John Wiley and Sons, Inc.
Holmes, S. D. (1928), "Appendix B: a graphical method for estimating R for small group.,"
391-394 in Educatioqal Psychology (Peter Saudiford, auth.), New York: Lon mans, Green and Co.
Inuelberg, A. (1985), *The plane with parallel coordinates," The Vsael Computer, 1, 09-91.
Scott, D. W, (1985), "Aveorage shifted histogram: effective nonparametric density estimators
.in several dimensions.", Ann. StatisL, 13, 1024-1040.
253
I
I,
00
it
0
0
Ii
00
0 0
0 0
0 0
S...
0
0~ o0
I
0~ oS
00 0
hi,
N S..
254
0,
4C
.5
g
- am slo
Wolu t plot
256 ainsO
Fipi. 4.1. Parallel cOOKRdiat plot of a circle.
258
ratio. adHweight
259
Fisure 5.2 The hecond Permuta~Iti at the five dimnensionul
prewntAtlOn of the &utomjobile data. Notice the two alum of
hume relations pear rtio, and -miluesa 9&11on.
260
Figure 5.3 The third permutation of the five dimensional
automobile data. Note the highllghtiag of the domestic
automobile group.
261
21. nI Image m
PrispceI
262
Figure 8.1 Parallel coordinate densty plot of 5000*unlfonn
random variablesn. This plot has five contour
levels 5%, ?A~%, 50 %,7% and 95%.
263
till 1IIIIIIsIIIIL"
- II1111111 " " II II I II
Hhilimps in
Il"1Li
il11"'",1"
ll ll111111It I" ','
II,l
IIIIIIS~
'Prie I
Displacen
Gear Rati
Figure 6.2 Relative slope plot of five dimensional automobile
data. Data prusented In the same order as in Figure 7.4
264
COMPUTATIONAL AND STATISTICAL ISSUES
IN DISCRETE-EVENT SIMULATION
Peter W. Glynn
and
Donald L. Jglehart
Abstract
Discrete-event iimulation is one of the most important techniques available for study-
ing complex stochastic systems. In this paper we review the principal methods available
for analyzing both the transient and steady-state simulation problems in sequential and
parallel computing environments. Next we discuss Leveral of the variancz reduction meth-
ods designed to make simulationa run more efficiently. Finally, a short discussion is given
of the methods available to study system optimization using simulation.
265
1. Introduction.
Computer simulation of complex stochastic systems is an important technique for
evaluating system performance. The sturting point for this method is to formulate the
time vmyng behavior of the system as L basic stochas9tic process Y m {Y(t) : t ! 0),
where Y(.) may be vector-valued. [Discrete time proeses can ilso be handled.] Next
a computer program is written to generate sample reeizations of V. Simulation output
is then obtained by running this program. Our discussion in this paper is centered on
the analysis of this simulation output. The goal being to develop sound probabilistic and
statistical methods for estimating system performance.
Two principal problems arise: the transient simulation problem and the steady-state
simulation problem. Let T denote a stopping time and X m 4{1'(t) : 0 : t _5T}, where It
is a given real-valued function. The transient problem is to estimate a a E{X}. Examples
of a include the following:
a .Ef(Y~t0)},
aE to. J (Y(,.))ds}
and
a- P{Y does not enter A before to},
Here to is a fixed time (> 0), f is a given rei-valued function, and A is a given subset of
the state-space of Y. The transieut problem is relevant for systems running for a limited
(but possibly random) langth of time that canot be expected to reach a steady-state. Our
goal here is to provide both point and interval estimetes for a.
For the steady-state problem we assume the Y process is asymptotically stationary
in the sense that
T j (YCa))d. .. a
as t -. oo. Here no denotes weak convergeace and I is a given real. valued function
defined on the state-space of Y. TlCe easiest example to thUnk about her is an irreducible,
positive recurrent, continuous time Markov chain. In this case Y(t) =o Y as t -4 oc and
a 3 E f(Y)}. Examples of a in this case include the following:
266
c=P(Y EA},
and
where c is a given cost function. Again as in the transient cae, we wish to construct both
point and interval estimates for a.
2. Transient Problem,
Assume we have a computational budget of t time units with which to simulate the
proceu Y and estimate a m E{X}, as defined in Section 1. In a sequential computing
environment we would generate independent, identically distributed (lid) copies
where the X 's are copies of X and rj is the computer time required to generate Xj. Let
N(t) denote the number of copies of X generated In time t; this is Just the renewal process
associated with the iid ri's. A natural point estimator for ,v Ie
N(t)
8 W E, N(t)>O
-t
I 1N~ 0) i mlN MtMo .
The standard asymptotic results for X'N(,) are the strong law of large numbers (SLLN)
and the central limit theorem (CLT).
STRONG LAW OF LARGE NUMBERS. If E{7r} < 0 and E{I X, 1} < 0, then
Xvw -+a a.s. as t -+ 00.
CENTRAL LIMIT THEOREM. If E{r} < oc and var{Xl} < oo, then
where N(O, 1) is a mean zero, variance one normal random variable, The SLLN follows
from the SLLN for ild aummandd and the SLLN for renewal processes. The CLT result
can be found in BILLINGSLEY (1968), Section 17.
267
From the SLLN we see that X'N(,) is a strongly consistent point estimator for a. Thus
for large 'we would use g'N(g as our point estimate. On the other hand, the CLT can be
used in the standard manner to construct a confidence interval for 0. Here the constant
E{,rl }. var {Xj) appearing in the CLT would have to be estimated.
Suppose now that we are in a parallel computing environment with p independent
* oo. On the p processors we
processors. Now we wish to estimate a for a fixed t as p --
generate ild copies of (X, r):
where
( j
N(t)
n J0 t , N. M
E(t)
Here the processing ends on all processors at time T~ t. If E{'ru) <00o and E{I X 1) < 00,
then for all t > 0
a,(p,t) -.E{XN(,)} = E{X. l{,, )} a.s.
02(At) 12 P-~
[ 1 (t) + 1]
i-i
268
Here all processurs complete by time
a2 (pIt0
j= ( ) a..
E{Nj(t) 4+I ) EX} a
The equality about Is simply Wald's equation. Finally, since a2(p, t) Is a ratio estimator,
a CLT is also available from which a confidence interval can be constructed.
The last estimator we consider was proposed by HEIDELBERGER and GLYNN
(1987). Here we set
a(p,t) = (
Pm
where
Nj(t) "'N(j() + X011t>,)
Given N(t) 2t 1, Heidelberger and Glynn show that the paits of random vaxiables (XI, r),
S(XVN(t), rJ(,)) are exchangeable. Using this fact, they prove that E(.R (,)) - E{Xi},
Since the IN"(m)s are iid, we see that as(t) is strongly consistent for a = E(X}. Since
the summands in as(p, t) are lid, the standard CLT holds (under appropriate variance
assumptions) and can be used to develop a confidence interval for a. Note that the
definition of ±N(b')(t) requires the ith processor to complete the replication in process at
time t, if no obervltions have been completed by time t; i.e., ri > t. Thus the completion
time for all p processors is given by
While T. -.* oo a... as p --o co (if P{ril > t) > 0), T. goes to infinity at a much slower
rate than is the case for a2(p, t). They also show that the following CLT holds:
a)
t12 E 1 2 {ri) N(O, 1)
269
as t -4 0c, where we assume 0 < a2= va-r{X} <00 and 0 < Z{r1 }< o. Thus ,tN(w)
can also be used in a sequential environment to estimate a.
3. Steady-State Problem.
The steady-state estimation problem is considerably more difficult than the transient
estimation problem. This difficulty stems from the following considerations: (1) need to
estimate long-run system behavior from a finite length simulation run; (1i) an initial bias
(or transient) usually is present since the process being simulated is non-stationary; and
(ii) strong autocorrelations are usually present in the process being simulated. While
classical statistical methods can often be used for the transient estimation problem, these
methods generally fail for the steady-state estimation problem for the reasons mentioned
above.
Assume our simulation output process is Y a {Y(t) : t 2_0} and for a given real-
valued function f
(t) 0 [Y(j)]d a. (1)
As stated above, we wish to construct point and interval estimators for a. In addition to
(1), many methods also assume that a positive constant ty exists such that the following
CLT holds:
It,((t)- ] . o.. NO, 1) (2)
as t -.* . From (1) and (2) we can construct a point estimate and confidence interval for
a provided we can estimate a. Estimating a is generally the hardest problem.
A variety of methods have been de Aloped to address the steady-state estimation
problem. In Figure 1 we have given a break-down of these methods. Most of the methods
are single replicate methods, since multiple replicate methods tend to be inefficient because
of the initial bias problem.
Here we only consider single replicate methods. These methods are of two types:
those that consistently estimate a and those in which a is cancelled out.
For consistent estimation of 0., we need a process {s(t) : t 0} such that s(t) * a.
270
MULTTIME SENIES
REPENECATION SPETRAIAATOLNG
METHODS METHODS
ONIguE 1ACLLTO
ESTIATIN M271D
In which cas (2) lehs to L 100(1 - 6) %con h&,dce interval for a given by
where O(z(1 - 6/2)) = 1 - 6/2 and 4 is the standard normal distribution function.
On the other hand, the canceling out methods require a non-vanishing process { Z(t):
t _ 0) such that
[tl/ 2(a(t) - a), Z(t)] * (oN(0, 1), aZ]
as t - oo. Then using the continuous mapping theorem (cf., BILLINGSLEY (1968), p.
30) we have
tl/ 2 (a(t) - a)/Z(t) * N(0, 1)/Z (3)
as t --* o. Note from (3) that a ha& been cancelled out in a manner reminiscent of the
t-statiltic.
22f(Y())d,
272
where t is the length of time the simulation is run. On the random time scale our point is
given by
an(f) M*"W %
where ?.(f) (respectively, %,)in the sample mean of Y1 f,.Y(f) (r1 ,. ). Here the
Y process is simulated to the completion of n regeni~ative cycles. 'Using the basic facts
(i) and (ii) above, It can be shown that both a(t, f ) and a. (f ) are strongly consistent for
a(f) as t aznd n respectively tend to infinity. Next we defin~e 4k a Yh(f) - *(f)-k and
assume that var{Zk1 is a2 < cc. Then It can be shown that the following two CLT's hold
as t -* oo and n - cc:
and
n112ran(f) - O(/) -* (OP/Ef ri N(O, 1).
These two CLT's can then be used to construct confidence inlter'vals for a(j) provided both
2
ap and Efri I can be estimated, The mean E{riI ii - iily estimatod by %,and a2 can be
estimated from its definition in terms of Y 1(/) and ri.
Next we turn to a discussion of the principal method available for canceling out a.
This is the method of standardized time uerics developed by 5CHRUBEN (1983). Our
discussion is based on the paper GLYNN and 1GLEHART (1989) and uses some results
from we-A convergence theory; see BILLINOSLEY (1968) for background on this theory.
From our~ output process Y we form the random elements of C[0, 1], the space of real-valued
continuous functions on the interval (0,1], given by
js1
F It n Y(s)das
and
X.(t) * n 1 / 2 (P?(t) - v
where 0 !5 t !5 1 and n : 1. Now we make the basic assumption that a finite, positive
constant a exists such that
Xn *o a'B as n --# oo, (4)
273
where B is standard Brownian motion. This assumption holds for a wide class of output
processes. To find the scaling process {Z(t) : t 2> 0} consider the clam. M of functions
g: C[O, 1] -1 R such that
(i) g(*:) = ag(x) for all a > 0 and x C[O, 1];
(ii) g(B) > 0 with probability one;
(iii) g(z + ik) - g(z) for all real / and E C[O, 1], where k(t) m t;
(iv) P{B E D(g)} = 0, where D(g) is the set of discontinuities of g.
The process
sn(t) A ?n!t) - at 0 _,,< I
is called a standardized time series, Using weak convergence arguments it is easy to show
from (4) that
Sn(1) * B(1)/g(8) (5)
as n -* 0c. Unfolding this CLT we have the following 100(1 - 6)% confidence interval for
al:
(?R(1) - z(1 - 6/2)g(?.), ?R(1) + z(6/2)g(?.)],
where P{B(1)/g(B) 5 z(a)) - a for 0 < a < 1. Thus each g e M gives rise to a
confidence interval for a provided we can find the distribution of B(1)/g(B), Fortunately,
this can be done for a number of Interesting g functions.
One of the g functions leads to the batch means method, perhaps the most popular
method for steady-state simulation. We conclude our discussion of the method of stan-
dardized time series by displaying this special g tunction. To this end we first define the
Brownian bridge mapping r : C(O, 1] -. C(0, 1] as
Now think of partitioning our original output process Y into m > 2 intervals of equal
length and define the mapping bm : C(O, 1] --* R by
( "11/2 )
b 1(x(i/n) - x((i - 1)/m)2]
274
for x E C(O, 1]. Finally, the g function of interest is g. = b.mor. To see that gm corresponds
to the batch means method we observe that
heeg(?n) - m 1 /2 [,-z
--1iml ((n) - Jul zs(n))
where
Z -(n) Y(x)!(n/m)
J(i-1).nm
Jd.
is the ith batch mean of the process {Y(t) : 0 : t < n). Specializing (5) to the function
gm we see that
nt
275
VRT's are based on some analytic knowledge or structural properties of the process being
simulated.
The first VRT we discuss is known as importance sampling. This idea was first
developed in conjunction with the estimation of E{h(X)} a a, where h is a kmown real-
valued function and X a random variable with density, say,f , Instead of sampling X from
f, we sample X from a density g which has been selected to be large in the regions that
are "most important", namely, where If I is largest. Then we estimate a by the sample
mean of h(X)f(X)/g(X); see HAMMERSLEY and HANDSCOMB (1964).
This same basic idea can be carried forward to the estimation of parameters associated
with stochastic processes. We generate the process with a new probabilistic structure and
estimate a modified parameter to produce an estimate of the original quantity of interest.
The example we consider here is the M/M/1 queue with arrival rate A, service rate p,
and traffic intensity pm A/ < 1. Let V denote the stationary virtual waiting time and
consider estimating the quantity a n P{V > u} for large u. When p is less than one, the
virtual waiting time process has a negative drift and an Impenetrable barrier at zero. Thus
the chance of the process getting above a large u is small, and a long simulation would be
required to accurately estimate a. The idea used here in importance sampling is to generate
a so-called conjugate process obtained by reversing the roles of A and p. For the conjugate
process the traffic intensity is greater than one, and the estimation problem becomes much
easier. ASMUSSEN (1988) reports efficiency increases on the order of a factor of 3 to a
factor of 400 over straight regenerative simulation depending on the values of p and u. In
general, importance sampling can yield very significant variance reductions. Further work
along these lines can be found in SIEGMUND (1976), GLYNN and IGLEHART (1989),
SHAHABUDDIN et al. (1988), and WALRAND (1987).
The second VRT we discuss is known as indirect estimation. Assume we are interested
in estimating a n E{X}, but happen to know that E{Y} = aE{X) + b where a and b are
known. Sometimes it happens that a CLT associated with the estimation of E{Y} will have
a smaller variance constant associated with it than does the CLT for estimating L'{X}, In
this case we would prefer to estimate E{Y} and we use the affie transformation above to
yield an estimate for E{X },This idea has proved to be useful in queuing simulations where
276
LMI 0
the affine transformation is a result of Little's Law. In general, variance reductionsarealized
using this method are not dramatic, being usually less than a factor of 2. For further
results along these lines, see LAW (1975) and GLYNN and WHITT (1986). While the
affine transformation works in queuing theory, it is conceivable that other transformations
might arise in different contexts.
The third and final VRT we discuss here is known as discrete time conversion. Suppose
that X = {X(t) : t 2_ 0) is an irreducible, positive recurrent, continuous time Markov
chain (CTMC). Then X(t) o X as t -+ o and we may be interested in estimating
a = E{f(X)}, where f is a given real-valued function. As we have discussed above, the
regenerative method can be used to estimate a. A CTMC has two sources of randomness:
the embedded discrete time jump chain and the exponential holding times in the successive
states visited. The discrete time conversion method eliminates the randomness due to the
holding times by replacing them by their expected values. It has been shown that this
leads to a variance reduction when estimating a. Also, as an added side benefit computer
time is saved since the exponential holding times no longer need to be generated. Gains in
efficiency for this method can be substantial. Further discussion of this idea can be found
in HORDIJK, IGLEHART, and SCHASSBERGER (1976), and FOX and GLYNN (1986).
277
Suppose X = {X. : n > O} is a discrete time Markov chain (DTMC) and that the cost of
running system 0 for r + 1 steps is g(9, Xo,..., X,). The expected cost of running syotem
9 is then given by
O~)isA,(009, Xo,.... ,x)(,
where Es is expectation rtI tive to the probability measure P(O) associated with system 9.
If E,{'} were independent of 9, we would simply simulate iid replicates of
7g(0, Xo,... , X,). By introducing the likelihood function L(9, Xo,... , X,) it is possi-
ble to write a(6) as
oI- tX)(O,
ot($) = Es.f9(1o XoI... . ,Xn))
to
where the interchange of V and Es must be justified. A similar approach can be developed
to estimate the gradient of a performance criterion for a steady-state simulation. For an
overview of this approach see GLYNN (1987), and REIMAN and WEISS (1986).
The second method which has been proposed for estimating gradients is called the
infinitesimal perturbation analysis (IPA) method. In this method a derivative, with respect
to an input parameter, of a simulation sample path is computed. For example, we might
be interested in estimating the mean stationary waiting time for a queueing system u well
as its derivative with respect to the mean service time. Since we are taking a derivative
of the sample path inside an expectation operator, the interchange of expectation and
differentiation must be justified in order to produce an estimate for the gradient Vc(0),
say. The IPA method assumes that if the change in the input parameter, 9, is small
enough, then the times at which events occur get shifted slightly, but their order does
not change. It has been shown that the IPA method yields strongly consistent estimates
for the performance gradient in a variety of queueing contexts; see HEIDELBERGER,
CAO, ZAZANIS, and SURI (1988) for details on the IPA method and a listing of queueing
problems for which the technique works.
278
REFERENCES
ASMtISSEN, S. (1985). Conjugate processes and the simulation of ruin-problems. Stoch.
Proc. Appl. ?&, 213-229.
BILLINGSLEY, p. (1988). Convergence of Probability Measure.,. John Wiley and Sons,
New York.
BRATLEY, P., FOX, B., and SCHRAGE, L. (1987). A Guide to Simulation. 2nd Ed.
Springer-Verlag, New York.
FOX, B. and GLYNN, p. (1986), Discrete-time conversion for simulating semai-Markov
processes. Operations Research Letter. A,191-196.
GLYNN, P. and WHITTI, W. (1989). Indirect estimation via L = AW. Operations
Research 3Z, 82-103.
GLYNN, p. (1987). Likelihood ratio gradient estimation: an overview. Proceedings of
the 1987 Winter Simulation Conference, 368-375.
GLYNN, P. and HEIDELBERGER, P. (1987). Bias properties of budget constrained
Monte Carlo simulations, I: estimating a mean. Technical Report, Department of
Operations Research, Stanford University.
GLYNN, P. and IGLEHART, D. (1989). Simulation output analysis using standardized
time series. To appear in Math, of Operation. Re.
GLYNN, P. and IGLEHART, D. (1989). Importance sampling for stochastic simulation.
To appear in Management Scd.
I{AMMERSLEY, J. and HANDSCOMB, D. (1964). Monte Carlo Methods. Methuen,
London.
HEIDELBERGER, P. (1987). Discrete evezut simulations and parallel processing: statisti-
cal properties. IBM Remerach Report RC 12733. Yorktown. Heights, New York.
HEIDELBERGER, P., CAO, X-R, ZAZANIS, M. and SUB.!, R. (1988). Conviergence
properties of izfaitesimal perturbation analysis estimates. Management Scd. 3A,
1281-1302.
IIORDIJK, A. IGLEHART, D. and SCHASSBERGER, Rt. (1976). Discrete-time methods
for simulating continuous time Markov chains. Adv. Appi. Prob. a., 772-788.
IGLEHART, D. (1978). The Regenerative method for simulation analysis. In Curreni
27q
Trends, in Programmisng Methodology - Software Modeling. (K. M. Chandy and R. T.
Yeh, editors). Prentice-Hall, Englewood Cliff., NJ, 52-71.
LAW, A. (1975). Efcient estimators for simulated queueing systems. Management Sci.
22, 30-41.
REIMAN, M. and WEISS, A. (1986). Sensitivity analysis via likelihood ratios. Proceed.
Ings of the 1986 Winter Simulation Conference, 285-289.
SHAHABUDDIN, P., NICOLA, V., HEIDELBERGER, P., GOYAL, A., and GLYNN, P.
(1988). Variance reduction in mean time to failure simulations. Proceedings of the
1088 Winter Simulation Conference, 491-499.
SIEGMUND, D. (1976). Importance sampling in the Monte Carlo study of sequential
tests. Ann. Statist. 4, 673-684.
WALRAND, J. (1987). Quick simulation of rare event in queueing networks. Proceedings
of the Second International Workshop on Applied Mathematics and Per-
formance/Reliabillty Models of Computer/Communication Systems. G.
Iazeolla, P. J. Courtois, and 0. J. Boxina (eds.). North Holland Publishing Co.,
Amsterdam, 275-286.
WILSON, J. (1984). Variance reduction techniques for digital simulation. Amer. J.
Math. Management Scl. 4, 277-312.
280
Bayesian Inference for
Weibull Quantiles
Mark G. Vangel
References
281
The Weibull model is widely used to represent failure data in engin-
eoring applications. One reason is because the Weibull distribution in the
limiting distribution of the suitably normalized minimum of a sample of
positive ild random variables under quite general conditions (Barlow and
Proschan, 1975, ch. 6). The model is therefore appropriate for the strength
of a system composed of a string of many links where the strength of the
links are lid and the system fails when the weakest link fails (Bury, 1975,
oh. 16). An example of a physical system which can be modeled in this way
is the strength of a brittle fiber in tension, Another reason why the
Weibull model is used is that the distribution is very flexible and con-
sequently it often fits data well.
282
Finally we reach the third approach, which is the focus of this paper,
For any location-scale family (e.g. the extreme value family) and any equiv.
ariant estimators of the parameters (e.g. MLZ's) tfe distribution of certain
pivotals can by obtaloned exactly if one conditions on the ancillary sta-
tistios. From these pivotals one can get exact conditional confidence
bound, and tolerance limits for any sample size. The method in applicable
to both complete and Type II censored samples (i.e., samples for which only
the r smallest order statistics are observed) and requires no tables.
Since the intervals have exact conditional confidence, it follows that they
are also exact unconditionally. In addition, this method has the advantage
of making use of all of the information with respect to the parameters which
is in the data (the parameter estimates are in general not sufficient
statistics), though for the Weibull model this does not appear to be a
practical concern (Lawless, 1973). This conditional approach is apparently
due to Lawless, who introduced it in (Lawless 1972), An exposition of the
procedure appears in (Lawless, 1982), which is also useful as a guide to the
literature.
283
each dataset as one iteratively approximates the confidence limit.
One goal of this project has been to implement the Lawless procedure
for the extreme value distribution in a 'robust' FORTRAN program which can
be used with little user interaction. Another goal has been to investigate
a recent approximation to the conditional procedure (DiCiccio, 1987) which
is accurate to OP(n'31 2 ). This approximation makes the the calculation of
posterior distributions feasible, A FORTRAN program to calculate and plot
the posterior distribution of Weibull quantiles which makes use of the
DiCiccio result is discussed. The results of a small simulation to assess
the accuracy of the approximation are presented, though little effort war
spent on the simulation since the order of convergence in probability has
been determined.
(W* xi* log x,) (1* x') " -1/a - 1/r Z log xi
~w - i +(n -r) wr
284
.2 Xht &Ntreae YJU Dit bt
Y - lo(X)
Is
where
for any e and any d > 0. The maximum likelihood estimates are readily seen
to be equivariant,
285
Let the sample size be n and, to allow for Type II censoring, let r I n
be the number of data values. Denote the density of G(*) by
First we demonstrate that the following random variables are pivotal; that
s. they have probability distributions uhich do not depend on the
parameters-
A A0A
ZI-
U U)/b Z b /b Z3-( u b
Lost yji S r be the order statistics of a random sample from H(y). Consider
the random variables
The wLare the order statistics of a random sample from OC') and hence are
obviously pivotal. Since the estimator u is assumed to be equivariant we
have that
A A
1/b (u(y,, ... , y.) -u) - (u-u)/b - Z3 .
A A
b~wj' b( (y,-u)fb, ... , (y.u)/b)
Finally,
A A
Z,- (u-u)/b - ((U-u)/b) (b/b) - 32
286
A A
u(aI,A ...0 Ad)- M2, .. y) -u PE0
and
A A A
b(al , ..., an) -bXyl, too$ yr)/° EP 1.
'
h(z,, za, ) - k(!, r, n) a ' (
IT (a, + Z122)) [ O(a1 z a + zAI)]
LEI
Y P- b a, + u.
AA
u - ZI b + u- bZ
1 Z2 + u
A
b - Z b
287
Con2fidene IntirA"a. f=x k~tmma V1alue Quauxizie
h(%a ji) -k(s) exp (1az,) s a 'r /[Z* exp (az2 )1'.
zP - ZI -vP/Z. and Za -z a
where
wP M In (-In (I-p)).
a t
288
The double integral can now be written as a single integral by recognizing
that the integral over y is the incomplete gam function:
utiere
A A
-P (u *XP) /b.
2E9
Let (yj) be the order statistics of a Type 11 censored sample of size r S n
from an extreme value distribution. The usual Joint noninformative prior
for the location parameter (u) and scale parameter (b) of a location-scale
family is:
w(u, b) a 1/b.
Usin& the expression for the extreme value pdf given in a previous
section, the corresponding posterior distribution is seen to be
' (-
(xp p ((y -u)/b)),
w(b iy) * (1/b) ' exp (I y1 /b) /[Z* exp (Yi /b)]
Let *(u, b) be any scalar function of the parameters about which in-
ference is to be made. Assume that V(u, ') is monotonically increasing in u
for fixed b. If a function can be found which satisfies this condition
piecewise, then the following results may still be applied to each monotonic
section of the function, Some useful choices for t are
290
*(u, b) - exp(-exp((t-u)/b))) (reliability at time t)
O[n(s, b), b] - a.
9
I(e,r) - (l/r(r)) I x r ' exp (-x) dx.
0
291
For a confidence bound on the pth quantile x,
(sb) - , - vpb
and
The fact that for inference about quantiles and about the shape (or, in
terms of the extreme value distribution, scale) paraeter the Bayesian
approach is equivalent to the Lawless conditional approach will be demonstr-
ated next.
A A
z - b/b
a a
292
;Ib I
r exp (Zaiz)/[Z*oxp (aLz)I z ze'2 d
b/bl
- J h(z Ia) d.
t - (U - s)/b
A A A A
exp (wP + tb/b - u/b (!*exp (aLb/b + u/b))) -
A A
*Xp (wp + b/b (Z*oxp (a b/b))).
A
The change of variable Z - b/b gives the desired result.
Details on inference for this family may be found in (Farewell and Prentice,
1577) and (Lawless, 1980), Note that the case k - 1 corresponds to the
Voibull distribution.
293
If T has a generalized gaa distribution, then Y - log (T) can be
written in the form p - @W where
- log(p) + log(k)/a,
o - 1/(ok" 1 )
fwv; k) -k- '1/2 /r(k) exp (k1 '2 v - exp (v/kl/ 2 )).
+ Wpa,
Y O"
A A
that is, the KLE of the pth quantile is required to equal y.' If p and a
denote the unconstrained MlX's, and if L(p, a) denotes the log of the log
generalized gamma likelihood, then the asymptotic distribution of the sta-
tistic
A A
V(yPO) - -2[L(p, ) L(, a)]
is x with one degree of freedom. Law:lsss (1984, sec. 4.2) suggests that
inference based on V(y, 0 ) is acceptable for moderate to large samples but
that the approximation may be inadequate for small samples.
294
DiCieclo (1987) has applied general techniques of Barndorf-Nielson
(1986) in order to develop a computationally inexpensive modification to the
signed square root of V(y.0 ) which yields a likelihood ratio based approxim-
ation suitable even for quite small sample sizes. The numerical integration
required by the exact methods is only troublesome for moderate to large
samples; so the approximation is actually of questionable use over the range
of sample sizes for which it is Inaccurate.
I will not reproduce the details of the DiCiccio approximation here for
two reasons. The most important of these is that only the results are
presented in (DiCiccio 1987) and to repeat these results without having
studied their derivation would serve no purpose. A second reason is that
although the approximation is inexpensive to compute, the formulas are
messy, and to reproduce them here is to invite typographical errors. Inter-
ested readers should refer to (DiCiccio 1987) and to the FORTRAN implementa-
tion as subroutine LAWAPX in the appendix,
295
than the uverall sample size, which is not surpriaing. Overall, the
DiCicolo result appears to be satisfactory for samples of 10 or more un-
censored values, and remarkably good for samples of 30 or more observed
values. This conclusion is based partly on the small simulation presented
here and partly on experimenting with various cases of real and simulated
data,
The work presented here has been motivated by a need for improved
methodology for calculating basis values and for communicating lower tail
quantile information to the engineer. Typically, the engineer who routinely
calculates ard interprets these numbers has little appreciation for the
rather convoluted frequency arguments behind tolerance limits. The long run
proportion of times a statistic calculated from successive samples of size n
from a hypothetical population is greater than a certain quantile of that
population is of little help to the statistically naive. The simple state-
ment that the tenth (first) percentile is greater than the B-basis (A-basis)
286
value with 95 percent probability is much more direct and intuitive. Also,
the Bayesian approach presents all of the information in the data about the
lower tail quantile of interest, which is what should be the ultimate
concern of the engineer anyway. The fact that the tolerance limit is only a
convenient summary statistic of this distribution becomes clear when the
user is presented with the entire posterior and shown how to determine
arbitrary tolerance limits graphically.
297
former is motivated by frequency considerations, while the letter is derived
from a Bayesian point of view. The recent work of biCiccio (1987) greately
reduces the computational burden of both methods with little loss of
accuracy.
U1. Rafo~encls
Bury, K. (1975), Stariutical Models In Applied Science, New York: John Wiley
and Sons.
298
Lawless, J. F. (1972). "Confidence Interval Estimation for the parameters of
the Weibull distribution", UCllltas Hatheomtlcae, 2, 71-87.
LAvless, J. F. (1982). StatLt.lol Models and Methods for Llfetime Data, New
York: John Wiley and Sons.
299
ZILLI
A simulation of 951 lover confidence bound# on the 10th percentile using the
Weibull distribution vih 100 replicates per case vas performed. The
results are sum arlsed beZow:
10 10 10 1 1.09 .029
20 20 10 1 .380. .0086
10 10 5 1 2.20 .059
20 20 5 1 .761 .017
30 20 5 1 .752 .016
300
Carbon fiber / 1oxy sBecimen tensile strength data
95 LCB oan 10th argelnti1.
A 48 244,1 244.3
B 36 271.4 271,6
C 33 228.2 228.5
D 25 269.5 269.8
A 23 77.8 77.64
B 18 76.36 76.50
C 30 77.18 77.23
D 10 78.45 78.64
301
Fisure1
Iwo_ _ _
____Ono
00 tw
Now
-I-s
302
Fisure 2
wu)w
0303
Fiture 3
~4
030
Figure 4
pow
%%O
CA 4w
305
Appendix: FORTRAN Listings
The following programs were developed on in Alliant TX/B and should run with
little modification on any 32 bit machine. However, the software has not been
tested to the point where it can be considered error free. The programs are
provides as a guide to an individual wishing to implement the algorithins discussed
in this paper.
306
Proofaer tawpgm
C
Mark VangeL, $!2o/SI
Program to implement Lowless' procedure for
c conditional confidence intervals on uanttiles
9 for a location/scale #amily. The family chosen
c here is extreme value. Data may he Type 11 censore 4 .
9 Note that conditioninq on the incittaries gives the
c equivalent of an HPD reaiom for a noninformalivo
c prior.
c
implicit doubl orecision (a-ho 0-1)
peraetler (ilan * 500)
character*2 ftenme
dimersion (limsa)
c
common Idat/ x
common /ca/ cnoro., suaa, n, k
common /ebi &0 OaI, wu, t
common lctd tol
data one I1.d/
C
date coarse, los t.
finer e.o-7, -?, 1.0-5/
C
c -- Citout unit nfumoer amo ftlenffe
writ@ (6, ) 'ootPut unit number ?I
Peso (5, ) lout
it (out ,ne. 6) then
write (6, *) 'Filenamo ?I
rea9 (5,'Ca12)') llen' e
ocem (unituiOUt, fileutlenm., *~*tulnr, . * )
end it
c
c -This gprocrai, is tostou 00t ramn, jotA.
c It car. aI,1 be wsol f*or riata i ro, a Ifi(e.
c Tro first recora of ima inout lilf. has 10,e
$ &t qe sit
and t0e M gir of uncemsortc VAL. s.
write (6, *) 'Enter I for date frre ite
write (6o a) I fo ranoeo oats.'
read (5, ) ICat
if (1Oat .eq. 1) then
write (6, *) O~i eriavar ?I
real (S, '(dc)') flenee
Ocer (ur tilout$* 4 tmlien ,t lu
t selOW
' )
read (iOut1, a) M, K
c
c -- Notet the first fieLd on eoO roemain~o reccr. iis
c botch indicator not use for this nresra-.
do 10 is1. k
read (iout*1, *) aummy, w (1)
10 continue
ca I gL vrgm (k, a, K)
else
write (6, 6) 'seed ?'
read (5, * ) isVed
wr ite (6
to) 1wei but k ViflC drc scalp ?'
read (So e) shp, act
write (6, 4) 0SampLe site ?'
read (5, 0) m
wr it (6o 5) 'wumber uncensorco ?'
,ea'1 (5, C) k
Al
307
c
c -- Got the pseudoofand*m sample
CALL mnsot (41600).
slt drhWib (he Shpo 11)
sail davr~n (no so A)
00 20 isle k
a MI * set Ox MI
10 continue
end if
A2
308
e write (Oo*) 11sed$, 2.pdl ?I
C read (*#*) itypt
it yot a I
write (*,*) 9144 one) mat for abscissa ?I
read (*o*) amino amia
write (* I)9miand mam for ordirtdtt COPA lop ',ilwit ?I
read amino ,*x.
oem
Gatt imitt (980)
CaltL b i mit I
Cott comset (1ae(Iamin)
Cat( comaot (10aSeaC12) OMam)
if (aa *no. sera) then
Catt coaset (ibasty(11, orim)
Catt comset (ibaSey(l2), omab,)
and if
C
41 (itype *#q. 1) then
Cott mots (notot *na)
Catt ChePck (Qu#Mt, cdi)
Catt mot$ (ma)
Catl dBotay (Quanto cdf)
Go 30 41# npott2
Cott cptot (Quart (l*nQ.l)* crlf (IOnQ41))
Continue
*ls# 41 (ityp. .@a. 2) them
Ca l mot s (nnLot *n;q)
gait 0100c~ (Quanto dens)
Catl nmpta (rrli)
Call d5ptay (duent, oens)
co 40 1.1, nvlot-I
Coll c.olot (Quart (1*M*n.')P GPM% (i*9MQoI))
40 comnne
Ono i I
Cott raovabs Coo loonl)
Cal L Aneodt
reed (a,*)
go to2
eel i1f
C
A3
309
Program Oprtpil-
Mark VangeLo July 1QS9.
c Program to calculate and plot the posterior of a
c porgertiit for the Wtibult model* Ihis program catlls
c subroutines from the Tektronix Plot 10 Library.
c
charatter*20 a..
charactor*l @mt
'dimension m(1000), cdI (SO()o)dems(500O)o quarit ("~
* ipoint (100)
data for** on.t IO.d0p 1.dO/
ipoInt (1) S 0
mQ 0 100
"~plot a 0
read (Ooxl) r,
if (o ,q%. one) 9 N
a1"
write Civot) 'k4ange of value% for vosterior F
Alt
310
4 write (S*) °lmcd. 2mpd4 ?f
C reed (0os) 4type
itype a I
write (**S) '4in and oma for abscissa ?I
read (s*) qe4in qae
write (o*) 'Win and msa lor ordinate DO,0 for default) ?9
read (*,) eu4no osin
geLL initt (960)
Cell binitt
Cell couset (ibeem(I1)o q4in)
call comset ,(basoemt2)o 4ue)
4f 494a4 one. tore) then
ceLL ceoset (cibeys 110 sn)
Cell $onset (ibasey(ll), *Oak)
end 41
A5
311
programtavl
c mark Yanget. 10/14/9&
Cc Program to test by simulastion an aeorou1imstiom of
c DiCictio (1987) to th~e Lawtoss comditiomot rocoadipe.
iupticit dowbie precision (&who o-2)
Parameter (40160nu 500)
eciarswt cr12 41 mope
dimension X~umax)
common /datl
common /Cal snorfl. SUM&# no k
common I60/ p.0 AM, w&~o t
common /cdI tot
A6
312
colti drvwio (MO shwo 1)
cott d~vPrfm (m, X,11
(to 20 4-010 k~
a Ci) a cI*
20ciiaa
cO99tiIUf
(KfOifC mo YAM
CJOe*~
Ks.!9lP1iI
C~~Ct L.oI9BB
CT1ch'pop~trcso 1
c - ricitcio' BatoifillIion
c -- essufU
tfpf
infOia~O ~.
% of atr' ml ffo
1.
dCIPC * CaOt -ftot)/.O
W
?Os Sion. orv.
s!mara o~fI.offer
C
%flI *b. /Ctt' IC~A
$I Ia t (L.i
.
*f l (iouto 1:1 *ier1S ,,I
cp I i b
A- a~ iviAtiC
W F1 1 l,
A7
313
Subroutine tow~s% .na a pa, gama. ptt)
c park VangeLO may 1908
C subroutine Itawtlls catcu~ateS *o Sided tower tolerance tiffits
forethe Wtibutt codtt using the Laviess Conditional. ciroctrlurt.
C This routine' Deptorms 'exact' catewtationst by numeirical
9 4"tegrAtion. for even Moderately 4arce Samsleso the !lCiccit
C aooroximation used in subroutine fteaw&W is very accurat*
c and coaputaionaLty Less troubtesome than the method ubtel rre.
C
A a-* Data u1~
* n "a Total %sapot* site (Inowt.
4 k N*ubtr of (uncensored) observations (Inc~p)
* P roostliitty associated wit quontiie go (TInmow*
4 g Confloectre level, lor ico on art(nu
C @ltlt 'Ekact' tower 1teranCe Limit rtot
A8
314
c -. We need the ancitterits and their %up
ou"a IN sero
do 20 161gm k
sum& 0 suma *(a 'il -amu) /ag
20 continue
c
c *Next we obtain the constant of imttgration.
e use an acaptive qua'drsture routine to iitqrate
c XNTGNL an ('s, inliiity).
charm~ one
eoserr a ero
rttrr to%
call dadagi (anton1o stro* I# *sesrr# rettrro inko err)
c
* - The normalizing constant
cnorR 0 on# /xk
wri te (60 0)
write (6i 0) 'The mormatitina constant it of cropw
a
4 *- tt ouamtie tofr stanaarri eatremp value istri'out~on
wo. tog (-to; (one -p)
c
c - frts A~lorithn to fim, 1olerarce Iiwit facie.r S~c
9 that th* inltgraL of YIT~, irom ( .P ir irlt ) tnu4it 144,1.
c 'The first no%% has a Lmr',e error toLtrarct to save 14-,(
c The %#Comoi ;as% Ufa% the linal toterance.
C
asier stze
P#Lerr acoarte
lt coarse
caLL cibren (acomt, &oserrp rtier, ri, tr., Pakit)
c
C troPve in lop the in1ll.
man i I 10 C
it .95ov, *RN
th 1.050C *8h
rekerr atint
end
A9
315
subroutine lawapm 4xt nhamps, nobs&# pat lama, atla)
c
C Mark Vengel. July M966
c
~ An emcellent approximation to the one-sided Witbull
o conditional tolerance limits of Lawless (1975).
C
9w - .ta (Input)
S nmamp - Total sample site (Input)
a mobs - NumbWe of (uncensored) observations (Input)
a p -, Probability associated with quanttle up (Input)
I - Confidnce lovel for lab on up (Input)
2 etol -- Approbimato lowen tolerance limit (Output)
maxit 0 100
abser, * sees
reierr a coarse
tt a coarse
call dlbren (aconflo abse', relr.P. Rio ah, maxit)
A10
316
nh a h *sah
relar~r a fine
tol s #in*
call debron taconfi. absor, reoa, Il, sh, &&wit)
return
end
All
317
subroutine iCont (a, nsamo, mobs# ro go tolLMt)
c Marb Vang#(P October 19PS
c A non-ilerative first approuimation to the Lawless Cmrdifirrat
c Procedure (or* aternativety, to the posterior of a QUamtito undef
c a tl Prior). The routine is written for two Paramtor Wtioltjt
c analysisp but extension to the *eneratisted too gamma faoi Ly is
c straightlorward. This routine returns the estimated cornlidenco
c Limit for a providem oroobbility Level And confidence (honce
c tie 'it -' for inverse - in the routine name). Tnis rout ime it
C approtimatety inverse to *¢'. I t provi de % tho sam. rpswtl
c as 'LawsopN out with a toss accurate appoimation.
c Rol 1 Diciccic, T.J.o Teennometrics 1967 p.33
c a -Data (Innul
c neamp -- TotaL %samote size (Innut )
Numoor 01 (unconsorvd) observations mrl
C c
noos -.
*- Conticence Level tor It on to (JnnUT)
C p -. Probabitity associated with ouamtite xf (Tnout)
c tatlet-- upper toLerance limit (upt
imoticit doubte preciuion (a-h, P-0)
dimtnsion P(I)# & (4.)
data or-so Watl 11.0-50 WI'/
data sero, one, hat, thhaill 1vhatf
c
C, G~el the WeipuLL OLII
cati w mrmke (usho, usct, nsamli, nobs, t, erg, itpr, I'
4 continue
o(j) a imU) +(nsamp -robs) mt
3U~ continue
C
020 a -mobjs
all a I I)
WO -(no~*a 'iC)))
cic ant.
d2I * s'*ot' *%(I)
a1 3*5(1) 4SM~
dO0a nobs 3s)*()
d40C -nb
d31 *-('nots osa~))
02? -('.*nobs *5#%(1) *S?)
CIII -(?*stl) #*ss(?J 410~))
004 a -note -(Teac?) 0*5asC) *SU.)
A12
318
c The aporoximatt mean anac Varian.e, @1 ro the sionec
--
c square root al -2 times~ tihetihoca ratio, is cakcutatel ir
c terms al the Oij.
vll -dO? M(20 $dO2 -all 4#113)
c
a *(Vii **half *C-diP/dO? 4011*a03 IdO?*102 ) 12
b S(Vii *(-d2?ldO? +(d21*dC3 *2k*dl3*dli *dl2**?) Wd2**)
S -(4*dl2sd1l*tO3 *d06edl2552) Wa2003
6 *2*(dll*d03 IdO2*02) *02)) /4.
c (Vi1 CSthhati C(d3o -'ad~isdlldO#001*sdl? *(dll l/2 *4
6 -003 *(dll/d02) 003)) 10~
d wl *(?0 d*C *i?*d?1*dlleo1
*3 dOCel d
I -(6*dl~aO3aailme? #*dla3Cal 1**? +1 P4(d 1~201) *0?) 1a'?v~h
S *(iCcllCOO3CilC~340&Cdi2*A4) /drnm*a.
S -3Cdlles4 *d03**i /W0 5)) /24
c
umu aa 2*c
uso a rt (one +*(b +*d *a&W *li*cCC2)
c
c -- OiCigtiov equation b.
dnorirn (9)
Ig
rg smu +xSg NlQ
Y; utoc *usct *wrm
t 0(L Let 0 YO 4uscI *(sJft (V13) *Cfrq *C*aCC **
S (lvhAtf CC *Ili~ +M) Crj s*3))
c - Go to wtibio.'t state.
toiLtt s top (totiet)
c
retwrn
en(1
A13
319
double Precision function antgfld (r)
C Proportional. to the odf ol a certain piuctaL auamliiy.
9 The normatling constant* lcmormli dep.ends on the
c Oat& And must be obtainto bY A orv(imiMary nuslerical
e integration. O~nce lenorm' is knowm, WLYGOVL becofros a
4 pdl and is used ~V YNTrGND durinq the orimary rwui'epca.
C integration to got the toterance tiflit.
C
C Niote : lcmorm' must be initiaLited (to 1) Uploro the
C protimimary intogration. Fotlowing that# tenor-'
C can be assiqpnoo tte vatue 0 ict Makes ~''a r,14
c
impkicit doubt* Drec 4 sian (a-ti, o-t)
dimqengion XM1
c
o
c fsll is used by YNTON~o hence thf cormn t tocc.
common Idatl x
common /cc/ &I
common lea/ cmorpo suIma, no I-
common lparl eshn #scl# imuo Nse
data Afro# one /A.100 1.10/
c
pwr * one /ftoat ( .1)
t *one
do 10 laXl K
if (i feQ. k) I a dcrhLf (M -k 41)
aaa C)-#*U) Ilsd
Il s t*eC( . kA~.
13 continue
ontgno a (I /dutr Mk) sdv-k) *rwr
Mntgnd a Antari *OR. (su-,o A6l *0 Wr -ao))
amtyroa UAfltofl Ocmor",
return
end
A14
320
cloute Precision function Nconl (to)
c
C Subroutine to dttenino tho crobatiLtity of
c the pIM Quaritito blinq test them ?TO)o where 7 is I'6
C toterance Limit factor.
C
imoticit doubt# ortcision (a-ho o-
common dCbt Do gaffiWpo
common dp tot
tttrnat yntqn'i
data Ilroo ane /O.dup 1.dQ/
c
e * t i6 to be found In that t 4,1Ierat P5~
i1 Y ' rT -
c (Do infinity) #Quit& 'maw.'.
a t
obuerr 0 tot.
r*Ltrr 0 zero
cat. oceagi Cyntgntf, aRePO lo aOlrro riderr xiu err)
aCOnI a Mi -gem
c
c -- Ihis is just some tteminaL ouirwl to app~ee the user w~tl
c ~a
W~'iti'
nW10r rpsIu kts .
write (,) 7Lrn lwcnr: COrM000mce I to to Mi
rituwr n
end
A15
32.1
double precisionfl unction~ ymtgnd (1)
c Funcin to calculate the integrl 10!
C Otterminin9 the confidenct-
implicit doubt# precilicn CS'ho 081)
common la#b/ 00 gamp we# t
cUommo It cno!C' Swma, M.
gommen ICc a
am*1dOo 1.dO/
data streoe
pwr * on* l~t~1
A16
322
doubte precisi n function coMn 4 1 (t)
A17
323
subroutine acotf (a, nisappo nobs, Qo to coml)
C Mark vanigeto JULY 19bo
C
9 An spvrooimation to the Lawless conditionat orocedure (or#
C atternatively, to the Posterior of a quanititt under a fkat erif~r).
9 The routine is written for two paragirer Weibutt anaetylss rtt
C extemSion to the generatiza tog gama tsamilt is otraiqhtlortari.
C
C Rol I ficieceo, T.J.0 Techrioqetrics 3937 P.?"
C
C a 0. Date (I Mout
c Alarp -- Totak samole site (Inoid)
C nobs -- Number of (uncensored) observatiank
C Q -- value for which F (Mr, *( ) is desi re, (Input)
C P - Probabitity ossociottew with ouanli~Lp op (InrLut)
C coni F- (no 8( a) (Cut rul1
C
imokict Ooubte Orteesiom (a-h, 0-1)
dimensinam N M# 5 (4)
common /part ushowo usctwo mmuevo xsoev
data to$# macit 11.d-So 100/
data serov oneo hatl tn~atf
C
C - t the Constrainedi tI'so wilere Q is coristroine-1 Ic'
C be the Oth quantike.
Call CwnrftL (00 as C ShMO. cad,
I nar so,
I 0.ra era 1
otIt's0
co 20 islo not%
It a Log Ca(I)
ct a cl 4(t *Cloc2/C1Cl -top MC -CtoC)/csC1l
Ul MlUt t *UtOH /usc I -vot. W -ul oc )/utc
?~continve
cl a ct -Crisamp -tnoa) amk (t *ctoc)/csct1I *rts Oto: (cct)
U1. v U -Insaff'o -nous) *ea (0I -Utoc)#Uscl) -01's oto- (wW:)
a~~*rc
- -6ih1)
64n a logia) -utoc -WP *USCt
sun a abs (52M) /SnM
r 0 $on start (abs (NIP))
C
1
C i C MIU tfqP deIV at I V#S a1
p0 Iectt the, 117-. liketihc eo! .1
C Inhe I.F
coo 30 Jolt 1.
G(j) a Rero
00 41.0 islo mat s
1 l(11 4,11 ) * 10c) /uscI
((8a "60) 0 *top (8)
i S)
aj) 't
A1B
UZ4
4U continue
1Wj a i~j) #(nsami)-gr~*
10 continue
C
d2O a -nb
RR. U a( '1
A19
325
SUSAcuTINE wWAMLF (AP. bo w'SAmoo N05S Ar AEPSp ITRp lI
C
C M~ARV VANGFLP MAY 19tb
C
C ES71P'ATE WfIlULL PARAMFT!TRS FY MOXI94UN LTKELtNf(W
c A .WIMuLL S'WAML PAfRA14FTFR (PETPUJft)
C p -- IC1ULL SCALE PARAMFIER (REI'US9hin
c NSAMP -- SAMPLr SIZE (NFSRn
C NOPS -- NUMUER OFlQOSrVATTN UCPCRr
C (NSAMP-NOiRS) VALUiES APE TYPF 11 CENS'Y1E'
C X -- DATA VALUFS
C AEPS *-AlSOLUTF COk'VlklEwICr TQLERAOC (tN00)
C ITEN - NUPSEN OF ITEPATIrNS So'nut4fir EU.E
MAXIT AXIPUl SUPPE OF ITEPATI 4 (hP011)
C -.
A20
326A
ELSE
IF (CVTb.L (1) .Lr. CY) 'rsfrb
A a SHIO~L (1-1) # (SHOTOL (1) -SHPIOL (1-f))t
I (CV -CVTPL (I)I MCT,Lt1l-1) -CYVL MT)
aC Ya so
so COtdl i.6T
Dr. 6o toi No
II A (I1) $14 1'
72 TI7 * LCM (x (I)
13 T2 * LL.' (A M)l
S2 S?~ * 71
54 54 T~
6j CU I INHEI
S2 a 5? 4 T! CL~' !.L
$3 a $3 0*T CtA*
( -''
71 L'~ I A
74* 4 $2
D~aA *Ti*2 *T0* (11,01 TV
C
IF S (h 7* 6)1 t %I
6 L
COI I
Sa OS4 isi 'owCl("'A
C -- SCALi hATA
Ct A221
327
SUBROUTINt CWNP&4L (9oAXPp Apdo SA4PV~OT'5 go ArPl, IT EDwAyl'
C
C
C MAR9 VANGFLo JULY 198t
C
C ~A1t!' L11!L I'CC. '4 -I-'3L
EST I AT E WEI IULL PARA M TE RY 04A
C CUjNSIPA1NI THAT TIml ESTImATEI PTN OUAtNT1LE AT Tmr 'Lr
C IS EGUAL 1TO %P.
C P -- P!RDtAqILITVY LEVEL Or #-NATILE
C VP -- PTH bU1ANTILr
C A *WEM~LL 5S4AvE pAA~ryr4
C *- !IMLL $CALE PAPIRWTF
C N6'' *SAP PLr SIZL
C NO' S 1.10lqU'1'Ef F (UkS!RVATIC2E ,
C ('NSAMP-NW',) VALUIES AFL IYPF ii cr.%s,'V
C Y DATA VALUES
c AEPS - A945001!T C~.J~~kAEl'CF TO0Ltr& r (
C ITEA - NJ0LQ MFI' TENA!1!'NS p~tUI;4r L'
r~ 10Y~ I - VAXI 1110 N'PlS
NE0 F IIE0T TIq (VP T)
DL~7. 'I's1.Y
~.'CL)
I!~~~~~~~ CI 1A
10
1 r.TA wPUL
rA I I /CL
SCAP m Ar scA
C r 1 *L)
o (N -P1r
IF(tx! (1)~zrTs~ ~SCA SC* 1
1a
9 zu1)I$oL
20 CCJT1NUl'
E PS A A2 2
p p Ago
C *'USE 114F CUFFFICIrNT MF VARIA Ie'S 10 011T METHOL orP
C PC~wENIS ESYIPiATE Of TH sHAPr rpA'jmiC
CYAo SORT (NOe0S *SS -S *S) /f
If (CV *GE, CVT3L (1)) THN~''
A a SI4PTL (1)
ELSE IF (CY oLE, CVI kL 4'T'LI) 1I'C"
A a SHPISL (thlbL)
ELSE
C40 1@2, NTRL
IF (CVTr.L (1 *LF. CV) TWNt
A a 5IMO1L (1-1) 4 (tAMPflL (1) *S5010L Ifl
S ICY -CVT ;L (.1)) I(CCVVL (1-11 CC~ (1))
cc TO so
W~i IF
40CC'tT INUF
END IF
c -- LOOCP UhTILL C! VEOG!rCt Ul' MW~AIC. L11117 r~~c
1 CONTINUL
C
CALCULATE SU"S FIP$7
S4 ZEV
c2~ 112 T
S3 S7 3 s(t.SAIIF -',j
1'*
C -- CALC4LATj 0'Eo t,4Ar'L C'11'110
41 A ~.1
V 51 -I 52) 1 (0. 4
C -- U1 0~4s CIV0,41
IF 05SS (Al *A) .LL. r~r V; li~r i)j ~, 1 1
ITVr w ITfrm + I
A a Al
GU) 1( I
C
C -- CALCUL47E SCALE pp Lorjr,. Frfh, St-Ap WAA0-Tr
9v CLNI I UE
1 it AP IC **(Tir /A) ##)ALI
C
r -- U01CALF UAlk
00 sO A s
A Ck 1.!
5I
ej FON1 I UE
0,k7 UP
A2 3
329
MAKING FISHER'S EXACT TEST RELEVANT
Paul H. Thrasher
Engineering Branch
Reliability, Availabi ity, and Maintainability Division
Army Materiel Test and Evaluation Directorate
White Sands Missile Range, New Mexico 88002-5175
ABSTRACT
Q-values are normally calculated using the same algorithm used to find
pre-test Type II errors. The q-value calculation inputs are normally (1) the
p-value instead of the pre-test Type I risk,. (2) the sample size actually used
instead of the samnle size planned, and (3) the same relevant values of the
parameter considered in the pre-test Type II risk calculation. Since the
Fisher-Irwin Exact Method doesn't historically have a design stage, there is
no pre-test algorithm available for modification. This paper develops the
necessary algorithm by extending the p-value calculation based on the binomial
rather than the hypergeometric distribution.
The q-value equations are developed and their mathematical properties are
examined. Computer programming methods are discussed. Examples are provided
for sample sizes both (1) small enough that a hand-held calculator can be used
and (2) large enough to require a digital computer. Numerical results are
interpreted from the viewpoint of a manager that must balance non-zero Type I
and Type II risks.
331
method has not reported information about the Type-li error. This error may
be called consumer's error, the government's error, or the error of concern
for the advocates of change.
The Fisher-Irwin Exact Method can provide relevant information about the
Type-II error. This additional information results from calculating and
reporting q-values. Q-values are the probabilities of being wrong in
marginally failing to reject the null hypothesis when the two samples are from
different populations. Since the two populations may differ in different
ways, there is a q-value for each pair of separate populations. Managers can
use a q-value, for a relevant pair of unequal populations, as evidence for
concluding that the two samples are from those different populations. They
reach this conclusion if they believe a relevant q-value is sufficiently high.
This paper provides equations for and examples of calculating (1) the
p-value and (2) q-values for the Fisher-Irwin Exact Method using a o.e-sided
analysis. This one-tailed analysis is used to reject a single population in
favor of two populations that differ in the direction indicated by the data.
This paper also discusses a digital computer program. This program has
been written to (1) handle the necessary voluminous calculations for large
sample sizes, (2) retain the analyst's identification of the two measurement
samples and the two mutually exclusive and exhaustive categories, and (3)
provide an report from which a manager can decide if future actions should be
based on one or two populations.
P-VALUE CALCULATION
The data for the Fisher-Irwin Exact Method, often called Fisher's Exact
Test, consists of four numbers. They and their sums are normally arranged in
a square array. The following array has double entries to illustrate both the
general situation and a specific example:
Since the choices of Samples I and II and of G(.tegories 1 and 2 are both
arbitrary, there are four possible ways the data can be arranqed. The
ambiguity has been removed from the above table by naming the samples and
categories to make (1) n>(N-n) and (2) r/n>(R-r)/(N-n).
There are two methods of calculating the p-value. The best known uses
the hypergeometric distribution. The second uses the binomial distribution.
Both are described and illustrated in pao-s 195-203 of Bradley, James, V.,
Distribution-Free Statistical Tests, 1r(- iL--Hall, Inc., 1968.
332
Hypergeometric Approach:
nCr N~nCR-r
Data] a
P[Obtaining the NCR
where jCj is the number of ways of choosing j items from I items. Both iC
and iCi.j are found from the following relation of factorials:
il
i i
iCj iCi - j J (i-.)
p-value 21C2 1 5 C3 2 1 CI 1 5 C4 2 1 C0 15 C5
p-aue-+ ,+ ,
36C5 36C5 36C5
333
if (1) n-r is less than expected because n-r~n[(N-R)/N] and (2) mi1 2 is
the minimum possible number of items in Sample I from Category 2,
n-r
p-value iE nCi N.nCNRi / NCN.R;
i-m 12
if (1) n-r is more than expected because n-r>n[(N-R)/N] and (2) MI2 is the
maximum possible number of items in Sample I from Category 2,
if (1) r is less than expected because rn[R/N] and (2) rol, is the minimum
possible number of items in Sample I from Category 1,
r
p-value - II N-nCR-i / NCR;
if (1) r is more than expected because r>n[R/N] and (2) Mil is the maximum
possible number of items in Sample I from Category 1,
Binomial Approach:
334
With the restriction that these two samples are independent, the probability
of obtaining both samples is the product of the two above equations; this
reduces to
Finally, the conditional probability of obtaining the two samples given that
the combined sample has been obtained is
Samples]
P(r In n) and (R-r in N-n) I (R in
N)] -P[Both
P[Combined Sample]'
This is the same equation as was obtained using the hypergeometric approach
and the rest of the calculation of the p-value proceeds identically.
- .0479 - (.236)(.203),
335
P[31;36,.861 in Combined Sample] a 36 C31 (.861)31 (1.861)3631
The value of .253 -isobviously the same intermediate result as was obtained in
the hypergeometric approach. In fact, any value of p yields .253.
The last column contains the probabilities of the different ways that the 31
items can be distributed. The sum of this column is the probability of having
31 items from Category 1 in the two samples. The value of .189 obviously
agrees with the shorter calculation In the preceeding paragraph. A modified
version of the longer calculation of this paragraph will be needed in the
calculation of q-values.
Before calculating q-values, it is illustrative and useful to obtain the
p-value from the table of the preceeding paragraph. Note first that the .0479
in the last column of the starred row corresponding to the data agrees with
P[Obtaining both Samples] from the short binomial calculation of two
paragraphs ago. Note second that the entries above .0479 in the table
correspond to probabilities of obtaining more unlikely partitions than the
data. Using these two facts yields
.34.
336
This result of .34 does not depend on the number used for p. This may be
seen by (1) calculating another table using any p other than 31/36-.861 and
(2) summing the probabilities of obtaining partitions as extreme as the data.
This last method of calculating the p-value emphasizes that the data are
viewed marginally, The data are viewed as unbalanced Just enough to warrant
rejection of the single-population hypothesis.
Q-VALUE CALCULATION
337
Category 1 Approach:
For the specific example in this discussion, one table used to calculate
a q-value for pl-19/21 and paIml2/15 is
The sum of .191 on the lower right represents the probability of obtaining a
total of 31 items from Category 1 from the two samples. This probability can
be divided into entries in the right hand column to find the conditional
probabilities of obtaining possible numbers of items from Category 1 in
Samples I and II. Taking the data and more extreme divisions of the R items
from category I as evidence of rejection, the q-value is found from
* .70.
Using a more conventional approach, the q-value can be found from less extreme
divisions of the R items from Category I to be
,30.
.
338
This procedure may be stated formally with the following equation:
r-1
ri P[r';np]P[R-r';N-npII]
q-value- ............
max(r')
]
r'-m n(r') r'n P 3 R r ;Nn p
where mln(r') and max(r') are the m'inimun and maximum values of r' allowed by
the constraints imposed by fixeo values of N, n, and R. Increasing of r' is
limited by the total size of both Sample I and Category 1. That is, r' must
simultaneously satisfy r'4n and, r4'R. Thus the upper limit on the above sum
is
max(r') • min(n,R)
min(r') m max(O,n+R-N).
r-1
rm) P[r';n,p I ] P [ R- r ' ;N - n,p 1 I ]
q-value rmax(n+R-N)
min(n,R)
;n,p 1 ]
)]P[R-r';N-npl
rlmax(;n+R-N) P[r'
339
Category 2 Approach:
Using Category 2 instead of Category I leads to the following table:
The numbers in the right three columns are the same as in the previous table
used in the Category 1 approach. The calculation proceeds as before but the
indicus in the summation appear differently:
j P[n-r';21,.0952]P[(N-n)-(R-r' );15,.200]
1- q-value - --
I P[n-r';21,.0952]P[(N-n)-(R-r');15,.200]
n.r'O
n-r'.O
max(n-r')
nrn P[n-r';n,l-pl]P [ (N-n)-(R-r');N-nl-pii]
n -r+1
q-value - n-r
I P[n-r';nl-p1 ]P[(N-n)-(R-r');N-n,-p I]
min(n-r')
340
where min(n-r') and max(n-r') are the minimum and maximum possible
measurements of Category 2 items in Sample I. By arguments similar to those
used in the Category 1 approach, n-r'4n and n-r'eN-R imply that
max(n-r') • min(n,N-R)
and n-r)O and R-r')O -, n-r'l)n-R imply that
min(n-r') = max(O,n-R).
min(n,N-R)
P~r' n 'Pl]PER-' N-njPl I]
1'
n-r n.r+l
q-value mn(n,N-R)
IP[r';npl]P[R-r';N-nPii]
n-r'-max(O,n-R)
max(O,n+R-N)
q-eal ' --1 P[r ';n P]P[R-r ' N-n PI1]
q-val ue ... . ..
max(O,n+R-N)
I P[r';n,p 1]P[R-r' ;N-n,p 1 I]
r'min (n,R)
341
Range, Sum with P-Value, and Sametry:
The q-value, like any other probability, is bounded by zero and one.
This is verified for the Category I equation by splitting the sum in the
denominator into two sus which first sum from r'-min(r')-max(O,n+R-N) to
r'-r-1 and then summing from r'-r to r'-max(r')-min(n,R). Dividing both
numerator and denominator by the sum in the3 numerator then yields
1
q-value a ..
min(n,R)
r P[r';njp1 jP[R-r';N-n'pII]
1+
r-1
r-1rI ,n ,pl ]P[Rr',N-n ,PIll'
r',max(;,n+R-N) P
Since this equation's ratio of the two sums is never negative but it will be
infinite if the data yields r-max(On+R-N), q-value ) 0, Since this ratio can
be 0 by choosing p1 and pll equal to 0 or 1, q-value 4 1.
When p, equals p11 , the q-value is one minus the p..value. This occurs
because the possible values of r' are divided into two mutually exclusive and
exhaustive sets. One set contains possible measurements as unlikely or more
unlikely than r. The other contains values of r' more likely than r. These
two sets identify conditional probabilities that are summed to find the
p-value and q-value. The p-value summation uses the unlikely set with both p,
and PTI equated to any common probability. The q-value summation uses the
likel5 set with any values of PT and P1t. The mutual exclusiveness and
exhaustiveness of the two sets equire that p-value + q-value m 1 when PI=PIT-
For the Fisher-Irwin Exact Method, the q-value is syrmetr1c about the off
diagonal in a plot of p, versus p11 . That is, symmetry is expressed by
This may be seen by applying the binomial equation P[i;m,p] mCi p1 ( 1 .p)mi
to the Category 1 equation in a series of three equations:
r-1
r'-max(,n+R-N)
mi~ ,) P[r' ;n,PI]P[R-r';N-n,Pii]
q-value = . .... .. .. . .
r'-max(O,n+R-N)
342
r -1aK
n+~ ) n~ri PIr' (1 -PI)ni-r' N-nCR-r' PIIR-r' (1.P11 )N-n-R+r'
min(n,R) C r j p~ - l CR r , P , N nR r
r'UmaX(O,n+R-N)
where the (No rl Factor) is(L~p1)n p11 R (1 p1 )N-n-R. This constant has been
factored from each term of the suni over r .-After canceling this factor from
the equation, the symiimetry isevident because substituting 1-p
11 for p1 and
I-pi or phj yields the sameequation.
The qvalue foe the Fisher-Irwin Exact Method also issymimetric inn and
R. Applyi Cj w ii/j1(i-j)I to the above equation and canceling ni and
(N-n) yileld
r- PI (1-pui.r
q-value(NjnOR~r~pjp 11 ) a q-value(NORjn~rjp1,pjI).
343
RECAPITULATION AND INTERPRETATION
The p-value and a relevant q-value can provide influencing factors for
management. If the p-value is lower than the risk allowed for the proponent
of a single population, management is inclined toward the decision that two
populations exist. If a relevant q.value is higher than the risk that the
proponent of two populations is willing to take, management is also inclined
toward the decision that two relevant populations exist. On the other hand, a
high p-value or low relevant q-value inclines mdnagement toward the decision
that there is one population.
Management will quite often be influenced by factors other than the
p-value and a relevant q-value. A subjective decision-making process will
naturally be used to consider all factors. The extremity of the lowness or
highness of the p-value and a q-value provides the subjective weight for these
two factors.
If management cannot determine threshold risks to indicate two
populations when the p-value is below and a q-value is above these thresholds,
an alternate approach is to compare the p-value and a q-value. Management can
set a threshold ratio of Type II to Type I risks and compare a ratio of
q-value/p-value to this threshold. Two populations are then indicated if a
ratio of q-value/p-value is too high. In a subjective decision-making process
considering many factors, the extremity of a q-value/p-value ratio provides
the subjective weight of the Fisher's Exact Test factor.
Management should determine which two populations are relevant. Factors
other than the data may suggest specific populations. Management should
consider a q-value for each and every pair of relevant populations, If the
analyst is not provided with the p, and PIT for any relevantly different
populations, the report to management should include a table of q-values for a
wide range of p, and pII.
For the primary example in this discussion, .34 is the p-value and .30 is
a q-value for the two populations suggested by the data. I- these two
populations with pIw.9 and pii=.8 are relevantly different, the two risks of
.34 and .30 provide the basis for action. If the existence of these two
populations is considered as positive, .34 is the probability of making a
false positive decision. Similarly, considering the existence of only one
population as negative implies that .30 is the probability of making a false
negative decision,
If .34 and .30 are believed sufficiently low and high for probabilities
of false positives and negatives respectively, future action is based on the
existence of two populations with with p1 and Pl estimated by .9 and .8. If
.34 and .30 are believed sufficiently high and Iow, future action is based on
a single population.
344
For this example, .30/.34=1/1.1-.9 is a ratio of q-value/p-value.
Subject to the relevancy of pT-, and PiT-. 8 , 1/1.1m.9 is the ratio of risks
of making false negative and false positive decisions. Future action is based
on two populations if i/.1-.g is believed sufficiently high. Similarly,
future action is based a single population if 1/1.1m.9 is believed
sufficiently low.
If the p-value and a q-value provide conflicting or indeterminate
indications that are unresolvable, the immediate future action is to do
additional testing. Additional testing should provide more definitive
information by yielding either a low p-value and a high q-value or a high
p-value and a low q-value. Naturally increasing the sample sizes may not
yield a proportional increase in all the data; but if additional testing
actually doubled all the data in this paper's example, the results would be
.18 for the p-value, .35 for a q-value, and .35/.18-2 for a q-value/p-value
ratio corresponding to pi-.9 and Pi- .8. This possible decrease in the
p-value, increase in a q-value, and increase in a ratio of q-value/p-value
would increase the tendency to base future actions on two populitions.
COMPUTING METHODS AND RESULTS
Two related manipulations are useful in extending the range of data which
yields q-values without computer overflows or underflows. The equation for
the q-value can be rewritten as
r-1 C
ue - r'-max( ,n+R.N) 'r'{ (n-rl)l"('R"rl)l (N-n-R+'r1)1
q-va
r P1 (1-P~~r'
L(I-pi l)il i
q-val ue u ........ -...
where C is any constant. The computer program can assign C with a value which
hinders the summed terms from exceeding the computer's working range. To make
this assignment without overflowing or underflowing the computer, each term
must be considered as
r
C [p ( - l
r'!...n..r'.. . exp[ In(TERMS)]
L ( .Pl) P
r 'i (n-r )! (R-r )l (N-n-R+r ){
I 345
nere the expression ln(TERMS) in the exponential is
ln(TERMS) - ln[C]
+ r'[ln(p1 ) + ln(1-pii) - ln(1-pI) - ln(pii)]
The constant C can be selected to keep ln(TERMS) within the computer's range
for x in exp(x). (e.g. -176 to 176). For any value of r', this selertion can
then be used to force C exi(in(TERMS)) into the computer's operating range
(e.g., 8.6(10)-78 to 1.15 (10)77). Naturally this pr'ogramming technique is
successful only if r' doesn't change too much in the summation between
max(O,n+R-N) and min(n,R).
The range of computer calculations for the p-value ca be extended by
using logarithms. One useful form of the p-value equatiun is
W
p-val~ge'a . exp[ln(nCx) + ln(NfnCy) - ln(NCz)]
iuv
If R<(N-R) If RW(N-R)
Factor If R<n If R)n ' (N-R)N-n)
)f If (N-R)o(N-n)
v r r 'N-n)-(k-r) (N-n)-(R-r)
w R n N-R N-n
x I i N,.R-i N-R-i
y R-i I- i i i
z R R N-R N-R
346
Soeni results from the outputs in these figures (and similar computer
executions) are compiled in the following table. All possible measurements of
r fur N-36, nu2l, and R-31 are included. The tabulated q-values are
referenced to e and * instead of pT and PII' This is necessary because the
computer places the data in a standardly ordered table sometimes making PI-e
and P11rn and sometimes resulting in Pjs and pin-e.
347
The dependence of both the p-val ue and q-val ue on the number of
measurements is illustrated by the following example. Four measurements are
assumed to yield values of N, n, R, and r given by {N1 ,N2 N3 ,N4 -
{20,40,80,160}, {nl,n ,n , } a 10,20,40,80}, {R R2 ,R ,R4 - (17,34,68,1361,
and {rl,r 2 ,r,,rl}- {:19,3%,72). The second, third an3 fourth measurements
are just multiples of the first. All four of these hypothetical measurements
provide point estimates of e and * of 9/10-.9 and (17-9)/10-.8 respectively.
The interior of the following table contains sets of q-values from four
executions of the computer program. Each set has the q-value for the smallest
sample size first and the largest last.
348
The q-value for 0=11/14-. 786 and a=07/121=.884 is .375. That is slightly
higher than the p-value but it might not be large enough justify not using the
short fins. If management sets the desired requirement at e=.900 and decrees
that o-.850 is an unacceptable accuracy rate, the last table on Figure 7b
provides a basis for decision. The q-value for e-.900 and 0-.850 is .525.
Since this is twice the p-value, management has a fairly strong basis for not
using the short fins. If management leaves the desired requirement at .900
and raises the unacceptable level to .890, the q-value increases to .706. The
argument for rejecting the short fins is thus quite strong if .890 is really
an unacceptable accuracy rate.
SUMMARY
The p-value and q-value analysis of the Fisher-Irwin Exact Method has
been developed. The p-value equation has been derived using two techniques:
hypergeometric and binomial, The binomial technique has been extended to
yield a q-value equation. This equation has been derived from two sources:
possible category one measurements and possible category two measurements.
This q-value equation has been shown to possess mathematical symmetry. The
q-value for PI-PIi has been shown to equal one minus the p-value; this was
predestined for the Fisher-Irwin Exact Method because it is a general property
of the p-value and q-values. A computer program has been written. This makes
the analysis practical. Analysts can perform voluminous calculations without
approximations. Managers can consider the relative sizes and importance of
the p-value and relevant q-values. Managers can iecide if the two samples are
from one population or from two populations differing either (1) from the
combined point estimate of the population or (2) according to (A) a desired
population or standard and (B) an unacceptable population. Computer generated
reports have been provided for communication between analysts and managers.
The development of the p-value and q-value analysis of the Fisher-Irwin Exact
Method has reached the stage of Imolementation.
CONCLUS ION
The analyst has a responsibility to report all information influencing
the decision. This information should be in a form that can be understood and
used by the decision-maker. Reporting the p-value and relevant q-values
satisfies both of these conditions. The p-value and q-values provide the
decision-maker with estimates of the risks of making wrong decisions. This
makes Fisher's Exact Test relevant.
349
Greetings! Welcome to a computerized Fisher-Irwin Exact Test Analysis.
Two independent samples are initially assumed to be from a single population.
This assumption is rejected and the two samples are considered to represent
two statistically different populations if management reaches two conclusions:
1) The p-value is deemed sufficiently low and
2) A q-value for relevantly different populations is deemed sufficiently high.
The q-value is the probability of falsely deciding that two populations exist.
A q-value for two relevantly different populations is the probability of
falsely deciding that those two populations are one population.
This computerized analysis does a one sided test in the direction indicated by
the data. It requires four numerical inputs determining nine numbers:
350
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" s
ENTER SIZE OF POPULATION "N" 36
ENTER SIZE OF SAMPLE OF PROMINENCE "n" 21
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN POPULATION "R" 31
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN SAMPLE OF PROMINENCE "r" 19
351
ANALYSIS OF FISHER'S EXACT TEST
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 1.
Category 1: Category 2:
Sample I: r- 19 n- r a 2 n- 21
Sample II: R - r w 12 (N -n) - (R - r) a 3 N - n a 15
Ra 31 N- R - 5 N- 36
For this data, the post-test risk of a Type I error is p-value - 0.337.
For this data's two suggested binomial parameters of the category of interest
(i.e. e m 19 / 21 - 0.905 and a 12 / 15 a 0.800), the post-test risk
of a Type II error is q-value a 0.297.
For other binomial parameters of the category of interest, q.values may be
estimated from the following table:
e\ * 0.050 0.150 0.250 0.350 0.450 0.550 0.660 0.750 0.850 0.950
0.050 0.663 0.956 0.990 0.997 0.999 1.000 1.000 1.000 1.000 1.000
0.150 0.182 0.663 0.866 0.946 0.978 0.992 0.997 0.999 1.000 1.000
0.250 0.058 0.389 0.663 0.827 0.917 0.963 0.986 0.996 0.999 1.000
0.350 0.020 0.210 0.456 0.663 0.809 0.903 0.958 0.986 0.997 1.000
0.450 0.007 0.105 0.285 0.483 0.663 0.804 0.903 0.963 0.992 1.000
0.550 0.003 0.048 0158 0.314 0.491 0.663 0.809 0.917 0.978 0.999
0.650 0.001 0,Q19 0.075 0.174 0.314 0.483 0.663 0.827 0.946 0.997
0.750 0.000 0.006 0.027 0.075 0.158 0.285 0.456 0.663 0.866 0.990
0.850 0.000 0.001 0.006 0.019 0.048 0.106 0.210 0.389 0.663 0.956
0.950 0.000 0.000 0.000 0.001 0.003 0.007 0.020 0.058 0.182 0.663
For binomial parameters of the category of interest near those indicated by
the data, q-values may be estimated from the following table:
a\ * 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.805 0.418 0.460 0.504 0.551 0.600 0.650 0.702 0.753 0.803 0.852 0.896
0.825 0.362 0.402 0.446 0.493 0.543 0.595 0.649 0.705 0.761 0.816 0.869
0.845 0.304 0.342 0.384 0.429 0.479 0.532 0.589 0.648 0.710 0.772 0.833
0.865 0.245 0.280 0.318 0.361 0.409 0.461 0.518 0.580 0.646 0.716 0.786
0.885 0.187 0.217 0.251 0.289 0.333 0.382 0.438 0.500 0.569 0.644 0.724
0.905 0.132 0.156 0.183 0.215 0.253 0.297 0.348 0.407 0.476 0.553 0.641
0.925 0.083 0.100 0.120 0.144 0.173 0.208 0.251 0.302 0.365 0.440 0.530
0.945 0.043 0.052 0.064 0.079 0.098 0.122 0.152 0.190 0.240 0.304 0.386
0.965 0.015 0.018 0.023 0.030 0.038 0.049 0.064 0.085 0.113 0.153 0.211
0.985 0.002 0.002 0.003 0.004 0.005 0.007 0.009 0.013 0.019 0.028 0.044
352
For binomial parameters of the category of interest near those indicated by
e\ 0 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.800 0.431 0.473 0.518 0.564 0.613 0.663 0.713 0.763 0.812 0.859 0.902
0.820 0.375 0.416 0.460 0.507 0.557 0.609 0.663 0.717 0.772 0.826 0.876
0.840 0.318 0.357 0.399 0.445 0.495 0.548 0.604 0.663 0.723 0.784 0.842
0.860 0.259 0.295 0.334 0.378 0.426 0.479 0.536 0.597 0.663 0.730 0.798
0.880 0.201 0.2.32 0.267 0.306 0.351 0.402 0.458 0.520 0.589 0.663 0.740
0.900 0.145 0.170 0.199 0.233 0.272 0.318 0.370 0.431 0.499 0.577 0.663
0.920 0.094 0.112 0.134 0.160 0.192 0.229 0.274 0.328 0.393 0.469 0.559
0.940 0.051 0.062 0.0760.093 0.115 0.141 0.175 0.217 0.271 0.338 0.424
0.960 0.020 0,025 0.031 0.040 0.060 0.064 0.083 0.108 0.142 0,189 0.255
0.980 0.003 0.004 0.006 0.007 0.010 0.013 0.018 0.025 0.035 0.052 0.078
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:
e\ * 0.750 0.770 0.790 0.810 0.830 0.850 0.870 0.890 0.910 0.930 0.950
0.770 0.617 0.663 0.708 0.753 0,797 0.839 0.879 0.914 0.945 0.969 0.987
0.790 0.568 0.615 0.663 0.711 0.759 0.807 0.852 0.893 0.930 0.960 0.982
0.810 0.513 0.561 0.611 0.663 0.715 0.767 0.819 0.867 0.911 0.948 0.976
0.830 0.453 0.501 0.553 0.606 0.663 0.720 0.777 0.834 0.886 0.932 0.968
0.850 0.389 0.436 0.487 0.542 0.601 0.663 0.726.0.790 0.853 0.909 0.956
0.870 0.321 0.365 0.415 0.469 0.529 0.593 0.663 0.735 0.808 0.877 0.937
0.890 0,251 0.290 0.336 0.387 0.445 0.511 0.583 0.663 0.746 0.831 0.909
0.910 0.181 0.213 0.252 0.297 0.351 0.413 0.486 0.569 0.663 0.764 0.865
0.930 0.114 0.138 0.168 0.204 0.248 0.302 0.368 0.449 0.547 0.663 0.791
0.950 0.058 0.072 0.090 0.113 0.143 0.182 0.233 0.301 0.391 0.510 0.663
0.970 0.018 0.023 0.030 0.040 0.053 0,071 0.098 0.137 0.196 0.288 0.434
e\ * 0.600 0.620 0.640 0.660 0.680 0.700 0.720 0.740 0.760 0.780 0.800
0.900 0.064 0.076 0.090 0.105 0.124 0.145 0.170 0.199 0.233 0.272 0.318
0.920 0.039 0.046 0.055 0.066 0.079 0.094 0.112 0.134 0.160 0.192 0.229
0.940 0.019 0.023 0.028 0.035 0.042 0.051 0.062 0.076 0.093 0.115 0.141
0.960 0.007 0.008 0.010 0.013 0.016 0.020 0.025 0.031 0.040 0.050 0.064
0.980 0.001 0.001 0.002 0.002 0.003 0.003 0.004 0.006 0.007 0.010 0.013
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
353
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" r
ENTER "TEST ONE NUMBER OF SUCCESSES" 20
ENTER "TEST ONE NUMBER OF FAILURES" 1
ENTER "TEST TWO NUMBER OF SUCCESSES" 11
ENTER "TEST TWO NUMBER OF FAILURES" 4
ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE skip
ENTER "01 FOR CLOSE LOOK AT Q.-VALUE TABLE IN DATA SUGGESTED REGION,
"ANYTHING ELSE" TO SKIP skip
ENTER "N" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e m 20 / 21 w 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE 11 / 15 - 0.733),
"ANYTHING ELSE" TO SKIP m
ENTER "e" .9
ENTER 11
" .8
ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e a 20 / 21 - 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * =11 / 15 a 0.733),
"ANYTHING ELSE" TO SKIP m
ENTER "le" 1
ENTER "" .7
ENTER "I" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e 0 20 / 21 m 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * = 11 / 15 - 0.733),
"ANYTHING ELSE TO SKIP skip
END OF PROGRAM
354
ANALYSIS OF FISHER'S EXACT TEST
Although Test One, Test Two, Successes, and Failures may be interchanged
several ways mathematically, they have physical identities. To utilize these
identities, [A] Test One (i.e. the test with 20 Successes and 1 Failure) is
taken as the sample of prominence (i e. it Is considered physically more
important than Test Two) and [B] Successes define the category of interest
(i.e. the most natural description of a test result Is considered to be
Success instead of Failure).
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 1.
Category 1: Category 2:
Sample I: r a 20 n - r a I n a 21
Sample IT: R- r • 11 (N -n) - (R - r) * 4 N - n a 15
Rn 31 N-R a 5 Na 36
For this data, the pust-test risk of a Type I error is p-value * 0.084.
For this data's two suggested binomial parameters of the category of interest
(i.e. e i 20 / 21 * 0.952 and * 11 /
a 15 a 0.733), the post-test risk of
a Type II error is q-value - 0.241.
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:
o\ 0 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0,880 0,900
0.800 0.785 0.815 0.843 0.869 0.804 0.916 0.936 0.953 0.968 0.979 0.988
0.820 0.741 0.774 0.806 0.837 0.865 0.892 0.916 0.937 0.956 0.971 0.983
0,840 0.690 0.726 0.761 0.795 0.829 0.860 0.889 0.916 0.939 0.959 0.975
0.860 0.628 0.667 0.705 0.744 0.782 0.818 0.853 0.886 0.916 0.942 0.964
0.880 0.556 0.596 0.637 0.679 0.721 0.763 0.804 0.844 0.882 0.916 0.945
0.900 0.471 0.511 0.553 0.597 0.643 0.690 0.737 0.785 0.832 0.876 0.916
0.920 0.374 0.411 0.452 0.496 0.543 0.592 0.645 0.700 0.756 0.812 0.867
0.940 0.265 0.297 0.333 0.372 0.416 0.465 0.518 0,577 0.641 0.709 0.780
0.960 0.152 0.174 0.199 0.229 0.263 0.302 0.349 0.403 0.466 0.539 0.623
0.980 0.050 0.059 0.070 0.083 0.099 0.118 0.143 0.174 0.214 0.267 0.337
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:
9\ € 0.600 0.620 0.640 0.660 0.680 0.700 0.720 0.740 0.760 0.780 0.800
0.900 0.302 0.332 0.363 0.397 0.433 0.471 0.511 0.553 0.597 0.643 0.690
0.920 0.225 0.250 0.277 0.307 0.339 0.374 0.411 0.452 0.496 0.543 0.592
0.940 0.149 0.168 0.188 0.211 0.237 0.265 0.297 0.333 0.372 0.416 0.465
0.960 0.079 0.090 0.102 0.117 0.133 0.152 0.174 0.199 0.229 0.263 0.302
0.980 0.024 0.027 0.032 0.037 0.043 0.050 0.059 0.070 0.083 0.099 0.118
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
355
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" S
ENTER SIZE OF POPULATION "N" 36
ENTER SIZE OF SAMPLE OF PROMINENCE "n" 21
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN POPULATION "R" 31
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN SAMPLE OF PROMINENCE "r" 18
ANALYSIS OF FISHER'S EXACT TEST
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 2.
Category 1: Category 2:
Sample I: ra 3 n - r - 18 n - 21
Sample II: R - r - 2 (N -n) - (R - r) - 13 N - n a 15
R n 5 N -R - 31 N - 36
For this data, the post-test risk of a Type I error Is p-value n 0.663.
For this data's two suggested binomial parameters of the category of interest
(i,e. e a 18 / 21 a 0.857 and € - 13 / 15 - 0.867), the post-test risk
of a Type II error is q-value - 0.306.
ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE skip
ENTER "C" FOR CLOSE LOOK AT Q-VALUE TABLE IN DATA SUGGESTED REGION9
"ANYTHING ELSE" TO SKIP skip
ENTER "If'FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE o a 18 / 21 a 0.857) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * = 13 / 15 = 0.867),
"ANYTHING ELSE" TO SKIP m
ENTER "e" .8
ENTER "" .9
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:
e\ * 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.800 0.158 O.184 0.215 0.250 0.291 0.337 0.391 0.452 0.521 0.598 0.682
0.820 0.127 0.149 0.176 0.207 0.244 0.287 0.337 0.396 0.464 0.542 0.630
0.840 0.098 0.117 0.139 0.166 0.198 0.237 0.283 0.337 0.403 0.480 0.569
0.860 0.073 0.088 0.106 0.128 0.155 0.188 0.228 0.277 0.337 0.411 0.501
0.880 0.051 0.062 0.076 0.093 0.114 0.141 0.174 0.216 0.270 0.337 0.423
0.900 0.033 0.041 0.050 0.063 0.078 0.098 0.124 0.158 0.202 0.260 0.337
0.920 0.019 0.024 0.030 0.037 0.048 0.061 0.079 0.103 0.136 0.182 0.246
0.940 O.ooq 0.011 0.014 0.019 0.024 0.032 0.042 0.057 0.077 0.108 0.154
0.960 0.003 0.004 0.005 0.007 0.009 0.012 0.016 0.022 0.032 0.047 0.071
0.980 0.000 0.001 0.001 0.001 0.001 0.002 0.003 0.004 0.006 0.009 0.015
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Figure 6. Input and screen output in case that alters order in table.
356
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" r
Although Test One, Test Two, Successes, and Failures may be interchanged
several ways mathematically they have physical identities. To utilize these
identities, [A] Test One (i.e. the test with 11 Successes and 3 Failures)
is taken as the sample of prominince (i.e. it is considered physically more
important than Test Two) and [B] Successes define the category of interest
(i.e. the most natural description of a test result is considered to be
Success instead of Failure).
357
ENTER "C" FOR CLOSE LOOK AT Q-VALUETABLE IN DATA SUGGESTED REGION,
"ANYTHING ELSE" TO SKIP c
For binomial parameters of the category of interest near those indicated by
the data, q-values may be estimated from the following table:
*\ 0 0.686 0.706 0.726 0.746 0.766 0.786 0.806 0.826 0.846 0.866 0.886
0.784 0.499 0.550 0.603 0.655 0.706 0.755 0.801 0.844 0.882 0.916 0.943
0.804 0.430 0.483 0.537 0.593 0.648 0.702 0.755 0.805 0.850 0.891 0.926
0.824 0.358 0.409 0.464 0.521 0.580 0.639 0.698 0.755 0.809 0.859 0.902
0.844 0.283 0.331 0.384 0.441 0.501 0.564 0.628 0.693 0.756 0.815 0.869
0.864 0.208 0.251 0.299 0.353 0.412 0.476 0.544 0.614 0.686 0.756 0.823
0.884 0.138 0.173 0.213 0.260 0.314 0.375 0.443 0.517 0.595 0.676 0.757
0.904 0.079 0.103 0.132 0.168 0.212 0.265 0.327 0.399 0.480 0.569 0.663
0.924 0.035 0.048 0.065 0.088 0.118 0.156 0.205 0.265 0.339 0.427 0.530
0.944 0.010 0.014 0.021 0.031 0.045 0.065 0.093 0.131 0.184 0.255 0.349
0.964 0.001 0.002 0.003 0.005 0.007 0.01,2 0.020 0.033 0.053 0.087 0.140
0.984 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.003 0.008
ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE - 11 / 14 a 0.786) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE e , 107 / 121 - 0.884),
"ANYTHING ELSE" TO SKIP m
o\ 0 0.750 0.770 0.790 0,810 0.830 0.850 0.870 0.890 0.910 0.930 0.950
0.800 0.618 0.673 0.726 0.776 0.824 0.867 0.905 0.937 0.962 0.980 0.992
0.820 0.550 0.608 0.666 0.724 0.778 0.830 0.876 0.916 0.948 0.973 0.989
0.840 0.471 0.532 0.595 0.658 0.721 0.781 0.837 0.887 0.929 0.962 0.984
0.860 0.384 0.445 0.510 0.578 0.648 0.717 0.785 0.847 0.901 0.945 0.976
0.880 0.291 0.348 0.411 0.481 0.556 0.634 0.712 0.789 0.860 0.919 0.964
0.900 0.197 0.245 0.302 0.368 0.442 0.525 0.614 0.706 0.796 0.877 0.942
0.920 0.110 0.145 0.189 0.243 0.310 0.389 0.482 0.585 0.696 0.805 0.902
0.940 0.044 0.063 0.088 0.123 0.170 0.232 0.313 0.415 0.539 0.679 0.822
0.960 0.009 0.014 0.022 0.034 0.054 0.084 0.131 0.201 0.306 0.454 0.646
0.980 0.000 0.000 0.001 0.002 0.003 0.006 0.013 0.026 0.056 0.122 0.265
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
END OF PROGRAM
358
A.IEDA LIST
FOR THE
TUTORIAL S flN NEAR
nI RM ICZ
17-18 OCIOW 1988
AND =H
THIRTY FOURM(CM M CN MEZ DESIGN CP IN A)D
M
RIEA , DEVP T AND TEST=NG
19-21 OMDUR 1988
UI=R.1L (T)
OOFREC (C)
C LANUR Rat
e L.
=rsnt Of Statistics
versity of South Carolina
Columbia, SC 29208 (803) 777-7800
C WAlHZR ,Paul H.
Reliability Division
TE-RE
White Sands Missile Range
New MixicL 88002 (505) 678-6177
T,C N tL,
IZvi J.
Div. 6415
Sandie National Lobs
PO Box 5800
A pinrque, Now Mexioo 87185 (505) 844-4208
C, TE, DeorIis
STP-Mr-TA-A
Ywmm Proving Ground (AV) 899-3251
C €MUM, W.J.
College of Bus. Acmin
Texas Tech University
lak*oc' TX 79409 (806) 742-1546
l2";59
C DAVID, H.A.
Ia State University
Departent of Statistics
ISU, Armes, IA 50011 (515) 294-7749
C HOCK=G. Ron
Texas A&M
Dparwnt of Statistics
College Stati, TX 77843 (409) 845-3151
C ZACS, S.
Binghwvon Center
Departibnt of Math Sciences
Suny, Binghwxton NY 13901 (607) 777-2619
C IAiNDmA, Jagdish
U.S. Amy Reearch Office
Mathematicml Sciences Division
P.O. Box 12211
Research Triangle Park
NC 27709 (919) 549-0641
C ESSV*AMEU, De Oskar M.
Research Directorate
U.S. Amy Missile Ccmmrd
A'IV: AMOU-PD-RE-AP
Redstone Arsenal, AL 35898-5248 (AV) 876-4872
360
T,C ABRAHIM, YMhd
Mathematical Department
New Mexico State University
T HVICK, Chris
Mathematical EDepartment
New Mexico State University
T DALE, Richard H.
Data Sciences Division
STEWS-NR-AM
White Sands Missile Range, N 88002
T GADNEY, George
Data Sciences Division
STM-NR-AD
White Sands Missile Range, NM 88002
TC MICAUJLIN, Dale R.
Data Sciences Division
STES-MR-AM
White Sands Missile Range, NM 88002
T ZEBR, Ronald
AMXCM-CW-WS
White Sands Missile Range, NM 88002
361
TC SUTGE, J. Fbbezt
waiter Reed Anny Institute of Research
washirqton, DC 20307 (202) 576-3151
TC MM I Douglas
walter Ped
Anny Institute of Research
Washingtionp DC 20307-5100 (202) 576-7212
TC GPWM0P Gavin
t7TEF
Math Deparbmnt
UT'P I El Paso, -IX (915) 747-5761
C WOOS, Anthony K.
USATWM1, System Analysis Of fice
Mo~deling &Techniques Div.
4300 Goodfellow Blvd.
St. louis MO 63120-1798 (314) 263-2926
TC BODT, Barry A.
362
T,C WEBB, David W.
Ballistic Research Laboratozy
ATTNt SLM-SE-P
Aberdeen Proving Ground
MD 21005-5066 (301) 298-6646
T QUINZI, Tony
TRAC
White Sands Missile Range
NM 88002 (505) 678-4356
C LEIG, Siegfried H.
U.S. Auy Missile CauwuAn
ATT~s AS1M-RD-RE-OP/S. H. Lehnigh
RestomO Arsenal, AL 35898-5248 (205) 876-3526
363
T,C Sw= I Donald X.
New Maxico Research Institute
Box 160
Las CMW-es, NK 88004 (505) 522-5197
C BATES, Carl B.
U.S. Am Concepts Arnaysi Agency
8120 Wbodmont Ave
Bethesda, M 20814-2797 (AV) 295-0163
C Mm(, Donald L.
Stanford University
Departunt of Operations Resiarch
Stanford, CA 94305-4022 (415) 723-0850
C SOWOKY, P.
Counsellor, Defenie R&D,
Canadian Embassy
2450 Massachusetts Ave NW
Washington, DC 20008 (202) 483-5505
TC BISSMER, Batney
Penn State University
Hershey Foods
281 W. Main St
Middletown, PA 17057 (717) 944-0649
364
C =tzALZ I RaMiro
TRAC
White Sands Missile Range, NM 88001
C SHUSTEI, Eugene
Mathmatical Department
University of Texas -El Paso
El Paso, Texas
C ROJO, Javier
Mathematical Department
University of Texas - El Paso
El Paso, Texas
C CNG,. Cherq
Matheatical Departmnt
University of TewUa El Paso
El Paso, Texas
C LIU, Yan
Mathematical Department
University of Texas - El Paso
El Paso, Texas
C POLLA , Charles
RAM Division
White Sand&1 Missile Range, New Mexico 88002
C KAIQH, Bill
Mathematical Department
University of Texas - El Paso
El Paso, Texas
C PARZEN, Emanuel
Department of Statistics
Texas A&M University
College Station, Texas 77843
365
C HAMM, Joe
U.S. Ay Airborne Board
Fort Bragg, NC 28307 (AV)236-5115
TC CASTILLO, Cesar
Data Sciences Division
STEWS-NR-AM
White Sands Missile Range, NM 88002
T CATRSON, Janet
White Sands Missile Range, NM 88002
TC COHEN, Herb
AMSA
Aberdeen Proving Ground MD 21005
WANG, Phillip
TRAC
White Sands Missile Range, NM4 88001
T ACKER, Clay D.
Data Sciences Division
STES-NR-AM
White Sands Missile Range, NM 88002
C CoX, Paul
2930 Huntington Drive
Las Cruces, New Mexico 88001
TC CULPEPM , Gideon
Las Cruces, NM
C ANDERSEN, GERALD
366
C VANE, Mark
U.S. A ' Materials Technology Lab
Watertown Arsenal
C WMST, Larry
TErN Headquarters
Aberdeen Proving Ground, MD 21005-5066
C W3W4AN, Edward
George Mason University
367
UNa.ASSIFIED
SECURITY CLASSIFICATION 5F fHISPAG
8c. ADDRESS (City, State, and ZIP Cod@) 7b. ADDRESS (City, State, and ZIP Ccode)
P.O. Box 12211
Research Triangle Park, NC 27709
Be. NAME OF FUNDING /SPONSORING 8 b, OFFICE SYMBOL 9. PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER
ORGANIZATION AMSC on behalf (if applicable)
of ASO (ADA)
ITASK
I______ _____________________
Sc. AD.DRESS (City, State, and ZIP Code) 10. SOURCE OF FUNDING NUMBERS
PROGM O PROJECT WQKUNITNO
.AC SI N N .
M N NO IN ,N
11. TILE (include S cu t yClassification)EL
Proceedings of the Thirty-Fourth Conference on the Design of Experiments in Army Research,
DeveloSTent and Testing
12. PERSONAL AUTHOR(S)
13a. TYPE OF REPORT 13b TIME COVERED 114, DATE OF REPORT (YearW011th,Day) 1i5. PAGE COUNT
Technical FROM fL...fRA TO... 9I 1989 July I7 367-
16, SUPPLEMENTARY NOTATION
17. COSATI CODES I18, SUBJECT TERMS (Continue on reverse it necessary and identify by block number)
FIELD GROUP SUB-GROUP