0% found this document useful (0 votes)
708 views

Get TRDoc

This document provides proceedings from the Thirty-Fourth Conference on the Design of Experiments in Army Research, Development and Testing. It includes summaries of presentations on topics such as order statistics, functional statistical data analysis, reliability analysis, and nonparametric statistics. It also recognizes Dr. Marion R. Bryson's receipt of the eighth Wilks Award for his contributions to statistical methodology in Army research.

Uploaded by

Lili Sbora
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
708 views

Get TRDoc

This document provides proceedings from the Thirty-Fourth Conference on the Design of Experiments in Army Research, Development and Testing. It includes summaries of presentations on topics such as order statistics, functional statistical data analysis, reliability analysis, and nonparametric statistics. It also recognizes Dr. Marion R. Bryson's receipt of the eighth Wilks Award for his contributions to statistical methodology in Army research.

Uploaded by

Lili Sbora
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 365

ARO Report 89-2

PROCEEDINGS OF THE THIRTY-FOURTH


CONFERENCE ON THE DESIGN OF
CN EXPERIMENTS IN ARMY RESEARCH
DEVELOPMENT AND TESTING

This document cnan

blank pages that were


not filmed.

Approved for public release; distribution unlimited.


The findings in this report are not to be construed as an
official Department of the Army position, unless so
designated by other authorized documents.

Sponsored by
The Army Mathematics Steering Committee 'f41
IELECTE
T IC
on Behalf of
AUG 15 1989

THE CHIEF OF RESEARCH, DEVELOPMENT AND


ACQUISITION

I, .
U.S. Army Research Office

Report No. 89-2 V


July 1989

PROCEEDINGS OF THE THIRTY-FOURTH CONFERENCE

ON THE DESIGN OF EXPERIMENTS

Sponsored by the Army Mathematics Steering Committee

HOST

U.S. Army White Sands Missile Range

White Sands Missile Range, New Mexico

HELD AT

New Mexico State University


Las Cruces, New Mexico
19-21 October 1988

Approved for public release; distribution unlimited.


The findings in this report are not to be construed as
an official Department of the Army position, unless so
designated by other authorized documents.

U.S. Army Research OfFice


P.O. Box 12211
Research Triangle Park, North Carolina
FOREWORD

The Thirty-Fourth Conference on the Design of Experiments in Army Research,


Development and Testing was held on 19-21 October 1988 in the auditorium of the
Physical Sciences Laboratory on the campus of New Mexico State University, Las
Cruces, New Mexico. Mr. John Lockert, Director of the White Sands Missile
Range, stated his installation would serve as the host for this meeting. He
selected Mr. William Agee to act as the chairperson for local arrangements.
The attendees appreciated the quiet and efficient manner in which this
gentleman handled the many tasks associated with this event. He is also to be
commended for his planning arrangements for a tutorial which was scheduled to
be held two days before the start of this conference.

The original format for the Design of Experiments Conferences, which are under
the auspices of the Army Mathematics Steering Committee (AMSC), was outlined by
the eminent statistician, Professor Samuel S. Wilks, who served as conference
chairman until his death. Through these symposia the AMSC hopes to introduce
and encourage the use of the latest statistical and design techniques into the
research, development and testing conducted by the Army's scientific and
engineering personnel. It is believed that this purpose can be best pursued by
holding these meetings at various government installations throughout the
country.

Members of the program committee were pleased to obtain the services of the
following distinguished scientists to speak on topics of interest to Army
personnel:
Speaker and Affiliation Title of Address

Professor Herbert A. David Some Applications of Order


Iowa State University Statistics
Professor Ronald R. Hocking Diagnostic Methods - Variance
Texas A&M University Component Estimation
Professors Donald L. Iglehart Computational and Statistical
and Peter W. Glynn Issues in Discrete-Event
Stanford University Simul ation

Professor Emanuel Parzen Two Sample Functional


Texas A&M University Statistical Analysis
.ooegsson For
Professor Edward L. Wegman Parallel Coordinate I
George Mason University Density Plots I GRA&I
TIC TAB [

)By .. . .. .
iii LDIt ributlon/
/7-- Availability Codas
,'~ Dist
/Speutal
M~ AviiIJ and~/or
Four days before the start of the planned two-day tutorial on "Topics in Modern
Regression Analysis", its speaker advised Mr. Agee he could not give his
planned lectures. Fortunately, Professor Ali Hadi of Cornell University was
able, so to speak, to save the day. The attendees were very pleased with Dr.
Hadi's interesting and informative tutorial on "Sensitivity Analysis in Linear
Regressi on".

Dr. Marion R. Bryson, Director of the U.S. Army Combat Development


Experimentation Center, was the recipient of the eighth Wilks Award for
Contributions to Statistical Methodologies in Army Research, Devel'opment and
Testing. This honor was bestowed on Dr. Bryson for his many significant
cunLributions to the field of statistics. These started by providing
statistical consulting while he was on the faculty of Duke University. This
era was followed by full-time work devoted to directing analytical studies for
the Army. Since then, he has provided overall technical direction to the
Army's most modern field test facility. His published works include papers on
a wide range of topics of importance to the Army, including methods for scoring
casualties, designing field experiments, and inventory control problems.
The AMSC has asked that these proceedings be distributed Army-wide to enable
those who could not attend thiis conference, as well as those that were present,
to profit from some of thp scientific ideas presented by the speakers. The
members of the AMSC are taking this opportunity to thank all the speakers for
their interesting presentations and also members of the program committee for
their many contributions to this scientific event.
PROGRAM COMMITTEE

Carl Bates Robert Burge Francis Dressel


Eugene Dutnit Hugh McCoy Carl Russell
Doug Tang Malcolm Taylor Jerry Thomas
Henry Tingey

iv
TABLE OF CONTENTS*

Title Page
Foreword.************...***** C eO S* *C CCC0* C 000**...g. iill

Tabl e of Contents................................0......C............ v
Program.. ......
........ ...* ..... *.. ....................... vii

SOME APPLICATIONS OF ORDER STATISTICS


He A. David....................*..*................. .. 1
MULTI-SAMPLE FUNCTIONAL STATISTICAL DATA ANALYSIS
Eman uel1 Parz en. .,,,,.aa.,.
,. .. .0a* , . 0a00 .... , . .. ..
, ., 9990 15

RELIABILITY OF THE M256 CHEMICAL DETECTION KIT-


David W. Webb and Linda L.C. Moss.............,,...,....., 27
COMPARISON OF RELIABILITY CONFIDENCE INTERVALS
Paul H . Thrasher. ......... . 0 1 ... .. .. ............ 33
ENVIRONMENTAL SAMPLING: A CASE STUDY,
Dennis L. Brandon . 4...... .......... .. 73

A GENERALIZED GUMBEL DISTRIBUTION


Siegfried HC Lehni gk...C........ ... . . .... ......... ... 77

A GENERALIZATION OF THE EULERIAN NUMBERS WITH A PROBABILISTIC


APPLICATION
Bernard Harris ...... ............... ... .......... 79
THE ANALYSIS OF MULTIVARIATE QUALITATIVE DATA USING AN ORDERED
CATEGORIAL APPROACH
H. B. Tingey, E. A. Morgenthein, and S. M. Free.................. 99
A SMALL SAMPLE POWER STUDY OF THE ANDERSON-DARLING STATISTIC AND A
COMPARISON WITH THE KOLMOGOROV AND THE CRAMER-VON MISES STATISTICS
Linda L.C. Moss, Malcolm S. Taylor, and Henry B. Tingey.......... III
NONPARE, A CONSULTATION SYSTEM FOR ANALYSIS OF DATA
J. C. Dumer, III, T. P. Hanratty, and M. S. Taylor..... .... ... 173

*This Table of Contents contains only the papers that are published
in this technical manual. For a list of all papers presented at the
Thirty-Fourth Conference on the Design of Experiments, see the Program of
this meeting.

v
TABLE OF CONTENTS (continued)

Title Page
;NUMERICAL ESTIMATION OF GUMBEL DISTRIBUTION PARAMETERS.,
Cha rl es E. Hall1 , Jr.. aa. * * * ** ** .. * * * . * * * . * * * * **
0 0 a a 0 a # a f * ** * 1856

EXPERIMENTAL DESIGN AND OPTIMIZATION OF BLACK CHROME SOLAR


SELECTIVE COATINGS,

0ETERMINATION OF DETrECTION RANGE OF MONOTONE AND CAMOUFLAGE


OATTERNED FIVE-SOLDIER CREW TENTS BY GROUND OBSERVERS.,
George Anitole and Ronald L. Johnson...................0 *0'" a"s 189
AN EXAMPLE OF CHAIN SAMPLING AS USED IN ACCEPTANCE TESTING
Jerry Thomas, Robert L. Umholtz, and William E. Baker........... 201
SOME NOTES ON VARIABLE SELECTION CRITERIA FOR REGRESSION MODELS
(AN OVARV IEW) .
Eugene F.Dutoi t..........................f .4..a........a......b 219
TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS-

PARALLEL COORDINATE DENSITIES,


Edwa rd J. Wegma n.. . . .0 a.a . . . * 0 * ** *0 * * * *.
1 . . . .
0 .. * 1 *.**
0 0 24 7
COMPUTATIONAL AND STATISTICAL ISSUES IN DISCRETE-EVENT SIMULATION
Peter W. Glynn and Donald L. gehr..............265
BAYESIAN INFERENCE FOR WEIBULL QUANTILES
Mark G. Vangel ... .... . ..
.*,,,,,. .. . *******.
. . . . ...... ...... 281

MAKING FISHER'S EXACT TEST RELEVANT.


Paul H. Thahr.. .... *............,...331

LIST OF REGISTERED ATTENDEES ...... . .. ........ ... .. ... * ... .. 359

vi
AGENI1A

THE THIRTY-FOURTH CONFERENCE ON TIM DESIGN OF EXPERIMENTS


IN ARMY RESEARCH, DEVELOPMENT AND TESTING

19-21 October 1988

Host: White Sands Missile Range

Location: Physical Sciences Laboratory


Nev Mexico State University
Las Cruces, Nev Mexico

xxxxx Wednesday, 19 October xxxxx

0815 - 0915 REGISTRATION

0915 - 0930 CALLING THE CONFERENCE TO ORDER

WELCOMING REMARKS:

0930 - 1200 GENERAL SESSION I (Auditorium)

Chairman:

0930 - 1030 KEYNOTE ADDRESSt

SOME APPLICATIONS OF ORDER STATISTICS

6. A. David, Iova State University

1030 - 1100 BREAK

1100 - 1200 TWO SAMPLE FUNCTIONAL STATISTICAL ANALYSIS


Emanuel Parzen, Texas A&M University

vii
1200 - 1330 LUNCH

1330 - 1500 - CLINICAL SESSION A

Chairpersons Carl Bates, U.S. Army Concepts Analysis Agency

Panelists: Bernard Harris, University of Wisconsin-Madison


Robert Launer, University of South Carolina
Emanuel Parzen, Texas A&M University

RELIABILITY OF THE M256 GIMEICAL DETECTION KIT

David V. Webb and Linda L.C. Moss, U.S. Army


Ballistic Research Laboratory

COMPARISON OF RELIABILITY CONFIDENCE INTERVALS

Paul H. Thrasher, White Sands Missile Range

1500 - 1530 BR1AK


1530 - 1710 COMBID CLINICAL AND TECHNICAL SESSION

Chairpersons Carl Russell, U.S. Army Operational Test and


Evaluation Agency

Panelists: Bernard Harris, University of Visconsin-Madison


Ronald Hocking, Texas AIIM University
Henry Tingey, University of Delavare

ENVIRONMENTAL SAMPLING: A CASE STUDY

Dennis L. Brandon, U.S. Army Engineer Vatervays


Experiment Station

THE GENERALIZED GUMBEL DISTRIBUTION


Siegfried H. Lehnigk, U.S. Army Missile Command

EULER NUMBERS, EULER-FROBENIUS POLYNOMIALS AND PROBABILITY

Bernard Harris, University of Wisconsin-Madison

viii
xxx= Thursday, 20 October xxxx€

0800 REGISTRATION
0815 - 1000 TECINICAL SESSION 1

Chairperson: Oskar M. Essenvanger, U.S. Army Missile


Command
REAP-A RADAR ERROR ANALYSIS PROGRAM
William S. Agee and Andrev C. Ell!-,gson, White Sands
Missile Range
A MET9OD FOR ANALYZING MULTIVARIATE QUALITATIVE DATA USING
AN ORDERED CATEGORICAL APPROACH
H.B. Tingey, E. A. Morgenthien, and S. M. Free,
University of Delavare

II
A SMALL SAMPLE POWER STUDY OF THE ANDERSON-DARLING STATISTIC

Linda L.C. Moss, Malcolm S. Taylor, U.S. Army Ballistic


Research Laboratory, and Henry B. Tingey, University of
Delavare

NONPARU A CONSULTATION SYSTEM FOR NONPARAMETRIC ANALYSIS


OF DATA
Malcolm S. Taylor, John C. Dumer, III and Timothy P.
Hanratty, U.S. Army Ballistic Research Laboratory

1000 - 1030 BREAK

1030 - 1200 TECHNICAL SESSION 2

ChaArpersont Linda L.C. Moss, U.S. Army Ballistic Research


Laboratory

NUMERICAL ESTIMATION OF DISTRI(BUTION PARAMETERS

Charles E. Hall, Jr., U.S. Army Missile Command

ix
EXPERIMENTAL DESIGN AND OPTIMIZATION OF BLACK CHROME SOLAR
SELECTIVE COATINGS
I. J. Hall and R. B. Pettit, Sandi& National Laboratory

MULTI-OBSERVER MULTI-TARGET VISIBILITY PROBABILITIES FOR


POISSON SHADOWING PROCESSES IN TUE PLANE

M. Yadin and S. ?hacks, State University of Nov York at


Binghamton

1200 - 1330 LUNCH

1330 - 1500 APPLICATION SESSION

Chairpersons John Robert Burge, Walter Reed Army Institute


of Research

AN EXAMPLE OF CHAIN SAMPLING AS USED IN ACCEPTANCE TESTING


Robert L. Uholtz, Jerry Thomas and William E. Baker,
U.S. Army Ballistic Research Laboratory

SOME NOTES ON MODEL SELECTION CRITERIA


Eugene Dutoit, U.S. Army Infantry School

TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS


Barry A. Bodt, U.S. Army Ballistic Research Laboratory

1500- 1530 BREAK


1530 ~ 1730 GENERAL SESSION II

Chairperson: Gerald Andersen, U.S. Army Research,


Development and Standardization Group (UK)

PARALLEL COORDINATE DENSITY PLOTS

Edvard J. Vegman, George Mason University

COMPUTATIONAL AND STATISTICAL ISSUES IN DISCRETE-EVENT


SIMULATION
Donald L. Iglehart and Peter W. Glynn, Stanford
University

1830 - 1930 CASH BAR

1930 - 2130 BANQUET AND PRESENTATION OF WILKS AWARD

x
xxxzz Friday, 21 October xxxxx

0800 REGISTRATION

0815 -0930 TECHNICAL SESSION 3

Chairperson: Barry A. Bodt, U.S. Army Ballistic Research


Laboratory

BAYESIAN INFERENCE FOR VEIBULL QUANTILES

Mark Vangelp U.S. Army Materials Technology Laboratory

A QUALITY ASSURANCE TEST BASED ON P(y<x) CRITERIA

Donald Neal, Trevor Rudalevige and Mark Vangel,


U.S. Army Materials Technology Laboratory

MAKING FISHER'S EXACT TEST RELEVANT

Paul H. Thrasher, White Sands Missile Range

0930 - 1000 BREAK

1000 - 1130 GENERAL SESSION III


Chairperson: Douglas B. Tang, Valter Reed Army Institute
of Research Chairman of the AMSC Subcommittee
on Probability and Statistics

OPEN MEETING OF THE STATISTICS AND PROBABILITY SUBCOMMITTEE


OF THE ARMY MATHEMATICS STEERING COMMITTEE

DIAGNOSTIC METHODS - VARIANCE COMPONENT ESTIMATION

Ronald R. Hocking, Texas A&M University

ADJOURN

PROGRAM COMMITTEE

Carl Bates Robert Burge Francis Dressel


Eugene Dutoit Hugh McCoy Carl Russell
Doug Tang Malcolm Taylor Jerry Thomas
Henry Tingey

xi
SOME APPLICATIONS OF ORDER STATISTICS*

H. A. David
Department of Statistics
102D Snedecor Hall
Iowa State University
Ames, IA 50011-1210

ABSTRACT. Suppose that the random variables X1 ,...,X n are arranged in


ascending order as X n (: ... ( Xn: n * Then Xr~n is called the r-th order
statistic (r a Il...,n). Order statistics, and functions thereof, have been
used extensively in such diverse areas as quality control, the estimation of
parametwrs, life testing, data compression, selection procedures, and the
study of extreme meteorological phenomena. In this paper we focus on
applications of order sLatistics to (a) estimators that are resitant to
outliers, (b) current measures of location and dispersion such as the moving
median and the moving range, and (c) some problems in reliability.

1. INTRODUCTION. if the random variables Xl,1*.,n are arranged in


asceding order of magnitude and then written as

1;n 4 .s e Xr:n e ... t Xn:no


we call Xr: n the r-th order statistic (0S)(r a 1,...,n). Usually X1,...Xn
are assumed to be a random sample from some underlying population.

The subject of order statistics deals with the properties and applica-
tions of these ordered random variables and of functions involving them,
Examples are the extremes Xl:n and Xn:ng the range Wn - Xn: n - Xl:n, the
extreme deviate (from the sample mean) Xn:n - i, and the maximum absolute

deviation from the median (MAD) 1 max.,X i - M1, where the median M

equals X (n odd) and 1(X + X ) (n even).


1~):n (9+ 1):n
All these statistics have important application. The extremes arise in
the statibtical study of droughts and floods, as well as in problems of
breaking strength and fatigue failure. The range is well known to provide a
quick estimator of the population standard deviation a, whereas MAD is a more
recent estimator of a valuable becau3e of its high resistance to wild
observations (outliers). The extreme deviate it a basic tool in the detection
of such outliers, large values of (X:n - Xl/a indicating the presence of an

*Keynote Address, 34th Conference on the Design of Experiments in Army


Research, Development and Testing, New Mexico State University, Las Cruces,
October 19, 1988. Prepared with support from the U. S. Army Research Office.
standardized bias and mean squired error (MSE) of Ln (a) under (2.1) can be
obtained with the help of tables of the first two moments of normal order

statistics in the presence of an outlier (David, Kennedy, and Knight, 1977).


For example, under (2.1) the standardized bias E () of L n(a) is given by
E Ln (a) - v Xi -P

or
bn(X) 0 E a iain(X), (2.3)

where 0in (X) is the expected value of Xi: n for U - 0, 0 - 1. Note


that i:n(0) is just the widely tabulated expected value ai: n of the i-th OS,
Zi~n,
in random samples of n from a standard normal population. Clearly,
ai:n( ) is a strictly increasing function of A. Also, since for X - -, (2.1)
leaves us with a random normal sample of size n - 1 plus an observation at ,
we have
ai - l,...,n-1 ann (a) - (2.4)

(and likewise ai:n (-' ) -'i-1:n' i - 2,...,n c:n(_*) =


*

Some results for samples of 10 are shown in Figures I and 2, where

X 10
is the sample mean10 1 E0
9
T 1(1) is the trimmed mean 8E Xil 0

8
W 10 (2) is the Winsorized mean 1- (2X3 :10 + E Xi.1 0 + 2X8 :10 )

T1 0 (4) is the median 2 (K 5 : 1 0 + etc.

The figures are confined to ) ) 0 since results for X < 0 follow by skew-
symmetry in Fig. I and by symmetry in Fig. 2.
BISb(X).
BIAS bn Since at O i we see from (2.3) that the bias is a strictly

increasing function of X for each of the estimators, and from (2.4) that
bn (-) - ai
i i:a-l (2.5)

This gives the numerical values placed on the right of Fig. 1. The jagged
graphs are the corresponding "stylized sensitivity curves" (Tukey, 1970;
Andrews et al., 1972) obtained by plotting Ln(al :n1,..*'a n X)
i:n-

against In particular, for the median we have


I.

3
~
P,,4
40

IAfII

I
N*N

Ot*
P4 da

4E
:go g 990)
. 1 meio(,:9....
me g- -" :9~ 2
1 0 < X~ <a6:
1('5:9 + 6:9 2 '6:9 ' 6:9
- 0.1372 6

The last result is the same as given by (2.5). In fact, each of the horizon-
tal lines serves as an asymptote to the corresponding bias function. It is
seen that the median performs uniformly best.
MEAN SQUARED ERROR MSE() _. No clear-cut results emerge. The sample mean
does best for ;k< 1.5 but is quickly outclassed for larger X. Overall, TIO(1)
performs best, although the more highly trimmed TIO(2) is slightly superior
for very large X.

EXTENSIONS
I. The intuitively appealing result that for symmetric unimodal distributions
the median is the least biased among the Ln-estimators can be formally
established under (2.1) and also for a class of symmetric non-normal
distributions (David and Ghosh, 1985).
2. For n 4 20 appropriately trimmed means still do well in the MSE sense when
compared with much more complex estimators, but for X sufficiently large
and n not too small are inferior to the best of the adaptive estimators
such as Tukey's biweight (Mosteller and Tukey, 1977, p. 205).

3. An often used alternative outlier model replaces the second line of (2.1) by

x, ~ N(u, - 20 2 ) T2 > I

For this model location estimators remain unbiased but their variance is
increased. Since bias has been sidestepped, only the variance of the
estimator needs to be studied (David and Shu, 1978; Rosenberger and Gasko,
1983).
CASE OF SEVERAL EXTREME OUTLIERS. For q(1 4 q < - n) outliers Rocks (1986)
defines as a measure of outlier resistance of an estimator of location T the
"expected maximum bias" DT(n,q) by

DT(nq) - E[{sup[IT(Z Icn q,...,1Zn-q:n q, i,...,q)I}, (2.6)

where the supremum is taken over all possible choices of the constants
Xl,...,)q and the Z's are the normal OS. When T - Ln, the supremum will
evidently occur when the X'A are all + 4 or all - 0. As Rocke points out, by
focusing on the worst case of bias one need not specify the usually unknown
distribution(s) of the outliers. It suffices to model the good observations
which more generally could be from any standardized distribution.

It appears that unwittingly Rocks does not use (2.6) but in fact works
with the standardized bias

5
T
(n,q) - E{T(Zi~n...,on.q~n...,e)}s (2.7)

If the good observations were independently generated from a unimodal


symmetric distribution (mode - maximum), then again the median can be shown to
have the least bias DT(n,q) among Ln-statistics (locke's proof is incorrect;
see the appendix).

3. CURRENT MEASURES O LOCATION AND DISPERSION

Let Xibe a sequ.ence of independent random variables with cdf jx


(i - 1,2,...). Then Sn W a Xte,...,X+n-1) may be called a moving sample of

size n, and X(i) the r-th OS of S (i) the moving r-th OS. Moving maxima
r:n' n -

(r - n) and minima (r - 1) were studied by David (1955) under homogeneity

(Fi(x) - F(x), i - 1,2,...,) in the course of an investigation of moving

ranges W i n X~i
n:n -
1:n C~)
i -1,2,...). The latter have a longer history
(Grant, 1946), being natural companions to moving averages on quality control
charts. Such charts are particularly appropriate when it takes some time to
produce a single observation.

Moving medians are robust current measures of location and, like moving
averages, smooth the data; see, e.g., Tukey (1977, p. 210). Cleveland and
Kleiner (1975) have used the moving midmean, the mean of the centrAl half of
the ordered observations in each Sn"i) , together with the mean of the top half
and the mean of the bottom half, as three moving descriptive statistics
indicating both location and dispersion changes in a time series.

Since Sn(i) and Sn(") involve common random variables iff


d Bi Ji
- < n, we see that X(i) ) and X are independent for don and
Ir*,n s,.n
dependent otherwise, with n - d rv's in common. To begin with, we assume
homogeneity, Then the joint distribution of Sn(i) and Sn(J) will be
stationary and will depend only on F(x), n, and d. We therefore consider
I)
Sn(1) nand S(1+d), and more specifically Xrrzn and X(1+d)
S~n
(r,s m 1,...,n). Let

d
ifgh(d) - Pr{rank (X(1))
r:n - g, 9:n
rank x(l +d ) )
hj, (3.i)

where rank (Y) denotes the rank of Y in the combined sample XI,.*..Xn+d . It
follows that

6
(1) x (s:n~d)
E(X r:n Zg,h gh E(X g:n+d 0n+d
(3.2)

This permits calculation of cov(X(l), X(l+d)) in terms of the first two


r*s s:n
moments of order statistics in samples of n+d from a distribution with cdf
F(x), since the w h can be obtained by combinatorial arguments (David and
Rogers, 1983). The joint
•r:n distribution of X (i ) and X (s:n
J ) has been investigated
by Inagaki (1980).

With the help of (3.2) it is possible to evaluate the auto-covariance


structure under homogeneity of the moving median and, in fact, of any linear
function of the order statistics a'x( M Ixl:n + f*.+ nXnn:n. That is, we
can find
coy (00 7 1 )i, 5 ~~

in terms of the first two moments of the OS for sample sizes up to 2n-I from a
distribution with cdf F(x).

Electrical engineers have made extensive use of moving order statistics


in digital filters. They view a moving sample as a window on a sequence of
signals xl, x 2 ,.., and speak of median filters when using the moving median to
represent the current value of the signal, thaeby "filterinr out" occasional
impulsive disturbances (outliers)(e.g,, Arco, Calaghor, and Nodes, 1986).
More generally, the median may be replaced by G'g to give order statistic
filters (eg., Bovik and Restrepo, 1987).

For example, suppose that in the automatic smoothing of a basically


stationary time series one is prepared to ignore single outliers but wishes to
be alerted to a succession of two or more high (or low) values. This calls
for use of moving medians in samples of three, since clearly a single outlier
will be smoothed out but two successive large values will result in two large
medians. The following small example illustrates the situation, where for _
purposes of comparison we have added the much less informative moving mean xis

X13 1 1 10 2 4 3 9 10 2 1

i )
x 2:3 1 1 2 4 3 4 9 9 2

x 1 4 41 51 3 51 71 7 4J
133 33 3 3

1 X2 ,... are not iid, even the distribution of order statistics in


When XK

7
a fixed sample becomes complicated although a fairly elegant expression for
the pdf can be written down in terms of permanents (Vaughan and Venables,
1972) if the X's are not identically distributed but still independent. It is
easily seen that the moving median and other order statistics will reflect
trends except for a smoothing at the ends. Thus for the following sequence,
where the upward trend is underlined, we have
x 5 2 1 3 4 6 9 12 14 11 7

x2:3 2 2 3 4 6 9 12 12 11

x22 2 22 41 61 91 1 10
3 3 3 3 ~ 3 13 10
For a linear trend given by
X, 0 iT + Zi, i 1,2,... (3.3)
where the Zi are i.i.d., we evidently have

Q)~ W
r:n -r:n
(i )
with covariances coy (X
Sr:nD X(i))
*:nt (r,s o 1,...,n) independent of i.

Consider now a particular sample X1 ,X2 ,...,X 2 m.-


1 (m - 2,3,...) with
symmetric unimodal distributions. Then under (3.3), which need hold only for
the sample in question, we see that for T>0

Pr{rank X i = il incruases with 1.

Thus Xr:n will tend to lead the trend, reflect the current state, or lag the
trend according as r m, and will do so increasingly as T increases; for
<O, the results are reversed. However, in contrast to the sample mean,
whose variance remains unchanged under a linear trend, the variance of the
sample median increases with T. (I am indebted to Or. W. J. Kennedy for some
computations verifying the latter intuitively obvious result.) Thus the use
of the median, under locally linear trend, is appropriate primarily as
protection against outliers. In this situation, but under nonlinear trend,
Bovik and Naaman (1986) consider the optimal estimation of EX. by linear
functions of order statistics.

4. SOME PROBLEMS IN RELIABILITY

There is a well-known immediate connection between order statistics and


the reliability of k-out-of-n systems.
Definition A k-out-of-n system is a system of n components that functions if
and only if at least k (k - 1,...,n) of its components function. Series and
parallel systems correspond to k - n and k - 1.

I_ II8
Let Xi(i - 1,...,n) be the lifetime of the i-th component and
Ri(x) PriXi > x} its reliability at time x (the probability that it will
function at time x). Then the reliability of the system S at time x is
Rs(x) - Prixn(k+x)n > x1.
If the Xi are independent (but not necessarily identically distributed) one
may write (Sen, 1970; Pledger and Proschan, 1971).

n ai I-A
Rs(X) _ E i R (x) [I - Ri(x)] ,
A i-l

n
where Ai 0 or 1, and A is the region E A > k. It ca be shown that a
ili
series (parallel) system is at most (least) as reliable as the corresponding
= n
system of components each having reliability -(x)E i W. An excellent
n iAl e
general account, covering also important situations when the X i are not
independent, is given in Barlow and Proschan, (1975).

I will conclude with a problem in reliability, quite different from the


above, that was suggested by an enquiry from Malcolm Taylor (see Baker and
Taylor, 1981). A fuze contains n detonators, r of which must function within
time span t. The ideal requirement r w n may be too demanding in practice and
r * n-I suffices. The n times to detonation, Xl,.e.,Xn, may reasonably be
regarded, I was told, as a random sample from a normal population. Let
P(r; n,t) be the probability that at least r detonations have occurred in time t.

Now, for a random sample from any continuous distribution with cdf F(x),
P(n;n,t) is just

Prixnn -x 1:n 4 t- - nf:*[F(x+t) - F(x)]n - 1 dF(x),

the cdf of the sample range (Hartley, 1942). Let A1 ' and A2 be the events

Xn-l:n - XI:n 4 t and Xn:n - X2:n ( t, respectively. Then,

P(n-1; n,t) - Pr{A1 'U A2}


PrIA1 '} + Pr{A} - PrIA,'A 2 } (4.1)

The event A' occurs if n-I or n-2 of Xl,.%.,Xr fall in the interval
(Xi:noX n + t] and A2 if n-2 of the X, are in (X2:n,X2:n + ti. Since A2
includes the event that n-1 of the X, are in (Xl:n,Xl:n + t], we can avoid
unnecessary duplication by replacing A1 ' in (4.1) by A1 , the event that
exactly n-2 of the X, are in (Xi:n1 Xl:n + t].

9
We have immediately, writing n ( j ) . n(n-1) ... (n-j+l), that

Pr{AI) - n a J'. [F(x+t) - F(x)I'"-[l - F(x+t)] dF(x)

and

Pr{A2 } - n(2)f_ [F(x+t) - F(x)]n' 2 F(x)dF(x).

The joint occurrence of A1 and A2 is illustrated below for n - 6. We


have
. ., .,, . , i :
1X2x+t y+t
X zn x X2:n y

3
Pr{A1 A2 } - n( )f9fx+t[F(x~t) - F(x+t)]
-n3[Y+t) dF(y)dF(x)

From these results P(n-1; n,t) has been tabulated in David and Kinyon
(1983) when F(x) - O(x). Note that P(n.-l;n,t) may be interpreted as the
probability that at least n-i out of n independent normal N(p,o 2 ) variates are
within an interval of length to.
EXAMPLE. As in Baker and Taylor (1981) suppose that XI,.,.,X, are independent
normal variates with a - 10- 5 The entry P(6;7,3) a 0.9587 tells us that the
probability of at least six detonations out of 3 possible seven within time
span 3o is 0.9587. By comparison, the probability of seven detonations is
only 0.6601, as found from tables of the cdf of the range (Pearson and
Hartley, 1970).

David and Kinyon (1983) also give an expression, involving a triple


integral, for P(n-2; nt). It should be noted that P(r; n,t) ham received
much attention by quite different techniques in the special case when the Xi
are independent uniform variates (e.g., Neff and Naus, 1980). From a
different viewpoint again, writing
P(r; n,t) - Pr(H (1) et,
where " -in (Xi. 1 ,n - Xi),
nn ial,...,n-r+L rln in

we may regard Hn(1) as a measure of dispersion. In fact, H (a) is the length


n n n
of the shorth, the shortest a-fraction of the ordered sample (Andrews et al.,
1972). It has recently been shown (Grubel, 1988) that Hn(a) is asymptotically
normal (for fixed a).

10
APPENDIX
H. A. David and C. C. Yang

Correction to 'Outlier resistance in small samples'


By DAVID M. ROCKE

Biometrika (1986), 73, 175-81

The author does not stay with his own definition of DT(n,q) but in fact uses
DTn'q) -EIT(zlo""""n-q0"'"'1")}'
Even with this change the proof of the theorem on p. 176 is in error since the
combinatorial tem associated with 6 should be (n-l ), not (n1 ). However,
n-r n-r
since 6n-r = 6r-q, the theorem follows directly from Case 2 of David and
Groeneveld (Biometrika (1982), 69, 227-32) and has essentially been proved in
P. K. Sen (Ed.) Biostatisties (1985) North-Holland, pp. 309-11.

REFERENCES
Andrews, D. F., Bickel, P. J., Hampel, F. ',, Huber, P. J., Rogers, W. H., and
Tukey, J. W. (1972). Robust Estimates of Locations. Princeton University
Press.

Arcs, 0. R., Gallagher, N. C., and Nodes, T. A. (1986). Median filters:


Theory for one- and two-dimensional filters. In: Advances in Computer
Vision and ImaEe Processing, Vol. 2, pp. 89-166.

Baker, W. E. and Taylor, M. S. (1981). An order statistic approach to Fuze


design. Tech. Rept. ARBRL-TR-02313, U. S. Army Research and Development
Command, Ballistic Research Lab, Aberdeen Proving Ground, Md.

Barlow, R. E. and Proechan, F. (1975). Statistical Theory of Reliability and


Life Testingl Probability Models. Holt, Rinehart, and Winston, New York.

Barnett, V. and Lewis, T. (1984). Outliers in Statistical Data. 2nd edn.


Wiley, New York.

Bovik, A. C. and Naaman, L. (1986). Least-squares signal estimation using


order statistic filters. Proc. 20th Ann. Conf. Info. Sci. Syst., pp. 735-
39.

Bovik, A. C. and Restrepo, A. (1987), Spectral properties of moving L-


estimates of independent data. J. Franklin Inst. 324, 125-37.

Cleveland, W. S. and Kleiner, B. (1975). A graphical technique for enhancing


scatterplots with moving statistics. Technometrics 17, 447-54.

David, H. A. (1955). A note on moving ranges. Biometrika 42, 512-15.

11
David, H. A. (1981). Order Statistics. 2nd edn. Wiley, New fork.

David, H. At, Kennedy, Wo J., and Knight, R. D. (1977). Means, variances,


and covariances of normal order statitics in the presence of an outlier.
Selected Tables in Mathematical Statistics. S, 75-204.

David, H. A. and Kinyon, L. C. (1983). The probability that out of n events


at least rtln-2) occur within time span t. In: Sen, P. K. (Ed.),
Contributions to Statistics, pp. i07-13, North-Holland, Amsterdam.

David, H. A. and Rogers, M. P. (1983). Order statistics in overlapping


samples, moving order statistics and U-statistics. Biometrika, 70, 245-9.

David, H. A. and Shu, V. S. (1978). Robustness of location estimators in the


presence of an outlier. In: David, H. A. (Ed.), Contributions to Survey
Sampling and Applied Statistics: Papers in Honor of H 0o. Hartley, pp. 235-
50, Academic Press, New York.

Galambos, J. (1987). The Asymptotic Theory of Extreme Order Statistics.


Kriegar, Malabar, Florida.

Grant, E. L. (1946). Statistical Quality Control. McGraw-Hill, New York.

Grubbs, F. He. (1950)o Sample criteria for testing outlying observations.


Ant. Math. Statist. 21, 27-58.

Griibel, R. (1988). The length of the shorth. Ann. Statist. 16, 619-28.

Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A.


(1986). Robust Statistics. Wiley, New York.

Inagaki, N. (1980). The distributions of moving order statistics. In:


Matusita, K. (Ed.), Recent Developments in Stati.stical Inference and Data
Analysis, pp. 137-42. North-Holland, Amsterdam.

Mosteller, F. and Tukey, J. W. (1977). Data Analysis and Regression.


Addison-Wesley, Reading, Massachusetts.

Neff, N. D. and Naus, J. 1. (1980). The distribution of the size of the


maximum cluster of points on a line. Selected Tables in Mathematical
Statistics 6, 1-207.

Pearson, B. S. and Hartley, H. O. Biometrika Tables for Statisticians, Vol.


I, 3rd Ed. (with additions)# Cambridge University Press.

Pledger, G. and Proschan, F. (1971). Comparisons of order statistics from


heterogeneous distributions. In: Rustagi, J. S. (Ed.) Optimizing Methods
in Statistics, pp. 89-113. Academic Press, New York.

Rocks, D. M. (1986). Outlier resistance in small sample3o Biometrika 73,


175-82.

12
Rosenberger, J. L. and Gasko, M. (1983). Comparing location estimatorn:
Trimmed means, medians, and trimean. In: Hoaglin, D. C., Mosteller, F.,
and Tukey, J. W. (Eds.) Understanding Robust and Exploratory Data Analysis,
pp. 297-338, Wiley, New York.

Sen, P. K. (1970). A note on order statistics for heterogeneous


distributions. Ann. Math. Statist. 41, 2137-9.

Tukey, J. W. (1970). Exploratory Data Analysis. (Limited Preliminary


Edition) Addison-Wesley, Reading, Massachusetts.

Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley, Reading,


Massachusetts.-

13
MULTI-SAMPLE FUNCTIONAL STATISTICAL DATA ANALYSIS
Emanuel Parson
Department of Statistics
Texas A&M University
College Station, Texas 77843-3143

ABSTRACT. This paper discusses a functional approach to the problem of compar-


loon of multi-samples (two samples or a samples, where c > 2). The data consists of a
random samples whose probability distributions are to be trsted for equality. A diversity
of statistics to test equality of c samples are presented in a unified framework with the
aim of helping the researcher choose the optimal procedures which provide greatest insight
about how the samples differ in their distributions. Concepts discussed are: sample distri-
bution functions; ranks; mid-distribution function; two- sample t test and nonparametric
Wilcoxon test; multi-sample analysis of variance and Kruskal Wallis test; Anderson Darling
and Cramer von Mimes tests; components and linear rank statistics; comparison distribu-
tion and comparison density functions, especially for discrete distributions; components
with orthogonal polynomial score functions; chi-square tests and their components.
1. INTRODUCTION. We assume that we are observing a variable Y in c cues or sam.
ples (corresponding to c treatments or c populations). The samples can be regarded as the
value of c variables Y1, ... , Y with respective true distribution functions FI(y), ... , F9'(Y)
and quantlie functions QI(u) 1 '*, (u). We call Y1 ,..., Yo the conditioned variables (the
value of Y in different populations).
The general problem of comparison of conditioned random variables is to model how
their distribution functions va'y with the value of the conditioning variable k - 1,... e,
and in particular to test the hypothesis of homogeneity of distributions:
H0:Ft =,.,= F6= F
The distribution F to which all the others are equal Is considered to be the unconditional
distribution of Y (which is estimated by the sample disti lbution of Y In the pooled sampe).
I. DATA. The data consists of a random samples
YM(j),j = 19 ... ,
for/g = 1, c, The pooled sample, of size N = nj,+ ... + no, represents observations of
o.,
the pooled (or unconditional) variable Y, The a samples are assumed to be independent
of each other.
8. SAMPLE DISTRIBUTION FUNCT[ONS. The sample distribution functions of
the samples are defined (for -oo < y < oc) by
Fk'(y) = fraction <_y among Yk(.).
The unconditional or pooled sample distribution of Y is denoted
F() = fraction :5 y among Yk(.), k = 1,...,c.
We use ^ to denote a smoother distribution to which we are comparing a more raw
distribution which is denuted by a ", An expectation (mean) computed from a sample is
denoted E'.
Research Supported by the U.S. Army Research Office

15
4. RANKS, MID-RANKS, AND MID-DISTRIBUTION FUNCTION. Nonparamet-
ric statistics use ranks of the observations in the pooled sample; let

Rk(t) denote the rank in the pooled sample of Yk(t).


One can define Rk(t)= NFA(Y (t)).
In defining linear rank statistics one transforms the rank to a number in the open unit
interval, usually Rk(t)/(N + 1). We recommend (Rk(t) -. 5)/N. These concepts assume
all observations are distinct, and treat ties by using average ranks. We recommend an
approach which we call the "mid-rank transform" which transforms Yk(t) to P^(Y(t)),
defining the mid-distribution function of the pooled sample Y by
Ply) = F () - .SpA(l).
We call
pA(y) = fraction equal to y among pooled sample

the pooled sample probability mass function.


5. SAMPLE MEANS AND VARIANCES. When the random variables are assumed
to be normal the test statistics are based on the sample means (for k = 1,...c,)
nos

Yk-= ir[Yk] = (/nk) , kO.


t=1
We interpret y - as the sample conditional mean of Y given that it comes from the kth
population. The unconditional sample mean of Y is
Y- = E[Y] = p.lY- + ... + p. 0Ya',

defining
p.k = n/N
to be the fraction of the pooled sample in the kth sample; we interpret it as the empirical
probability that an observation comes from the kth sample.
The unconditional and conditional variances are denoted

VAR[Y] = (i/N) E EYkU)- Y-)


n,

VAR [YM-- (1/n.) {(Yk(j) - Yk-} 2


j=1

Note that our divisor is the sample size N or nk rather than N - c or n k - 1. The latter
then arise as factors used to define F statistics.
We define the pooled variance to be the mean conditional variance:

a.2 Z p.k VAR-[Yk]


k=1

16
6. TWO SAMPLE NORMAL T TEST. In the two sample cast the statistic to test
H0 is usually stated in a form equivalent to
N - 2))((1/ni) + (1/n2))})
T = (Y- - Y2-}1a^{(N1(
We believe that one obtains maximum insight (and analogies and extensions) by expressing
T in the form which compares Yf'" with Y-:
T = ((N - 2)p.i/(1 - p.)' 5 (Y -
- Y'}laA
1
The exact distribution. of T is t(N - 2), t-distribution with N - 2 degrees of freedom.
7. TWO-SAMPLE NONPARAMETRIC WILCOXON TEST. To define the popular
Wilcoxon non-parametric statistic to test Ho we define Wk to be the sum of the nk ranks
of the Yk values; its mean and variance are given by
E(Wk] =nk(N + 1)/2, VAR[Wk] = nln2 (N + 1)/12
The usual definition of the Wilcoxon test statistic Is
A = {Wk - E[Wk]j}{VAR[Wk]) 5 .
The approach we describe in this paper yields as the definition of the nonpasrametric
Wilcoxon test statistic (which can be verified to approximately equal the above definition
of TI, up to a factor {1 - (1/N) 2}'5 )

T 1 = {12(N - 1)p.i/(1 - p. 1)}(R "- -. 5,


defining

Ri-= (1/nj) E(Rl(t) .5)/N


= (Wj1 njN) - (1/2N)
One reason we prefer this form of expressing non-parametric statistics is because of its
relation to mid-ranks;

One should notice the analogy between our expressions for the parametric test statistic
T and the nonparametric test statistic Ti; the former has an exact t(N - 2) distribution
and the latter has asymptotic distribittion Normal{O, 1).

8. TEST OF EQUALITY OF c SAMPLES NORMAL CASE. The homogeneity of


c samples is tested in the parametric normal cue by the analysis of variance which starts
with a fundamental identity which in our notation is written

VAR'[Y]- E P.k{Yk- _ y_}2 4, Cr^2


k=1
The F test of the one-way analysis of variance can be expressed as the statistic
c
T2 E
k=l
P.klTl' ,

=Z (--p.k)ITFk1 2 ,
k=1

17
defining
Tk- -(N- ){Yk-- Y-/a'
TFk = {(N - C)p.k/(1 - - -
Y-I/O"

The asymptotic distribution of T 2 /(C -1 ) and TF' are F(c - 1, N - c) and F(1, N - c)
respectively.
9. TEST OF EQUALITY OF c SAMPLES NONPARAMETRIC KRUSKAL.
WALLIS TEST. The Kruskal-Wallis nonparametric test of homogeneity of c samples
can be shown to be

TKW 2 -p.k)ITKWk I2.

TKWk = (12(N - 1)p.k/(1 - P.k)} 5 {Rk - - .5}

The asymptotic distributions of TKW2 nd TKW are chi-squared with c - 1 and 1


respectively.
degrees of freedom
2
10. COMPONENTS. We have represented the analysis of variance test statistic T
and the Kruskal-Wallis test statistic TKW 2 as weighted sums of squares of statistics TFk
and TKWk respectively which we call components, since their values should be explicitly
calculate, to indicate the source of the significance (if any) of the overall statistics. Other
test statistics that can be defined can be shown to correspond to other definitions of
components.
11. ANDERSON DARLING AND CRAMER VON MISES TEST STATISTICS. Im-
portant among the many test statistics which have been defined to test the equality of
distributions are the Anderson-Darling and Cramer-von Mimes test statistics. They will
be introduced below in terms of representations as weighted sums of squares of suitable
components.
12. COMPARISON DISTRIBUTION FUNCTIONS AND COMPARISON DEN.
SITTY FUNCTIONS. We now introduce the key concepts which enable us to unify and
choose between the diverse statistics available for comparing several samples. To compare
two continuous distributions F(.) and H(.), where H is a true or smooth and F is a model
or raw, we define the comparison distribution fun c tion

D(u) = D(u; H, F) = F(H-(u))


with comparison density

d(u) = d(u;H,F) = D'(u) =f(H-1(u))/h(H-l(u)).


Under Ho : H = F, D(u) = u and d(u) = 1. Thus testing H0 is equivalent to testing
D(u) for uniformity.
Sample distribution functions are discrete. The most novel part of this paper is that
we propose to form an estimator D~(u) from estimators H-(.) and F-(.) by using a general
definition of D(.) for two discrete distributions H(.) and F(.) with respective probability
mass functions p H and PF satisfying the condition that the values at which PH are positive
include all the values at which PF are positive.

18
18. COMPARISON OF DISCRETE DISTRIBUTIONS. To compare two discrete
distributions we define first d(u) and then D(u) as follows:
d(u) = d(u; H, F) = pp(H- I())/ pH(H-,(u),
D(u) = d(t)dt.

We apply this definition to the discrete sample distributions F^ and F"k to obtain
dk"(, ) = d(u; F, Fk')
and its integral Dk'(u).
We obtain the folowing definition of dk"(u) for the c sample testing problem with all
values distinct:
dk'(u ) = N/nk if (Rk(j) - 1)IN < u < Rk(j)/N,j = 1,...,nk,
= 0, otherwise.
A component, with score function J(u), is a linear functional

Tk'(J) = J(u)dk'(u)du

It equals
(link) N -I N )d,

which can be approximated by E'[J(P(Yk))].


14. LINEAR RANK STATISTICS. The concept of a linear rank statistic to compare
the equality of c samples does not have a universally accepted definition. One possible
definition is nlk

Tk-(J) = (1/nk) E J((Rk(j) - .5)/N)


j-l
However we choose the definition of a linear rank statistic as a linear functional of dk'(u),
which we call a component; it is approximately equal to the above formula.
We define

Tk-(J) = ((N - 1) VAR[J(U)Ip.k/(1 - pAk)) 5 f J(u){dk-(u) - 1}du (1)

where U is Uniform{0,1}, E[J(U)] = f1 J(u)du,


,1

VAR[J(U)] = {J(u) - E[J(U).'ldu.

Note that the integral in the definition of Tk'(J) equals

u}.
fOI J(u)d(Dk-(u) -

19
The components of the Kruskal-Wallis nonparametric test statistic TKW 2 for testing
the equality of c means have score function J(u) = u - .5 satisfying
EIJ(U)]- .5, VAR[J(U)] = 1/12.
The components of F test statistic T 2 have score function
J(U) = {QC )- Y }
where Q'(u) is sample quantile function of the pooled sample Y.
15. GENERAL DISTANCE MEASURES. General measares of the distance of D(u)
from u and of ((tt) from 1 are provided by the integrals from 0 to 1 of
2
{d(&) - 12, {D'(u) - U}2, {D,() - i2/U(j _ U), {(u) -1

where d^(u) is a smooth version of cr(u). We will see that these measures can be decom-
posed Into components which may provide more insight; recall basic components are linear
functionals defined by (1)
r(J) = J(u)d"(u)du.
If 4O(u), i = 0, 1,2,..., are complete orthonormal functions with 0 = 1, then H0 can
be tested by diagnosing the rate of increase (as a function of m = 1, 2,...) of
f1 m
{dm(u)- 1_}2 d = 1: IT-(0,1 2

which measure the distance from 1 of the approximating smooth densities


M

dtn (u) = ,T0) j()

16. ORTHOGONAL POLYNOMIAL COMPONENTS. Let pi(z) be Legendre poly-


nomials on (-1,1):
p1(X) X
1)/2, -
P2(X) = (3z2
P3(X) (5z' - 3z)/2,
-

p4(z) - 35z 4 - 30z 2 + 3.


Define Legendre polynomial score functions
L, (u) = (2i + 1)'"pi(2u - 1).
One can show that an Anderson-Darling type statistic, denoted AD(D'), can be repre-
sented

AD(D-)= 1 {{D(u) _ U}2 /u(l -- u))du


00

i=1

20
Define cosine score functions by
~ci(u) = 2 '5coa(iir u ).
One can show that a Cramer-von Mises type statisi., denoted CM(D), can be repre-
sented

CM(D -) = j{D(v.) - U)2 du

,i-1
In addition to Legendre polynomial and cosine components we consider Hermite poly-
nomial components corresponding to Hermite polynomial score functions
OHi(u) = (il)-'Hi(V-(u))
where H, (z) are the Hermite polynomials:
Hi(z) = ,
H2 () = x2 - 1,
H3 () = , 3 - 3,
H4 (x) = x4 _ SX2 + 3.

17. QUARTILE COMPONENTS AND CHISQUARE. Quartile diagnostics of the


null hypothesis HO are provided by componentr with quartile "square wave" score functions

S Q1 (U) = -2 ' 5, 0 < u < .25,


=0 , .25 <. < .75,
= 2 '5 , .75 < U < 1;
SQ 2 (U) = 1, 0 < u < .25,
= -1, .25 < u <.75,
=1, .75< u1;
SQ 3 (u) = 0 if 0 < u < .25 or .75 < u < 1,
---2- 51 .26 < u < .6,
5
= 2- , .5 < u< .75.

A chi-squared portmanteau statistic, which is chi-squared(3), Is


3
CQk = (N - 1)p.k/( - P.k) F Ik(SQ,)l2

(N - 1)P.k/(1 - P.k) fo dQ (u)

defining the quartile density (for i = 1,2, 3,4)


dQk(u) = 4{Dk"(i(.25)) - Dk'((i- 1).25),(i - 1).25 < U < i(.25)

21
A pooled portmsnteau chi-squared statistic is
a
CQ = 1( - P.k)CQk
ksi

18. DIVERSE STATISTICS AVAILABLE TO TEST EQUALITY OF a SAMPLES.


The problem of statistical Infereence is not that we don't have answers to a given question;
usually we have too many answers and we don't know which one to choose. A unified
framework may help determine optimum choices. To compare c samples we can compute
the following functions and statistics:
i} comparison densities: dk'U,.
comparison distributions k (j),
quartile comparison density dQk(u), quartile density chi-square

CQk = (N - l)p.k/(l - p'k) o {dQk(u)- })2 du.


4) non-parametric regression smoothing of dk (u) using a boundary Epanechnikov kernel,
denoted dkA(u),
5) Legendre components and chi-squares up to order 4 are defined using definition (!) of
Tk':
TLk(i) = T"(OLi)

CLk(m) = > ITLk(i)l

CL(m) = F,(1 -p,k)CLk{,)


k-1

AL'k = 2 TLk(i)I
00
2 /i(i + 1)

AD = >.(- P.k)ADk

6) Cosine components and chi-squares up to order 4 are defined:


TCk(i) - Tk *C,)

CCk(m) = >M ITCk(i) 2


C
CC(m) ,.(1 - p. )Cck(m)
00k- 1
2
cMk = £jCCk(()2 /(7r)

cM = f
CM = >2(l P~k)CMk
k=1
I

22
7) Hermite components and chi-squares up to order 4 are defined:

THM(i) = Tk"(OHi)
m
C14(m)= IT k(i)lI
i-1

CH(m) - p.k)CHk(,)
k=1

8) density estimators dkA(u) computed from components up to order 4,


9) entropy measures with penalty terms which can be used to determine how many
components to use in the above test statistics

19. EXAMPLES OF DATA ANALYSIS, The interpretation of the diversity of statis-


tics available is best illustrated by examples.
In order to compare our methods with others available we consider data analysed by
Boos (1986) on ratio of assessed value to sale price of residential property in Fitchburg,
Mass., 1970. The samples (denoted I, II, III, IV) represent dwellings in the categories
single-family, two-family, three-family, four or more families, The sample sizes (54, 43,
31, 28) are proportions .346, .276, .199, .179 of the size 156 of the pooled sampie. We
compute Legendre, cosine, Hermite components up to order 4 of the 4 samples; they are
asymptotically standard normal. We consider components greater than 2 (3) in absolute
value to be significant (very significant).
Legendre, cosine, and Hermite components are very significant only for sample I,
order 1 (.4.06, -4.22, -3.56 respectively). Legendre components are significant for sample
IV, orders 1 and 2 (2.19, 2.31). Cosine components are significant for sample IV, orders I
and II(2.36, 2.23) and sample III, order 1 (2.05). Hermite components are significant for
sample IV, orders 2 and 3 (2.7 and -2.07).
Conclusions are that the four samples are not homogeneous (have the same distribu-
tions). Samples I and IV are significantly different from the pooled sample. Estimators
of the comparison density show that sample I is more likely to have lower values than the
pooled sample, and sample IV is more likely to have higher values. While all the statistical
measures described above have been computed, the insights are provided by the linear rank
statistics of orthogonal polynomials rather than by portmanteau statistics of Cramer-von
Mises or Anderson-Darling type.
20. CONCLUSIONS. The goal of our recent research (see Parzen (1979), (1983))
on unifying statistical methods (especially using quantile function concepts) has been to
help the development of both the theory and practice of statistical data analysis. Our
ultimate aim is to make it easier to apply statistical methods by unifying them in ways
that increase understanding, and thus enable researchers to more easily choose methods
that provide greatest insight for their problem. We believe that if one can think of several
ways of looking at a data analysis one should do so. However to relate and compare the
answers, and thus arrive at a confident conclusion, a general framework seems to us to be
required.
One of the motivations for this paper was to understand two-sample tests of the
Anderson-Darling type; they are discussed by Pettitt (1976) and Scholz and Stephens
(1987). This paper provides new formulas for these test statistics based on our new def-
inition of sample comparison density functions. Asymptotic distribution theory for rank
processes defined by Parzen (1983) Is given by Aly, Csorgo, and Horvath (1987); an excel-
lent review of theory for rank processes is given by Shorack and Wellner (1986).

23
However one can look at k sample Anderson-Darling statistics as a single number
formed from combining many test statistics called components. The importance of com-
ponents is also advocated by Boos (1986), Eubank, La Riccia, and Rosenstein (1987) and
Alexander (1989). Insight is greatly increased if instead of basing one's conclusions on
the values of single test statistics, one looks at the components and also at graphs of the
densities of which the components are linear functionals corresponding to various score
functions. The question of which score functions to use can be answered by considering
the tall behavior of the distributions that seem to fit the data.

REFERENCES
Alexander, William (1989) "Boundary kernel estimation of the two-sample comparison
density function" Texas A&M Department of Statistics Ph. D. thesis.
Aly, E.A.A., M. Csorgo, and L. Horvath (1987) "P-P plots, rank processes, and Chernoff-
Savage theorems" In Now Perspectives in Theoretical and Applied Statistics (ed. M.L.
Puri, J.P. Vilaplann, W. Wertz) New York: Wiley 135-156.
Boos, Dennis D. (1986) "Comparing k populations with linear rank statistics" Journal of
the American Statistical Association, 81, 1018-1025.
Eubank, R.L., V.N. La Riccia, R.B. Rosenstein (1987) "Test statistics derived as compo-
nents of Pearson's Phi-squared distance measure" Journal of the American Statistical
Association, 82, 816-825.
Parzen, E. (1979) "Nonparametric statistical data modeling" Journal of the American
Statistical Association, 74, 105-131.
Parzen, E. (1983) "FunStat quantile approach to two-sample statistical data analysis"
Texas A&M Institute of Statistics Technical Report A-21 April 1983.
Pettitt, A.N. (1976) "A two-sample Anderson Darling statistic" Biometrika, 63, 161-168.
Scholz, F.W, and M.A.Stephens (1987) "k-sample Anderson- Darling tests" Journal of the
American Statistical Association, 82, 918-924.
Shorack, Galen and Jon Wellner (1986) Empirical Processes With Applications to Statistics
New York: Wiley,

24
4)i

C-4 wbD

- 4)x 0
C*- >

cyI-
) rn

w .w

0)0

4) 0. 4"W! W! an0

C24
For samples I and IV, sample comparison distribution function IY(u)

huoimg WO/hice IFUR FUU AStft owin Yalu/Price SINGUL FAMILV ASSESS

6 .6

.4 .4
.~ .3

.1 .2 .3 s ti
* tj ti 1.1 C .2 .3 .4 5 .6 .7 38 .

For samples I and WV, sample comparison density d(u), sample quartile density dQ-(u)
(square wave), nonparametric densty estimator d^(u)

4ouing
W 93 u/Pice 1001 FAILIV ASSESSI 2,8wuin Ualuu/Price SINML FAMILY ASSESS

21

For samples I and IV, Legendre, cosine, and Hermite orthogonal polynomial estimator of
order 4 of the comparison density, denoted d4 (u, compared to sample quartile density
dQ-(u).
Ieg, cos(x's), NerW(Os Density Ley, Cos~x's), Her(es) Density

33.3

... 2

.3
2 .1
.1 .2 .3 .4 5 .6 .7 .1 .9 1I1 . 4 . 6

26
Reliability of the M256 Chemical Detection Kit

David W. Webb & Linda L C. Moss


U.S. Army Ballistic Research Laboratory

Abstract
The U.S. Army uses the M256 Chemical Detection Kit (CDK) to indicate the presence
or absence of certain agents in the battlefield, which is indicated by a color change on the kit.
Strength of response Is also influenced by the quantity of agent. Lots must meet reliability
specificutIons to be considered "battle-ready". How do we go about collecting and analyzing
our data so as to evaluate Its reliability? Other problems of interest include quantifying how
the agent quantity affects the response and if there are differences between the two manufac.
turers of the M256 CDK. Consultants at the Ballistic Research Laboratory have employed a
dose-response framework to study the reliability problem. We use a binary response
(present/not present) and assume a lognormal distribution In arriving at a response curve for
each lot. Assessments of our approach and suggestions for alternative approaches are asked
of the panel.

27
Description of Kit

The M256 Chemical Detection Kit (CDK) is used to detect the presence/absence of
dangerous concentrations of toxic agents by color-changing chemical reactions. Each CDK
contains twelve samplers, which are the actual testing devices. Four types of agents can be
detected with the CDKL The tests indicate
a) Ifit is permissible to remove the protective mask following an agent attack,
b) if agent is present in the air or on surfaces suspected of contarnk'ation,
c) if any agent is present after decontamination operations.
The U.S. Army requires that the samplers exhibit at least a 92.5% reliability (with 90%
confidence) in responding to agent concentrations at the specification levels, However, the
kit should not be so sensitive that soldiers wear their mask at safe levels of concentration,
thereby interrupting other battlefield duties.
On the back of each sampler are complete instructions for testing and colored examples
of safe and danger responses. After performing the test, a paper test spot is checked for any
change of color. The color change will not usually be an exact match with the colors shown
on the back of the sampler. This is because the response depends upon the agent quantity.
To make matters more complex, when the agent is present the observed response may be
nonuniform with a few shades of the danger response showing.

Test Conditions &Restrictions


The lots of kits differ in manufacturer (A or B), age (1-8 years), and storage site (8 sites
in the United States and Europe). Not all combinations of these three factors are
represented in the design matrix; in fact, the design matrix is very sparse. For example, there
was only one lot that was eight years old.
Most lots contain ten or more kits (therefore, 120 or more individual samplers). Some
lots contained as many as 1000 kits, while others had as few as one kit.
We are restricted to the number of samplers that may be tested at any time since the
test chamber is large enough to hold only six samplers. Another restriction lies in the fact that
testing laboratories are only available for the length of time designated in the work contract.
This usually is no more than two months.

28
The test equipment that controls the concentration of ageut in the test chamber is very
accurate and precise, but it is slow. It may take about an hour to change to a higher concen-
tration. When going from a high to a low concentration, the waiting period may be several
hours since the high concentration tends to leave a residual amount of agent in the test
chamber.

We have decided to evaluate each agent and the chosen lots separately. From each
manufacturer, we have selected one lot from the available age groups. Also, we have tried to
choose lots of similar age from the manufacturers so that they can be paired and we can look
for general trends. In all, we have chosen fifteen lots ranging in age from 1 to 8 years.
Although the sites are in varying climatic areas, most of the warehouses are humidity and
temperature controlled; therefore the locations are treated as homogenous. Differences
existing between manufacturers are not considered in our initial design, but will be addressed
later.
We have taken the route of estimating the reliability of each lot at the specification level
of each agent. We have also chosen a dose-response type experiment, where our dose Is the
agent concentration and the response Is safe/danger. For the purpose of determining
response, U.S. Army manuals specify a set of nine color chips that progressively range from
the "safe" color to the "danger" color. The manual also states a cutoff color for the Bernoulli
response. (In most cases, color chips 1-3 correspond to a safe response, while chips 4-9 are
considered danger responses.)
We have made the assumption that the response curves follow that of the lognormal
cumulative distribution function with unknown mean and standard deviation. The lognormal
was selected based on historical precedent, although we note that the log-logistic would have
also been a reasonable choice.
To choose the concentration levels at which to run the tests, we have considered several
candidate sequential designs. In light of some of our restrictions, however, none of these
would be very practical (e.g., Robbins-Monro would have required too much laboratory
time).
Instead, we have chosen a two-stage "semi"-fixed design. In the first stage, 11 samplers
are tested at seven different levels; one concentration level set at an estimated mean, three
concentrations above this estimated mean, and three concentration levels below the
estimated mean, each being a multiple of the standard deviation away from the mean, Mean
and standard deviation estimates are based on the results of a pretest (which for the purpose
of brevity is deleted from this presentation). The multiple of the standard deviation is chosen
so that the specification level will be covered by the seven test concentrations.

29
Stage I

Concentration Number of Samplers


A,1 3kgl I
tk
1 2

1 +1 A- 3
2
l4k. 1 1
i i11

Note: k is chosen so that the seven test concentrations cover the specification level. A,and 0,
come from the pretest.
At the conclusion of Stage I, the data are analyzed using the DiDonato-Jarnagin max-
imum likelihood estimation algorithm to produce new estimates of the parameters, A and
&2. In Stage II, nine more units are tested at five concentration levels; one, level set at the
new estimated mean, and at two levels above and below this, each now being a multiple of the
new standard deviation from the mean.

Stage II
Concentration Number of Samplers

2-2 1
A2 -2 2
A2 3
A2 + 2
2+ g2 1
9

At the conclusion of Stage II, the parameter estimates for the lot are re-evaluated using
all 20 data points, giving us a final Aand 8. With these final estimates, the .925 quantile is
estimated by A + z(.. 0.

By taking the variance of the above equation, we get an estimate of the variance of the
.925 quantile,
Var(A) + (z(.)) 2 Var(b) + 2 z(.M ) Cov( , a)
(The DiDonato-Jarnagin algorithm gives the values of the variances and covariance term.) If
the one-sided 90% upper confidence limit of the .925 quantile is less than the specification

30
concentration, then we can conclude that the lot meets the requirement for that particular
agent.
We do not have a statistical technique per se for detecting significant differences
between manufacturers or sites. Our "approach" would be to simply look for any obvious
trends or differences. To study the age issue, a separate accelerated life test will be con-
ducted at a later date.

Questions
1. Is our approach appropriate for determining an extreme quantile?
2. Can one estimate a quantile when considering more than two possible responses (e.g.,
the nine color chips)?
3. How might we statistically compare the reliability of the manufacturers (or sites)?

Concluding Remarks
Following our presentation, we heard comments and suggestions from the clinical ses-
sion panelists and audience. Two major concerns were expressed by several persons. First
was uneasinessd towards our assumption of a log-normal distribution. Some respondents felt
this to be a potentially dangerous assumption, especially since we are estimating the tail of
our distribution. Secondly, some persons questioned our method of estimating the mean of'
the distribution, and then extrapolating to the .925 quantile. These two problems could lead
to some very erroneous conclusions.
In general, the comments we heard confirmed our beliefs that this is a very difficult
problem to analyze, in light of the small sample sizes and other laboratory constraints to
which the test is subjected, Although no definitive alternative approaches arose from our dis-
cussions, some possible attacks that were suggested to us included --

1. Sampling more towards the tails of the distribution.

2. Isotonic regression.

3. Testing at the specification level and employing a general linear model approach
with the color chip number corresponding to the color change as the response and
age, manufacturer, and storage site as variables.
We would like to thank the panelists and audience for their many suggestions and
remarks.

31
COMPARISON OF RELIABILITY CONFIDENCE INTERVALS

Paul H. Thrasher
Engineering and Analysis Branch
Reliability, Availability, and Maintainability Division
Army Materiel Test and Evaluation Directorate
White Sands Missile Range, New Mexico 88002-5175

ABSTRACT

Some confidence intervals on reliabilities are investigated. Only


binomial events are considered. Only the narrowest two-sided and the upper
one-sided confidence intervals are calculated. Three methods of estimating
the distribution of reliabilities are reviewed and compared. These are the
Fisherian approach, the Bayesian approach with the ignorant prior, and the
Bayesian approach with the noninformed or noninformative binomial prior. Both
the width and location of the confidence intervals differ for these three
methods.

INTRODUCT ION

Reliability estimates are not as straightforward as might be expected.


Measurement of a number of successes x in a sample size n quickly leads to a
point estimate of the reliability R equal to x/n. Estimates of confidence
intervals are more difficult to obtain however. Two things in addition to the
data are needed for confidence interval estimation. First, some function must
be used to describe the reliabilities. Second, a method must be selected to
locate the confidence interval within the function.

The purpose of this paper is to compare various functions describing


reliabilities. For simplicity, all tested items will be assumed dichotomous
and independent. That is, the binomial b(x;n,R) is assumed to describe the
random variable x if n and R are known. The problem is to select a function
for R when x and n are known. The three functions considered here are based
oi ,1) the Fisherian approach and (2) the Bayesian technique using prior
distributions of R when (A) R is equally likely to be any value between zero
and one and (B) R is unknown numerically but it is known to be a binomial
parameter.

To focus attention on the comparison of the confidence intervals from


these three functions, the methods used to locate the confidence intervals are
restricted in this paper. Only two methods are used in calculations; one is
one-sided and the other is two-sided.

Preceding Page Blank

33
The one-sided confidence interval considered is the upper confidence
interval. This is based on the premise that having a reliability too low is
much more serious than the reliability being toc high.

The two-sided confidence interval considered is the narrowest possible


(Rankin). This is illustrated in Figure 1. It is located by adjusting the
confidence limits until (1) the sum of the areas under the tails is a and (2)
the functions of these two limits are equal. This correspondence of narrowest
interval with equal heights is a geometrical property. It is not based on the
choice nf the function describing R. It may be demonstrated by (1) starting
with the confidence limits of points equal heights, (2) moving the left
confidence limit to the right, and (3) noting that the right limit has to be
moved further to the right in order to keep the sum of the areas under the
tails constant. This is shown in Figure 1 by the dashed lines. A similar
argument starts by moving the narrowest confidence limit to the left.

Other possible two-sided confidence limits, not calculated in this paper,


are illustrated in Figures 2 and 3. These are the traditional equal-division-
of-area-under-the-tails interval and the maximum-likelihood-estimator-in-the-
center interval. The first is the easiest to calculate. The second has a
symmetric appeal but it is non-existant when the peak of the curve is not at
R-O.5 and 1-0 is sufficiently large.

FISHERIAN APPROACH

The traditional Fisherian approach (Mann, Shafer, and Singpurwalla)


considers sums of binomial probabilities. This approach yields two Beta
functions. The lower confidence limit is obtained from one Beta function; a
second function is needed for the upper limit.

Lower Confidence Limit:

The lower (1-0)100% confidence limit R is defined by P[R>R]-l-1. An


alternate expression is P[R-Rc]-a. The limit R is the largest value of R that
makes the data x and n plausible. Plausibility is defined as satisfaction of
the degree of confiderice 1-0 of correctly selecting the right R. The lower
100% confidence limit of R is R-O because all values of R satisfy R)O.
Increasing R requires a decrease in 1-a or an increase in g. This increase in
R shifts the binomial distribution of the possible measurements i which
resulted in the single measurement x. For the limitirn case of RO, the

34
binomial b( i;n,R) consists of a single spike of unit height at 1-0. As R and
c increase, b(i;n,_) takes a shape illustrated in Figure 4 and described by

where the number of ways of obtaining I successes in n trials if found from by

(n) ni n(n-1) ...(n-i+1)


I il (n-i)l i(i-1) ... (IM

The extent of the shifting from the single spike is determined by the data
x and n. The value of R is determined in two steps. First, R is increased
until the sum b(x;n,R)+b(x+1;n,R)+...+b(n;n,) equals the probability a making
the confidence relation P[RcR_]-J or P[R>j]-1-Q untrue. Second, the continuous
variable R is decreased infinitesimally making the confidence relation
P[R>j]m1- just barely valid. Thus R is neither too large or too small to be
a (1-:)100% lower confidence limit on R when

n (n\, 1 -R n

The extraction of R from this equation can be facilitated by using a Beta


function as described in the following paragraph. Before doing that however,
it is expedient to note that a measurement of x-O implies that R-O for all
values of 0. This special case isn't algebraically included in the following
Beta function. It is adroitly described by an argument based on Figure 4:
when x-0, R has to be 0 to make b(O;n,R)-l and b(i ;n,R)-0 for all i*0.

The Beta function of R is

f(R) r(a+b) Ra-l (1R)b-l


r(a)r(b)

where a and b are parameters. Using the equality of the gamma fUnction r(J)
and the factorial (J-1)1 when j is an integer yields

f(R)-i (a+b-l)l T Ra-l (1.R)b-.


(a-l)1(b-1I

35
Postulating that the reliability is described by f(R) and setting the area to
the left of R at c yields

R
af f(R) dR.

Repeated integrations by part yield

" ab-1 (a+b-1. (..)a+b-1-i


i1
1a \i,/

Comparison of this summation and the summation for a in the previous paragraph
yields a-x and a+b-1,-n. Thus the parameters in the Beta function for the
lower limit R are a-x and b-n+1-x.

Upper Confidence Limit:

The upper confidence limit R defined by P[R<r]-l-a is obtained from


another Beta function. Arguments similar to those in the proceeding section
yield the upper Beta function in four steps:

(1) R is in the binomial sum

)n-i
ix (n - '1 (1.7
iO i

(2) " is the lower limit of integration over the second Beta function

1
- f f(R) dR,

(3) repeated integrations by parts transform this integral to the summation

aP'1 (a'lb'1) 7i (-R')a'+b'4.i l


and

(4) the second Beti function parameters are identified by x-a'-1 and
n-a'+b'-1 to be a'-x+1 and bl-n-x.

36
This Beta function does not describe R when x=n because r(b')=r(O)u(O-1)!
is meaningless. For this special case, R-1 for all ,. This may be seen from
a binomial distribution symmetric to Figure 4. Using an R near 1 and an (
containing binomial terms from J-O to J-xan, it is easily seen that a is 1
even when R is 1. Since R continuous, R-i for any value of 1-s.

BAYESIAN APPROACH

The Bayesian approach (Martz J-d Waller) uses the data x and n to update
a prior distribution g(R) describing R to a posterior distribution g(Rlx)
describing R after x is given. The algebraic relation between these two is
based on the equality of the joint density h(x,R) to both the product
g(Rlx)f(x) and the product f(xIR)g(R). Thus the posterior is found from

g(Rlx) - f(xlR) g(R) / f(x).

This expression is simplified by noting that (1) the conditional density of x


given R is

n x
f(xlR) - b(x;n,R) -(n)Rx (1R)
'X

and (2) the marginal density f(x) from the integral of h(x,R)-f(xjR)g(R) is

f(x) - f f(xlR) g(R) dR (n)Rx (l-R)n-x g(R) dR.

Thus the general posterior is

Rx (1-R) nI' x g(R)


g(Rlx) * .
f RK (1-R)n-x g(R) dR
0

37
Ignorant Prior:

One prior that can be used is the uniform distribution g(R)=1 for OR41
and g(R)=O elsewhere. This is sometimes called the ignorant prior because all
values of R between 0 and I are equally likely. That is, there is no evidence
to favor the selection of any value of R over any other R between 0 and 1.

Use of this prior in the general posterior yields

g(RIx) Rx (R)n-x -

f Rx (I-R)n ' x dR

Integration by parts evaluates the denominator. The posterior is thus

g(Rjx -=~ n2 R(x+1) - l (1.R)(n-x+1)-l.


)

r(x+1) r(n-x+1)

This is a Beta function with parameters a-x+l and b-n.x+1.

Noninformed Prior:

A second prior that can be used recognizes that the reliability is a


binomial parameter but has no information about its value. This is sometimes
called the noninformed or noninformative binomial prior.

Every noninformed prior is based on a transformation making the


probability density insensitive to the data. For the binomial parameter R in
b(x;n,R), it has been empirically found (Box and Tiao) that plots of
K(x,n)b(x;n,o) versus * yield very similar curves for fixed n and different
x's when (1) K(x,n) is determined by numerical integration to make the area
uwder K(x,n)b(x;n,o) equal to one and (2) * is given by

2 .
m n Arcsin(Rl/ )

Figures 5 and 6 show that for O<x<n these similar curves become nearly equally
spaced along the 0 axis as n is increased. The noninformed argument assumes
that all n+1 curves are essentialy equal and equally spaced for all n. This

38
makes being noninformed about x equivalent to being ignorant about 0. The
prior assumption that (1) x is unknown but (2) the situation is described by
the one of these curves thus leads to a prior distribution of 0 that is
uniform between 0-0* and -900. The corresponding prior of R may be found
from the transformation of variable technique (Freund and Walpole) by applying

g(R) - h(o) _
dR

Using h(o)-1 and sin(o)-Rl/ 2 in this equation yields g(R)wl/{2[R(1-R)]1/ 2}.

Use of this binomial noninformed prior in the general posterior yields

2
RX-1 /2 (1.R)n-x-i/
g(Rjx) - . .
fI R( X+ 1 / 2 ) l (I-R) ( n ' x +1 / 2 ) ' 1 dR

The denominator is recognized as an integral over a Beta function. It is


evaluated to be r(x+1/2)r(n-x+l/2)/r(n+l). The posterior is thus found to be
a Beta function with a-x+1/2 and ban-x+1/2.

COMPARISON OF CONFIDENCE INTERVALS

The three methods reviewed in the previous sections have been applied to
confidence intervals on reliability. Both two-sided and one-sided intervals
have been investigated.

Narrowest Two-Sided Intervals:

Figures 7 through 15 show distributions and narrowest two-sided 80%


confidence intervals. Figure 7 illustrates the symmetry about x-n/2. Thus
graphs for x<n/2 are not needed to investigate trends. Figure 8 is one
example of the destruction of symmetry by making x>n/2. Figure 9 shows that
when x-n the syimmetry is so completely destroyed that the narrowest two-sided
intervals are actually upper one-sided intervals. Figures 10, 11, and 12 and
Figures 13, 14, and 15 show the effect of increasing n: for fixed x, the
confidence intervals all become narrower but the relationship of the
Fisherian, ignorant, and noninformed intervals retains an order.

39
The effect of changing x for fixed n is seen to be a change in the order
of the Fisherian, ignorant, and noninformed intervals, The Fisherian interval
seems to be the widest. For x near n/2, the ignorant interval seems to be
narrower than the noninformed interval. For x near n however, the noninformed
interval seems to be the narrowest of the three.

Upper One-Sided Intervals:

Figures 16 through 31 show distributions and upper one-sided 90%


confidence intervals on reliabil,.y, The lower confidence limit appears lower
for the Fisherian analysis thdn for the Bayesian analyses. The Bayesian
ignorant and noninformed priors seem to lead to two sets of results. The
lower confidence limit appears lower for the noninformed when x is near n/2
but higher for the noninformed when x is near n.

The symmetry of the Beta functions makes the lower confidence limits for
x near 0 such that the Fisherian is lowest, the noninformed Bayesian is next
lowest, and the ignorant Bayesian is the highest of the three. This is shown
in Figures 25 though 28. These figures and Figures 29 through 31 also show
that large n leads to fairly close agreement between the three methods.

CONCLUSION

The three methods are all on sound theoretical ground but'give different
results. No single method provides most logical confidence intervals. The
choice between methods has to be based on goals and philosophy. Since the
Fisherian method leads to the widest confidence intervals, it is the most
conservative approach. Since proponents of the Bayesian method prefer priors
which contain more information than the ignorant or noninformed prior, the
Bayesian method (without a prior based on previous tests/calculations) does
not meet all the goAls of analysts with a Bayesian philosophy. Thus the
Fisherian method seems to be a good, conservative method for the initial
analysis. This initial analysis can provide a prior for a future Bayesian
analysis of addition data from a future test.

40
REFERENCES

Rankin, Donald W., "Estimating Reliability from Small Samples," Proceedings of


the Twenty-Second Conference on the Design of Experiments in Army Research,
Development, and Testing, Army Research Office

Mann, Nancy R., Schafer, Ray E., and Singpurwalla, Nozer D., Methods for
Statistical Analysis of Reliability and Life Data, John Wiley & Sons, New
York, 1974

Martz, Harry F., and Waller, Ray A., Bayesian Reliability Analysis, John Wiley
& Sons, New York, 1982

Box, George E.P., and Tiao, George C., Bayesian Interference in Statistical
Analysis, Addison-Wesley, Reading, Massachusetts, 1973

Freund, John E., and Walpole, Ronald E., Mathematical Statistics, Prentice-
Halls Inc., Englewood Cliffs, New Jersey, 1980

41
.. .. . .. .. . .. .
I- ,

o
LU W I

LUI
/ I 1
co I'

I V)I "

. U.. . -,
- - --
,...
-
..
- -
1.

42
uJ
z
Il

-LJ
i-

-W

L L

43
.- ............

LU
LlU

IW-

441
~u.o.

LUL

LU Z z (n
I LU ccIF

d LU Al

~LU _ _

06
0 0

LU

00

45
I I I -
04

00
0.
ol
41

= lp

LU

U Q

I z

6LII

46
.10.

-K2

CLC

0 I0

LALa
LL (0--- ~ ~
- I- '2
a--- -- s

- - a--- -

147

to L6aa~

- l-

CL CL CLd
-a M
.3 * en

'~47
LCV

I I , I -i

i i"i i" °/1


'1
II I,

Ixi ,

II 48
I I ---

Li q
0 o

13,,

49
Iii,

.................. o
II .4 I

z |
(S)Z ,
Qw

LLJ

a:
cc

---- I_ I

50
x
I0I

z'
ii1010
00)

QI
Z
mw

51
I0
II0

I0 1
ii.0
520
- - - - ----- -
- - - - ---

... ... ...

LU%
cc

C4 v

53
LII
cc
II4

544
92 0.

.. .. .. .

Cl55
3xx V \ i
9
01~11

~\
00,

31P

IIl

co
LL 4"0
'K I
I
a:
u

Jz

LLL
aq 00c

0, L. w
ZL 0ZU

57
- I

- %
L DI

I -
co I

cc,

58s
- -

- a
- II
-a Ii
a.

a
I4~ a
N
II %% I~
x ~jg
N I
a, I
N

zQ Z
I.-
U
LL ~

II,
I I I I C

0
II ~I ~gl I

I I
59
00
.00

Q 0/0
Vo

%4.
g

60
- 10

0 I0

LU

C4
- -- --- - - - -P.. ... . .

.. .. ... . . . .

Z -

L.N.

LL v

lz.

i62
92

II "P

I-

Lli.
cr..

,.3
z
LL (D
. ....

zMA

cr. a:

65
zS

z
Q LL

oz
I-

LL6

wU z
cc 0

C-C,

00L
wL0.z

66
CC

>-z

uul
ui 0

LUU

z EI

Z3

67
z

zi

LOLu

wsi 9
wL

I-..

wJ 3
00
C-L)

w
0 D
0 z.
zW
tat

z
LL 0

cc- ~

0 I a,

w ~ 69
0

z-
IL>
<0)

0.t

0.1

0 Sj
UJ3
CC;h

I I I I
Iz

70
Z II

LL W

cc 8
C-
UJ z
cr.
0

- w
~I U..
w z

0 z

711
.
........................................................

LL.

LU

LU
=j LA.

LU z

cc N

CLA

ui i

z z

72
Environmental Samplings A Case Study

Dennis L. Brandon
US Army Engineer
Waterways Experiment Station
Vicksburg, Mississippi 39180

Abstract Sampling strategies have been developed to accomplish


various environmental objectives. The objectives may be: (1) to
estimate the average of characteristics in a population; (2) to estimate
the variability of characteristics of interest in a population (3) to
decide if characteristics of interest in a population meet certain
standards or criteria; (4) to identify the source(s) which caused
characteristics in a population to exceed standards. A study designed to
achieve objectives 3 and 4 will be presented. Modifications and
alternate approaches will also be discussed.

Bagrund Navigable waterways of the United States have and will


continue y a vital role in the nation's development. The Corps, in
fulfilling its mission to maintain, improve, and extend these vaterways,
is responsible for the dredging and disposal of large volumes of sediment
each year. Nationwide, the Corps dredges about 230 million cubic yards
in maintenance work and about 70 million cubic yards in new dredging
annually at a cost of about $450 million. In accomplishing its national
dredging and regulatory missions, the Corps has conducted extensive
research and development in the field of dredged material management.
Federal expenditures on dredged material research monitoring,, and-
management activities have cumulatively exceeded $100 million.
Techniques developed to evaluate contaminant mobility in dredgad material
can be applied to other contaminated areas. Accordingly, the plant and
animal bioassaym are tvo.techniques developed to assess the environmental
impact of dredged material in vetland and upland disposal environments.
These bioassays, surface soil samples, groundwater samples, and
additional plant tissues were used to evaluate a contaminated site in
western California.

The case study site is approximately 200 acres with both upland and
wetland areas (see Figure 1). This site was known to have very high
concentrations of metals in surface soils. Major pathways for
contaminant mobility are the meandering stream which flows north and the
drainage ditches. Also, tidal inundation affects a substantial portion
of this site.

The objectives of the study were tot (1) define the extent of the
hazardous substance contamination on the site; (2) identify the sources
of the hazardous substances detected on the property; (3) evaluate the
extent of migration of the hazardous substances on the property; (4)
assess the bioavailability, mobility, and toxicity of the hazardous
substances detected on the property; (5) evaluate the condition of the
wetland and upland habitats on the property. This paper focuses on the
use of soil samples to achieve objectives 1 thru 4.

73
1000~00 300P
(q200

ES

FigureY10 Ma fSuyAe

74CI
SAMPLING PLAN The sampling plan was formulated based on previous
soil and water data, historical information, and tha potential pathways
for contaminant mobility. The sampling locations are shown In Figure 1.
Three samples were collected at some locations and one sample was
collected at the other locations. The triplicate samples were used in
statistical comparisons. This sampling plan reduced the cost of the
investigation by allowing a selected number of sample locations to be
tested extensively while other sample locations received one-third the
cost and effort. A total of 178 samples were collected and analyzed for
As, Cd, Cu, Fb, Ni, Se, and Zn.
There is an analogy between the strategy used here and the disposal
philosophy of many Corps eldments. Most dredging and disposal decisions
are made at the local level on a case by case basis. Often, the
environmental objective is to prevent further degradation of the disposal
area. Therefore, samples are collected at the dredge site and disposal
site. A statistical evaluation performed on the chemical analysis of the
samples becomes the basis for determining whether degradation will occur.
In this study, samples were collected at the remote reference area and an
area of degradation (i.e. contamination). Ten triplicate samples were
collected in the remote reference area. Twenty-eight triplicate samples
were collected in the area of contamination. Locations having a mean
concentration of metals in soil, plants, or animals statistically greater
than similar data from all remote reference locations were declared
contaminated, These concentrations provide a judgemental basis for
classifying the 64 single sample locations.

Three souzces of contamination were identified from historical


information. One additional source was indicated by the soil analysis
and later verified with historical information. Sources were thought to
be areas with several high metal concentrations in a vicinity and a
gradual decrease in metal concentrations as one moves away from this
area. The sources found in this study appeared to have released metal in
two different forms. One method was to bury or discharge contaminants
associated with a solid material in an area. The other source discharged
highly contaminated liquids into a stream. Identifying sources was
further complicated by the fact that some of the discharges were
intermittent and possibly hadn't occurred in several years. This study
was successful in identifying sources which discharged contaminants
associated with solids. Ydentifying the source of liquid discharges was
more difficult due to seasonal fluctuation of the stream.
The noil analysis was partially successful in achieving objectives 1
thru 4. The extent of contamination from known sources was established
&nd locations requiring further investigation were identified. This plan
has been augumented with additional sr.mpling. These samples further
delineated th,.. extent of contamination horizontally across the site and
vertically down the soil profile. As a result of this study, 26.5 acres
were declared contaminated.

75
A Generalized Gumbel Distribution

Siegfried H. Lehnlgk
Research Directorate
Research, Development, and Engineering Center
U.S. Army Missile Command
Redstone Arsenal, AL 35898-5248

A generalized Gumbel (extreme value type I) distribution class is


introduced. In addition to the usual shift and scale parameters this new
distribution contains an arbitrary positive shape parameter. The classical
Gumbel distribution results as special case for shape equal to unity.
Microcomputer-based algorithms for estimation of the parameters are present-
ed. They are based on the moment equations and on the logarithmic likelihood
function associated with the distribution density. A program diskette for
microcomputer use will be made available upon request. A combined paper by
this author and Charles E. Hall, Jr., will be published elsewhere.

77
A Generalization of the Eulerian Numbers with a
Probabilistic Application

Bernard Harris
University of Wisconsin, Madison
C. J. Park
San Diego State University

1 Introduction and Historical Summary


In this paper we study a generalization of the Eulerlan numbers and a clas of polynomials related
to them. 'An interesting application to probability theory Isgiven in Section 3. Ther, we use theme
extended Eulerlan numbers to construct an uncountably infinite family of lattice random variables
whose first n moments concide with the first n moments of the sum of n + I uniform random
variables. A number of combinatorial identities ar also deduced.
The EEuerian numbers are defined by
A~..(_) , n l j-)'t O,1,2,.,, "NO,1,2,o.... 1

They satisfy the recursion

A,V -JA,I,, +(n- J + l)A,..- (2)

and the Worpitzky (25] relation

*t Z+ ~ )Aq (3)

Also,
A , mA,%j. i; (4)
1 An,. (5)

In addition, they possess a number of combinatorial interpretadons which am described below.


Let X - {1, 2,..., n} and let Ps(k) be the number of pemutations of the elemens of X.
having exactly k increases between adjacont elements, the first element always being counted as
an inaase.
For a a 4, the 24 permutations and the number of increases are given in Table 1.1.
79
Pennutation NmrflM
1 1234 4
2 1243 3
3 1324 3
4 1342 3
5 1423 3
6 1432 2
7 2134 3
8 2143 2
9 2314 3
10 2341 3
11 2413 3
12 2431 2
13 3124 3
14 3142 2
15 3214 2
16 3241 2
17 3412 3
18 3421 2
19 4123 3
20 4132 2
21 4213 2
22 4231 2
23 4312 2
24 4321 1

80
As seen from the tabulation, P4 (1) = 1,P4 (2) =.11, P4 (3) - 1, P 4 (4) - 1,which concides
with A 4 jj " 1,2,3 r4.
Let
An(t A . (6)
Jul

Then
t ), A(t)x'
A /nl,.t 1. (7)

are
The above relations and some of their properties can be found in(8]; the polynomials (6)
also discussed inL. Carlitz (4]. These results may also be found inthe expository paper of L.
Carlitz [3]. The formulas (1) and (2) ar also given in L. v. Schrutka [21).
Ddsir6 Andrd [1] established that AV isthe number of permutations of {X, } with j "elementary
itiversions". He also established that Aq, is the number of circular permutations of (X,, I} with j
4elementary inversions".
The equivalence of these two results with the enumeration of the number
of increases in permutations of {X.} can be trivially established.
G. Frobenius [ 15) studied the polyomonlals

Anm AjxJ (8)

introduced by Euler, and established many of their properties. In particular, relations with the
Bernoulli numbers are given in [15].
In D. P. Roselle [201, the enLneration of permutations by the number of rises, A, is related
to enumeration by the number of successions, that is, a permutation ir of {X,,} has a succession if
,r(i) -"i+ 1, i- 1, 2,..., n,

Some number theoretic properties of AV are given in L. Carlitz and J. Riordan [7] and in L.
Carlitz [5].
In this paper, we study a generalization of the Eulerian numbers. A generalization in a dif-
f.rcnt direction was given by E. B,Shanks [221, who apparently did not note a connection of his
-coefficients with the Eulerian numbers, L. Carlitz [2] noted the relationship of Shank's results to

81
the Eulerian numbers and obtained representations for these generalized Eulerian numbers using
results due to N. Nielsen [17].
F. Poussin [18] considered the enumeration of the number of inversions of permutations of
(X,, which end in J, I g j n. This produces a decomposition of the Eulerian numbers. She
also introduced a polyomial generating function for these numbers. The sums of these polynomials
are the Euler-Frobenius polynomials.
Another deomposition of the Eulerian numbers with a combinatorial interpretation is given by
'.F. Dillon and D.P. Roselle [12].
J. Riordan (19] lists many properties of the Eulerlan numbers in Exercise 2, page 38-39 and de-
scribes the combinatorial interpretation of the Eulerian numbers in terms of triangular permutations
(which is equivalent to the elementary inversions described by Andrd [1]). He also gives a brief
table of the Eulerian numbers on page 215. See also L. Comet [10], where generating functions for
the Eulerian numbers are given and the Eulerian numbers are obtained by enumerating the number
of permutations with a specified number of increases. Many properties of the Eulerian numbers are
given as well as their historical origins in terms of sums of powers.
F.N. David and D.E. Barton [11] suggest the use of the Eulerian numbers as a statistical test for
the randomness of a sequence of observations in time, employing the probability distribution given
by
P * Aj/n!, j - 1,2,...,n. (9)
The generating function (7) is derived and employed to obtain the moments and cumulants
of the distribution (9). In particular, David and Barton show that the factorial moments are the
generalized Bernoulli numbers. However, David and Barton do not make any identification of
these distributions with the Eulerian numbers.
Using probabilistic arguments, Carlitz, Kurtz, Scoville and Stackelberg [6] showed that the Eu-
lerian numbers, when suitably normalized, have an asymptotically standard normal distribution,
This was accomplished by representing the distribution PV as the distribution of a sum of indepen-

82
dent Bernoulli random variables. S. Tanny [24] demonstrated the asymptotic normality by utilizing
the relationship of the Eulerian numbers to the distribution of the sum of independent uniform ran-
dom variables and applying the central limit theorem.
L. Takdcs [23] obtained a generalization of the Eulerian numbers which provide the solution
to a specific occupancy problem. Namely, let a sequence of labelled boxes be given, the first box
labelled 1, the second box 2, and so on. At trial number n distribute I balls randomly in the first
n boxes so that the probability that each ball selects a specific box is 1/n and the selections are
stochastically independent. For I - 1, the probability that j - 1 boxes are empty after trial number
n is Aq/n,j 1,2,..., n. Takics' paper contains many references and describes additional
combinatorial problems whose solution is related to the Eulerian numbers.
Finally, L. Toscano [25] obtained formulas expressing the Eulerian numbers in terms of Stirling
numbers of the second kind.

2 Generalized Eulerian Numbers


We now introduce a generalization of the Eulerian numbers and investigate its properties.
Let 6 be an arbritrary real number and let

AV (6) =-E (-1)"(6 + -V)",j=,,1,..,n=0,1,2, ... (10)

These polynomials are mentioned in L. Carlitz, D.P. Roselle and R.A. Scoville [8], As noted
there, AO (0) are the Eulerian numbers. These polynomials are also used by P.S. Dwyer [13] to
calculate sample factorial moments. Dwyer .oes not relate these to the Eulerian numbers.
We begin our analysis with the following theorem:
Theorem 1. Let n and k be non-negative integers and let 6 be any real number. Then

83
"2 (+ ) (-I)(6 +j-v) n
1
JUO VUO VJ

t( + (k)A,()(11)
jWO
is independent of 6 fork = 0, 1,..., .
Ef. The following identity (see N. Nielsen, [17], page 28) will be utilized in the proof.
( _+x (8+jv)' (12)
)no0 n NV

Let A and E be the operators defined by

(f()) f (Z+ 1) -f()

and
E(f( )) = f (Z+ 1),
Then, it can be shown that (C. Jordan, [16])

7.O Er-v1 (13)


V\ (V

In particular, for r - 0, 1,..., n,

N0
r

- n(r)(6 + j) ("') (14)

the last equality follows from elementary properties of the function (S + j)k' j (C. Jordan, [16], p.
51). Thus, for r - 0,1,..., n, from (12) we have

(S+ j),. D(_1) 1 (6+ +r ,/n(r).()

84
Hence,
+1 (-+( + - )

i n- kl(+i (k)(n+ n+1 ( )(, _


jnO UO ++n-k-l (,O 1 +

(16)
Thus, it follows that

too JN WVaO
n= 1
(17)
Setting z 6 + n- k - 1+ I in (12) we get

(n- k-(nk-s.l).2
1+ 0)1 = E
J=0
+ n-
nl
- +
Va=0
(n+ I' (6(-q1
j(-+1-)"

and hence
(-(n frh /n-k k I ) (18)
ISO
and is independent of 6.
In particular, = , , ( ) (2" - 1)/n and
(3n - 2n 1 + l)/*n(- 1), L,-3(n
() (6)- 14-iMn-2)
.,4--3".,.(2 -I
A brief table of/[t]( )(6) for k - 0 1, 2,3 and n So 0, 1,2,3 is given in the Appendix to this
paper,
The Nielsen identity (12) seems to have been discovered in a somewhat less general context by
Paul S. Dwyer (13], who employed it to calculate factorial moments by means of cumulative sums;
see also Ch. A. Charalambides [9], who in addition to discussing Dwyer's work also showed that
these generalized Eulerian are related to enumeration of compositions of integers.

85
The following corollary will be subsequently employed.
Corollary. Let n and kbe non-negative integers with kc 7bi.Ten

1(D &+ J)AE1 ( (-1)W(s +j - (19)

is independent of 8.
I~rg.we can write
(6 + j(01(20)Ept
+J),
1.00
where Ok are the Stirling numbers of the second kind. Since the coefficients Ph do not depend on
8, substituting (20) into (19) and interchanging the order of summation, we got

AkA
tw0

which is independent of 8.
Prior to demonstrating that the independence of 8, noted in Theorem I and its corollary can not
be extended to k m + 1, we will need to calculate the derivative of 1A i(i).
4 Thus, we have:
Thoe'2 Let naind k be non-negative Integers and let 8 be any real number. Then

d 1()- Ii)n..)5
0-0 (22)

Ew~.Since

Ny~= n(+l \ _~(~j_


J-0oIN

( +j)~h-1 s
1 F,')6 (-I) V(6 +j-O

86
Comparing the first term with (19) and employing the Pascal triangle identity on the second
term we get

h k 8) .'+~
jNO vnO V
1 ()v( -~V)"-
(23)
Further,

= (n~(n-i)!
1)! (6 +n ) ( )'(,
+ 1)- J(n I
V1O) ( V) ) V (24)

From (13)

b
(+,n- 0
V)"- -, (25)
vWO V

since it is the nth difference of a polynomial of degree n - 1. The second term on the right hand
side of (24) is (-1) (6),
In addition,

'
(n(
- 6 + ) (-1)M6(6 + j- V)-
-

(+ - ( -1 +)

~E 8( + j -v -

87
1(
"Ih(i-~j+ 1
I
(- (+J

(n-) I j ~ ,=1l

7n D S +J)" (_l), ,( + _) 1)I

(I),o or j go V
ni +

Thus, by (23), (25) and (26) we have shown that

dIA(k ) kALN2(8 k~%l)8

or
dgkn(6) k (6) - k, N ) (27)

establishing the theorem.


Corollary 1. For I < k n,then

k _,(8) - Z ( ), ) (28)

Pt. . By the Corollary to Theorem 1, p (8) is independent of 8 for 0 k n and hence


- 0 for such values of k.
Corollary 2. If k - n + 1, then

d" (8) C . -(n+ l)j"'"(8), (29)

where c,,+,, is a constant (depending on N,but independent of 8).


Proof, From the Corollary to Theorem I and Theorem 2, ( n + 1)14. ) (5) is independent of 6 and
all terms in ( ) r ' (5) with the exception of r - n are independent 6
of . This last

term is (n+ 1)A.'S') (6).


Corollary 2 can be extended to k = n+ 2 and so forth, but the expressions obtained become more
complicated and do not appear to be particularly useful. However, we do make use of Corollary 2
in the next theorem,
Theorem 3. For every n, A,1 (5) is a polynomial of degree n + I in 6 with leading coefficient

EP.rf. We proceed by induction, using Corollary 2 to Theorem 2.

Form - 0, 14() () - 6. Then d/-2u(8) 6) - c2,1 -26.

Performing the indicated integration, we have

A1)(6) = Ca.,t 6 62 + d,

where d is an unspecified constant,


Assume that the conclusion holds for n - m. Then

d/46~()-
pm+2 ~+,~ (m + 2)IA42

c,,+2,+1 - (m + 2)(a06"' ' + E aj 6M+-J), (30)


Jul
where ao is +1 or-l. Integrating, we get,.,,.2 (6) -ao + Pffi+1(6), where P,+(6) i
polynomial of degree m + 1.
A table of p4l) (6) appears in the Appendix for n - 5, k 0, 1,..., 10 and selected values of
6.

89
3 Applications to Probability Theory
Let U1 , U2 ,..., U,tI be independent random variables uniformly distributed on (0,1). Let
owl
,~~U. The distribution of Sol is well-known and is given by the probability density
,.-
function
fs,. n+I (-)V(x - V)"O < X < n+ 1, (31)

(for example, see W. Feller (14).) Where

(zc<
- a)+ -- (32)
x-a, X-a>O,
Write

where CS,, I]denotes the integer part of S.,I,Clearly 6isa continuous random variable and
0 < 6 < 1; [So I] isadiscrete random variable with carrier set {0,1,2 ....,n}.
The conditional distribution of S,+ I given that the fractional part of 8,o I is 6 is given by
IS
P{Smi1 -x IX x] - 6} a fs.,(J+ 6)/,fS.,(+ 8), (33)

wherej+6 xj=0,1,..,% i.e.j-[x].


From (31),
/s,.,(U + 6) - w E
WOO (
-)10+ )n
V+

Butj + 6 - v > 0 is equivalent to v /,thus we get

fs..(j+ 6) - ," (-)U+6-O'(4

which is A, (6)/n!I Also,


E fsI., (U+ 6) - I = 01
j-0

90
and thus (34) is a discrete probability distribution with carrier set {6, 1 + 8, ... , n + 6}.
Let W,, . be the random variable whose distribution is given by (34). We then have the fol-
lowing theorem.
Theorem 4. The moments of order k - 0,1 ,...,n of W,, 1 ,8 coincide with the corresponding
moments of 8,, that is,

E{8 "} - E(Wr.,},jk-= 0,1,...,, %0 <g 6< 1. (35)

~wIE{Snl+1 } Ea(E{Snl 1 6)}) - E({f'jf),


However,
E{ W,+ 18 (j+ 8) (-1)9(j+ 6 - 0,
YNO woo
,0

which is independent of 8, by the Corollary to Theorem 1.


A brief table of W.+1 , for n - 5 is given in the Appendix.
Remark. It is easy to see that the marginal distribution of 8, the fractional part of Sol , is uniform
on (0,1). An elementary proof follows.

P{6 8*) x) d-
J-O

" f fs. 1U + u)du


-/~JU(s.,(I +U))du;

but , + u) = 1 for every 0 < u < 1. Hence

P{8< 6'} - f du - '.

Finally, we note that W,. 1,6 is asymptotically normally distributed. This is stated in the following
theorem.

91
Theorem S. As n - oo, for 0 b <1, the distribution of

F(,4q~+)Wt,-
+ ) 11 (i's)) (36)

coverges weakly to the standar.' normal distribution.


Further,
Ar.(6) L
----. *+-(n+/), ( + (37)
,n1t Vfn

Eof. Both (36) and (37) are Immediate consequences of the representation of , as the con-
ditional distribution of the sum of n+ 1 independent uniform random variables on (0,1) given the
fractional part of the sum and the central limit theorem.

92
Appendix

This Appendix is devoted to some tables illustrative of some of the quantities introduced in the
body of the paper.

,Table A.I

Table of [A)(6),k = 0, 1,2,3; n= 0,1,2,3

$0 1 2 3
0 1 6 6(6- 1) 6(6- 1)(6-2)
1 1 ft". 1+6_82_(l-. L) 26 3 +36 2 ,,.-_1+n
2 1 2+ ++26- 2S' -
1_ Wy4
2 4
3 1 La±I n
(PL ri-2)

93
The Distrbution of W.,s, n = 5, 6n-.1, .4, .5, .9

6 8 x10-l 9 x1075 3 x10- .005


1+8 ,013 .044 .062 .177
2+6 .260 .396 .438 .545
3+ 6 .545 .476 438 .260
44.6 .177 .083 .062 .013
4+6 .005 6 x 10 - 4 3 x15" 4 5 x 10 - 9

Note the aymmolby for 6 - .5 and that 8 n .9 and 8 - 1 are identical when the column for
8 = .9 is read going tip and 6 = 1 is read going down (the entries 8 x 10"8 and 5 x 10- differ
as a consequence of rounding errors).

94
A'(6), n --5, k =0, ,..,10;6 -a 0,.I,.3, p.5,1.7,1.9

6=0 .1 .3 .5 .7 .9
k 0 1 1 1 1 1 1
1 3 3 3 3 3 3
2 9.5 9.5 9.5 9.5 9.5 9.5
3 31.5 31.5 31.5 31.5 31.5 31.5
4 108.7 108,7 108.7 108,7 108.7 108.7
5 388.5 388.5 388.5 388.5 388.5 388.5
6 1432.50 1432.50 1432.53 1432.55 1432.53 1432.50
7 5431.50 5431,51 5432.01 5432.48 5432.31 5431.69
8 2118,7 21117.60 21122.56 21129.77 21129.66 21122,07
9 84010.5 83989.19 84020.48 84096.88 84116.67. 84049.80
10 341270.5 341018.48 341121,81 341763.40 342089.16 341628.77

95
References
[1] Andrd, D., Mdmoire sur les inversions 616mentaires des permutations, Mem. della Pontifica
Accad. Romana dci Nuovo Lincei, 24 (1906), 189-223.

[2) Carlitz, L., Note on a paper of Shanks, Am. Math. Montly, 59 (1952), 239-241.

[3] Carlitz, L., Eulerian numbers and polynomials, Math. Mag. 32 (1959), 247.260.

[4] Carlitz, L., Eulerian numbers and polynomials of higher order, Duke Math. J.27 (1960), 401-
424.

[5] Carlitz, L., A note on the Eulerian numbers, Arch. Math., 14 (1963), 383-390.

[6] Carlitz, L., Kurtz, D.C., Scoville, R., and Stackelberg, O.P., Asymptotic Properties of Euleria
Numbers, Z. Wahrscheinlichkeitstheorie verw. Geb., 23 (1972), 47-54.

(7] Carlitz, L. and Riordan, J., Congruences for Eulerian numbers, Duke Math. J. 20 (1953) 339-
344.

[8] Carlitz, L., Roselle, D.P., and Scoville, R.A., Permutations and sequences with repetitions by
number of increases, J.Combin. Theory, 1 (1966), 350-374.

[9] Charalambides, Ch.A., On the enumeration of certain compositions and related sequences of
numbers, Fibonnacci Quarterly, 20 (1982), 132-146.

[10] Comtet. L., Analyse Combinatoire, Presses Universitaires de France, Paris, 1970.

[111 David, F.N. and Barton, D.E., Combinatorial Chance, Charles Griffin and Company Ltd.,
London, 1962.

[12] Dillon, J.F. and Roselle, D.P., Eulerian numbers of higher order, Duke Math, J., 35 (1968),
247-256,

96
[13] Dwyer, P.S., The calculation of moments with the use of cumulative totals, Ann. Math. Statist.,
2 (1938), 288-304.
(14] Feller, W., An Introduction to Probability Theory and its Applications, 2nd Ed., Vol. HI, John
Wiley & Sons, Inc., New York, 1971.

(15] Frobanius, 0., OYber die Bernoullischen Zahien und die Eulerschen Polynome, Sitz. Ber.
Preuss. Akad. Wiss. (1910), 809-847.

[16] Jordan, K., Calculus of Finite Differences, 2nd Ed., Chelsea Publishing Co., New York (1960).

[17] Nielsen, N. Traitd dldmentalre des nombrei de Bernoulli, Gauthier-Villars et Cie,, Paris
(1923).

(18] Poussin, FR, Sur une propridt6 arithmdtique de certains polyn6mes associds aux nombres
d'Euler, Comptes Rendus Acad. Sci. Paris, 266 (1968), 392-393.

(19] Riordan, J., An Introduction to Combinatorial Analysis, John Wiley & Sons, Inc., New York,
1958.

[20) Roselic, D.P., Permutations by number of rises and successions, Proc. Amer. Math. Soc., 19
(1968), 8-16.

[21] v. Schrutka, Lothar, Sine nucue Eintilung der Permutationen, Math. Annalen., 118 (1941),
246-250.

[22] Shanks, E.B., Iterated sums of powers of the binomial coefficients, Am, Math. Monthly, 58
(1951), 404-407.

[23) Takdcs, L., A generalization of the Eulerian numbers, Publicationes Mathematicae Debrecen,
26 (1979), 173-181.

97
(24] Tanny, S., A probabilistic interpretation of Eulerian numbers, Duke Math. J., 40 (1973), 717-
722.

[25] Toscano, L., Sulla somma di alcune serie numeriche, Tohoku Math. J., 38 (1933), 332-342.

(26] Worpitzky, J., Studien Uber die Bernoullischen und Eulerschen Zahlen, Journal fUr die reine
und angewandte Mathematik, 94 (1883), 203-232.

98
The Analysis of Multivariate Qualitative Data
Using an Ordered Categorial Approach

H. B. Tingey E. A. Morgenthein S. M. Free


University of Delaware Bristol-Meyers

ABSTRACT

When the experimental units being classified are sub-sampling units in the study, an
ordered categorical procedure cannot be applied directly. Further, the count data ob-
tained which is routinely analyzed by univariate statistical methods, ignores the depen-
dence among the responses. A modification of the method developed I'y Nair (1986, 1987)
is used to derive the scores and indices, which are analyzed by nonparametric AOV. An
example from teratogenicity studies is used to illustrate the technique,

This problem arises from the consideration of studies where a reproduction safety test
must be performed prior to the use of drug, chemical or food additive, The standard pro-
tocol in such studies requires that pregnant female subjects (usually rodents) are randomly
sslgned to one of four treatment groups, The appropriate dosage is administere4 shortly
after the beginning of gestation. When the animals are near term, they are sacrificed
and the number of potential offspring are counted. Other data collected are the number
of implantation, early and late fetal deaths, number of live offspring and the number of
fetuses according to various degrees of increasing severity of malformation. Also data on
continuous variables such as fetal weight are collected. It is unclear from the literature
which statistical methods are appropriate for the analysis of this type of data.

For continuous measurements one may quickly turn to the analysis of variance. For
count data describing the number of fetuses with or without some qualitative outcome,
other methods have evolved. A per-fetus analysis using total of early death and total
number of implantation in a Fisher exact-test or a chi- squared test of independence may
be performed, but this appears to inflate samples sizes and ignores the dependence of
observations within litters. A review of per-fetus analysis is given by Haseman and Hogan
(1975) who conclude the per letter analysis is more appropriate.
All but one of the proposed methods for per-litter analysis considF.r a single outcome.
The need to include within and among-litter variation negates the use of simple binomial
or Poisson models for count data. In the methods which consider several single responses,
a problem of family error rate arises. Since the tests arc not independent, the nominal
family error rate cannot be exactly determined. The multivariate method developed by

99
Ryttman (1976) relies on the assumption of norma.ity which is violated in the case of
fetal deaths, This lack of success, however, does not preclude a multivariate approach. In
situations where ranking the categories from mild to severe is possible, ordered categorical
models may be applied and the family error problem may be eliminated.

In this paper we obtain a scoring system for various outcomes which produces a severity
index for each litter. This index is sensitive to location shifts. The modeling which follows
will be based on this index.

The study design prohibits the straight-forward application of ordered categorical pro-
cedures because the items (fetuses) are not independent, Thus a scoring procedure allows
consideration of the effect of letter size on severity of the response, as a whole, in the litter,
Here the sampling unit is the fetus or individual. Three observations should be made; i)
results are different per litter than for per fetus, ii) per litter evaluates the proportion of
fetuses affected rather than the numbers of affected litters, and iii) observed treatment -
control differences is less significant than per-fetus indicates (via simulation).

Univariate Analysis,

The simple analysis is based on litter as the experimental unit. This analysis is carried
out using binomial and poisson models. The binomial assumption states that conditional
on litter size the number affected is binomial. The analysis is based on transformed data,
usually the arc-sine of the observed proportion. The poisson model does not account for
litter size as it assumes the mean number affected is the same for all dose groups. The
analysis again used a transformation, usually the square root of the observed number,
Neither fits the data very well. This may be due to extra binomial or extra poisson
variability, as the case may be,

More sophisticated models are reviewed by Haseman and Kupper (190) include: weighted
least squares based on proportion and unequal sample sizes. This approach due to Cochran
(1043) requires sample sizes which are too large .for this application. Others include, the
normal-binomial (Luning et. al. 1966), beta-binomial (Williams 1075), negative binomial
(McCtughran and Arnold 1076), correlated - binomial Altham (1978), jackknife Gladen
(1970), Several nonparametric procedures have been tried, namely the Mann-Whitney U,
the Kruskal-Wallis and the Jouckheere/ Terspstra, Some attempts at multivariate analy-
sis have been tried by Ryttman (1076), log-linear models by Haberman and others (1074)
and generalized linear models by McCullagh (1080), All of the latter techniques have
distributional assurnIptions.

Since some of the ordered categorical procedures develop or accept scores for the ca.t-
egories, this approach was pursued. Scores induce relative spacing among the categories.
Thus, a mean score may be obtainerd for each litter. This implies analysis by litter as a.
sampling unit, We note that CATMOD in SAS alows for scoring, but the scores must. be,

100
user specified.

Ipscn (1955) suggested a scoring for Bioassay. Instead of estimating an LD, 0 or ED50
based on number of survivors after x days, he ordered the data into categories with the
continuum represented by time (days). The scores proposed are such that the variance of
the linear regression of mean scores on the log dose is maximized with respect to the total
variance. An adjustment is made if the scores do not reflect the ordering of the categories.

Bradley etal, (1962) scores by maximizing the treatment sum of squares after scaling
relative to the error sum of squares. This is an iterative procedure which does not require
the assumption of linearity.

Using no distributional assumption, Nair (1986, 1987) suggested some techniques for
analyzing ordered categorical data in the field of quality control. He showed the Taguchi
statistic for 2 x 2 tables, can be orthogonally decomposed into X - 1 components where
K is the number of categories. In the two sample case he showed the first components
is equivalent to Wilcoxon's rank test on grouped data. Thus, this components would be
sensitive to shifts in the multinomial model, Further, the second components corresponds
to Mood's rank test for grouped data, thus is sensitive to scale changes in the 2 x K model,

In the lion equiprobable case the correspondence does not apply though the interpre-
tation still holds. This result has been verified using a comparison density approach for
the two samnple problem by Eubank, LaRiccia and Rosenstein (1987).

The decomposition of Taguchi's accumulation chi-squared (1066, 1974) requires the


solution of an eigenvector problem. Nair (1986, 1987) provides the method for deriving two
sets of scores, These yield statistics that are approximately equal to those obtained from
the orthogonal decomposition, but do not require a rigorous solution. The approximate
and exact statistics have comparable power.

When applied to 2 x K tables, the first set of Nair's scores is sensitive to shifts in locntion
of thee underlying random variable. It is reasonable to suggest, when applied to litters.
these scores yield a11colitilous index useful for detecting shift. In teratogenicity studies
the location shifts of interest would be those that indicate a significant dose-res)onse: .

Nnir's Metlhd

As alrea.ly mentioned, the first and second components of the orthogonal dccompo itiou
corresp(nd to the Wileoxon and Mood rank test,, respectively.

Wilcoxon i;ests,

HO: G(,x) = F(x)


H1 G(x) = F(x - )

101
where F, G are two distribution functions.

Mood tests,

Ho G(x) = F(x)
H1 G(x) = F(x/O)
where 0 is a constant.

For more than two treatment groups the first component corresponds to the Kruskal
- Wallis statistic for grouped data, and the second to the generalized Mood statistic. in
the general case (except equiprobable) case the equalities are no longer exact, but the first
two corr.ponent have good power for detecting location and scale shifts respectively. The
focus of this work is on location shifts.

Observed frequency for the (i, k) th cell = I ki - 1, 2


j=-1,2,...K
Column total, = CK = YIK + Y2 K
Row total. = =R R , h= Ti,
Cumulative row frequencies Zk = E- I
Cumulative column totals, D'= k 1C
The row proportions = r= = RI/N
The column proportions = Ck = CA,/N
Cumulative column proportion up to and including column k dk = D/N,

Vector conventions used.

bold lower case letter = a vector


bold upper case letter - a matrix
transpose = i
a vector raised to a power implies each element is raised to a power
(this is non standard)

Multinomial model. 2 x N case

Two random samples of size Rj i = 1, 2 are drawn from two


multinomial populations.
For each population, the probabilities of the K outcomes are given by

102
Pikk = 1, 2, ... K.

The cumulative probabilities for population i are given by

7
rik = Ej~ Pij

If the K categories are assumed to be ordered the hypothesis is

Ho: rlk 7
2k k=1, 2, ... K
H1 : ( - r2k) < 0 for all K
(strict inequality for at least one k).

Alternative statistics to Pearson's x 2


K 2
Taguchi's statistic TE = [dk(1 - dk)]- 1 {'- Ri(Zik/Ri - dk)}
k=l iW1

If <2p is the Pearson x2 statistic from a 2 x 2 table where column 1 contains the
cumulative frequencies of categories 1 through k and column 2 contains the cumulative
frequencies of categories (k+1) through K. Then,
K-i
TE = F, X'kP
k=l

where TB is a "ccs type statistic".

TE assigns weight w4 = [dk(1 - dk)] -1 to the k h term in the sum which is equal for each
k under H0 .
K-1 2
Nair's Statistic T = K Wk f{E Rj(Zk/R, - dk) 2
k=l iLI

The statistics in the class are obtained by the choice of the set {wk} where u'k > 0
for k = 1, 2, ... K - 1. The decomposition is carried out conditionally on the marginal
proportions. For yl, Wk k = 1, 2, ... K - 1, W is a diagonal matrix. Using the d we form
the (K - 1) x K matrix A by;

- di -d, ........ -dl


d2 1 - d2 -d 2 ... -d 2
1- dk,_ 1- dk- ..... 1- dk-l -dk- 1

103
Thus T is given by

T = y'AtWAyi/Nrjr 2

To express T as a sum of squares in yj we need to express

A WA

as a product of a diagonal matrix, Q and its transpose. Let A be diagonal of order K


formed by the colum proportions {ck} and let r be the diagonal matrix of order (k - 1)
containing the eigenvalues of AtWAA then the decomposition yields

AVA = QrQt

where Q contained the eigenvectors of A 1 WAA such that

~= E1Q]

satisfies

substituting QFQt into T above with

U= Q'(Nroi)-

yields

K-1
T Z li'

where the rj's are elements of the vector of eigenvalues, r, and Uj's are elements of .,
Under H0 the distribution of Yi conditional on row and column proportions is multiple
hypergeometric with

104
E(Y) = Nrlck

Cov(Ylk, Y 11) = N(1 - 1/N)- 1 r1 r2(1 - Ck) k = 1

= -N(1 - 1/N)?1 r 2(ckcI) k 34 1

or

E(yj) = NrjA1
cov(y, = N(l - 1/N)-Irlr 2 A((I- 1) itA]

1 is a K x 1 vector of ones, It follows that

E(u) = NVr(Q1A1) / VTrh7 = 0


cov(u) = (1 - 1/N)-QA(I - 1 1A]Q = (1 - 1/N)'I

implies the Uj's are uncorrelated with zero means.

, Under HO it can be shown that the limiting distribution of yj converges to the multi-
variate normal distribution as N goes to infinity, Thus

X-1 K-i

j-1 jul

a weighted sur of independent X2 random variables, each with 1 df

The approximate solution by Nair proposed two sets of statistics which have properties
the same as those obtained for the equiprobable case (i.e. ck = 1/k). That is the first
component of TE or Uvl is equivalent to the Wilcoxon test on the 2 x K table and UE,2,
the second component, isequivalent to Mood's

K
M L [k - (K + 1)/2]2 Ylk.
k=l

They do not require the solution of the eigenvalue problem as orthogonal decomposi-
tion is not necessary. The first component, all observations in the category are assigned a.

105
score proportional to the midpoint of the category. The second component the scores are
quadratic in the midrank. Additionally, each set of scores is adjusted to satisfy orthogo-
nality.

To calculate the scores, let c be of length K with elements being the column proportions.
Form

.5 0 .... 0
B= 1 0...." 0

B =5

Let r = Bc and r* = 1 - .5(1). Note the r's are Bross's ridits. The first set of scores is
obtained from

1 = I/c'r 2

where r *2 is a vector of squares of elements of r . The second set of scores is obtained in


two steps. First let

c = 1'1- (c'13)1) - 1.

Then

2 ,
S = e/cte

The approximate statistics for the 2 x K table are

VI'2 L'/RI + L 2 2 /R 2

where
Li = 1Vyj 1, 2
and
V2 ' = S1 /R1 + S 2 2 /R 2
where
Si = st i 1, 2

106
which are comparable in magnitude and consequently in power to U1 and U2 respectively.

We now apply the method:

Conduct of the study and data:

PROTOCOL

1. Sprague-Dawley rat study


2. Herbicide: nitrofen (2, 4-dichloro-4 nitrodipheyn] ether)
3. Test compound administered during organogenesis
4. Sacrifice prior to parturition and cesarean-sectioned
5. Record litter ab.. mfetal data
6. Administration of compound follows daily dose regimen
7. Treatment groups; control and three dose groups
8. Inseminated females randomly assigned to 4 groups of 24 rats each
9. Dose levels. 6.25, 12.5, 25 mg/kg/day body weight on days 6-15 of gestation.
Controls: gavage solution w/o test compound

Live fetuses are weighed., sexed and examined for external malformations. They are
then sacrificed in order to perform the skeletal and visceral examination. Recorded are the
number of corpora luta on each ovary, number of implantations, number of fetuses, and
the number of resportions in each utrine horn. Table I displays the data for each rodent
and close level.

The following definitions are employed to categorize the fetuses: Dead - Early or late
resorption of dead at c-section, malformed-gross visceral or skeletal variation, growth re-
tarded - body weight more than two standard deviation from the mean for the given sex
or by a range test. Normal - absence of any of the previous outcomes, Table's II and
III summarize the results by number and percent for each dose by category. It should be
noted that the differing number of letters is due to nonpregnant females, not toxicity.

The final columin of Table I is the calculated severity index, This index is calculated by
multiplying the score for the category by the number of fetuses in the category, summing
and dividing by the number of implantations, i.e.,

Sr = ntc/nt1.

Details of the calculations of a severity index are given in the following example,

Consider the following sample data:

107
Table I
Nitrofen Data - Strague-Dawley Rats
Number of Growth Severity
Id Implantations Normal Retared Malformed Dead Index
Dose Group = Control (0.0mg/kg/day b.w.)
19 1 1 0 0 0 0.00000
8 4 3 0 0 1 0.68423
11 5 4 1 0 0 0.25139
7 8 7 0 0 1 0,34212
1 12 9 0 3 0 0.49145
16 14 14 0 0 0 000000
24 14 11 0 3 0 0.42124
6 15 15 0 0 0 0.00000
9 15 13 0 1 1 0.31352
20 15 12 0 3 0 0.39316
22 15 12 1 2 0 0.34590
2 16 14 1 1 0 0.20142
4 16 16 0 0 0 0.00000
10 16 16 0 0 0 0.00000
12 16 14 2 0 0 0.15712
17 16 16 0 0 0 0.00000
23 16 15 0 0 1 0.12286
3 17 11 0 6 0 0.69382
5 17 10 0 6 1 0.85481
13 17 13 0 3 1 0.50790
15 17 13 0 2 2 0.55326
21 18 13 0 4 1 0.58890
Dose Group m'Low (625 mg/kg/day b.w.)
32 1 0 0 0 1 2.73692
28 12 10 0 2 0 0.32763
43 12 9 0 3 0 0.49145
26 14 10 0 4 0 0.56166
31 14 14 0 0 0 0.00000
39 14 11 0 3 0 0,42124
41 14 9 0 5 0 0,70207
47 14 10 0 4 0 0.56166
48 14 14 0 0 0 0.00000
33 15 10 0 5 0 0,65527
38 15 10 0 5 0 0.65527
40 15 13 0 1 1 0,31352
45 15 12 0 3 0 0.39316
25 16 11 0 5 0 0.61432
27 16 12 0 4 0 0.49145
34 16 10 0 6 0 0.73718
35 16 12 0 4 0 0.49145
.37 16 9 0 7 0 0.86004
44 16 12 0 4 0 0.49145
46 16 11 0 5 0 0.61432
36 17 7 0 8 2 1.24708

108
Table I (Cont'd.)
Nitrofen Data - Strague-Dawley Rats
Number of Growth Severity
Id Implantations Normal Retared Malformed Dead Index
Dose Group = Control (0.0 mg/kg/day b.w.)
54 2 0 0 0 2 2.73692
70 3 1 0 2 0 1.31054
59 4 2 0 2 0 0.98290
64 8 5 0 3 0 0.73718
53 11 4 1 5 1 1.25663
55 13 7 0 5 1 0.96661
58 14 7 0 6 1 1.03798
60 14 7 0 5 2 1,09306
65 14 8 0 5 1 0.89757
68 14 10 0 3 1 0.61674
62 15 6 0 6 3 1.33371
67 15 13 0 1 1 0.31352
71 15 8 0 4 3 1.07160
49 16 6 0 10 0 1.22863
69 16 11 0 4 1 0.66251
56 18 15 0 2 1 0.37047
57 18 13 0 5 0 0.54606
72 18 7 0 11 0 1,20133
Dose Group = High (25.0 mg/k/day b.w.)
91 2 0 0 1 1 2.35136
80 7 3 0 3 1 1.23348
86 8 6 1 1 0 0.40284
73 10 1 0 9 0 1.76923
77 14 3 0 11 0 1.54450
78 14 3 0 11 0 1.54456
79 14 2 0 12 0 1.68408
83 14 9 0 5 0 0.70207
93 14 1 0 12 1 1.88047
76 15 0 0 15 0 1.96581
84 15 6 0 9 0 1.17049
92 15 7 0 8 0 1.04843
74 16 4 0 11 1 1.52255
87 16 6 0 10 0 1.22863
94 16 8 0 8 0 0.98290
95 16 6 0 10 0 1.22863
96 16 4 0 10 2 1.57075
89 17 6 0 11 0 1.27199
90 17 0 0 11 6 2.23797
75 18 6 0 12 0 1.31054
81 18 6 0 12 0 1.31054
88 19 11 0 7 1 0.86829

109
Table II
Number of Implantations
Group Normal Gr.Retarded Malformed Dead Total
Control 252 5 35 8 '300
Low 239 1 89 5 334
Mid 130 1 79 18 228
High 98 1 199 13 311

Table III
Percent of Implantations.
Group Normal Gr.Retarded "Malormed Dead
Control 84.0 1.7 11.7 2.7
Low 71,6 0.3 26.6 15
Mid 57.0 0,4 34,6 7.9
High 31.5 0,3 . 64.0 4.2

110
Number of Implantations (Fetuses)
Norrnal Gr.Retarded Malformed Dead Totals
Control 252 5 35 8 300
Low Dose 239 1 89 5 334
Middle Dose 130 1 79 18 228
High Dose 98 1 199 13 311
Totals 719 8 402 44 1173

Calculate the column proportions:


ck .61295823 .00682012 .34271100 .03751066

Calculate Bross's ridits (1958) by the formula rk = (cO+c +. .+ck-1)+ Sck where co = 0
rk .30647912 .61636829 79113385 .98124468

Now, let rk = rk - .5:


r! -.
19352088 .11636829 .29113385 .48124468

Calculate the constant d = [eCIr* 2 + c2 rl 2 + C 3 r3*2 + C4 74*2] 1 /2


d = (.61295823(-.19352088)2 + ,00682012(,11636829)2 +
.34271100( .29113385)2 + .03751066(.48124468)2] 1/2
.24654207

The vector of scores (Nair, 1986, 1987) is then obtained by Ik = rk*/d:


Ik -0 7849 0.4720 1.1809 1.9520
Shifting the scores so that the score for a normal implantation (fetus) is zero, the final
scores are:
l 0.0000 1.2569 1.9658 2.7369

Then, a litter with 11 implantations of which 4 are classified as normal, 1 as growth re-
tarded, 5 as malformed and 1 dead, would have a severity index of:

SI=[0,0000(4)+1.2569( 1)+1.9658(5 )+2.7369( 1)]/11=1,2566

This can be interpreted in light of the above scores, i.e., an index near zero would be
indicative of a litter with nearly all normal fetuses at cesarean-section and a "core 'nir
2,7369 would be indicative of a litter with nearly all fetuses dead at cesarean-section.

Designs for the Analysis of the Severity Index

Five designs were evaluated which assume normality. The one-way classification, a
one-way classification using litter size as a covariate, a generalized randomized Iblock using

ill
litter size as a blocking variable, and a weighted anelysis using in one case litter size as a
weight and in another the square root of litter size. The results are summarized in Table
IV in terms of calculated F, associated P values and R 2 .

Table rV
f F P R2
One Way analysis 3,81 22.97 < .0001 .46
Covariance 3,80 25.99 < .0001 .53
Generalized RBD 3,65 20.52 < .0001 .58
Weighted AOVr (litter size) 3,81 15.62 < .0001 .37
Weight AOV'( littersize) 3,81 33.11 < .0001 .55

As was expected the covariance and blocking provided an improvement over the one-
way classification as measured by R 2. However, the magnitude of the improvement does
not seem to warrent the chance of violating the more restrictive assumptions placed on the
experiment by those designs. A better alternative, in the parametric case, may be using
the square root of litter size as a weight which provides nearly the same value of R I as
does the blocking design, However, we would prefer the one-way analysis for its simplicity
and robustness in application,

The normality assumption on the severity index is quite suspect in many situations,
As an alternative, the nowparametric Kruskal-Wallis procedure was carried out. In view of
the overwhelming significance of the parametric procedures, this result was not surprising
X2 = 47.75; af = 3 p < .0001. Figure 1 compares the linearity of the mean severity index
and the median severity index.

Figure 1

2.00

1.50-

1.00

0,50.

0 .000 6.25 1250 250


DOSE (mg/kg/dy)
STAT o MEAN a MEDIAN

112
Statistical Procedure

The consideration of litter size is not necessary -for analysis of the SI's. It is important
to note than the SI's are probably not normally distributed, particularly in the control
group and at the higher dose levels, The following is suggested for toxicity-teratogenicity
studies,

1. If the SI's are reasonably normal, calculate the AOV F-statistic for a one way layout,

Use this statistic to test for differences in location.


2. If F is significant, follow with linear constrasts to test for increasing trend.
3. If significant use Dunnett's procedure to compare control mean with each of the
treatment means to establish no-effect leve,
4. In the presence of non normality use a similar sequence of nonparametric test. e.g.
X - W, Jonckheere/Terpstra, and Dunn's procedure,

SAS code is available which reads litter data, calculates scores, computes SI's and cal-
culated the statistics, The results above have been "tested" by simulation analysis of
additional nitrofen studies and two other biological examples. Also, the method detects
different dose patterns with equal ability, The K - W test showed consistently higher power
than the F-statistic in the simulation studies,

113
REFERENCES

[1] Altham, P.M,E. (1978), Two generalizations of the binomial distribution. Applied Statistics
27, 162-167

[2] Bradley, R. A,, S. K. Katti and I. J. Coons. (1962), Optimal scaling for ordered
categories. Psychometrika 27, 355-374.

[3] Bross, I,D.J, (1958). How to under ridit analysis, l 14, 18-38.

[4] Cochran, W. G. (1943). Analysis of variance for percentages based on unequal num-
bers. J. American Statistical Association 38, 287-301.

[5] Eubank, R.L., V.N. LaRiccia and R.B. Rosenstein. (1987), Test statistics derived as
components of Pearson's phi-squared distance measure,
J..American Statistical Association 82, 816-825,

[6] Gladen, B, (1979), The use of the jackknife to estimate proportions from toxicological
data in the presence of litter effects, J. American StatisticalAssociation 74, 278-28,

[7] Haberman, S.3, (1974). Log-linear models for frequency tables with ordered classifica-
tions. B 30, 589-600.

[8] Haseman, J.K. aid M.D. Hogan. (1975). Selection of the experimental unit in tera-
tology students, Teratology 12, 165-171.

[9] Haseman, JK, and L.L. Kupper. (1979). Analysis of dichotomous response data from
certain toxicological experiments. Diometrii 35, 281-293.

[10] Ipsen, J. (1955). Appropriate scores in bio-assay using death times and survivor
symptoms. B 2, 465-480.

[11] Luning, K. G., W, Sheridan, K. H. Ytterborn and U. Gullberg, (1966), The rela-
tionship between the number of implantations and the rate of intra-uterine denth in
mice. Mutation Research 3, 444-451.

114
[12] McCaughran, D.A. and D.W. Arnold. (1976). Statistical models for numbers of im-
plantation sites and embryonic deaths in mice. Toxicology and Applied Pharmacology
38 , 325-333.

[13] McCullagh, P. (1980). Regression models with ordinal data (with discussion).
J. Royal Statistical Society, B 42, 109-142.

[14] Morgenthien, E.A. (1988). An Ordered Categorical Approach to the Analysis of


Qualitative Data From Developmental Toxicity/Teratogenicity Studies. Unpublished
Ph.D. Dissertation, University of Delaware.

(15] Nair, VJ. (1986). Testing in industrial experiments with ordered categorical data
(with discussion). Jechnomnetrigs 28, 283-311,

[16] Nair, V.J. (1987). Chi-squared type tests for ordered alternatives in contingency
tables, J. American Statistical Association 82, 283-291.

[17] Ryttman, H, (1,976) A new statistical evaluation of the dominant-lethal mutation


test. Mutation Research 38, 228-238.

[18] Williams, D,A, (1975). The analysis of binary responses from toxicological experi-
nients involving reproduction and teratogenicity. Biometic 31, 949-982.

116
A SMALL SAMPLE POWER STUTDY OF THE

ANDERSON-DARLING STATISTIC AND A COMPARISON WITH

THE KOLMOGOROV AND THE CRAMER-VON MISES STATISTICS

Linda L. Crawford Moss, US Army Ballistic Research Laboratory


Malcolm S. Taylor, US Army Ballistic Research Laboratory
Henry B. Tlngey, University of Delaware

Abstract

The Anderson-Darling goodness-of-fit procedure emphasizes agreement


between the data and the hypothesized distribution in the extremes or tails.
An improved table of the quantiles of the Anderson-Darling statistic, useful
for small sample sizes, was constructed using the Cray-2 supercomputer. The
power of the Anderson-Darling test Is compared to the Kolmogorov and the
Cramer-von Mises tests when the null hypothesis Is the normal distribution
and the alternative distribudons are the Cauchy, the double exponential, and
the extreme value distributions,

117
1. INTRODUCTION

Consider a random sample Xi, X2 , ... , Xn from a population with a

continuous distribution function. One method of testing the hypothesis that

the n observations come from & population with a specified distribution

function F(x) is by a chi-square test. This test requires a subjective

partitioning of the real line R and a comparision of the empirical histogram

with the hypothetical histogram. A more objective method, Is to compare the

empirical distribution function Fn(x) with the hypothetical distribution

function F(x). The empirical distribution function based on n observations is

defined as Fn(x) - if exactly k observations are less than or equal to


n

x, for k-0, 1 ... , n.

To compare the empirical and hypothetical distribution functions a

measure of their difference is required. Addressing this, Anderson and Darling

[1052] considered the following metrics in function space:

00

n - n f [Fn(x) - F(x)]2 V/4F(x)] dF(x) (1.1)


-00

and

118
Kn sup \/ IFn(x) - F(x)IV F-3J. (1.2)
-00 < x < =0

Samples producing large values of W 2 (or Kn) lead to rejection of the null

hypothesis that the population distribution function is F(x). One of the

contributions of Anderson and Darling was the Incorporation of a non-negative

weight function k In (1.1) and (1.2). By a suitable choice for k, specific ranges
of values of the random variable X, corresponding to different regions of the

distribution F(x), may be emphasized. For '[F(x)] m 1, W2 becomes thR

Cramdr-von Mises statistic [Cramir, 1928 and von Mises, 1031] and K n

becomes the Kolmogorov statistic [Kolinogorov, 1033].

The tails of the distribution function will be accentuated In the

investigation detailed In this paper; Anderson and Darling suggest using


1

t44F(x)] - F(x)[1 - F(x)]

With this choice for the weighting function, metric (1.1) becomes the basis for

the Anderson-Darling statistic.

In Section 2, the Anderson-Darling test statistic is developed: in

Section 3, the most accurate tabulation to date of the test statistic is provided.

In Section 4, the description and the results of a power study are given in
which the Anderson-Darling, the Cram:'-von Mises, and the Kolmogorov

statistics are compared.

119
2. THE ANDERSON-DARLING STATISTIC

For a fixed value of the random variable X, say X w x, the empirical

distribution function Fn(x) is a statistic, since it is a function of the sample

values x1 , x 2, ..., x n. The distribution of this statistic is established as a lemma.

Lemma (2.1): If Fn(x) Is the empirical distribution


function corresponding to a random sample X1 , X2 , ..., Xn
of size n from a distribution H('), then for a fixed x, nFn(x)
is distributed binomial (H(x), n).

Proof:
P(nF,(x) - k) - P(exactly k values xi < x), for k - 0, 1, ..., n.

Let Zi - I(-,., xl (Xi), where the indicator function I is defined as


Iff-0 < Xi :5_ x
, (-"xi)
-
0, otherwise
Then EZ1 counts the number of sample values xi 5 x.
Here each Zi -Bernoulli (H(x)), so Z 1 -binomial(H(x),n).
Therefore,
P(nFn(x) - k) - P(exactly k values x, )
-P(PZi - k)

- I H(x) k 1 -- H(x)) . II.

120
From Lemma 2.1,

E[F.(x) - 1 E [nFn(x)]. -H(x)

and

Var[Fn(x)] - n2 1_ Var [nFn(x)]_;i H(x) [1 - H(x)] . (2.1)

To assist in the determination of a suitable weighting function 4('),

that is, a function that will weigbt more heavily values in the tails of the

distribution F(x) at the expense of values closer to the median, consider the

expectation of the squared discrepancy E [oF(x) - F(x)]. It is important to

keep In mind that the value x Is fixed, so F(x) is a constant, and the

expectation Is with respect to the'random variable Fn(x) whose distribution was

established in Lemma 2.1. Then

n E[F(x) - F(x)] - n E [Fn(x) - H(x) + H(x) - F(x)] 2

- n E {Fn(x) - H(x) - {F (x) - H(x)

which, after algebraic manipulation (Appendix A) yields the variance nnd bias.

- 1 H(x){1 - H(x)} + {F(x) - H(x). (2.2)

121
Under the null hypothesis HO: H(x) - F(x) Vx, (2.2) becomes

niE [F, (x) - -~)F(x)[1 - F(x)] . (2.3)

1
Anderson-Darling chose as a weighting function, O[F'(x)) ~)[ ~)

WeightIng by the reciprocal of (2.3) takes into consideration the variance of

the statistic Fn(x) and also maintains the objective of accentuating values In

the tails of F(x).

With this choice of weighting function and without loss of generality


assuming x, :5 x2 <5... <5x,,, let F(x) - ui, dF(x) - du, and F(xi) - ui. Then

the Anderson-Darling test statistic (2.4) can be rewritten as expression (2.5) by

expansion and integritlon (Appendix B).

2 ) 24
00 Fn(x) -F(x)] d'
.F(x) [1 - F(x)]

W2 1,1- [(2j-1)In u, + (2(n-j)+l) ln(l-u,)] (2.5)

122
3. DISTRIBUTION OF THE ANDERSON-DARLING STATISTIC

The asymptotic distribution of W2 was derived by Anderson and

Darling [1952]. Lewis (1981] undertook the tabulation of F(z; n) - P(W2 < z)

for n - 1, 2, .. , 8 and for Incremental values of z over the interval

[0.025, 8.000]. Lewis' table entries were computed using a Monte Carlo

procedure to generate an empirical approximation Fm(z;n) to the distribution

function F(z;n) based on rn samples of size n. At that time, computational

restrictions essentially limited the accuracy of the table entries to within

0.00326 of the true value.

Following an analogous procedure based on expression (2.5) and the

ob3ervation that the U are distributed U[0,1] [Feller, 19661, the table appearing

In Lewis' paper was recalculated using a Cray-2 supercomputer. A

Kolmogorov-type bound (Conover, 1980] was used to construct a 95%

confidence band for Jhe distribution function F(z;n).

In general, the width of a (I - a)I00% confidence band is equal to

twice the value of the (1 - a)100% quantile of the Kolmogorov statlst ic

Km sup V Fmx)-(x)l , where m is the number of saniple


-00 < X < 01

123
values used in the construction of Fm(x). With n fixed, the 95% confidence

band can be made arbitrarily small by a suitable choice for m, the number of

Monte Carlo samples. The commonly tabled [Miller, 10561 asymptotic

approximation for the 9 5 th quantile Is 1,358/\/ . However, Harter [1080]

suggests using
1

1.358 where r - (m + 4)2 (3.1)

M 3.51

for an Improved approximation.

Using approximation (3.1) to construct a 95% confidence band with

the width not exceeding 0.001, the value for m must be at least 7,375,881. In

this simulation, m was chosen to be 7.4 million. Table 1 lists the

reconstruction of Lewis' table, now accurate within 0.0005. Again, z ranges

from 0.025 to 8.000 and for n - 1, 2, ..., 10. The column labeled "oo" contains

the asymptotic values, rounded to four decimal places.

124
-g -r

125~
ti

d d 0 0 c C6

d260a0C ic '

I" t - - N

o ooo
6--- -- == --- oc --- dd c di0 0o0

126
H~ t-
0i ag;

0E

al e e 4 e 41 11orlel41 * al l l 4.1 44f*14 ri" C 4 0) vi i t

12
Cc 00
ai 0 a

k-~ -

128
4. POWER STUDY

The power of the Anderson-Darling test was compared with two other

goodness-of-fit procedures based on the empirical distribution function: the

Kolmogorov and the Cramir-von Mises tests. The Itolmogorov statistic


Introduced in Section I as metric (1.2) with weighting function ?4 [F(x)] m 1

becomes

K- sup /'n IIn(x)- F(x)I. (4.1)


-o < x <00

For an ordered sample x, : x2 '... : xn and F(xi) -u, (4.1) may be

evaluated using D - max( D', D- ), where

D+ - max Ui and

D- mmax ui
l [ n

The Cram~r-von Mlses statistc, defined as

00

an n f [Fn(x) - F~x)] " dF(x)


-00

can be reduced to (4.2) for ease of computation (Appendix C);

129
W;
E M n + 12n
-- '_ (4.2)

In the power study, two cases were considered. Case 1 corresponds to

the situation in which the parameters of the hypothesized distribution are

completely specified. Case 2 corresponds to the situation In which the

parameters are not specified and must be estimated from the sample data.

For both case 1 and 2, the null hypothesis Is

Ho: A random sample X1 , X., ... , X. comes from a normal population


or
Ho: H(x) - F(x), where F(x) - N(ua).

As alternative hypotheses, the Cauchy, double exponential, and extreme value

distributions were chosen, each with location parameter the same as the null

hypothesis. This provided a heavy-tailed, light-tailed, and skewed distribution,

respectively, against which the power of the three goodness-of-fit tests are

compared.

The power functions do not exist in closed form; they are

approximated empirically via a Monte Carlo simulation. To determine a point

on the power curve, a large number of samples of size n was generated from a

specific distribution serving as the alternative hypothesis. The number of

times the null hypothesis was rejected at a specific level of slgnlficance was

recorded. The ratio of the number of rejections, Y, to the total number of

1 30
samples generated, N, provides an estimate, p - Y/N, of the probability of

rejecting the null hypothesis when it should be rejected (power). The value p

determines a point on the power curve corresponding to a specific sample size

n, a specific significance level a, and a specific alternative hypothesis.

To determine the number of samples of size n required for a

sufficiently accurate estimate of p, a nonparametric technique was employed.

Since the counter Y is distributed binomial( '; p,N) where the parameter p is

the true but unknown power, and since an approximate confidence Interval for

p can be constructed [Conover, 1980] using

- =~ P L- z(I- I <P <y +-!)IN (4.3)

samples of size n contlnued to be generated from the alternative distribution

until the confidence interval for p given in (4.3) was sufficlently small.

The confidence interval coefficient 1 - c was chosen to be 0.975 and

the confidence interval width not to exceed 0.025. Then the confidence limits

(4,3) were successively evaluated until the interval width was satisfied.

Considering a "worst-case" scenarlo in which p - 1/2 and the variance o1 its

estimate p is greatest, equating Y/N - 1/2 in (4.3) suggests that saimples of

magnitude 8037 might be required. A minimum "alue for N of 100 w.s

imposed to prevent prrmature termination of the procedure.

131
4.1. Case 1: Distribution Parameters Specified.

The power study for case 1 specified the parameters of the

hypothesized distribution as N(0,1). The results of the study are summarized

in Figures 1 - 12. For each of the three distributions serving as an alternative

hypothesis, samples of size n - 5, 10, 15, 20 were chosen for study and, as

previously mentioned, the location parameters of both the null and alternative

hypotheses coincided. The scale parameter for the alternative hypothesis were

values from 0.025 to 3.000 in increments of 0.025.

The level of significance for the study was 0.05. The critical value for

each test was determined from tables In Conover [1980] for the Kolmogorov

test, Stephens and Maag [1968] for the Cramir-von Mises test, and Table 1 in

Section 3 of this paper for the Anderson-Darling test.

The Anderson-Darling test demonstrated orerall superiority for the

sample siz ' and hypotheses chosen for this study. This is perhaps to be

anticipated in view of tie emphasis on agreement in the tails by the

Anderson-Darling procedure, but the magnitude of difference over the

Kolhogorov and Cramr-von Mises tests i !,-tpressive.

The power curves corresponding to n - 10, 1b, 20 ar- distAnguished

by their characteristic of decreasing to a global minimum before becoming

132
monotone increasing. An explanation of this feature is suggested by

consideration of Figures 13 - 15 in which the distribution functions of the

N(0,1) and Cauchy (0,C) are compared. There it is seen (Figure 14) that

corresponding to C - 0.50 the two distribution functions are similar; an

increase (decrease) in the scale parameter C causes the tails of the distributions

to become more distinct. Values in a neighborhood of " 0.50 marked the

global minimum throughout the study.

4.2. Case 2: Distribution Parameters Estimated.

The Anderson-Darling, Kolmogorov, and Cramdr-von Mises

goodness-of-fit tests were developed for use in case 1 where distribution

parameters are specified, and so precludes their use in the more likely situation

where parameters must be estimated. In practice, these procedures are

sometimes used anyway with the caveat that the tests are likely to be

conservative. Stephens (1974] provides adjustments to the test statistics that

enablk the tests to be used to test the assumptiun

Hm: H(x) - F(x), where F(x)- N(p,a') and the population parameters are

estimated from the data.

The results of the power study for case 2, are summarized in

Figures 16 - 27. As in case 1, the sample sizes are n - 5, 10, 15, and 20, and

133
the level of significance is 0.05. Both location and scale parameters coincide;

the scale parameter are values from 0.025 to 3.000 in increments of 0.025.

The power plots are horizontal, demonstrating that power does not

change with scale parameter and provides empirical support for Stephens'

transformations. Power Increases with Increasing sample size, as would be

expected. When both location and scale parameters agree, all three tests are

competitive for the sample sizes and alternative distributions chosen for this

study.

1.44
.*

* *, S
* • ^ .2

* SL .;I

I U

I S
S
SSo
5 '.

'S 35

-III .. .. ' ...... . . .. . ..... . II __I


I
1
I U
0

I,,
I
I
* 9 -~
9
* 9

9
99 (.3
I
9 9 519
99

99
iIj N
9
pm-. 9 9
9

41
9 9.

PU V
9 9
9
9~

x 9

-& ~
1 *
9
9

Q '9
9
9

9%. I 9. N
'99

9%~ C,

J~Mod 0

136
I
i
I
II.
P
0
E
1 4

I elm eo~
* 4 ~ I,
C ~ .2
1* S
4
~ *
* S

iI U?
S...,

0 I
4~
4
4 Ii. 0
S
6 4
4
*

(V
4

9. *5 'S.

~Mod

137
'a
I:
~ Og

13
a ,
4'

I *

I 'a Oo~
a a
I
-~ a

0 ii~
a. *
jJ:
>1
4
a
9 a §

B
0 LI~

'a

'-a
a U?
I
a e0" *0
*0. *~ 5 1 1 d

~ C C

138
UU

1%0

ap4

1391
iC
6 0
J*Ij *

14
Z, ,

00
il=ll Ci,

iOU

CR

to

I I I I II I I
-, 5 0

141
9414h
00

cl*

* 9,,

PC.

9a o

I 142
S
U
U U U

U'

U Umu
'U U

IU U'
U
S
~Ii
:111
~uIu
U
U
q'N
U

U
~;II U U w
U. II
'4, U
*
a
* II 4, 5
*
U * U
4,* 6 5'

UU *
4, U **
U.S
\Ib

4,.
p.4 *

________________________________________________ S
I I I -~

a* '0* .4'. 0
"4 0 0 0 0

Z~ Mod

143
ccc
hI

£ aNo 4 a000
... Sm moml

14
I hO U

Ii

i ~
U N
I U

I
0 S
S I I'

,. U I. d
'.4

S -~

'S
'S 0.
It S S
4., *
4.. 5
* C
S S

S
S#q'*% 05 -
*%. ~ .

'S.. 0..
S

.5

A.

U .3 5~ 0

________________________________________________ 0
I I 0
o 4' N 0
- 0 0 0 § 0

3 MOd

145
i b iw
a UN ,,

0'9.14

im
r !I
P4b

"1 I

I. . .. . | .. i . . i
I

146
C!0
I *;

US"S

04147
*
U
U
10
* U

o *
*
-
a

o I

P4

U a

Q
A
.1~~

a
I
a
a

- P4

o *
I
S
I

--~ Cia
£
I

I. a -
I: I
U

o I:
z
0

I I I
o
- C C § 0 0

(x) ~

148
w C -
*
S
*

ISU (
Sm C.
I
S
S

SSS
* Ii~

Q
S.
S
S A
I
'-4

- S

- S.S.
otu

Z
0 U I:
PS

0 0 0 0

149
U U

o t m
i

SI
I
I, 00
0
~ U

I bebOa

I
I o
I ha
bAke

I
1 H:
U2 I ii:
I
1
I
0
u2~ -~

I.'
C
~II I -
I
~Ib~ I 0

I a
a

I
I -
I
0

a
I
I
1 I I I 0
0 U 0
- 0 0 0 0 0

a Mod

151
IsI
Io

i:i
AI, I I;
Is
'Bi
so I
I~6l

Is

1 00
U.l 01
RI
S.mo

19 12

Is162
C
I. S

bO 3
ig
Is 0
F
oo~
C
lii
*
I I:
III
BE'
C
&
I.'
I'* w Q

z
'S
I'
U'

0 I:
~III II cmu
I.

I
U'

Q
WiS
C

F
U'
Em
U.

I I
U

a Mod

153
q
U
a
S
Ia
a a
U
a

lii
~Gw
I
U

* a I
I U

I. U
U
ii: U
U
~II a
*
a

ii
- I

*
.31
*

a - II
*
a
U
* L.
U

o a
U
*
I-
I
a
U
U
*

U
- U
a
in

I
a
I

o U

_______________ U

o UI N 0
C2 C C C 0

.xuMod

154
-q

"" i 1 " ... C.. C

IO 0

- c-
mmI
155
i i
z i
IS q

O WN
44I Wi
I
I § 0 I

5
N

1!1
FWA4 Sm

PC 0M
0I

ccS
ci c;

ii: m1 S
C
I.

bo I.
S*
U.

'B
'B
'B

- I~ SE

U,

I~4
iii
I.e
SB
'a
'S

a:
Urn
I
B.
'U,
'B
IS

- B1
KB U
'B
'B 45
w~o

I: u

I
B~
U)
o 'U
lB
'U
C
-
'U
'B
a:
B.
'U

UB
'a
C
'B

o 'S
'U
U
Q

Z ~ Mod

157
ii
U,,

l0I
gI

NN

k . i "iI
. . I . ....

'0o C

ZSMod

158
q

if mm
C
S.
obo~
''if

IL
~Oy, IS
me
I.

0
ii ci
WI

U2 II S
I.me
34.
5'
I.
Is S..

II

Si

.5'
I.

C
0

I.5' 0
C
od -~

0 0

J SM

159
II

.. _.6

e 'a

I'
- Ut

*S

Ii. I

I' n 4 5

1606

, "W, , ""lli i i -i t~ i i i i ' i Ii ii i ii i iI

. . . _I. - I
3 i
- ", -

~ I

I:

I:
i "E€

i _ III
II II I . . .| II. _ ., . ... .
REFERENCES

Anderson, T. W. and D. A. Darling. "Asymptotic Theory of Certain 'Goodness


of Fit' Criteria Based on Stochastic Process," Annals of Mathematical
Statistics. 1952, Vol. 23, pp. 193-212.
........... "A Test of Goodness of Fit," Journal of the American Statistical
Association. 1954, Vo!. 49, pp. 765-709.

Cogan, Edward J. and Robert Z. Norman. fiandbood of Calculus, Difference


and Differential Equations. Englewood Cliffs, N.J.: Prentice-Hall,
Inc., 1958.

Conovcr, W. J. PracticalNonparametricStatistics, 2ed. New York: John Wiley


& Sons, 1980.
Cramdr, Harald. "On the composition of elementary errors," Skandinavisk
Aktuarietidkrift, 1928, Vol, 11, pp. 13-74, 141-180.
Feller, William. An Introduction to Probability Theory and its Application,
Volume 11. New York: John Wiley and Sons, Inc., 1966.

Handbook of MathematicalFunctions with Formulas, Graphs, and Mathematical


Tables. Edited by Milton Abramowitz and Irene A. Stegun, National
Bureau of Standards, Applied Mathematics Series 55, June 1964.

Harter, H. Leon. 'Modified Asynptotic Formulas for Critical Values of the


Kolinogorov Test Statltic," The American Statistician. May 1980,
Vol. 34, No. 2, pp. 110-111.
Johnson, Norman L. anid Samuel Kotz. Continuous Uuivariate Distributions. 1,
Distributions inStatietica. New York: John Wiley & Sons, 1970.
Jolley, L. B. W. Summation of Series. 2nd Revised Edition. New York:
Dover Publications, Inc., 1961.
Kernighan, Brain W. and Dennis M. Ritchie. The C Programming Language.
New Jersey: Prentice-Hall, Inc., 1978.

162
Kolmogorov, A. N. "Sulia determinazione empirica delle leggi di probablilta,"
Giornale dell'fat tuto Italiano degli Attuari, 19nS, Vol. 4, pp 8.91.

Lewis, Peter A. W. 'Distributlon of the Anderson-Darling Statistie," The


Annals of Mathematical Statitics, IG61, Vol. 32, pp. 1118-1124.

Miller, R. G., Jr. "Table of percentage points of Kolmogorov statistics,"


Journalof the Ameican StatisticalAssociation, 1956, VIt. 51,
pp. 111-112.

Rubinstein, Reuven Y. Simulation and the Monte Carlo Method. New York:
John Wiley & Sons, 1981.

Stephens, M. A. "EDF Statistics for Goodness of Fit and Some Comparisons,"


Journal of the American StatisticalAssociation. September 1974,
Vol. 69, No. 347, pp. 730-737.

.......... 'Use of the Kolmogorov-Smirov, Cram~r.Von Mises and Related


Statistics without Extensive Tables," Royal StatisticalSociety, Scries B.
1970, Vol. 32, No. 1, pp. 115-122.

....
and Urs R. Maag. "Further percentage points for W2," Biometrska.
1g8, Vol. 55, pp. 428-430.

Yon Mises, R. Wahracheinlichkeitsreohnungund Are Anwendung in der


Statiatik und Theoretischen Physik., 1031, F. Deutlike, Leipzig.

163
APPENDIX A: EXPECTION OF SQUARED DISCREPANCY
BETWEEN AN EMPIRICAL DISTRIBUTION FUNCTION
AND A SPECIFIED DISTRIBUTION FUNCTION

To help find a suitable Weighting function, O[F(x)], lets look at

n E [Fn(x) - F(x)]'.

n E [F1(x) - F(x)j2 -nE IF (x) - H(x) + H(x) - F(x)] 2

- nE nFf(x) - H(x) - 2 {Fn(x) - Hx){F(x) - H(x)

+ F(x) - H(x)I

- n E [F.(x) - H(x)]2 -2 {F(x) - H(x)}E [F.cx) - H(x)]

- H(x)2
+ jF(x)

-n E [F (x) H(x)]2 + {F(x) - H(x)}

164
E [FDcX) 2 - 2 Fn(x)H(x) + H2(x)] + {(x) - H(x)}

arE [F.cX) 2] - 2 H(x) E [F.(x)] + H2 (X)+ {F(X) - H(X)}

-L H(x){1 - H(x)} + H2(X)- 2 H2(X)+ H2 (X)

+ {F(x) - H(X)]

n E [F.cX) - F(x)]2 11 [L {E(X){1 - H(x)}} + {F(x) - H(x)}2]

Under the null hypothesis, I.e., H.: 11(x) - x)

n E [Fn(x) - F(X)]2 - F(X)[1 - F(X)].

165
APPENDD3C Bs EXPANSION AND INTEGRATION OF
THE ANDERSON-DARLING STATISTIC

2 - 0 [F,(x) - Fx1
n F(Y.) (1- F(x)] dF(')

J , [Fn(x) - F(x)]2 n-i 71 [F,(x) - Fx1


- ~ .?~F(x) [I - F(x)dF) + E P(x) [1 - F~x)j Fx

[Fn(x) - F(x)] 2 dF)

Let F(x) mu
dF(x) wdu
F(xi) -w

JUx [o-U12 du + n-(1f d +I U)2 dul


0~ u(1U) k-i uk u(1u) dUf u(1U)

V2 _ 2k u + U2
n-l Uj1 n2 n d
--ulI n I)+ u(1.U.) d

+ [-1 n- nui

166
V
uk~l du

+ [a f u(-u) ]

+ r~ .1 fcU du

+ 11+ un - In Un}

mn {-u 4-
1 - In (1 - ul)
k2In (Uk+l) - In (1-ul - In Uk + In (I-U)
k
k-i n2Uf1

n-i
E
2k
T
rI (1 - Uk+1) - In (1 - Uk)I
k-i

+-~ Uk+l - In(1 -Uk+) +Uk +Inl(1 -Uk)I

167
man u-in (1-u 1 ) + E -- Uk- n (1 Uki

in+
- In (1U

I2n-2

+ i±1 +ln(l -U 1 )-Un-ln(l -un)

mfn{1 2 [(2
k- )
InlfUk(
2(n-k) +J
2 1JIn (1 -Uk)]

= -n - n [(2k -1) In Uk +(2 (it-k) +1) In (I1-Uk)]


Sk-i

168
APPENDIX C: DERIVA.TION OF THE
CRAMfR-VON MISES STATISTIC

For an ordered samrple X1 :5 X2 5 :5 x, the empirical distribution


function Is defined as

F,,(x) I for Xk :5X< Xk+1

1 xn <x

The Oram~r-von MIses statistic may be written


00

n f [Fn(x) - F(X)] 2 dF(x)

-n Jf [Fj(x)
- 0 0
- F(x)]2 dF(x) + E f[F.(x)
k- 1 xk
- F(x)]2 dF(x)

+ f [Fa(x) - F(X)] 2 dF(x)


X,

Let F(x) - u
dF(x) -du
F(x1 ) - u1

169
Then

ii f [F.(x) -F(x)] 2 dF(x)

[oU2ubl2
+l + i1U]d}

Umn fr+ 4u+ -u


~ 3] f
+

2 n 3 Uk n(,3-un

U3 + n-.[ (Uck -Uk) - uf

1n- 1n+ 2 1
3 1 - ub + 13 + U 3 n

- 3~[ n2I- kE- ) 2k 1)]


+ (n 2( fU }

~k-)

L 2k -1 2 2 n

170
Completing the square,

12
Ik - 1 2k-i1
En1 [Uk -2k- 3 E~c~ 2ni
n ~2 - JJ12 + n ,1 ( - )22n2+(2 )
r- 2n + 3 4n2 2+3 yf-i
Eka[U[
k
-2k
1
2
1~L
1
2 + -

2
1 2+

171
Nonpare, a Consultation System for Analysis of Data

J.C. Dumer III


T.P. Hanratty
M.S. Taylor

US Army Ballistic Research Labor atory


ATrN: SLCBR.SE-P
Aberdeen Proving Ground, MD 21005-5066

Abstract. Nonpare, a consultation system for analysis of data using nonparametric


statistical procedures, is under active development. It is intended to serve as an
intelligent interface that will act as a guide, an instructor, and an interpreterto a body of
statisticalsoftware. Nonpare exists as a prototype, with a limited releaseplanned in 1989
for field testing.

173
1. Introduction
Statistical software packages, to large extent, accept any properly configured
data set and proceed to process It. Few if any checks are made to ensure the adequacy
of the data and the suitability of the analysis, and little is done to provide an
explanation or interpretation of the results. This requires a great deal from the user.
Declining computation costs, together with increased availability of computers and
proliferation of statistical software, has further enhailced the opportunity for faulty
data analysis. Application of expert system techniques from artificial intelligence to
produce more cognizant software is one approach to reversing this unfortunate trend.
In 1985, a workshop sponsored by AT&T Bell Laboratories brought together
mainy of the active investigators in artificial intelligence and statistics and was the
genesis of a book by the same title edited by Gale (1]. This reference is in essence the
proceedings of the workshop; but the papers given there, some with extensive
bibliographies, provide the most complete centrally-located account of research In this
topic to date.
This report details an effort underway at the US Army Ballistic Research
Laboratory (BRL) to develop a consultation system for analysis of data using
nonparametric statistical procedures. The system, called Nonpare, is intended to
serve as an intelligent interface that will act as a guide, an instructor, and an
interpreter to a body of statistical software. The system is currently a prototype, with
a first release planned for 1989 for field testing.
2. Nonpare
Nonparametric statistics is too large an area to hope to encompass at once,
especially if the entire field of mathematical statistics Is partitioned into parametric
and nonparametric procedures. The common-sense approach to construction of
consultation systems suggests limiting the domain of application, but nonparametric
statistics has qualities that make it strongly appealing.
Nonparametric data analysis is characterized chiefly by the absence of
restrictive distribution assumptions -notably freedom from dependence on the normal
(Gaussian) distribution. Many nonparametric statistical procedures are exact rather
than approximate for small data sets, and they are the only confirmatory procedures
which can be used to analyze data collected on a nominal or an ordinal scale of
measurement. For these and other compelling reasons advanced, for example, by
Conover, [2] Hollander and Wolfe, [3] and Lehmann, [4] nonparametric procedures
find use in a wide variety of disciplines.
2.1 The System Structure
Nonpare uses Genie, an expert system shell developed at the BRL, [5] to
provide a frame-based production system with forward and backward inferencing as
well as an explanation facility that allows the user to interrogate the system -what
hypotheses are being entertained, what rules are being verified, what facts are in
evidence. Genie was chosen over commercial expert system shells for the research
and development of Nonpare because of its accessibility for modification.

174
Nonpare, shown schematically in figure 1, consists of three subsystems' in
addition to Genie.

Genie
inference engine

forward
knowledge reasoning explanation user
basebackward facility
reasoning

nonpararnetric
data system
analysis dictionary

Figure 1. Nonpare system overview.

The system dictionary is a facility whose purpose is to provide on-line


explanation of statistical jargon that may appear during the interactive dialog between
Nonpare and the user. Expert domain knowledge, codified in English-like rules,
resides in the knowledge base. Once an appropriate procedure(s) has been identified,
the data are analyzed and the results explained by the nonparametric data analysis
component, Graphics is used to summarize the data and enhance the explanation, In
total, the user is led within system limitations to an appropriate statistical procedure
through an interactive process in which the uer is questioned and can in turn question
the consultation system. Nonpare is written ir Interlisp-D and currently runs on
Xerox 1100 Series Lisp machines.
3. An Illustrative Session
Following the dictum of Amurican educator John Dcwcy (185)-1952) that
"We learn by doing," a detailed session with Nonpare follows, in which the main
system features are illustrated.
Example 3.1
Suppose that a ballistician nceds to assess the effectiveness of a newly
designed kinetic energy penetrator against a specific armor plate. In particular, the
experiment,-r would like to establish wheth;;r the probability of perforation exceeds
.8V a level already attained with existing technology. Fourteen rounds are fired, and
[p]erforation and [nJonperforation recorded to obtain: n, p, p, p, n, p, [. p, p, n, p, p,

175
p, p. Is the Pr(perforation} > .80 ?
(A diversion here. Searching for a statistical procedure with a set of data
already collected is precisely how not to proceed. The purpose for collecting the data
should first be established, and then the statistical tools available to support this
purpose determined. Then the collection and analysis of data can proceed in an
informed manner. Lamentably, the methodology-search scenario is enacted over and
over again; so this example is not too contrived.)
It should be apparent from the onset that the question regarding
Pr{perforation} > .80 can never be answered unequivocally yes or no, but only with
some degree of qualification.
Nonpare presently has nineteen distinct data analysis procedures at its
disposal; the number continues to increase. No assumptions have been made about
their frequency of use; one procedure has not been declared most likely to be
exercised, a second procedure next most likely, and so on, since the base of potential
users is so broad. For the user, this means that any procedure is a likely starting point,
as in this session, the dialog of which begins in figure 2. In the remainder of this
section, the conventions that boldface denotes system prompts and brackets contain
user input will be adopted. An occasional system response may be italicized but
should not be confusing within the context of its appearance.

Dlo y'ou have a -. mple


1, X ,n'? '.
Are ')you interested in 'uhet'er the data cotorrii t: a spe:ified
distri bution? 1
Are 'yoLI interested in the probt:,bili, ,' of occurrence ct . iful.r
categorv or event? y
Enter the name of the ,:-.tegor"y ot interest - > :erfcir. tior
Are the ri trial s pr,:dl i ng the values XI, ,,,:<,n independert" A

Figure 2. Beginning dialog with Nonpare.

The session begins with a question about the configuration of the data.
)n you have a sample XP ..., Xn ? The data, n, ... , p, look like X1, ..., Xn; respond
[y]es.
Are you interested In whether the data conform to a specified distribution ?
Nonpare is investigating a possible goodness-of-fit situation. A statistician,
anticipating an approach to this problem, might find a [y]es response is appropriate
here. A nonstatistician, for whom this portion of the system is designed, and who is
interested in whether Pr{perforation} > .80, should respond (nbo, as indicated.
Are you Interested In the probability of occurrence of a particular category or
event ? [y]es. The user is interested in the probability of occurrence of a perforation.

176
Enter the name of the category of interest. [perforation]. Domain-dependent
terminology is being introduced.
Are the n trials producing the values X1, ..., X, Independent ? Suppose the user is
unsure of the technical implications of the term "independent." An acceptable
response is [What is independent] - as shown in figure 3.

Are the n trials producing the values X, , Mr independent? hat is in


dependent
Incepenuence relates to freeacm (r,m e.,terna inIlu'nce or contro- here,
the referenc&e is to measurements (aate) be,i, free to assumne values
without regard to otih-er meapsurements 0-,at raay Ihe fnaae,
Are the n trials producing the values, X, ,,, .n inclependent? y
Does each trial have the sane probability 1) of producing the perforation
?y
Are 'you interested in considering wihether the probability of occurrence
of the perforation equals or iS bounCed by some specified value p? A

Figure 3. A call to the system dictionary.

Independence relates to freedom from external influence or control- here, the reference
is to measurements (data) being free to assume values without regard to other
measurements that may be made.
This illustrates a dilemma for the subject area specialist. It may be impossible to
rigorously define a term without reliance upon other terms that are equally obscure to
a user with only a modest statistical background, This is the case here, where
independence is bound to basic concepts of probability theory. Nonpare's response
conveys the notion, but regrettably not the substance, of independence. More work is
needed here. For now, assume the experimenter has collected a set of independent
data.
Are the n trials producing the values Xp ...
,X. Independent ? [y]es.

Does each trial have the same probability p of producing the perforation ? [y]es.
Notice that Nonpare is now using language the user provided, when it talks about
probability of perforation.
Are you interested in considering whether the probability of occurrence of the
perforation equals or Is bounded by some specified value p* ? (y]es. The user is
interested in the inequality Pr{perforation} > .80. After a [y]es response, the system
suggests a possible approach, shown in figure 4.

177
The binornial test. is an, arpropriate procedure. To execute the
bir omli a test, uSe the menu to comp lete this atatem nt:
I am interested in testing the null hypothesis that:The pelohtabflit-y
of occurrence of the per'forarforf

Ie'qu]s somie value gl


JCloes not exceed p),
I I

Figure 4. A call to the nonparametric data analysis subsystem.

The menu allows the user to select either a two-sided or one-sided test of hypothesis
and is a potential source of error. Beginning statistics students, not realizing that a
null (or empty) hypothesis is chosen to be rejected, might mistakenly choose is at least
p* at this juncture. Here again, some -level of statistical competence is required.
Selecting the hypothesis does not exceed p* from the menu using a mouse, the user
obtains for confirmation (figure 5) the statement:
I am Interested In testing the null hypothesis that: The probability of occurence of the
perforationdoes not exceed p 1.

I am interested in te.t ing the null ',yplthesis that T;.e' ,prof , i ,o


occ'urret':e of the per-for,. io; does riot tmceed p'

Specify the sample -Size n -> 14


Specify a value for p* -) ,B0
rpecI ty the number of d t uni vaues assigned to the per for t in -

Figure 5. Hypothesis confirmation and input parameter declaration.

Specify the sairrle size n. f14]


Specify a value for p*. (.801

178
Spec4~ the number of datum values assigned to the perforation. [11]
The first two "Specify ... "commands determine the appropriate binomial distribution;
the third determines the size of the critical region for the statistical procedure, which
is explained in figure 7, following the system-generated histogram shown in figure 6.

Region of Rejection
Pr (X) '1WRk
.25

IV
RN%

P
Sf f %SV

IN '~~%"
AmmVN',V

%ANS
M%&A
IRA\S

\, MM~%

01 2 31
X "~

Figure~~~~~
Sttsiarahsmay
6.~
The~~~~~~~~~~~~~~5
dipastepoaiiyo hitga bsrigeatyn(o0 4

aro eroaiosi
fute sosiftetre(utukon)P~eroaio).0
Attsiinwllraiyasmlteti rp.I teue eeylok ti sapo
involving~~ hc ~ h ~ ih ~ ~ ~~~5
ryrgon
n onsi orsodn on>1,hlssm

~
special~ ~ an ~ ~
itpoiessm ~~~~~,S5S
sigifcace easrnereadn teuse

compuatios,
i wil hav sered is puposehere Figur 7,Vl whc peaso h
terminal5-.* explinstha
simutaneusly

Thrthleel togra dhispayst


threpobbiltof obseigtrin rexacln . 4
arorpefraios nfortensot i tetre bu nkow) rpefoa1o79 .0
This means that Ifyou reject the hypothesis (The probability of occurrence of the
perforation does not exceed .8) you do so with a .69 probability of being In error.

The cr iti,-al level of this test, car respording to the light gray region,
is .69
This nea, s that if 'YOu reject. the hypothesis (The prateabiit, of
occurrence of the perforat.ion coes not eceect 8) you ,ioso tol h a ,69
probabillt.'y of being in error,

WVould you like to run the ti'.,rorilial procedure ag,.air,? 1k

Figure 7, Explanation and interpretation of results.

Since the investigation began with the assumption (null hypothesis) that the
Pr(perforation} <_ .80, the evidence collected -eleven perforations, three
nonperforations-is not sufficient to support abandonment of that assumption. A
probability of being in error of .69 is more than a reasonable person would be willing
to assume. And so, the response to the original question, Is the Pr{perforation} >
.80 ? is a qualified no, the qualification being expressed through invocation of the
critical level.
Would you like to run the binomial procedure again ?
At this juncture, an experimenter might well be asking a number of "What if
questions. "What if I had been able to afford three more firings?" or, "What if I had
observed one more perforation?" and so on. A response of [y]es here allows the user
to exercise the binomial procedure directly, without having to respond again to all the
preliminary questions. A [n]o response is given, but this is an excellent place to use
Nonpare's tutorial capabilities to study the sensitivity of the binomial procedure to
modification of parameter values or slight changes in the data.
Are you interested in determining an interval In which the probability p of
occurrence of the perforation lies?
The foregoing analysis suggests that an assertion that the probability of perforation
lies in the interval (.80, 1] cannot be made. What interval might be expected to
capture this unknown parameter? A response of [y]es causes this question to be
answered, first graphically, as in figure 8, and then verbally, as in figure 9.

180
CONFIDENCE INTERVAL
with 95.0% Confidence Level
p, .78

.48 .94

Figure 8. Display for a 9501 confidence interval.

Figure 8 ,hows that the Pr(perforation}, whose estimate based on the


fourteen firings is p = .78, lies within the interval [.48, .94] with a high level of
confidence. This interval is so broad one can see why the assertion that
Pr{perforation} > .80 is ill-advised. The formal interpretation of the confidence
interval is given as
The probability of occurrence of the perforation Is contained In the Interval [.48,
.94] with an a priori probabilIty .95.

Are y,,u irterested in',dter nin 'D I iltorva1 i wh I t -he pf-Iob.bI, liit.,v F:l
f occur rer,:e of the per fortiron 1ies'? /

The prol:ah[ 1lity


interval ,f iJlth
4?, ,.4] an a ...of'.,it.he pn'hability
:,currerce pert or. t l, ,'5,
i. ,:,-r.itned In the

Would , 1iiyou
e C,:1rfiderI,:e level other tha Q5 ' 1-

Figure 9. Explanation and interpretation of the confidence interval.

Would you like n conldcnce level other than .95 ? (io]. The 95% confidence levcl
was prechosen. A [y]es response allows the user to control the confidence level. The
session is terminated with a [n]o response, shown in figure 9.
At the conclusion of the session the inference engine displays a fact solution
tree for all the intermediate decisions leading to the final conclusion. Buttoning with a
mouse any node of the fact tree produces the logic leading to that location. In figure
10, fact'11 was buttoned, and the corresponding trace is displayed beneath the fact
tree. These are features of the inference engine rather than Nonpare, but they are
valuable as diagnostics to the developer and provide some measure of reassurance to
the user.

181
II
g--

", ~.. ,

\ - - !, - , , - C-P1.

IIII
I

-0 1 ,
-- . 1'

,1 .. .
.... . I I
18

i iI I- / Ul
i i ii
m I n
4. Conclusions
Nonpare, a consultation system for analysis of data using nonparametric
statistical procedures, has been described; and most of its operational features have
been illustrated. The essence of the system is the rule-based interface with
accompanying software for data analysis and the interpretation of the ensuing
computations. Nonpar; is under active development, but its feasibility as an
operational system has been estabdshed. Enlargement of the rule-base and the
addition of more statistical procedures is clearly indicated before it can approach its
potential. Not surprisinSly, tangential problems in basic research have been spawned
by this effort. A first relewe is planned for 1989 for field testing.

References
[1] W.A. Gale, Ed., Atificial Intelligence and Statistics (Addison-Wesley, 1986).
(2] W.J. Conover, PracticalNonparametricStatistics (John Wiley, 1980).
(3] M. Hollander and D.A, Wolfe, Nonparametric Statistical Methods (John Wiley,
1973).
[4] E.L Lehmann, Nonparametrics(Holden-Day, 1975).
[5] F.S. Brundick, et.al.,Genie: An inference engine with applications to vulnerability
analysis, Technical Report BRL-TR-2739, US Army Ballistic Research Laboratory,
Aberdeen Proving Ground, MD (1986).

¢~~~ I Y3 :I1
Numerical Estimation of Gumbel Distribution Parameters

Charles E. Hall, Jr.


Research Directorate
Research, Development, and Engineering Center
U.S. Army Missile Command
Redstone Arsenal, AL 35894-5248

AB.,A.I. The parameters which maximize the log-likelihood function

of the Gumbel distribution were estimated by two different methods. A

derivative approach was used, which calculated the intersection of the

zeros of the implicit functions obtained from the derivatives of the

log-likelihood function. A direct maximization was also performed.

Both methods gielded positive results.

185
EXPERIMENTAL DESIGN AND OPTIMIZATION OF BLACK CHROME
SOLAR SELECTIVE COATINGS
I. J. Hall and R. B. Pettit
Sandia National Laboratories
Albuquerque, NM 87185
A]ISTAL. Some years ago Sandia Laboratories was given the
task of investigating selective coatings for solar applications.
Early experimental results, which were based on one variable at
the time experiments, produced acceptable coatings in the
laboratory. However, when full scaled parts were coated by
commercial electroplaters, the coatings quickly degraded when
heated in air. At this point a systematic approach using a
fractional factorial design was used to determine both the
effects and interactions between several variables, including the
bath composition(four variables), current density, plating time,
substrate, and bath temperature. Response surface for the
optical properties of the coatings were constructed for both the
am-plated and the thermally aged samples. These response
surfaces were then used to specify ranges for the bath
compositions, and other plating parameters, that provided
coatings with optimum thermal stability. With proper control of
the plating variables, stlective coatings were obtained that
should maintain high solar absorptance values during years of
operational at 300 C in air.
1. INTRODUCTION. Two variables are of interest to
selective coating investigators, namely, absorptance (a) and
emittance (a). Good selective coatings have high a's and low
6'S. In our investigations we concentrated on making a as large
as possible and settling for the corresponding e if it was not
"too big". The independent variables that effected a and 6
divided themselves into two groups (bath variables and plating
variables) in such a way that a split-plot experimental design
would have been appropriate. The bath variables would have been
associated with the whole plots and the plating variables with
the sub-plots. The bath variables were chromic acid, trivalent
chromium, addition agent and iron. and the plating variables were
plating time, current density, bath temperature, bath agitation
and substrate. For a specified combination of bath variables a
entire set of experiments were possible for the plating variables
as in a split-plot design. Because of many constraints we did run
the experiment as a split-plot design. The dependent variable
readings ( a's ) were obtained by coating a substrate and then
measuring the absorptance with a Beckman Model DK 2A
spectroreflectometer. Readings were obtained for the substrate
both as-plated and as-aged. The as-aged readings were obtained
after the specimens were heated in a 450 0C furnace for 40 hours
while the as-plated readings were taken before they were
subjected to any extreme environments. The aged readings were
the most important because we were concerned about the thermal
stability of the coatings, i.e. would coatings not degrade at
high temperature for extended time periods. The experimentation
was done in three phases that are briefly described below.

187
2.
LEx~erimentationj. Based on previous experience, we
decided that the bath variables were most important and thus we
concentrated most of our effarts on investigating these
variables. The plating variables wero set at nominal values. We
used standard response surface methodoloqy to guide us in the
experimentation. (See Box, Hunter, and Hunter, "Statistics for
Experimenters", Chapter 15, 1978) The first phase consisted of
running a 1/2 replicate of a 24 factorial experiment on the four
bath variables. The experimentation was done in a rather limited
range of the factor space. The results of this experiment were
used to determine a path of stoepest ascent (Phase two). Three
more experiments were done along this line of steepest ascent.
These experiments would normally indicate a region in the bath
variable space that would produce larger a values. In our case
however all the coatings turned gray after a short time in the
furnace - a highly undesirable result. The most valuable
information from these three bath experiments was that a "cliff"
existed in the response surface. Because of time limitations we
did not repeat the experiments along the steepest ascent line.
Based on a combination of engineering judgement and factorial
design methodology, several more baths were mixed and the a's
measured on the coated substrates (Phase three). A total of
eighteen baths were mixed and the results from these baths were
used to estimate a quadratic surface - i.e. a was written as a
function of a second degree polynomial in the four bath variables
and the variable coefficients were estimated using a backward
stepwise statistical package. The final regrepsion equation had
11 terms including the constant term with an R' - 0.96. Several
graphs were drawn based on this equation that allowed us to map
out an acceptable region in the bath variable space. This space
was very near the "cliff" in the response surface. A limited
number of experiments also were done involving the plating
variables for a fixed bath. Based on these experiments we were
able to specify ranges for the plating variables as well.
3. Summary. Using response surface methodology we were
able to determine the variables and the range of variables that
produced stable selective coatings. The procedures developed in
the laboratory were subsequently implemented in a production
environment with excellent results. The close interaction
between the statistician and the experimenter led to a
satisfactory solution with a rather limited number of
experiments.

188
DETERMINATION OF DETECTION RANGE OF
MONOTONE AND CAMOUFLAGE PATTERNED FIVE-SOLDIER
CREW TENTS BY GROUND OBSERVERS

George Anitole and Ronald L. Johnson


U. S. Army Belvoir Research, Development
And Engineering Center
Fort Belvoir, Virginia 22060-5606

Christopher J. Neubert
U. S. Army Materiel Command
Alexandria, Virginia 22333-0001

ABSTRACT

Field evaluations have determined that camouflage patterns reduce detectability ranges for
uniforms and vehicles in woodland environments. This study identified the effects of three pat-
terned and two monotoned Five-Soldier Crew Tents using detection ranges and number of false
detections as determined by ground observers. The distance of correct detections were recorded
along with the number of false detections. An analysis of variance for the detection ranges and
number of false detections was performed. The Duncan's Multiple-Range Test was used to
determine significant differences (a - 0.05) in groups of tents. From this data, it was deter.
mined that the three patterned Five-Soldier Crew Tents were more difficult to detect than the
two monotone tents.

1.0 SECTION I - INTRODUCTION

Several years ago, the U.S. Army decided that camouflage patterns have a definite ad-
vantage when used on uniforms and vehicles in woodland environments. This had led to a similar
consideration for teats, since the current U.S. Army tents are solid (i.e., monotone) color. Tents
present a large, relatively smooth form, making them conspicuous targets. The use of patterns
to break up this signature could increase camouflage effectiveness. However, before such a
judgement could be made, a field test was planned to determine the relative merits of various
patterns versus monotones in a woodland background. The Natick RD&E Center fabricaLed
three patterned tents and two monotone tents for evaluation. In consultation with Belvoir, the
patterned tents were fabricated in the standard four-color uniform pattern, one in the standard
pattern size and the other two in progressively larger expanded patterns. The two monotone
tents were in colors Forest Green and Green 483 (483 being the textile equivalent of paint color
Green 383). A test plan I/ was developed by Belvoir at the request and funding of Natick, and
the field test was conducted by Belvoir at Ft. Devens, Massachusetts, in the summer of 1987.
This report describes the test and its results.

189
2.0 SECTION 11 . EXPERIMENTAL DESIGN

2.1 Test Targets


Five, Five-Soldlor Crew Tests were supplied by Natick for this study in the following pat-
torus and colors:
o Tent A - Standard size four-color uniform pattern repeated every 27.25 inches
o Tent B - Forest Green
a Tent C - Expanded four-color uniform pattern repeated every 36 inches
o Tent D - Expanded four-color uniform pattern repeated every 50 Inches
o Tent E - Green 483
2.2 Test Sites
The study was conducted at the Turner Drop Zone, Ft. Devens, Massachusetts, a large
cleared tract of land surrounded by a mix of coniferous and deciduous trees resembling a central
European forest background. Two test sites were selected. Site #1 was located on the western
end of the drop zone, so that the morning sun shone directly upon the test tent. Site #2 was
located on the eastern edge of the drop zone, so that the afternoon sun shone directly upon the
test tent. An observation path, starting at the opposite end of the drop zone from the test tent
location, was laid out for each site. Each path followed zig.zag, random length directions toward
its test site, and afforded a continuous line of sight to its respective test tent location, The
paths were within a 300 to 400 cone from the target tents, and were surveyed and marked at ap-
proximately 50-meter intervals using random letter markers. For Site #2, the distance between
markers after the first 15 markers was about 25 meters along the path. A night evaluation in-
volving other camouflage targets led to this procedural change. The markers and distances from
the tents are shown In Tables 1 and 2

190
Table I
Distance& of Markers to Tents for Site #1

ALPHABET DISTANCE IN ALPHABET DISTANCE IN


MARKER METERS ALONG MARKER METERS ALONG
PATH FROM PATH FROM
STARTING POINT STARTING POINT
TO TENT TO TENT

S 1,182.64 so 464.78
y 1,128.57 Y,446.74
a 1,094.00 0'428.17
L 1,049.93 L' 413.48
F 1,008.07 F1 398.48
P 978.31 Po 383.34
E 947.02 El 364.04
K 902.75 K' 348.27
A 858.10 A' 334.46
T 817.81 T' 322.69
V 778.91 .V, 308.59
B 750.15 B' 289.59
M 709.76 Mo 281.60
Li 674.87 U' 269.08
H 702.65 H' 253.16
Z 677.99 Z' 235.50
R 648.46 R, 217.81
N 613.35 N' 199.60
X 602.56 X,178.93
I 594.57 I', 156.76
D 578.05 D' 141.15
C 561,16 C, 120.05
0 541.70 0' 102.34
J 525.33 J'85.37
G 505.62 62.81
W 483.64 W1 41.84
Table 2

Distances of Markers to Tents for Site #2

ALPHABET DISTANCE IN ALPHABET DISTANCE IN


MARKER METERS ALONG MARKER METERS ALONG
PATH FROM PATH FROM
STARTING POINT STARTING POINT
TO TENT TO TENT

F 1,205.36 A 653.34
W 1,168.63 Z 813.20
U 1,130.58 E 574.09
o 1,086.03 P 540.30
C 1,048.10 H 513.10
R 1,006.15 K 496.46
V 982.00 S 475.57
0 974.13 F' 459.10
M 942.37 W' 417.71
901.58 U' 379,40
B 889.75 0, 338.25
J 858.01 C' 296,90
L 851.84 R1 278.53
X 841.28 V' 258.20
G 803.95 0' 220.73
D 764.09 I' 180,87
Y 723.48 B' 143.94
T 695.32 J' 111,00
N 073.60 L' 89.78

2.3 Test Subjects

A total of 153 enlisted soldiers from Ft. Devens served ad ground observers, All person-
nel bad at least 20/30 corrected vision and normal color perception. A minimum of 30 observers
were used for each test tent, about evenly split between test sites. Each observer was used oaly
once,

2.4 Data Generution

The test procedure was to determine the detection distances of the five tents involved by
searching for them while traveling along the predetermined measured paths. Each ground ob-
server started at the beginning of the observation path, i.e., marker S for Site #1 and marker
F for Site #2. The observer rode in the back of an open 5/4-ton truck accompanied by a data
collector. The truck traveled down the observation path at a very slow speed, about 3-5 mph.
The observer was instructed to look for rtilitary targets in all directions except directly to his
rear. When a possible target was detected, the observer informed the data collector and pointed

192
to the target. The truck was immediately stopped, and the data collector sighted the apparent
target. If the sighting was correct, i.e., the Five-Soldier Crew Tent, the data collector recorded
the alphabetical marker nearest the truck. If the detection was not correct, the false detection
was recorded, and the data collector' informed the observer to continue looking. The truck
proceeded down the observation path. This search process was repeated until the correct tar-
get (tent) was located.
The tents were rotated between the two test sites on a daily basis, until all tents had been
observed by at least 15 observers at each site. (This number of observers allows the use of
parametric statistics which have a guod opportunity to yield absolute conclusions). Their orien-
tations with respect to the sun were kept constant at both test sites.. The Five-Soldier Crew
Tent was positioned so that a full side was facing the direction of observer approach.

3.0 SECTION III-RESULTS

3.1 Range of Detection


Tables 3, 4, and 5 show the detection data for the Five-Soldier Crew Tents. Table 3 gives
the mean detection range in meters for each tent, and its associated 95 percent confidence in.
terval. Table 4 shows the analysis of variance 2/ performed upon the data of Table 3 to deter.
mine if there were significant differences in the detection ranges, i.e., if pattern and color had
an effect upon detection range. Table 5 indicates which tent patterns and solid colors differed
significantly from each other in this respect. Figure 1 is a graphic display of the detection ran-
ges of Table 3.

Table 3
Mean Detection Ranges (Meters) and 95 Percent
Confidence Intervals,

95 PERCENT CONFIDENCE
STANDARD INTERVAL
TENT N MEAN ERROR LOWER LIMIT UPPER LIMIT

A 31 327.54 127.75 280.68 374.40


B 30 427.71 173.74 362.83 492.58
C 32 351.17 129,42 304.51 397,83
D 30 387.12 161.79 326.76 447.59
E 30 674.88 214.94 594.62 755.14

193
Table 4
Analysis of Variance for Tent Detection
Across Five Levels of Color Variation

DEGREES
OF
SOURCE FREEDOM SUM OF SQUARES MEAN SQUARE F-TEST SIG LEVEL

TENT COLOR 4 2,377,907-68 594,476.9927 22.0083 0.00*


ERROR 148 3,983,214.260 26,913.099
TOTAL 152 6,361,122.228

*Significant at a less than 0,001 level,

Table 4 indicates that there are significant differences in the ability of the ground observers
to detect the Five-Soldier Crew Tents in different four-color patterns and solid colors

785
770 75.21

72S5
~6$710
610-

630
.21 635-
605
590 - 594.6113
" 560
S4O
545 -
t530
o00 49.-5799

470
6 43! 447.3931
0 440
420
410 -397.8210
3910 374,3047
Z 3850
<~ 365T
33S-62.83|4
19 350
320
305 - 326,7615
'90I 304039
273 290,6794
lao I I I.
..
A a C D I
FIVE SOLDIER CREW TENTS

Fipre 1. Mean Ranges of Detection and 95 Percent


Confidence Intervals for Five-Soldier Crew Tents

194
Table 5

Duncan's Multiple-Range Test (Range of Detection)

SUBSET 1 SUBSET 2 SUBSET 3


GROUP MEAN GROUP MEAN GROUP MEAN

A 327.54 C 351.17 E 874.88


C 351.17 D 387.12
D 387.12 B 427.71

The harmonic mean group size is 30.58. The subsets are significant at a - 0.05

The Duncan's Multiple-Range test separates a set of significantly different means into sub-
sets of homogeneous means. One of the assumptions is that each random sample is of equal
size. Since this was not true, the harmonic mean of the group was used as the group size. As
seen above, the range of detection was the shortest for tents A, C, and D and these tents do
not differ significantly from -each other (a - 0.05). Tent E had the longest mean range of detec-
tion and is significantly (a - 0.05) different from the other 4 tents in this respect.
3.2 False Detections

The number of false detections is defined as the number of times a target other than the
test target is detected by an observer, In this study such detections are rocks, trees, shadows,
etc. These detections, as a rule, are a function of how hard it is to detect the test target. The
more difficult the detection task, the greater the number of false detections. Tables 6, 7, and
8 show the false detection data. Table 6 gives the mean false detection value, and its associated
95 percent confidence interval, for each of the Five-Soldier Crew Tents, Table 7 contains the
analysis of variance performed upon the data of Table 6 to determine if there were significant
differences in the rate of false detections. Table 8 indicates which tent patterns and colors had
significant rates of false detection.

Table 6

Mean False Detection Rates and 95 Percent Confidence Intervals

g5 Percent Confidence
Standard Interval
Tent N Mean Error Lower Limit Upper Limit

A 31 4.87 3.27 3.67 6.07


B 30 3.53 2.53 2.59 4.48
C 32 3.38 1.96 2.87 4.08
D 30 3,87 2.76 2.83 4,90
E 30 2.50 1.91 1.79 3.21

195
Table 7
Analysis of Variance for Rates of False
Detection across Five Levels of Color Variance

DEGREES
OF
SOURCE FREEDOM SUM OF SQUARES MEAN SQUARE F-TEST LEVEL

TENT COLOR 4 90.088 22.521 3.50 0.009


ERROR 148 953.417 e.44
TOTAL 152 1043.503

Significant at less than 0.01 level.

Table 7 indicates that there are significant differences in the rates of false detection for
the Five.Soldier Crew Tents.

-,0717

4,0393

[.- 5.670o23.11
3.

WI L8341
I7j8s 2.6671

A I C D

FIVE SOLDIER CREW TENTS

Figure 2. Mean Rates of False Detection and 95 Percent


Confidence Intervals For Five.Soldier Crew Tents

'19F
Table 8
Duncan's Multiple-Range Test
(Rates of False Detection)

SUBSET I SUBSET2
GROUP MEAN GROUP MEAN

E 2.50 B 3.53
C 3.38 D 3.87
B 3.53 A 4.87
D 3.87

Harmonic mean group size is 30.58.

The rates of false detection for tent groups E, C, B, and D, and B, D, and A were not sig-
nificantly different (a - 0.05). However subset 1 is significantly different from subset 2.

4.0 SECTION IV- DISCUSSION

The Duncan's Multiple-Range Test (Table 5) shows that the group of Five-Soldier Crew
Tents A, C, and D had the shortest detection range. Tent A is the standard size woodland
uniform four-color pattern, while Tents C and D are expansions of this pattern. The pattern at
Tent A Is repeated every 27.25 Inches, the pattern for Tent C is repeated every 36 inches, and
the pattern for Tent D is repeated every 50 inches. Tents C, D, and B are significantly different
from each other. Tent B is solid color, Forest Green, Tent E, which is not solid color Green
483, had the longest mean detection range (674.89 meters), and this is significantly (e - 0.05)
longer than any of the other means for the Five-Soldier Crew Tents. Thus, it can be concluded
that the patterned tents are harder to detect from ground observation, but that the pattern
should not be expanded beyond the repeat of every 36 in, hes. The human eye is probably resolv-
ing the larger pattern repeated every 50 inches as bei6, different from the tree and bush back-
ground (the color brown, in particular, becomes distinguishable from the woodland background
when overexpanded).
When working with detection ranges, the question of field data stability is always paramount
to the amount of weight that can be given to the test conclusions. One of the best methods to
determine data stability Is through a test-retest procedure. Field studies are very expensive and
time consuming, so this data is very rare. We do have such an opportunity to examine this type
of data for the Turner Drop Zone. A ground evaluation of camouflage nets was conducted in
the summers of 19853/ and 19874/, The net sites and test procedures were identical to the sites
and test procedures in which the Five-Soldier Crew Tents were evaluated. In both net studies,
the standard camouflage net was evaluated. In 1985 this net had a mean detection range of
411.75 meters, while in 1987 the mean detection range was 414.41 meters. This difference in
mean detection range is only 2.66 meters. From these results, it is inferred that the mean detec-

197
tion ranges for the live-Soldier Crew Tents are stable, and solid conclusions about their
camouflage effectiveness can be made.
The analysis of false detections seen in Table 8 and Figure 2 also lends credence to the
belief that the Five-Soldier Crew Tent A had the best performance as to camouflage effective-
ness, with Tent E the worst performance. In the following discussion of false detections in Sec-
tion 3.2, it would be expected that Tent A, being the hardest to find, would have the most false
detections, and Tent E the least number of false detections. This is exactly what occurred, with
Tent A having a mean false detection rate of 4.87, and Tent E a mean false detection rate of
2.50. Duncan's Multiple-Range Test (Table 8) shows that the two rates of false detection dif-
fer significantly (a - 0,05) from each other. The false detection rates of tents B, C, and D are
not in the expected ordinal position. The expected order, based upon mean range of detection,
would be B, D, and C, while the true order of rates of false detection is C, B, and D. However,
a check of Tables 5 and 8 shows that these tents are not significantly different from each other
either for range of detection or for rate of false detection. Thus, from a statistical view, these
three tents are considered to have the same ordinal position,

5.0 SECTION V.SUMMARY AND CONCLUSIONS

Five, Five-Soldier Crew Tents were evaluated by ground observers to determine their
camouflage effectiveness as measured by the mean detection range and the mean rate of false
detection. These tents were in the following four-color camouflage patterns and solid colors:
* Tent A - Standard size four-color uniform pattern repeated every 27.25 inches,
* Tent B - Forest Green
* Tent C - Expanded four-color uniform pattern repeated every 36 Inches
a Tent D Expanded four-color uniform pattern repeated every 50 inches
* Tent E - Green 483

A minimum of 30 ground observers per Five-Soldier Crew Tent were driven toward each of two
sites on marked observation trails in the back of an open 5/4-ton truck. The observers were
looking for military targets, and they informed the data collector when they thought they saw
one. If the detection was correct, the ciosest alphabetic ground marker to the truck was recorded.
From this letter, the distance to the tent from the truck was determined. If the detection was
not correct, i.e., false detection, it was noted on the data sheet. The ground observer then con-
tinued the search, with the truck traveling down the observation path until the test target was
seen. An analysis of the resulting data provided the following conclusions:
A. Five.-Soldier Crew Tent A was the most camouflage effective, with the lowest mean
range of detection and highest rate of false detections,

B. Four-color pattern Five-Soldier Crew Tents are more camouflage effective than solid
colors.

198
C. The expanded four-color pattern, repeated every 50 inches, is too large to be effective
in denying detection. (The color brown becomes distinguishable from the woodland background
when overexpanded).

D. The solid colors Green 483 and standard Forest Green should not be used,
E. The mean range of detection data appears to be very stable. A test-retest field study
using identical sites and test procedures in the summers of 1985 and 1937 involving the stand-
ard camouflage net yielded mean detection ranges of 411.75 and 414.41 meters respectively.

REFERENCES

1, Anitole, George and Johnson, Ronald, Unpublished OutlIne Test Plan, Exaluatin..aL
Camnuflage Tosn. U.S. Army Belvoir Research, Development and Engineering Center, Fort Bel-
voir, VA, 1987.
2. Natrella, Mary G,, Expertmental Statistlcs, National Bureau of Standards Handbook 91, U.S.
Department of Commerce, Washington, D.C., 1966.
3. Anitole, George, and Johnson, Ronald, Stnittlcal Evaluatinn of Woodland Carnnuflaue Net by
Oraund OehArwArx, US. Army Belvoir Research, Development and Engineering Center, Fort Bel-
voir, VA, August 1986.
4. Anitole, George, and Johnson, Ronald, Evaluation of Woodland Camouflage Ners by Ground
Observer, U.S. Army Belvoir Research, Development and Engineering Center, Fort Belvoir, VA,
1988.

199
AN EXAMPLE OF CHAIN SAMPLING AS USED IN ACCEPTANCE TESTING

JERRY THOMAS
ROBERT L UMHOLTZ
WILLIAM E. BAKER

PROBABILTY & STATISTICS BRANCH


SYSTEM ENGINEERING & CONCEPTS ANALYSIS DIVISION
US ARMY BALLISTIC RESEARCH LABORATORY
ABERDEEN PROVING GROUND, MD 21005-5066

ABSTRACT

The Probability mnd Statistics Branch of the Ballistic Research Laboratory was asked to

develop a procedure of acceptance testing for armor packages. Because the available sample

sizes were extremely small, we were unable to identify a sampling plan directly applicable to

this problem. Accordingly, we have devised a new procedure by adapting an existing tech-

nique, known as chain sampling, to both the attribute portion (structural integrity) and the

variable portion (penetration depth) of the acceptance testing process. Operating charac-

teristic curves and power curves are presented for this procedure, and suggestions are made

concerning the simultaneous use of quality control charts.

201
I. INTRODUCTION
In most cases a consumer's decision concerning whether or not to accept a manufac-
tured product is based on an examination of a sample from that product. When General
Mills introduces a new pre-sweetened breakfast cereal, they spend millions of dollars in
advertisement costs with the hope that the consumer will sample it. Here, the consumer con-
siders the entire supply of this new cereal as a single manufactured lot, to be accepted or
rejected. Product acceptance, in this case, corresponds to the consumer purchasing more
boxes of the new cereal.
This is merely an everyday example of what is known as acceptance sampling, that is,
various techniques which allow for discrimination between an acceptable product and an
unacceptable one. Sampling may be based on an attribute criterion, a variable criterion, or
some combination of these. In our example the consumer may judge the sweetness of the
cereal as satisfactory or excessive (attribute), or he may measure the time in milk before the
cereal becomes soggy (variable). Sampling by attributes is a dichotomous situation in that,
based on a particular attribute, each item is either defective or non-defective; rejection occurs
if there is a high percentage of defectives in the sample. Sampling by variables establishes an
acceptable level of a particular variable, and rejection occurs if its sample value crosses the
acceptable threshold. Of course, in our example of a box of cereal, the sample size was one.
Generally, this will not be the case; but occasionally, for one reason or another, the consumer
is forced to make a decision based upon a very small sample size.
Because decisions are made from samples, there is some risk of error, either the error of
accepting a bad product or the error of rejecting a good product. The amount of protection
desired against such risks can be specified. The Acceptable Process Level (APL) is a high-
quality level that should be accepted 100(1-a)% of the time; a is thus defined to be the
producer's risk. The Rejectable Process Level (RPL) is a low-quality level that should be
accepted only 100(3)% of the time; 3 is thus defined to be the consumer's risk. Unfor-
tunately, these error factors vary inversely; that is, as the consumer's risk grows, the
producer's risk diminishes and vice versa. The Operating Characteristic (OC) curve is an
important part of any acceptance sampling plsn, since it provides a graphical display of the
probability of accepting a product versus the value of the particular parameter being
inspected. The OC curve is a function of APL, RPL, c4 and i9,as well as sample size. Given a
particular acceptance sampling plan, the OC curve depicts the associated error risks and
demonstrates the relationship among all of the variables.
The US Army Ballistic Research Laboratory (BRL) has developed acceptance sampling
plans for armor packages. These plans were briefed to the Project Manager MiAl on 14
April 1988 at Aberdeen Proving Ground, Maryland. Their general structures were accepted
with the guidance that the processes would be officially adopted pending some refinements.

202
II. CHAIN SAMPLING
Numerous sampling techniques exist, each with special properties that make it applica-
ble to particular situations. Sampling plans reviewed in the literature required sample sizes
much larger than those feasible for armor testing. In our case extremely small sample sizes
were warranted due to the expense of both the armor and the testing procedure, augmented
by the destructive nature of the test itself. Accordingly, we have devised a new procedure by
adapting an existing technique, chain sampling, for use in this project.
Chain sampling Is particularly appropriate for small samples because it uses information
over the past history of production lots. Even with small samples, it is possible to accept a
marginal Jot provided that a given number of lots immediately preceding (i.e., the chain) were
acceptable. When a consumer uses an expendable product such as the breakfast cereal in our
previous example, he utilizes chain sampling in his decision of whether or not to subsequently
purchase the same product. If the first or second box he buys is unacceptable, he will prob-
ably discard the product forever. However, if the tenth box is unacceptable, he might con-
tinue with one more purchase of the same cereal taking into consideration its past history of
nine boxes of acceptable quality.
An advantage of chain sampling is its automatic incorporation of reduced or tightened
inspection procedures when applicable. That is, as quality remains acceptable over a period
of time and our confidence grows, the sample size is reduced (or, more accurately, samples
are taken less frequently). If quality becomes marginal, inspection Is tightened by taking sam-
ples more frequently. When quality diminishes to the point where a production lot must be
rejected, the production process is stopped and necessary adjustments and corrections are
made. At that point a new chain begins and continues as before.
Certain assumptions must be made before chain sampling is considered as a sampling
technique. In particular, production should be a steady, continuous process in which lots are
tested in the order of their production. Also, there should be confidence in the supplier to
the extent that lots are expected to be of essentially the same quality. Generally, a fixed sam-
ple size will be maintained with the investigator taking more or fewer samples as tightened or
reduced inspection is dictated.

III. ACCEPTANCE SAMPLING PLAN


The armor packages tested at the BRL consist of a right side and a left side, which are
designated as one set. One month's production is considered to be a production lot. Every
month we continue testing one set at a time until a decision can be made about that produc-
tion lot. For a given set, one shot is fired into each side; and, if spacing on the target permits,
a second shot follows. In each of the first three months, a total of at least four shots is
required in order to make a decision concerning that month's production. This provides addi-
tional confidence during the early stages of the plan. There are two portions of the accep-
tance sampling plan. The first is structural integrity, handled using attribute methods; the
second is depth of penetration of a particular round fired into the armor, handled using vari-
able techniques. For both portions, decisions concerning a production lot should be based
upon the data from all available shots on that lot.

203
A combined chain sampling plan was proposed, The maximum length of the chain was
fixed at eight, meaning that after the chain has been established, we will consider the current
set along with the seven immediately preceding. While the chain is growing, there is an area
between the criteria for acceptance and rejection in which we can make no decision. At least
one set will be tested each month; but if no decision can be made, tightened inspection will
dictate the examination of additional sets, possibly up to a maximum of eight. Table 1 shows
the relationships among months, sets, and shots for this particular procedure. Note that the
maximum number of sets and, hence, the maximum number of shots decrease over time as
the chain is being formed. Following the third month and the concurrent drop in the
minimum number of shots, then when the chain is at its full length (definitely by the eighth
month), one set and at most four shots are all that is required in order to make a decision for
each subsequent production lot.

A rejection in either the structural integrity or the penetration depth will result in overall
rejection of the production lot. In that case production is stopped, adjustments and correc-
tions are made, and testing resumes with the construction of a new chain. If neither measure
results in a rejection but at least one falls within the no-decision region, another set should be
examined and both categories re-evaluated using the addi.-ona data.

A. Acceptance Sampling by Attributes

Projectiles are fired at these packages, which are then inspected for structural integrity.
With attribute sampling, only two outcomes are possible. The structural integrity is assessed
to be either defective or non-defective, regardless of the number of shots. Any decision to
either accept or reject a lot is based on the number of defective plates in the sample being
considered.

Chain sampling is employed in this attribute sampling plan. Results from the most
recent eight sets influence decisions regarding a lot. A lot can be either accepted or rejected
at any time (except for one case discussed in the next paragraph). In the early stages of sam-
piing there is also an area in between acceptance and rejection where no decision is rendered
immediately but sampling is continued. After a chain reaches its full length of eight sets, a
decision to accept or reject is made immediately.

In the sampling plan, a safeguard is built in to prevent rejection of a good lot after only
one set. If there are no defectives in the first set, the lot is accepted. Otherwise, no decision
is made. Subsequently, rejection would occur only when there were three or more failures in
the most recent eight sets.

Table 2 shows the decision rules for a chain building to a maximum length of eight. The
OC curves for this plan are depicted in Figure 1. It shows that for a chain at full length, the
probability of accepting a lot whose true defective rate is 5% is equal to 0.96, while the proba-
bility of accepting a lot whose true defective rate is 10% is equal to 0.79. Power curves for
the plan are depicted in Figurle 2. For a chain at full length, the probability of rejecting a lot
from a process whose true defective rate is 5% is equal to 0.04, while the probability of reject-
ing a lot whose true defective rate is 10% is equal to 0.21. (Note, if these probabilities are
deemed to be unsatisfactory, a different plan providing more satisfactory levels could be
developed by varying the maximum chain length or modifying the decision rules).

204
TABLE 1. Relationships Among Variables in Chain Sampling Procedure

Required Sets Required Shots


Month Minimum Maximum Minimum Maximum

1 1 8 4 32

2 1 7 4 28
3 1 6 4 24
4 1 5 2 20

5 1 4 2 16

6 1 3 2 12

7 1 2 2 8

8 1 1 2 4
9 1 1 2 4

k 1 1 2 4

205
TABLE 2. Decision Rules for Acceptance Sampling by Attributes.

DECISION RULES

SET
NUMBER ACCEPT REJECT NO DECISION

1 l n .... f
2 2 2
2 Ef 0 E f,3 1:5 E f,52
Jul 1.l 1-1

5 ~ f,- 0 ;2:3 1:'5Jf1 :52

6 6
6 66 ;fll 6f;t>3 Eft=- 2

1-1 i-1 Jul


7 7 7
"2
Elia
7 E f15_I E fi>3
l1 i- i-I
8 8

8 f,_52 ;f,>3
1.1 i-1
9

9 f-2 E f23
i.2 1 2

k k
k E f
1 <2 Zf >3
f-k-7 1-k-7

f,= number of failures in set i

206
QC
aa-ldo, joXT~qeOl

a20
~Ci2 t

0.)
U-~3p

C)
~I)
I
0

oq~J I-

- 0

UOU32~J Jo ~
2U8
B. Acceptance Sampling by Variables
When primary interest Is in a process level rather than a percent defective, sampling by
variables is the proper procedure. For the armor packages, depth of penetration for a partic-
ular munition was the process level of interest. When variable sampling plans are established,
two major assumptions must be satisfied: first, the distribution of the variable of interest
must be known; and second, a good estimate of its standard deviation must be available.
In our particular problem there were 22 baseline shots from which we were to determine
a distribution and estimate its standard deviation, as well as establish acceptable and reject-
able process levels (APL & RPL). The 22 shots had a mean (Xb) of 5mm with a standard
deviation (Sb) of 30mm. The data had been transformed, allowing for both positive and nega-
tive penetration values. When plotted, the data appeared normal; and, indeed, the hypothesis
of normality could not be rejected using statistical goodness-of-fit tests. The APL was esta-
blished at 20mm (1/2 baseline standard deviation from the baseline mean) and the RPL was
set at 80mm (2 1/2 baseline standard deviations from the baseline mean). o, the probability
of rejecting at the APL, was set at 0.05; and 0, the probability of accepting at the RPL, was
allowed to vary with the sample size -- for a sample of size four, P would equal 0.10.
As in the attribute case, a set consists of a right side and a left side. For each set an
attempt will be made to fire a second round into each side. Because this might not always be
possible, due primarily to discrepancies between the aim point and the hit location, each set
can result in either two, three, or four data points, depending on whether or n~ot both shots on
each side are considered to be good hits. It is important that during the first three months,
while the chain Is being formed, at least four shots are available upon which to make a deci-
sion. Table 3 outlines the decision rules for the variable sampling plan. Like the attribute
sampling plan, it Incorporates chain sampling with a maximum length of eight sets. The plan
will not reject based on the first sample, and it has a region of no decision until the chain
reaches its full length. In this table, X represents the mean penetration depth for all shots
currently considered, s represents the standard deviation of this sample, n is the total number
of shots used in computing X, and t,9s represents the 95th percentile of the t-dstribution for
the appropriate degrees of freedom (n-I). Thus, n can vary from 2 to 32 depending upon the
length of the chain and the number of shots available on each side of the armor package.
Because n varies so widely, any one of many OC curves may be applicable. Figure 3
shows these curves for sample sizes 2, 32, and many integers in between. The abscissa value,
D, represents a multiple of sb from Xb; thus, the numbers in parentheses are the penetration
depths in millimeters. Note that for all n, the probability of accepting at the APL is 0.95
(1-a). Because the probability of accepting at the RPL is too high for n-2 and n-3, the pro-
cedure will not allow lot acceptance at these small sample sizes (see Table 3). Table 4 pro-
vides the values for the t-statistic for (1-cx)-levels of 0.99 and 0.95 and degrees of freedom
from 3 to 31.
Power curves show the probabity of rejecting a particular lot. Generally, they are noth-
ing more than the complement of OC curves. However, for our procedure this is not the
case, since there is a region of no decision. Figure 4 shows the power curves for this variable
sampling procedure. Basically, there are two sets of curves -- the first two pertaining to a
0.05 and the next three pertaining to a - 0.01. Note from Table 3 that in order to reject
209
TABLE 3. Decision Rules for Acceptance Sampling by Variables.

DECISION RULMS

SET
NUMBER, ACCEPT REJECT NO DECISION

I (n < 4) ... ALL


X-APL X -APL
I (n - 4) / < ... .> t .95

X-.APL I-APL
2 (combine with 1) X-A .< 9s ...... > tP9

X-APL X-APL X -APL


(combine with 1,2) S/ln
.. 95 sl~n >t. 9 t.9_- sl,/'; >t95

X-.APL r-.APL APL


'-
> tX - P >
7 (combine with 1-6) X-
APL
-
-
A.L

8 (combine with 1-7) s I/," - < t ' s Ivf > t95

X-APL X-APL .
9 (combine with 2-8) - t95

r'.APL X-.APL
k (combine with (k-7) - (k-L)) _ t95 > t.95
s Il,,/- s l /'

At least four shots are required in each of the first three months;

otherwise, regard as "No Decision".

210
0p4

o//

of .9.

0~~.
/6 / Il,

r~
_ _I _ t4

P4 --

aou-edaooV JO S~jflct:qchId'
211
TABLE 4. Values of the Cumulative t-Statistic

De-rees of Freedom (I-a) -level


(n-1) 0.95-- 0.99

3 2.35 4.54
4 2.13 3.75
5 2.02 3.37
6 1.94 3.14
7 1.90 3.00
8 1.86 2.90
9 1.83 2.82
10 1.81 2.76
11 1.80 2.72
121.78 2.68
13 1.77 2.65
14 1.76 2.62
15 1.75 2.60
16 1.75 2.58
17 1.74 2.57
18 1.73 2.55
19 1.73 2.54
20 1.73 2.53
21 1.72 2.52
22 1.72 2.51
23 1.71 2.50
24 1.71 2.49
25 1.71 2.49
26 1.71 2.48
27 1.70 2.47
28 1.70 2.47
29 0.0 2.46
30 1.70 2.46
31 1.70 2.45

Interal of the Central tDistribution'


*This table is abridged from Tables of the Probability 1965.
by R.E. Miodusid BRL Technical Note # 1570, August

212
Ch

uo~o~.jjo A TITq'eqoad
213
before the chain is at its maximum length, we use the smaller a-level, and Figure 4 shows
some possible sample sizes for a = 0.01. If we reject at an a-level of 0.05, our sample size
must be somewhere between 16 and 32; and these curves are also shown in Figure 4. Gen-
erally, the power curves are of more interest to the producer than the OC curves, since they
highlight the producer's risk.

C. Quality Control Charts


Variations in the manufacturing process are either random or assignable. A process is
"in control" whe only random variations are present. Assignable variations, If uncorrected,
may eventually result in rejection of a manufactured lot. However, they can often be
identified through the use of quality control charts.
A quality control chart Is a graphical comparison of test data with some previously com-
puted control limits. The most common quality control chart Is the Shewhart chart, named
for its originator, Dr. Walter A. Shewhart. Figure 5 is a Shewhart control chart for mean
penetration depth, the variable of interest in our armor package acceptance sampling plan.
The APL is the central line with an upper control limit equal to the RPL, two baseline stan-
dard deviations away from the APL If we were concerned about extremely low penetration
depths, we would incorporate a lower control limit as well. Assuming a normal distribution
with parameters equal to those of the baseline data implies that if only random variations are
present, 99.38% of the time the mean penetration depth of the sample will fall below the
upper control limit. This leaves a false-alarm frequency of less than 1% (0.62%) - so low that
this control limit seems to be a reasonable threshold to distinguish between random varia-
tions and assignable variations.
The mean penetration depth is plotted for consecutive sets of armor plate. If, over a
period of time, we see a drifting toward the control limit, the process can be examined and
adjusted. This might possibly eliminate some future rejection of an entire lot.
A similar chart should be constructed for the range of penetration depth within the sam-
ple, to insure that the variability of the armor packages is not increasing. A third chart for
structural integrity, the attribute of interest in our acceptance sampling plan, would also be
useful. In each case appropriate upper control limits must be established.
Over the years alternative quality control charts have emerged, each with their own set
of advantages and disadvantages. One of the most popular has been the cumulative sum con.
trol chart (cusum chart). Here, decisions are made based on all the data rather than just the
last sample. An advantage of the cusuin chart is that it often displays sudden and persistent
changes in the process mean more readily (that is, with fewer samples and less expense) than
a comparable Shewhart chart. However, control limits are somewhat less intuitive and,
therefore, more difficult to establish. Somewhere in between the Shewhart chart and the
cusum chart are quality control charts that use some, but not all, of the past data. Many of
these techniques incorporate a weighting factor, providing more weight to the most recent
data.

214
44.)

4-j.

* a
(t 4Ie-l
u.9.a U d -@

215I
It is important that some type of quality control charts be represented in the acceptance
sampling plan. They are relatively easy to maintain and might provide early warning signs
which could be beneficial to both the producer and the consumer.

IV. CONCLUSIONS
Generally, it is not feasible for a consumer to Inspect every item from a production lot
that he might want to purchase. A judicious choice of a lot acceptance plan will allow him to
sample the production lot and determine with a pre-established level of confidence whether
or not it meets his specifications, Chain sampling Is'a particular method of lot acceptance
sampling used when sample sizes are small. It utilizes the most recent lot information to pro-
vide more confidence in the decision.
In testing armor packages for acceptance by the US Army, chain sampling provides a
logical method, since destructive testing dictates small sample sizes. A technique involving
both structural integrity (attribute sampling) and penetration depth (variable sampling) has
been proposed. One set of armor packages Is represented by both a left side and a right side.
The procedure allows for accepting the production lot (one month's production) after exa-
mining just one set. It allows for rejecting the production lot only after testing at least two
sets. There Is.a region of no decision; but after the chain has reached its maximum length of
eight sets, a decision must be rendered.
Operating characteristic curves and power curves provide the probability of accepting
and rejecting lots given a percent structurally defective (attributes) and given a mean penetra-
tion depth (variables).
In addition to the acceptance sampling plans, control charts should be used for both the
attribute and variable parameters. These charts display sample results for particular parame-
ters such as percent defective, mean penetration depth, and variability of penetration depth.
The data might be presented as individual sample points or as sums over a preceding number
of samples. By continually examining the control charts, we can see when one of the parame-
ters is drifting toward the rejection region, enabling the producer to make adjustments and,
possibly, preventing rejection of an entire lot of armor plate.
The proposed lot acceptance plan was briefed to the Project Manager MIAl on 14 April
1988 at Aberdeen Proving Ground, Maryland. It was approved and will be adopted subject to
any refinements agreed upon by both the US Army Ballistic Research Lboratory and the
Project Manager.

216
BIBLIOGRAPHY

Crow, Edwin L, et. al., Statistics ManuA with Examples Taken from Ordnance
Development, Dover Publications, Inc., 1960.

Duncan, Acheson J., Quality Control and Industrial Statistics, Richard N. Irwi, Inc., 1974.

Grant, Eugene L. & Leavenworth, Richard S., Statistical Quality Control, McGraw Hill Book
Company, 1980.

Juran, J.M., Editor, Quality Control Handbook, McGraw Hill Pook Company, 1974.

Mioduski, R.E., Tables of the Probability Integral of the Central t-Distribution, BRL
Technical Note # 1570, August 1965.

Schilling, Edward G., Acceptance Sampling in Quality Control, Marcel Dekker, Inc., 1982.

Thomas, D.W., Chairman, Statistical Quality Control Handbook, Western Electric Company,
Inc., 1956.

217
SOME NOTES ON VARIABLE SELECTION
CRITERIA FOR REGRESSION MODELS
(AN OVERVIEW)

Eugene F. Dutolt
U.S. Army Infantry School
Fort Banning, Georgia

Abstract. There are several decision rules for determining when


to enter additional independent variables into linear multiple
regression. Three of these are: (1) examining the incremental
significance in the multiple correlation coefficients, (2)
Mallows' Cm statistic to determine the best combination of
independent variables, and (3) considering the changes in
magnitude of the standard error of estimate. This paper will
examine some of the Interrelationships between the three methods
cited above. These relationships will be applied to a data set
and the results presented.

Acknowledoement. The author wishes to thank Professors Ron


Hocking and Emanuel Parzen for their comments and suggestions.
It is this spirit of freely shared ideas that makes these
"Design of Experiments Conferences" valuable to Army
statisticians.

1. The Problem.

Given experimental data in the form:

Y X1 XI X, . . . X.

Yi Xil X12 Xia . . .X,1m


Ya X2 1 Xaa Xas . . . Xse
Ya Xaa X33 . . . XVQ

YM X.,I X X.* . . .X.

where (Xi, Xa, X&, . . ., X.) are candidate


independent variables (that make senie according to some
theoretical bases) and Y is the dependant variable. The
researcher wants to form some model

k
Y' a a + b X (1)
1.1

where k_ q. This would indicate the model (equation 1)


consisted of the best set of candidate independent variables.
This paper will provide en overview of the following measures
and criteria In order to shed some light on this probicm.

219
2. The Multiple Correlation Coefficient. (R2)

a. Incremiental Significance. The well known test for the


incremental significance In R2 by adding an additional
independent variable X1 Into equation (1) Is:
F - (R2... ia.. .ki - R2 ...,.2 ...ka)/ (k'i - k2) (2)
(0 - R2V.12a 'k 1 / (N - lKi -1)

where k, a number of independent variables for larger R*


ka number of Independent variables for smaller S2
o larger Ra
N a number of cases

The teat follows an F distribution with degrees of freedom equal


to (ki - ka), (N - kic - 1).

b. Adjusted R*. As Independent variables are added to


equation (1), the Value Of P2 Will alga Increase. This
Increase may be small (i.e., statistically not significant). In
order to account for this mathematical Increase In R2, the
so-called shrinkage formula Is used to calculate an adjusted
Ra as:.

R 2 adJ R2 W~ k Q1 - Raw) (3)


N -k-i1

where kc number of Independent variables In regression.


N *number of oases,
3. Mallowa' Cp Statistic. Myers (reference 1) presents the Cp
statistic In the fol lowing form:
Cp Ep + (S2 - Mz( -N- P) (4)
6 2

where p *k + 1
*estimated variance of the complete modei (He*.;
all Independent variables Included).
a set IMLted variance of the candidate (subset)
model.
N anumber of cases,

( and 33 are obtained as the residual mean squares from


the regression ANOVA.
The following Imterpretntlon Is based an the discussion from the
Myers (reference 1) text:

220
Figure 1

Co N Cr, - P
4 D#
3 A
2 B.
I C

1 2 3 4 P

Reference to equation (4) shows that if S2 < 2, the plot


of Cp will fall below the line Cp w p. The above inequality Is
deslrable for It states that the variation about soma subset
regression model is less than the variation about the full
model. Only point C In the above diagram meets this condition.
This concept will be discussed In the following paragraph
concerning the standard error of estimate about regression, It
should be noted that If (S2 r & *, then equation (4) becomes
Cp a p. This Is always the case for the full model. An
alternative format for Cp Is given by Daniel and Wood (reference
2) as:

Cp a RSS.6 - [N - 2 p1 (5)
^2
where p - k + I (as before)

RSS,. a residual sum of squares with k Independent variable


(p parameters)

2 w residual mean square of the complete model (as


before).

It can be shown (not here) that equation (5) is equivalent to


equation (4).

3A. Another Alternative Form for Cp. Given q independent


variables, the total regression model Is:

=a + biX, + ba×a+...+bX% (8)

The regression ANOVA table Is presented below as Figure 2.

FIgure 2
Regression ANOVA - Full Model

Source DF, SS MS,

Explained q (N-i) (S21) (Rz.)


.
Residual - [N-q-1] S2,1 ,
N - 1

i 221
The model with k independent variables: where k < q is:
j a +biXi + ba Xa +...b.: Xw (7)
The regress ion AfN0VA table Is given In Figure 3.
Figure 3
Regression ANOVA - Subset Model
Source DF SS Ms
Explained kc (N-1) (S2Y)M(R J
Residual N-k-i (N-k-I) Salk SVX
N - I

where
S2YIXb * N-i1 (Say) 0lR~
N-k-i

S~y IX" N-1 (S2 ") ( -R 2q)


N-q- I
Referring to the Myers format of Cp (equation 4) and
substituting equations (8) for SOan 6 2t
Cip N Ck+i) + ( jjtX.,-_2I~) N--)

Further substitution [reference equations (8)) end simple


algebra yield another format for the Cp test:
Cp w Nci lR~)-(N-2k - 2) (9)
(1-Ra,)
Not.t that in var Ious f orms the Cp test can be expressed as a
tunction of P2, S~y X, N, q, k. This leads to another
Independent decision method, namely the standard error of
regression (SyI X).
4. SyL When performing step-wise regression, the vaiue of
S2YIX usually gets smeller as Independent variables arii added
to regression. In other words

S'Ylx . > Say X1.!. (usual ly). (10)

However, this is not always the case. The ratio Is actually

Note that equation (4) ci iowe for cases where some subset model
has less variance (S2 ) than the varian~ce for the complete

222
model (- =) In this case the Cp plot is below the I ins
(i.e.; point C in Figure 1). This can be expressed in Figure 4:

Figure 4

S, I XI.:
Minimum value of SAX,

1 2 31 Ik
The minimum value In Figure 4 corresponds to point C In Figure
1 . The subset of regression that has minimum variance would be
the best predictor of the dependent variable Y. The ratio
described by equation (11) can be rewritten to gain some Insight
to the process, From equations (8) the fol lowing expressions
can be Inferred:
,)
S-'YIX . • N-1 (S2'- (1l-R='A

N-k-1 (12)

S2Y IX.!. * N-1 (SaY) (1-R=,-)


N-k-2

The ratio In, Inequal ity (11) now becomes:

S=YIX, (N-k-2) . 1-Ilk (13)


SaY IX . (N-k-l) 1 - RE,.

where N-k-2 < I


N-k-1

and 1-R . > 1


1 -Reb, .

Therefore, the value of the ratio citec as inequal ity (11)


wil I depend on the magnitudes of the ratios shown above. Note
that equation (3) (Rlsa] contains a shrlnkage factor with
terms (1,-R2 ) and (N-k-1). These terms are also contained
in equation (13). Intuitively it appers that the adjusted
correlation coefficient (RWao) should be a maximum for the
subset of regression where S2YIX is a minimum. The value of
Cp should also be minimum for the same subset of Independent
var lables.

5. ExampIa. The fol lowing example wav taken from Myers


(refernce 1). It Is found on page 110, Table 4-1. The example
uses soles data for asphalt shingles obtained from (N=15)
districts. The variables considered In this example are:

223
X, = Number of promotional accounts.
Xz = Number of active accounts.
X3 - Number of competing brands.
X* a District potential.
Y = Sales In thousands of dollars.

The results of a stop wise regression are given below:

Figure 5
Stepwise Results

STEP VARIABLE SY IX. * ADJ Ra'* p C *


1 3 4 ,. . j0l35 2 . 1227.1
2 . , 6.67 '28302 a 11.4

"43 '1,1,
2. 2.
3.,3 4 4.98
5.1i2" .99612
...
.99,590 4
6 3.4
5

Notice that the last stop (number 4) has a Cp value equal to p


(5). This is always true for the full model. Also note that
stop 3 is the beat subset regression, It Is this step
(variables X,, X 2 , X*,) where the values of SYIx and op
are minimum and adJusted R2 Is maximum.

The results of all variable case. Is presented In Figure 6.


The combination shown In step 11 Is the best subset regression.
It Is the same combination of optimum values of RaidJ, Sygx
and Cp (variables Xi, Xa, Xa,).
Figure 6
All Cases

STEP VARIABLE S x Ra ADJ R2 P C.


1 1 82.49 .01200 -. 06400 2 3361
2 2 58.63 .50101 .46263 2 1692
3 3 49,99 .63725 .60936 2 1227.1
4 4 79.05 .09284 .02306 2 3085.1
5 1,2 60.53 .50900 .42716 3 1666.8
6 1,3 51.71 .64160 .568197 3 1213.9
7 1,4 81.26 .11609 -.03240 3 3011.2
8 2,3 6.67 .99403 .99303 3 11.4
9 2,4 60.82 .50422 .42159 3 1683.1
10 3.4 4a.83 .68051 .62726 3 11081.
11 1,2,3 4.98 .99695 ,99612 4 1 3.4
12 1,2,4 62.90 .51395 .38139 4 1651.9
13 1,3,4 50.27 .68959 .60493 4 1052.4
14 2,3,4 6.97 .99404 .99241 4 13.3
15 1 1.2,3.4 - 5.12 .99707 .99590 5

224
6. Summary. This paper has examined several methods of
determining when to enter additional independent variables into
linear multiple regression In order to form a oRtimum subset
from all the candidate variables.

The Interrelationships between Cp, Sylx and adjusted R2


are studied. These three Indicators appear to provide the same
information in the model selection decision process. Although
they all lead to the same decision regarding the subset
regression selection, each measure provides a different
perception on the subject.

References

Myers, R; Classical and Modern Regression with Applications.


Duxbury Press. Boston, MA; 1986

Daniel, C; Wood, F; Fitting Equetions to Date. Wiley, New York,


1971

225
TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS

Barry A. Bodt

US Army Ballistic Research Laboratory


Aberdeen Proving Ground, Maryland 21005-5066

ABSTRACT
An effort is underway to enhance the battlefield survivability of combat vehicle tire sys-
tems. The Army is currently investigating several new tire technologies with regard to their
ability to function after battlefield degradation. The tires, in a run-flat condition, must sup-
port some vehicle mobility according to that vehicle's mission profile. The immediate objec-
tive of this program is choosing, for further research, the most promising among the new tire
technologies. The presenter has been tasked to develop an appropriate test plan.
Sound experimental strategy, for this or any study, must be accompanied by a clear
understanding of the problem(s) to be resolved. A list of question areas worth exploring to
help gain this understanding is suggested by Hahn (Technometrics, 1984) as part of more
general guidelines. The presenter demonstrates their usefulness to that end in the above
mentioned tire program. The test plan and the process by which it evolved is discussed.

227
TWO-STAGE TESTING OF COMBAT VEHICLE TIRE SYSTEMS

1. INTRODUCTION

An effort is underway to enhance the battlefield survivability of combat vehicle tire sys-

tems. The impetus for current investigations dates back to a 1979 issue paper, submitted to

DA by the US Training and Doctrine Command (TRADOC). In 1985 the Tank Automotive

Command (TACOM) established a tire task force, the need for which was supported by the

results of a 1984 independent evaluation of one tire system performed by the Operational

Test and Evaluation Agency (OTEA). OTEA observed that when the run-flat tires for the

High Mobility Multi-Purpose Wheeled Vehicle (HMMWV) were run flat for 30 miles, the

tires became unserviceable and had to be replaced. The objective of the TACOM Tire Task

Force is to identify a survivable tire system (STS) technological replacement which demon-

strates acceptable battlefield survivability. A two-phase approach (operational and technical)

has been adopted to screen available STS technologies in search of candidates for more

intense research and development. The operational phase, considering the standard and

seven new STS technologies, was completed by the Combat Developments Experimentation

Center (CDEC) in 1987. The technical phase, the focus of this paper, is being conducted by

the Vulnerability Lethality Division (VLD) of the Ballistic Research Laboratory (BRL)

according to the test plan developed by the Probability and Statistics Branch (PSB) of the

Systems Engineering and Concepts Analysis Division (SECAD) of the BRL

This paper is intended to accomplish two tasks. The first is to discuss the test plan that

has been adopted for the technical testing phase -- not in great detail but sufficiently to

demonstrate the degree to which experimental objectives are satisfied. As part of the discus-

228
sion it is shown how, for example, tire performance specifications, factors thought to

influence performance, and physical and budgetary corwtraints are incorporated in the test

strategy. The second is to illustrate the usefulness of well-defined consulting guidelines in

extracting the necessary information from experimenters. Any sound experimental strategy

must be accompanied by a clear understanding of the problem to be resolved, but informa-

tion essential to that understanding is often difficult to obtain. The fragmented manner in

which information is passed from client to consultant inhibits a cogent assimilation of facts

needed for efficient problem solving. Hahn (1984) suggests imposing the structure of ques-

tion area guidelines (see Figure 1) both to help sort the information coming in and to direct

consultation sessions down new promising paths.

The remainder of the paper is organized as follows. In Section 2 the problem and test

plan are developed, punctuated by Hahn's guidelines. It is hoped that this presentation will

both give fair treatment to the Army concern as well as illustrate a reasonable approach to

consultation. In Section 3 a brief critique of the test plan's strengths and weaknesses is given,

followed by some closing comments.

2. EVALUATION OF THE TEST PLAN

Problem information is divulged in this section according to Hahn's guidelines, and that

constitutes our presentation of his technique. We seek only to show how encompassing those

question areas are by developing in full the Army's problem through their use. In the text,

italicized words and phrases refer back to guidelines in Figure 1. The guidelines have been

juggled to allow for a logical presentation and the order shown in Figure 1 corresponds, with

few exceptions, to that of this section. This is simply a matter of convenience and not a claim

229
1. The objectives of the experiment.
2. The variables to be held constant and how this will
be accomplished (as well as those that are to be
varied).
3. The uncontrolled variables - what they are and
which ones are measurable.
4. The response variables and how they will be meas-
ured.
5. Special considerations which Indirectly impose
experimental constraints.

6. The budged size of the experiment and the dead-


lines that must be met.
7. Conditions within the experimental region where the
expected outcome is known; the anticipated perfor-
mance is expected to be inferior, especially for pro-
grams where an optimum is sought; and experimen-
tation is impossible or unsafe.

8. Past test dat. and, especially, any information about


different types of repeatability.
9. The desirability and opportunities for running the
experiment in stages.
10. The anticipated complexity of the relationship
between the experimental variables and the
response variables and any anticipated interactions.

11. The procedures for running a test, including the ease


with which each of the variables can be changed
from one run to the next.
12. The details of the physical set-up.

Figure 1. Important Question Areas for Statisticians to Address.

230
for an ideal sequence, In fact, each consulting session is likely not only to naturally gravitate

toward different orders but also to move around from area to area, possibly returning several

times to some.

2.1. Understanding the Problem

Let us begin by considering objectives. We consider two types; military and experimen-

tal. The military objective is that HMMWV tires remain serviceable when degraded through

battlefield exposure to small caliber munitions and shell fragments. Serviceable means that

the tire exhibits perfonnance consistent with the standards specified in the NATO-Fnabel 20

A 5 1956 NATO Test Procedure. Summarized expectations set forth therein say that the

combat tire must possess (as nearly as possible) the same over the road performance as the

classic radial tire In terms of maximum vehicle speed and lateral and longitudinal traction and
stability. After degradation, normal military performance of the vehicle is still required when
no more than two tires (one drive and one steering) are damaged. The experimental objective
is to screen six, including the standard, tire systems with the purpose of selecting a subset for

further research, development, and the eventual upgrading of combat tires. The selection cri-

teria must be driven by the military objectives summarized above.

Question areas 2-4 in Figure 1 each concern variables. It Is in the identification ind

classification of these variables that the experimental strategy begins to take form. In Table I

the most important ones are given. Care is taken to initially classify them as candidates for

response, design or nuisance variables and to subclassify them for each of the last two

categories. The scale of measurement is also noted. A short definition of each of these vari-

ables Is given in the appendix. Because the variables listed in Table I represent only those

231

Ll
"> ,. ,.>

ii

ii >Q

e
", "> ,e > > "

i ]G

_, ,, I
...- -I II 7ii...'"i "'II IIII '
i
H

'4d

~ r - --

I-I

"I , !! I t i I
i.

233
considered essential, all must be incorporated in the experimental strategy. We briefly dis-

cuss several c them here so that the reader may gain a sense of the complexity of the prob-

lem.

The logical starting point for discussion is with tire technology, for it Is the selection

from among these prototypes that is the objective of this experiment. Six manufacturer

offerings, including the standard, are to be considered, but there are basically only four tech-

nologies. When combat tires are exposed to small caliber munitions and shell fragments they

will surely tear, puncture, or in some other way be damaged so as to induce partial or com-

plete deflation. Then in order for military objectives to be satisfied, the survivable tire will

either successfully negate this damage or be structurally capable of supporting vehicle mobil-

ity without benefit of full tire pressure. Taking the first tact, the sealant tire systems contain

chemical compounds which are intended to flow to a source of air loss, solidify, and thereby

negate the threat damage. Run-flats take the second tact and are able to support the vehicle

with a metal or plastic insert which acts in the tires stead when the tire is deflated. Self-

supporting tires are so named because molded into the tread is a rigid fiber glass band,

designed to carry the tire's full load in the absenice of tire pressure. Solid urethane tires cir-
cumvent the problem by containing no air to be lost, ',ut they do so at the cost of additional

weight, inhibiting vehicle mobility (Drelling et al., 1987).

A limitation of the CDEC exercise is that tire degradation from shell fragments is not

considered. Interest in the more irregular pt , s


s:and tears caused by the shell fragment

threat is a consideration in involving the BRL in the technical phase of experimentation. To

make inferences about tire performance after fragment damage the consensus is that either

live shells should be detonated near standing tires or the expected resulting fragments should

234
be simulated. A specialconsideration in long range plans is that an acceptance test consistent

with current testing be developed. Due to the repeatability requirements inherent in accep-

tance testing the shell-detonation approach was dropped in favor of fragment simulation

This decision led to variables involving fragment shape, size, and velocity. Due to budget

and time constraints it appears unreasonable to select many values for each and then proceed

in a factorial manner when incorporating them in the design. Rather we option to create two

fragment profiles, each representative of a distinct threat. Avoiding great detail, a standard
BRL fragment design is specified for shape. Velocity and mass are determined as follows.

Each are a function of the distance between the shell at detonation and the tire. The distance

selected corresponds to a 50% mobility kill for the vehicle according to models accessed by

the Army Material Systems Analysis Activity (AMSAA). Avoiding the experimental region
where the expected outcome is .,own, we do not consider distances so close that the personnel

are not likely to survive. The median velocity and mass among computer simulated frag-

ments possessing an appropriate trajectory then serve as representative values for these

characteristics. Trial firings suggest some deviations in these choices so that resulting dam-

age seemed similar to actual fragment damage previously observed.

Other factors of keen interest include the terrain traveled and tire position, motion, and

pressure. The mission profile for the HMMWV dictates that it shall be able to travel primary

roads, se.condary roads, and cross country. Further it suggests that in a characteristic mission

those three terrains might comprise 30%, 30%, and 40%, respectively, of the total mileage

covered Tire pcsiticn refers to its designation as a drive tire or a drive and steering tire; the

HMMWV is 4-wheel drive. In addition to this one-at-a-time damage, recall that the NATO

Finabel standards require acceptable performance when two tires on the vehicle are

235
damaged. When attacked, the HMMWV may be moving or at rest. Proponents of the

sealant technology claim that if the tire is in motion when punctured, then the sealant

mechanism will be mote effective in finding and repairing the damage. Past test data indi-

cates that tire pressure may influence the type of puncture, that is, clean or ragged. Manufac-
turer recommended high and low pressures for each tire will be considered.

The specialconsiderationthat this experiment complement the CDEC exercise fixed two

important test definitions. TACOM decided that the response would remain defined as miles

until failure. Failure occurs when either the tire begins to come apart when in use or the

operator must slow to less than 50% of normal operating speed in order to maintain control.

Under a rigid value for normal speed, failure could depend on the size and strength of the

operator. We propose to account for that by establishing a profile on operators (actually dri-

ing teams) in their normal operation of the vehicle. The 50% rule is then based on normal

team performance. Driving teams are established to avoid failure due to fatigue. Past test

data reveals that some degraded tires remain serviceable after 100 continuous miles of opera-

tion. In order to avoid truncated data, the test course is extended to 125 continuous miles, but

at the additional cost of trial time. It is felt that If two operators are allowed to rotate after

each 25 mile lap, then fatigue will not enter into the failure determination.

2.2 Test Plan

The test plan will be implemented in stages. A fairly large number ot experimental con-

ditions define the experiment outlined in Section 2.1. To examine each uf these conditions in

a factorial manner will require mote resources than the experimental budget will allow; for all

but the standard tire no more than 30 prototypes will be made available. Moreover, recall

236
that the principal objective of this study is to facilitate comparison among tires. Placing too

much emphasis (sample allocation) on ancillary issues may partially obscure (weaken conclu-

sions regarding) the main experimental focus. For these reasons, resource limitations and

emphasis, we choose to run the experiment in two primary stages.

The division of testing meets the above concerns. In stage 1 all the experimental condi-

tions are incorporated in the design as factors or fixed test procedures. Only the standard

HMMWV tire is considered in stage 1. The purpose of this stage is two-fold. First the vari-

ous test conditions may be examined. It is hoped that some will prove unnecessary for inclu-

sion in stage 2, thereby increasing the experimental information per sampled test condition.

Second, test procedures may be smoothed. Field test exercises nearly always present unex-

pected problems, often resulting in samples which must be invalidated for the analysis. Here

we run only the risk of wasting some more plentiful, standard tires instead of the scarce new

prototypes. In stage 2 the prototypos wiU be examined by an experienced testing group under

the conditions remaining after stage 1. Since the complete details will not be available until

stage 1 is concluded, we defer further discussion of stage 2 to future papers. In the remainder

of this paper stage I testing serves as the main focus.

Stage I will be run as a 1/2 replication of a 4x24 factorial design, requiring 32 observa-

tions. The design factors, each discussed in Section 2.1, are listed in Table 2. The 4 levels for

threat include a 7.62mm round fired at 450 and 90" obliquity on the sidewall, a small fragment

simula.tor, and a large fragment simulator. Note that only 2 tire position levels, drive or steer-

ing, are considered. The case in which two tires are danaged, requiring twice as many sam-

ples, is handled only in a limited sense. Imbedded in the stage 1 factorial design are four

treatment combitations having two damaged tires which arise from a 1/2 replicate of a 24

237
IC

a
6eq

> 1~ -- -

23
design. The remaining 4 observations are already included in the principal stage 1 design.

The other three factors are handled as previously noted. This design allows hypothesis tests

on all main effects and on most first-order interactions. Due to the anticipatedcomplexity of

the relationship amoong variables some first-order interactions can be sacrificed to the experi-

mental error formed by the remaining second, and above, order interactions. The remaining

variables are addressed in stage 1 as suggested in Figure 2.

Randomization for the stage I design is complete except in the case of driving team,

vehicle, and terrain. In consideration of the proceduresfor runninga test and the ease in which

variables can be changed and the details of the physical set up some compromises were made.

The complete randomization of the driving teams is not possible because both teams are to

be used simultaneously. The first driver in the rotation for each team was randomized. As
indicted In Figure 2, four vehicles are used but are not Included as test factors. To mitigate

their effect on the outcome, they have been selected according to age and state of repair and

have been partially randomized over the design. Also noted in Figure 2, the three terrains

mentioned in Section 2.1 comprise the test track. The course layout attempts to mix or ran-

donmize the terrains so that not all the mileage for one type will be traveled before the next is

encountered.

3. CRITIQUE OF APPROACH

In this section we address the primary advantages and disadvantages of the test plan

interpreted in terms of the stated military and experimental objectives and follow with some

comments pertaining to the consulting technique employed. Beginning with the military

objectives, all of the variables considered important by TACOM or the NATO Finable Stan-

239
~.1.

,.

240
II4.
dards are included in the test plan in a manner suitable to TACOM. Sometimes this requires

compromise, such as in the use of terrain. Terrain is considered only through Its inclusion in

the test course in proportions consistent with the HMMWV mission profile. For some othe:

variables the military Interests are clarified in the test plan. For example, normal operating

speeds in the failure definition are m',e sensibly tied to the normal performance of individual

driving teams. Also, efforts to handle the fragment threat result in a reasonable fragment

simulation procedure which may be used in follow-on acceptance testing.

With regard to experimental objectives, the selection of STS prototypes for further

research and development follows directly from analysis of the second testing stage. Further,

the stage 1 plan imposes an analyzable design structure on a complex problem providing for

the testing of all important hypotheses. In addition, running the experiment in stages has the

emphasis and resource advantages mentioned in Section 2.2. However, the test plan has

several weaknesses. By examining the standard tire only in stage 1, comparisons between it

and other STS prototypes are hindered. Experimental error is an issue since complete ran-

domization is not possible and since some pooling of low-order interactions into the error

term may be necessary. Choice of an error term for the imbedded test of the two-tire effect

is far from straight.forward, particularly since four of the eight observations must be used in

the analysis twice. Finally, we had to take some liberties in the combination of variables to

form factors so that a design would be possible with the available samples.

As to consulting, we cannot prove the usefulness of Hahn's guidelines, but we hope that

the illustration is convincing, Surely, the information can be obtained through other methods,

but the imposed structure of this approach facilitates a very comprehensive investigation. In

the end, all methods must be judged by the experimental strategies which they help to

241
develop, but their performance is hopelessly confounded with the skills of the consultant using

them. Of course the purposes of those strategies are to meet both application objectives and

satisfy statistical theory. Whether this strategy satisfies those purposes, and if not, whether

fault lies with the consultant, the approach, or the problem are questions left for the reader to

decide.

242
REFERENCES
Drelling, IS., Pietzyk, S., Schrag, H. (1987), "Survivable Tire Test," CDEC-TR-87-014.
Hahn, 0. J. (1984), "Experimental Design in the Complex World," Technometricsa 26, 19-3 1.
Kempthorne, 0. (1952), Design and Analysis of Experiments, New York; John Wiley & Sons.

243

Mimi
APPENDIX
terrain the three driving surfaces listed as primary road, secon-
dary road, and cross country. Each will induce different
tire stress and all are included in the vehicle's mission
profile.

tire position placement of the damaged tires. In testing, the right


front, right rear, or both may be degraded.

vehicle High Mobility Multi-purpose Wheeled Vehicle


(HMMWV). This vehicle's tire system is the program's
focus. The individual HMMWV effect is an issue,
driver operator of the vehicle. The influence of different drivers
should be accounted for.
tire motion the state of the tire, either static or rolling.

threat obliquity

tire technology six Survivable Tire Systems to be compared.


fragment shape partially determines the nature of the tear or puncture.

fragment size partially determines the nature of the tear or puncture.

fragment velocity partially determines the nature of the tear or puncture.

road temperature effects the tire vulnerability. Previous testing revealed


reduced susceptibility to puncture with low tire pressures.
vehicle load influences tire stress.

driving speed Influences tire stress.

distance to tire refer, to distance traveled by fragment or smal caliber


munition.
round threat munition(s) to be used.

miles until failure response.

delivery method fragment propelling methods considered.

244
# shots on tire - the number of punctures to be made in each tire to
obtain Its degraded state.
subjective assessments -comments, solicited from drivers regarding handling of
vehicle when tires are in normal or degraded mode.

245
Parallel Coordinate Densities'

Edward J. Wegman
Center for Computational Statistics
242 Science-Technology Building
George Mason University
Fairfax, VA 22030

1. Introduction. The classic scatter diagram is a fundamental tool in the construction of' a
model for data. It allows the eye to detect such structures in data as linear or nonlinear features,
clustering, outliers and the like. Unfortunately, scatter diagrams do not generalie readily beyond three
dimensions. For this reason, the problem of visually representing multivaiate data I a difficult,
largely unsolved one. The principal difficulty, of course, i the fact that while a data vector may be
arbitrarily high dimensional, say n, Cartesian scatter plots may only easily be done in two dimensions
and, with computer graphics and more effort, in three dimensions. Alternative multidimnsional
representations have been proposed by several authors Including Chernoff (1973), Fienberg (1979),
Cleveland and McGill (1984a) and Car et al. (1988).
An Important technique based on the use of motion is the computer-based kinematic display
yielding the illusion of three dimensional scatter diagrams. This technique was pioneered by Friedman
and Tukey (1973) and Is now available in commercial software packages (Donohoe's MacSpin and
Velleman's Data Desk). Coupled with easy data manipulation, the kinematic display techniques have
spawned the exploitation of such methods as projection pursuit (Friedman and Tukey, 1974) and the
grand tour (Asimov, 1985). Clearly, projection-bnsed techniques lead to important insights concerning
data. Nonetheless, one must be cautious in making inferences about high dimensional data structures
based on projection methods alone. It would be highly desireable to have a simultaneous
representation of all coordinates of a data vector especially if the representation treated all components
in a similar manner. The cause of the failure of the standard Cartesian coordinate representation is the
requirement for orthogonal coordinate axes. In a 3-dimensional world, it is difficult to represent more
than three orthogonal coordinate axes. We propose to give up the orthogonality requirement and
replace the standard Cartesian axes with a set of n parallel axes.

2. Parallel Coordinates. We propose as a multivariate data analysis tool the following


representation. In place of a scheme trying to preserve orthogonality of the u-dimensional coordinate
axes, draw them as parallel. A vector (xi, x3, , , xn) Is plotted by plotting x, on axis 1, x 2 on axis 2
and so on through xn on axis n. The points plotted in this manner are joined by a broken line. Figure
21 Illustrates two points (one solid, one dashed) plotted in parallel coordinate representation. In this
Illustration, the two points agree in the fourth coordinate. The principal advntage of this plotting
, x,) is represented In a planar diagram mo that each vector
device is clear. Each vector (xl, x 2 , ...
component has essentialy the same representation.
The parallel coordinates proposal has its roots in a number of sources. Griffen (1958) considers
a 2-dimensional parallel coordinate type device as a method for graphically computing the Kendall tan
correlation coefficient. Hartigan (1975) describes the "profiles algorithm" which he describes ar
"histograms on each variable connected between variables by identifying easea." Although he does not
recommend drawing all profiles, a profile diagram with all profiles plotted is a parallel coordinate plot.
There is however far more mathematical structure, particularly high dimensional structure, to the
parallel coordinate diagram than Hartigan exploits. Inselberg (1985) originated the parallel coordinate

'This research was sponsored by the Army Research Office, Contract DAAL03-87-K-0087

247
representation as a device for computational geometry. His 1985 paper is the culmination of a series of
technical reports dating from 1981. Finally we note that Diaconis and Friedman (1083) discuss the so-
called M and N plots. Their special case of a I and 1 plot is a parallel coordinate plot in two
dimensions. Indeed, the 1 and I plot is sometimes called a before-and-after plot sud has a much older
history. The fundamental theme of this paper is that the transformation from Cartesian coordinates to
parallel coordinates is a highly structured mathematical transformation, hence, maps mathematical
objects into mathematical objects. Certain of these can be given highly useful statistical
interpretations so that this representation becomes a highly useful data analysis tool.
3. Parallel Coordinate Geometry. The parallel coordinate representation enjoys some elegant
duality properties with the usual Cartesin orthogonal coordinate representation. Consider a line Z in
the Cartesian coordinate plane given by Z: y=mx+b and consider two points lying on that line, say
(a, ma+b) and (c, mc+b). For simplicity of computation we consider the xy Cartesian axes mapped
into the xy parallel axes as described in Figure 3.1. We superimpose a Cartesian coordinate axes t,u on
the xy parallel axes so that the y parallel axis has the equation u=1. The point (a, ma-l-b) In the xy
Cartesian system maps into the line joining (a, 0) to (ma+b, 1) in the tu coordinate axes. Similarly,
(c, mc+b) maps into the line joining (c, 0) to (mc+b, 1). It is a straightforward computation to show
that these two lines intersect at a point (in the tu plane) given by 1: (b(1-m) "1 , (1-m)'). Notice
that this point In the parallel coordinate plot depends only on m and b the parameters of the original
line in the Cartesian plot. Thus I is the dual of L and we have the interesting duality result that
points in Cartesian coordinates map into lines In parallel coordinates while lines in Cartesian
coordinates map into points In parallel coordinates.
For 0 < (1-m)" < 1, m is negative and the Intersection occurs between the parallel
coordinate axes. For m=-I, the intersection Is exactly midway. A ready statistical Interpretation
can be given. For highly negatively correlated pairt, the dual line segments In parallel coordinates will
tend to cross near a single point between the two parallel coordinate axes. The scale of one of the
variables may be transformed in such a way that the intersection occurs midway between the two
parallel coordinate axes in which case the slope of the linear relationship Is negative onie.
In the case that (1-m)"1 <0 or (1-m)'>1, m is positive and the intersection occurs external
to the region between the two parallel axes. In the special case m=1, this formulation breaks down.
However, it Is clear that the point pairs are (a, a+b) and (c, c+b). The dual lines to these points are
the lines in parallel coordinate space with slope b " and Intercepts -ab and -cb " respectively. Thus
the duals of these lines in parallel coordinate space are parallel lines with slope b" . We thus append
the ideal points to the parallel coordinate plane to obtain a projective plane. These parallel lines
intersect at the ideal point in direction b " . In the statistical setting, we have the following
interpretation. For highly positively correlated data, we will tend to have lines not intersecting
betweeu the parallel coordinate axes. By suitable linear resallng of one of the variables, the lines may
be made approximately parallel in direction with slope b" . In this case the slope of the linear
relationship between the rescaled variables Is one. See Figures 3.2 for an illustration of large positive
and large negative correlations. Of course, nonlinear relationships will not respond to simple linear
rescaling. However, by suitable nonlinear transformations, It should be possible to transform to
linearity. The point-line, line-point duality seen In the transformation from Cartesian to parallel
coordinates extends to conic sections. An Instructive computation Involves computing in the parallel
coordinate space the image of an ellipse which turns out to be a general hyperbolic form. For purposes
of conserving space we do not provide the details here.
It should be noted, however, that the solution to this computation is not a locus of points, but
a locus of lines, a line conic. The envelope of this line conic Is a point conic. In the case of this
computation, the point conic in the original Cartesian coordinate plane Is an ellipse, the image in the
parallel coordinate plane Is as we have just seen a line hyperbola with a point hyperbola as envelope.
Indeed, it in true that a conic will always map into a conic and, in particular, an ellipse will always
map into a hyperbola. The converse Is not true. Depending on the details, a hyperbola may map into
an ellipse, a parabola or another hyperbola. A fuller discussion of projective transformations of conics
is given by Dimsdak (1984). Inselberg (1985) generalizes this notion Into parallel coordinates resulting

248
in what he calls hstars.
We mentioned the duality between points and lines and conics and conics. It is worthwhile to
point ou6 two other nice dualities. Rotations in Cartesian zoerdi'late beco.e tradslations in paiallel
coordinates and vice versa. Perhaps more interesting from a statistical 1oint of view is that pointis of
hillection in Cartesian space become cusps in parallel coordinate space and vice versm. Thus the
i-ilatively hard-to-detect inflection point property of & function becomes the notably more easy to
detect cusp in the parallel coordinate representation. Inselberg (1985) dim.uses theme properties In
detail.
4. Further Statistical Interpretationa. Since ellipses map into hyperbolas, we can have an easy
template for diagnosing uncorclated data pairs. Consider Figure 3.2. With a completely uncorrelated
data set, we would expect the 2-dimensional scatter diagram to fill sebstantally a circumscribing
circle. As illustrated in Figure 3.2, the parallel coordinate plot would approximate a figure with a
hyperbolic envelope. As the corrolation approaches negative one, the hyperbolic envelope would deepen
so that in the limit we would have a pencil of lines, what we like to call the cross-over effect. As the
correlation approaches positive one, the hyperbolic envelope would widen with fewer and fewer cross-
overs so that in the limit we would have parallel lines. Thus corre.ation structure can be diagnosed
from the parallel coordinate plot. As nuted earlier, Griffen (1958) used this as v graphical devicu for
computing the Kendall tasu.
C:iffen, in fact, attributes the graphical device to Holmes (1928) which predates Keudall's
discussion. Tha coniput.iMonal formula is

where X is the number of intersections resulting by connecting the twqo ranklngs of each member by
linw, one ranking havinb been put in natural order While the original formulation was framed in
terms of ranks for both x and y axes, it is clear that the numbe. of crossings is invariant to any
monotone increaeing transformation of elt-er x or y, the ranks being one su"L transformation. Bucauce
of this scale invariance, une would expect. rank-beld htatistIcA to have an intimate relstionship to
parallel coordinates.
It is clear that if there is a perfect positive linar relationi'!.ip with no cromings, then X = 0
and r = 1. Similarly, if there . a perfect negative linear relationsitip, Figure 3.2 is again appropriate
j we have a pencil of lines. Since every line neets every other line, the number of intersections is
so that

rr21 -(n)1
2 =_I 3

It should be further noted that cluste'ing is easily diagnosed using the parallel coor.&nate
representation.
So far we have focused primarily on pairwiae parallel coordinate relationships. The idea
however is that we can, sc to speak, stack these diagrams and rel-resent rill n diauensions
simultaneously. Figure 4.1 trs illustrates f6-dimensinnal Gaussian uncorrelated data plotted in
parallel coordinates. A 6-dimensional ellipsoid would have a similar general shape but with iiyperbolas
of different depths. This data is deep ocean acoustic noise and is illustrative of what might be
expected.
Figure 4.2 is illustrative of some data structurom one might see in a five.dimensional data set.
First it should be noted that the plots along any given axis represent dot diagrams (a refinement of the
histograms of Hartigan), hence convey graphically the one-dinensional marginal distributions. In this
illustration, the first axis is meant to have an approximately normal distribution shape while axis two
the shape of the negative of a X2 . As discussed above, the pairwise comparisons can be made. Figure

249
4.2 illustrates a number of instances of linear (both negative and positve), nonlinear and cludtering
situations. Indeed, it is clear that there is a 3-dimensional cluster along coordinates 3, 4 and 4.
Consider also the appearance of a mode in parallel coordinates. The mode is, intuitively
speaking, the location of the most intense concentration of probability. Hence, in a sampling situation
it will be the location of the niost intense concentration of observations. Since observations are
represented by broken line segments, the mode in parallel coordinates will be represented by the most
intense bundle of broken line paths in the parallel coordinate diagram. Roughly speaking, we should
look for the most Intense flow through the diagram. In Figure 4.2, such a flow begins near the center
of coordinate axis one and finishes on the left-hand side of axis five.
Figure 4.2 thus illustrates some data analysis festures of the parallel coordinate representation
including the ability to diagnose one-dimensional features (marginal densities), two-dimensional
features (eorrelations and nonlinear structures), three-dimensional features (clustering) and a five-
dimensional feature (the mode). In the next section of this paper we consider a real data set which will
be illustrative of some additional capabilities.
5. An Auto Data Esample. We Illustrate parallel cooordinates as an exploratory analysis tool
on data about 88 1980 model year automobiles. They consist of price, miles per gallon, gear ratio,
w.'.Ight and cubic Inch displacement. For n = 5, 3 presentationo are needed to present all pairwine
nermutations. Figures 5.1, 5.2 and 5.3 are these three presentations. In Figure 5.1, perhaps the most
striking feature is the cross-over effect evident In the rdlationship between gear ratio and weight. This
suggests a negative correlation. Indeed, this is reasonable since a heavy car would tend to have a large
engine providing cou3iderable torque thus requiring a lower Sear ratio. Conversely, a light car would
tend to have a small engine providing small amounts of torque thus requiring a higher gear ratio.
Consider as well the relationship between weight and cubic inch displacement. In this diagram
we have a considerable amount of approximate parallelism (relatively few crossings) suggesting positive
correlation. This Is a graphic representation ot the fact that big cars tend to have big engines, a fact
most are prepared to believe. Quite striking however is the negative slope going from low weight to
moderate cubic Inch displacement. This is clearly an outlier which Is unusual in neither variable but in
their joint relationshlp. The same observation is highlighted in Figure 5.2.
The relationship between miles per gallon and price Is also perhaps worthy of comment. The
left-hand side shows an approximate hyperbolic boundary while the right-hand side clearly illustrates
the cross-over effect. This suggests for Inexpensive cars or poor mileage cars there is relatively little
correlation. However, costly cars almost always get relatively poor mileage while good gas mileage cars
are almost always relatively inexpensive.
Turning to Figure 5.2, the relationship between gear ratio and miles per gallon is instructive.
This diagram is suggestive of two classes. Notice that there are a number of observations represented
by line segments tilted slightly to the right of vertical (high positive slope) and a somewhat larger
nlimber with a negative slope of about -1. Within each of these two classes we have approximate
parallelism. This suggests that the relationship between gear ratios and miles per gallon is
approximately linear, a believable conjecture since low gears = big engines = poor mileage while high
gears = small engines = good mileage. What is intriguing, however, is that there seems to be really
two distinct classes of automobiles each exhibiting a linear relationship, but with different linear
relationships within each class.
Indeed in Figure 5.3, the third permutation, we are able to highlight this separation intu twn
classes in a truly 5-dimensional sense. The shaded region in Figure 5.3 describes a clas of vehicles with
relatively poor gas mileage, relatively heavy, relatively inexpensive, relatively large engines and
relatively low gear ratios. Figure 5.4 is a repeat of this graphic but with different shading highlighting
a clas of vehicles with relatively good gas mileage, relatively light weight, relatively inexpensive,
relatively small engines and relatively high gear ratios. In 1980, these two characterizations describe
respectively domestic automobiles and imported automobiles.
6. Graphical Eziensions of Parallel Coordinate Plots. The basic parallel coordinate idea

250
suggests some additional platting devices. We call these respectively the Parale: Coordinate Dernity
Plots, Relative Slope Plots and Color Histograms. These are extensions of the basic idea of parallel
coordinates, but structured to ex.iloit additional features or to convey certain information more easily.
6.1 Parallel Coodinate Density Plots. While the basic parallel coordinate plot Isa useful
device itself, like the conventional scatter diagram, it suffers from heavy overplotting with large data
sets. In order to get around this problem, we use a parallel coordinate density plot which is computed
as follows. Our algorithm isbased on the Scott (1985) notion of average shifted histogram (ASH) but
adapted to the parallel coordinate context. As with an ordinary two dimensional histogram, we decide
on appropriate rectangular bins. A potential difficulty arses because a line segment representing a
point may appear in two or more bins in the same horizontal slice. Obviously if we have k a-
dimensional observations, we would like to form a histogram based on k entries. However, since the
line segment could appear in two or more bins iua horizontal slice, the count for any givea horizontal
slice Is at least k and may be bigger. Moreover, every horizontal slice mAy not have tht; same count.
To get around this, we convert, line oegments to points by intersecting each line' begunent with a
horizontal line passing through the middle of the bin. This gives us an exact count of k for ebch
horizontal slice. We construct an ASH for each horizontal slice (typidally averaging 5 histograms to
form our ASH). W, have used contours to represent the two-dimenslonal density although gray scale
shading could be used in a display with sufficient bit-plane memory, An example of an parallel
coordinate density plot is given in Figure 6.1. Parallel coodinate desity plots have tWe advantage of
being graphical representations of data setc which are cimultaneously high dimensional and very large.
6.2 Relative Slope Plots. We have already seen that parallel line segments In a parallel
coordinate plot correspond to high positive correlatiou (linear relationship). As in our automobile
example, it is possible for two or more sets of linear relationships to exist simultaneously. In an
ordinary parallel coordinate plot, we see thefse as sets of parallel lines with distinct slopes. The work of
Cleveland and McGill (1984b) suggests that comparison of slopes (angles) is a relatively inaccurate
judgement task and that it is much easier to compare magnitudes on the same waie. The relative
slope plot ismotivated by this. In an'n-dimensional relative slope plot there are u-1 parallel axes,
each corresponding to a pair of axes, say xd and xj, with xj regarded as the lower of the two coordinate
axes. For each observation, the slope of the line segment between the pair of axes isplotted as a
magnitude between -1 and +1, The maximum positive slope ic coded as + 1, the minimum negative
slope as -1 and a slope of oo as 0. The magnitude Is calculated as tot q]where q is the angle between
the xj axis and the line segment correspoading to the observation. Each individual observation in the
relative slope plot corresponds to a vertical section through the axis system. An example of a relative
slope plot is given in Figure 6.2. Notice that since slopes are coded as heights, simply laying a
straightedge will allow us to discover sets of linear relationshipe within the pair of variables xj and xj.
6.3 Color Histogram. The basic set-up for the color histogram is similar to the relative slope
plots. For an n-dimensional data set, there are n parallcl axes. A vertical section through the diagram
corresponds to an observation. The idea is to code the magnitude of an observation along a given axis
by a color bin, the colors being chosen to form a color gradient. We typically choose 8 to 15 colors.
The diagram is drawn ny choosing an axis, say xk, and sorting the observations in ascending order.
Along this axis, we see blocks of color arranged according to the color gradient with the width of the
block being proportional to the number of observations falling into the color bin. The observations on
the other axes are arranged in the order corresponding to the xk axis and color coded according to their
magnitude. Of course, if the same color gradient shows up say on the xm axis as on the x&, then we
know xh is positively "correlated" with xm. If the color gradient is reversed, we know the *correlation"
is negative. We used the phrase "correlation" advisedly since in fact if the color gradient isthe same
but the color block sizes are different, the relationship is nonlinear. Of course if the xm axis ahows
color speckle, there Is no "correlation" and xj is unrelated to xm. An exanple of a color histogram is
given In Figure 6.3 (for pdrposw of reproduction here it Isreally a gray-scale histogram).
7. Implementigions ,nd Ezperience8. Our parallel coordinates data analysib software has been
implemented in two forms, one a PASCAL program operating on the IBM RT"under the AIX
operating system. This code allows for up to four simultaneous windows and offers simultaicous

251
display of parallel coordinates and scatter diagram displays. It offers highlighting, zooming and other
similar features and also allows the possibility of nonlinear resealing of each axis. It incorporates axes
permutations and also Includes Parallel Coordinate Density Plots, Relative Slope Plots and Color
Histograms.
Our second implementation isunder development in PASCAL for MS-DOS machines and
includes similar features. In addition, it has a mousedriven painting capability and can do real-time
rotation of .3-dimensional scatterplots. Both programs use EGA graphics standards, with the second
also using VGA or Hercules monochrome standards.
We regard the parallel coordinate representation as a device complementary to scatterplots, A
major advantage of the parallel coor lnate representation over the scatterplot matrix isthe linkage
provided by connecting points on the axes. This linkage is difficult to duplicate in the scatterplot
mattix. Because of the projective line-point duality, the structures seen in a auatterplot can also be
seen in a parallel coordinate plot. Moreover, the work of Cleveland and McGill (1984b) suggests that
it is aWier and more acwurate to compare observations on a common scale. The parallel coordinate
plot and the derivatives of it de facto have a common scale and so for example a sense of variability
and central tendency among the variables are easier to graSp visually in pallul coordinates when
compared with the scatterplot matrix. On the other hand, one might interpret all the Ink generated by
the lines as a significant disadvantage of the parallel coordinate plot. Our experience on this Ismixed.
Certainly for large data sets on hard copy this is a problem. When viewed on an Interactive graphic
screen particularly a high resolution screen, we have often found that individual points in a scattetplot
can get lost because they are simply not bright enough. That does not happen in a parallaP coordinate
plot. Howover, if many points are plotted in monochrome, it is hard to distinguish between points.
We have gotten around this problem by plotting distinct olnts Indifferent colors. In an EGA
implementation, this means 16 colors. This Is surprisingly effective in separating points. In one
experiment, we plotted 5000 5-dimensional random vectors using 16 colors, and inspte of total
overplotting, we were still able to see some structure. In data sets nf somewhat smaller scale, we have
implement a scintillation technique. With this technique, when there Isoverplotting we comue tLe
screen view to scintillate between the colors representing the overplotted poluts. The speed of
scintillation Isis proportionil to the number of points overplotted and by carefally tracing colors, one
can follow an individual, point through the entire diagram.
We have found painting to be an extraordinarily effective technique In pruallel coordinates.
We have a painting scheme that not only paints all line, within a given rectangular area, but also all
line lying between to slope constraints. This is very effective in seprating clusters. We also use
invisible paint to eliminate observation points frot the dp\ta het temporarily. This is a natural way of
doing a subset selection.
References
Asimov, Daniel (1985), "The grand tour: a tool for viewiag multidlrnsional data," JIAM J.
Scient. Statist. Compul., 4, 128-143.
Carr, D. B., Nicholson, W. L., Littlefield, It., Hall, D. L. (1986), "Interactive color display
methods for multivariate data," in Siatiaical Image .Processing and Graphics, (Wegman, E. and
DePriest, D., eds.), New York: Marcel Dekker, Inc.
Chernoff, H. (1973), "Using faces to represent pointo in k-dinienslonIal space," J. Am. SaLtist.
Assoe., 68, 361-368.
Cleveland, W. S. and McGill, Rt. (1984a), "The many faces of the scatterplot," J. Am. Statist.
Asioc., 79, 807-822.
Cleveland, W. S. and McGill, Rt. (1984b), "Graphical perception: theory, etperimetitAtion, and
application to the development of graphical methods," J. Am. Statist. desoc., 79, 31 -,S54
Diaconis, P. and Friedman, J. (1983), "M and N plns," in Recent Advances inS atsics, 425-
447, New York: Academic Press, Inc.

252
--- H0I I I III I I Il M-
..... . _ .- ... .. . .. l ....
Dimsdale, B. (1984), "Conic transformations and projectivities," IBM Los Angeles Scientific
Center Report #6320-2753.
Fienberg, S. (1979), "Graphical methods in statistics," Am. Statisicin, 33,185-178.
Friedman, J. and Tukey, J. W. (1973), "PRIM-9" a film produced by Stanford Linear
Accelerator Center, Stanford, CA Bin 88 Productions, April, 1973.
Friedman, J. and Tukey, J. W. (1974), "A projection pursuit algorithm for exploratory data
analysis," IEEE Trans. CompuL, 0.23, 881-889.
Grifen, H. D. (1958), *Graphic computation of tau as a coefficient of disarray," 4. Am.
Statist. Assoc., 53, 441-447.
Hartigan, John A. (1975), Clustering Algorihm, New York. John Wiley and Sons, Inc.
Holmes, S. D. (1928), "Appendix B: a graphical method for estimating R for small group.,"
391-394 in Educatioqal Psychology (Peter Saudiford, auth.), New York: Lon mans, Green and Co.
Inuelberg, A. (1985), *The plane with parallel coordinates," The Vsael Computer, 1, 09-91.
Scott, D. W, (1985), "Aveorage shifted histogram: effective nonparametric density estimators
.in several dimensions.", Ann. StatisL, 13, 1024-1040.

253
I
I,

00

it
0
0

Ii
00
0 0
0 0
0 0
S...
0

0~ o0

I
0~ oS
00 0

hi,
N S..

254
0,

4C

.5
g
- am slo
Wolu t plot

256 ainsO
Fipi. 4.1. Parallel cOOKRdiat plot of a circle.

Fip" 4.lb Parallel coordinate plot of 6 channel sonar data.


The data Is uncorrelated Gaussian aos. The second
coordinate represents a relativioly remote hydrophone and his a
somewhat different mean. Notice the approximate hyperbolic
shap.
257
fIpre4.2 A fl" dknuwhal @Wat diapam In pu&aUo.
coordna Mluatal mar~na dwui*Ia cwmlaon.,N thm.
disauloaa clusteria and a five dinwaviomaI mode.

258
ratio. adHweight

259
Fisure 5.2 The hecond Permuta~Iti at the five dimnensionul
prewntAtlOn of the &utomjobile data. Notice the two alum of
hume relations pear rtio, and -miluesa 9&11on.

260
Figure 5.3 The third permutation of the five dimensional
automobile data. Note the highllghtiag of the domestic
automobile group.

261
21. nI Image m

PrispceI

Figure 5.4 The thi~rd purmutation showing highlighting of the


Imported automobile group.

262
Figure 8.1 Parallel coordinate densty plot of 5000*unlfonn
random variablesn. This plot has five contour
levels 5%, ?A~%, 50 %,7% and 95%.

Figure 6.3 Color histogrm. of 13 dimensional automobile data.


This plot Is show in grey sca for purposes of reproduction.

263
till 1IIIIIIsIIIIL"
- II1111111 " " II II I II
Hhilimps in

II'IIIlI IIIBlI, I~I I

Il"1Li
il11"'",1"
ll ll111111It I" ','
II,l
IIIIIIS~
'Prie I

Displacen

Gear Rati
Figure 6.2 Relative slope plot of five dimensional automobile
data. Data prusented In the same order as in Figure 7.4

264
COMPUTATIONAL AND STATISTICAL ISSUES
IN DISCRETE-EVENT SIMULATION

Peter W. Glynn
and
Donald L. Jglehart

Department of Operations Research


Staniord University
Stanford, CA 94305

Abstract

Discrete-event iimulation is one of the most important techniques available for study-
ing complex stochastic systems. In this paper we review the principal methods available
for analyzing both the transient and steady-state simulation problems in sequential and
parallel computing environments. Next we discuss Leveral of the variancz reduction meth-
ods designed to make simulationa run more efficiently. Finally, a short discussion is given
of the methods available to study system optimization using simulation.

Keywordi: stochastic simulation, output analysis, variance reduction, parallel computa-


tion, and system optimization.

265
1. Introduction.
Computer simulation of complex stochastic systems is an important technique for
evaluating system performance. The sturting point for this method is to formulate the
time vmyng behavior of the system as L basic stochas9tic process Y m {Y(t) : t ! 0),
where Y(.) may be vector-valued. [Discrete time proeses can ilso be handled.] Next
a computer program is written to generate sample reeizations of V. Simulation output
is then obtained by running this program. Our discussion in this paper is centered on
the analysis of this simulation output. The goal being to develop sound probabilistic and
statistical methods for estimating system performance.
Two principal problems arise: the transient simulation problem and the steady-state
simulation problem. Let T denote a stopping time and X m 4{1'(t) : 0 : t _5T}, where It
is a given real-valued function. The transient problem is to estimate a a E{X}. Examples
of a include the following:
a .Ef(Y~t0)},
aE to. J (Y(,.))ds}
and
a- P{Y does not enter A before to},

Here to is a fixed time (> 0), f is a given rei-valued function, and A is a given subset of
the state-space of Y. The transieut problem is relevant for systems running for a limited
(but possibly random) langth of time that canot be expected to reach a steady-state. Our
goal here is to provide both point and interval estimetes for a.
For the steady-state problem we assume the Y process is asymptotically stationary
in the sense that
T j (YCa))d. .. a

as t -. oo. Here no denotes weak convergeace and I is a given real. valued function
defined on the state-space of Y. TlCe easiest example to thUnk about her is an irreducible,
positive recurrent, continuous time Markov chain. In this case Y(t) =o Y as t -4 oc and
a 3 E f(Y)}. Examples of a in this case include the following:

a:= EYh) (when Y is real-valued),

266
c=P(Y EA},
and

where c is a given cost function. Again as in the transient cae, we wish to construct both
point and interval estimates for a.

2. Transient Problem,
Assume we have a computational budget of t time units with which to simulate the
proceu Y and estimate a m E{X}, as defined in Section 1. In a sequential computing
environment we would generate independent, identically distributed (lid) copies

(X1 X1),(Xr2) ... ,

where the X 's are copies of X and rj is the computer time required to generate Xj. Let
N(t) denote the number of copies of X generated In time t; this is Just the renewal process
associated with the iid ri's. A natural point estimator for ,v Ie
N(t)
8 W E, N(t)>O
-t
I 1N~ 0) i mlN MtMo .
The standard asymptotic results for X'N(,) are the strong law of large numbers (SLLN)
and the central limit theorem (CLT).

STRONG LAW OF LARGE NUMBERS. If E{7r} < 0 and E{I X, 1} < 0, then
Xvw -+a a.s. as t -+ 00.

CENTRAL LIMIT THEOREM. If E{r} < oc and var{Xl} < oo, then

t1/2EXN(1) - a] * (.E{rl) Var{X})'/ 2 . N(O9 1),

where N(O, 1) is a mean zero, variance one normal random variable, The SLLN follows
from the SLLN for ild aummandd and the SLLN for renewal processes. The CLT result
can be found in BILLINGSLEY (1968), Section 17.

267
From the SLLN we see that X'N(,) is a strongly consistent point estimator for a. Thus
for large 'we would use g'N(g as our point estimate. On the other hand, the CLT can be
used in the standard manner to construct a confidence interval for 0. Here the constant
E{,rl }. var {Xj) appearing in the CLT would have to be estimated.
Suppose now that we are in a parallel computing environment with p independent
* oo. On the p processors we
processors. Now we wish to estimate a for a fixed t as p --
generate ild copies of (X, r):

(X21 ,721),(X 22,22) ,,,,, (X,(s),?2N,(,))

(XPIsri), (XP2 ,rp2) ,, (XN,(,),s,'pN,(t)),


A number of estimators have been proposed for estimating a - E{X} The most natural
estimator to consider first is that obtained by averaging the realizations of X across each
processor and then averaging these sample means. This leads to
a,(p,t £.m,
E g

where
( j
N(t)

n J0 t , N. M
E(t)
Here the processing ends on all processors at time T~ t. If E{'ru) <00o and E{I X 1) < 00,
then for all t > 0
a,(p,t) -.E{XN(,)} = E{X. l{,, )} a.s.

a p -* oc. Here 1A is the indicator function of the set A. Unfortunately, Z{X} #


E{X. l(r5) ) and so aI(p,t) is not strongly consistent for a as p --
* 0o.
The next estimator for a was proposed by HEIDELBERGER (1987). For this esti-
mator we let all processors complete the replication in process at time t. The estimator
is
p N,(t)+1

02(At) 12 P-~
[ 1 (t) + 1]
i-i

268
Here all processurs complete by time

Tp = MX [rd + rd +... + riN(t)+1l.

Unfortunately, To -* +oo a.j. as p -# oc. However, oi2(p, t) Is strongly consistent for a.


To see this, note that if .E{ X 1) < c and P{r > 0} > 0, then as p - oc
(N(t)+

a2 (pIt0
j= ( ) a..
E{Nj(t) 4+I ) EX} a
The equality about Is simply Wald's equation. Finally, since a2(p, t) Is a ratio estimator,
a CLT is also available from which a confidence interval can be constructed.
The last estimator we consider was proposed by HEIDELBERGER and GLYNN
(1987). Here we set

a(p,t) = (
Pm
where
Nj(t) "'N(j() + X011t>,)

Given N(t) 2t 1, Heidelberger and Glynn show that the paits of random vaxiables (XI, r),
S(XVN(t), rJ(,)) are exchangeable. Using this fact, they prove that E(.R (,)) - E{Xi},
Since the IN"(m)s are iid, we see that as(t) is strongly consistent for a = E(X}. Since
the summands in as(p, t) are lid, the standard CLT holds (under appropriate variance
assumptions) and can be used to develop a confidence interval for a. Note that the
definition of ±N(b')(t) requires the ith processor to complete the replication in process at
time t, if no obervltions have been completed by time t; i.e., ri > t. Thus the completion
time for all p processors is given by

While T. -.* oo a... as p --o co (if P{ril > t) > 0), T. goes to infinity at a much slower
rate than is the case for a2(p, t). They also show that the following CLT holds:

a)
t12 E 1 2 {ri) N(O, 1)

269
as t -4 0c, where we assume 0 < a2= va-r{X} <00 and 0 < Z{r1 }< o. Thus ,tN(w)
can also be used in a sequential environment to estimate a.

3. Steady-State Problem.
The steady-state estimation problem is considerably more difficult than the transient
estimation problem. This difficulty stems from the following considerations: (1) need to
estimate long-run system behavior from a finite length simulation run; (1i) an initial bias
(or transient) usually is present since the process being simulated is non-stationary; and
(ii) strong autocorrelations are usually present in the process being simulated. While
classical statistical methods can often be used for the transient estimation problem, these
methods generally fail for the steady-state estimation problem for the reasons mentioned
above.
Assume our simulation output process is Y a {Y(t) : t 2_0} and for a given real-
valued function f
(t) 0 [Y(j)]d a. (1)

As stated above, we wish to construct point and interval estimators for a. In addition to
(1), many methods also assume that a positive constant ty exists such that the following
CLT holds:
It,((t)- ] . o.. NO, 1) (2)

as t -.* . From (1) and (2) we can construct a point estimate and confidence interval for
a provided we can estimate a. Estimating a is generally the hardest problem.
A variety of methods have been de Aloped to address the steady-state estimation
problem. In Figure 1 we have given a break-down of these methods. Most of the methods
are single replicate methods, since multiple replicate methods tend to be inefficient because
of the initial bias problem.
Here we only consider single replicate methods. These methods are of two types:
those that consistently estimate a and those in which a is cancelled out.
For consistent estimation of 0., we need a process {s(t) : t 0} such that s(t) * a.

270
MULTTIME SENIES

REPENECATION SPETRAIAATOLNG
METHODS METHODS

ONIguE 1ACLLTO

ESTIATIN M271D
In which cas (2) lehs to L 100(1 - 6) %con h&,dce interval for a given by

[((t) - z(1 - 6/2).(t)/t/ 2, a(t) + (1 -'61/2),(t)/t 1/2 ],

where O(z(1 - 6/2)) = 1 - 6/2 and 4 is the standard normal distribution function.
On the other hand, the canceling out methods require a non-vanishing process { Z(t):
t _ 0) such that
[tl/ 2(a(t) - a), Z(t)] * (oN(0, 1), aZ]
as t - oo. Then using the continuous mapping theorem (cf., BILLINGSLEY (1968), p.
30) we have
tl/ 2 (a(t) - a)/Z(t) * N(0, 1)/Z (3)
as t --* o. Note from (3) that a ha& been cancelled out in a manner reminiscent of the
t-statiltic.

First we discuss one of the methods in which a is consistently estimated, namely,


the regenerative method; see IGLEHART (1978) for a discussion of this method plus
other background material. Here we assume that the simulation output process Y is a
regenerative process. We are given a real-valued function f aid wish to estimate a(f) a
E{f(Y)}, where Y(t) ,* Y as t -.* oo. Again it is convenient to think of Y as an
irreducible, positive recurrent, continuous time Markov chain. Let T(0) = 0,TI, T2,... be
the regeneration times for Y and set ri - Tj - Ti- 1,i 2_ 1. The ri's are the lengths of
the regenerative cycles' Next defne the areas under the Y process in the kth regenerative
cycle by
A() - L [Y(j)jd.,
The following basic facts provide the foundation for the regenetative method:
(i) the pain {(Y(f),,rk) : k 1) a) iid;
(ii) if E{I f(Y) 1 < o, then a(f) - {Yj(f))1E(,ri).
The regenerative method can be developed on either the intrinsic time scala (t) or on the
random time scale (n) corresponding to the number of regenerative cycles simulated. On
the intrinsic time scale our point estimate for a is given by

22f(Y())d,

272
where t is the length of time the simulation is run. On the random time scale our point is
given by
an(f) M*"W %
where ?.(f) (respectively, %,)in the sample mean of Y1 f,.Y(f) (r1 ,. ). Here the
Y process is simulated to the completion of n regeni~ative cycles. 'Using the basic facts
(i) and (ii) above, It can be shown that both a(t, f ) and a. (f ) are strongly consistent for
a(f) as t aznd n respectively tend to infinity. Next we defin~e 4k a Yh(f) - *(f)-k and
assume that var{Zk1 is a2 < cc. Then It can be shown that the following two CLT's hold
as t -* oo and n - cc:

t'/ 2 t-Qf) - at(f)] * (&/E'12 {?i})N(O, 1),

and
n112ran(f) - O(/) -* (OP/Ef ri N(O, 1).
These two CLT's can then be used to construct confidence inlter'vals for a(j) provided both
2
ap and Efri I can be estimated, The mean E{riI ii - iily estimatod by %,and a2 can be
estimated from its definition in terms of Y 1(/) and ri.
Next we turn to a discussion of the principal method available for canceling out a.
This is the method of standardized time uerics developed by 5CHRUBEN (1983). Our
discussion is based on the paper GLYNN and 1GLEHART (1989) and uses some results
from we-A convergence theory; see BILLINOSLEY (1968) for background on this theory.
From our~ output process Y we form the random elements of C[0, 1], the space of real-valued
continuous functions on the interval (0,1], given by

js1
F It n Y(s)das
and
X.(t) * n 1 / 2 (P?(t) - v

where 0 !5 t !5 1 and n : 1. Now we make the basic assumption that a finite, positive
constant a exists such that
Xn *o a'B as n --# oo, (4)

273
where B is standard Brownian motion. This assumption holds for a wide class of output
processes. To find the scaling process {Z(t) : t 2> 0} consider the clam. M of functions
g: C[O, 1] -1 R such that
(i) g(*:) = ag(x) for all a > 0 and x C[O, 1];
(ii) g(B) > 0 with probability one;
(iii) g(z + ik) - g(z) for all real / and E C[O, 1], where k(t) m t;
(iv) P{B E D(g)} = 0, where D(g) is the set of discontinuities of g.
The process
sn(t) A ?n!t) - at 0 _,,< I

is called a standardized time series, Using weak convergence arguments it is easy to show
from (4) that
Sn(1) * B(1)/g(8) (5)

as n -* 0c. Unfolding this CLT we have the following 100(1 - 6)% confidence interval for
al:
(?R(1) - z(1 - 6/2)g(?.), ?R(1) + z(6/2)g(?.)],

where P{B(1)/g(B) 5 z(a)) - a for 0 < a < 1. Thus each g e M gives rise to a
confidence interval for a provided we can find the distribution of B(1)/g(B), Fortunately,
this can be done for a number of Interesting g functions.
One of the g functions leads to the batch means method, perhaps the most popular
method for steady-state simulation. We conclude our discussion of the method of stan-
dardized time series by displaying this special g tunction. To this end we first define the
Brownian bridge mapping r : C(O, 1] -. C(0, 1] as

()(t)o(t)-tx(1), xec ,1], o.5t 5 1

Now think of partitioning our original output process Y into m > 2 intervals of equal
length and define the mapping bm : C(O, 1] --* R by
( "11/2 )
b 1(x(i/n) - x((i - 1)/m)2]

274
for x E C(O, 1]. Finally, the g function of interest is g. = b.mor. To see that gm corresponds
to the batch means method we observe that

heeg(?n) - m 1 /2 [,-z
--1iml ((n) - Jul zs(n))

where
Z -(n) Y(x)!(n/m)
J(i-1).nm
Jd.

is the ith batch mean of the process {Y(t) : 0 : t < n). Specializing (5) to the function
gm we see that
nt

(.M Zj(n) - c]/gn(?n) * tin.. 1

as n -# oc, where t,,-I is a Student's-t random variable with m - 1 degrees of freedom.


This follows from the fact that B(1)/g,(B) is distributed as t,.- 1 since B has independent
normal increments. For other examples of functions g e M for which the distribution of
B(1)/g(B) is known see GLYNN and IGLEHART (1989).

4. Variance Reduction Techniques.


Once a basic method is developed to produce point estimates and confidence inter-
vals for a parameter of interest, we turn our attention to making these methods more
efficient. Over the years a dozen or more techniques have been proposed to improve sim-
ulation efficiency. Good references for many of these techniques are BRATLEY, FOX,
and SCHRAGE (1987), WILSON (1984), Here we have elected to outline three of these
techniques.
As we have seen in Sections 2 and 3, confidence intervals for parameters being es-
timated are generally constructed from an associated CLT. Each CLT has an intrinsic
variance constant, say, a'. The idea for many variance reduction techniques (VRT's) is to
modify the original estimate in such a way as to yield a new CLT with a variance constant
7 < o. This will, of course, lead to confidence intervals of shorter length, or alterna-
tively, confidence intervals of the same length from a shorter simulation run. Frequently

275
VRT's are based on some analytic knowledge or structural properties of the process being
simulated.
The first VRT we discuss is known as importance sampling. This idea was first
developed in conjunction with the estimation of E{h(X)} a a, where h is a kmown real-
valued function and X a random variable with density, say,f , Instead of sampling X from
f, we sample X from a density g which has been selected to be large in the regions that
are "most important", namely, where If I is largest. Then we estimate a by the sample
mean of h(X)f(X)/g(X); see HAMMERSLEY and HANDSCOMB (1964).
This same basic idea can be carried forward to the estimation of parameters associated
with stochastic processes. We generate the process with a new probabilistic structure and
estimate a modified parameter to produce an estimate of the original quantity of interest.
The example we consider here is the M/M/1 queue with arrival rate A, service rate p,
and traffic intensity pm A/ < 1. Let V denote the stationary virtual waiting time and
consider estimating the quantity a n P{V > u} for large u. When p is less than one, the
virtual waiting time process has a negative drift and an Impenetrable barrier at zero. Thus
the chance of the process getting above a large u is small, and a long simulation would be
required to accurately estimate a. The idea used here in importance sampling is to generate
a so-called conjugate process obtained by reversing the roles of A and p. For the conjugate
process the traffic intensity is greater than one, and the estimation problem becomes much
easier. ASMUSSEN (1988) reports efficiency increases on the order of a factor of 3 to a
factor of 400 over straight regenerative simulation depending on the values of p and u. In
general, importance sampling can yield very significant variance reductions. Further work
along these lines can be found in SIEGMUND (1976), GLYNN and IGLEHART (1989),
SHAHABUDDIN et al. (1988), and WALRAND (1987).
The second VRT we discuss is known as indirect estimation. Assume we are interested
in estimating a n E{X}, but happen to know that E{Y} = aE{X) + b where a and b are
known. Sometimes it happens that a CLT associated with the estimation of E{Y} will have
a smaller variance constant associated with it than does the CLT for estimating L'{X}, In
this case we would prefer to estimate E{Y} and we use the affie transformation above to
yield an estimate for E{X },This idea has proved to be useful in queuing simulations where

276
LMI 0
the affine transformation is a result of Little's Law. In general, variance reductionsarealized
using this method are not dramatic, being usually less than a factor of 2. For further
results along these lines, see LAW (1975) and GLYNN and WHITT (1986). While the
affine transformation works in queuing theory, it is conceivable that other transformations
might arise in different contexts.
The third and final VRT we discuss here is known as discrete time conversion. Suppose
that X = {X(t) : t 2_ 0) is an irreducible, positive recurrent, continuous time Markov
chain (CTMC). Then X(t) o X as t -+ o and we may be interested in estimating
a = E{f(X)}, where f is a given real-valued function. As we have discussed above, the
regenerative method can be used to estimate a. A CTMC has two sources of randomness:
the embedded discrete time jump chain and the exponential holding times in the successive
states visited. The discrete time conversion method eliminates the randomness due to the
holding times by replacing them by their expected values. It has been shown that this
leads to a variance reduction when estimating a. Also, as an added side benefit computer
time is saved since the exponential holding times no longer need to be generated. Gains in
efficiency for this method can be substantial. Further discussion of this idea can be found
in HORDIJK, IGLEHART, and SCHASSBERGER (1976), and FOX and GLYNN (1986).

5. System Optimization Using Simulation.


Consider a family of stochastic systems indexed by a parameter 9 (perhaps vector-
valued). Suppose a(9) is our performance criterion for system 0. Our concern here is to find
that system, say 9o, which optimizes the value of a. For a complex system it is frequently
impossible to evaluate ev analytically. Simulation may be the most attractive alternative.
We could naively simulate the systems at a sequence of parameter settings 01, 92,.., '
and select setting that optimizes a(8j). In general this would not be very efficient, since
k would have to be quite large. A better way would be to estimate the gra ent of a and
use this estimate to establish a search direction. Then stochastic approximation and ideas
from non-linear programming could be used to optimize a.
T o general methods have been proposed to estimate gradients: the likelihood ratio
method and the infinitesimal perturbation method. We will discuss both methods briefly

277
Suppose X = {X. : n > O} is a discrete time Markov chain (DTMC) and that the cost of
running system 0 for r + 1 steps is g(9, Xo,..., X,). The expected cost of running syotem
9 is then given by
O~)isA,(009, Xo,.... ,x)(,

where Es is expectation rtI tive to the probability measure P(O) associated with system 9.
If E,{'} were independent of 9, we would simply simulate iid replicates of
7g(0, Xo,... , X,). By introducing the likelihood function L(9, Xo,... , X,) it is possi-
ble to write a(6) as

oI- tX)(O,
ot($) = Es.f9(1o XoI... . ,Xn))
to

for a fixed value of 00. Then we can write

Va(O) = Ze. VgO, Xo,'... , nLO Xo,...,- X)) I,

where the interchange of V and Es must be justified. A similar approach can be developed
to estimate the gradient of a performance criterion for a steady-state simulation. For an
overview of this approach see GLYNN (1987), and REIMAN and WEISS (1986).
The second method which has been proposed for estimating gradients is called the
infinitesimal perturbation analysis (IPA) method. In this method a derivative, with respect
to an input parameter, of a simulation sample path is computed. For example, we might
be interested in estimating the mean stationary waiting time for a queueing system u well
as its derivative with respect to the mean service time. Since we are taking a derivative
of the sample path inside an expectation operator, the interchange of expectation and
differentiation must be justified in order to produce an estimate for the gradient Vc(0),
say. The IPA method assumes that if the change in the input parameter, 9, is small
enough, then the times at which events occur get shifted slightly, but their order does
not change. It has been shown that the IPA method yields strongly consistent estimates
for the performance gradient in a variety of queueing contexts; see HEIDELBERGER,
CAO, ZAZANIS, and SURI (1988) for details on the IPA method and a listing of queueing
problems for which the technique works.

278
REFERENCES
ASMtISSEN, S. (1985). Conjugate processes and the simulation of ruin-problems. Stoch.
Proc. Appl. ?&, 213-229.
BILLINGSLEY, p. (1988). Convergence of Probability Measure.,. John Wiley and Sons,
New York.
BRATLEY, P., FOX, B., and SCHRAGE, L. (1987). A Guide to Simulation. 2nd Ed.
Springer-Verlag, New York.
FOX, B. and GLYNN, p. (1986), Discrete-time conversion for simulating semai-Markov
processes. Operations Research Letter. A,191-196.
GLYNN, P. and WHITTI, W. (1989). Indirect estimation via L = AW. Operations
Research 3Z, 82-103.
GLYNN, p. (1987). Likelihood ratio gradient estimation: an overview. Proceedings of
the 1987 Winter Simulation Conference, 368-375.
GLYNN, P. and HEIDELBERGER, P. (1987). Bias properties of budget constrained
Monte Carlo simulations, I: estimating a mean. Technical Report, Department of
Operations Research, Stanford University.
GLYNN, P. and IGLEHART, D. (1989). Simulation output analysis using standardized
time series. To appear in Math, of Operation. Re.
GLYNN, P. and IGLEHART, D. (1989). Importance sampling for stochastic simulation.
To appear in Management Scd.
I{AMMERSLEY, J. and HANDSCOMB, D. (1964). Monte Carlo Methods. Methuen,
London.
HEIDELBERGER, P. (1987). Discrete evezut simulations and parallel processing: statisti-
cal properties. IBM Remerach Report RC 12733. Yorktown. Heights, New York.
HEIDELBERGER, P., CAO, X-R, ZAZANIS, M. and SUB.!, R. (1988). Conviergence
properties of izfaitesimal perturbation analysis estimates. Management Scd. 3A,
1281-1302.
IIORDIJK, A. IGLEHART, D. and SCHASSBERGER, Rt. (1976). Discrete-time methods
for simulating continuous time Markov chains. Adv. Appi. Prob. a., 772-788.
IGLEHART, D. (1978). The Regenerative method for simulation analysis. In Curreni

27q
Trends, in Programmisng Methodology - Software Modeling. (K. M. Chandy and R. T.
Yeh, editors). Prentice-Hall, Englewood Cliff., NJ, 52-71.
LAW, A. (1975). Efcient estimators for simulated queueing systems. Management Sci.
22, 30-41.
REIMAN, M. and WEISS, A. (1986). Sensitivity analysis via likelihood ratios. Proceed.
Ings of the 1986 Winter Simulation Conference, 285-289.
SHAHABUDDIN, P., NICOLA, V., HEIDELBERGER, P., GOYAL, A., and GLYNN, P.
(1988). Variance reduction in mean time to failure simulations. Proceedings of the
1088 Winter Simulation Conference, 491-499.
SIEGMUND, D. (1976). Importance sampling in the Monte Carlo study of sequential
tests. Ann. Statist. 4, 673-684.
WALRAND, J. (1987). Quick simulation of rare event in queueing networks. Proceedings
of the Second International Workshop on Applied Mathematics and Per-
formance/Reliabillty Models of Computer/Communication Systems. G.
Iazeolla, P. J. Courtois, and 0. J. Boxina (eds.). North Holland Publishing Co.,
Amsterdam, 275-286.
WILSON, J. (1984). Variance reduction techniques for digital simulation. Amer. J.
Math. Management Scl. 4, 277-312.

280
Bayesian Inference for
Weibull Quantiles

Mark G. Vangel

U.S. Army Naterials Technology Laboratory


Watertown MA 02172-0001

The posterior distribution of a two parameta." Weibull


quantile for a noninformative prior may be obtained exactly
(Nogdanoff and Pierce, 1973), although the necessary
numerical integration detracts from the usefulness of this
result. Credible intervals for this posterior have an
alternative frequentist interpretation in terms of
conditional tolerance limits (Lawless, 1975).
An approximation to the Lawless procedure was proposed by
DiCiccio (1987). This approximation does not involve
numerical integratibn and is of order 0 (n'3 /2 )$ apparently
it is adequate even for samples as small as ten.
The focus of this paper is on the use of DiCiocoo's
result for the routine calculation of Weibull quantile
posteriors. Even a nonbayesian may find the posterior cdf'a
useful since they provide an easy graphical means for
obtaining accurate tolerance limits.
Examples from strength data for composite materials are
presented and a specific application of importance to
aircraft design is discussed.

References

1. Dogdanoff, D.A. and Pierce, D.A. (1973). JASA, 68, 659.


2. DiCiccio, T.J. (1987). Technometrio, 29, 33.
3. Lawless, J.F. (IS75). Technamterics, 17, 255.

281
The Weibull model is widely used to represent failure data in engin-
eoring applications. One reason is because the Weibull distribution in the
limiting distribution of the suitably normalized minimum of a sample of
positive ild random variables under quite general conditions (Barlow and
Proschan, 1975, ch. 6). The model is therefore appropriate for the strength
of a system composed of a string of many links where the strength of the
links are lid and the system fails when the weakest link fails (Bury, 1975,
oh. 16). An example of a physical system which can be modeled in this way
is the strength of a brittle fiber in tension, Another reason why the
Weibull model is used is that the distribution is very flexible and con-
sequently it often fits data well.

Inference for the Weibull distribution (or, equivalently, for the


extreme value distribution, which is the distribution of the logarithm of a
Weibull random variable) is complicated by the fact that the Weibull is not
in the exponential family, and consequently the minimal sufficient sta-
tistics are the entire sample, Also, although MLE's are easily obtained
iteratively, the distributions of the MLE's and pivotals based on the MLE's
can not be obtained in closed form. The same is true of linear estimators
of the Welbull parameters.

At least three approaches to Weibull inference have been taken. The


first is to tabulate approximate quantiles of the pivotals obtained by Monte
Carlo. From these tables one can obtain confidence intervals on parameters
as well as confidence intervals on quantiles (tolerance limits) for complete
samples (Thoman, Bain and Antle 1969, 1970). A problem arises for incom-
plete samples, since tables must be prepared by simulation for each censor-
ing configuration. The tables which have been prepared (Billman, Antle and
lain, 1972) are inadequate, A second appraoch is to approximate the dis-
tribution of the pivotals (e.g. Lawless and Mann, 1976). These approxima-
tions are empirical and consequently they are not very satisfactory from .1

theoretical point of view.

282
Finally we reach the third approach, which is the focus of this paper,
For any location-scale family (e.g. the extreme value family) and any equiv.
ariant estimators of the parameters (e.g. MLZ's) tfe distribution of certain
pivotals can by obtaloned exactly if one conditions on the ancillary sta-
tistios. From these pivotals one can get exact conditional confidence
bound, and tolerance limits for any sample size. The method in applicable
to both complete and Type II censored samples (i.e., samples for which only
the r smallest order statistics are observed) and requires no tables.
Since the intervals have exact conditional confidence, it follows that they
are also exact unconditionally. In addition, this method has the advantage
of making use of all of the information with respect to the parameters which
is in the data (the parameter estimates are in general not sufficient
statistics), though for the Weibull model this does not appear to be a
practical concern (Lawless, 1973). This conditional approach is apparently
due to Lawless, who introduced it in (Lawless 1972), An exposition of the
procedure appears in (Lawless, 1982), which is also useful as a guide to the
literature.

If one choses an appropriate noninformative prior distribution for the


parameters of a location-scale family, then the posterior distribution
either of the parameters or of a quantile conditional on the ancillaries are
formally identical to frequentist confidence and one-sided tolerance limits
respectively.

Bayesian and frequentist terminology may thus be interchanged freely


and I will do so in this paper. This is particularly valuable when
discussing tolerance limits, which have a frequentist interpretation which
is difficult for nonstatisticians to understand. A posterior cdf of a
quantile, however, is immediately understood intuitively. Such a edf can by
used to obtain graphically arbitrary one sided and approximate two sided
conditional tolerance limits since for the cases discussed herein these
intervals coincidi with noninformative prior Bayesian credible intervals,

The main disadvantage of this conditional approach is that it is com-


putationally intensive. Many numerical integrations must be performed for

283
each dataset as one iteratively approximates the confidence limit.
One goal of this project has been to implement the Lawless procedure
for the extreme value distribution in a 'robust' FORTRAN program which can
be used with little user interaction. Another goal has been to investigate
a recent approximation to the conditional procedure (DiCiccio, 1987) which
is accurate to OP(n'31 2 ). This approximation makes the the calculation of
posterior distributions feasible, A FORTRAN program to calculate and plot
the posterior distribution of Weibull quantiles which makes use of the
DiCiccio result is discussed. The results of a small simulation to assess
the accuracy of the approximation are presented, though little effort war
spent on the simulation since the order of convergence in probability has
been determined.

The cdf of the Weibull distribution is

F(x; a, A) - I -*e( X /P)*

where p is a scale parameter and a a shape parameter. Maximum likelihood


estimation is straightforward. The following equation is solved by Newton-
Raphson for *:

(W* xi* log x,) (1* x') " -1/a - 1/r Z log xi

where x1 : x2 : ... S x r are the order statistics, n a r is the sample


size, and

~w - i +(n -r) wr

A F RTRAN subroutine 'WEIMLE' for performing these calculations is given in


the appendix.

284
.2 Xht &Ntreae YJU Dit bt

SIt X be distributed Veibull vith shape a and scale p. The distribu-


tion of

Y - lo(X)

Is

M(y; U, b) - G((y-u'/b) - exp (-exp ((y-.u) /b))

where

b - 1/a and u - log

are scale and location parameters respectively. The location-scale family


H(y; u, b) is called the extreme value distribution. Results for the
extreme value distribution are easily interpreted in terms of the Weibull
distribution, and vice versa.

J. "Qnditiona2 Inference f&I Ioat n.-jA]&


Zaie

The presentation below follows Lawless (1982). The distribution H(y;


u, b) is taken to be the extreme value distribution as in the previous
A A
section. The parameter estimates u and b may be taken to be NLE's, but the
results hold for any equivariant estimators -- that is any statistics 5 and
which satisfy

l(dy,+c, ... , dy,+c) - d G(y, .. ' y,) + C

G(dyl+c, ... , dyt4.c) - d G(Y1 , .... y1 )

for any e and any d > 0. The maximum likelihood estimates are readily seen
to be equivariant,

285
Let the sample size be n and, to allow for Type II censoring, let r I n
be the number of data values. Denote the density of G(*) by

O1((y-u) /b) - 1/b &((y-u) /b)

First we demonstrate that the following random variables are pivotal; that
s. they have probability distributions uhich do not depend on the
parameters-

A A0A
ZI-
U U)/b Z b /b Z3-( u b

Lost yji S r be the order statistics of a random sample from H(y). Consider
the random variables

V,- (y1 -u) /b

The wLare the order statistics of a random sample from OC') and hence are
obviously pivotal. Since the estimator u is assumed to be equivariant we
have that

UNIO..10wr) - u((y,-u)/b, ... , (y.-u)/b)

A A
1/b (u(y,, ... , y.) -u) - (u-u)/b - Z3 .

Hence Z3in pivotal. Similarly, Z. is a pivot since

A A
b~wj' b( (y,-u)fb, ... , (y.u)/b)

Finally,

A A
Z,- (u-u)/b - ((U-u)/b) (b/b) - 32

The quantities a1 - (x1 -u)/b are immediately seen to be ancillary mince


the (a,) are a random sample from G(,), where G(') is a completely known
distribution. Only r-2 of these ancillary statistics are independent since

286
A A
u(aI,A ...0 Ad)- M2, .. y) -u PE0

and

A A A
b(al , ..., an) -bXyl, too$ yr)/° EP 1.

The fundamental result upon vhich conditional inference from a frequen-


tLst perspective is based is that the Joint pdf of (Z), Za , a1 , .,, a1 .2 )
is of the form

'
h(z,, za, ) - k(!, r, n) a ' (
IT (a, + Z122)) [ O(a1 z a + zAI)]
LEI

where k(l, r, n) is a function of a1 , a2 , ... , a,,, only. The pdf of (ZI P


&2) given a is of the same form as h above except that the normalizing con-
stant is different.

The proof is straightforward. Begin with the Joint pdf of.{yl,


yr) and make the change of variables
A A

Y P- b a, + u.

The Jacobian of this transformation is a constant given a. A second change


of variables

AA
u - ZI b + u- bZ
1 Z2 + u

A
b - Z b

gives the desired result.

287
Con2fidene IntirA"a. f=x k~tmma V1alue Quauxizie

Using the pivotal density derived in the previous section, it is not


difficult to obtain exact confidence intervals on quantiles of the extreme
value (or equivalently, the Weibull) distribution. Toward this and, we
determine the distribution for the scale parameter pivotal Z2 . This result
is of interest in it's own right since it leads to confidence intervals on
the extreme value scale (or Weibull shape) parameter. To got the density of
Z., merely integrate out Z, from the joint pdf given in the previous
section, giving

h(%a ji) -k(s) exp (1az,) s a 'r /[Z* exp (az2 )1'.

Next, make the change of variables

zP - ZI -vP/Z. and Za -z a
where

wP M In (-In (I-p)).

The joint density of Z. and Z is

, z2 1 A) * -z exp (I (aLz2 + zPz 2 + wP)

I exp (.Z exp (a zZ + z2zz + wP)).

The cdf of Z can be expressed in terms of the density of Z and Z2 as

a t

P(Z, S t) - Jdz, I dz P f(zp Z2 ).

Change variables again, this time letting

y - axp (zPz a + w) X* exp (aLz 2 ) and z -z .

288
The double integral can now be written as a single integral by recognizing
that the integral over y is the incomplete gam function:

P(ZV S t) - I dz, h(z2 ) I(exp (tzh + vp) Kexp (aIz:), r)


0

utiere

Since the pth quantums of an extreme value random variable is

the pivotal Z, can be expressed as

A A
-P (u *XP) /b.

The probability distribution of Zcan therefore be used to obtain


exact conditional confidence intervals on extreme value quantiles, One
first obtains the constant of integration kc(&) numerically. Next, the
P(Z P S t) is evaluated numerically for several choices of t until the
quantile of the distribution of ZPis determined to the desired accuracy,
Finally, the confidence bound on xP is trivially obtained by pivoting.

i.& aysa Integettc

Independently of Lawless, Bogdanoff and Pierce (1973) arrived at


results identical fo those outlined above from a Bayesian point of view.
layesian results are much easier to explain to nonstatisticians. This is
particularly true for the problem that I'm prinarily interested in, con-
fidence intervals on a quantile, and th. advantages of Bayesian motivation
for a particular application will be discussed in a later section.

2E9
Let (yj) be the order statistics of a Type 11 censored sample of size r S n
from an extreme value distribution. The usual Joint noninformative prior
for the location parameter (u) and scale parameter (b) of a location-scale
family is:

w(u) a constant w(log(b)) a constant

w(u, b) a 1/b.

Usin& the expression for the extreme value pdf given in a previous
section, the corresponding posterior distribution is seen to be

w(u,b IZ) * b -I+) exp (I (YL - u)/b)

' (-
(xp p ((y -u)/b)),

The location parameter is readily integrated out giving

w(b iy) * (1/b) ' exp (I y1 /b) /[Z* exp (Yi /b)]

The normalizing constant is determined by numerical integration. Inference


based on this result will be shown in the next section to be formally equiv-
alent to Weibull inference using the pivotal for the shape parameter.

Let *(u, b) be any scalar function of the parameters about which in-
ference is to be made. Assume that V(u, ') is monotonically increasing in u
for fixed b. If a function can be found which satisfies this condition
piecewise, then the following results may still be applied to each monotonic
section of the function, Some useful choices for t are

*(u, b) - u (location parameter)

*(u, b) - u + log(-log (1-p)) b (pth quantile)

290
*(u, b) - exp(-exp((t-u)/b))) (reliability at time t)

V(u, b) - u - yb (mean; y is Buler's constant)

Define the inverse function u(s, b) by means of the relation

O[n(s, b), b] - a.

The posterior cdf of e can be expressed as

P (O(u, b) 5'I!) -r P(u s n(s, b) Ib,y) w(b Iy) db


0

It is easy to show that the conditional distribution of X - exp (-u/b) given


b is the following gamma distribution:

w(h 1b, y) - (l/rr)) [Z*exp (yL/b)) XI'- exp (- X Z*exp (y,/b)).

Simple algebra also shows that

P (u s n(s, b) Ib, y) - P(h I exp(-n(s, b)/b) 1b, y)

Combining these results, we have finally that

P (*(u,b) I s Iy) "

1lI.(oxp (-n(sb)/b) I'exp (y,/b), r)) x(b lY) db


0

where 1(9, r) denotes the incomplete gamma function

9
I(e,r) - (l/r(r)) I x r ' exp (-x) dx.
0

291
For a confidence bound on the pth quantile x,

(sb) - , - vpb

and

P(xr S a) • 1 (1 - I(exp (V, .u/b Z exp (y,/b), r) w(b I) db.


0

The fact that for inference about quantiles and about the shape (or, in
terms of the extreme value distribution, scale) paraeter the Bayesian
approach is equivalent to the Lawless conditional approach will be demonstr-
ated next.

2. Foral Egui&1lenca gL a~usmn And EragenLtia Results

First we demonstrate that posterior intervals for the scale parameter


(b) have an exact frequency interpretation. Let b1 be 6uch that the post-
erior probability that b is greater than b1 is y, Since

A A

P(b ? bl) - P(b/b s b/bl) -

we make the change of variable

z - b/b

and substitute for the Yt in terms of the a i to get

a a

Y- Jw(b ly) db- I exp (ly,/b) b' /1(!exp (y 1 /b)j' db


b1 b

* I exp (Zaz)/[Iexp (az)]r z" ' 2 dz


0

292
;Ib I
r exp (Zaiz)/[Z*oxp (aLz)I z ze'2 d

b/bl
- J h(z Ia) d.

To ae the equivalence of the results for a quantile, we make the sub.


stitution

t - (U - s)/b

and note that

Sxp (WI - s/b (Z'exp (YL/b))) -

A A A A
exp (wP + tb/b - u/b (!*exp (aLb/b + u/b))) -
A A
*Xp (wp + b/b (Z*oxp (a b/b))).

A
The change of variable Z - b/b gives the desired result.

J1. MAi Lg GearojuLged Gama Diia.bion

The probability distribution of a goneralized gamma random variable T


is

FT(x; a, p, k) - (&/r(k)) (xak'I/pak) exp (-(x/p)).

Details on inference for this family may be found in (Farewell and Prentice,
1577) and (Lawless, 1980), Note that the case k - 1 corresponds to the
Voibull distribution.

293
If T has a generalized gaa distribution, then Y - log (T) can be
written in the form p - @W where

- log(p) + log(k)/a,

o - 1/(ok" 1 )

and V has the probability density

fwv; k) -k- '1/2 /r(k) exp (k1 '2 v - exp (v/kl/ 2 )).

Y is said to have a log generalized gamma distribution.

By varying k, one obtains a family of location-scale deitributions


ranging from the normal (k - a) to the extreme value (k - 1). Although we
will restrict attention to the case k - 1, it is straightforward to adapt
both the frequentist and the Bayesian approaches to arbitrary fixed k and
even to certain regression situations (Jones et. al,, 1980).

i. A&rxiLa !nferZnce fr Xl Lqg Geneale Gaina Qistribuon

Let ; and 3 denote maximum likelihood estimates of p and a subject to


the constraint

+ Wpa,
Y O"
A A

that is, the KLE of the pth quantile is required to equal y.' If p and a
denote the unconstrained MlX's, and if L(p, a) denotes the log of the log
generalized gamma likelihood, then the asymptotic distribution of the sta-
tistic

A A
V(yPO) - -2[L(p, ) L(, a)]

is x with one degree of freedom. Law:lsss (1984, sec. 4.2) suggests that
inference based on V(y, 0 ) is acceptable for moderate to large samples but
that the approximation may be inadequate for small samples.

294
DiCieclo (1987) has applied general techniques of Barndorf-Nielson
(1986) in order to develop a computationally inexpensive modification to the
signed square root of V(y.0 ) which yields a likelihood ratio based approxim-
ation suitable even for quite small sample sizes. The numerical integration
required by the exact methods is only troublesome for moderate to large
samples; so the approximation is actually of questionable use over the range
of sample sizes for which it is Inaccurate.

I will not reproduce the details of the DiCiccio approximation here for
two reasons. The most important of these is that only the results are
presented in (DiCiccio 1987) and to repeat these results without having
studied their derivation would serve no purpose. A second reason is that
although the approximation is inexpensive to compute, the formulas are
messy, and to reproduce them here is to invite typographical errors. Inter-
ested readers should refer to (DiCiccio 1987) and to the FORTRAN implementa-
tion as subroutine LAWAPX in the appendix,

12. MAn Accurac 21 ZbA ~A22omaio


'3 /
)
The DiCiccio approximation can be shown to be accurate to 0p(n
(DiCiccio, 1987, p. 37), so an extensive simulation study of accuracy is un-
necessary. The results of a very small such study are presented in Table 1,
Sampltr of sizes ranging from 10 to 30 were taken from Weibull populations
with different shape parameters. Both the Lawless and the DiCiccio methods
were used to calculate 95 percent lower confidence limits on the tenth per-
centiles of the Weibull populations and the mean and standard deviation of
the percent different between the Lawless result and the DiCiccio approxima-
tion were calculated for 100 replicates for each case. One would expect
that the approximation error should be a rapidly decreasing function of n,
and this is observed to be the case. The quality of the approximation is
also seen to be a function of the shape parameter of the population.
Halving the shape parameter (from 10 to 5) approximatly doubles the mean
percentage error uniformly over sample sizes. Also, the approximation error
actually appears to be a function of the number of uncensored values rather

295
than the uverall sample size, which is not surpriaing. Overall, the
DiCicolo result appears to be satisfactory for samples of 10 or more un-
censored values, and remarkably good for samples of 30 or more observed
values. This conclusion is based partly on the small simulation presented
here and partly on experimenting with various cases of real and simulated
data,

11, An Aolcaajn: Compoite Material Bais Val

A criterion used both by aireraft designers when choosing a material


for a specific application and by the Federal Aviation Adminiatration when
certifying a new material for a structural aircraft application is the
material basis value. A 'B-basis value' is defined to be a lower 95 percent
confidence limit on the tenth percentile of the strength distribution of a
material and an 'A-basis value' is a 95 percent lower confidence limit on
the first percentile, The reason for these tolerance limits, which have
been used in the industry for decades, is that a designer is primarily
interested in the lower tail of the strength distribution. In order to
design a reliable structure, he would like to estimate the stress level at
which a material is 90 percent or 99 reliable, A tolerance limit is an
attempt to estimate these quantiles in a conservative way, Such conserva-
tism is particularly necessary for advanced composite materials, which typi-
cally have relatively high strength variability. Also, advanced materials
are generally expensive to manufacture and test, resulting in small sample
sizes,

The work presented here has been motivated by a need for improved
methodology for calculating basis values and for communicating lower tail
quantile information to the engineer. Typically, the engineer who routinely
calculates ard interprets these numbers has little appreciation for the
rather convoluted frequency arguments behind tolerance limits. The long run
proportion of times a statistic calculated from successive samples of size n
from a hypothetical population is greater than a certain quantile of that
population is of little help to the statistically naive. The simple state-
ment that the tenth (first) percentile is greater than the B-basis (A-basis)

286
value with 95 percent probability is much more direct and intuitive. Also,
the Bayesian approach presents all of the information in the data about the
lower tail quantile of interest, which is what should be the ultimate
concern of the engineer anyway. The fact that the tolerance limit is only a
convenient summary statistic of this distribution becomes clear when the
user is presented with the entire posterior and shown how to determine
arbitrary tolerance limits graphically.

Table 2 presents B-basis value calculations for a graphite fiber/epoxy


material made by four fabricators. Note the agreement between the DiCiccio
and the Lawless calculations. Figure 1 consists of the four tenth per-
centile posteriors. Not only do two of the fabricators have nearly the same
B-basis value, they also have virtually identical quantile posteriors,
Several questions immediately come to mind: Why did the other two manufac-
turers produce substantially weaker material?; Are other lower tail quantile
posteriors for the two 'similar' fabricators as close together?; etc. Ex-
amining the posterior rather than a summary statistic of the posterior leads
to insight into the data that might not otherwise be Apparent. Figure 2
demonstrates that the B-basis value can be retrieved graphically.

Table 3, Figure 3 amd Figure 4 present corresponding results for


another material: woven Kevlar fibers in an epoxy matrix. These data show
much less fabricatox-to-fabricator variability than do the graphite/epoxy
data, This can readily be seen from the tolerance limit calculations, The
fact that there is essentially no evidence in the data to suggest that the
fabricators differ with respect to the tenth perceitiles of their strength
distributions is made particularly clear by the overlapping' posteriors of
this quantile.

This paper reviews two results related to conditional inference in lo-


cation-scale families, emphasizing inference on Weibull quantiles. These
methods are due to Lawless (1972) and Bogdanoff and Pierce (1973). For the
case of inference on quantiles both procedures are equivalent, though the

297
former is motivated by frequency considerations, while the letter is derived
from a Bayesian point of view. The recent work of biCiccio (1987) greately
reduces the computational burden of both methods with little loss of
accuracy.

The advantages of the Bayesian Interpretation, at least for inference


on quantiles, has been demonstrated by means of an example from an engin-
eering application.

U1. Rafo~encls

Barndorff-Nielsen 0. E. (1986). "Inference on full or partial parameters,


based on standardized signej log likelihood ratio", ELometrLka, 73, 307-322,

Barlow, R. E. and F. Proschan (1975). Staetlaccal Theory of RellabLlitcy and


LIfe To#tInS, New York: Holt, Rinehart and Winston.

Billmann B., C. Antle and L. J. Bain (1972), 'Statistical inference from


censored WeLbull samples", rchnowotrics, 14, 831-840.

Bogdanoff, D, and D, A. Pierce (1973). "Bayes-Fiducial inference for the


Weibull distribution", J. Am. Scat. Assoc., 68, 659-664.

Bury, K. (1975), Stariutical Models In Applied Science, New York: John Wiley
and Sons.

DiCiccio, T. J. (1987), "Approximate inference for the generalized gamma


distribution ", Technomecrics, 29, 33-40.

Farewell V. T, and Prentice R. L. (1977). "A study of distributional shape


in life testing", TechnomeotrLe, 19, 69-75.

Jones, R., F. Scholtz, M. Cssiander and C. Shorack (1985). "Tolerance bounds


for log gamma regression models", TechnometrIcs ,27, 109-118.

298
Lawless, J. F. (1972). "Confidence Interval Estimation for the parameters of
the Weibull distribution", UCllltas Hatheomtlcae, 2, 71-87.

Lawless, J. F. (1973). "Conditional inference for the parameters of the


Veibull distribution", J. Am. Stat. Assoc., 69, 665-668.

Lawless, J. F. (1980). "Inference in the generalized gamma and log Samma


distributions", TechnometrLcs, 22, 409-419.

LAvless, J. F. (1982). StatLt.lol Models and Methods for Llfetime Data, New
York: John Wiley and Sons.

Lavless, J. F. and N. R. Mann (1976). "Tests for homogeneity for extreme


value scale parameters", Comun. Stat., A5, 389-405.

Thoman, D. R., L. J. Bain and C. E. Antle (1969). "Inference on the


parameters of the Weibull dsitribution", TechnometrIcs, 11, 805-816,

Thoman, D. R., L. J. Bain and C. E. Antis (1970), "Reliability and tolerance


limits in the Weibull distribution", Technometric, 12, 363-371.

299
ZILLI

Accuracy of the DiCioclo agaroximation

A simulation of 951 lover confidence bound# on the 10th percentile using the
Weibull distribution vih 100 replicates per case vas performed. The
results are sum arlsed beZow:

n r Shape Scale AtStan, error of moan

10 10 10 1 1.09 .029
20 20 10 1 .380. .0086
10 10 5 1 2.20 .059
20 20 5 1 .761 .017
30 20 5 1 .752 .016

300
Carbon fiber / 1oxy sBecimen tensile strength data
95 LCB oan 10th argelnti1.

Fabricator n Estimates (KSI)


Lawless DiCiccio

A 48 244,1 244.3
B 36 271.4 271,6
C 33 228.2 228.5
D 25 269.5 269.8

Kavlar fabric / spoxy specimen tgnaile strength data


ILL LCB an 10th gercentild

Fabricator n Estimates (KSI)


Lawless DiCiccio

A 23 77.8 77.64
B 18 76.36 76.50
C 30 77.18 77.23
D 10 78.45 78.64

301
Fisure1

Iwo_ _ _

____Ono

00 tw
Now

-I-s

302
Fisure 2

wu)w

0303
Fiture 3

~4

030
Figure 4

pow

%%O

CA 4w

305
Appendix: FORTRAN Listings

The following programs were developed on in Alliant TX/B and should run with
little modification on any 32 bit machine. However, the software has not been
tested to the point where it can be considered error free. The programs are
provides as a guide to an individual wishing to implement the algorithins discussed
in this paper.

306
Proofaer tawpgm
C
Mark VangeL, $!2o/SI
Program to implement Lowless' procedure for
c conditional confidence intervals on uanttiles
9 for a location/scale #amily. The family chosen
c here is extreme value. Data may he Type 11 censore 4 .
9 Note that conditioninq on the incittaries gives the
c equivalent of an HPD reaiom for a noninformalivo
c prior.
c
implicit doubl orecision (a-ho 0-1)
peraetler (ilan * 500)
character*2 ftenme
dimersion (limsa)
c
common Idat/ x
common /ca/ cnoro., suaa, n, k
common /ebi &0 OaI, wu, t
common lctd tol
data one I1.d/
C
date coarse, los t.
finer e.o-7, -?, 1.0-5/
C
c -- Citout unit nfumoer amo ftlenffe
writ@ (6, ) 'ootPut unit number ?I
Peso (5, ) lout
it (out ,ne. 6) then
write (6, *) 'Filenamo ?I
rea9 (5,'Ca12)') llen' e
ocem (unituiOUt, fileutlenm., *~*tulnr, . * )
end it
c
c -This gprocrai, is tostou 00t ramn, jotA.
c It car. aI,1 be wsol f*or riata i ro, a Ifi(e.
c Tro first recora of ima inout lilf. has 10,e
$ &t qe sit
and t0e M gir of uncemsortc VAL. s.
write (6, *) 'Enter I for date frre ite
write (6o a) I fo ranoeo oats.'
read (5, ) ICat
if (1Oat .eq. 1) then
write (6, *) O~i eriavar ?I
real (S, '(dc)') flenee
Ocer (ur tilout$* 4 tmlien ,t lu
t selOW
' )
read (iOut1, a) M, K
c
c -- Notet the first fieLd on eoO roemain~o reccr. iis
c botch indicator not use for this nresra-.
do 10 is1. k
read (iout*1, *) aummy, w (1)
10 continue
ca I gL vrgm (k, a, K)
else
write (6, 6) 'seed ?'
read (5, * ) isVed
wr ite (6
to) 1wei but k ViflC drc scalp ?'
read (So e) shp, act
write (6, 4) 0SampLe site ?'
read (5, 0) m
wr it (6o 5) 'wumber uncensorco ?'
,ea'1 (5, C) k

Al

307
c
c -- Got the pseudoofand*m sample
CALL mnsot (41600).
slt drhWib (he Shpo 11)
sail davr~n (no so A)
00 20 isle k
a MI * set Ox MI
10 continue
end if

..- Got the Weljutt "LEI$


cat( vnrmte (Pshoo @scI. no ko no epso Item, 100)
s-- Extreme value location (Amu) amd seat# (lsg) *Stietatts
amu adiog (*eti)
as; one /eshp
* - Write out what we have so for.
write (lout, *)
writ* (iout, s) #The Lawless gonm4ltiomaL procedure'
Uritt flout, 0)
if (toot mne, 1) them
write Clout, s) 'Seed t ',hee
end it
vril# Clout, 0) 'Sample silt : t n
write (lout, *) 'Numtor uncensorea to A.
if (idol an@. 1) them
Urite tiouto *) '600.uLt shaoto scalp I ', Soto* sct
end if
write Clout, s) '~eibulL MLE% I ', esnr, tscL
Write ClOUt, 10)'LutRc Valu# location: 'p
0FAft
WrFit1e Clout# ) 'Ltrepo Value scale I A,
's
write Clout, C
Writ# (lout, K 0) 'ibut 04ta 1'
write (lOut, lflw) (AMP)iulo k)
1C formatl'~04
c N.lows catcul4to the tober~mct limnit wsin'rk ho
c exact mothoo AMC O~cici's aoolalr
c
write (6o 0 'Lower comliaonce oeund caLcwtjltinr'
write (6P50'$uar'tiLe ?I
reso Ci5 C) P
write (6o 0 'Coniluenct, coefficient ?'
reso (So 0) qa"
c -- Lawless comdional rrocedurp (TocIhnoepetricto i;?'.t
call Lowles (in,ne ko,
ri ea-
tot)
c -- OlCiecic's moppoximatiom (Technometricso lc'!?)
catil lawbom (no MO ko Do gamo &tot )
c rile out tht toterenco limit
la reculci
write (lout, *)
4
write Clout, s) 'Lower con irlenco toum,' on d ousptilo'
Write@touto s) ' (rxt roe,p Value am, 6eitut I)I
write (louto 0) 'Probatitity' t ::
Wri to(louto 0) 'Confidence. I# Yom~
write Clout, 0) 'fitrte Yntut qua'tito I ', t1,
write Hloule 0) 'Lawless tolerance liits t' #tot# to;(e.il
wt~g
i oCicut, )'Aopromimations I' atol, toj(aq~I

A2

308
e write (Oo*) 11sed$, 2.pdl ?I
C read (*#*) itypt
it yot a I
write (*,*) 9144 one) mat for abscissa ?I
read (*o*) amino amia
write (* I)9miand mam for ordirtdtt COPA lop ',ilwit ?I
read amino ,*x.
oem
Gatt imitt (980)
CaltL b i mit I
Cott comset (1ae(Iamin)
Cat( comaot (10aSeaC12) OMam)
if (aa *no. sera) then
Catt coaset (ibasty(11, orim)
Catt comset (ibaSey(l2), omab,)
and if
C
41 (itype *#q. 1) then
Cott mots (notot *na)
Catt ChePck (Qu#Mt, cdi)
Catt mot$ (ma)
Catl dBotay (Quanto cdf)
Go 30 41# npott2
Cott cptot (Quart (l*nQ.l)* crlf (IOnQ41))
Continue
*ls# 41 (ityp. .@a. 2) them
Ca l mot s (nnLot *n;q)
gait 0100c~ (Quanto dens)
Catl nmpta (rrli)
Call d5ptay (duent, oens)
co 40 1.1, nvlot-I
Coll c.olot (Quart (1*M*n.')P GPM% (i*9MQoI))
40 comnne
Ono i I
Cott raovabs Coo loonl)
Cal L Aneodt
reed (a,*)
go to2
eel i1f
C

write (*#*) *(uil ?I


read ($00W)') on&s
if (an$ *no. 'y') go to
Cott finilt (of 760)
C
stop
end

A3

309
Program Oprtpil-
Mark VangeLo July 1QS9.
c Program to calculate and plot the posterior of a
c porgertiit for the Wtibult model* Ihis program catlls
c subroutines from the Tektronix Plot 10 Library.
c
charatter*20 a..
charactor*l @mt
'dimension m(1000), cdI (SO()o)dems(500O)o quarit ("~
* ipoint (100)
data for** on.t IO.d0p 1.dO/
ipoInt (1) S 0
mQ 0 100
"~plot a 0

c -* Loop over all fites.


write 00*I'mbriel outputo 1'comr'tote owicut ?I
read ** ibriel
continlue
write (*** *FI(#namo ?I
rear' ($09(420M' ftomme
if (Itenwec .*a. 1 ') go tn 2
mptot 0 nplot +1
Dom Cumitm1l,. litfuftmnmo)
read (1000) miampo 1chs
do 10 1.1, moot
read (lOic) idumok Civoint (mPtot) -0i)
10 coritinue
close (10)
ipoint (notot+1) a ipait CM000t Onobs
C
C - Cacl tm
tn osterior 4or a specilli' )wamlite

read (Ooxl) r,
if (o ,q%. one) 9 N
a1"
write Civot) 'k4ange of value% for vosterior F

read (*0*) U0mn OTAX


write (*,*) 'Posterior ofl~CtL
write roff, IoQmir.
I54 I to to opa%
write (,~
Ci c2C 1# r4
ioal § (nplot -1)*rnc 41
quint 0 (i-1) 4(13 *offin
(ifix)
call tawapx (a tipaimt(nnLot) 41), miamoo ro ,so
S quaint Cio) o po cdI fido))
if (ibriol V,3. 1) then9
end if
20 continue
gto
N-fow plot the results
2 continue
write (e,s*) 'Plot& ?
reao (OA'(Afl di
0i (anc teo. ')I') tImer

~- tnsity calcu~ations not yet iitOIV'nent..

Alt
310
4 write (S*) °lmcd. 2mpd4 ?f
C reed (0os) 4type
itype a I
write (**S) '4in and oma for abscissa ?I
read (s*) qe4in qae
write (o*) 'Win and msa lor ordinate DO,0 for default) ?9
read (*,) eu4no osin
geLL initt (960)
Cell binitt
Cell couset (ibeem(I1)o q4in)
call comset ,(basoemt2)o 4ue)
4f 494a4 one. tore) then
ceLL ceoset (cibeys 110 sn)
Cell $onset (ibasey(ll), *Oak)
end 41

4 (etepe .eq. I) then


aill notS (nplot *nq)
ctll check (quant. cdf)
call note (nq)
call dsoLey equint cd4)
do 30 imlo npLst-I
ceLL cotl (quant (i*nq*L). ed4 (isnQ61))
continue
else 44 (itype seq 2) then
call npts (nolot *no)
sell check (aant* dens)
Call not$ Crtq)
cell dspLay (quant, dens)
do 40 410 nPlot-1
ceLL cplot fquant (4o*nQ4). dens (i*nq*))
continue
end 4
call movebs (0r 1000)
call encode
read (00)
go to 2
end it
C
write (Ce0) 'Guit It
read (W,'(sl)') ens
i (an$ one. Sy') go to 2
caLl finitt 40o 700)

A5

311
programtavl
c mark Yanget. 10/14/9&
Cc Program to test by simulastion an aeorou1imstiom of
c DiCictio (1987) to th~e Lawtoss comditiomot rocoadipe.
iupticit dowbie precision (&who o-2)
Parameter (40160nu 500)
eciarswt cr12 41 mope
dimension X~umax)
common /datl
common /Cal snorfl. SUM&# no k
common I60/ p.0 AM, w&~o t
common /cdI tot

data coarse, fine 11.d-i .)-471


9 -- Output unit number sno 1iLonawe
write (6,0) 'Output unit nuwi.0or 7'
read (so 0) out
if (lout eme. 6) thtem
write (6o *) tPitoname 1'
read (50"(41W)) Iten-.
open (unitsieut, littullening, slatusue.)
end if
c uGet limulation parameters
write (6o *) ISeed ?I
read (So * ) 1sevo
write (to 0) 'Weitutt shape an, Icale
read (S, *) sho scl
write ((i, 4) 'e16pte silo 7'
read C~50 ) m
write (bo 0)'Ic~r uncensorte, '0
reap (. 0)o k
write (6o.0) 'Wumt-er of reolicates ?'
read (So ml) nli
wr ItI (60i I)Nwa,fl 7'1
reap (S, m)o
write too 0) Comfiatnct cov 4 licipmt ?I
read (So s) qatr
c
c --Write out what we have sm far.
write (lout, *)
write (lout. 0) 'The Lawtoss condilir'naI ;rccewrP
write (lout. VA)
wrPi te (lout, 0) *ed'
write (lout, 0) liwirr of r.aL icitt I me'.
wri te (lout, 0) 'Samnler size , M
write@ (lout, 0) '?.UR~fr uneporf: 0
write (lout, 4) '4uariti 10e
wri4tef(lo ut,*) 'Comfiiencp cool4iclermt I Ia
wtite (lt out, 0*) 'heinult shar*e. scato ' ~ s
write (lout. s)
C
C -- Loop over tIe~& * o' rarlicats
caL rmid (110PO)
5 a C.dO

at) 9 1810 nsie

A6

312
colti drvwio (MO shwo 1)
cott d~vPrfm (m, X,11
(to 20 4-010 k~
a Ci) a cI*

20ciiaa
cO99tiIUf

(KfOifC mo YAM
CJOe*~
Ks.!9lP1iI
C~~Ct L.oI9BB

CT1ch'pop~trcso 1
c - ricitcio' BatoifillIion

c -- essufU
tfpf
infOia~O ~.
% of atr' ml ffo
1.
dCIPC * CaOt -ftot)/.O

ow~itthe to~tC!IfIG Cifit FOSutts


c
(i§ulp 0~ ILAWIps% # Priccino. ~t~~
write

W
?Os Sion. orv.
s!mara o~fI.offer
C
%flI *b. /Ctt' IC~A
$I Ia t (L.i
.
*f l (iouto 1:1 *ier1S ,,I
cp I i b
A- a~ iviAtiC
W F1 1 l,

A7

313
Subroutine tow~s% .na a pa, gama. ptt)
c park VangeLO may 1908
C subroutine Itawtlls catcu~ateS *o Sided tower tolerance tiffits
forethe Wtibutt codtt using the Laviess Conditional. ciroctrlurt.
C This routine' Deptorms 'exact' catewtationst by numeirical
9 4"tegrAtion. for even Moderately 4arce Samsleso the !lCiccit
C aooroximation used in subroutine fteaw&W is very accurat*
c and coaputaionaLty Less troubtesome than the method ubtel rre.
C
A a-* Data u1~
* n "a Total %sapot* site (Inowt.
4 k N*ubtr of (uncensored) observations (Inc~p)
* P roostliitty associated wit quontiie go (TInmow*
4 g Confloectre level, lor ico on art(nu
C @ltlt 'Ekact' tower 1teranCe Limit rtot

c Itel i Lowtosso Technovottrlcs 1975


imoticit double orecition (&-ho o-&)
dimensiont K 1
common /girl eshoo tocto mpup kso
Common lI4/ cnorm, suine, no If
common ltl p. PAR# woo I
common /cd/l tot
C
external nte)MI, Acorf
data coarse. logo lime /.-,1d~ .i7
dat& aReo ontot cL, 1d~i, et
I 1.70'/
-- Put stul 4 in ce'off"o
n *no
k Uka

C -- (et the Wtiolt I'LL" I


call wnpmtt (tshr,, OICto no ko go 00%, iiero IMO.)

C -- Toteramce lim'it lacimr renot. TIis reap it c .A


C efloucaf or viflutyjL sm practical coRfidemep coOfIic9S''
C 'iCQon1 proviats a first a OictOR th t^tcrimca
C Li R,4 1. I. ot o thm% iconl Irrturr,& Lower it'.ra'ce timits.

Cott icenf (Ko no ko p. Como IO(mt)


cam SotMI -qdm
I oL t too I (es t /t e MI ) **t I 0,)
atI cL st tt in
IIN ch *toL kiMI
write (,)'Lamlei IFirst guess to
', o- (tottml
C
C *Now go to tog scatft.
do 1(0is1, k
A (il a tog (M Mi)
1J co ftIinue,

IC -* EareIA WAkhWq anJ $Cale


tocatiDP (NfrTj) 110%0) 05iFAIII'
amuM t og (ict)
'sq a one /eiho

A8
314
c -. We need the ancitterits and their %up
ou"a IN sero
do 20 161gm k
sum& 0 suma *(a 'il -amu) /ag
20 continue
c
c *Next we obtain the constant of imttgration.
e use an acaptive qua'drsture routine to iitqrate
c XNTGNL an ('s, inliiity).
charm~ one
eoserr a ero
rttrr to%
call dadagi (anton1o stro* I# *sesrr# rettrro inko err)
c
* - The normalizing constant
cnorR 0 on# /xk
wri te (60 0)
write (6i 0) 'The mormatitina constant it of cropw
a
4 *- tt ouamtie tofr stanaarri eatremp value istri'out~on
wo. tog (-to; (one -p)

-. Fitifoolto~ 0th qwantitt lop dota


8P xMW Oasg *%fl

c
c - frts A~lorithn to fim, 1olerarce Iiwit facie.r S~c
9 that th* inltgraL of YIT~, irom ( .P ir irlt ) tnu4it 144,1.
c 'The first no%% has a Lmr',e error toLtrarct to save 14-,(
c The %#Comoi ;as% Ufa% the linal toterance.
C

asier stze
P#Lerr acoarte
lt coarse
caLL cibren (acomt, &oserrp rtier, ri, tr., Pakit)
c
C troPve in lop the in1ll.
man i I 10 C
it .95ov, *RN
th 1.050C *8h
rekerr atint

caL I dibevn (eAont, ALsopr ret err, tit, " sit)


aotI Ct a ,h

c -'Calculate tho! tolerance tit-i


Itoi. a Inc Canu -Jh Cins )

_. R-Vestore the doll


do 30 ia), k
a Mi a arj 0. MI)
IV continue

end

A9
315
subroutine lawapm 4xt nhamps, nobs&# pat lama, atla)
c
C Mark Vengel. July M966
c
~ An emcellent approximation to the one-sided Witbull
o conditional tolerance limits of Lawless (1975).
C
9w - .ta (Input)
S nmamp - Total sample site (Input)
a mobs - NumbWe of (uncensored) observations (Input)
a p -, Probability associated with quanttle up (Input)
I - Confidnce lovel for lab on up (Input)
2 etol -- Approbimato lowen tolerance limit (Output)

a Ref I Lawless, Tachnometrics 1975


a Dicicio. Technometrics 1I?
a
implicit double precision (a-h, a-%)
dimension m (1)
common /par/ ushp, ufs1c, mmu, xeg
common /aonf/ p Iam neamp, nmobs
eVternal acoMf l
data maxit /1001, eps /1.d-si
data coarse, fine /i.d-ah l.d-7/
data zro, one /O.dO, 1.dO/
data ci. chi dl, dh /,909 1.2d0 ."00, l.OfdO
a
C -- Put stuff in common
p • pa
lam * one -jama
nibs mobsa
neamp oaempa

.. S-et the MLK's of the Weibull paramotows


call wnmlo (Ushp, uWil, nuamp. nobs$ . spt,ite. malit)

Tolerance limit factor range.


-- This range is broad
enough for virtually any practical confidence co4fficient.
'iconf' provides a first approwtmati*M to the tolerance
c limit.
call icn4 (m. nsaimp mobs$ p. Wair. tollmt)
al *ci Otailet
ah -ch *tollet
write (e) 1 Lawapx t First guessio t toilet
C
€: -- Dgmt9 a algorithm to find tolerance limit factor at
1 the desired confidence loyal. The first pass has a
a large error tolerance to sav time. The second pass
Suses the final tolerance.

maxit 0 100
abser, * sees
reierr a coarse
tt a coarse
call dlbren (aconflo abse', relr.P. Rio ah, maxit)

a -- Iqovee in for the kill.


maxit 1100
li esh id

A10
316
nh a h *sah
relar~r a fine
tol s #in*
call debron taconfi. absor, reoa, Il, sh, &&wit)

return
end

All

317
subroutine iCont (a, nsamo, mobs# ro go tolLMt)
c Marb Vang#(P October 19PS
c A non-ilerative first approuimation to the Lawless Cmrdifirrat
c Procedure (or* aternativety, to the posterior of a QUamtito undef
c a tl Prior). The routine is written for two Paramtor Wtioltjt
c analysisp but extension to the *eneratisted too gamma faoi Ly is
c straightlorward. This routine returns the estimated cornlidenco
c Limit for a providem oroobbility Level And confidence (honce
c tie 'it -' for inverse - in the routine name). Tnis rout ime it
C approtimatety inverse to *&cent'. I t provi de % tho sam. rpswtl
c as 'LawsopN out with a toss accurate appoimation.
c Rol 1 Diciccic, T.J.o Teennometrics 1967 p.33
c a -Data (Innul
c neamp -- TotaL %samote size (Innut )
Numoor 01 (unconsorvd) observations mrl
C c
noos -.
*- Conticence Level tor It on to (JnnUT)
C p -. Probabitity associated with ouamtite xf (Tnout)
c tatlet-- upper toLerance limit (upt
imoticit doubte preciuion (a-h, P-0)
dimtnsion P(I)# & (4.)
data or-so Watl 11.0-50 WI'/
data sero, one, hat, thhaill 1vhatf
c
C, G~el the WeipuLL OLII
cati w mrmke (usho, usct, nsamli, nobs, t, erg, itpr, I'

C - ran%form to0 ex t r-e 'a tije di st r i -- ti on.


WuLoc 0 tog (Wict)
uscL a ore /uspin
c
c -- alcutate the Ierivasivet of 0-e (04 tikotinlvo. it
C the wLF.
so tog0 (-top (one -P))
do C~jl 4
a(j) zero
do 4C.aL no')S
2 (Lo, (.(i) -u10) /usld
t a(0a -w) *e*I) dC(a

4 continue
o(j) a imU) +(nsamp -robs) mt
3U~ continue
C
020 a -mobjs
all a I I)
WO -(no~*a 'iC)))
cic ant.
d2I * s'*ot' *%(I)
a1 3*5(1) 4SM~
dO0a nobs 3s)*()
d40C -nb
d31 *-('nots osa~))
02? -('.*nobs *5#%(1) *S?)
CIII -(?*stl) #*ss(?J 410~))
004 a -note -(Teac?) 0*5asC) *SU.)

A12

318
c The aporoximatt mean anac Varian.e, @1 ro the sionec
--
c square root al -2 times~ tihetihoca ratio, is cakcutatel ir
c terms al the Oij.
vll -dO? M(20 $dO2 -all 4#113)
c
a *(Vii **half *C-diP/dO? 4011*a03 IdO?*102 ) 12
b S(Vii *(-d2?ldO? +(d21*dC3 *2k*dl3*dli *dl2**?) Wd2**)
S -(4*dl2sd1l*tO3 *d06edl2552) Wa2003
6 *2*(dll*d03 IdO2*02) *02)) /4.
c (Vi1 CSthhati C(d3o -'ad~isdlldO#001*sdl? *(dll l/2 *4
6 -003 *(dll/d02) 003)) 10~
d wl *(?0 d*C *i?*d?1*dlleo1
*3 dOCel d
I -(6*dl~aO3aailme? #*dla3Cal 1**? +1 P4(d 1~201) *0?) 1a'?v~h
S *(iCcllCOO3CilC~340&Cdi2*A4) /drnm*a.
S -3Cdlles4 *d03**i /W0 5)) /24
c
umu aa 2*c
uso a rt (one +*(b +*d *a&W *li*cCC2)
c
c -- OiCigtiov equation b.
dnorirn (9)
Ig
rg smu +xSg NlQ
Y; utoc *usct *wrm
t 0(L Let 0 YO 4uscI *(sJft (V13) *Cfrq *C*aCC **
S (lvhAtf CC *Ili~ +M) Crj s*3))

c - Go to wtibio.'t state.
toiLtt s top (totiet)
c
retwrn
en(1

A13
319
double Precision function antgfld (r)
C Proportional. to the odf ol a certain piuctaL auamliiy.
9 The normatling constant* lcmormli dep.ends on the
c Oat& And must be obtainto bY A orv(imiMary nuslerical
e integration. O~nce lenorm' is knowm, WLYGOVL becofros a
4 pdl and is used ~V YNTrGND durinq the orimary rwui'epca.
C integration to got the toterance tiflit.
C
C Niote : lcmorm' must be initiaLited (to 1) Uploro the
C protimimary intogration. Fotlowing that# tenor-'
C can be assiqpnoo tte vatue 0 ict Makes ~''a r,14
c
impkicit doubt* Drec 4 sian (a-ti, o-t)
dimqengion XM1
c
o
c fsll is used by YNTON~o hence thf cormn t tocc.
common Idatl x
common /cc/ &I
common lea/ cmorpo suIma, no I-
common lparl eshn #scl# imuo Nse
data Afro# one /A.100 1.10/
c
pwr * one /ftoat ( .1)
t *one
do 10 laXl K
if (i feQ. k) I a dcrhLf (M -k 41)
aaa C)-#*U) Ilsd
Il s t*eC( . kA~.
13 continue
ontgno a (I /dutr Mk) sdv-k) *rwr
Mntgnd a Antari *OR. (su-,o A6l *0 Wr -ao))
amtyroa UAfltofl Ocmor",
return
end

A14

320
cloute Precision function Nconl (to)
c
C Subroutine to dttenino tho crobatiLtity of
c the pIM Quaritito blinq test them ?TO)o where 7 is I'6
C toterance Limit factor.
C
imoticit doubt# ortcision (a-ho o-
common dCbt Do gaffiWpo
common dp tot
tttrnat yntqn'i
data Ilroo ane /O.dup 1.dQ/
c
e * t i6 to be found In that t 4,1Ierat P5~
i1 Y ' rT -
c (Do infinity) #Quit& 'maw.'.
a t
obuerr 0 tot.
r*Ltrr 0 zero
cat. oceagi Cyntgntf, aRePO lo aOlrro riderr xiu err)
aCOnI a Mi -gem
c
c -- Ihis is just some tteminaL ouirwl to app~ee the user w~tl
c ~a
W~'iti'
nW10r rpsIu kts .
write (,) 7Lrn lwcnr: COrM000mce I to to Mi
rituwr n
end

A15

32.1
double precisionfl unction~ ymtgnd (1)
c Funcin to calculate the integrl 10!
C Otterminin9 the confidenct-
implicit doubt# precilicn CS'ho 081)
common la#b/ 00 gamp we# t
cUommo It cno!C' Swma, M.
gommen ICc a
am*1dOo 1.dO/
data streoe
pwr * on* l~t~1

iS t~t funct4On Wh4ith WAS in~cOrbt'


C ThN fir$% lactor
C to get cnmowim'
yntgna a antgnd (s)
P. a The Secora lactor is the vamms edi OvatuatC t&IaCM
c onlto a OuantitY eatcuLtta in xTi'J
which depends (wp
utLir. m* s*d#No *t *a pw'
Q a om# -dgamlf (float Mo) upkis.)
yntgrd M yntjfld OQ
end

A16

322
doubte precisi n function coMn 4 1 (t)

imoLicit doubtLe Drecisio (fh* o-.)


9
dimension $(I)
common Idatl S
common icon/ po gemp nsamo mOkjS
C iL scomi (x# neamop mobs# to
P, Prot)
aSonil a prob *g&w
ret urnr
end

A17

323
subroutine acotf (a, nisappo nobs, Qo to coml)
C Mark vanigeto JULY 19bo
C
9 An spvrooimation to the Lawless conditionat orocedure (or#
C atternatively, to the Posterior of a quanititt under a fkat erif~r).
9 The routine is written for two paragirer Weibutt anaetylss rtt
C extemSion to the generatiza tog gama tsamilt is otraiqhtlortari.
C
C Rol I ficieceo, T.J.0 Techrioqetrics 3937 P.?"
C
C a 0. Date (I Mout
c Alarp -- Totak samole site (Inoid)
C nobs -- Number of (uncensored) observatiank
C Q -- value for which F (Mr, *( ) is desi re, (Input)
C P - Probabitity ossociottew with ouanli~Lp op (InrLut)
C coni F- (no 8( a) (Cut rul1
C
imokict Ooubte Orteesiom (a-h, 0-1)
dimensinam N M# 5 (4)
common /part ushowo usctwo mmuevo xsoev
data to$# macit 11.d-So 100/
data serov oneo hatl tn~atf

C
C - t the Constrainedi tI'so wilere Q is coristroine-1 Ic'
C be the Oth quantike.
Call CwnrftL (00 as C ShMO. cad,
I nar so,
I 0.ra era 1
otIt's0

.. 'transfer" to extreme value districutiom.


cloc tog (cscLw)
us~l amon /Ushp~w
Csc
I C10one /C I ho

C * Cal.Cutatv thr, SiM. SquAPP foot o1 -2 tif-ts It~e


C a f the tikelihool ratio

co 20 islo not%
It a Log Ca(I)
ct a cl 4(t *Cloc2/C1Cl -top MC -CtoC)/csC1l
Ul MlUt t *UtOH /usc I -vot. W -ul oc )/utc
?~continve
cl a ct -Crisamp -tnoa) amk (t *ctoc)/csct1I *rts Oto: (cct)
U1. v U -Insaff'o -nous) *ea (0I -Utoc)#Uscl) -01's oto- (wW:)
a~~*rc
- -6ih1)
64n a logia) -utoc -WP *USCt
sun a abs (52M) /SnM
r 0 $on start (abs (NIP))
C
1
C i C MIU tfqP deIV at I V#S a1
p0 Iectt the, 117-. liketihc eo! .1
C Inhe I.F
coo 30 Jolt 1.
G(j) a Rero
00 41.0 islo mat s
1 l(11 4,11 ) * 10c) /uscI
((8a "60) 0 *top (8)
i S)
aj) 't

A1B
UZ4
4U continue
1Wj a i~j) #(nsami)-gr~*
10 continue
C
d2O a -nb

dcl2 a -(mobs 46(p))


d30 a mobs
d~ a ?*mob~s 41(l)
dl? a 3WsI) #%(?)
anobs 43*6(2)
*u *W()
d4O a-nt
d3l a -(3'onomi *sm)
di? n -( 4 *ot #vs !) *t 5))
dI3 m *(?*(i)*'j(P) os(,))
I4 8 -ML -U(2)*'*(I g()

c -- TM AFrrokifftf MOAQM AMC vaiamico it ii


c terms
% t1h 0j.
ai
aad*cd /(020 g~idjP -0!1 moll)

* v2 **eh *(-dl? I dD? *d I I I 1 71-11,42


I -(4*o.2?*U11Aorj *I411* id2t:.*7

S -0IU3 C111/ 1( **j ) /0


*
rW I Iv W*; ( 411A. 1u~I*P4 316J7 I *p
I~ ~~~~~~ .4e *?~ Maai1)
I''/A '"I
Ifo~a

*II& U -II j'A

RR. U a( '1

ConI a dmormf (r)

A19

325
SUSAcuTINE wWAMLF (AP. bo w'SAmoo N05S Ar AEPSp ITRp lI
C
C M~ARV VANGFLP MAY 19tb
C
C ES71P'ATE WfIlULL PARAMFT!TRS FY MOXI94UN LTKELtNf(W
c A .WIMuLL S'WAML PAfRA14FTFR (PETPUJft)
C p -- IC1ULL SCALE PARAMFIER (REI'US9hin
c NSAMP -- SAMPLr SIZE (NFSRn
C NOPS -- NUMUER OFlQOSrVATTN UCPCRr
C (NSAMP-NOiRS) VALUiES APE TYPF 11 CENS'Y1E'
C X -- DATA VALUFS
C AEPS *-AlSOLUTF COk'VlklEwICr TQLERAOC (tN00)
C ITEN - NUPSEN OF ITEPATIrNS So'nut4fir EU.E
MAXIT AXIPUl SUPPE OF ITEPATI 4 (hP011)
C -.

IMOLICIT ELCUPLE PRECISIOk, (AwH# I-Z)


DIMENSION X Wo~ CVIL (IS)# SNP~bL (25)

c -- CCP.FFICIEN S OF VAPIATI011 FOD 100141 EST INA10 OF


C SHAPE PAPAMETEP
DATA tNIbL /211
DATA CVTE6L /it o .52271#~ a36345# .23054p .?2905o
6 .19377, .16002o .14837o *337B6 .1tl
5 .10944# .10122* .0M'?' nB?7* .01170#

n?41?7, .03150#, *02R2co .61694* 0?'


I ~ ~ ,.E,,C,1,?1!D /
DATA SHPTbL

C -- VCALE IHE CATA


SCALL * y (1.)

IF (X (1) .jT, vLL ) $TAI.L . (1 )


1w~ CLNT2?NLC
Do ic !Il to~t
11 (1) . (1) 1' SCALr
24) CII'~JL
c
IbAMP~ a ioL (I.(J
PASAwP n ChPL I ASA&'P
Fp * Acp
IF ifF U~.E. ZE-40 EPS v I"L"I'
C
CALCULATE 594 Si.w rlF L^U'-# A' $ 'I' cc '&E
C --
51 a ZC
E *ZEN.O
ss a ZEO
n~u lelfl N%)LS
S2 * 2 Ll (v (1))
S 5 x ()
(I
IQ CONTINUE
51a 51 1 XAA"p
C
C Lusr TmE CCEFF2CIIlT 'IFvi 10'm., IF% r
--
MET,
C 00(jENI ESTImATE Uc Thr r. 1*PI pArAI4ETV,7
CV w W~T (NntS *S! -S o-~) /f
1
IF (CV *GE. CV73L (1)) 1' tl
A w SHPIBL 41)
ELSE IF ((V .Lt. CV1!SL (000L) TP"EA
A m SPIBTL ('01L)

A20
326A
ELSE
IF (CVTb.L (1) .Lr. CY) 'rsfrb
A a SHIO~L (1-1) # (SHOTOL (1) -SHPIOL (1-f))t
I (CV -CVTPL (I)I MCT,Lt1l-1) -CYVL MT)
aC Ya so

so COtdl i.6T

c -- LVCP VI TILL (fPVG'f c* lyriOATC" LIwIT XC.-,r'.


h1Ek s (I

c -- CALCULATE rQM~S FIPST

Dr. 6o toi No
II A (I1) $14 1'
72 TI7 * LCM (x (I)
13 T2 * LL.' (A M)l
S2 S?~ * 71
54 54 T~
6j CU I INHEI
S2 a 5? 4 T! CL~' !.L
$3 a $3 0*T CtA*
( -''

71 L'~ I A
74* 4 $2
D~aA *Ti*2 *T0* (11,01 TV
C

IF S (h 7* 6)1 t %I
6 L

c -- CALCULATE SrALE PAt-A£irlr- gr, ~ , 1


~ r

COI I
Sa OS4 isi 'owCl("'A

C -- SCALi hATA

Ct A221

327
SUBROUTINt CWNP&4L (9oAXPp Apdo SA4PV~OT'5 go ArPl, IT EDwAyl'
C
C
C MAR9 VANGFLo JULY 198t
C
C ~A1t!' L11!L I'CC. '4 -I-'3L
EST I AT E WEI IULL PARA M TE RY 04A
C CUjNSIPA1NI THAT TIml ESTImATEI PTN OUAtNT1LE AT Tmr 'Lr
C IS EGUAL 1TO %P.
C P -- P!RDtAqILITVY LEVEL Or #-NATILE
C VP -- PTH bU1ANTILr
C A *WEM~LL 5S4AvE pAA~ryr4
C *- !IMLL $CALE PAPIRWTF
C N6'' *SAP PLr SIZL
C NO' S 1.10lqU'1'Ef F (UkS!RVATIC2E ,
C ('NSAMP-NW',) VALUIES AFL IYPF ii cr.%s,'V
C Y DATA VALUES
c AEPS - A945001!T C~.J~~kAEl'CF TO0Ltr& r (
C ITEA - NJ0LQ MFI' TENA!1!'NS p~tUI;4r L'
r~ 10Y~ I - VAXI 1110 N'PlS
NE0 F IIE0T TIq (VP T)

IMPLICIT LflUPLC PR!CISICIN (A-t. n-17)


flMtI~. ().CVTkL (25)o SlJPT'.- (i')

C -CCE IPTC Ir I CP VA i10


1 1cl ro W,"v rT I T
C ShAFF PAtA4ETCE"

DL~7. 'I's1.Y
~.'CL)
I!~~~~~~~ CI 1A
10
1 r.TA wPUL
rA I I /CL

SCAP m Ar scA
C r 1 *L)
o (N -P1r
IF(tx! (1)~zrTs~ ~SCA SC* 1

1a
9 zu1)I$oL

20 CCJT1NUl'

E PS A A2 2
p p Ago
C *'USE 114F CUFFFICIrNT MF VARIA Ie'S 10 011T METHOL orP
C PC~wENIS ESYIPiATE Of TH sHAPr rpA'jmiC
CYAo SORT (NOe0S *SS -S *S) /f
If (CV *GE, CVT3L (1)) THN~''
A a SI4PTL (1)
ELSE IF (CY oLE, CVI kL 4'T'LI) 1I'C"
A a SHPISL (thlbL)
ELSE
C40 1@2, NTRL
IF (CVTr.L (1 *LF. CV) TWNt
A a 5IMO1L (1-1) 4 (tAMPflL (1) *S5010L Ifl
S ICY -CVT ;L (.1)) I(CCVVL (1-11 CC~ (1))
cc TO so
W~i IF
40CC'tT INUF
END IF
c -- LOOCP UhTILL C! VEOG!rCt Ul' MW~AIC. L11117 r~~c

1 CONTINUL
C
CALCULATE SU"S FIP$7

S4 ZEV

* c. *(V (1) A '' A


1?0 1 LL'M (1. (1) 0~P)
13 17~ L(.:1 (A (1) iNw)

c2~ 112 T
S3 S7 3 s(t.SAIIF -',j
1'*
C -- CALC4LATj 0'Eo t,4Ar'L C'11'110

41 A ~.1
V 51 -I 52) 1 (0. 4

C -- U1 0~4s CIV0,41
IF 05SS (Al *A) .LL. r~r V; li~r i)j ~, 1 1
ITVr w ITfrm + I
A a Al
GU) 1( I
C
C -- CALCUL47E SCALE pp Lorjr,. Frfh, St-Ap WAA0-Tr
9v CLNI I UE
1 it AP IC **(Tir /A) ##)ALI
C
r -- U01CALF UAlk
00 sO A s
A Ck 1.!
5I
ej FON1 I UE
0,k7 UP

A2 3

329
MAKING FISHER'S EXACT TEST RELEVANT

Paul H. Thrasher
Engineering Branch
Reliability, Availabi ity, and Maintainability Division
Army Materiel Test and Evaluation Directorate
White Sands Missile Range, New Mexico 88002-5175

ABSTRACT

The Fisher-Irwin Exact Method is made relevant by including q-values in


the analysis. Q-values are post-test Type II risks. They provide information
which complements the Type I risk provided by the p-value. Reporting both the
p-value and relevant q-values enables managers to base decisions on both types
of risks. For references on q-values, see four papers by Thrasher in the
Proceedings of the Thirtieth through Thirty-Third Conferences on the Design of
Experiments in the Army Research, Development, and Testing, U.S. Army Research
Office, 4300 South Miami Boulevard, Research Triangle Park, North Carolina
27709-2214.

Q-values are normally calculated using the same algorithm used to find
pre-test Type II errors. The q-value calculation inputs are normally (1) the
p-value instead of the pre-test Type I risk,. (2) the sample size actually used
instead of the samnle size planned, and (3) the same relevant values of the
parameter considered in the pre-test Type II risk calculation. Since the
Fisher-Irwin Exact Method doesn't historically have a design stage, there is
no pre-test algorithm available for modification. This paper develops the
necessary algorithm by extending the p-value calculation based on the binomial
rather than the hypergeometric distribution.

The q-value equations are developed and their mathematical properties are
examined. Computer programming methods are discussed. Examples are provided
for sample sizes both (1) small enough that a hand-held calculator can be used
and (2) large enough to require a digital computer. Numerical results are
interpreted from the viewpoint of a manager that must balance non-zero Type I
and Type II risks.

INTRODUCTION AND OBJECTIVE

The Fisher-Irwin Exact Method is a quick and straightforward technique of


comparing two samples of dichotomous items. The normally reported statistic
from this test is the p-value. The p-value is the probability of being wrong
in marginally rejecting a null hypothesis that the two samples are from one
population. In practice, managers conclude that the two samples are from
different populations if they believe the p-value is sufficiently low.

This method of analysis has not gained universal acceptance. The


reluctance to use this method may well be due to unbalanced reporting of
information. The p-value is used to report the Type-I error. This error is
sometimes called the producer's error, the contractor's error, or the error of
concern for the advocates of maintaining the status quo. Traditionally the

331
method has not reported information about the Type-li error. This error may
be called consumer's error, the government's error, or the error of concern
for the advocates of change.

The Fisher-Irwin Exact Method can provide relevant information about the
Type-II error. This additional information results from calculating and
reporting q-values. Q-values are the probabilities of being wrong in
marginally failing to reject the null hypothesis when the two samples are from
different populations. Since the two populations may differ in different
ways, there is a q-value for each pair of separate populations. Managers can
use a q-value, for a relevant pair of unequal populations, as evidence for
concluding that the two samples are from those different populations. They
reach this conclusion if they believe a relevant q-value is sufficiently high.

This paper provides equations for and examples of calculating (1) the
p-value and (2) q-values for the Fisher-Irwin Exact Method using a o.e-sided
analysis. This one-tailed analysis is used to reject a single population in
favor of two populations that differ in the direction indicated by the data.

This paper also discusses a digital computer program. This program has
been written to (1) handle the necessary voluminous calculations for large
sample sizes, (2) retain the analyst's identification of the two measurement
samples and the two mutually exclusive and exhaustive categories, and (3)
provide an report from which a manager can decide if future actions should be
based on one or two populations.

The Fisher-Irwin Exact Method may be implemented in different ways. At


the cost of redundancy, this paper uses more than one approach to illustrate
different viewpoints.

P-VALUE CALCULATION

The data for the Fisher-Irwin Exact Method, often called Fisher's Exact
Test, consists of four numbers. They and their sums are normally arranged in
a square array. The following array has double entries to illustrate both the
general situation and a specific example:

Category 1 = Success: Category 2 = Failure: Sum:


Sample I = Development: raig n-r=2 n-21
Sample II = Production: R-r=12 (N-n)-(R-r)-3 N-n=15
Total: R=31 N-R=5 N=36

Since the choices of Samples I and II and of G(.tegories 1 and 2 are both
arbitrary, there are four possible ways the data can be arranqed. The
ambiguity has been removed from the above table by naming the samples and
categories to make (1) n>(N-n) and (2) r/n>(R-r)/(N-n).

There are two methods of calculating the p-value. The best known uses
the hypergeometric distribution. The second uses the binomial distribution.
Both are described and illustrated in pao-s 195-203 of Bradley, James, V.,
Distribution-Free Statistical Tests, 1r(- iL--Hall, Inc., 1968.

332
Hypergeometric Approach:

The hypergeometric approach is based on a population of N items which is


split into two samples of sizes n and N-n. The null, hypothesis is that the
difference between the R and N-R items of the two categories did not influence
the sample selection. The probability of obtaining the data is the ratio of
(1) the number of ways items from one category can be chosen for the two
samples to (2) the number of ways items of this category can be chosen from
the total population. Thus

nCr N~nCR-r
Data] a
P[Obtaining the NCR

where jCj is the number of ways of choosing j items from I items. Both iC
and iCi.j are found from the following relation of factorials:

il
i i
iCj iCi - j J (i-.)

The p-value is the probability of obtaining the data or more extreme


partitions of the N-R items of Category 2. (More explicitly, the p-value for
the one-sided or one-tailed test is the probability that the partition of the
N.,R items will be as unbalanced as the data in the direction that the data
suggests.) For the specific example, the p-value is

p-value 21C2 1 5 C3 2 1 CI 1 5 C4 2 1 C0 15 C5
p-aue-+ ,+ ,
36C5 36C5 36C5

21! 1S! 211 15! 211 15!


"+ 191
= 21 31 121 41 111
11 20!36+I"1 01 21! .. 51 10!
36! 361 361
5! 311 51 31! 5! 311

= .253 + .076 + .008 = .34

A formal expression for the hypergeometric approach may be written in two


ways. Considering the possible distribution of the N-R items of Category 2
yields one of the two following equations:

333
if (1) n-r is less than expected because n-r~n[(N-R)/N] and (2) mi1 2 is
the minimum possible number of items in Sample I from Category 2,
n-r
p-value iE nCi N.nCNRi / NCN.R;
i-m 12

if (1) n-r is more than expected because n-r>n[(N-R)/N] and (2) MI2 is the
maximum possible number of items in Sample I from Category 2,

p-value " nr nCi N'nCN-R'i / NCN'R'

By considering the distribution of the R items of Category 1 instead of the


N-R items of Category 2, this formal expression is written using two other
equations:

if (1) r is less than expected because rn[R/N] and (2) rol, is the minimum
possible number of items in Sample I from Category 1,

r
p-value - II N-nCR-i / NCR;

if (1) r is more than expected because r>n[R/N] and (2) Mil is the maximum
possible number of items in Sample I from Category 1,

p-value •Ir nCi N.nCR-i / NCR.


I-r

Binomial Approach:

The binomial approach is based on an infinite population from which two


independent samples are taken. Although the binomial parameter of the
distribution of Category 1 items may be estimated by R/N, it is raelly
unknown. Fortunately for the p-value calculation, this parameter will cancel
from the equations regardless of its value. Denoting this binomial parameter
by p allows the probabilities of obtaining the two samples to be written as

Ptr;n,p in Sample I] * nCr Pr ( 1 .p)n-r and

P[R-r;N-n,p in Sample III - N-nCR-r PR-r (1_)(N-n)-(R-r).

334
With the restriction that these two samples are independent, the probability
of obtaining both samples is the product of the two above equations; this
reduces to

P[Obtaining both Samples] - nCr N-nCR.r PR (l.p)N-R,

The probability of obtaining one big sample size of N with R p-type


observations is

P[R;N,p in Combined Sample] - NCR PR (1 .p)N-R.

Finally, the conditional probability of obtaining the two samples given that
the combined sample has been obtained is

Samples]
P(r In n) and (R-r in N-n) I (R in
N)] -P[Both
P[Combined Sample]'

This equation is expressed in terms of the data by division of the two


previous equations. The result is

nCr N-nCR-r PR (I.p)N-R nCr N-nCR.r


P[Obtaining the Data] C .(1. .. ...
NCR pR(1-p)N-R NCR

This is the same equation as was obtained using the hypergeometric approach
and the rest of the calculation of the p-value proceeds identically.

The binomial equations in the above paragraph may be illustrated and


clarified by using (1) the data from this discussion's specific example and
(2) the point estimate of p given by R/N-31/36=.861. The result is

P[19;21,.861 in Sample I] * 2 1 C1 9 (.861)19 (1-.861)21 "19 * .236,

P[12;15,.861 in Sample II] - 1 5 C12 (.861)12 (1-.861)15 "12 a .203,

P[Obtaining both Samples] * 2 1 C1 9 1 5C1 2 (.861)31 (1..861)3631

- .0479 - (.236)(.203),

335
P[31;36,.861 in Combined Sample] a 36 C31 (.861)31 (1.861)3631

- .189, and by dividing equations

P[(19 in 21) and (12 in 15) 1 (31 in 36)] - .0479/.189 a .253.

The value of .253 -isobviously the same intermediate result as was obtained in
the hypergeometric approach. In fact, any value of p yields .253.

It is illustrative and useful to obtain P[31;36,.861 in Combined Sample]


without the assumption that all 36 ito';i were selected from one population.
This is done by using the facts that i1)Sample I and Sample I were obtained
independently and (2) the 31 items in Category 1 could have been distributed
between the two samples in different ways. The calculation is summarized in
the following table. The starred row of this table corresponds to the data
and three intermediate results from in the preceeding paragraph.
R r' R-r' P[r';21,.861] P[R-r';15,.861] P[r';21,.BB1]P[R-r';15,.B61]

31 21 10 .0433 .034.8 .00151


31 20 11 .147 .0981 .0144
31 19 12 .236 .203 .0479 *
31 18 13 .242 .290 .0700
31 17 14 .175 .257 .0450
31 16 15 .0961 .106 .0102

The last column contains the probabilities of the different ways that the 31
items can be distributed. The sum of this column is the probability of having
31 items from Category 1 in the two samples. The value of .189 obviously
agrees with the shorter calculation In the preceeding paragraph. A modified
version of the longer calculation of this paragraph will be needed in the
calculation of q-values.
Before calculating q-values, it is illustrative and useful to obtain the
p-value from the table of the preceeding paragraph. Note first that the .0479
in the last column of the starred row corresponding to the data agrees with
P[Obtaining both Samples] from the short binomial calculation of two
paragraphs ago. Note second that the entries above .0479 in the table
correspond to probabilities of obtaining more unlikely partitions than the
data. Using these two facts yields

p-value * P[Rejecting I Rejection Should Not Occur]

.00151 .0144 .0479


U - + -. +-
.189 .189 .189

.34.

336
This result of .34 does not depend on the number used for p. This may be
seen by (1) calculating another table using any p other than 31/36-.861 and
(2) summing the probabilities of obtaining partitions as extreme as the data.
This last method of calculating the p-value emphasizes that the data are
viewed marginally, The data are viewed as unbalanced Just enough to warrant
rejection of the single-population hypothesis.
Q-VALUE CALCULATION

Q-values, like the p-value, consider the data to be just sufficiently


unbalanced to warrent rejection of the single-population hypothesis. While
the p-value is the probability of getting results at least as unbalanced as
the data, q-values are the probabilities of more balanced results.
For Fisher's Exact Test, q-values cannot be calculated by using the
hypergeometric approach. All q-values are conditional probabilities with the
condition being that two different populations provided the two samples. Thus
q-values must be calculated by using the binomial approach with different
binomial parameters, p, and PII, for the populations of the two samples. If
desired, these two parameters may be replaced with p1 and k where kupi/p 1ll
Most q-value calculations are functions of only one parameter and lend
themselves to a two dimensional power curve representation. For the Fisher-
Irwin Exact Method, there are two parameters so the representation must take
the form of a three dimensional power surface. Any specific point on this
surface does not exhibit as much information as the entire surface. To be
specific in the following calculation however, pl and PIl will be taken as the
point estimates from the data. That is, the fol owing q-value calculation
will address the error of concluding that poplIPT when actually
p1-r/n-19/21;9,O5, piia(R-r)/(N-n)-12/15-,800, and k-p1 /Pu -,905/.800-1.13.
This addresses the intuitive concern of "making the mistae of ignoring what
the data's trying to tell us."
Since a q-value for any specific p1 and PIT is the probability of falsely
retaining the assumption that one p describes all items, a q-value is one
minus the probability of rejecting the assumption of a single population when
there are two populations described by pi and PTI. This can be calculated
from entries in a table of probabilities for al possible values of r
consistent with N, n, and R.
The remainder of this section considers two approaches to the q-value
equation, shows that the two results are equivalent, and discusses some
general mathematical properties.

337
Category 1 Approach:
For the specific example in this discussion, one table used to calculate
a q-value for pl-19/21 and paIml2/15 is

R r' R-r' P[r';21,.905] P[R-r';15,.800] P[r';21,.905]P[R-r';15,.800]


31 21 10 .122 .103 .0126
31 20 11 .270 .188 .0507
31 19 12 .284 .250 .0711
31 18 13 .190 .231 .0438
31 17 14 .0898 .132 .0119
31 16 15 .0321 .0352 .00113

The sum of .191 on the lower right represents the probability of obtaining a
total of 31 items from Category 1 from the two samples. This probability can
be divided into entries in the right hand column to find the conditional
probabilities of obtaining possible numbers of items from Category 1 in
Samples I and II. Taking the data and more extreme divisions of the R items
from category I as evidence of rejection, the q-value is found from

1 - q-value - P[Rejecting I Rejection Should Occur)


.0711 .0507 .0126
* - + -
.191 .191 .191

* .70.

Using a more conventional approach, the q-value can be found from less extreme
divisions of the R items from Category I to be

q..value , P[Failing to RejectlRejection Should Occur]

.0438 .0119 .00113


* - + -+ -
.191 .191 .191

,30.
.

338
This procedure may be stated formally with the following equation:

r-1
ri P[r';np]P[R-r';N-npII]

q-value- ............
max(r')
]
r'-m n(r') r'n P 3 R r ;Nn p

where mln(r') and max(r') are the m'inimun and maximum values of r' allowed by
the constraints imposed by fixeo values of N, n, and R. Increasing of r' is
limited by the total size of both Sample I and Category 1. That is, r' must
simultaneously satisfy r'4n and, r4'R. Thus the upper limit on the above sum
is

max(r') • min(n,R)

Decreasing of r' is limited by the requirement that two measurements must be


non-negative. Possible measurements of Category 1 items in Sample I and
Category 2 items in Sample II lead to r'lO and (N-n)-(R-r') O ->,r')n+R-N.
Considering the other two possible measurements leads to four other
conditions: R-r'4R, R-r'4N-n, n-r'(n, and n-r'4N-R. These four conditions
are equivalent to the first two. Thus the lower limit of the above sum is

min(r') m max(O,n+R-N).

The final form of the Category 1 equation is thus

r-1
rm) P[r';n,p I ] P [ R- r ' ;N - n,p 1 I ]

q-value rmax(n+R-N)
min(n,R)
;n,p 1 ]
)]P[R-r';N-npl
rlmax(;n+R-N) P[r'

339
Category 2 Approach:
Using Category 2 instead of Category I leads to the following table:

N-R n-r' (N-n)-(R-r') P[n-r';21;2/21] P[(N-n)-(R-r');15,3/15] P[Both]


5 0 5 .122 .103 .0126
5 1 4 .270 .188 .0507
5 2 3 .284 .250 .0711
5 3 2 .190 .231 .0438
5 4 1 .0898 .132 .0119
5 5 0 .0321 .0362 .00113

The numbers in the right three columns are the same as in the previous table
used in the Category 1 approach. The calculation proceeds as before but the
indicus in the summation appear differently:

j P[n-r';21,.0952]P[(N-n)-(R-r' );15,.200]
1- q-value - --
I P[n-r';21,.0952]P[(N-n)-(R-r');15,.200]
n.r'O

.0126 + -.0507 + -----


.0711 a .70 or
.191 .191 .191

P[ n-r';21, .0952]P[ (N-n)-(R-r' );15,.200]


n-r'=3
q-val
q-vue
lu . 1 . ..-P[n-r .... . .
; 21, .0952]P[ (N-n)-(R-r' );15,.200]

n-r'.O

.0438 +-- .0119 + .00113


19--- .30.
.191 .191 .191
This procedure may be stated formally with the following equation:

max(n-r')
nrn P[n-r';n,l-pl]P [ (N-n)-(R-r');N-nl-pii]
n -r+1
q-value - n-r
I P[n-r';nl-p1 ]P[(N-n)-(R-r');N-n,-p I]
min(n-r')

340
where min(n-r') and max(n-r') are the minimum and maximum possible
measurements of Category 2 items in Sample I. By arguments similar to those
used in the Category 1 approach, n-r'4n and n-r'eN-R imply that

max(n-r') • min(n,N-R)
and n-r)O and R-r')O -, n-r'l)n-R imply that
min(n-r') = max(O,n-R).

The use and interpretation of this result must of course be done in


conjunction with n;P(N-n) and r/n)(R-r)/(N-n). Interpretation must recognlze
that the arrangement of the data in a standard format may or may not select
Category 1 as the category of primary interest and/or Sample I as the first
sample drawn and/or tested.
Equivalency of Methods:
Although the equations from the Category I and Category 2 approaches
appear quite different they are equivalent. By using the binomial
probability relation P~i;m,p]=P[m-i;m,1-p], the Category 2 equation may be
rewritten as

min(n,N-R)
P~r' n 'Pl]PER-' N-njPl I]
1'
n-r n.r+l
q-value mn(n,N-R)
IP[r';npl]P[R-r';N-nPii]
n-r'-max(O,n-R)

The limits on the summations may be rewritten by using min(mM)--max(-m,-M),


k+max(m,M)-max(k+mk+M), and k+min(m,M)-min(k+m,k+M) to yield

max(O,n+R-N)
q-eal ' --1 P[r ';n P]P[R-r ' N-n PI1]
q-val ue ... . ..
max(O,n+R-N)
I P[r';n,p 1]P[R-r' ;N-n,p 1 I]
r'min (n,R)

By reversing the sumnation limits to correspond to the normal practice of


summing from low to high indices, this equation becomes the Category 1 result.

341
Range, Sum with P-Value, and Sametry:
The q-value, like any other probability, is bounded by zero and one.
This is verified for the Category I equation by splitting the sum in the
denominator into two sus which first sum from r'-min(r')-max(O,n+R-N) to
r'-r-1 and then summing from r'-r to r'-max(r')-min(n,R). Dividing both
numerator and denominator by the sum in the3 numerator then yields

1
q-value a ..
min(n,R)
r P[r';njp1 jP[R-r';N-n'pII]

1+
r-1
r-1rI ,n ,pl ]P[Rr',N-n ,PIll'
r',max(;,n+R-N) P

Since this equation's ratio of the two sums is never negative but it will be
infinite if the data yields r-max(On+R-N), q-value ) 0, Since this ratio can
be 0 by choosing p1 and pll equal to 0 or 1, q-value 4 1.

When p, equals p11 , the q-value is one minus the p..value. This occurs
because the possible values of r' are divided into two mutually exclusive and
exhaustive sets. One set contains possible measurements as unlikely or more
unlikely than r. The other contains values of r' more likely than r. These
two sets identify conditional probabilities that are summed to find the
p-value and q-value. The p-value summation uses the unlikely set with both p,
and PTI equated to any common probability. The q-value summation uses the
likel5 set with any values of PT and P1t. The mutual exclusiveness and
exhaustiveness of the two sets equire that p-value + q-value m 1 when PI=PIT-
For the Fisher-Irwin Exact Method, the q-value is syrmetr1c about the off
diagonal in a plot of p, versus p11 . That is, symmetry is expressed by

q-value(N,n,R,r,ppPIi) - q-value(N,n,R,r,1-pi 1 ,1..p).

This may be seen by applying the binomial equation P[i;m,p] mCi p1 ( 1 .p)mi
to the Category 1 equation in a series of three equations:

r-1
r'-max(,n+R-N)
mi~ ,) P[r' ;n,PI]P[R-r';N-n,Pii]
q-value = . .... .. .. . .

r'-max(O,n+R-N)

342
r -1aK
n+~ ) n~ri PIr' (1 -PI)ni-r' N-nCR-r' PIIR-r' (1.P11 )N-n-R+r'

min(n,R) C r j p~ - l CR r , P , N nR r
r'UmaX(O,n+R-N)

(No r' Factor) rz 1Crl N-nCR-r I 1

where the (No rl Factor) is(L~p1)n p11 R (1 p1 )N-n-R. This constant has been
factored from each term of the suni over r .-After canceling this factor from
the equation, the symiimetry isevident because substituting 1-p
11 for p1 and
I-pi or phj yields the sameequation.
The qvalue foe the Fisher-Irwin Exact Method also issymimetric inn and
R. Applyi Cj w ii/j1(i-j)I to the above equation and canceling ni and
(N-n) yileld

r- PI (1-pui.r

q-val ue - rl-makx(O,n+R-N) r'l (n-rl)I (R-r')1 (N-n-R+rl)i 1(1-pi) p1j ]


(n-rn')l 1 LI -p
1 1)' rl
rl-mJ~nR-N ell(n-l~l(R-rl)! (N-n-R+r'll 1pjpl
Substitution of n for R and R for n yields the same equation, Thus symmnetry
is expressed by

q-value(NjnOR~r~pjp 11 ) a q-value(NORjn~rjp1,pjI).

This equation reflects the mathematical arbitrariness in Identifying samples


and categories. The samples and categories normally are distinguished
physically; but they are interchangeable mathematically.

343
RECAPITULATION AND INTERPRETATION
The p-value and a relevant q-value can provide influencing factors for
management. If the p-value is lower than the risk allowed for the proponent
of a single population, management is inclined toward the decision that two
populations exist. If a relevant q.value is higher than the risk that the
proponent of two populations is willing to take, management is also inclined
toward the decision that two relevant populations exist. On the other hand, a
high p-value or low relevant q-value inclines mdnagement toward the decision
that there is one population.
Management will quite often be influenced by factors other than the
p-value and a relevant q-value. A subjective decision-making process will
naturally be used to consider all factors. The extremity of the lowness or
highness of the p-value and a q-value provides the subjective weight for these
two factors.
If management cannot determine threshold risks to indicate two
populations when the p-value is below and a q-value is above these thresholds,
an alternate approach is to compare the p-value and a q-value. Management can
set a threshold ratio of Type II to Type I risks and compare a ratio of
q-value/p-value to this threshold. Two populations are then indicated if a
ratio of q-value/p-value is too high. In a subjective decision-making process
considering many factors, the extremity of a q-value/p-value ratio provides
the subjective weight of the Fisher's Exact Test factor.
Management should determine which two populations are relevant. Factors
other than the data may suggest specific populations. Management should
consider a q-value for each and every pair of relevant populations, If the
analyst is not provided with the p, and PIT for any relevantly different
populations, the report to management should include a table of q-values for a
wide range of p, and pII.
For the primary example in this discussion, .34 is the p-value and .30 is
a q-value for the two populations suggested by the data. I- these two
populations with pIw.9 and pii=.8 are relevantly different, the two risks of
.34 and .30 provide the basis for action. If the existence of these two
populations is considered as positive, .34 is the probability of making a
false positive decision. Similarly, considering the existence of only one
population as negative implies that .30 is the probability of making a false
negative decision,
If .34 and .30 are believed sufficiently low and high for probabilities
of false positives and negatives respectively, future action is based on the
existence of two populations with with p1 and Pl estimated by .9 and .8. If
.34 and .30 are believed sufficiently high and Iow, future action is based on
a single population.

344
For this example, .30/.34=1/1.1-.9 is a ratio of q-value/p-value.
Subject to the relevancy of pT-, and PiT-. 8 , 1/1.1m.9 is the ratio of risks
of making false negative and false positive decisions. Future action is based
on two populations if i/.1-.g is believed sufficiently high. Similarly,
future action is based a single population if 1/1.1m.9 is believed
sufficiently low.
If the p-value and a q-value provide conflicting or indeterminate
indications that are unresolvable, the immediate future action is to do
additional testing. Additional testing should provide more definitive
information by yielding either a low p-value and a high q-value or a high
p-value and a low q-value. Naturally increasing the sample sizes may not
yield a proportional increase in all the data; but if additional testing
actually doubled all the data in this paper's example, the results would be
.18 for the p-value, .35 for a q-value, and .35/.18-2 for a q-value/p-value
ratio corresponding to pi-.9 and Pi- .8. This possible decrease in the
p-value, increase in a q-value, and increase in a ratio of q-value/p-value
would increase the tendency to base future actions on two populitions.
COMPUTING METHODS AND RESULTS

A digital computer program has been written in Pascal/3000 to facilitate


the p-value and q-value analysis of the Fisher-Irwin Exact Method.

Two related manipulations are useful in extending the range of data which
yields q-values without computer overflows or underflows. The equation for
the q-value can be rewritten as

r-1 C
ue - r'-max( ,n+R.N) 'r'{ (n-rl)l"('R"rl)l (N-n-R+'r1)1
q-va
r P1 (1-P~~r'
L(I-pi l)il i
q-val ue u ........ -...

min(n,R) C FPl r'-1


(1-p 1 l
r'max(O,n+R-N) (n-r') (R-r')l (N-n-R+r')l (I-pl) Pll

where C is any constant. The computer program can assign C with a value which
hinders the summed terms from exceeding the computer's working range. To make
this assignment without overflowing or underflowing the computer, each term
must be considered as
r
C [p ( - l
r'!...n..r'.. . exp[ In(TERMS)]
L ( .Pl) P
r 'i (n-r )! (R-r )l (N-n-R+r ){

I 345
nere the expression ln(TERMS) in the exponential is

ln(TERMS) - ln[C]
+ r'[ln(p1 ) + ln(1-pii) - ln(1-pI) - ln(pii)]

- ln[r'l] - ln[(n-r')I] - ln[(R-r')!] - ln[(N-n-Rr')].

The constant C can be selected to keep ln(TERMS) within the computer's range
for x in exp(x). (e.g. -176 to 176). For any value of r', this selertion can
then be used to force C exi(in(TERMS)) into the computer's operating range
(e.g., 8.6(10)-78 to 1.15 (10)77). Naturally this pr'ogramming technique is
successful only if r' doesn't change too much in the summation between
max(O,n+R-N) and min(n,R).
The range of computer calculations for the p-value ca be extended by
using logarithms. One useful form of the p-value equatiun is

W
p-val~ge'a . exp[ln(nCx) + ln(NfnCy) - ln(NCz)]
iuv

where the factors v, w, x, y, and z are dependent on r, R, and N according to


the following table:

If R<(N-R) If RW(N-R)
Factor If R<n If R)n ' (N-R)N-n)
)f If (N-R)o(N-n)

v r r 'N-n)-(k-r) (N-n)-(R-r)
w R n N-R N-n
x I i N,.R-i N-R-i
y R-i I- i i i
z R R N-R N-R

The computer program operates from a terminal . At the start of the


prcgram, the user selects either the terminal scrben or a printer for the
program output. The information in Figure I then appears on the screen. This
provides the user with a brief sumimary of the analysis and asks the user which
four independent variables will be entered. Figure k provides an exampie of
the terminal screen after the output has been dirccted to a printer and tqe
user has chosen to enter N, ;.,R, and r. Figures 3a and 3b contain the output
for this input. Correspondingly, Figures 4 and 5 show input and output kien
the user has entered r, n-r, R-r, and (N-n)-(R-r). Finally Figure 6 shows an
example with both input and output on the terminal screen. For this example,
the input produces a standard table in a different order than the data.

346
Soeni results from the outputs in these figures (and similar computer
executions) are compiled in the following table. All possible measurements of
r fur N-36, nu2l, and R-31 are included. The tabulated q-values are
referenced to e and * instead of pT and PII' This is necessary because the
computer places the data in a standardly ordered table sometimes making PI-e
and P11rn and sometimes resulting in Pjs and pin-e.

point-estimates q-value for


to replace .861
point eo.8 e-.85 e-.87 0-.9
r -value I A estimates, &-.87 & 68. 8
16 .054 .762 1.000 .000 .791 .925 .962 .993
17 .292 .810 .933 .249 .381 .644 .766 .924
18 .663 .857 .867 .306 .098 .274 .407 .682
19 .337 .905 .800 .297 .902 .726 .593 .318
.0 .084 .952 .733 .241 .988 .941 .884 .690
21 .008 1.000 .667 .000 .999 .995 .987 .941

This table emphasizes that management needs to determine relevantly different


populations instead of Just considering the point-estimates suggested by the
data. As expected, the extreme r measurements of 16 and 21 lead to low
p-values indicating two populations. The two populations indicated however,
are not those suggested by the point-estimates of B and *. (0 and € are the
binomial probabilities that describe the two popUlations; they replace the
common-population point-estimate of 31/36-.861.) The q-values for those
point-estimates are identically zero. Although every manager is free to
determine how high a q-value needs to be for a two population decision, these
are low by any standard. Management must realize that no amount of testing
can prove that anything is either completely perfect or worthless. Instead,
more reasonable values of e and * must be considered. If e-.8 and -.9 are
considered for ru16 (or e-.9 and O-.8 for r-21), the q-value of .791 (or .941)
is quite high. Even higher q-values are obtained when e-.85 and 0;.87 are
considered for r-16 (or e-.87 and *-.85 for r-21). This reflects the fact
that it's easier to say two things are different if they don't have to be very
different. Considering possible populations for which e and * are on opposite
sides of the common-population point-estimate of 31/36-.861 from the point-
estimates (e.g. considering e;.9 and €-.8 for r-16 or e'.8 and 0-.9 for r-21)
lead. to very high q-values. This reflects the compound fact that (1)
obtaining data biased in opposite direction from two existing populations is
extremely unlikely so (2) the existence of such data strongly implies more
than one population. Less extreme but similar results are obtained for r
measurements of 17 and 20. An r measurement of 18 indicates one popul3tion
unless management is concerned about very extreme alternate populations.
Finally an r measurement of 19 indicates two populations if management is
careful about what those two populations are.

347
The dependence of both the p-val ue and q-val ue on the number of
measurements is illustrated by the following example. Four measurements are
assumed to yield values of N, n, R, and r given by {N1 ,N2 N3 ,N4 -
{20,40,80,160}, {nl,n ,n , } a 10,20,40,80}, {R R2 ,R ,R4 - (17,34,68,1361,
and {rl,r 2 ,r,,rl}- {:19,3%,72). The second, third an3 fourth measurements
are just multiples of the first. All four of these hypothetical measurements
provide point estimates of e and * of 9/10-.9 and (17-9)/10-.8 respectively.
The interior of the following table contains sets of q-values from four
executions of the computer program. Each set has the q-value for the smallest
sample size first and the largest last.

.76 .80 .84


.86 2791.3659.447.5261 1.444,.5999.752 "'1
1352.472i.595,721
.90 .180,.217,.232,.221 .237,.303,.356,.397 .316,.421,:525,.633
.94 .084,.082,.059,.027 .118,.127,.111,.074 .169,.201,.209o.192

The corresponding set of p-values Is 1.500,.331,.174,.060}. Note (A) that


increasing the sample size decreases the p-value and increases the q-value for
6=.9 and Oa.8. Thus increasing the sample size, if the data remains
proportionate, increases the justification for deciding that two population
yielded the two samples. Note also (B) that the p-value has a more pronounced
change than the q-values. Thus the prvalue is more sensitive than the
q-values. A unusually high q-valu thus has at least as much significance as
an unusually low p-value. Note finally (C) that increasing the sample size
when e-.90 & o-.76, e-.94 & o-.76, e-.94 & 0%.80, and q-.94 & O-.84 eventually
leads to a decrease in the q-value. This corresponds to universal measurement
implying exact results. (This large-measurement effect does not occur for
ew.86 & on.76, e,.86 & o=.80, 9-.86 & u.84, and em.90 & u.84 because they
are on the opposite side of the point estimate from the extreme identified by
the alternate hypothesis.)

The program is designed so the user can keep track of a sample of


prominence and a category of interest. This enables the user to enter and
analyze management's relevantly different populations. For example, consider
the hypothetical case analyzed in Figures 7a and 7b. Suppose that a field-
fired missile is being developed. Enough tests have been made on the initial
design to obtain 107 hits and 14 misses. A set of shorter missile fins in
proposed to make the field-assembly faster. A short series of tests on the
short-fin version yields 11 hits and 3 misses. The short-fin test is
prominent in the mind of the missile designer; the short fins should not be
used if they significantly degrade the missile's performance. Figures 7a and
7b contain the input and terminal-screen output of an analysis using the
Fisher-Irwin Exact Method. The first entry into the computer, 11, identifies
the short-fin test as the sample of prominence and hits as the category of
interest. The p-value, .249, is somewhat low but the advocates of fast
assembly with short fins might claim that .249 is not close Pnough to zero to
warrant the conclusion that short fins have degraded the missile's accuracy.

348
The q-value for 0=11/14-. 786 and a=07/121=.884 is .375. That is slightly
higher than the p-value but it might not be large enough justify not using the
short fins. If management sets the desired requirement at e=.900 and decrees
that o-.850 is an unacceptable accuracy rate, the last table on Figure 7b
provides a basis for decision. The q-value for e-.900 and 0-.850 is .525.
Since this is twice the p-value, management has a fairly strong basis for not
using the short fins. If management leaves the desired requirement at .900
and raises the unacceptable level to .890, the q-value increases to .706. The
argument for rejecting the short fins is thus quite strong if .890 is really
an unacceptable accuracy rate.
SUMMARY

The p-value and q-value analysis of the Fisher-Irwin Exact Method has
been developed. The p-value equation has been derived using two techniques:
hypergeometric and binomial, The binomial technique has been extended to
yield a q-value equation. This equation has been derived from two sources:
possible category one measurements and possible category two measurements.
This q-value equation has been shown to possess mathematical symmetry. The
q-value for PI-PIi has been shown to equal one minus the p-value; this was
predestined for the Fisher-Irwin Exact Method because it is a general property
of the p-value and q-values. A computer program has been written. This makes
the analysis practical. Analysts can perform voluminous calculations without
approximations. Managers can consider the relative sizes and importance of
the p-value and relevant q-values. Managers can iecide if the two samples are
from one population or from two populations differing either (1) from the
combined point estimate of the population or (2) according to (A) a desired
population or standard and (B) an unacceptable population. Computer generated
reports have been provided for communication between analysts and managers.
The development of the p-value and q-value analysis of the Fisher-Irwin Exact
Method has reached the stage of Imolementation.
CONCLUS ION
The analyst has a responsibility to report all information influencing
the decision. This information should be in a form that can be understood and
used by the decision-maker. Reporting the p-value and relevant q-values
satisfies both of these conditions. The p-value and q-values provide the
decision-maker with estimates of the risks of making wrong decisions. This
makes Fisher's Exact Test relevant.

349
Greetings! Welcome to a computerized Fisher-Irwin Exact Test Analysis.
Two independent samples are initially assumed to be from a single population.
This assumption is rejected and the two samples are considered to represent
two statistically different populations if management reaches two conclusions:
1) The p-value is deemed sufficiently low and
2) A q-value for relevantly different populations is deemed sufficiently high.
The q-value is the probability of falsely deciding that two populations exist.
A q-value for two relevantly different populations is the probability of
falsely deciding that those two populations are one population.
This computerized analysis does a one sided test in the direction indicated by
the data. It requires four numerical inputs determining nine numbers:

Category One (e.g. Success): Category Two (e.g. Failure): Sum:


Sample One: r n-r n
Sample Two: R-r (N-n)-(R-r) N-n
Total : R N-R N
Data may be entered in two ways. The theoretical-statistician approach uses
"N, n, R, & r". The reliability-engineer uses "r, n-r, R-r, P.(N-n)-R-r)".
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATIS'riCIAN APPROACH "R/S"

Figure 1. Terminal screen at program initiation.

350
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" s
ENTER SIZE OF POPULATION "N" 36
ENTER SIZE OF SAMPLE OF PROMINENCE "n" 21
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN POPULATION "R" 31
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN SAMPLE OF PROMINENCE "r" 19

ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE t


ENTER "a' FOR CLOSE LOOK AT Q-VALUE TABLE IN DATA SUGGESTED REGION,
"ANYTHING ELSE" TO SKIP C
ENTER I"M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e - 19 / 21 - 0.905) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * - 12 / 15 a 0.800),
"ANYTHING ELSE" TO SKIP m
ENTER le"1 .9
ENTER " " ,8

ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE


THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e a 19 / 21 - 0.905) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE 1 / 15 - 0.800),
" 12
ANYTHING ELSE" TO SKIP m

ENTER "e" .87


ENTER 1101, .85
ENTER I"M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e a 19 / 21 a 0.905) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * - 12 / 15 - 0.800),
"ANYTHING ELSE" TO SKIP m
ENTER '" 1
ENTER "" .7

ENTER I"M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE


THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e w 19 / 21 - 0.905) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * * 12 / 15 - 0.800),
"ANYTHING ELSE" TO SKIP skip
END OF PROGRAM

Figure 2. Sample of theoretical-statistician input.

351
ANALYSIS OF FISHER'S EXACT TEST
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 1.
Category 1: Category 2:
Sample I: r- 19 n- r a 2 n- 21
Sample II: R - r w 12 (N -n) - (R - r) a 3 N - n a 15
Ra 31 N- R - 5 N- 36
For this data, the post-test risk of a Type I error is p-value - 0.337.
For this data's two suggested binomial parameters of the category of interest
(i.e. e m 19 / 21 - 0.905 and a 12 / 15 a 0.800), the post-test risk
of a Type II error is q-value a 0.297.
For other binomial parameters of the category of interest, q.values may be
estimated from the following table:

e\ * 0.050 0.150 0.250 0.350 0.450 0.550 0.660 0.750 0.850 0.950
0.050 0.663 0.956 0.990 0.997 0.999 1.000 1.000 1.000 1.000 1.000
0.150 0.182 0.663 0.866 0.946 0.978 0.992 0.997 0.999 1.000 1.000
0.250 0.058 0.389 0.663 0.827 0.917 0.963 0.986 0.996 0.999 1.000
0.350 0.020 0.210 0.456 0.663 0.809 0.903 0.958 0.986 0.997 1.000
0.450 0.007 0.105 0.285 0.483 0.663 0.804 0.903 0.963 0.992 1.000
0.550 0.003 0.048 0158 0.314 0.491 0.663 0.809 0.917 0.978 0.999
0.650 0.001 0,Q19 0.075 0.174 0.314 0.483 0.663 0.827 0.946 0.997
0.750 0.000 0.006 0.027 0.075 0.158 0.285 0.456 0.663 0.866 0.990
0.850 0.000 0.001 0.006 0.019 0.048 0.106 0.210 0.389 0.663 0.956
0.950 0.000 0.000 0.000 0.001 0.003 0.007 0.020 0.058 0.182 0.663
For binomial parameters of the category of interest near those indicated by
the data, q-values may be estimated from the following table:

a\ * 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.805 0.418 0.460 0.504 0.551 0.600 0.650 0.702 0.753 0.803 0.852 0.896
0.825 0.362 0.402 0.446 0.493 0.543 0.595 0.649 0.705 0.761 0.816 0.869
0.845 0.304 0.342 0.384 0.429 0.479 0.532 0.589 0.648 0.710 0.772 0.833
0.865 0.245 0.280 0.318 0.361 0.409 0.461 0.518 0.580 0.646 0.716 0.786
0.885 0.187 0.217 0.251 0.289 0.333 0.382 0.438 0.500 0.569 0.644 0.724
0.905 0.132 0.156 0.183 0.215 0.253 0.297 0.348 0.407 0.476 0.553 0.641
0.925 0.083 0.100 0.120 0.144 0.173 0.208 0.251 0.302 0.365 0.440 0.530
0.945 0.043 0.052 0.064 0.079 0.098 0.122 0.152 0.190 0.240 0.304 0.386
0.965 0.015 0.018 0.023 0.030 0.038 0.049 0.064 0.085 0.113 0.153 0.211
0.985 0.002 0.002 0.003 0.004 0.005 0.007 0.009 0.013 0.019 0.028 0.044

Figure 3a. First half of printer output from Figure 2 input.

352
For binomial parameters of the category of interest near those indicated by

management, q-values may be estimated from the following table:

e\ 0 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.800 0.431 0.473 0.518 0.564 0.613 0.663 0.713 0.763 0.812 0.859 0.902
0.820 0.375 0.416 0.460 0.507 0.557 0.609 0.663 0.717 0.772 0.826 0.876
0.840 0.318 0.357 0.399 0.445 0.495 0.548 0.604 0.663 0.723 0.784 0.842
0.860 0.259 0.295 0.334 0.378 0.426 0.479 0.536 0.597 0.663 0.730 0.798
0.880 0.201 0.2.32 0.267 0.306 0.351 0.402 0.458 0.520 0.589 0.663 0.740
0.900 0.145 0.170 0.199 0.233 0.272 0.318 0.370 0.431 0.499 0.577 0.663
0.920 0.094 0.112 0.134 0.160 0.192 0.229 0.274 0.328 0.393 0.469 0.559
0.940 0.051 0.062 0.0760.093 0.115 0.141 0.175 0.217 0.271 0.338 0.424
0.960 0.020 0,025 0.031 0.040 0.060 0.064 0.083 0.108 0.142 0,189 0.255
0.980 0.003 0.004 0.006 0.007 0.010 0.013 0.018 0.025 0.035 0.052 0.078
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:
e\ * 0.750 0.770 0.790 0.810 0.830 0.850 0.870 0.890 0.910 0.930 0.950
0.770 0.617 0.663 0.708 0.753 0,797 0.839 0.879 0.914 0.945 0.969 0.987
0.790 0.568 0.615 0.663 0.711 0.759 0.807 0.852 0.893 0.930 0.960 0.982
0.810 0.513 0.561 0.611 0.663 0.715 0.767 0.819 0.867 0.911 0.948 0.976
0.830 0.453 0.501 0.553 0.606 0.663 0.720 0.777 0.834 0.886 0.932 0.968
0.850 0.389 0.436 0.487 0.542 0.601 0.663 0.726.0.790 0.853 0.909 0.956
0.870 0.321 0.365 0.415 0.469 0.529 0.593 0.663 0.735 0.808 0.877 0.937
0.890 0,251 0.290 0.336 0.387 0.445 0.511 0.583 0.663 0.746 0.831 0.909
0.910 0.181 0.213 0.252 0.297 0.351 0.413 0.486 0.569 0.663 0.764 0.865
0.930 0.114 0.138 0.168 0.204 0.248 0.302 0.368 0.449 0.547 0.663 0.791
0.950 0.058 0.072 0.090 0.113 0.143 0.182 0.233 0.301 0.391 0.510 0.663
0.970 0.018 0.023 0.030 0.040 0.053 0,071 0.098 0.137 0.196 0.288 0.434

For binomial parameters of the category of interest near those indicated by


management, q-values may be estimated from the following table:

e\ * 0.600 0.620 0.640 0.660 0.680 0.700 0.720 0.740 0.760 0.780 0.800
0.900 0.064 0.076 0.090 0.105 0.124 0.145 0.170 0.199 0.233 0.272 0.318
0.920 0.039 0.046 0.055 0.066 0.079 0.094 0.112 0.134 0.160 0.192 0.229
0.940 0.019 0.023 0.028 0.035 0.042 0.051 0.062 0.076 0.093 0.115 0.141
0.960 0.007 0.008 0.010 0.013 0.016 0.020 0.025 0.031 0.040 0.050 0.064
0.980 0.001 0.001 0.002 0.002 0.003 0.003 0.004 0.006 0.007 0.010 0.013
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Figure 3b. Second half of printer output from Figure 2 input.

353
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" r
ENTER "TEST ONE NUMBER OF SUCCESSES" 20
ENTER "TEST ONE NUMBER OF FAILURES" 1
ENTER "TEST TWO NUMBER OF SUCCESSES" 11
ENTER "TEST TWO NUMBER OF FAILURES" 4
ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE skip
ENTER "01 FOR CLOSE LOOK AT Q.-VALUE TABLE IN DATA SUGGESTED REGION,
"ANYTHING ELSE" TO SKIP skip
ENTER "N" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e m 20 / 21 w 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE 11 / 15 - 0.733),
"ANYTHING ELSE" TO SKIP m

ENTER "e" .9
ENTER 11
" .8
ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e a 20 / 21 - 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * =11 / 15 a 0.733),
"ANYTHING ELSE" TO SKIP m
ENTER "le" 1
ENTER "" .7
ENTER "I" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE e 0 20 / 21 m 0.952) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * = 11 / 15 - 0.733),
"ANYTHING ELSE TO SKIP skip

END OF PROGRAM

Figure 4. Sample of reliability-engineer input.

354
ANALYSIS OF FISHER'S EXACT TEST
Although Test One, Test Two, Successes, and Failures may be interchanged
several ways mathematically, they have physical identities. To utilize these
identities, [A] Test One (i.e. the test with 20 Successes and 1 Failure) is
taken as the sample of prominence (i e. it Is considered physically more
important than Test Two) and [B] Successes define the category of interest
(i.e. the most natural description of a test result Is considered to be
Success instead of Failure).
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 1.
Category 1: Category 2:
Sample I: r a 20 n - r a I n a 21
Sample IT: R- r • 11 (N -n) - (R - r) * 4 N - n a 15
Rn 31 N-R a 5 Na 36
For this data, the pust-test risk of a Type I error is p-value * 0.084.
For this data's two suggested binomial parameters of the category of interest
(i.e. e i 20 / 21 * 0.952 and * 11 /
a 15 a 0.733), the post-test risk of
a Type II error is q-value - 0.241.
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:

o\ 0 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0,880 0,900
0.800 0.785 0.815 0.843 0.869 0.804 0.916 0.936 0.953 0.968 0.979 0.988
0.820 0.741 0.774 0.806 0.837 0.865 0.892 0.916 0.937 0.956 0.971 0.983
0,840 0.690 0.726 0.761 0.795 0.829 0.860 0.889 0.916 0.939 0.959 0.975
0.860 0.628 0.667 0.705 0.744 0.782 0.818 0.853 0.886 0.916 0.942 0.964
0.880 0.556 0.596 0.637 0.679 0.721 0.763 0.804 0.844 0.882 0.916 0.945
0.900 0.471 0.511 0.553 0.597 0.643 0.690 0.737 0.785 0.832 0.876 0.916
0.920 0.374 0.411 0.452 0.496 0.543 0.592 0.645 0.700 0.756 0.812 0.867
0.940 0.265 0.297 0.333 0.372 0.416 0.465 0.518 0,577 0.641 0.709 0.780
0.960 0.152 0.174 0.199 0.229 0.263 0.302 0.349 0.403 0.466 0.539 0.623
0.980 0.050 0.059 0.070 0.083 0.099 0.118 0.143 0.174 0.214 0.267 0.337
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:

9\ € 0.600 0.620 0.640 0.660 0.680 0.700 0.720 0.740 0.760 0.780 0.800
0.900 0.302 0.332 0.363 0.397 0.433 0.471 0.511 0.553 0.597 0.643 0.690
0.920 0.225 0.250 0.277 0.307 0.339 0.374 0.411 0.452 0.496 0.543 0.592
0.940 0.149 0.168 0.188 0.211 0.237 0.265 0.297 0.333 0.372 0.416 0.465
0.960 0.079 0.090 0.102 0.117 0.133 0.152 0.174 0.199 0.229 0.263 0.302
0.980 0.024 0.027 0.032 0.037 0.043 0.050 0.059 0.070 0.083 0.099 0.118
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Figure 5. Printer output from Figure 4 input.

355
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" S
ENTER SIZE OF POPULATION "N" 36
ENTER SIZE OF SAMPLE OF PROMINENCE "n" 21
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN POPULATION "R" 31
ENTER # OF ITEMS FROM CATEGORY OF INTEREST IN SAMPLE OF PROMINENCE "r" 18
ANALYSIS OF FISHER'S EXACT TEST
In the following standardly ordered table, the sample of prominence and
category of interest are identified by the user as Sample I and Category 2.

Category 1: Category 2:
Sample I: ra 3 n - r - 18 n - 21
Sample II: R - r - 2 (N -n) - (R - r) - 13 N - n a 15
R n 5 N -R - 31 N - 36
For this data, the post-test risk of a Type I error Is p-value n 0.663.
For this data's two suggested binomial parameters of the category of interest
(i,e. e a 18 / 21 a 0.857 and € - 13 / 15 - 0.867), the post-test risk
of a Type II error is q-value - 0.306.
ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE skip
ENTER "C" FOR CLOSE LOOK AT Q-VALUE TABLE IN DATA SUGGESTED REGION9
"ANYTHING ELSE" TO SKIP skip
ENTER "If'FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE o a 18 / 21 a 0.857) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE * = 13 / 15 = 0.867),
"ANYTHING ELSE" TO SKIP m

ENTER "e" .8
ENTER "" .9
For binomial parameters of the category of interest near those indicated by
management, q-values may be estimated from the following table:

e\ * 0.700 0.720 0.740 0.760 0.780 0.800 0.820 0.840 0.860 0.880 0.900
0.800 0.158 O.184 0.215 0.250 0.291 0.337 0.391 0.452 0.521 0.598 0.682
0.820 0.127 0.149 0.176 0.207 0.244 0.287 0.337 0.396 0.464 0.542 0.630
0.840 0.098 0.117 0.139 0.166 0.198 0.237 0.283 0.337 0.403 0.480 0.569
0.860 0.073 0.088 0.106 0.128 0.155 0.188 0.228 0.277 0.337 0.411 0.501
0.880 0.051 0.062 0.076 0.093 0.114 0.141 0.174 0.216 0.270 0.337 0.423
0.900 0.033 0.041 0.050 0.063 0.078 0.098 0.124 0.158 0.202 0.260 0.337
0.920 0.019 0.024 0.030 0.037 0.048 0.061 0.079 0.103 0.136 0.182 0.246
0.940 O.ooq 0.011 0.014 0.019 0.024 0.032 0.042 0.057 0.077 0.108 0.154
0.960 0.003 0.004 0.005 0.007 0.009 0.012 0.016 0.022 0.032 0.047 0.071
0.980 0.000 0.001 0.001 0.001 0.001 0.002 0.003 0.004 0.006 0.009 0.015
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Figure 6. Input and screen output in case that alters order in table.

356
ENTER RELIABILITY-ENGINEER/THEORETICAL-STATISTICIAN APPROACH "R/S" r

ENTER "TEST ONE NUMBER OF SUCCESSES" 11


ENTER "TEST ONE NUMBER OF FAILURES" 3
ENTER "TEST TWO NUMBER OF SUCCESSES" 107
ENTER "TEST TWO NUMBER OF FAILURES" 14
ANALYSIS OF FISHER'S EXACT TEST

Although Test One, Test Two, Successes, and Failures may be interchanged
several ways mathematically they have physical identities. To utilize these
identities, [A] Test One (i.e. the test with 11 Successes and 3 Failures)
is taken as the sample of prominince (i.e. it is considered physically more
important than Test Two) and [B] Successes define the category of interest
(i.e. the most natural description of a test result is considered to be
Success instead of Failure).

In the following standardly ordered table, the sample of prominence and


category of interest are identified by the user as Sample II and Category 1.
Category 1: Category 2:
Sample I: r * 107 n - r - 14 n* 121
Sample II: R - r • 11 (N -n) - (R - r) a 3 N - n - 14
R 118 N - R w 17 N 135
For this data, the post-test risk of a Type I error Is p-value - 0.249.
For this data's two suggested binomial parameters of the category of interest
(i.e. '- 11 / 14 - 0.786 and e - 107 / 121 0.884), the post-test risk of
a Type II error is q-value a 0.375.
ENTER "T" FOR TABLE OF Q-VALUES, "ANYTHING ELSE" TO SKIP TABLE t

For other binomial parameters of the category of interest, q-values may be


estimated from the following table:
e\ € 0.050 0.150 0.250 0.350 0.450 0.550 0.650 0.750 0.850 0.950
0.050 0.751 0.981 0.996 0.999 1.000 1.000 1.000 1.000 1.000 1.000
0.150 0.154 0.751 0.926 0.976 0.992 0.997 0.999 1.000 1.000 1.000
0.250 0.022 0.429 0.751 0.897 0.959 0.985 0.995 0.999 1.000 1.000
0.35U 0.003 0.190 0.516 0.751 0.884 0.951 0.982 0.995 0.999 1.000
0.450 0.000 0.064 0.289 0.549 0.751 0.880 0.951 0.985 0.997 1.000
0.550 0.000 0.016 0.124 0.329 0.559 0.751 0.884 0.959 0.992 1.000
0.650 0.000 0.002 0.035 0.144 0.329 0.549 0.751 0.897 0.976 0.999
0.750 0.000 0.000 0.005 0.035 0.124 0.289 0.516 0.751 0.926 0.996
0.850 0.000 0.000 0.000 0.002 0.016 0.064 0.190 0.429 0.751 0.981
0.950 0.000 0.000 0.000 0.000 0.000 0.000 0.003 0.022 0.154 0.751
Figure 7a. First half of reliability-engineer input and screen output
for hypothetical missile modification analysis.

357
ENTER "C" FOR CLOSE LOOK AT Q-VALUETABLE IN DATA SUGGESTED REGION,
"ANYTHING ELSE" TO SKIP c
For binomial parameters of the category of interest near those indicated by
the data, q-values may be estimated from the following table:

*\ 0 0.686 0.706 0.726 0.746 0.766 0.786 0.806 0.826 0.846 0.866 0.886
0.784 0.499 0.550 0.603 0.655 0.706 0.755 0.801 0.844 0.882 0.916 0.943
0.804 0.430 0.483 0.537 0.593 0.648 0.702 0.755 0.805 0.850 0.891 0.926
0.824 0.358 0.409 0.464 0.521 0.580 0.639 0.698 0.755 0.809 0.859 0.902
0.844 0.283 0.331 0.384 0.441 0.501 0.564 0.628 0.693 0.756 0.815 0.869
0.864 0.208 0.251 0.299 0.353 0.412 0.476 0.544 0.614 0.686 0.756 0.823
0.884 0.138 0.173 0.213 0.260 0.314 0.375 0.443 0.517 0.595 0.676 0.757
0.904 0.079 0.103 0.132 0.168 0.212 0.265 0.327 0.399 0.480 0.569 0.663
0.924 0.035 0.048 0.065 0.088 0.118 0.156 0.205 0.265 0.339 0.427 0.530
0.944 0.010 0.014 0.021 0.031 0.045 0.065 0.093 0.131 0.184 0.255 0.349
0.964 0.001 0.002 0.003 0.005 0.007 0.01,2 0.020 0.033 0.053 0.087 0.140
0.984 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.003 0.008
ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE
THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE - 11 / 14 a 0.786) AND THE
SAMPLE OF NON-PROMINENCE (I.E. REPLACE e , 107 / 121 - 0.884),
"ANYTHING ELSE" TO SKIP m

ENTER "*" .85


ENTER "8 .9
For binomial parameters of the category of interest near those indicated by
management, q.values may be estimated from the following table:

o\ 0 0.750 0.770 0.790 0,810 0.830 0.850 0.870 0.890 0.910 0.930 0.950
0.800 0.618 0.673 0.726 0.776 0.824 0.867 0.905 0.937 0.962 0.980 0.992
0.820 0.550 0.608 0.666 0.724 0.778 0.830 0.876 0.916 0.948 0.973 0.989
0.840 0.471 0.532 0.595 0.658 0.721 0.781 0.837 0.887 0.929 0.962 0.984
0.860 0.384 0.445 0.510 0.578 0.648 0.717 0.785 0.847 0.901 0.945 0.976
0.880 0.291 0.348 0.411 0.481 0.556 0.634 0.712 0.789 0.860 0.919 0.964
0.900 0.197 0.245 0.302 0.368 0.442 0.525 0.614 0.706 0.796 0.877 0.942
0.920 0.110 0.145 0.189 0.243 0.310 0.389 0.482 0.585 0.696 0.805 0.902
0.940 0.044 0.063 0.088 0.123 0.170 0.232 0.313 0.415 0.539 0.679 0.822
0.960 0.009 0.014 0.022 0.034 0.054 0.084 0.131 0.201 0.306 0.454 0.646
0.980 0.000 0.000 0.001 0.002 0.003 0.006 0.013 0.026 0.056 0.122 0.265
1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

ENTER "M" FOR MANAGEMENT INDICATED BINOMIAL PARAMETERS TO REPLACE


THE DATA INDICATED PARAMETERS OF THE CATEGORY OF INTEREST IN THE
SAMPLE OF PROMINENCE (I.E. REPLACE • 11 / 14 - 0.786) AND THE
a
SAMPLE OF NON-PROMENENCE (I.E. REPLACE e - 107 / 121 - 0.884),
"ANYTHING ELSE" TO SKIP skip

END OF PROGRAM

Figure lb. Second half of hypothetical missile modification analysis.

358
A.IEDA LIST
FOR THE
TUTORIAL S flN NEAR
nI RM ICZ
17-18 OCIOW 1988
AND =H
THIRTY FOURM(CM M CN MEZ DESIGN CP IN A)D
M
RIEA , DEVP T AND TEST=NG
19-21 OMDUR 1988

UI=R.1L (T)
OOFREC (C)

C LANUR Rat
e L.
=rsnt Of Statistics
versity of South Carolina
Columbia, SC 29208 (803) 777-7800

C WAlHZR ,Paul H.
Reliability Division
TE-RE
White Sands Missile Range
New MixicL 88002 (505) 678-6177

T,C WMF'P Weston C.


Flight Safety
NR-CF
White Sands Missile Rang.
New Mexio 88002 (505) 678-2205

T,C BLICO, Abel J.


Atmosphric SciwON Lab
SLCS-AE-E
White Sands Missile Range
Now Mexico 88002 (505) 678-3924

T,C N tL,
IZvi J.
Div. 6415
Sandie National Lobs
PO Box 5800
A pinrque, Now Mexioo 87185 (505) 844-4208

C, TE, DeorIis
STP-Mr-TA-A
Ywmm Proving Ground (AV) 899-3251

C €MUM, W.J.
College of Bus. Acmin
Texas Tech University
lak*oc' TX 79409 (806) 742-1546
l2";59
C DAVID, H.A.
Ia State University
Departent of Statistics
ISU, Armes, IA 50011 (515) 294-7749

T,C DOYLE, Mary


USA TFOSC
Product Assurance
4300 Gcodfello
St. Louis, MD 63120-1798 (314) 263-9468

C HOCK=G. Ron
Texas A&M
Dparwnt of Statistics
College Stati, TX 77843 (409) 845-3151

C ZACS, S.
Binghwvon Center
Departibnt of Math Sciences
Suny, Binghwxton NY 13901 (607) 777-2619

T,C BRAND=t, Dernis L.


U.S. Anty Corps of xiners
aterways Bperimnt Station
P.O. Bcx 631
ViCkburg, MS 39180-0631 (601) 634-2807

T,C STRATXN, Willard F.


USAW Materiel Riadiness
suport Activity
Lexington, K" 40511-5101 (606) 293-4174

C IAiNDmA, Jagdish
U.S. Amy Reearch Office
Mathematicml Sciences Division
P.O. Box 12211
Research Triangle Park
NC 27709 (919) 549-0641

C ESSV*AMEU, De Oskar M.
Research Directorate
U.S. Amy Missile Ccmmrd
A'IV: AMOU-PD-RE-AP
Redstone Arsenal, AL 35898-5248 (AV) 876-4872

360
T,C ABRAHIM, YMhd
Mathematical Department
New Mexico State University

T,C BANDAS, Sadin


Mathematical Depaxnu-nt
New Mexico State University

T HVICK, Chris
Mathematical EDepartment
New Mexico State University

T DALE, Richard H.
Data Sciences Division
STEWS-NR-AM
White Sands Missile Range, N 88002

T GADNEY, George
Data Sciences Division
STM-NR-AD
White Sands Missile Range, NM 88002

TC MICAUJLIN, Dale R.
Data Sciences Division
STES-MR-AM
White Sands Missile Range, NM 88002

T,C 0UNCIL, Konrad K.


Data Sciences Division
STES-MR-AM
White Sands Missile Range, NM 88002

T ZEBR, Ronald
AMXCM-CW-WS
White Sands Missile Range, NM 88002

T,C CHAMBS, Charles E.


Data Sciences Division
STEWS-NR-MA
White Sands Missile Range, NM 88002

361
TC SUTGE, J. Fbbezt
waiter Reed Anny Institute of Research
washirqton, DC 20307 (202) 576-3151

TC MM I Douglas
walter Ped
Anny Institute of Research
Washingtionp DC 20307-5100 (202) 576-7212

TC GPWM0P Gavin
t7TEF
Math Deparbmnt
UT'P I El Paso, -IX (915) 747-5761

TIC AV3APA, Elton P.


U.*S. Amny Atmspheric
sciences Laboratory
AMNTh: SLChS-AE-E (AVAPA)
white Sands Missile Range
N.M 88002-5501 (505) 678-1570

C WOOS, Anthony K.
USATWM1, System Analysis Of fice
Mo~deling &Techniques Div.
4300 Goodfellow Blvd.
St. louis MO 63120-1798 (314) 263-2926

TC MOSS, Linda L.C.


Ballistics Research Laboratcry
U.S. Amny Ballistic Research Lab
Aberdeen Proving Grun
D 21005-5066 (301) 278-6832

TIC ~THaICS Jerry


Ballistics Research Laboratory
Aberdeen Proving Ground
MD 21005-5066 (301) 278-6728

TC BODT, Barry A.

Aberdeen Proving Ground


M'D 21005-5066 (301) 278-6646

TIC tMMOTZ, Robert L.


U.S. Anny Ballistic
Research Lab
SECB-E-P
Akvrdeen Proving Ground
I'D 21005-5066 (301) 278-6832

362
T,C WEBB, David W.
Ballistic Research Laboratozy
ATTNt SLM-SE-P
Aberdeen Proving Ground
MD 21005-5066 (301) 298-6646

T,C WR, uMalca S.


Ballistic Reuearch Lboratory
ATIN: SLCM-SE-P
Abudeen Proving Ground
MD 21005-5066 (301) 298-6646

T QUINZI, Tony
TRAC
White Sands Missile Range
NM 88002 (505) 678-4356

T,C JSSL, Carl T.


Om
ATIN: CSTE-<h
5600 Columbia Pike
Falls Church, VA 22041-5115 (AV) 289-2305

C HALL, Oarles E. Jr.


U.S. Azmy Missile Ccgrgrmn
ATTN, AMSMI-RD-RE-OP/C.E. Hall
Pedstone Arsenal AL 35898-5248 (205) 876-3934

C LEIG, Siegfried H.
U.S. Auy Missile CauwuAn
ATT~s AS1M-RD-RE-OP/S. H. Lehnigh
RestomO Arsenal, AL 35898-5248 (205) 876-3526

T,C GM, Rbet E.


STES-ID-P
tite Sands Missile Range,
New Mexico 88002 (505) 678-2291

TIC WTvOIT, Gem


U.S. Ami Infantry School (404) 545-3165

T,C GRIMES, Dr. Fred M.


TEXO Cmtind
Arms Test DirecWratm
Ft Hood, TX 76544--5065 (AV) 738-9614

363
T,C Sw= I Donald X.
New Maxico Research Institute
Box 160
Las CMW-es, NK 88004 (505) 522-5197

C BATES, Carl B.
U.S. Am Concepts Arnaysi Agency
8120 Wbodmont Ave
Bethesda, M 20814-2797 (AV) 295-0163

C Mm(, Donald L.
Stanford University
Departunt of Operations Resiarch
Stanford, CA 94305-4022 (415) 723-0850

T,C VismA, Vernon V.


US AtoVy Combat SystaMM Activity
Abedeen Proving Ground MD
21005-5059 (301) 278-7503

C SOWOKY, P.
Counsellor, Defenie R&D,
Canadian Embassy
2450 Massachusetts Ave NW
Washington, DC 20008 (202) 483-5505

T,C DRESSEL, Francis


U.S. Awy Research Office
PO Bcx 12211
Research Triangle Park
North Carolina 27709-2211 (919) 549-0641

C B.YSCM, Dr.' Meion R.'


TEXCt4 Experien.ation Center
Ford Ord, CA 93941-7000 (408) 242-4414

T,C TfNlEy, Henry B.


University of Delaare
Dept Math Sciences
Newark DE 19716 (302) 451-8034

TC BISSMER, Batney
Penn State University
Hershey Foods
281 W. Main St
Middletown, PA 17057 (717) 944-0649

364
C =tzALZ I RaMiro
TRAC
White Sands Missile Range, NM 88001

C SHUSTEI, Eugene
Mathmatical Department
University of Texas -El Paso
El Paso, Texas

C ROJO, Javier
Mathematical Department
University of Texas - El Paso
El Paso, Texas

C CNG,. Cherq
Matheatical Departmnt
University of TewUa El Paso
El Paso, Texas

C LIU, Yan
Mathematical Department
University of Texas - El Paso
El Paso, Texas

C POLLA , Charles
RAM Division
White Sand&1 Missile Range, New Mexico 88002

C KAIQH, Bill
Mathematical Department
University of Texas - El Paso
El Paso, Texas

C PARZEN, Emanuel
Department of Statistics
Texas A&M University
College Station, Texas 77843

T,C COW r Herb


U.S. Army Materiel Systems Analysis Activity
Aberdeen Proving Ground, MD 21005-5066

365
C HAMM, Joe
U.S. Ay Airborne Board
Fort Bragg, NC 28307 (AV)236-5115

TC CASTILLO, Cesar
Data Sciences Division
STEWS-NR-AM
White Sands Missile Range, NM 88002

T,C DAL=, Oren N.


Data Sciences Division
STEWS-NR-AM
White-Sids Missile Range, NM 88002

T CATRSON, Janet
White Sands Missile Range, NM 88002

TC COHEN, Herb
AMSA
Aberdeen Proving Ground MD 21005

WANG, Phillip
TRAC
White Sands Missile Range, NM4 88001

T ACKER, Clay D.
Data Sciences Division
STES-NR-AM
White Sands Missile Range, NM 88002

T,C ROGERS, Gerald


Math Department
New Mexico State University

C CoX, Paul
2930 Huntington Drive
Las Cruces, New Mexico 88001

TC CULPEPM , Gideon
Las Cruces, NM

T,C PAGE, Woodrow


1125 Larry Drive
Las Cruces, New Mexico 88001

C ANDERSEN, GERALD
366
C VANE, Mark
U.S. A ' Materials Technology Lab
Watertown Arsenal

C,T HARRIS, Bernard


University of Wisconsi, Madison

C WMST, Larry
TErN Headquarters
Aberdeen Proving Ground, MD 21005-5066

C W3W4AN, Edward
George Mason University

367
UNa.ASSIFIED
SECURITY CLASSIFICATION 5F fHISPAG

REPORT DOCUMENTATION PAGE


IForrm
Apo'e
0Bpo0704-0 ?88

la. REPORT SECURITY CLASSIFICATION lb. RESTRICTIVE MVARKINO '


UN.ASIFID
2a. SECURITY CLASSIFICATION AUTHORITY 3, DISTRIBUTION/AVAILABILITY OF REPORT

2b. DECLASSIFICATION/ DOWNGRADING SCHEDULE Approved for public release;j distribiution


unlimited.
4, PERFORMING ORGANIZATION REPORT NUMBER(S) S.MONITORING ORGANIZATION REPORT NUMBER(S)
AM Report 89-2
Ga. NAME OF PERFORMING ORGANIZATION
ArM Research Office
r
6bOFISMOL
(if applicable)
SICR-AW
7a. NAME OF MONITORING ORGANIZATION

8c. ADDRESS (City, State, and ZIP Cod@) 7b. ADDRESS (City, State, and ZIP Ccode)
P.O. Box 12211
Research Triangle Park, NC 27709

Be. NAME OF FUNDING /SPONSORING 8 b, OFFICE SYMBOL 9. PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER
ORGANIZATION AMSC on behalf (if applicable)
of ASO (ADA)

ITASK
I______ _____________________

Sc. AD.DRESS (City, State, and ZIP Code) 10. SOURCE OF FUNDING NUMBERS
PROGM O PROJECT WQKUNITNO
.AC SI N N .
M N NO IN ,N
11. TILE (include S cu t yClassification)EL
Proceedings of the Thirty-Fourth Conference on the Design of Experiments in Army Research,
DeveloSTent and Testing
12. PERSONAL AUTHOR(S)

13a. TYPE OF REPORT 13b TIME COVERED 114, DATE OF REPORT (YearW011th,Day) 1i5. PAGE COUNT
Technical FROM fL...fRA TO... 9I 1989 July I7 367-
16, SUPPLEMENTARY NOTATION

17. COSATI CODES I18, SUBJECT TERMS (Continue on reverse it necessary and identify by block number)
FIELD GROUP SUB-GROUP

'19, ABSTRACT (Continue on reverse if necessary and identify by block number)


This is a technical report of the Thirty-Fourth Conference on the Design of Experiments
in Army Research, Develoment and Testing. It contains nust of the papers presented at
this Teetinq. These articles treat various Army statistical andJ design problemns.

20, DISTRIBUTION IAVAILABILITY OF ABSTRACT 21 ABSTRACT SECURITY CLASSIFICATION


0 UNCLASSIFIE D/UNLI MITE D 0 SAME AS RPT. Q OTIC USERS
22a. NAME OF RESPONSIBLE INDIVIDUAL T22b TELEPHONE (include Area Code) 22c. OFFICE SYMBOL
Dr. Francis G. Dressel I(919) 549-0641, Ext. 124 SLCR-MJ\
DO FORM 1473,84 MAR 83 APR edition may be usd until exhausted. SECURITY CLASSIFICATION OF THIS PAGE
All other editions are obsolete. UNCLASIFID

You might also like