Simulation of Correlated Continuous and Categorical Variables Using A Single Multivariate Distribution
Simulation of Correlated Continuous and Categorical Variables Using A Single Multivariate Distribution
Clinical trial simulations make use of input/output models with covariate effects; the virtual
patient population generated for the simulation should therefore display physiologically rea-
sonable covariate distributions. Covariate distribution modeling is one method used to create
sets of covariate values (vectors) that characterize individual virtual patients, which should
be representative of real subjects participating in clinical trials. Covariates can be continuous
(e.g., body weight, age) or categorical (e.g., sex, race). A modeling method commonly used
for incorporating both continuous and categorical covariates, the Discrete method, requires the
patient population to be divided into subgroups for each unique combination of categorical
covariates, with separate multivariate functions for the continuous covariates in each subset.
However, when there are multiple categorical covariates this approach can result in subgroups
with very few representative patients, and thus, insufficient data to build a model that charac-
terizes these patient groups. To resolve this limitation, an application of a statistical method-
ology (Continuous method) was conceived to enable sampling of complete covariate vectors,
including both continuous and categorical covariates, from a single multivariate function. The
Discrete and Continuous methods were compared using both simulated and real data with
respect to their ability to generate virtual patient distributions that match a target popu-
lation. The simulated data sets consisted of one categorical and two correlated continuous
covariates. The proportion of patients in each subgroup, correlation between the continuous
covariates, and ratio of the means of the continuous covariates in the subgroups were varied.
1 Novartis Pharmaceuticals Corp., One Health Plaza 435/1125, East Hanover, NJ 07936,
USA.
2 Center for Drug Development Science, UC Washington Center, School of Pharmacy, Uni-
versity of California San Francisco, 1608 Rhode Islands Road, NW, Washington, DC
20036, USA.
3 Department of Pharmacology and Clinical Pharmacology, University of Auckland, 85
Park Rd, Private Bag 92019, Auckland, New Zealand.
4 Projections Research, Inc., 535 Springview Lane, Phoenixville, PA 19460, USA.
5 To whom correspondence should be addresed. E-mail: [email protected]
773
During evaluation, both methods accurately generated the summary statistics and proper pro-
portions of the target population. In general, the Continuous method performed as well as
the Discrete method, except when the subgroups, defined by categorical value, had markedly
different continuous covariate means, for which, in the authors’ experience, there are few clini-
cally relevant examples. The Continuous method allows analysis of the full population instead
of multiple subgroups, reducing the number of analyses that must be performed, and thereby
increasing efficiency. More importantly, analyzing a larger pool of data increases the precision
of the covariance estimates of the covariates, thus improving the accuracy of the description
of the covariate distribution in the simulated population.
INTRODUCTION
Clinical trial simulation (CTS) can be a valuable tool to improve
drug development (1–3). By synthesizing the available knowledge about the
drug, patients, and clinical program (e.g., pharmacokinetics and pharma-
codynamics, disease progress, demographics) into a stochastic model, the
user can investigate, in silico, aspects of the clinical study plan (dosing reg-
imens, study designs, patient populations, formulations), allowing the clin-
ical team to make rational, informed decisions with regards to optimizing
the development plan of a new compound (4–8).
A clinical trial simulation model consists of three main components
(1): a clinical trial execution model, an input–output (IO) model, and a
covariate distribution model. The execution model describes aspects of the
study conduct such as compliance with dosing schedules, and subject drop-
outs. The IO model is a collection of models describing the disease progress
during the study period, and the pharmacokinetics and pharmacodynam-
ics of the drugs being tested. The covariate distribution model incorporates
patient-specific factors that may account for inter-individual differences in
observed pharmacokinetics and pharmacodynamics and contribute to vari-
ability in individual parameter values. Based on the established or hypoth-
esized impact of the covariates on the IO model, the simulated covariate
information is then used to predict IO model parameters for a virtual patient
with a particular combination of demographics and characteristics.
Covariate distribution modeling can be used to generate virtual patients
for clinical trial simulation (3, 9). Each patient is represented by a set of intrin-
sic or extrinsic factors (called a covariate vector) which collectively describe
the characteristics of the patient. Useful covariates typically include demo-
graphics (age, weight, sex, race), concomitant drug use (which may also include
abused drugs, tobacco and alcohol), and disease risk or health status biomar-
kers (e.g., blood pressure, cholesterol concentrations, creatinine clearance, liver
enzymes, disease severity). Note that patient covariates may be continuous
Simulation of Correlated Continuous and Categorical Variables 775
(such as age and weight) or categorical (such as sex, race, or smoking status).
These covariates are frequently correlated between individuals (e.g., women
are more likely to weigh less and have lower creatinine clearances than men).
Since the covariates are used to predict elements of the IO model that influence
a patient’s trial outcome, it is critical that the covariates associated with each
virtual patient be realistic and consistent with the projected patient popula-
tion. Therefore, some care should be given to the development of the covariate
distribution model.
There are a number of techniques that use the covariates of an exist-
ing patient population (e.g., a patient population with the same indica-
tion, or patient information from a previous study of the same drug)
to create new virtual patient populations for clinical trial simulation (9).
The simplest method is to sample complete patient covariate vectors from
observed values in the existing database (also called the empirical distribu-
tion), with or without replacement of that vector in subsequent sampling.
The benefit of sampling from an empirical distribution is that covariate
combinations are guaranteed to be realistic, as they are extracted directly
from real patient data. However, no new patient covariate vectors can be
created using this approach.
Rather than sampling complete vectors from single subjects, the
individual empirical distributions of each covariate can be used to create
vectors that do not exist in the empirical database. In this method, covari-
ates are sequentially sampled from their individual empirical distributions,
with each subsequent covariate chosen from a constrained set of values
based upon previously selected covariates (e.g., after randomly sampling
age from the observed age distribution, one would then randomly sam-
ple creatinine clearance from its observed distribution; however, the choice
of values would be limited to creatinine clearance measurements obtained
from patients with the age chosen in the first step). Such “conditional dis-
tributions” will preserve the correlation between the covariates. It should
be noted, however, that as each additional covariate is selected, subsequent
distributions become increasingly constrained, potentially limiting values
to a highly restricted subset; shuffling the selection order may partly alle-
viate this problem (9). In addition, the process of sequentially selecting
covariates can be computationally inefficient.
Random sampling of covariate vectors from a multivariate normal
distribution (MVND, Fig. 1) preserves the benefits of the previously
described covariate profile generation methods (generation of unique sub-
jects with realistic covariate vectors), while reducing their limitations (e.g.,
computational inefficiency and sampling from overly constrained distribu-
tions). A MVND is represented by two parameters: a vector of means
of the individual covariates, and matrix consisting of the variances of the
776 Tannenbaum et al.
Fig. 1. Illustration of multivariate normal distribution for two covariates (cov1 and cov2).
Covariate combinations that occur naturally in the target population have a high probability
of being selected by the covariate distribution model, whereas unrealistic or physiologically
impossible combinations are selected with lower frequency.
covariates along the main diagonal, and the covariances between each pair
of covariates in the other matrix positions (10).
Two important assumptions fixed in the definition of an MVND must
be considered. First, all covariates in the MVND are assumed to follow
the same known distribution (e.g., normal or log-normal). Second, while
covariance defines the basic association between two covariates, it is not
sufficient to fully define the shape of the relationship; sampling from a
MVND will always result in linearly related simulated covariates. Thus,
regardless of the observed distributions of the covariates, and the shapes
of the relationships between them in the empirical distribution, covari-
ates sampled from an MVND based upon this data will be normally (or
log-normally) distributed and linearly related; the simulated results may
therefore only approximate the original target population. It should be
noted, however, that most common covariates, such as age and weight, are
generally normally or log-normally distributed; in addition, within normal
ranges of covariate values, it is usual to see a linear relationship between
common covariates. Therefore, because the MVND defines the individ-
ual covariate distributions as well as maintains the systematic relationship
between the covariates, the generated covariate vectors should be physio-
logically realistic.
Software packages such as NONMEM (11) and Pharsight Trial
Simulator (Version 2.1.2, Pharsight Inc, Mountain View, CA) allow sam-
ples to be obtained from multivariate normal distributions (MVND), but
this can only be accomplished when all covariates are continuous. Because
categorical covariates are not continuous they have not previously been
considered for inclusion in MVNDs. However, the method for covari-
ate distribution modeling that will be introduced creates a single MVND
Simulation of Correlated Continuous and Categorical Variables 777
METHODS
Discrete Method
A commonly used method for dealing with both continuous and cat-
egorical covariates is to use a separate MVND for each unique combina-
tion of categorical covariate values. This will henceforth be designated the
Discrete method. For example, if sex and smoking are the two categori-
cal covariates, the population is divided into four groups (female smokers,
female nonsmokers, male smokers, and male nonsmokers). Subgroup spe-
cific MVNDs are then created from which continuous covariates (e.g. age,
weight) are sampled. Although the Discrete method is frequently used,
there are significant limitations, which arise from subdividing the patient
population.
First, the Discrete method may be impractical to implement when
there are multiple categorical covariates (e.g. sex ×2, smoking ×3, race
×4, disease status ×5 would lead to 120 separate MVNDs). Even if it
were feasible to simulate this many MVNDs, estimation of their param-
eters may be impossible because of limited empirical observations of the
continuous covariates in each of the categorical subgroups. When there are
too few patients in a subgroup, there may be insufficient data to create
a reliable MVND; specifically, if there less than N + 1 subjects in a sub-
group (where N is the number of covariates in the MVND), the variance–
covariance matrix that is generated will be singular. A worst case scenario
is a subgroup in which there are no patients in the empirical distribution,
yet patients with this combination of categorical covariates could poten-
tially be enrolled in a future clinical trial. Because the relationship between
the continuous covariates in this subgroup is unknown, it is impossible to
determine if the simulated patient covariate vectors are appropriate for this
patient group.
Although there may be no data about the association between co-
variates (continuous or categorical) for a specific subgroup, it seems rea-
sonable to assume that the variance structure may be similar to those
778 Tannenbaum et al.
Continuous Method
In the Continuous method the parameters of a single MVND are esti-
mated by treating all categorical covariates as if they are continuous val-
ues, a procedure seen commonly in statistical simulation (12–15). In order
to constrain all biological covariates to be positive, we typically assume
a log-normal multivariate distribution. Thus, the MVND variance–covari-
ance matrix is defined in terms of the logarithms of the covariate values.
Likewise, categorical values must all be coded to possess positive values.
Complete patient covariate vectors (both continuous and categorical cova-
riates) are then sampled from a single MVND; because the sampled val-
ues are logarithmic, each component of the vector is then exponentiated to
obtain the true covariate values. Note that because categorical covariates
are sampled from a continuous MVND, sampled values for the virtual
patients for these covariates will therefore be nondiscrete (e.g., if nonsmok-
ers = 1 and smokers = 2 in the empirical distribution, a value such as 1.3
would be a possible value for smoking status in vectors sampled from a
MVND). These continuous values must then be mapped to discrete cate-
gorical values, based on a continuous critical value (CrV).
The CrV is determined from the inverse of the lognormal cumulative
distribution with a given mean, standard deviation, and cumulative prob-
ability (16), according to the following equation:
Nonsmokers (18.6%)
Former smokers (49.3%)
Current smokers (32.1%)
0 1 2 3 4 5 6
Continuous value for smoking status
Fig. 2. Illustration of the calculation of discrete values for smoking status with three possible
values (nonsmoker, former smoker, current smoker). The mean and standard deviation define
the log-normal probability distribution of smoking status in the empirical distribution, as if it
was a continuous covariate. The histogram is derived from an empirical distribution used to
estimate the continuous distribution parameters. The cumulative probability (P) corresponds
to the area under the probability distribution curve (solid line). The CrVs are indicated by
arrows on the x-axis.
Qualification of Methods
The performances of the Discrete and Continuous methods were eval-
uated based upon their abilities to reproduce the summary statistics of
target population covariate distributions, using both real and simulated
populations. All simulations were performed using Trial Simulator, and
statistics were determined using SPLUS (Version 6.0, Insightful Corp.,
Seattle, WA).
For the Continuous method, the parameters of a single MVND were
estimated using all covariates (continuous and categorical) from the real
780 Tannenbaum et al.
or simulated data. The following steps were carried out in S-PLUS. First,
summary statistics (geometric mean, minimum, and maximum) were com-
puted for each covariate. All covariates were log-transformed (for categor-
ical covariates with a value of zero in the empirical distribution, 1 was
added to all values prior to log-transformation). The variance–covariance
matrix of the transformed values was then determined. The summary sta-
tistics and variance–covariance matrix were entered into Trial Simulator
to define the MVND. All covariates were classified as continuous, regard-
less of their original type (categorical, continuous) in the empirical distri-
bution. After the covariate vectors were sampled from the MVND, each
element was exponentiated. The “continuous” categorical covariate values
were then discretized based upon the appropriate CrV.
For the Discrete Method, the values from the real or simulated data
were subset into groups corresponding to each unique combination of cat-
egorical covariates. In Trial Simulator, each covariate was classified as cat-
egorical or continuous according to its type in the empirical distribution.
The method outlined in the previous paragraph was then applied for the
continuous covariates in each subgroup.
1000 subjects were simulated using both the Discrete and Continuous
methods. The population summary statistics and distributions of contin-
uous covariates and proportions of categorical covariate values generated
by both methods were compared to those of the corresponding “observed”
(real or simulated) data. In addition, a test of the method’s ability to
preserve the correlation coefficients between the covariates was examined.
A variance–covariance matrix was created from the simulated population
data, and compared to the variance–covariance matrix obtained from the
original data; the percent difference between the values in the same posi-
tions in the two matrices was calculated.
For each original data set (real, simulated), the Discrete and Contin-
uous methods were replicated 10 times. With 1000 subjects simulated per
replicate, good marginal statistics for the covariates, including 95% confi-
dence intervals, could be obtained. Since the results were fairly invariant
between the replicates (based on a small standard error of the mean), 10
replicates were judged to be sufficient to obtain precise estimates of the
covariate summaries.
The Continuous and Discrete methods were applied to both real data
and 27 simulated covariate data sets, as follows:
Table II. Variables defining the simulation scenarios (n = 27) for the
simulated target population covariate data sets
% (CAT = 1) Corr MR
10 0 0.1
25 0.45 0.5
50 0.9 0.9
The MVND parameters for each of the possible scenarios were used
to simulate 27 covariate data sets with Trial Simulator.
RESULTS
Empirical Distribution of Covariates (Real Data Example)
Table III and Figs. 3 and 4 display the covariate summary statistics of
the real data set and those generated using the Continuous Method. The
mean, standard deviation, and range of the continuous covariates in the
original population are maintained in the simulated population. In addi-
tion, the proportion of each value of the categorical covariates in the orig-
inal population is maintained in the simulated population, showing that
the mapping from continuous to discrete value calculation is appropriate.
Simulation of Correlated Continuous and Categorical Variables 783
Table III. Real Data Example: summary statistics of the categorical covariates of target pop-
ulation (n = 467) and the covariate data generated using the Discrete and Continuous meth-
ods (n = 1000 × 10 replicates)
1.0 1.0
Proportion
Proportion
0.5 0.5
0.0 0.0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Replicate Replicate
DIAGNOSIS
1.0
Proportion
0.5
0.0
0 1 2 3 4 5 6 7 8 9 10
Replicate
Fig. 3. Real Data Example: Proportion of patients in each level for the three categorical
covariates. The leftmost bar in each plot represents the target population data (n = 467).
Each additional bar represents one replicate of 1000 patients, generated using the Continuous
method. The varying bar colors represent the proportion of subjects in each category.
DIASTOLIC BP SYSTOLIC BP
CHOLESTEROL GLUCOSE
Fig. 4. Real Data Example: Box and whisker plot showing the distribution of values for the
continuous covariates. The leftmost box in each plot represents the target population data
(n = 467). Each additional box represents one replicate of 1000 patients, generated using the
Continuous method. The boxes contain the 25th to 75th percentiles, the horizontal bar repre-
sents the median value, the whiskers represent the 10th to 90th percentiles, and the lines out-
side the whiskers represent outliers.
Table IV. Real Data Example: summary statistics of the continuous covariates in the tar-
get population (n = 467) and the covariate data generated using the Discrete and Continuous
methods (n = 1000 × 10 replicates)
Age 68.7 ± 8.2 69.7 (0.23) ± 7.20 (0.15) 69.9 (0.21) ± 4.98 (0.14)
Weight 72.6 ± 12.8 74.0 (0.27) ± 10.8 (0.19) 70.5 (0.44) ± 7.19 (0.34)
BMI 25.8 ± 3.5 26.1 (0.04) ± 2.92 (0.07) 25.8 (0.08) ± 2.41 (0.09)
Cholesterol 205.5 ± 44.0 214.0 (1.50) ± 43.7 (1.32) 212.2 (2.29) ± 30.95 (1.19)
Diastolic BP* 77.9 ± 11.0 79.4 (0.36) ± 10.2 (0.16) 79.4 (0.36) ± 10.2 (0.16)
Systolic BP* 146.1 ± 18.4 148.1 (0.36) ± 17.1 (0.42) 147.0 (1.06) ± 12.5 (0.45)
Glucose 5.97 ± 1.91 6.70 (0.07) ± 2.36 (0.09) 6.48 (0.11) ± 1.39 (0.07)
Fig. 5. Simulated Data: Bar chart showing the percentage of patients in the subgroup
(CAT = 1) for the target population covariate data and the covariate data generated by
the Continuous and Discrete methods. There should be 10, 25, and 50% in the (CAT = 1)
subgroup, respectively, for each set of 9 scenarios.
simulated by the Continuous method. For the whole population, the Con-
tinuous method reliably simulates covariates with mean and coefficient of
variation close to the true values. The %PE is negligible, and is relatively
independent of MR and number of patients in the subgroups, with only
a slight negative %PE for both the mean and CV at a MR value of
0.1. However, for the individual subgroup summary statistics, the %PE is
highly dependent upon both MR and the percentage of patients in that
subgroup. For MR = 0.1, the Continuous method results overestimate the
mean and CV of CAT = 1 (as shown by a large positive %PE); as the per-
centage of patients in this subgroup increases, however, the error decreases.
As MR increases, the errors approach zero for both mean and CV in the
subpopulations. For the Discrete method, there are negligible errors in the
mean and SD for the subgroups and for the whole population, which are
independent of the values of MR or the percentage of patients in each
subgroup (results not shown).
Figure 8 shows the correlation between CONT1 and CONT2 for
the true simulated covariate data, and for the Continuous and Discrete
method results. Comparing the plots of the simulated covariate data and
the Continuous method results indicates that for a MR value of 0.1, the
continuous method fails to capture the relationship between CONT1 and
CONT2, but adequately captures the correlation for larger ratios. For all
Simulation of Correlated Continuous and Categorical Variables 787
Fig. 6. (a) Simulated Data: Population distribution of CONT1 for MR = 0.1. The target pop-
ulation covariate data (gray bars) is overlaid with the Continuous method results (top) and
Discrete method results (bottom), shown as transparent bars. Only the scenarios for correla-
tion = 0 between CONT1 and CONT2 are shown, but the plots for correlations of 0.45 and
0.9 look similar. (b) Simulated Data: Population distribution of CONT1 for MR = 0.5. (c)
Simulated Data: Population distribution of CONT1 for MR = 0.9.
788 Tannenbaum et al.
Fig. 6. Continued.
70
30
-10
-50
150
110
% PE (CV)
70
30
-10
-50
0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9
Fig. 7. Simulated Data: Percent error in mean (top row) and CV (bottom row) of CONT1
for the values generated by the Continuous method, as a function of mode ratio and %
(CAT = 1).
-35
CAT=1:10%
CAT=1:25%
CAT=1:50%
-70
0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9
Ri
Fig. 8. Simulated Data: Correlation between CONT1 and CONT2 for CAT = 1 (light gray)
and CAT = 2 (dark gray). Only the scenarios with 50% of patients in each subgroup are
shown, but the plots for 10% and 25% in the CAT = 1 subgroup look similar.
790 Tannenbaum et al.
Fig. 9. Simulated Data: Percent error in correlation between CONT1 and CONT2 for the
values generated by the Continuous method, as a function of mode ratio and % (CAT = 1).
covariate coding had no effect upon the simulated results. The summary
statistics of the continuous covariates, and the proportion of each value of
the categorical covariates in the original population, including smoking sta-
tus, were nearly identical to that of the original observed population.
While the value of the codes (1/2/3 vs. 1/2/50) did impact the simu-
lated results when the third value became large, this analysis was really not
necessary; should codes such as the latter appear in a data set, the third
value could be simply recoded (i.e., all values of 50 changed to 3) for the
creation of the MVND, and then transformed back (to 50) in the simu-
lated data set.
DISCUSSION
As demonstrated by the real data example, both the Continuous
method and Discrete methods generate accurate summary statistics for
the covariates of the target population. The mean, standard deviation,
and range of the continuous covariates in the target population, and the
Simulation of Correlated Continuous and Categorical Variables 791
12 17 22 27 32 37 40 60 80 100 0 5 10 15 20
90
80
AGE 70
60
50
37
32
27
BMI
22
17
12
350
300
250
CHOL 200
150
100
100
80
DBP
60
40
210
160
SBP
110
60
20
15
10 DIAB
5
0
100
80
WT
60
40
50 60 70 80 90 100 150 200 250 300 350 60 110 160 210 40 60 80 100
Fig. 10. Scatterplot matrix for observed continuous covariates in real data set. Lines repre-
sent Loess smooths of the individual plots, and indicate that the relationships between the
covariates are relatively linear; thus, it is appropriate to enter all covariates into the MVND.
REFERENCES
1. N. H. G. Holford, M. Hale, H. C. Ko, J.-L. Steimer, and C. C. Peck (eds.). P.
Bonate, W. R. Gillespie, T. Ludden, D. B. Rubin, L. B. Sheiner, and D. Stanski
(contributors). Simulation in Drug Development: Good Practices. http://cdds.georgetown.
edu/research/sddgp723.html
2. P. L. Bonate. Clinical trial simulation in drug development. Pharm. Res. 17:252–256
(2000).
3. N. H. G. Holford, J. Monteleone, H. Kimko, and C. Peck. Simulation of clinical trials.
Annu. Rev. Pharmacol. Toxicol. 40:209–234 (2000).
4. S. Chabaud, P. Girard, P. Nony, and J. P. Boissel. Clinical trial simulation using ther-
apeutic effect modeling: Application to ivabradine efficacy in patients with angina pec-
toris. J. Pharmacokinet. Pharmacodyn. 29(4):339–363 (2002).
5. H. J. M. Lemmens, D. R. Wada, C. Munera, A. Eltahtawy, and D. R. Stanski.
Enriched analgesic efficacy studies: An assessment by clinical trial simulation. Contemp.
Clin. Trials 27(2):165–173 (2006).
6. C. Veyrat-Follet, R. Bruno, and R. Olivares. Clinical trial simulation of docetaxel in
patients with cancer as a tool for dosage optimization. Clin. Pharmacol. Ther. 68:677–
678 (2000).
7. H. Kastrissios, S. Rohatagi, J. Moberly, K. Truitt, Y. Gao, D. R. Wada, M. Takahashi,
K. Kawabata, and D. Salazar. Development of a predictive pharmacokinetic model for
a novel cyclooxygenase-2 inhibitor. J. Clin. Pharmacol. 46(5):537–548 (2006).
8. K. G. Kowalski and M. M. Hutmacher. Design evaluation for a population pharmaco-
kinetic study using clinical trial simulation: A case study. Stat. Med. 20:75–91 (2001).
9. D. R. Mould. Defining covariate distribution models for clinical trial simulation. In H.
C. Kimko and S. B. Duffull (eds.), Simulation for Designing Clinical Trials: A Phar-
macokinetic– Pharmacodynamic Modeling Perspective, Marcel Dekker, New York, 2003,
pp. 31–53.
10. M. Evans, N. Hastings, and B. Peacock. Statistical Distributions, John Wiley & Sons,
Inc., New York, 1993, pp. 102–105.
11. S. L. Beal and L. B. Sheiner (eds.). NONMEM Users Guides, Icon Development Solu-
tions, Ellicott City, MD (1989–98).
12. B. Schmeiser. Advanced input modeling for simulation experimentation. In P. A. Far-
rington, H. B. Nembhard, D. T. Sturrock, and G. W. Evans (eds.), Proceedings of the
1999 Winter Simulation Conference, 1999, pp. 110–115.
13. M. Kaut and S. W. Wallace. Evaluation of scenario-generation methods for stochastic
programming (2003). http://citeseer.ist.psu.edu/cache/papers/cs/29430/http:zSzzSzwww.iot.
ntnu.nozSz mkautzSzCV and studyzSzSG evaluation.pdf/kaut03evaluation.pdf
14. S. Ghosh and S. G. Henderson. Chessboard distributions and random vectors with
specified marginals and covariance matrix. Working paper, Department of Industrial
and Operations Engineering, University of Michigan, Ann Arbor (2000).
15. M. C. Cario and B. L. Nelson. Modeling and generating random vectors with arbi-
trary marginal distributions and correlation matrix. Technical Report, Department of
Industrial Engineering and 35 Management Sciences, Northwestern University, Evans-
ton, Illinois (1997).
16. L. Lapin. Probability and Statistics for Modern Engineering (2nd edn), PWS-KENT
Publishing Company, Boston, 1983, pp. 215–217.
17. R. H. Williams and P. R. Larson (eds.), Williams Textbook of Endocrinology (10th
edn), W.B. Saunders Co., Illinois, 2002, pp. 622, 726.
18. P. McDonough and R. Moffatt. Smoking-induced elevations in blood carboxyhaemo-
globin levels. Effect on maximal oxygen uptake. Sports Med. 27:275–283 (1999).