0% found this document useful (0 votes)
42 views25 pages

Comparison of Count Modeling Techniques For Estimating Environmental Monitoring Limits in Clean Rooms

This document discusses statistical techniques for setting alert and action limits for environmental monitoring in clean rooms used by pharmaceutical companies. It compares traditional percentile limits, parametric bootstrap limits, nonparametric bootstrap limits, and Bayesian limits using simulated count data. The key distributions considered are Poisson, negative binomial, and zero-inflated versions to account for overdispersion and excess zeros often seen in environmental monitoring data. The goal is to better understand the strengths and limitations of these statistical modeling techniques for setting appropriate environmental monitoring limits.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views25 pages

Comparison of Count Modeling Techniques For Estimating Environmental Monitoring Limits in Clean Rooms

This document discusses statistical techniques for setting alert and action limits for environmental monitoring in clean rooms used by pharmaceutical companies. It compares traditional percentile limits, parametric bootstrap limits, nonparametric bootstrap limits, and Bayesian limits using simulated count data. The key distributions considered are Poisson, negative binomial, and zero-inflated versions to account for overdispersion and excess zeros often seen in environmental monitoring data. The goal is to better understand the strengths and limitations of these statistical modeling techniques for setting appropriate environmental monitoring limits.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Statistics in Biopharmaceutical Research

ISSN: (Print) 1946-6315 (Online) Journal homepage: https://www.tandfonline.com/loi/usbr20

“Comparison of Count Modeling Techniques for


Estimating Environmental Monitoring Limits in
Clean Rooms”

Plinio A. De los Santos, Ji Young Kim, Pieta C. IJzerman-Boon, George G.


Kariuki & Brandye Smith-Goettler

To cite this article: Plinio A. De los Santos, Ji Young Kim, Pieta C. IJzerman-Boon, George
G. Kariuki & Brandye Smith-Goettler (2020): “Comparison of Count Modeling Techniques for
Estimating Environmental Monitoring Limits in Clean Rooms”, Statistics in Biopharmaceutical
Research, DOI: 10.1080/19466315.2020.1799854

To link to this article: https://doi.org/10.1080/19466315.2020.1799854

Accepted author version posted online: 23


Jul 2020.

Submit your article to this journal

Article views: 2

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=usbr20
“Comparison of Count Modeling Techniques for Estimating

Environmental Monitoring Limits in Clean Rooms”

by Plinio A. De los Santos1, Ji Young Kim1, Pieta C. IJzerman-Boon1,

George G. Kariuki1,2 and Brandye Smith-Goettler1

t
ip
Abstract:

cr
Pharmaceutical and biotechnology industries manufacture their products in clean rooms, which are

designed to minimize levels of particulates (like microorganisms recovered from the air or from the

us
clean room surfaces). Alert and action limits are employed to monitor and control the state of the room,
an
keeping the level of particulates at appropriate levels. Particulate monitoring systems could generate

particulate count data with the following characteristics: have repeated counts, have inflated zero or
M
low counts, and could be dispersed and have distributions with long thin tails to the right. In this paper,

we present comparisons of four statistical modeling techniques for setting alert and action limits (i.e.,
e d

traditional percentile, parametric bootstrap, nonparametric bootstrap, and Bayesian with informative
pt

priors) using simulated environmental monitoring data under controlled experimental conditions, to
ce

better understand the strengths and limitations of these techniques.

KEYWORDS: Bayesian Percentiles, Bootstrap Percentiles, Particulate Count Estimation, USP<1116>, Zero
Ac

Inflation.

1
Center for Mathematical Sciences, MMD, Merck & Co., Inc., Kenilworth, NJ USA
2
Currently at Regulatory Compliance and External Engagement, Global Quality / Global Product
Development and Supply, Bristol-Myers Squibb, New Brunswick, NJ USA

1 of 24
Introduction

For assurance of drug substance/product quality, pharmaceutical manufacturers use controlled

environments (e.g. clean rooms) to mitigate microbial contamination. United States Pharmacopeia

Chapter 1116 (USP<1116>)[1], Microbiological Evaluation of Clean Rooms and Other Controlled

Environments, indicates that an environmental monitoring program: (i) describes in detail the

procedures and methods used for monitoring particulates as well as microorganisms in controlled

t
environments, (ii) includes sampling sites, frequency of sampling, and investigative actions that should

ip
be followed if alert or action levels are exceeded. Furthermore, this USP chapter specifies that while an

cr
alert level focuses on limits to ensure that the process is within control, an action level is a limit that, if

us
exceeded, should trigger an investigation and a corrective action. In this paper we focus on comparing

statistical methods used to establish data driven alert and action levels in support of an environmental
an
monitoring program.
M
Wilson [2] stated that, in practice, environmental monitoring data are usually not normally distributed.

Also, he pointed out that their histograms generally resemble a Poisson distribution or a Negative
d

Exponential distribution (a.k.a. “Exponential” distribution), which are two interrelated distributions that
e

could be employed to describe count data (i.e., while the Poisson distribution focuses on describing the
pt

actual counts, the Exponential distribution focuses on describing the time between counts). The Poisson
ce

distribution requires that the mean and the variance of the counts have the same value and can be
Ac

described simply by the mean count (μ). Since we do not consider real-time monitoring data in this

paper, but only data from samples collected at discrete points in time, we will only be using distributions

describing the count data, not the times between them.

Hoffman [3] indicated that counts for many processes cannot be adequately modelled by the Poisson

distribution, especially when the data are over-dispersed (i.e., when the variance of the data is

considerably larger than its mean). For that situation, Hoffman [3] recommends the use of the Negative

2 of 24
Binomial distribution, since this is a natural/flexible extension of the Poisson distribution as depicted in

Figure 1. The figure shows that when the Negative Binomial dispersion parameter “k” increases, the

distribution converges to a Poisson. When a random variable, X, follows a Negative Binomial distribution

with location parameter “μ” and dispersion parameter “k”, the mean and variance are given by:

[ ] (1)

[ ] ( ) (2)

t
ip
Notice that equation (2) also shows that when the dispersion parameter goes to infinity, the mean and

the variance of the counts have the same value and the Negative Binomial converges to the Poisson

cr
distribution.

us
But in certain situations, the environmental data are populated by an excessive number of zeros,
an
beyond what would be structurally expected in either a Poisson or a Negative Binomial distribution. This

phenomenon is known as “zero inflation.” In this situation, a zero-inflated probability distribution may
M
be employed, which allows for excess zeros. The density function “ ( )” for a zero-inflated probability

distribution can be described with a specified probability of excess zeros (“Pr.(zero)”) and probability
d

distribution function “ ( )”:


e
pt

( ) ( ( )) ( )
( ) { (3)
( ( )) ( )
ce

When the probability distribution function “ ( )” is Poisson distributed, the zero inflated distribution is
Ac

known as a zero-inflated Poisson (or ZIP) distribution [4]. Similarly, when the probability distribution

function “ ( )” is Negative Binomial distributed, the zero inflated distribution is known as a zero-

inflated Negative Binomial (or ZINB) distribution [5]. For comparison purposes, the above four

distributions are considered in this paper with input parameter ranges consistent with historically

observed environmental monitoring surface data.

3 of 24
Figure 1: Examples of Poisson and Negative Binomial distributions

t
ip
cr
us
an
M
Method Description and Study Design

Traditionally, alert and action limits are calculated by obtaining one-sided percentiles of a suitable
d

parametric distribution. In this paper, the comparison limits will be also calculated using Bootstrap and
e

Bayesian based procedures, as outlined in Table 1. Some additional information about these techniques
pt

are:
ce

 Nonparametric bootstrapping [6, 7, 8] is a computationally intensive technique for making

inferences about a population characteristic using samples from the population. The central idea of
Ac

bootstrapping is that it may sometimes be better to draw conclusions about the characteristics of

the population strictly from the sample at hand. Bootstrapping involves “resampling” the data with

replacement many times, in order to generate an empirical estimate of the entire sampling

distribution of the statistic.

4 of 24
Table 1: Description of Environmental Monitoring Upper Limits Estimation Methods for Comparison
Method Process
Traditional Fit the assumed distribution to the data using maximum likelihood estimation and
get an upper percentile limit (e.g. 95% or 99%, to be used as alert or action limit)
from the fitted distribution.
Bootstrap If the bootstrap is parametric, fit the assumed distribution to the data and
estimate the population parameters from an observed sample of size “n” using
maximum likelihood estimation. Otherwise, employ the observed raw data
frequencies as the fitted distribution.
Resample the fitted distribution “B” times and obtain in each occasion a sample
size “n”. Then, for each re-sample, estimate an empirical distribution-based
percentile limit.

t
In this context, “re-sampling” implies the generation of “n” random counts from

ip
the fitted distribution at each of the “B” iterations. For the comparison, “B” was
set to 1000.

cr
Set the environmental monitoring limit equal to the median of the “B” percentile
limits. The median of the bootstrap samples was employed instead of the average

us
of the bootstrap samples because it was considered a more robust distribution
parameter.
Bayesian For each assumed distribution, obtain 20,000 samples from the posterior
distributions of the parameters of the assumed distribution. Sampling for the
an
assumed distribution was performed in RStan using 5,500 iterations including a
warm up period of 500 on 2 MCMC chains. To reduce autocorrelation, thinning
was set at 5, meaning that every 5th sample was saved.
M
For each of the 20,000 sampled parameter combinations, obtain a percentile limit
of the corresponding distribution.
Set the environmental monitoring limit equal to the median of the distribution of
d

the percentile limits.


e


pt

Parametric bootstrapping initially assumes a distribution for the population and employs an

observed sample to estimate the distributional parameters. Then draw a large number of samples
ce

from the estimated parametric distribution to further calculate the statistic of interest.
Ac

 A Bayesian approach [9] is another computationally intensive technique, based on the idea that the

distribution parameters are random variables. It allows the use of prior information on the

parameters when available. With the advancements in the computational power, the estimation of

a posterior distribution is simple and flexible in the sense that estimates of any posterior distribution

for a given prior distribution is possible. Another advantage of this technique is that it is possible to

obtain the posterior predictive distribution of the statistics of interest directly from the posterior

5 of 24
distributions of the parameters based on Markov Chain Monte Carlo (MCMC) [10]. The Bayesian

analysis presented in this paper was performed using Stan [11], a powerful computational platform

for MCMC sampling with an R interface. The sampling was done using the algorithm ‘No-U-Turn

Sampler (NUTS)’ considering its advantages over other algorithms (robustness against tuning

parameters and flexibility in the choice of models) [12], and the predictive distributions of the

microbial count upper limits were obtained.

t
ip
During the limit calculation across methods, both the 95th and the 99th percentile levels were calculated.

cr
To evaluate the above calculation methods, simulated data were generated while considering the

extreme parameter conditions listed in Table 2. Their corresponding densities are plotted in Figure 2.

us
Then, 30 sphere packing space filling design points, represented in an experimental cube in Figure 3,
an
were employed to survey points within the extreme parameter space [13]. As listed in Table 3, when

combining the sphere packing design points and the not yet included extreme corner points, 35
M
parameter point locations were considered. Their mean, variance and “true” percentiles are also
d

provided in Table 3. The selected parameters represent conditions which are consistent with historically
e

observed environmental monitoring surface data. However, these should not be considered universal
pt

across all possible situations that could be observed in the field. The selected base conditions enable us
ce

to illustrate and characterize the performance of the environmental monitoring estimation methods

outlined in the previous section within the selected parameter context. Table 4 lists the assumed
Ac

informative prior distributions employed with the Bayesian method, which were also chosen based on

the range of data historically observed.

Sample sizes of levels n=60 and n=300 were used for each combination as the typical small and large

sample sizes for the counts in a specific room, as well as 50 experimental replicates from each

combination. Hence, a total of 3,500 simulated datasets were created using the R script provided in

Appendix A.

6 of 24
Table 2: Extreme Parameter Combinations and Resulting Distributions
Inflation Dispersion Mean Count Source
(probability of structural Parameter k Parameter μ Distribution
zeros, irrespective of
distribution driven zeros)
No Low (0.1) Poisson (μ=0.1)
No [k  1000 (large number)] High (2.9) Poisson (μ=2.9)
[Pr.(zero)=0.0] Yes Low (0.1) Negative Binomial (μ=0.1, k=1)
[k=1] High (2.9) Negative Binomial (μ=2.9, k=1)
No Low (0.1) ZIP (μ=0.1, Pr.(zero)=0.6)
Yes [k  1000 (large number)]
High (2.9) ZIP (μ=2.9, Pr.(zero)=0.6)

t
ip
[Pr.(zero)=0.6] Yes Low (0.1) ZINB (μ=0.1, k=1, Pr.(zero)=0.6)
[k=1] High (2.9) ZINB (μ=2.9, k=1, Pr.(zero)=0.6)
Note: It was assumed that k=1000 is large enough to approximate the setting without dispersion.

cr
Table 3: Parameter Values for Simulated Experiments and “True” Percentiles from Source Distributions

us
Inflation Dispersion Mean Count Source “True” Percentiles Sphere Packing
Mean Variance Corner Point
Pr.(zero) k Parameter μ Distribution Design Point
95% 99%
0.00 1 0.10 Negative Bin. 0.10 0.11 1 1 X
0.00
0.00
0.00
1
222
545
2.90
0.10
1.68
Negative Bin.
Negative Bin.
Negative Bin.
2.90
0.10
1.68
an
11.31
0.10
1.69
10
1
4
15
1
5
X X
X
X
0.00 1,000 0.10 Poisson 0.10 0.10 1 1 X
M
0.00 1,000 1.20 Poisson 1.20 1.20 3 4 X
0.00 1,000 2.90 Poisson 2.90 2.91 6 7 X X
0.01 463 2.90 ZINB 2.87 3.14 6 8 X
0.02 640 0.48 ZINB 0.47 0.48 2 3 X
d

0.07 1 1.13 ZINB 1.05 2.47 4 7 X


0.15 1,000 0.10 ZIP 0.09 0.09 1 1 X
e

0.16 222 2.11 ZINB 1.77 3.31 5 6 X


0.18 846 1.99 ZINB 1.63 3.09 4 6 X
pt

0.24 1 0.10 ZINB 0.08 0.09 1 1 X


0.25 512 1.24 ZINB 0.93 1.60 3 4 X
0.29 531 2.90 ZINB 2.06 5.90 6 7 X
ce

0.31 647 0.10 ZINB 0.07 0.07 1 1 X


0.32 1 1.39 ZINB 0.95 3.16 4 7 X
0.34 1,000 1.08 ZIP 0.71 1.26 3 4 X
0.34 1,000 2.90 ZIP 1.91 5.87 5 7 X
Ac

0.34 73 2.90 ZINB 1.91 5.94 6 7 X


0.43 287 0.57 ZINB 0.32 0.48 2 3 X
0.44 746 1.99 ZINB 1.11 2.95 4 5 X
0.50 996 0.10 ZINB 0.05 0.05 0 1 X
0.57 466 2.90 ZINB 1.25 4.58 5 7 X
0.59 34 1.42 ZINB 0.58 1.38 3 4 X
0.60 1 0.10 ZINB 0.04 0.05 0 1 X X
0.60 1 2.66 ZINB 1.06 6.54 6 11 X
0.60 1 2.90 ZINB 1.16 7.67 7 12 X
0.60 478 1.34 ZINB 0.54 1.21 3 4 X
0.60 581 0.10 ZINB 0.04 0.04 0 1 X
0.60 938 2.82 ZINB 1.13 4.11 5 7 X
0.60 983 1.25 ZINB 0.50 1.09 3 4 X
0.60 1,000 0.10 ZIP 0.04 0.04 0 1 X

7 of 24
Inflation Dispersion Mean Count Source “True” Percentiles Sphere Packing
Mean Variance Corner Point
Pr.(zero) k Parameter μ Distribution Design Point
95% 99%
0.60 1,000 2.90 ZIP 1.16 4.31 5 7 X
Figure 2: Density Comparisons at the Extreme Parameter Combinations (Corner Points)

t
ip
cr
us
an
M
e d
pt

Figure 3: Experimental Cube with Data Generation Parameters


ce
Ac

8 of 24
Table 4: Bayesian Informative Prior per Assumed Distribution
Assumed Distribution that Applies to
Assumed Informative
Parameter Negative
Prior Distribution Poisson ZIP ZINB
Binomial
Mean Count Parameter μ uniform(0.1, 2.9) X X X X
Dispersion k uniform(1, 1000) X X
Inflation Pr.(zero) uniform(0, 0.6) X X
Assessing Akaike Information Criteria Goodness of Fit Patterns

For each of the simulated datasets, four candidate distributions (i.e., Poisson, Negative Binomial, ZIP,

and ZINB) were fitted even when the assumed distribution was not exactly the true distribution.

t
However, as listed in Table 5, there were 41 out of the 14,000 possible instances for which it was not

ip
feasible to fit an assumed distribution due to convergence issues. This situation occasionally may be

cr
observed when the sample size is small, and having low mean counts and preponderance of zeros [14,

us
15]. As a result, only 13,959 feasible cases were part of the assessment.

an
Table 5: List of Assumed Distribution Fit with Convergence Issues
Source Distribution Parameters
Assumed Number of
n Inflation Dispersion Mean Count Parameter
M
Distribution Convergence Issues
Pr.(zero) k μ
Negative Binomial 60 0.00 222 0.1 1
0.31 647 0.1 1
d

0.50 996 0.1 2


e

0.60 1 0.1 3
581 0.1 7
pt

1,000 0.1 3
300 0.00 1,000 2.9 1
ce

ZINB 60 0.00 222 0.1 1


1,000 2.9 3
0.31 647 0.1 1
Ac

0.50 996 0.1 2


0.60 1 0.1 3
581 0.1 7
1,000 0.1 3
ZIP 60 0.00 1,000 2.9 3

The goodness-of-fit of each fitted distribution to the data was assessed using the Akaike Information

Criterion (AIC) [16]. This criterion helps define the best model in a way that balances the number of

parameters with the objective of avoiding either over- or under-fitting the data:

9 of 24
( ̂) (4)

where “k”, in this case, is the number of parameters,

and “ ̂ ” is the maximum value of the likelihood function.

It is possible that some of the AIC estimates are too close to distinguish whether a substantial difference

between fitted options is significant. A goodness of fit assessment using AIC is relative: the smaller a

t
value of the AIC in comparison to others, the better the fit. Then, it should be expected that the AIC

ip
tends to be smaller when the assumed distribution is consistent with the source distribution input

cr
parameters. But also, there are instances where the AIC for assumed distributions which are different

us
from the source distribution, are very close to the AIC obtained when the assumed and the source

distribution input parameters are consistent with each other. This happened more often for the lower
an
mean count levels. For illustration purposes, Figure 4 provides box-plots of the AIC per sample size level
M
for the various assumed distributions and the source distributions at the experiment corner points, split

by the mean count levels. The split by mean count level was motivated by the clear differences in scale
d

between AIC at the low and high mean count levels. When drilling down into these plots some visual
e

patterns emerged:
pt

 Irrespective of the mean count level, the higher sample size level tended to exhibit a higher AIC. This
ce

could be expected since the AIC is a likelihood-based criterion and the likelihood scales with the
Ac

sample size.

 When the mean count level was at the low level, the AIC values from different assumed

distributions tended to be close to each other at a given sample size and source distribution levels.

Similar patterns were also observed at the high mean count level when the source distribution was

Poisson.

10 of 24
 But when the mean count level was at the high level and the true source distribution was other than

Poisson, there appear to be differences in AIC values, indicating that Poisson, followed by ZIP did not

fit very well when the source distribution was different.

 Although the visual patterns in Figure 4 suggest that there are opportunities of fitting multiple

distributions to a dataset, there is no clear guidance of when AIC values could be close enough to

each other to suggest that there is no significant difference between the distribution fit. The AIC, as

t
well as other metrics commonly employed (like Bayesian Information Criteria or BIC) are just able to

ip
provide a ranking. But in most cases evaluated, it is possible to use a p-value based assessment like

cr
the chi-square test. Also, Vuong [17] proposed the use of a pairwise test for the types of

us
distributions that are being considered in this paper. Although Wilson indicated that Vuong’s test

has been misused for zero-inflated models under non-nested conditions [18], Merkle et al. [19]
an
evaluated the adequacy of Vuong’s test under those conditions. In this context, two models are
M
“nested” when one reduces to the other when certain parameters are fixed [18]. At the end,

practitioners in the field should exercise cautious judgement.


e d
pt
ce
Ac

11 of 24
Figure 4: Comparison of AIC Results by Sample Size and Assumed Distribution Levels at the Corner Point
Source Distributions and at the Extreme Mean Count Levels

t
ip
cr
us
an
M
d

Assessing Limit Patterns


e

After fitting the various assumed distributions to each of the simulated datasets, the main interest was
pt

the comparison of 95th and 99th percentile limits obtained with all the methods. Considering 13,959
ce

fitted distribution cases, that would yield 111,672 comparisons between the 4 estimation methods (i.e.,

Bayesian with informative priors, parametric bootstrap, nonparametric bootstrap, and the traditional
Ac

methods) for the 2 sets of limits.

Figures 5 and 6 shows the average difference between the estimated percentile limits and the “true”

limits from the source distributions (e.g. the “true” percentile limits are provided in Table 3). These

show that when the mean count level was low (i.e., less than 1.5), or when there was a combination of

high mean count level (i.e., over 1.5) without extreme inflation levels (i.e., between 0.15 and 0.45), the

12 of 24
average limit differences are relatively low, and in cases very close to zero. But when there was a

combination of high mean count level (i.e., over 1.5) with extreme inflation levels (i.e., below 0.15 or

above 0.45), there was more variability in the limit differences (especially at the 99th percentile limits).

That suggests an interactive behavior in the limit difference response as a function of the mean count

level and the inflation parameters. These plots also revealed that at the high mean count parameter

levels, the differences tended to show a decreasing slope as a function of the dispersion parameter “k”,

t
varying with the calculation method and the assumed distribution. The average differences appeared to

ip
show consistent results between the two sample size levels evaluated (i.e. 60 and 300). Figure 6 shows

cr
that for the 99th percentile limits, the Bayesian method provides lower percentile estimates than the

us
other methods, usually closer to the true limits, except sometimes at higher mean count levels when the

assumed distribution is negative binomial. However, this pattern is not present for the 95th percentile
an
limits in Figure 5.
M
e d
pt
ce
Ac

13 of 24
Figure 5: Plotting the 95% Limit Difference (True minus Estimated) for Cases that Converged

t
ip
cr
us
an
Note: a 95% confidence interval is included around each method by dispersion slope.
M
Figure 6: Plotting the 99% Limit Difference (True minus Estimated) for Cases that Converged
e d
pt
ce
Ac

Note: a 95% confidence interval is included around each method by dispersion slope.

14 of 24
Figures 7 and 8 provide violin plots of the differences between the “true” and the estimated limits by

the estimation method and by sample size and percentile level for the lowest and highest mean count

parameter levels. Violin plots show the probability density of the data at different groups, usually

smoothed by a kernel. In these figures, mean differences are denoted as red and blue circles, and the

black dashed lines are the reference lines where the differences are zero. In addition, Figures 9 and 10

compare directly the estimated vs the “true” limits for the lowest and highest mean count parameter

t
levels. These plots show that:

ip
 When the mean count was the lowest evaluated (0.1) and the limit was estimated at the 95 th

cr
percentile level, the bootstrap and traditional methods tended to show limit differences around

us
zero. But in that case, when the source distribution was ZIP or ZINB (e.g., two distributions with high

inflation), the Bayesian method with informative priors tended to yield limits higher than the “true”
an
limit. For all source distributions, higher sample sizes led to limits closer to the true one.
M
 When the mean count was the lowest evaluated (0.1) and the limit was estimated at the 99 th

percentile level, two patterns were observed:


d

o When the source distribution was ZIP or ZINB, the Bayesian method tended to show limit
e

differences around zero. But the bootstrap and traditional methods tended to yield limits lower
pt

than the “true” limit.


ce

o When the source distribution was Poisson or Negative Binomial (no inflation), all the estimation
Ac

methods tended to show limit differences around zero.

Higher sample sizes led to limits closer to the true one, especially when the source distribution was

Poisson or Negative Binomial.

 When the mean count was the highest evaluated (2.9) and the limit was estimated at the 95 th

percentile level, the non-parametric bootstrap method tended to consistently show limit differences

around zero irrespectively of the source distribution. Especially when the assumed distribution was

15 of 24
Poisson or ZIP and differed from the source distribution, the non-parametric bootstrap provided

limits closer to the true limits than the other methods.

 When the mean count was the highest evaluated (2.9) and the limit was estimated at the 99 th

percentile level, two patterns were observed:

o When the source distribution was Negative Binomial or ZINB, all the estimation methods tended

to yield limits lower than the “true” limit. But in some of the assumed distribution cases (when

t
the assumed distribution was consistent with the source distribution), the Bayesian method

ip
with informative priors tended to show limit differences closer to zero.

cr
o When the source distribution was Poisson or ZIP, the Bayesian method with informative priors

us
tended to show limit differences around zero, while the other methods tended to result in limits

lower than the “true” limit. an


Patterns in Figures 5 to 8 may suggest some implementation strategies of the estimation methods:
M
 Because the average differences between the true and estimated limits were low when mean count

levels are low (0.15 or less), or when there was a combination of high mean count level (i.e., over
d

1.5) with not extreme inflation levels (i.e., between 0.15 and 0.45), it may be appropriate to employ
e

in that situation the simplest estimation method (i.e., the traditional percentile).
pt

 But when there was a combination of high mean count level (i.e., over 1.5) with extreme inflation
ce

levels (i.e., below 0.15 or above 0.45), not a single method would be the optimal across all related
Ac

conditions. For instance:

o When the calculated percentile was around 95%, the non-parametric bootstrap method

appeared to yield the smallest average differences in most of the cases.

o When the calculated percentile was around 99%, the Bayesian method with informative priors

appeared to yield the smallest differences in most of the cases.

16 of 24
Figure 7: Differences between True and Estimated Percentile Limits for the Lowest Mean Count
Parameter Level (0.1) and the Dispersion and Inflation Corner Points

t
ip
cr
us
an
M
e d
pt
ce
Ac

th th
Note: Violin plots of the difference in limits (True-Estimated) at each percentile level (95 and 99 ) by the
assumed distribution (A), the source distribution (S) and sample size (red: n=60, blue: n=300). The mean
differences are shown as dots in each color. Black dashed lines are drawn at the difference of 0 (a perfect match
between true and estimated limits). Notation employed to label methods: BY_I=Bayesian with informative priors,
BT_NP=nonparametric bootstrap, BT_P=parametric bootstrap, and Trad=traditional.

17 of 24
Figure 8: Differences between True and Estimated Percentile Limits for the Highest Mean Count
Parameter Level (2.9) and the Dispersion and Inflation Corner Points

t
ip
cr
us
an
M
e d
pt
ce
Ac

th th
Note: Violin plots of the difference in limits (True-Estimated) at each percentile level (95 and 99 ) by the
assumed distribution (A), the source distribution (S) and sample size (red: n=60, blue: n=300). The mean
differences are shown as dots in each color. Black dashed lines are drawn at the difference of 0 (a perfect match
between true and estimated limits). Notation employed to label methods: BY_I=Bayesian with informative priors,
BT_NP=nonparametric bootstrap, BT_P=parametric bootstrap, and Trad=traditional.

18 of 24
Figure 9: Estimated vs True Percentile Limits for Lowest Mean Count Parameter Level (0.1) per Method
Bayesian (with Informative Priors) Nonparametric Bootstrap

t
Parametric Bootstrap Traditional

ip
cr
us
an
M
Figure 10: Estimated vs True Percentile Limits for Highest Mean Count Level (2.9) per Method
Bayesian (with Informative Priors) Nonparametric Bootstrap
e d
pt
ce
Ac

Parametric Bootstrap Traditional

19 of 24
Concluding Remarks

In this paper, we explored statistical methods to estimate the limits of the microbial counts through a

carefully designed simulation study. The following was observed through the study:

1. When mean counts are close to zero, fitting limits under any of the assumed distributions would

yield similar results. In that case it may be reasonable to fit the simplest distribution possible (e.g.,

Poisson).

t
2. When mean count levels were high and source distributions were over-dispersed (like Negative

ip
Binomial and ZINB), estimation methods may underestimate the actual limits.

cr
3. Typically, the limit estimation methods evaluated on average tend to yield similar results when the

us
mean count values were low, or when there was a combination of high mean count level without

extreme inflation levels. In that situation, it may be appropriate to estimate the limits using the
an
simplest methodology (e.g., with the traditional method).
M
4. When there was a combination of high mean count level with extreme inflation levels, there was

more variability in the limit differences. In that situation it was observed that, when estimating a
d

95th percentile limit, on average the non-parametric bootstrap method tended to yield limits which
e

were closer to the source distribution limits. Similarly, it was observed that, when estimating a 99 th
pt

percentile limit, on average the Bayesian method with informative priors tended to yield limits
ce

which were closer to the source distribution limits.


Ac

5. The Akaike Information Criteria (AIC) can provide a relative goodness of fit ranking among several

assumed distributions. Other p-value based goodness of fit alternatives (such as chi-square test or

Vuong’s test) are available but would not always be informative in detecting differences between

multiple groups with similar ranking. In the end, practitioners in the field should exercise cautious

judgement.

20 of 24
6. Sometimes an assumed distribution cannot be fit to the data, especially when the sample size is

limited and when the assumed distribution is skewed (like in the case of the Negative Binomial

distribution). For that reason, it is important to assess multiple distributions while trying to obtain

data driven environmental monitoring limits.

As indicated earlier, the limit estimates in this paper are based on univariate distributions for the

particle counts (e.g., the intercept-only count generalized linear models) and are conducive for limit

t
estimations of separate areas at a time (e.g., a clean room). But the estimation methods described can

ip
be extended to count-based regression models with multiple predictors when appropriate covariate

cr
information is available (for example, room temperature or moisture level could be included as

us
covariates), allowing for the establishment of appropriate limits for the situation at hand.

an
M
e d
pt
ce
Ac

21 of 24
Acknowledgement

Authors want to recognize Perceval Sondag from Merck & Co., Inc., Kenilworth, NJ, USA for his support

in Bayesian inference questions.

References

[1] USP<1116> “Microbiological Evaluation of Clean Rooms and Other Controlled Environments”.

[2] J.D. Wilson, “Setting alert/action limits for environmental monitoring programs”, PDA Journal of

t
ip
Pharmaceutical Science and Technology, 1997, V.51 (4), pp. 161-2.

cr
[3] D. Hoffman, “Negative binomial control limits for count data with extra-Poisson variation”,

Pharmaceutical Statistics, 2004, V.2 (2), pp. 127–132.

us
[4] D. Lambert, “Zero-inflated Poisson Regression Models with an Application to Defects in
an
Manufacturing”, Technometrics, 1992, V.34 (1), pp. 1–14.

[5] H. Yang, W. Zhao, T. O'day and W. Fleming, “Environmental Monitoring: Setting Alert and Action
M
Limits Based on a Zero-Inflated Model”, PDA Journal of Pharmaceutical Science and Technology,

2013, V.67(1), pp. 2-8.


e d

[6] B. Efron and G. Gong, “A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation”, The
pt

American Statistician, 1983, V.37 (1), pp. 36-48.


ce

[7] B. Efron, “The Bootstrap and Modern Statistics”, Journal of the American Statistical Association,

2000, V.95(452), pp. 1293-1296.


Ac

[8] C.Z. Mooney and R.D. Duval, “Bootstrapping: a nonparametric approach to statistical inference”,

Sage Publications, 1993.

[9] A. Gelman, J. B. Carlin, H. S. Stern, D. B. Rubin. “Bayesian Data Analysis”, Chapman and Hall/CRC,

2004.

[10] D. Gamerman, H.F. Lopes. “Markov chain Monte Carlo: Stochastic simulation for Bayesian

inference”, Boca Raton: Chapman and Hall/CRC, 2006.

22 of 24
[11] Stan Development Team. “RStan: the R interface to Stan. R package version 2.19.2”, 2019.

[12] M. D. Hoffman and A. Gelman, “The No-U-Turn Sampler: Adaptively Setting Path Lengths in

Hamiltonian Monte Carlo”, Journal of Machine Learning Research, 2014, V.15, pp. 1593-1623.

[13] L. Pronzato and W. G. Müller, “Design of computer experiments: space filling and beyond“,

Statistical Computing, 2012, V.22, pp. 681–701.

[14] L. Xu, A D. Paterson, W. Turpin and W Xu, “Assessment and Selection of Competing Models for

t
Zero-Inflated Microbiome Data”, Journal PLOS ONE, DOI:10.1371(July 6, 2015), pp.1-30.

ip
[15] C.D. Desjardins, “Evaluating the Performance of Two Competing Models Of School Suspension

cr
Under Simulation - The Zero-Inflated Negative Binomial and the Negative Binomial Hurdle”,

us
University of Minnesota doctoral dissertation (May, 2013), p. 48.

[16] H. Akaike, “A new look at the statistical model identification“, IEEE Transactions on Automatic
an
Control, 1974, V.19(6), pp. 716–723.
M
[17] Q. H. Vuong, “Likelihood ratio tests for model selection and non-nested hypotheses”,

Econometrica, 1989, V.57, pp. 307–333.


d

[18] P. Wilson, “The misuse of the Vuong test for non-nested models to test for zero-inflation”,
e

Economics Letters, 2015, V.127, pp. 51-53.


pt

[19] E.C. Merkle, D. You, and K.J. Preacher, “Testing non-nested structural equation models”,
ce

Psychological Methods, 2016, V.21(2), pp. 151-163.


Ac

23 of 24
Appendix A: R Program to Generate Simulated Datasets

# Specify working folder for files


path="C:/ "
# Load "input.param.file.csv" with "Inflation", "Dispersion", and "Mean.Count"
# input parameters provided in first three columns of Table 3.
inputfile<-read.csv("input.param.file.csv", header=T)
n0<-nrow(inputfile)
n1<-60
n2<-300
replicates<-50
n<-c(rep(n1,n0),rep(n2,n0))

t
inputfile<-cbind(n,rbind(inputfile,inputfile))

ip
require(VGAM)
set.seed(3622)

cr
for (i in 1:(n0*2))
{

us
for (replicate in 1:replicates)
{
temp<-rzinegbin(n=inputfile[i,1], mu =inputfile[i,4], size = inputfile[i,3], pstr0 = inputfile[i,2])
an
temp<-data.frame(n=rep(inputfile$n[i],inputfile$n[i]),
Inflation=rep(inputfile$Inflation[i],inputfile$n[i]),
Dispersion=rep(inputfile$Dispersion[i],inputfile$n[i]),
M
Mean.Count=rep(inputfile$Mean.Count[i],inputfile$n[i]),
replicate=rep(replicate,inputfile$n[i]),
Count=temp)
d

if (i==1 & replicate==1)


{
e

simulated_data<-temp
}
pt

else
{
ce

simulated_data<-rbind(simulated_data,temp)
}
}
Ac

}
write.csv(simulated_data, "simulated_data.csv",row.names = FALSE)

24 of 24

You might also like