Multivariate Statistical Process Control Process Monitoring Methods And Applications 1st Edition Zhiqiang Ge download
Multivariate Statistical Process Control Process Monitoring Methods And Applications 1st Edition Zhiqiang Ge download
https://ebookbell.com/product/multivariate-statistical-process-
control-process-monitoring-methods-and-applications-1st-edition-
zhiqiang-ge-4231042
https://ebookbell.com/product/statistical-monitoring-of-complex-
multivariate-processes-with-applications-in-industrial-process-
control-uwe-kruger-4312464
https://ebookbell.com/product/multivariate-statistical-process-
control-with-industrial-application-1st-robert-l-mason-1313804
https://ebookbell.com/product/process-monitoring-and-fault-diagnosis-
based-on-multivariable-statistical-analysis-2024th-edition-xiangyu-
kong-57514892
https://ebookbell.com/product/multivariate-statistical-modeling-in-
engineering-and-management-1st-edition-jhareswar-maiti-46083382
Multivariate Statistical Methods Going Beyond The Linear 1st Edition
Gyrgy Terdik
https://ebookbell.com/product/multivariate-statistical-methods-going-
beyond-the-linear-1st-edition-gyrgy-terdik-35754252
https://ebookbell.com/product/multivariate-statistical-machine-
learning-methods-for-genomic-prediction-osval-antonio-montesinos-
lopez-37597868
https://ebookbell.com/product/multivariate-statistical-machine-
learning-methods-for-genomic-prediction-1st-ed-2022-montesinos-
lpez-37627830
https://ebookbell.com/product/multivariate-statistical-modelling-
based-on-generalized-linear-models-2nd-edition-ludwig-fahrmeir-4271654
https://ebookbell.com/product/multivariate-statistical-quality-
control-using-r-1st-edition-edgar-santosfernndez-auth-4272414
Advances in Industrial Control
Multivariate Statistical
Process Control
Process Monitoring Methods
and Applications
2123
Zhiqiang Ge Zhihuan Song
Department of Control Department of Control
Science and Engineering Science and Engineering
Institute of Industrial Process Control Institute of Industrial Process Control
Zhejiang University Zhejiang University
Hangzhou, Zhejiang, Hangzhou, Zhejiang,
People’s Republic of China People’s Republic of China
The series Advances in Industrial Control aims to report and encourage technol-
ogy transfer in control engineering. The rapid development of control technology
has an impact on all areas of the control discipline. New theory, new controllers,
actuators, sensors, new industrial processes, computer methods, new applications,
new philosophies. . . , new challenges. Much of this development work resides in
industrial reports, feasibility study papers, and the reports of advanced collabora-
tive projects. The series offers an opportunity for researchers to present an extended
exposition of such new work in all aspects of industrial control for wider and rapid
dissemination.
Statistical process control (SPC) has now evolved into a group of statistical tech-
niques for monitoring process performance and product quality. An important feature
of these methods is that they are used to monitor performance and are not a control
method per se since they contain no automatic feedback mechanism that defines a
control action to be taken once a fault condition has been detected.
The classical statistical process control tools are the Shewhart control charts that
monitor operational process means and use the assumption of the Gaussian dis-
tribution to design upper and lower control limits. The interval of process output
variation between these limits defines the normal operating region for the process. If
the process output strays outside these limits, this is taken as an indication that the
process is operating abnormally and that a process fault or disturbance has occurred.
Diagnosis and interpretation of what is causing the process upset is a much more
complicated issue and tools like the “cause and effect” chart or a root-cause analysis
were developed for this aspect of the performance monitoring problem.
However, the complexity of large-scale industrial operations and the ease with
which supervisory control and data acquisition systems and distributed computer
control systems can accumulate large quantities of online process data provided
an impetus to develop new statistical concepts and tools in the field of SPC. One
influential monograph that created interest in extending the techniques of SPC to
exploit the information in these industrial datasets was the Advances in Industrial
Control series monograph Data-Driven Techniques for Fault Detection and Diagno-
sis in Chemical Processes by E. L. Russell, L. H. Chiang, and R. D. Braatz (ISBN
978-1-85233-258-7, 2000).
v
vi Series Editors’ Foreword
Process safety and product quality are two important issues for modern industrial
processes. As one of the key technologies in the process system engineering and
control area, process monitoring methods can be used to improve product quality
and enhance process safety. If a process fault can be anticipated at an early stage
and corrected in time, product loss can be greatly reduced. Timely identification
of faults can also be used to initiate the removal of out of spec products thereby
preserving high standards of product quality. On the other hand, the decisions and
expert advice obtained from process monitoring procedures can also be used for
process improvement.
In general, process monitoring methods can be divided into three categories:
model-based methods, knowledge-based methods, and data-based methods. Model-
based methods can be traced to 1970s, at which time they were mainly used in
aerospace, engine and power systems. Since model-based methods are based on
exact process models, they tend to give more accurate monitoring decisions than
the other two method categories, and this is the main advantage of the technique.
However, due to the complexity of modern industrial processes, it is very costly to
obtain an accurate process model, and in some situations it is not actually possible
to develop a process model, per se.
In contrast, knowledge-based methods are often more satisfactory because they
are based on the available real-time knowledge of the process behavior and the
experience of expert plant operators. Consequently, the monitoring results provided
by these methods tend to be more intuitive. However, the creation of the process
knowledge base is always a time-consuming and difficult operation requiring the
long-term accumulation of expert knowledge and experiences. Although there are
limitations to the model-based and knowledge-based methods, they are still popular
in particular areas, especially those in which process models can be easily obtained
or the process knowledge readily accumulated.
Compared to the model-based and knowledge-based methods, data-driven process
monitoring methods have no restrictions on the process model and the associ-
ated knowledge and consequently have become more and more popular in recent
vii
viii Preface
years. The application areas for the data-driven methods include the important com-
plex industrial processes of the chemical, petrochemical, and biological process
industries.
The widespread use of distributed control systems and SCADA technology in
the process industries means that large amounts of real-time process data are read-
ily available and this has greatly accelerated the development of data-based process
monitoring methods. In addition, the progress made in developing data-mining pro-
cedures also provides new technologies for use in process monitoring systems. The
data-driven method of Multivariate Statistical Process Control (MSPC) has been the
subject of considerable interest from both industry and the academic community
as an important tool in the process monitoring area. Three books using Multivari-
ate Statistical Process Control for process monitoring have been published in 1999
(Wang), 2000 (Russell, et al.) and 2001 (Chiang, et al.). Whilst Wang introduced
some data-mining technology into MSPC for process monitoring and control, Russell
et al. (2000) are mainly concerned with the traditional MSPC methods for process
monitoring. Later, Chiang et al. (2001) extended data-based process monitoring to
incorporate model-based and knowledge-based methods. It should be noted that the
traditional MSPC method is limited to Gaussian, linear, stationary, and single-mode
processes. During the last decade, the MSPC-based process monitoring approach
has been intensively researched, and many papers have been published. However, to
our best knowledge, there is no book on the MSPC-based process monitoring topic
that has been published since 2001.
The purpose of this book is to report recent developments of the MSPC method
for process monitoring purpose. The book first reviews the recent research works
on MSPC and then the shortcomings of the existing approaches are exposed and
demonstrated. This provides the motivation for our research and applications work.
In our opinion, this book can be assimilated by advanced undergraduates and grad-
uate (Ph.D.) students, as well as industrial and process engineering researchers and
practitioners. However, the knowledge of basic multivariate statistical analysis is
required, and some familiarity with pattern recognition and machine learning would
be helpful.
Acknowledgement
The material presented in this book has been the outcome of several years of research
efforts by the authors. Many chapters of the book have gone through several rounds
of revisions by discussions with a number of colleagues, collaborators, graduate and
Ph.D. students. We would like specifically to thank our colleagues and collaborators,
Prof.Youxian Sun for his great supports of the work; Prof. Lei Xie, Prof. Uwe Kruger,
Prof. Furong Gao, Prof. Haiqing Wang, Prof. Chunjie Yang, Prof. Jun Liang, Prof.
Shuqing Wang, and Prof. Tao Chen who have inspired many discussions in this or
related topics over the past years; former and current graduate and Ph. D. students,
Dr. Muguang Zhang, Dr. Yingjian Ye, Dr. Kun Chen, Ms. Ruowei Fu, Mr. Zhibo
Zhu, Mr. Hong Wang, Mr. Qiaojun Wen, Ms. Aimin Miao, Mr. Le Zhou, and Mr.
Hongwei Zhang for their participations in the discussions; and other supporting staffs
of the Department of Control Science and Engineering at Zhejiang University. The
financial supports from the National Natural Science Foundation of China (NSFC)
(60774067, 60974056, 61004134), and the National Project 973 (2012CB720500)
are gratefully acknowledged. Last but not least, we would also like to acknowledge
Mr. Oliver Jackson (Springer) and other supporting staffs for their editorial comments
and detailed examinations of the book.
ix
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 An Overview of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Main Features of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Organization of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
xi
xii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Notation
xv
xvi Notation
With the wide use of the distributed control systems in modern industrial processes,
a large amount of data has been recorded and collected. How to efficiently use
these datasets for process modelling, monitoring and control is of particular interest,
as the traditional first-principle model-based method is difficult to use in modern
complex processes, which is mainly due to the high human and resource costs or
special environments. Different from the first-principle model-based method, the
data-based method rarely needs any prior knowledge of the process. By extracting
the useful information from the recorded process data, data-based models are also
able to model the relationship between different process variables. Particularly, for
process monitoring purpose, the multivariable statistical process control (MSPC)-
based method has received much attention since the 1990s. The main idea of the
MSPC-based monitoring approach is to extract the useful data information from the
original dataset, and construct some statistics for monitoring. Most MSPC-based
methods can successfully handle the high-dimensional and correlated variables in
the process because they are able to reduce the dimension of the process variables
and decompose the correlations between them. Therefore, MSPC has become very
popular in industrial processes, especially when used for process monitoring.
So far, the most widely used MSPC method for process monitoring may be the
principal component analysis (PCA) and the partial least squares (PLS) methods. By
extracting the principal components from the process data, and then constructing T 2
and Squared Prediction Error (SPE) statistics for process monitoring, PCA and PLS
can both handle the high-dimensional and correlated process variables, and provide
detailed monitoring results for each of the data sample in the process. Along past
several decades, different improvements and modifications have been made to the
traditional PCA and PLS methods. Additionally, some new techniques have also
been introduced into the process monitoring area, such as probabilistic PCA, factor
analysis, independent component analysis (ICA), kernel PCA, support vector data
description (SVDD), etc. most of which were originally proposed in other areas.
On the basis of these newly developed and introduced methods, the monitoring per-
formance has been improved for processes under specific conditions. For example,
when the process data are not strictly Gaussian distribution, the traditional PCA
method is not sufficient to provide a good monitoring performance. With the intro-
duction of the ICA algorithm, the non-Gaussian data information can be efficiently
extracted, on the basis of which new monitoring statistics have been constructed for
process monitoring. Nonlinearity is a common data behaviour in many industrial
processes; although the conventional MSPC methods fail to describe nonlinear rela-
tionship among process variables, some new nonlinear modelling approaches have
recently been developed for process monitoring purposes, such as neural network-
based MSPC, kernel PCA, linear subspace method, etc. Another common behaviour
of modern industrial processes is, their operation condition may be time-varying or
they may have multiple operation modes. In these cases, the conventional MSPC
methods may not provide satisfactory results. Fortunately, a lot of improvements
and new approaches have been developed for monitoring those processes which are
time-varying or have multiple operation modes. Some representatives of these meth-
ods are adaptive PCA, recursive PLS, moving-window PCA, multimodel method,
local model approach, Bayesian inference method, etc. While most of the exist-
ing MSPC methods focused on static industrial processes, only a few works have
explored the topic of dynamic process monitoring, especially for dynamic non-
Gaussian processes. In order to handle the noisy data information in the process,
several probabilistic modelling methods have been proposed, e.g. probabilistic PCA
and factor analysis. As each process variable is inherently a random variable, the
probabilistic model may be more appropriate to describe the relationships among
them. Actually, the monitoring performance of most processes has been improved
with the introduction of the probabilistic models. Recently, research attention has
also focused on plant-wide process monitoring, which is also known as large-scale
process monitoring problem in related references. For those processes, the tradi-
tional MSPC methods have been extended to the multiblock counterparts, such as
multiblock PCA and multiblock PLS. In addition, several new approaches have also
been developed on the basis of these multiblock monitoring methods.
The aim of this book is to provide a recent overview of the MSPC method and
it introduces some new techniques for the process monitoring purpose. Specif-
ically, this book gives an overview of recently developed methods in different
aspects, namely non-Gaussian process monitoring, nonlinear process monitoring,
time-varying process monitoring, multimode process monitoring, dynamic process
monitoring, probabilistic process monitoring, and plant-wide process monitoring.
However, due to the limited space, only some methods have been selected for detailed
description in this book.
method was subsequently improved by using SVDD and factor analysis. For fault
diagnosis, the SVDD reconstruction-based non-Gaussian fault diagnosis method
is introduced, which can be considered as a complement of reconstruction-based
methods for fault diagnosis in the non-Gaussian case. Besides, a similarity-based
method is introduced for fault identification.
2. For nonlinear process monitoring, the statistical local approach (LA) is introduced
to the traditional kernel PCA modelling structure. This effectively eliminates the
restriction of the process data to the Gaussian distribution. Due to the offline
modelling and online implementation difficulties of existing methods, a new
viewpoint for nonlinear process monitoring, which is based on linear subspace
integration and Bayesian inference, is illustrated. Compared to existing nonlinear
methods, the new method can both improve the monitoring performance and
reduce the algorithm complexity.
3. In order to improve the monitoring performance for time-varying and multimode
processes, three new methods are given in detail. First, a local least squares
support vector regression-based method is introduced to improve the deficiency
of the traditional recursive method for time-varying processes. This greatly en-
hances the real-time performance for monitoring purpose. Second, a Bayesian
inference method is introduced for multimode process monitoring. Third, a two-
dimensional Bayesian-based method, which greatly alleviates the lean of the
monitoring method to process knowledge and experiences, is also introduced for
monitoring nonlinear multimode processes.
4. Very few works have been reported on process monitoring for non-Gaussian
dynamic processes. A new monitoring method is introduced in this book for
these special processes, which is based on subspace model identification and LA.
In contrast to other methods, the new method is more efficient in monitoring
non-Gaussian dynamic processes.
5. While the probabilistic PCA method has recently been introduced for process
monitoring, it has an inherent limitation that it cannot determine the effective
dimensionality of latent variables. A Bayesian treatment of the PCA method
is developed and introduced. For multimode process monitoring, this Bayesian
regularization method is extended to its mixture form.
6. Based on the traditional multiblock method that has been proposed for plant-wide
process monitoring, a two-level MultiBlock ICA-PCA method is introduced.
Through this method, the process monitoring task can be reduced and the inter-
pretation of the process can be made more efficiently. Detailed descriptions of
fault detection, fault reconstruction and fault diagnosis tasks are provided.
Chapter 2 introduces some conventionally used MSPC methods that have been
widely used and developed in the past years.
Chapter 3 gives introductions of several non-Gaussian process monitoring meth-
ods, including the two-step ICA-PCA strategy, ICA-factor analysis strategy, and the
SVDD algorithm.
Chapter 4 introduces the SVDD-based non-Gaussian fault reconstruction and
diagnosis algorithm, and the similarity-based fault identification method.
Chapters 5 and 6 introduce the nonlinear process monitoring topic through two
different viewpoints.
Chapter 7 gives a detailed description of the local model-based approach for time-
varying process monitoring.
Chapters 8 and 9 are two parts which are dedicated on the topic of multimode
process monitoring.
Chapter 10 focuses on dynamic process monitoring, which specifically introduces
two methods for monitoring non-Gaussian dynamic processes.
Chapter 11 introduces a Bayesian regularization of the probabilistic PCA method
and its mixture form, on the basis of which a probabilistic multimode process
monitoring approach is also demonstrated.
Chapter 12 gives an introduction of a two-level multiblock method for plant-wide
process monitoring.
Some of the materials presented in this book have been published in academic
journals by the authors, and are included after necessary modification and updates
to ensure accuracy and coherence.
Chapter 2
An Overview of Conventional MSPC Methods
k
X= ti pTi + E (2.2)
i=1
For a new observation xnew , the prediction of the PCA model is given by
For monitoring purpose, two statistical variables T 2 and SPE can be calculated based
on tnew and enew , respectively,
2
Tnew = tnew tTnew (2.5)
where is the eigenvalue matrix of the PCA model. The confidence limit of
the T 2 and SPE monitoring statistics are determined as follows (Chiang et al.
2001)
k(n − 1)
T 2 ≤ Tlim
2
= Fk,(n−k),α (2.7)
n−k
SPE ≤ SPElim = gχh,α
2
g = v/(2m) (2.8)
h = 2m /v2
The principle of the PLS method is similar to that of PCA, except that the PLS
method incorporates the information of the quality variables in the process. Given
a pair of process and quality dataset {X, Y}, PLS intends to decompose X and
Y into a combination of scores matrix T, loading matrices P and Q, and weight
matrix W. The relationship between X and Y can be described by the following
equations
X = TT P + E (2.9)
Y = TQT + F (2.10)
The regression matrix of the PLS model between the process and quality variables
can be determined as follows
−1
R = W(PT W) QT (2.11)
Given a new process data sample, (xnew , ynew ), the principal components are
calculated in the first step, which is given as
−1
tnew = xnew W(PT W) (2.12)
In the next step, the T 2 and SPE monitoring statistics can be constructed for pro-
cess monitoring. Different from the PCA approach, a total of four monitoring
statistics can be used for processs monitoring in the PLS model, two of which
correspond to the data information of the quality variables. Detailed description
of the PLS model and its monitoring strategy can be found in Chiang et al.
(2001).
2.3 Factor Analysis 7
Similar to the probabilistic PCA method, factor analysis concentrates on the latent
variables t whose distribution are Gaussian, and the original measurement variables
x are treated as linear combination of t plus small additive white noise e. The aim
of FA is to find the most probable parameter set = {P, } in the model structure,
which is described as follows (Bishop 2006)
x = Pt + e (2.13)
where P = [p1 , p2 , . . . , pl ] ∈ Rm×l is the loading matrix, just as that in PCA. The
variances matrix of measurement noise e is represented by = diag{λi }1,2,...,m ,
in which different noise levels of measurement variables are assumed. If all the
λ are assumed to be of the same value, then FA is equivalent to PPCA. If all λi ,
i = 1, 2, . . . , m are assumed to be zero, then FA becomes PCA. Therefore, FA is the
general description of Gaussian latent model structure. PPCA and PCA are just two
special cases of FA.
In the FA model, the latent variable t is assumed to be Gaussian distribution with
zero mean and unity variance, which is t ∈ N (0, I), where I is the unity matrix with
appropriate dimension. Precisely, the distribution of the latent variable t is
1
p(t) = (2π)−k/2 exp − tT t (2.14)
2
Then, the conditioned probability of the measured process variable x is given as
−m/2 −1/2 1 T −1
p(x|t) = (2π) |e | exp − (x − Pt) e (x − Pt) (2.15)
2
Based on Eqs. (2.14) and (2.15), the probability of x can be calculated as
−m/2 −1/2 1 T −1
p(x) = p(x|t)p(t)dt = (2π ) |C| exp − x C x (2.16)
2
where C = e + PPT is the variance matrix of the process data, thus the distribution
of x can be represented as x ∈ N(0, e + PPT ), |C| is to calculate the discriminant
value for the C.
The parameter set = {P, e } can be estimated by the expectation and maxi-
mization (EM) algorithm, which is an iterative likelihood maximization algorithm.
EM is a widely used algorithm for parameter estimation due to its simplicity and ef-
ficiency. Besides, it can also handle incomplete datasets, which is also very attractive
to the application in process industry. The EM algorithm can be partitioned into two
steps: the E-step and the M-step. The maximum likelihood function of the process
data can be given as
n
n
L= ln{p(xi , ti )} = ln{p(ti |xi )p(xi )} (2.17)
i=1 i=1
8 2 An Overview of Conventional MSPC Methods
where E(·) and tr(·) are expectation and trace calculators, cont represents a con-
stant value, and the first and second statistics of the latent variables are given as
follows
By calculating the E-step and M-step iteratively, the final optimal parameters for the
FA model can be obtained.
For process monitoring, the T 2 and SPE monitoring statistics based on the FA
model can be constructed as follows
2
Ti2 = t̂i = xiT QT Qxi ≤ χα,k
2
(2.24)
2
SPEi = e−1/2 ei = xiT (I − PQ)T e−1 (I − PQ)xi ≤ χα,m
2
(2.26)
2.4 Independent Component Analysis 9
XT = AS + E (2.27)
Ŝ = WXT (2.28)
Before applying the ICA algorithm, the data matrix X should be whitened, in order
to eliminate the cross-correlations among random variables. One popular method
for whitening is to use the eigenvalue decomposition, considering x(k) with its
covariance Rx = E{x(k)x(k)T }, the eigenvalue decomposition of Rx is given by
Rx = U UT (2.29)
where Q = −1/2 UT . One can easily verify that Rz = E{z(k)z(k)T } is the identity
matrix under this transformation. After the whitening transformation, we have
From Eqs. (2.28) and (2.33), we can get the relation between W and B
W = BT Q (2.34)
There are two classic measures of non-Gaussianity: kurtosis and negentropy (Hy-
varinen and Oja 2000). Kurtosis is the fourth-order cumulant of a random variable,
unfortunately, it is sensitive to outliers. On the other hand, negentropy is based on the
information-theoretic quantity of differential entropy. When we have obtained the
approximate forms for negentropy, Hyvarinen (1997) introduced a very simple and
efficient fixed-point algorithm for ICA. This algorithm calculates the independent
components one by one through iterative steps. After all vectors bi (i = 1, . . . , m)
have been calculated and put together to form the orthogonal mixing matrix B, then
we can obtain ŝ(k) and demixing matrix W form Eqs. (2.33) and (2.34), respectively.
Dimension reduction of ICA is based on the idea that these measured variables
are the mixture of some independent component variables. The performance and
interpretation of ICA monitoring depend on the correct choice of the ordering and
dimension of the ICA model. Because negentropy is used for measurement of non-
Gaussianity, we can select suitable number of ICs by checking their negentropy
values. If the negentropy value of current IC is zero or approximately zero, it indicates
that the non-Gaussian information of the process has already been extracted, and
the rest of the process is Gaussian, which can be analyzed by conventional MSPC
methods such as PCA.
Based on the identified ICA model, two monitoring statistics can be constructed
for monitoring, which are defined as I 2 and SPE, given as (Lee et al. 2004; Ge and
Song 2007)
I 2 = sT s (2.35)
SPE = eT e (2.36)
where s and e are independent components and residuals, respectively.
where c is the sample mean in the feature space. Conveniently, denote (x ¯ i) =
(xi ) − c as the centered feature space sample. The kernel principal component can
be obtained by the eigenvalue problem below
1
n
λv = v= ¯ i )T v](x
[(x ¯ i) (2.38)
F
n i=1
n
v= ¯ i)
αi (x (2.39)
i=1
The problem can be transformed to Eq. (2.41) by introducing a kernel matrix K̄ij =
¯ i) ·
(x ¯ T (xj ), where the kernel matrix is centered, α should be scaled as α2 =
1/nλ to ensure the normality of v (Choi et al. 2005)
1 ¯
λα = Kα (2.41)
n
The score vector of qth observation is calculated as
n
n
¯ q )] =
tq,k = [vk · (x ¯ q )
αki [(x ¯ T (xi )] = αki K̄qi (2.42)
i=1 i=1
2.6 Conclusion
In this chapter, several typical multivariate statistical process control methods have
been introduced, including PCA, PLS, ICA, FA, and KPCA. Based on these basic
MSPC methods, more specific and complex process monioring methods can be
further developed, some of which will be introduced in the following chapters.
Chapter 3
Non-Gaussian Process Monitoring
3.1 Introduction
It is also important to note that all the reported ICA-based monitoring methods
are based on the assumption that the essential variables which drive the process are
non-Gaussian and discarded Gaussian parts are utilized to set up a single squared
prediction error (SPE) monitoring chart. Actually, complex multivariate processes
are always driven by non-Gaussian and Gaussian essential variables simultaneously.
Separating the Gaussian information from the un-modelled Gaussian uncertainty
will facilitate the diagnosis of faults which occur in different sources. Therefore, a
two-step ICA-PCA information extraction strategy has been proposed and used for
process monitoring.
Due to complicated confidence limit determination of the traditional ICA-based
monitoring method, which is based on the kernel density estimation (KDE), support
vector data description (SVDD) has been introduced to improve the determination
of confidence limits for the non-Gaussian components (Tax and Duin 1999, 2004).
Compared to KDE (Lee et al. 2004), the SVDD-based method is more computation-
ally efficient, which relies on a quadratic programming cost function. To monitor
discarded Gaussian parts of the process information, PCA might be the straight-
forward but not the optimal approach in which the projected directions have the
maximum variations irrespective of the underlying information generating structure.
More precisely, the process recordings always consist of inevitable measurement
noises, while PCA is not capable of characterizing the different influences from
common process Gaussian impelling factors and random variations due to measure-
ment equipments. A probabilistic generative model (probabilistic PCA, PPCA) was
employed to address this issue via the expectation and maximization (EM) algorithm
(Kim and Lee 2003). However, PPCA assumes the same noise level of all measure-
ment variables, which cannot be easily satisfied in practice. To this end, a general
extension of PPCA model, factor analysis (FA) can model different noise levels of
measurement variables. To determine the number of essential variables, lots of use-
ful approaches have been reported, including Akaike Information Criterion (AIC),
Bayesian Information Criterion (BIC), SCREE test, PRESS cross validation test
(Chiang et al. 2001) and more recent fault detection criteria (Valle et al. 1999; Wang
et al. 2002, 2004).
Giving the data matrix X, assume r independent components (ICs) are extracted,
S = [s1 , s2 , . . . , sr ] ∈ Rr×n . In order to monitor the non-Gaussian part of the process,
a new statistic variable was defined (Lee et al. 2004; Ge and Song 2007)
I 2 = sT s (3.1)
3.2 Two-step ICA-PCA Information Extraction Strategy for Process Monitoring 15
After the non-Gaussian information has been extracted, the residual matrix E is
obtained. Then PCA is used to analyse it, expanding E as follows
k
E= ti pTi + F (3.2)
i=1
where F is the residual resulting from the PCA model. Here, we define the limits of
T 2 and SPE statistics to monitor the remaining Gaussian part of the process
k
ti2 k(n − 1)
T2 = ≤ Fk,(n−k),α , (3.3)
λ
i=1 i
n−k
In the PCA monitoring method, the confidence limits are based on a specified dis-
tribution, based upon the assumption that the latent variables follow a Gaussian
distribution. However, in ICA monitoring, the independent component do not con-
form to a specific distribution, hence, the confidence limit of the I 2 statistic cannot
be determined directly from a particular approximate distribution. An alternative ap-
proach to define the nominal operating region of the I 2 statistic is to use KDE (Chen
et al. 2000, 2004). Here we only need a univariate kernel estimator, which is defined
by
1 −1/2 2
n
fˆ(I 2 , H) = K H (I − Ii2 ) (3.5)
n i=1
where H is the bandwidth matrix and K is a kernel function, which satisfies the
following condition
K(I 2 ) ≥ 0, K(I 2 )dI 2 = 1 (3.6)
Rp
There are a number of kernel functions, among which the Gaussian kernel function
is the most commonly used.
Many methods have been proposed for the estimation of the window width or
the smoothing parameter, which is of crucial importance in density estimation. One
efficient method is called mean integrated squared error (MISE) (Hyvarinen and Oja
2000). There are three choices for H
1. For a process with p variables, a full symmetrical bandwidth matrix
⎡ ⎤
h211 h212 ··· h21p
⎢h221 h222 ··· h22p ⎥
⎢ ⎥
H=⎢ . .. .. ⎥
⎣ .. . . ⎦
h2p1 h2p2 ··· h2pp
16 3 Non-Gaussian Process Monitoring
which is a positive-definite matrix with p(p + 1)/2 parameters, in which h2ik = h2ki ;
2. A diagonal matrix with only p parameters, H = diag(h21 , h22 , . . . , h2p );
3. A diagonal matrix with one parameter, H = h2 I, where I is an identity matrix.
Approach one is the most accurate but is unrealistic in terms of computational load
and time. Approach three is the simplest, and the method used in the case of univariate
data, and can be adopted with slight modification. However, it can cause problems
in some situations due to the loss of accuracy by forcing the bandwidths in all
dimensions to be the same. This can introduce significant error in the density function
shape. Hence, approach two appears to be the appropriate choice, as a compromise,
but the computational load is still unacceptable if the dimensionality of the problem
is high and the size of the sample is large.
The implementation of the monitoring method, as mentioned in the previous
section, consists of two procedures: off-line modelling and on-line monitoring. In
the off-line modelling procedure, an ICA-PCA monitoring model is developed in
the normal operating condition. Then, the fault detection is executed by using this
monitoring model in the on-line monitoring procedure. The detailed algorithm flow
of the monitoring procedure is summarized as follows
(1) Off-line modelling procedure
Step 1 Acquire an operating dataset X during normal process;
Step 2 Normalize and whiten the data matrix X;
Step 3 Carry out the ICA algorithm to obtain the matrix W, B and Ŝ, so that Ŝ
−1
has great non-Gaussianity. We can also obtain A = (QT Q) QT B and
W = BT Q;
Step 4 For each sample, calculate its I 2 value, I 2 (k) = ŝ(k)T ŝ(k), (k = 1, 2, . . . , n),
2
using KDE to decide its 99 or 95 % confidence limit Ilim ;
Step 5 Carry out the PCA algorithm to obtain the score matrix T and the load matrix
2
P, decide the confidence limits Tlim and SPE lim for T 2 and SPE.
(2) On-line monitoring procedure
Step 1 For a new sample data xnew , using the same scaling method used in the
modelling steps;
Step 2 Calculate the ICs of the new sample data, xnew , ŝnew = Wxnew ;
T
Step 3 Calculate the I 2 statistic value for this new sample data, Inew
2
= Ŝnew Ŝnew ;
Step 4 The remaining part is given by enew = xnew − Aŝnew , calculate the score
vector, tnew = enew P, and also obtain the residual, fnew = enew − ênew ;
2
Step 5 Calculate Tnew and SPE new using Eqs. (3.3) and (3.4);
Step 6 If the statistics reject their corresponding limits, some kind of fault is
detected, otherwise, go to step 1 and continue monitoring.
The on-line process fault detection scheme of the ICA-PCA based method is shown
in Fig. 3.1.
3.2 Two-step ICA-PCA Information Extraction Strategy for Process Monitoring 17
ICA-PCA model
Calculate the Calculate the T2 and SPE
Extract ICs and calculate
remain matrix E statistic values for the
the I2 statistic value
for PCA analysis current sample data
no
Monitoring three statistics: I2- T2 and SPE
As a benchmark simulation, the Tennessee Eastman (TE) process has been widely
used to test the performance of various monitoring approaches (Downs and Vogel
1993; Chiang et al. 2001; Raich and Cinar 1995; Singhai and Seborg 2006). This
process consists of five major unit operations: a reactor, a condenser, a compressor,
a separator and a stripper. The control structure is shown schematically in Fig. 3.2
which is the second structure listed in Lyman and Georgakist (1995). The TE process
has 41 measured variables (22 continuous process measurements and 19 composition
measurements) and 12 manipulated variables, a set of 21 programmed faults are
introduced to the process. The details on the process description are well explained
in a book of Chiang et al. (2001).
In this process, the variables which are selected for monitoring are listed in
Table 3.1. There are 33 variables in the table, 22 continuous process measurements
and 11 manipulated variables. The agitation speed is not included because it is not
manipulated. Besides, we also exclude 19 composition measurements as they are
difficult to measure on-line in real processes. The simulation data which we have
collected were separated into two parts: the training datasets and the testing datasets;
18 3 Non-Gaussian Process Monitoring
XC FC FC FC XC
FI
FI FI
PC Purge
A
XC
XC FC CWS Compressor PI
TI
Condense S
FI CWR LI E
P
LI PI
D A
PI R TI
XC FC FC A
LC
T
CWS
FI O
S R
TI TC
T
E
R
CWR
I FI
P
XA Reactor LI
FI P TI TC XC
E
TI TC R XE
XD
FC FI
XG
XE
Stin XH
FC LC Cond
FI FI
LC Product
they consisted of 960 observations for each mode (1 normal and 21 fault), respec-
tively, and their sampling interval was 3 min. All faults in the testing datasets were
introduced in the process at sample 160, which are tabulated in Table 3.2. Faults 1–7
are step changes of process variables; faults 8–12 are random changes of variables;
fault 13 is slow shift of Reaction kinetics; faults 14, 15 and 21 are related to valve
sticking and faults 16–20 are types of unknown faults. Among these faults, some
faults are easy to detect as they greatly affect the process and change the relations
3.2 Two-step ICA-PCA Information Extraction Strategy for Process Monitoring 19
between process variables. However, there are also faults that are difficult to detect
(faults 3, 9 and 15) because they are very small and have little influence to the process.
So far, 22 datasets have been generated, corresponding to the 22 different operation
modes (1 normal and 21 fault) in the TE process. Before the application of PCA, ICA
and ICA-PCA, all the datasets were auto-scaled. A total of 9 ICs and 15 principal
components (PCs) were selected for ICA and PCA by cross-validation, respectively.
In the study of the ICA-PCA method, we select the same number of ICs (9) and
PCs (15) to compare the monitoring performance with ICA and PCA. The 99 %
confidence limits of all the statistic variables were determined by KDE. For each
statistic, the detection rates for all the 21 fault modes were calculated and tabulated
in Table 3.3. The minimum missing detection rate achieved for each mode is marked
with a bold number except the modes of faults 3 and 9; as these faults are quite
small and have almost no effect on the overall process so they were excluded from
our research. As shown in Table 3.3, ICA outperforms PCA for most fault modes,
and particularly the missing detection rates of ICA for the modes of faults 5, 10
and 16 are much lower than that of PCA, which indicates that ICA can detect small
events that are difficult to detect by PCA. However, the missing detection rates of
ICA-PCA for most modes are even lower than that of ICA, as ICA-PCA not only
monitors the non-Gaussian information of the process by ICA, but also the Gaussian
information is monitored by PCA. As a result of the simulation, the best monitoring
performance is found in the case of ICA-PCA. In most modes, the missing detection
rate of ICA-PCA is the lowest.
The monitoring results of fault 5 are shown in Fig. 3.3. The condenser cooling
water inlet temperature is step changed in the mode of fault 5. In that mode, the flow
20 3 Non-Gaussian Process Monitoring
rate of the outlet stream from the condenser to the separator also increases, which
causes an increase in temperature in the separator, and thus affects the separator
cooling water outlet temperature. As we have established a control structure for this
process, the control loops will act to compensate for the change and the temperature
in the separator will return to its set-point. It takes about 10 h to reach the steady state
again. As shown in Fig. 3.3a, PCA has detected the fault at sample 160 approximately.
However, the fault cannot be detected after approximately sample 350 as most of the
variables returned to their set-points. In this case, if a process operator judges the
status of the process based on PCA, they would probably conclude that a fault entered
the process and then corrected itself in about 10 h. However, as a matter of fact, the
condenser cooling water inlet temperature is still high than normal condition after
sample 350, and the condenser cooling water flow rate is continuously manipulated
while most variables return to their set-points. This indicates that a fault still remains
in the process. As PCA is based on only second-order statistics information, the
effect of the condenser cooling water flow may be neglected compared to the effect
of other variables, thus, the problem occurs. Alternatively, as shown in Fig. 3.3b
and 3.3c, all the ICA and ICA-PCA statistic variables stayed above their confidence
limit, which indicate that a fault remains in the process. What distinguished ICA
from PCA is that ICA involves higher order statistics information, which is non-
Gaussian and, thus, can correctly reflect the effect of condenser cooling water flow
rate. Such a persistent fault detection statistic will continue to inform the operator
that a process abnormality remains in the process; although, all the process variables
will appear to have returned to their normal values through control loops. In the
3.3 Process Monitoring Based on ICA-FA and SVDD 21
200 1500
150
1000
100
T2
I2
500
50
0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
60 2500
2000
40
1500
SPE
SPE
1000
20
500
0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
a sample number b sample number
10000
5000
I2
0
0 100 200 300 400 500 600 700 800 900 1000
3000
2000
T2
1000
0
0 100 200 300 400 500 600 700 800 900 1000
1500
1000
SPE
500
0
0 100 200 300 400 500 600 700 800 900 1000
c sample number
Fig. 3.3 Monitoring results for mode of fault 5 of TE process: a result based on PCA, b result based
on ICA, c result based on ICA-PCA
method of ICA-PCA, as ICA has extracted essential components that underlie the
process,the following procedure of PCA can reflect the effect of condenser cooling
water flow rate. So, ICA and ICA-PCA may provide the process operator with more
reliable information. Additionally, comparing the result of ICA-PCA with that of
ICA, we can conclude that the effect of the fault is magnified more in the former
statistic variables.
The main idea of SVDD is to map the input vectors to a feature space and to find
hypersphere with the minimum volume which separates the transferred data from the
rest of the feature space. Applications have shown a high generalization performance
of SVDD if large reference dataset with few abnormal samples is available (Tax and
Duin 1999).
22 3 Non-Gaussian Process Monitoring
where a is the center of the hypersphere in the feature space. The variable C gives
the trade-off between the volume of the sphere and the number of errors (number of
target objects rejected). ξi represents the slack variable which allows the probability
that some of the training samples can be wrongly classified. The dual form of the
optimization problem can be obtained as follows
n
n
n
min αi K(ŝi , ŝj ) − αi αj K(ŝi , ŝj )
αi
i=1 i=1 j =1
n
s.t. 0 ≤ αi ≤ C, αi = 1 (3.8)
i=1
where αi is Lagrange multipliers. After the hpyersphere in the feature space has
been constructed, the hypothesis that a new sample ŝ new is normal is accepted if the
distance d[(ŝnew )] = (ŝnew ) − a ≤ R, which is given by
n
n n
d[(ŝnew )] = K(ŝnew , ŝnew ) − 2 αi K(ŝi , ŝnew ) + αi αj K(ŝi , ŝj ) (3.9)
i=1 i=1 j =1
As demonstrated, conventional statistical methods are under the assumption that pro-
cesses are driven by fewer essential variables. These essential variables are either
non-Gaussian or Gaussian distribution. However, under more general circumstances,
processes are driven by non-Gaussian and Gaussian essential variables simulta-
neously, which formulates the key assumption of the mixed essential component
analysis (ICA-FA) method. Figure 3.4 shows the difference between ICA-FA and
the conventional methods (ICA and FA). The relationship of measurement variables
and mixed essential variables is given as follows (Ge et al. 2009a)
x = x1 + x2 = As + Pt + e (3.10)
3.3 Process Monitoring Based on ICA-FA and SVDD 23
s1
e e
s2
... s1 t1
sr Process Process t2 Process
s2
records ... records ... records
t1
sr t1
t2
...
t1
ICA-FA ICA FA
where x1 and x2 represent non-Gaussian and Gaussian parts of the process, respec-
tively, A is the mixing matrix of non-Gaussian essential variable s, P is the loading
matrix of Gaussian essential variables t and e is the Gaussian un-modelled uncer-
tainty, relating to measurement noises, etc., with zero means and different variances
=diag{λi }1, 2,... , m , which permits measurement variables to have different noise levels.
As a general form of PPCA, FA performs better than PCA and PPCA. Since pro-
cess measurements are always corrupted by noise, the ignorance of such information
generation structure will make PCA method leave some essential information in the
discarded space and trigger monitoring errors. In contrast, the noise structure is con-
sidered in PPCA and FA; thus, the useful information can be correctly extracted and
monitored separately.
To estimate the new parameter set = {A, P, } and extract essential variables
EA = {s, t} from measurement variables x, a two-step estimation and extraction
strategy can be developed. First, non-Gaussian essential variables can be extracted
by the efficient FastICA or PSO-based method (Xie and Wu 2006), thus the mixing
matrix A is obtained. Second, the parameter P and can be estimated by the EM
algorithm, which is described as follows. First, the mean and variance of the Gaussian
part of measurement variables x2 can be calculated as
n n −1
P= x2i t̂iT E[ti tiT |x2i]
i=1 i=1
n (3.13)
diag(x2i x2Ti − Pt̂i x2Ti )
i=1
=
n
where n is the number of samples, diag(Z) means to make a diagonal matrix using
diagonal elements of matrix Z. Equations (3.12) and (3.13) are calculated iteratively
until both of the parameters (P and ) are converged.
In summary, when the reference dataset X = {xi }i=1, 2,..., n is available, with the
ICA-FA model defined by Eq. (3.10), process information can be separated into three
different parts. The systematic information includes non-Gaussian and Gaussian
parts which are driven by their corresponding essential variables, whereas the noise
information and other unmeasured disturbances are represented by e.
As described in the previous section, the entire information of the process is par-
titioned into three parts: non-Gaussian systematic information, Gaussian systematic
information and noise information. Therefore, three different monitoring statistics
can be developed for fault detection. First, for monitoring non-Gaussian informa-
tion, SVDD-based method is used. In contrast to the confidence limit determination
method KDE, the SVDD reduces the complexity to quadratic programming prob-
lem that produces a unique analytical solution. Given the training dataset S (the
extracted independent components), the hypersphere is constructed. The center a
and the radius R can be determined by (Tax and Duin 2004)
n
a= αi (ŝi)
i=1
(3.14)
n
n n
R = 1 − 2 αi K(ŝz , ŝj ) + αi αj K(ŝi , ŝj)
i=1 i=1 j =1
The non-Gaussian statistic (NGS) can be developed as the distance between the data
sample and the center of the hypersphere, thus
2
NGSi = d 2 [(ŝi )] = (ŝi ) − a ≤ NGSlim = R 2 (3.15)
where ŝ i is the extracted independent component of the new sample, d[(ŝi )] is given
in Eq. (3.9). If NGSi ≤ R 2 , the non-Gaussian information is regarded as normal,
else, it will be regarded as an outlier or some fault and disturbance has happened.
Since the distribution of Gaussian essential variables t is assumed to be N{0, I},
the Mahalanobis norm of t is identical to the Euclidian norm. Different from the
projection method, the Gaussian essential variables cannot be calculated directly.
However, it can be substituted by their estimation, which is given in Eq. (3.12).
3.3 Process Monitoring Based on ICA-FA and SVDD 25
2
where χα,l represent the confidence limit with α confidence and l is the corresponding
free parameter. If the value of T 2 statistic goes beyond the confidence limit, the
Gaussian systematic process is regarded out-of-control.
Similarly, the noise part of the process information can also be monitored by the
squared Mahalanobis distance. Hence, the Gaussian information matrix x2 can be
represented as
x2 = Pt + e (3.17)
Notice Eqs. (3.12) and (3.17), the noise information e can be estimated as follows
In this section, the performance of the method is tested through the TE process. The
variables selected for monitoring are the same as those used for control structure
26 3 Non-Gaussian Process Monitoring
s2
0
-1
-2
-3
-2 0 2 4
s1
development in Lyman and Georgakist (1995). Simulation data which are acquired
in the normal process consist of 960 samples, and their sampling interval was 3 min.
In order to evaluate fault detection and identification performance of the method,
simulation data under 21 fault processes are also obtained. During the simulation,
each fault is introduced in the 161st sample.
For monitoring of the TE process, the number of ICs and principal factor (PFs)
should be chosen appropriately. In this study, 3 ICs and 6 PFs are selected. Then the
ICA-FA model can be developed. The kernel parameter for the Gaussian function
is chosen as σ = 2.0, the variable C is selected as C = 0.0625. For comparison, the
ICA, PCA and ICA-PCA algorithms are also carried out. The number of ICs for ICA
is selected as 3, whereas the PCs number for PCA is chosen as 6. All confidence
limits are set as 99 %. The scatter plot of the first independent component versus the
second independent component and two confidence limits of NGS and I 2 is given in
Fig. 3.5. Under normal operation, both of the confidence limits indicate that 99 %
samples are inside their corresponding regions. However, as discussed previously,
the confidence limit of NGS is tighter than that of I 2 , which is shown correctly in
the figure. Therefore, when some fault happens, NGS is more sensitive to detect
this fault. As samples lie in the gap between the NGS confidence limit and the I 2
confidence limit, only NGS can detect them. The I 2 statistic regards it as a normal
operation because the confidence limit has not been rejected.
For each statistic, the monitoring results (type II error) are tabulated in Table 3.4.
The minimum value achieved for each mode is marked with a bold number. As
shown in the table, ICA-FA outperforms ICA, PCA and ICA-PCA for most of the
fault modes. Particularly, the missing detection rates of ICA-FA for faults 10, 15, 16
and 19–21 are much lower than those of the other three methods. As indicated in the
2–4 columns of the table, best monitoring performance of some faults are revealed
by the NGS, some are gained by T 2 , whereas others are gained by the SPE statistic,
which shows the fault isolation ability of the ICA-FA method.
3.4 Conclusions 27
3.4 Conclusions
In this chapter, different non-Gaussian process monitoring methods have been re-
viewed, especially the ICA-based methods. For processes which are simultaneously
driven by non-Gaussian and Gaussian components, a two-step ICA-PCA informa-
tion extraction strategy has been demonstrated, based on which a corresponding
monitoring scheme has been developed. Compared to the traditional ICA and
PCA monitoring methods, the monitoring performance has been improved by the
ICA-PCA based approach. Furthermore, the SVDD method has been introduced
for modelling the distribution of the independent components, based on which a
non-Gaussian monitoring statistic has been developed. Compared to the conven-
tional I 2 monitoring statistic, the monitoring performance has been improved by
this non-Gaussian monitoring statistic. Besides, for probabilistic monitoring of the
Gaussian information of the process, the FA method has been incorporated. Two
additional Gaussian monitoring statistics have been constructed for process moni-
toring. Through the simulation study of the TE benchmark process, the efficiency of
both ICA-PCA and ICA-FA with SVDD methods have been demonstrated.
Chapter 4
Fault Reconstruction and Identification
4.1 Introduction
Fault detection and diagnosis are of ever growing importance for guaranteeing a safe
and efficient operation of complex industrial processes. Over the past few decades,
statistical-based techniques have been intensely researched, as they directly address
the challenges of large recorded variable sets resulting from the increasing complexity
of distributed control systems. Notable algorithmic developments have been firmly
made based on principal component analysis (PCA) and partial least squares (PLS;
Qin 2003; Venkatasubramanian et al. 2003).
Whilst statistically based fault detection schemes are well-established, several
fault diagnosis methods have also been proposed, such as contribution plot and the
backward elimination sensor identification (BESI) algorithm (Westerhuis et al. 2000;
Stork et al. 1997). However, these methods are difficult to implement in practice given
that the contribution method may produce misleading results (Lieftucht et al. 2006)
and the BESI algorithm requires a new PCA model to be calculated for each step
(Stork et al. 1997). For enhanced fault isolation, Gertler et al. (1999) introduced a
structure residual-based approach, which was later improved upon by Qin and Li
(2001).
Dunia and Qin (1998a, b) proposed a uniform geometric method for both
unidimensional and multidimensional fault identification and reconstruction. This
projection-based variable reconstruction work relied on the conventional SPE statis-
tic. More recent work by Wang et al. (2002) discussed the detectability of faults
for the PCA method using the T 2 statistic and Yue and Qin (2001) suggested the
use of a combined index for reconstruction-based fault identification and isolation
method. Other extensions of these methods include Lieftucht et al. (2006) and Li
and Rong (2006). However, these methods rely on PCA, which is applicable under
the assumption that the recorded process variables follow a multivariate Gaussian
distribution.
For monitoring processes that exhibit non-Gaussian behavior, recent work on in-
dependent component analysis (ICA) has shown its potential (Lee et al. 2004a, b;
Ge and Song 2007; Liu et al. 2008). Although ICA has been successfully employed
to extract non-Gaussian source signal that can be used to construct univariate moni-
toring statistics for fault detection, the issue of fault identification and isolation has
only been sporadically touched upon. A notable exception is discussed in Lee et al.
(2004a), where contribution plots for fault identification were developed for ICA
models. However, as mentioned above, contribution plot may produce misleading
and erroneous fault diagnosis results.
This chapter introduces a technique that relies on PCA-based fault reconstruction
to address the important issue of fault diagnosis for multivariate non-Gaussian pro-
cesses (Ge and Song 2009b). This method was developed for a description of the
independent components in the feature space determined by the support vector data
description (SVDD). In a similar fashion to PCA-based reconstruction, this scheme
relies on predefined fault directions, where the effect of a fault upon these direction
is evaluated using a developed fault diagnosis index.
For fault identification, Zhang et al. (1996) proposed the characteristic direction
method to identify faults, which took the first loading vector as the characteristic
direction. Although the first principal component extracted most of the fault infor-
mation, lot of other information has been omitted. Krzanowski (1979) developed a
method for measuring the similarity of two datasets using a PCA similarity factor
SP CA . Johannesmeyer et al. (2002) used that PCA similarity factor to identify the
similarity between different operation modes. Within that method, all the informa-
tion that the PCA model had covered was used, so the power of fault identification
has been improved. However, these similarity factors are all based on the assump-
tion that the data formed Gaussian distribution. For non-Gaussian process data, an
ICA similarity factor has been defined (Ge and Song 2007). Based on the ICA-FA
method in Chap. 3 a further similarity factor for the noise subspace of the process data
has also been defined. These similarity indices are used to classify various different
process faults.
For the introduction of the fault reconstruction method, we assume that the fault
subspace is known a priori. This is not a restriction of generality, as the application
of a singular value decomposition can estimate this subspace (Qin 2003). Assuming
there are a total of J possible fault conditions, denoted here by Fj , j = 1, 2, . . . , J ,
each fault condition is described by an individual fault subspace. These subspaces
are of dimension dim (j ) ≥ 1, j = 1, 2, . . . , J , which implies that unidirec-
tional, dim(j ) = 1, and multidimensional fault conditions dim(j ) > 1 can be
reconstructed.
After defining the fault subspace, z∗ can be reconstructed from the corrupted data
vector z using the ICA model and the SVDD method. Suppose a fault Fj has occurred,
a reconstructed zj represents an adjustment of the corrupted value z moving along
4.2 Fault Reconstruction 31
where ŝvi is the ith support vector obtained by the SVDD method and αi is
the corresponding coefficient. As this work utilizes the Gaussian kernel function
K(ŝi , ŝj ) = exp (−||ŝi − ŝj ||2 /δ), K(ŝ, ŝ) = 1, and N
i=1
N
j =1 αi αj K(ŝvi , ŝvj ) is a
constant. By incorporating Eq. (4.3), the optimization problem in Eq. (4.2) finally
becomes
N
N
fj = arg max αi K(svi , sj∗ ) = arg max αi K(svi , ŝ − Wj f) (4.4)
f f
i=1 i=1
N
Denoting = αi K(svi , Wj f) and taking the first partial with respect to f yields
i=1
∂
∇f =
=0 (4.5)
∂f
which for Gaussian kernel functions is equal to
N
∇f = [Wj ]T αi K(svi , ŝ − Wj f)(ŝ − Wj f − svi ) = 0 (4.6)
i=1
N
[Wj ]T αi K(svi , ŝ − Wj f)(svi − ŝ) = 0. (4.7)
i=1
32 4 Fault Reconstruction and Identification
By rewriting Eq. (4.7), the unknown fault vector f can be estimated as follows
N
j T W T αi K(svi , ŝ − Wj f)(ŝ − svi )
−1 i=1
f = [j W Wj ]
T T
N
(4.8)
αi K(svi , ŝ − Wj f)
i=1
Appendix shows that the estimated fault vector f j can be found iteratively
N
j T W T αi K(svi , ŝ − Wj f(t))(ŝ − svi )
−1 i=1
f(t + 1) = [j W Wj ]
T T
N
(4.9)
αi K(svi , ŝ − Wj f(t))
i=1
such that
NGSj = (ŝj∗ ) − a
2
2
(4.13)
the impact of the reconstruction procedure upon the jth fault direction therefore relies
on ξj
NGSj
ξj = (4.14)
NGSlim
4.2 Fault Reconstruction 33
which is referred to here as the fault diagnosis index and NGS lim is the control limit of
the NGS statistic. If the reconstruction procedure has been applied using the correct
fault direction, Eq. (4.14) highlights that a fault diagnosis index produces a value that
is smaller than or equal to 1. This, in turn, implies that the faulty sample has been
shifted along the correct fault direction to fall within the normal region. However, if
an incorrect fault direction is applied this value still exceeds 1.
z = As + e. (4.15)
The linear combinations are governed by the mixing matrix A, which was randomly
selected to be
⎡ ⎤T
0.815 0.906 0.127 0.913 0.632 0.098
A = ⎣0.279 0.547 0.958 0.965 0.158 0.971⎦ . (4.16)
0.957 0.485 0.800 0.142 0.422 0.916
The s∈R3 is a vector of source signal each of which follows a uniform distribution with
[0 1] and e are Gaussian distributed i.i.d. sequences of zero mean and variance 0.01,
e ∼ N {0,0.01I}. To test the performance of the fault reconstruction and diagnosis
method, a total of 1,000 samples were simulated. The first 500 samples served as
reference data, whilst the remaining 500 samples were used for model validation
and to inject fault conditions for studying the performance of the fault reconstruction
scheme (test dataset). Selecting the parameters of the Gaussian kernel functions of
the SVDD as C = 0.08 and σ = 6.5 produced the control limit of the NSG statistic
for a significance level of 5 %.
The first fault condition was a bias of magnitude 2 injected to the second sensor
in the test dataset after 250th sample. The conventional ICA-SVDD (Liu et al. 2008)
method could successfully detect this fault. Next, applying the fault identification and
isolation method and ICA contribution plots produced the plots shown in Fig. 4.1.
As discussed in the Sect. 4.2.2, values below or equal to 1 imply this fault condition
is responsible for the detected abnormal event. Values above 1 indicate that this fault
condition cannot be considered as a potential root cause. Given the way in which the
fault was injected the fault identification index for reconstructing along the direction
of variable 2 should be identified as significant, which plot a confirms. However, the
identification results of the contribution plot (ICA) in Fig. 4.1, plots 4.1b and 4.1c,
show that most variables show a significant response to this event for the I 2 cont,j and
variables 4 and 5 for SPE cont, j .
34 4 Fault Reconstruction and Identification
1.5 25
20
1
15
Icont
2
10
0.5
5
0 0
1 2 3 4 5 6 1 2 3 4 5 6
a Assumed Fault Condition b Variables
25
20
SPEcont
15
10
0
1 2 3 4 5 6
c Variables
1.5
30
1
20
Icont
0.5
2
10
0
0
251
Sa
251
253 Sa
mp
255 m 253
pl
le
257 eN 255
Nu
6
259 5 6 um 257 5
mb
3 4 4
2 Condition be 259 3
1 bles
er
Assumed Fault r 2
d e 1 Varia
30
20
SPEcont
10
0
251
Sa 253 6
mp 255 5
le 4
Nu 257 es
mb 259 3 iabl
er 2 Var
f 1
Fig. 4.1 Average results for 250 samples describing bias in sensor 2. a SVDD reconstruction.
b ICA contribution plot. c SPE contribution plot and individual plots for first ten samples. d SVDD
reconstruction. e ICA contribution plot. f SPE contribution plot
4.2 Fault Reconstruction 35
To analyze this misleading result in more detail, the way in which the variable
contributions are generated need to be reexamined
ˆ
2
Icont,j ∝ eTj Q† Bs = eTj Q† BBT Qz = eTj z. (4.17)
This separation shows that the fault contributions to the ICA-I 2 and -SPE statistics
depend on eTj z and eTj z, respectively. Assuming that the ith sensor is faulty,
the contribution of this sensor bias upon the residual variables ej is as follows
m
m
2
Icont,j = ωTj z = ωij zi SPEcont,j = γ Tj z = γ ij zi (4.20)
i=1 i=1
where zi is the magnitude of the sensor bias, ω ωTj and γ Tj is the jth row vector of
and whose ith elements are ij and γ ij , respectively. As no assumptions can
generally be imposed on and with respect to its symmetry etc., there is no
guarantee that the residual associated with the faulty sensor is the largest one. The
above equation also suggests the possibility that variable contributions which, by
default, should be insignificant may, in fact, be significant and indicate that these
variables are affected by the fault, which this simulation example confirms. For
d = m = m*, Eqs. (4.19) and (4.20) suggest that I 2 cont, j = zj , which produces a
correct variable contribution to the fault condition.
With the application of the reconstruction technique, a different picture emerges.
A total of six fault directions, one for each sensor, were considered. Figure 4.1a
shows the fault reconstruction index for each of the examined fault conditions. As
expected, a bias for the second sensor yielded the smallest value and below 1 and
hence, according to Eq. (4.14) the most significant impact of the reconstruction
approach is for this sensor. In contrast, reconstructing the other fault directions
(sensor faults) had an insignificant impact, leaving the fault reconstruction indices
to exceed 1. It should be noted that plots a–c represent the average values for the
250 faulty samples. For illustrative purposes, Fig. 4.1 also shows the reconstruction
indices and the variable contributions for the first ten samples in plots d–f.
36 4 Fault Reconstruction and Identification
Krzanowski (1979) developed a method for measuring the similarity of two datasets
using a PCA similarity factor, S PCA . Let us consider two datasets, D1 and D2 that have
the same variables but not necessarily the same number of measurements. Suppose we
have selected k principal components for the two PCA models. The PCA similarity
factor is a measure of the similarity between datasets D1 and D2 and to be exact, it is
between the two reduced subspaces. Denoted L and M, the two subspaces, the PCA
similarity factor can be calculated from the angles between principal components
1 2
k k
SPCA = cos Θij (4.22)
k i=1 j =1
where Θij is the angle between the ith principal component of dataset D1 and the jth
LT M
principal component of dataset D2 which is defined as cos Θij = ||Li ||i2 ||Mj j ||2 , here
||Li ||2 =||Mj ||2 = 1 and 0 ≤ Θ ≤ π/2 SPCA can also be expressed by subspaces L
and M as
trace(LT MMT L)
SPCA = (4.23)
k
If the value of S PCA exceeds a specified cutoff, the two datasets D1 and D2 are con-
sidered to be similar. By introducing different weight for each principal component,
this similarity factor has been modified by Singhai and Seborg (2002). Besides, Zhao
et al. (2004) also modified the similarity factor by introducing the principal angle,
in order to mine the most similar information between two datasets.
Another Random Document on
Scribd Without Any Related Topics
convenience and economy with which it serves these ends. If any
other property institution can, in a given situation, serve a given end
more easily and more cheaply than the institution of interest, then,
in that situation, the institution of interest—other things being equal
—is immoral and should be abolished. If, in the given situation, no
other property institution can serve the given end more easily and
more cheaply than the institution of interest, then that institution is
moral and should be retained. That is, from the modern sociological
point of view, the institution of interest is inconceivable except as a
means to some end outside itself. As a means it is to be judged in a
purely objective and pragmatic manner by the ordinary standards of
cost price, economic, social, and other.
The method of the ancients is entirely otherwise. Assuming still the
correctness of the modern viewpoint, which viewpoint be it said is
not unassailable and indeed is assailed by divers radicals, socialists
and others, but for the most part persons lacking in pecuniary
reputability; the mistake then, that the Early Church fathers make is
that of taking the means for an end. They have many arguments
against interest but all these arguments can be criticised for this one
error. The fathers elevate interest to the dignity of an end in itself.
Interest, qua interest, is condemned. It is taking advantage of a
brother's necessity. It is grinding the face of the poor. It is producing
pride, luxury, and vice. As soon as moral value is attached to
anything, it of course, is viewed as an end in itself. If it be true that
interest is an end in itself, then the fiercest diatribes of the fathers
are none too severe. Assuming their premises, their conclusions
follow inevitably. The modern man—he is not unknown—who talks
about the "sacred rights" of private property is guilty of the same
error as the ancient Christians, the error of mistaking means for
ends. The early Christians could not see that the property institution
of interest is neither good nor bad except as it is good or bad for
something. The something determines the judgment. As a matter of
historical fact the condemnation of interest developed in certain
early stages of human civilization and at those stages interest was
socially detrimental. At those stages, however, it was exceedingly
rare and correspondingly infamous. In any country where there is
abundance of good, free land the phenomenon of interest on money
will disappear, provided labor is free. So it disappeared in the
northern states of this Union in the later part of the 18th century.
These phenomena caused the southerners to adopt slavery though
all their English traditions had declared it immoral for more than
three centuries. The relation of interest to slavery under a condition
of free land is the relation of cause and effect, i.e., the requirement
of interest will produce slavery and the abolition of interest will
abolish slavery.[26] These social phenomena are of importance in our
consideration of the early Christian doctrine of interest. That doctrine
was largely evaded and disobeyed but it still had great effect and
that effect was toward the abolition of slavery. We do not mean that
this economic doctrine alone resulted in the abolition of slavery, or
even that it was a chief cause in the abolition of slavery, it was not
obeyed well enough to be such a chief cause; but so far as it was
obeyed, it tended in that direction.
The net result of all Christian teaching together was to prolong the
existence of the institution of slavery for two centuries, perhaps for
three. The doctrine of the sinfulness of interest however, worked
toward emancipation and forced slavery in its later end to become
almost wholly agricultural, i.e., to yield income as rent. Slaves cannot
be employed in commerce or industry in sufficient numbers to be
profitable where the institution of interest is banned as it was in the
'dark ages.' The Christian concept of interest undermined ancient
civilization by abrogating, slowly but surely, the institution of
property by which such gangs of 'manufacturing slaves' as made the
fortune of Crassus, could alone be made profitable. It is an historical
curiosity that it accomplished this result without any attack on the
institution of slavery itself.
As soon as Christian doctrines became widespread enough to
produce important social results we find Christian slave owners
manumitting their slaves in considerable numbers. It is no
derogation to the influence of the doctrine of human brotherhood or
to the humanity of the Christian slave owners to mention the fact
that the doctrine of the sinfulness of interest, by tending to make
slavery unprofitable, aided in the process of bringing to light the real
content of the doctrine of human brotherhood, and of making the
humane practice of manumission easier by the removal of certain
economic impediments.
In order to understand properly the working of the prohibition of
interest and its relation to manumission, it is necessary to carry the
analysis one step farther to its ultimate physical basis, which was the
conditioning factor of actual practice and eventually of theory also.
The exhaustion of the soil of western Europe which was the result of
ancient methods of agriculture, together with the rising standard of
living and the competition of other more fertile agricultural regions
like Egypt and North Africa resulted in the substitution of the
latifundi for small landholdings.[27] As the pressure continued the
latifundi in turn became economically unprofitable under forced labor
(slavery) and large tracts of land were abandoned. In order to put
this land under agriculture again the charge upon it had to be
reduced by the substitution of (relatively) free associated labor,
villeinage or serfdom. But this change cut off the economic margin
upon which the structure of ancient civilization was built and is the
ultimate economic reason assignable for the fall of Rome. Of course
the collapse of the empire could, theoretically, have been avoided
had the Romans of the first three centuries A.D. been content to live
the toilsome and frugal life of the Romans of the early republic. But
this was an utter impossibility in practice. This slowly working and
hardly understood decline in the relative and actual ability of ancient
agriculture to sustain the weight imposed upon it, enables us to see
why the sinfulness of interest could be steadily indoctrined even
though steadily evaded, by Christians from the beginning, while
manumission was not taught at all in the beginning and only worked
up to the dignity of a pious action relatively late.[28] It also explains
why manumission of household and personal slaves preceded that of
agricultural slaves. Of course there is nothing peculiarly Christian
about this later phenomenon and the operation of other causes is
discernable, but it is important for our purpose to observe that
Christian practice, and Christian theory in property matters in the
long run, followed the broad lines of the underlying economic
evolution.[29] The application of this to the origin of Christian
monasticism and to the revival of communistic theories by the later
Church fathers lies at the very outside limit of our study but will be
briefly touched on after we have considered the final overthrow of
the communistic property concept as they appear in the earlier
fathers up to and including Tertullian.
Clement of Alexandria 153-217 A.D. has the distinction of being the
first Christian theological writer who clearly expounds the concept of
private property which has held sway without substantial change in
the Church until the present time. This statement does not apply to
the doctrine of receiving interest on money. In respect to this
doctrine Clement is in perfect accord with all other early Christians
both before and after himself. Indeed he specifically states that the
Mosaic prohibition against taking interest from one's brother extends
in the case of a Christian to all mankind. But in regard to all other
property institutions Clement's attitude is essentially that of any
modern Christian of generous disposition.
In all that Clement has to say about property, and the 'bulk' of his
'property passages' is as great as that of all previous Christian
writers together, he speaks like a man on the defensive. Indeed
there has come down to us no other Christian writing earlier than his
time which presents his view, with the dubious exception of some
passages in Hermas. The fact seems to be that while Clement is
undoubtedly presenting an apologetic for the existing practice in the
Church of his day, that practice was felt to be more or less open to
attack in the light of certain scripture passages. Communism as an
existential reality was gone by the time of Clement—whatever may
have been the extent—probably a limited one—to which it had
existed in the earlier ages. But while communism as a fact was
dead, communism as an idea or ideal of Christian economy was not
dead. Indeed Clement's views about the morality of wealth were so
different from those of previous writers that a great modern
economist[30] in treating of this subject ventures the opinion, though
doubtfully, that the reason why Clement, alone among the great
early theologians, was never canonized by the Church was that he
ran counter to popular belief on this subject. This opinion is probably
erroneous. Clement's theological opinions have a semi-Gnostic tinge
quite sufficient to explain the absence of his name from the calendar
of saints.
Clement justifies the institution of private property. He justifies, on
the highest ethical and philosophical principles, the possession by
Christians of even the most enormous wealth. His apologetic is not
an original one. He borrows it bodily from Plato. Indeed he quotes
Plato verbatim, invocation to Pan and the other heathen gods
included.[31] The originality lies in applying this Platonic doctrine to
the exposition of Christian scripture. Clement's method is strictly that
of Biblical exegesis. In the well known sermon or essay on: "Who is
the Rich Man that shall be saved" he takes up practically all of the
scriptural passages which seem opposed to the institutions of private
property and explains them in so modern a spirit that the whole
sermon might be delivered today in any ordinary Church and would
be readily accepted as sound and reliable doctrine. His thesis is that
wealth or poverty are matters in themselves indifferent. That riches
are not to be bodily gotten rid of, but are to be wisely conserved and
treated as a stewardship intrusted to the owner by God. That charity
to the poor should be in proportion to one's wealth and that a right
use of wealth will secure salvation to the upright Christian even
though he possesses great riches all his life and leaves them to his
heirs. The wealth that is dangerous to the soul is not physical
possessions, but spiritual qualities of greed and avarice.
His views can be best expressed by himself. We give two
characteristic passages from the sermon above referred to.[32] "Rich
men that shall with difficulty enter into the kingdom, is to be
apprehended in a scholarly way, not awkwardly, or rustically, or
carnally. For if the expression is used thus, salvation does not
depend upon external things, whether they be many or few, small or
great, or illustrious or obscure or esteemed or disesteemed; but on
the virtue of the soul, on faith and hope and love and brotherliness,
and knowledge, and meekness and humility and truth the reward of
which is salvation." "Sell thy possessions. What is this? He does not,
as some off hand conceive, bid him throw away the substance he
possesses and abandon his property; but he bids him banish from
his soul his notions about wealth, his excitement and morbid feeling
about it, the anxieties, which are the thorns of existence which
choke the seed of life. And what peculiar thing is it that the new
creature, the Son of God intimates and teaches? It is not the
outward act which others have done, but something else indicated
by it, greater, more godlike, more perfect, the stripping off of the
passions from the soul itself and from the disposition, and the
cutting up by the roots and casting out of what is alien to the mind."
"One, after ridding himself of the burden of wealth, may none the
less have still the lust and desire for money innate and living; and
may have abandoned the use of it, but being at once destitute of
and desiring what he spent may doubly grieve both on account of
the absence of attendance and the presence of regret."[33]
We have now come to the beginning of what is in many respects the
most interesting period in the history of property concepts. It is a
period in which everything is upside down and wrong end to. In that
strange age we find a famous archbishop, one of the world's noblest
orators, a man of the most spotless integrity and the most saintly
life, publicly preaching in the foremost pulpit of Christendom
doctrines of property, the implications of which, the most hardened
criminal would scarcely venture to breathe to a gang of thieves.[34]
We find the most learned scholar of the century, in the weightiest
expositions of Christian Scripture, penning the most powerful
apologetic of anarchy that is to be found in the literature of the
world.[35] We find one of the greatest of the popes, a man whose
genius as a statesman will go down to the latest ages of history,
setting forth in a manual for the instruction of Christian bishops,
property concepts more radical than those of the fiercest Jacobins in
the bloodiest period of the Terror.[36]
Stranger still, these incredible performances are the strongest proofs
of the wisdom and piety of the men responsible for them. These
men are today honored as the saviors of civilized religion and their
images in bronze and marble and painted glass adorn the proudest
temples of the most conservative denominations of Christians. The
strange history of these famous men: Athanasius, the two Gregories,
Basil and Chrysostom in the East; Augustine, Ambrose, Jerome and
Gregory in the West, lies outside the limits of our study. But the
explanation of their desperate and uncompromising communism can
be given in a word. It was the communism of crisis: the communism
of shipwrecked sailors forced to trust their lives to a frail lifeboat
with an insufficient supply of provisions. These great Christian
scholars, enriched by all the accumulated culture of their civilization,
saw that culture falling into ruin all around them; they felt the
foundations of that civilization trembling beneath their feet. To vary
the figure, they beheld the rising tide of ignorance and barbarism
rapidly engulfing the world and with desperate haste they set to
work rebuilding and strengthening the ark of the Church that in it,
religion, and so much of civilization as possible, might be saved till
the flood subsided. Their task, perhaps the most important and most
urgent, that men have ever had to perform, was of such a nature
that they cared not what they wrecked in order to accomplish it.
They ripped up the floor of the bridal chamber for timber and took
the doors of the bank-safe for iron.
These rhetorical figures are violent; but they are less violent than
the reality they are intended to express. Monasticism was the last
desperate hope of civilized Christianity and these men knew it. To
establish monasticism they degraded the sanctity of marriage and
denounced the sacredness of property. They conferred the most
sacred honors upon the lowliest drudgery;[37] they turned princes
into plowmen and nobles into breakers of the soil. Some historians,
judging them by the different standards of a later age, have
pronounced them fanatics led astray by vulgar superstition. But
judged by the needs of their own age, judged by the inestimable
services rendered to the world by the monastic system they
instituted, they are entitled to a place far up in the list of the wisest
and the ablest of the human kind.
Sketchy and imperfect as the above study necessarily is, it
nevertheless gives the primary facts which are essential to an
understanding of the important part played by property concepts
and property institutions in the transformation of early Christianity
from a predominantly eschatological to a practically socialized
movement.
FOOTNOTES:
[1] Cf. Plato, Laws, V, 742. Aristotle, Politics, 1:X, XI. Cicero, De
Officus, II, XXV. Seneca, De Beneficus, VII, X.
[2] Acts IV.
[3] I. Cor. vii 30.
[4] Rom. xiii 3.
[5] Jas. Chap. V.
[6] Chaps. 21-22.
[7] Chap. xxxviii.
[8] Did. IV. 8.
[9] Barn. XIV. 16.
[10] Schaff, Vol. 1.
[11] Past. V. vi. 6.
[12] Past. S. IX. XXX. 5.
[13] Past III. 2.
[14] Apol. I. IV.
[15] Apol. I. xiv.
[16] De Mort. Per. XIV.
[17] Apol. XXXIX.
[18] Eus., E. H., V. 18.
[19] De Lapsis, VI.
[20] See Pronouncement of the Sacred Penitentiary, 11 Feb.,
1832.
[21] Sir James Macintosh.
[22] Civil Disabilities of the Jews.
[23] Lourie, Monuments of the Early Church, Chap. II.
[24] Lourie, ibid.
[25] Cf. Hypolytus.
[26] A. Loria. Cf. Economic Basis of Society. (Int.)
[27] Cf. A. Loria, Economic Foundations of Society. (Int.)
[28] Circa 200(?).
[29] Cf. K. Marx, Das Kapital, Vol. 1.
[30] F. Nitti in Catholic Socialism.
[31] Phaedus, The Laws, in Strom. II, 6.
[32] Chap. XIV.
[33] Chap. XXXI.
[34] Chrysostom, Sermons Rich Man and Lazarus, etc.
[35] Jerome, Commentaries.
[36] Gregory, Pastoralis Cura.
[37] Laborare est orare.
[38] Chap. I.
[39] E.g., Clement and Hermas.
CHAPTER III
THE EARLY CHURCH AND THE POPULACE
The transformation of early Christianity from an eschatological to a
socialized movement was the result of the interaction of three social
groups—three 'publics'—the Jewish, the Pagan, and the Christian. It
was a single movement, working itself out through these three
'crowds'. Christianity, like all other great religions, was in its first
beginnings essentially a mob phenomenon—that is to say it was a
very slow movement which had a long history back of it.
Perhaps no current opinion is more unfounded than the notion that
mob movements are sudden and unpredictable. They are almost
incredibly slow of development. The range of action found in the
mob is more narrowly and rigidly circumscribed than in almost any
other social group. A crowd is open to suggestions that are in line
with its previous experience, and to no others.
The initial success of Christ with the Jewish crowds was only possible
because for generations the whole Jewish public had been looking
forward to a Messiah and a Messianic kingdom. In so far as Christ
appeared to fulfill this preconceived expectation he gained popular
support. When he disappointed it, he lost his popularity and his life.
The early and enormous success of the apostles on the day of
Pentecost and immediately afterwards was due primarily to the fact
that the Chiliastic expectation preached to the Jerusalem crowds was
very closely in line with their inherited beliefs. As soon as Christianity
began to develop doctrines and practices even slightly at variance
with those traditional to Judaism it lost the support of the Jewish
public. Beginning as a strictly Jewish sect, it alienated practically the
whole Jewish race within little more than a generation. This
alienation was the inevitable effect of an idea of universalism
opposed to the hereditary Jewish nationalism. This idea of
universalism was not a new thing. It was to be found in the ancient
Jewish scriptures. But it had never become popularized. It formed no
part of the content of contemporary public opinion among the Jews.
Christianity met with success in the great cosmopolitan centers, like
Antioch and Alexandria, where universalism was a tradition and had
become a part of the crowd sentiment. It succeeded best of all in
Rome where universalism reached its highest development. Yet even
here a limitation is to be noted. Christianity was universal in its
willingness to receive people of all races and nations. It was not
universal in its willingness to acknowledge the validity of other
religions. This variation from the traditional Greek and Roman
universalism had momentous results. It made the propagation of the
Christian Gospel much more difficult and involved the church, at
least temporarily, in the current syncretism which was a popular
movement. So e.g., we find Justin calling Socrates a Christian and
asserting that the stories of Noah and Deucalion are merely versions
of the same event.
The main characteristics of crowd psychology are familiar enough.
Crowds do not reason. They accept or reject ideas as a whole. They
are governed by phrases, symbols, and shibboleths. They tolerate
neither discussion nor contradiction. The suggestions brought to
bear on them invade the whole of their understanding and tend to
transform themselves into acts. Crowds entertain only violent and
extreme sentiments and they unconsciously accord a mysterious
power to the formula or leader that for the moment arouses their
enthusiasm.
Any movement in order to become popular, in order to 'get over' to
the general public, has to operate within the limits set by this
psychology. The amount of change, adaptation, and development
necessary before a movement can fit into these limitations and
express itself powerfully within them is so considerable that no
historical example can probably be found where the required
accommodation has been accomplished in less than three
generations. It is the purpose of this chapter to trace, so far as the
surviving source material permits, the steps of this accommodation
in the case of early Christianity.
For some time before Christ the Jewish people had been restless.
Their desires and aspirations for national and religious greatness had
been repressed and inhibited. The unrest thus generated took
various forms; patriotic uprisings, religious revivals, etc. Christ was
at first considered merely as another Theudas or Judas of Galilee or
John the Baptist. In the pagan world the pax Romana produced a
somewhat similar restlessness. Travel increased; wandering, much of
it aimless, characterized whole classes of people;[1] there was a
marked increase in crime, vice, insanity, and suicide which alarmed
all the moralists. This condition of affairs was eminently suitable for
the first beginnings of a crowd movement; indeed no great crowd
movement can begin except under such circumstances. The
wanderings of St. Paul and the other Christians apostles—called
missionary journeys—were really only particular cases of a general
condition. The same organic demand for new stimulation, the same
sense of shattered religious and philosophic ideals prevailed in the
pagan as in the Jewish world. It would be hard to find a greater
contrast of character than Christ and Lucian. Yet the fiery
earnestness with which Christ denounces contemporary Jewish
religiosity and the cool cynicism with which Lucian mocks at the
pagan piety of the same age have a like cause. Economic pressure
on the lower strata of society contributed to the unrest. The slave,
the small shopkeeper, and the free artisan had a hard time of it in
the Roman world. Economically oppressed classes are material ready
to the hand of the agitator, religious or other. In the crowd
movements recorded in the Acts we can trace the first beginnings of
the Christian populace.[2] "In Iconium a great multitude both of
Jews and of Greeks believed but the Jews that were disobedient
stirred up the souls of the Gentiles and made them evil affected
against the brethren. But the multitude of the city was divided and
part held with the Jews and part with the apostles." At Lytra there
was a typical case of mob action where the apostles were first
worshipped and then stoned. In the cases of the mobs at Philippi
and Ephesus we see the economic motive, the threatened loss of
livelihood, entering along with anger at an attack on the received
religion. In the case of the Jerusalem and Athenian crowds we see
acceptance, or at least acquiescence, on the part of the crowd up to
the point where Christianity breaks with their tradition. In general
we see anger on the part of the crowds only after agitation
deliberately stirred up by interested parties; priests, sorcerers,
craftsmen or the like. Generally speaking the antipathy is no part of
the crowd psychology, and on occasion the crowd may be on the
side of the missionaries of the new religion. In general also the
Christians were not sufficiently numerous to make a counter crowd
demonstration of their own.
In Pliny's letter to Trojan, although it is a generation later than the
Acts and refers to a region where Christianity had been preached for
a considerable period of time, we find a marked instability in the
attitude of the public: "Many of every age, every rank and even of
both sexes are brought into danger and will be in the future. The
contagion of that superstition has penetrated not only the cities but
also the villages and country places and yet it seems possible to stop
it and set it right. At any rate it is certain enough that the temples
deserted until quite recently begin to be frequented, that the
ceremonies of religion, long disused, are restored and that fodder for
the victims comes to market, whereas buyers for it were until now
very few. From this it may easily be supposed that a multitude of
men can be reclaimed if there be a place of repentence."[3]
There seems no reasonable ground for doubting that Pliny's
judgment was correct. While the blood of the martyrs is doubtless
the seed of the church, a continuous, general, and relentless
persecution can extirpate a religion in a given nation; as the history
of the Inquisition abundantly proves. Still more easily can
propaganda for the older religion win back its former adherents of
the first and second generations. It is not, in general, till a
generation has grown up entirely inside a new religion that such a
religion is well established. The generation which at maturity makes
the rupture with the older faith can be brought back to it by less
expenditure of energy than was expended by them in breaking away
in the first place. The success of the Jesuits e.g., is quite inexplicable
on any other hypothesis. The generation who are children at the
time their parents make the break with the old religion are
notoriously undependable in the religious matters. It was in all
probability these people that Pliny had to deal with. It is at least
permissable to hazard the guess that the Laodiceans who aroused
the wrath of the author of the Revelation were of this generation. It
is certain that many of the 'Lapsi' who caused so much trouble to
Christian apologists and church councils belonged in this
chronological class.
In Justin Martyr we have a hint of a further development in the
crowd attitude toward the Christians. Justin says: "When you (Jews)
knew that He had risen from the dead and ascended to heaven as
the prophets foretold He would, you not only did not repent of the
wickedness you had committed, but at that time you selected and
sent out from Jerusalem chosen men through all the land to tell that
the godless heresy of the Christians had sprung up and to publish
those things which all they, who knew us not, speak against us. So
that you are the cause not only of your own unrighteousness but
that of all other men."[4]
Irrespective of the exact historical accuracy of this statement, it is
indicative of the process, technically known as 'circular interaction,'
which is so essential a step in the development of popular opinion
and the building up of crowd sentiment. Before any group of people
can become either popular or unpopular there must be a focusing
and fixation of public attention upon them. Even in the new
Testament we find the Jews sending emissaries from city to city to
call attention to the Christian propaganda. Prejudice against the
Christians was thus aroused in persons who had never either seen or
heard them. The basis of 'circular interaction' is unconscious or
subconscious emotional reaction. A's frown brings a frown to the
face of B. B's frown in turn intensifies A's. This simple process is the
source of all expressions of crowd emotion. By multiplication of
numbers and increase in the stimuli employed it is capable of
provoking a vicious circle of feeling which eventually causes
individuals in a crowd to do things and feel things which no
individual in the crowd would do or feel when outside the circle. It is
to the credit or discredit of the Jews that they first set this 'vicious
circle' in operation against the Christians. Of course the same
psychological principle operated to produce zeal and enthusiasm and
contempt of pain and death in the Christian 'crowd'. By this process
of 'circular interaction' the name, 'Christian,' had already in the time
of Justin become a mob shibboleth. It seems to have operated
precisely as the shibboleth 'traitor' operates on a patriotic crowd in
war time, or 'scab' on a labor group. It became a shibboleth of
exactly opposite significance in the Christian 'crowd'. The way was
thus prepared for the next step in the process of developing the
ultimate crisis. This step—the disparate 'universe of discourse'—is
exhibited in process of formation in the account of the martyrdom of
Polycarp. The account, as we have it, undoubtedly contains later
additions, but these additions even of miraculous elements, do not
necessarily invalidate those portions of the story with which we are
alone concerned. The martyrologist certainly had no intention of
writing his story for the purpose of illustrating the principles of group
psychology and the undesigned and incidental statements of crowd
reactions are precisely the ones of value for our purpose. A few brief
excerpts are sufficient to illustrate the stage reached in the growth
of the disparate 'universe of discourse.' "The whole multitude,
marvelling at the nobility of mind displayed by the devout and godly
race of Christians cried out: "Away with the Atheists: let Polycarp be
sought out."[5] He went eagerly forward with all haste and was
conducted to the Stadium where the tumult was so great that there
was no possibility of being heard."[6]
"Polycarp has confessed that he is Christian. This proclamation
having been made by the herald, the whole multitude both of the
heathen and Jews who dwelt in Smyrna cried out with uncontrollable
fury and in a loud voice: "This is the teacher of Asia, the father of
the Christians and the overthrower of our gods, he who has been
teaching many not to sacrifice or to worship the gods." Speaking
thus they cried out and besought Phillip, the Asiarch, to let loose a
lion upon Polycarp. But Philip answered that it was not lawful for him
to do so seeing the shows of beasts were already finished. Then it
seemed good to them to cry out with one voice that Polycarp should
be burned alive."[7]
"This then was carried into effect with greater speed than it was
spoken, the multitude immediately gathering together wood and
fagots out of the shops and baths, the Jews especially, according to
custom eagerly assisting them in it."[8]
"We afterwards took up his bones, as being more precious than the
most exquisite jewels and more purified than gold and deposited
them in a fitting place, whither, being gathered together as
opportunity is allowed us, with joy and rejoicing the Lord shall grant
us to celebrate the anniversary of his martyrdom both in memory of
those who have already finished their course and for the exercising
and preparation of those yet to walk in their steps."[9]
In the disparate universe of discourse in its complete form common
shibboleths produce entirely different mental reactions—usually
antagonistic ones. There is also complete accord as to the
shibboleths. The cry here is at one time against the Atheists, then
against the Christians. But the Christians could and did deny the
charge of Atheism. They were as antagonistic to Atheism as the
Pagans. An incomplete development of crowd feeling is evident on
the part of the pagans. The Jews are still the inciters and leading
spirits of the mob. The very statement that the Jews acted
'according to custom' shows that mobbing Christians was still looked
upon as a peculiarly Jewish trait. It was not yet entirely spontaneous
on the part of the pagan public. Most noticeable of all is the
indifference of the mob toward the Christians' adoration of relics of
the martyrs. No effort was made to prevent the Christians from
obtaining the bones of Polycarp. Either the cult of relics was not
known to the pagans and Jews—though it seems to be firmly
established among the Christians—or else, the effect of the cult in
perpetuating Christianity had not yet had time to make itself
manifest to the pagan public—or to the Jewish. In any case we have
here the plain evidence of the imperfectly developed condition of the
crowd mind, owing perhaps to a too short tradition.
Our next evidence is the martyrdoms of Lyons and Vienne preserved
in a letter quoted by Eusebius. "They (the Christians) endured nobly
the injuries inflicted upon them by the populace, clamor and blows
and draggings and robberies and stonings and imprisonments and all
things which an infuriated mob delight in inflicting on enemies and
adversaries."[10]
"When these accusations were reported all the people raged like wild
beasts against us, so that even if any had before been moderate on
account of friendship, they were now exceedingly furious and
gnashed their teeth against us.
"When he (Bishop Pothinus) was brought to the tribunal
accompanied by a multitude who shouted against him in every
manner as if he were Christ himself, he bore noble witness. Then he
was dragged away harshly and received blows of every kind. Those
men near him struck him with their hands and feet, regardless of his
age, and those at a distance hurled at him whatever they could
seize, all of them thinking that they would be guilty of great
wickedness and impiety if any possible abuse were omitted. For thus
they thought to avenge their own deities."[11]
"But not even thus was their madness and cruelty toward the saints
satisfied. Wild and barbarous tribes were not easily appeased and
their violence found another peculiar opportunity in the dead bodies.
For they cast to the dogs those who had died of suffocation in the
prison and they exposed the remains left by the wild beasts and by
fire mangled and charred. And some gnashed their teeth against
them, but others mocked at them. The bodies of the martyrs having
thus in every manner been exposed for six days were afterwards
burned and reduced to ashes and swept into the Rhone so that no
trace of them might appear on the earth. And this they did as if able
to conquer God and prevent their new birth; 'that', as they said,
'they may have no hope of a resurrection through trust in which they
bring to us this foreign and new religion.' "[12]
We have in this account a marked advance, as regards the
development of the mob mind, over what is found in the martyrdom
of Polycarp. Many of the 'crowd' phenomena are indeed the same
but the differences are even more striking than the similarities. We
find in Lyons no body of Jews or other especially interested persons
leading the mob on by manifestations of peculiar zeal and
forwardness. When the accounts are compared in their entirety it
becomes at once manifest that there is a consistency of attitude, a
whole heartedness in the actions of the Lyons mob that is lacking in
the case of the Syrmnaens. There is a degree of familiarity with
Christian doctrine—especially the doctrine of the resurrection—which
denotes a much more thorough permeation of the public mind by
Christianity. There may be no difference in the hatred of the two
mobs for the new faith, but it had more content in the mind of the
Gallic crowd. The degree of thought and pains taken by the Lyonese
persecutors—the guards placed to prevent the Christians from
stealing the relics of the martyrs, the elaborate efforts to nullify the
possibility of a resurrection—the very extent and thoroughness and
duration of the persecution are different from anything to be found
in the other martyrdom.
The difficulty to be explained—if it is a difficulty—from the point of
view of crowd psychology is that there is difference of only eleven
years—taking the ordinary chronology—between the two
persecutions. It is true that the Lyons persecution is the later, but
the difference in the mob behavior is such as might well demand the
lapse of a generation had the phenomena been exhibited by the
public of the same city. There must unquestionably have been a
great difference in the demotic composition of the populations of
Lyons and Smyrna; the reference to barbarians in Lyons shows as
much, but the behavior of mobs as controlled by the time needed for
the focusing and fixation of attention and the development of a
disparate universe of discourse is very little effected by difference of
demotic composition. It has indeed been suggested by one critic,[13]
that the persecution at Lyons belongs in the reign of Septimus
Severus instead of that of Marcus Aurelius. This would explain away
the difficulty, but there seems no necessary reason for adopting this
opinion. It would rather appear that there existed peculiar conditions
in Lyons and vicinity which account for the fact that the persecution,
so far as we know, was confined to that locality and also for the fact
that the mob mind was in a maturer state of antagonism to
Christianity. Just what these peculiar conditions were, it is impossible
to say with entire certainty. However there is at least a very
suggestive hint in a paragraph by the greatest modern authority on
Roman Gaul[14] contained in his well known volume on Ancient
France.[15] The paragraph is also worth quoting as giving a valuable
insight into the psychology of the peoples of the ancient Roman
World. "The Roman Empire was in no wise maintained by force but
by the religious admiration it inspired. It would be without a parallel
in the history of the world that a form of government held in popular
detestation should have lasted for five centuries. It would be
inexplicable that the thirty legions of the Empire should have
constrained a hundred million men to obedience. The reason of their
obedience was that the Emperor, who personified the greatness of
Rome was worshipped like a divinity by unanimous consent. There
were altars in honor of the Emperor in the smallest townships of his
realm. From one end of the Empire to the other a new religion was
seen to arise in those days which had for its divinities the Emperors
themselves. Some years before the Christian era the whole of Gaul,
represented by sixty cities, built in common a temple near the city of
Lyons in honor of Augustus. Its priests, elected by the united Gallic
cities, were the principal personages in their country. It is impossible
to attribute all this to fear and servility. Whole nations are not servile
and especially for three centuries. It was not the courtiers who
worshipped the prince, it was Rome, and it was not Rome merely
but it was Gaul, it was Spain. It was Greece and Asia."
While no dogmatic assertion is justified, it does not, perhaps, exceed
the limits of reasonable inference to suppose that the existence of
this noted center of Emperor worship in the immediate
neighborhood of Lyons may account, in part at least, for the especial
hatred of the populace of that city for persons who refused to
sacrifice to the Emperor and also for the maturity of their feeling
against the Christians, who were as far as we are aware, probably
the only persons who refused thus to sacrifice. This stray bit of
evidence is admittedly not conclusive. It is offered merely for what it
may be worth. There is evidence that by the middle of the second
Century popular opinion was sufficiently inflamed against the
Christians to render the administration of justice precarious because
of mob violence. Edicts of Hadrian and Antonius Pious specifically
declared that the clamor of the multitude should not be received as
legal evidence to convict or to punish them, as such tumultuous
accusations were repugnant both to the firmness and the equity of
the law.[16]
This attitude seems to have persisted with relatively little change for
about a century. During this period the official 'persecutions' were
neither numerous nor severe. From the very few scattered and
incidental references which have alone survived regarding the mob
feeling of the time, we can assert no more than that it was an
exasperated one, likely to break out upon provocation but under
ordinary circumstances more or less in abeyance. On the whole it
was undoubtedly more violent at the end of the period than at the
beginning.
Fortunately from the middle of the third Century onwards we have a
fairly continuous history of a single 'public' (Alexandria) which is
lacking before this time. The Alexandrian populace were noted for
their tumultuous disposition, but we have no reliable account of their
behavior towards the Christians until the time of Severus, 202 A.D. In
the account given by Eusebius of the martyrdom of the beautiful
virgin, Potamiaena, it is stated that: "the people attempted to annoy
and insult her with abusive words." As however the intervention of a
single officer sufficed to protect her from the people on this
occasion, the public sentiment cannot have been inflamed to any
alarming extent. If we may trust Palladius, her martyrdom was the
result of a plot of a would-be ravisher and in any case it was not the
product of any spontaneous popular movement.
In the period between 202 A.D. and 249 A.D. a well developed
tradition of hatred and violence grew up in the popular mind. We
have no record of the steps in the process but the extant accounts
of the Decian and Valerian persecutions in Alexandria leave no doubt
of the fact. These persecutions can only be called 'legal' by a violent
stretch of verbal usage. They were mob lynchings, sometimes
sanctioned by the forms of law, but quite as often without even the
barest pretense of judicial execution. They were quite as frequent
and as savage in the later part of the reign of Philip, as in the time
of Decius. They were not called forth by any imperial edict—they
preceded the edict by at least a year and were of a character such
as no merely governmental, legal process would ever, or could ever,
take on. Mobbing Christians had become a form of popular sport, a
generally shared sort of public amusement—exciting and not
dangerous. The letter of Bishop Dionysius makes this very clear. To
quote: "The persecution among us did not begin with the royal
decree but preceded it an entire year. The prophet and author of
evils to this city moved and aroused against us the masses of the
heathen rekindling among them the superstition of their country and
finding full opportunity for any wickedness. They considered this the
only pious service of their demons that they should slay us." Then
follows a long list of mob lynchings of which we take a single
specimen: "They seized Serapion in his own house and tortured him
and having broken all his limbs, they threw him headlong from an
upper story."[17] "And there was no street, nor public read, nor lane
open to us night or day but always and everywhere all them cried
out that if anyone would not repeat their impious words, he should
be immediately dragged away and burned. And matters continued
thus for a considerable time. But a sedition and civil war came upon
the wretched people and turned their cruelty toward us against one
another. So we breathed for a while as they ceased from their rage
against us."[18]
The mob broke loose against the Christians again the following year,
but there is no object in cataloguing the grewsome exhibitions of
crowd brutality. It is evident that what we have in this account is no
exhibition of political oppression by a tyrannical government, but a
genuine outbreak of group animosity which had been long
incubating in the popular mind. All the phenomena which are
characteristic of fully matured public feeling are found complete;
circular interaction, shibboleths, sect isolation devices and the rest.
When public feeling has developed to such a degree of intensity as
this, the accumulated sentiment and social unrest must of necessity
discharge themselves in some form of direct group action. This
direct action however may take the from either of physical violence
or, under certain conditions, of some sort of mystical experience;
conversion, dancing, rolling on the ground, etc. In exceptional cases
the two forms are combined. An illustration of this latter
phenomenon is given by Bishop Dionysius in this same letter; "In
Cephus, a large assembly gathered with us and God opened for us a
door for the word. At first we were persecuted and stoned but
afterward not a few of the heathen forsook their idols and turned to
God."[19] It is necessary to mention perhaps the largest, and
certainly the most dignified and respectable crowd that is to be met
with in connection with this persecution—that of Carthage on the
occasion of the martyrdom of Bishop Cyprian. We find here neither
rage on one side nor unseemly exaltation on the other. Pagans and
Christians alike behaved with decent seriousness at the death of that
famous man who was equally respected by all classes of the
population. But martyrs of the social eminence of Cyprian were very
rare, and orderly behaviour in such a vast multitude as witnessed his
end was still rarer.
To return to the populace of Alexandria. The long peace of the
Church which intervened between the persecution of Valerian and
that of Diocletian witnessed in Alexandria, as elsewhere, a great
growth of Christianity in numbers, influence, and wealth. It would
perhaps be going beyond the evidence to say that in this interval,
the majority of the population of the city were won over to the new
faith, but it is certain that the number of Christians became so great
as to intimidate the pagan portion of the people. The Alexandrian
mob was still very much in evidence but it gradually ceased to
harrass the Christians except under the most exceptional
circumstances. The dangers of such action became so considerable
and the chances of success so problematical that we find a period
when a practice of mutual forbearance governed the behavior of the
hostile groups.
The study of crowd psychology presents no more impressive
contrast than that exhibited by the people of Alexandria during the
Diocletian persecution compared with their behavior during that of
Decius. In the last and greatest of the persecutions, in the most
tumultuous city of the empire, the mob took no part. Like the
famous image of Brutus, it is more conspicuous by its absence than
it would be by its presence. The persecution was a purely
governmental measure officially carried out by judges and
executioners in accordance with orders. In one obscure and doubtful
instance we are told that the bystanders beat certain martyrs when
legal permission was given to the people to treat them so. In
another case we are told that the cruelty of the punishments filled
the spectators with fear. These are the only references to the public
that occur in the long and minute account of an eye witness of
famous events extending over a considerable number of years. Both
before and after this period the mob of the Egyptian metropolis
exhibits the utmost extreme of religious fanaticism. During this
period that mob had to be most carefully considered by the
government in other than religious matters. But as a religious power
it did not exist. Had the persecution of Diocletian happened a
generation earlier it could have counted on a very considerable
degree of popular support, had it happened a generation later it
would have caused a revolt that could only have been put down by a
large army. Happening at the precise time it did, it provoked no
popular reaction at all.
This strange apathy is not peculiar to Alexandria. Practically without
exception the authentic acts of the martyrs of this persecution are
court records taken down by the official stenographers in the
ordinary course of the day's work. They are dry, mechanical, and
repetitious to a degree. They exhibit, in general, harrassed and
exasperated judges driven to the infliction of extreme penalties in
the face of a cold and skeptical public. One imperial decree ordered
that all men, women, and children, even infants at the breast,
should sacrifice and offer oblations, that guards should be placed in
the markets and at the baths in order to enforce sacrifices there.
The popular reaction in Caesarea is thus recorded: "The heathen
blamed the severity and exceeding absurdity of what was done for
these things appeared to them extreme and burdensome."[20] "He
(the Judge) ordered the dead to be exposed in the open air as food
for wild beasts; and beasts and birds of prey scattered the human
limbs here and there, so that nothing appeared more horrible even
to those who formerly hated us, though they bewailed not so much
the calamity of those against whom these things were done as the
outrage against themselves and the common nature of man."[21]
The one thing to be said of this type of mob mind is manifestly that
it is transitional. The pendulum has swung through exactly half its
arc and for the brief instant presents the fallacious appearance of
quiescence. How transitory this quiet was on the part of the
Alexandrian mob is evidenced by the history of Athanasius. That
great statesman conciliated and consolidated public opinion in Egypt.
Backed by this opinion he practically cancelled the power of the civil
authorities of the country and negotiated as an equal with the
emperors. For the first time in more than three centuries the will of
the common people again became a power able to limit the military
despotism which dominated the civilized world.
The re-birth of popular government in the Fourth century through
the agency of Christian mobs is the most important preliminary step
in the growth of the political power of the Catholic Church. A study
of the mobs of Alexandria, Rome, Constantinople and other great
cities shows beyond question that the political power of the Church
had its origin in no alliance with imperial authority, but was
independent of and generally antagonistic to that authority. The
history of these Christian mobs lies outside the limits of our study
but it is worth while in the case of the Alexandrian populace to give
two or three brief extracts illustrating the final steps of the process
which changed a fanatically pagan mob into an equally fanatical
Christian one. What we have to consider is only the last stage of an
evolution already more than half complete at the time of the Nicene
Council. Under extreme provocation and certain of imperial
complacency at their excesses, the pagan mob during the reign of
Julian indulged in one last outburst against the exceedingly
unpopular George of Cappadocia who had been forcibly intruded into
the seat of Athanasius. To quote the Historian Socrates: "The
Christians on discovering these abominations went forth eagerly to
expose them to the view and execration of all and therefore carried
the skulls throughout the city in a kind of triumphal procession for
the inspection of the people. When the pagans of Alexandria beheld
this, unable to bear the insulting character of the act, they became
so exasperated that they assailed the Christians with whatever
weapons chanced to come to hand, in their fury destroying numbers
of them in a variety of ways and, as it generally happens in such a
case, neither friends or relations were spared but friends, brothers,
parents, and children imbued their hands in each others blood. The
pagans having dragged George out of the church, fastened him to a
camel and when they had torn him to pieces they burned him
together with the camel."[22] In this account we see the last expiring
efforts of the pagan mob movement. Any mob movement collapses
rapidly when it turns in upon itself, and the evil results of its violence
react immediately upon the members of the mob. By this time it is
evident that the number of Christians in Alexandria was so large that
any public persecution of them brought serious and unendurable
consequences upon the populace generally. Then the movement
ended.
But in the two centuries or more that the pagan movement lasted, a
contrary Christian mob movement had been developing along the
same general lines as the other. This movement, being later in its
inception, came to a head correspondingly later and reached its
crisis under the patriarch Cyril. Its violence was first directed against
the Jews whom the Christians appear to have hated even more than
they hated the pagans. The Jews were the weaker and less
numerous faction opposed to the Christians and as the Pagans seem
to have liked them too little to support them against the Christians, it
is not surprising that the Christian mob, which had pretty well
reduced the political authorities to impotence, should vent its rage
against the Jews and their synagogues. "Cyril accompanied by an
immense crowd of people, going to their synagogues, took them
away from them and drove the Jews out of the city, permitting the
multitude to plunder their goods. Thus the Jews who had inhabited
the city from the time of Alexander were expelled from it."[23]
Sometime after the expulsion of the Jews, the Christian mob, now
directing its spite against the rapidly disappearing paganism,
perpetrated perhaps the most atrocious crime that stains the history
of Alexandria—the murder of Hypatia. This beautiful, learned, and
virtuous woman, 'the fairest flower of paganism' is one of the very
few members of her sex who has attained high eminence in the
realm philosophical speculation. She enjoyed the deserved esteem of
all the intellectual leaders of her age—Christian as well as pagan—
and to the latest ages her name will be mentioned with respect by
all those speculative thinkers whose respect can confer honor.
Socrates describes her murder as follows: "It was calumniously
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
ebookbell.com