Verschoren A. (Ed), Lowen R. (Ed) - Foundations of Generic Optimization, Volume 2 - Applications of Fuzzy Control, Genetic Algorithms and Neural Networks (2008)
Verschoren A. (Ed), Lowen R. (Ed) - Foundations of Generic Optimization, Volume 2 - Applications of Fuzzy Control, Genetic Algorithms and Neural Networks (2008)
MATHEMATICAL MODELLING:
Theory and Applications
VOLUME 24
This series is aimed at publishing work dealing with the definition, development
and application of fundamental theory and methodology, computational and
algorithmic implementations and comprehensive empirical studies in mathe-
matical modelling. Work on new mathematics inspired by the construction of
mathematical models, combining theory and experiment and furthering the
understanding of the systems being modelled are particularly welcomed.
Managing Editor:
R. Lowen (Antwerp, Belgium)
Series Editors:
R. Laubenbacher (Virginia Bioinformatics Institute, Virginia Tech, USA)
A. Stevens (Max Planck Institute for Mathematics in the Sciences, Leipzig,
Germany)
The titles published in this series are listed at the end of this volume.
Foundations of Generic
Optimization
Volume 2: Applications of Fuzzy Control, Genetic
Algorithms and Neural Networks
Edited by
R. Lowen
University of Antwerp, Belgium
and
A. Verschoren
University of Antwerp, Belgium
Robert Lowen Alain Verschoren
University of Antwerp University of Antwerp
Belgium Belgium
9 8 7 6 5 4 3 2 1
springer.com
Contents
v
vi Contents
W. Peeters
Abstract This chapter may serve as an introductory article, and is meant to give an
overview of the mathematical methods applied in fuzzy control techniques, such as
fuzzification, aggregation and defuzzification. We will also discuss the advantages
and disadvantages of the several techniques, with respect to the achievability of
their goals, and we will give a brief overview of “hybrid techniques”, techniques
that involves fuzzy control as well as other artificial intelligent computing methods,
such as neural networks and genetic algorithms.
1 Introduction
1.1 History
Fuzzy control ([21, 168]) is a tool to model the control of complex systems derived
from knowledge obtained by human experience. Unlike ordinary expert systems,
fuzzy control systems do not require the time-consuming process of designing ap-
propriate algorithms for modelling the human behavior, and by its relative heuristic
simplicity, it is an excellent means to control more engineer-oriented applications
without a thorough understanding of the underlying mechanism; often it is suffi-
cient to develop a control strategy by a few simple “rules of thumb”, which consti-
tute a mere sufficient collection of conditions to keep the system stable, i.e. that the
W. Peeters
University of Antwerp
Dept. of Mathematics and Computer Science
Middelheimlaan 1
B-2020 Antwerp, Belgium, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 1
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 1–138.
c 2008 Springer.
2 W. Peeters
error can be kept within reasonable bounds. In the example of a fuzzy controlled car
([142]) for instance, the designers would want to make sure their vehicle does not
bump into other objects, without prior knowledge of the precise location of those
objects, so that the car would still function in a different environment. Fuzzy con-
trol is used in a wide scope of applied sciences, including physics, electronics and
economy. It is a powerful tool for steering complex processes without a need for de-
signing difficult tuning functions. The nature of the systems make that fuzzy control
systems are easy tools for modelling human experience, and even for adaptive learn-
ing of control behavior. Particularly the areas where these techniques are crossbred
with other succesful self–tuning algorithms, such as neural networks and genetic
algorithms (see Section 10), have produced very interesting results, although the
design of a fuzzy controller is inherently very heuristic by nature.
While the first application of fuzzy sets to control theory, in this case on a steam
engine, occured in 1975, performed by E.H. Mamdani and S. Assilian ([94]), the
first practical industrial application can be traced back to 1982, by L.P. Holmblad
and J.J. Østergaard, who applied fuzzy control to a cement kiln. Methods to control
an automated car ( [142]) were extended to automated steering systems for trains
( [163]), which examples show that the first applications of fuzzy control invari-
ably occur in big industrial processes. Only in the late 1980s, after some successful
implementations by Japanese manufacturers of fuzzy controllers in household ap-
pliances, such as vacuum cleaners and cameras ([60, 155]), the interest in the study
of fuzzy controllers has grown to worldwide proportions. Credit is due to M. Sugeno
([141]), whose work was an important source of inspiration for implementing fuzzy
control systems as a contemporary innovation to popular appliances, thus making
fuzzy control a widely accepted, economically profitable and quite popular topic in
engineering sciences. Fuzzy control is an approach for control systems that aim to
model human experience, alternative to expert control systems ([16]). However, its
origins trace back to control engineering rather than to techniques of artificial intelli-
gence. Fuzzy control is mostly a rule-based system, where the designer heuristically
formulates a set of control rules, which makes the scope of fuzzy control narrower
than general expert control systems. The main advantage is the relative simplicity
with which fuzzy rule bases can be defined, refined or tuned. Further studies how-
ever have shown that the design, the robustness and the capability of outperforming
convention control systems, such as PID-controllers, are largely dependent of the
circumstances in which one wants to perform fuzzy control.
We have to admit at the same time that fuzzy control theory suffers from some
serious drawbacks, that have been repeatedly targeted by its adversaries, and which
have been the subject of some heated debates over the last few decades, where
the question openly arises what the advantage of fuzzy control is as opposed to
classical control theory ( [1]). Without going into detail, we feel it necessary to
summarize these counterarguments, so that the reader can bear these in mind, al-
though we do not think any of the arguments make fuzzy control theory absolutely
superfluous.
• Fuzzy control theory is a largely empirical and heuristic theory, which lacks a
unifying design theory.
An Overview of Fuzzy Control Theory 3
• For much too long, “fuzzy” has been a buzz–word in commercial applications,
making the notion devoid of its content.
• While fuzzy controllers have proved their usefulness in relatively simple control
schemes, multivariable control systems are much harder to develop, while crisp
control methods do not suffer from this drawback. For larger, more complex
systems, the time consumed by the design of a fuzzy controller is almost equal
to, or even exceeds, the time needed to construct a classical controller, derived
from knowledge about dynamical systems.
• A generalized fuzzy control-specific stability analysis method (see Section 9)
does not exist yet, and many of the existing methods are simply generalizations
of crisp control stability methods.
• The mathematics behind crisp control theory involves much more difficult math-
ematical methods, which makes fuzzy control “the easy way out”. We believe
however, that the relative simplicity of fuzzy control can be an advantage as well,
because some of the nonlinear differential equations that describe accurately the
physical model of a control system are not analytically solvable anyhow, very
unstable for perturbations, and then fuzzy control might as well be as good as
any other approximation theory.
This first section will focus on the basic definitions and notations, while in
Section 2, we will establish a working definition for fuzzy rule bases, thus creat-
ing an environment in which fuzzified data can be considered as input, so that we
can complete the design of a fuzzy controller in Section 3. Aggregation and impli-
cation operators, and the process of defuzzification will respectively be studied in
Sections 4 and 5. All this theory will be illustrated with an extended example of an
automated heating system in Section 6. Simplifications of the fuzzy control theory,
such as table-based controllers and Sugeno controllers will be studied in Section 7,
and the design of adaptive fuzzy controllers, such as self-tuning and self-organizing
controllers, in Section 8. Section 9 will contain a brief summary of stability con-
trol techniques for fuzzy controllers, while in the last Section, 10, we will describe
shortly which other artificial computing techniques can successfully be combined
with fuzzy controllers.
Aggre-
+ gation
Preprocessing Postprocessing
Impli
cation
FUZZY CONTROLLER
–1
–10 small 0 medium large 10
processes that may be carried out in the preprocessing block comprise, but are not
limited to:
• Quantization of the measurements. When performing a sampling, typical errors
occur that are caused by the rounding-off of integers, depending on the coarse-
ness of the quantization steps or the precision scale of the measuring equipment.
Quantization is a means to reduce the data input, but if it is too coarse, the con-
troller may oscillate around the reference or even become unstable. The number
of quantization steps therefore always is a trade-off between the computing re-
sources one disposes of, and the desired precision. If the allowed measurement
values are for instance only −4, −3, −2, −1, 0, 1, 2, 3, 4, a measurement of x = 2.5
is rounded off to 3, causing an error of 0.5, being 6.25% of the total width of the
range space.
One possible solution to overcome problems with quantization is nonlinear
scaling ([62]) — see Figure 2. Typically, the end user is asked to enter three typ-
ical numbers for a small, medium and a large measurement, respectively. These
numbers then are considered as the break–points on a piecewise linear curve that
scale the incoming measurements. Although similar techniques are used in the
fuzzy controller itself, this technique is strictly speaking completely independent
of it.
• Normalization or scaling of the measurements onto a particular, standard range
• Removing noise by filtering
• Averaging out the results over a number of measurements, in order to obtain the
tendencies over a longer term
An Overview of Fuzzy Control Theory 5
Ideally, fuzzy control — and by extension, any kind of automated control — would
be a process in which a relatively small amount of control parameters are given as
input, and a desired output state is required. Of course this will almost never succeed
immediately, and an error between the desired output and the factual output will be
generated. Consequently, the controller designer will try to adjust the input values
in such a way that the output varies as a — preferrably — continuous function with
respect to the input values. In conventional (nonfuzzy) controllers, depending on
the chosen model, the error functions in a certain scope of time, say k measure-
ments, et , et−1 , ..., et−k+1 as well as the output control values ut−1 , ..., ut−k+1 , will
be stocked into a memory, and possibly, a model ut := f (e, u) will be designed that
determines the action to be taken.
In fuzzy control, it is not necessary to explicitly define the control action as an
input of the previous control and error variables, and instead, a set of control rules
will be defined by means of linguistic variables. The various rules generate a number
of rule consequences, which are then combined in one fuzzy set that describes the
possible control actions that can be taken, a process which will be called aggregation
(see Section 4) . Finally, a suitable method will have to be designed to generate
from this rule consequence one crisp control value; this latter process will be called
defuzzification (see Section 5). Apart from this, one also has to consider the number
of input signals, the shape of the fuzzy membership functions that make up the
linguistic variables, the number of fuzzy rules and much more.
Since, however, the rules in this knowledge base are the only tool for the system
designer to translate his expert knowlegde to, the behavior of the system will be
basically influenced by this design. Therefore, the necessary time should be reserved
to obtain and derive these rules. It will be mandatory to have a suitable set of rules
to obtain a closed–loop behavior of the system and to finally reach some kind of
6 W. Peeters
equilibrium. Sugeno and Nishida recommend in [142] the following ways to find
control rules:
• The operator’s experience and the control engineer’s knowledge. In [62], an op-
erator’s handbook for a cement kiln, such a collection of rules of thumb is estab-
lished by organizing an extensive questioning of experts on the subject. This is a
very time-consuming process.
• Fuzzy modelling of the operator’s control actions. Fuzzy IF–THEN rules can be
deduced from observation of an operator’s control actions or a log book. The
rules express input–output relationship.
• Fuzzy modelling of the process. Considering the linguistic rule base as the inverse
model of the control process, this inverse model may be used to obtain the fuzzy
control rules. This model can only be used with relatively low order systems,
but it provides an explicit solution to the inverse problem, assuming that fuzzy
models of the open- and closed-loop systems are available. For more information,
we refer to [84] and [116].
H.J. Zimmerman adds in [168] that also the following sources may be useful:
– Crisp modelling of the process
– Heuristic design rules
– On–line adaptation of the rules
• Self–learning controllers. Other interesting and more recent approaches are those
in which the controller determines the rules itself. The theory of fuzzy control
is crossbred with theory involving genetic algorithms and neural networks (see
Section 10), and this has recently produced some encouraging results.
This list is, however, neither complete nor universally necessary. Just as in con-
ventional control, an increase in the knowlegde of the system design will lead to
better control results. There is, however, no fixed design procedure in fuzzy control;
the various freeware and commercial software tools all use different strategies to
establish the rule base.
The reason of the vast success of fuzzy controllers is its fairly simple computa-
tional behavior, its obvious weakness however is, as is readily known, the inherently
heuristic nature of the design of a fuzzy controller. The wide possibility of choice
for shape and parameters in the control variables shows the need for a solid math-
ematical foundation, next to some obvious heuristic restraints which the controlled
system has to satisfy. Mathematically speaking, fuzzy control is based on the con-
cept of fuzzy sets as introduced by L.A. Zadeh ( [164] and [165]), extending the
notion of membership of a function from a two-valued logic to one in which the
range values continuously vary within I = [0, 1].
The most obvious kind of fuzzy control is the so-called direct control. The out-
put of the process is directly compared to a desired reference value, and if there
is a deviation, the controller will take action depending on the numerical value of
the error, the change in error and/or the cumulative error. This kind of controller is
an immediate substitute for the so-called PID–controllers (P–roportional I–ntegral
D–erivative). Another possible strategy is the so-called feedforward controller, that
An Overview of Fuzzy Control Theory 7
As everybody who is familiar with the basic concept of fuzzy control knows, three
key issues in the design of a fuzzy control system are:
• The choice of a suitable set of fuzzy variables, being functions from the space
in which control measurements are performed. Mostly this will be functions α
from R (or commonly, a closed interval thereof) to I.
• The choice of an implication function, or, equivalently, a set of linguistic rules,
each of the type
where the denoted variables Xi are linguistic, and linked to the fuzzy membership
sets αi , and coupled with an aggregation function to combine the consequences
of these assertions, and an implication function
• The choice of a suitable defuzzification method, assigning one crisp value with
the aggregated consequence function
Any combination of the three above will be referred to as a fuzzy controller block.
1.5 Notations
A fuzzy set will be denoted as µ : X −→ I, where X is the universe and I the unit
interval [0, 1]. The collection of all fuzzy sets on X shall be denoted as F(X).
The following properties of fuzzy sets will be used througout this text:
For any fuzzy set µ ∈ F(X) and any number α ∈ [0, 1], the α –cut of µ will be
denoted as the following crisp subset of X:
Γα (µ ) = {x ∈ X : µ (x) ≥ α }
8 W. Peeters
and the strong α –cut of µ will be denoted as the following crisp subset of X:
For any fuzzy set µ ∈ F(X), the following crisp subsets of X will be of utmost
importance:
• The support of µ equals
with the usual closure operator for a topology on X, in the worst case being the
discrete structure.
• The core of µ equals
The core of a normal fuzzy set is called the kernel of the fuzzy set.
1.5.6 Example
Mark that the core of a fuzzy set may be empty. For instance, consider the following
fuzzy set on X = [0, 1]:
x if x ∈ [0, 1[
µ (x) =
0 if x = 1
An Overview of Fuzzy Control Theory 9
If (X, d) is a metric space, then the width of a fuzzy set µ ∈ F(X) will then be
defined as
width(µ ) = sup d(x, y).
x,y∈suppµ
([91]).
The term linguistic variable was used for the first time by L.A. Zadeh in [166]. Not
quite mathematically elaborate as we will refine the definition further on, Zadeh
defined a linguistic variable as a quintuple
x would than be the name of the variable and T (x) the list of linguistic values the
variable would be able to assume. As a mathematician, the idea of T (x) being a
multiple-valued function is of course uncomfortable. Each of these values is taken
as a representatieve for a fuzzy variable µname : U −→ [0, 1] denoting the degree
to which an object u satisfies the linguistic variable X. U would then be the set of
10 W. Peeters
2.1.2 Example
Let x for instance be “body weight”, then T (x) might for instance be
and if the universe U denotes the weight of a person in kilograms, for instance
80 if x ∈ [80, 120]
1 − 1 − x −
µoverweight (u) = 20
0 otherwise
In that case, quite some confusion arises between the function as an object and its
outcome values, and also with the terms that are generated by G. We would find for
instance
altough in literature, these terms are often denoted as T (x) too. In order to avoid all
this confusion, we will take the liberty of denoting linguistic variables as well as
their possible outcome values by fuzzy sets µ : X −→ [0, 1] where X is the universe
of discourse, either equalling R or a subset thereof.
The only instance where the use of the “grammar” G may be useful, is when we try
to define new linguistic variables, starting from other existing linguistic variables.
2.2.1 Definition
G : (F(X))S → (F(X))
(µs )s∈S → G((µs )s∈S )
An Overview of Fuzzy Control Theory 11
in such a way that for all x, y ∈ X and for all (µs )s∈S , (νs )s∈S we have that
Summarized, the value of (G(µ ))(x) is only dependent of the value of µ (x) and not
of the values µ (y) with y = x ∈ X.
2.2.2 Examples
2. ([7]) Let µold (x) be defined as above, then we define two new linguistic variables
(Figure 4)
1 1
µnot old
µold
0 100 X 0 100 X
1 1 µfairly old
µold
µvery old
0 100 X 0 100 X
In fact, if we define µp–old (x)) to be (µold (x)) p , we can even consider the linguis-
tic variable
0 if x < 100
µabsolutely old (x) :=
1 if x ≥ 100
= lim (µold (x)) p
p→+∞
On the other hand, taking lim (µold (x)) p results in the linguistic variable
p→0
0 if x = 0
µ0 (x) :=
1 if x ≥ 0
which represents undecidedness, since all ages, except for a negligible set, are
considered to be equally “old”.
When designing any fuzzy controller, one starts with taking a finite collection of
rule antecendents, consisting of fuzzy variables, which we will denote by
A = {αi : X −→ I}ni=1
If two rule antecedents αi and α j are not disjoint, they will be called overlapping.
• An antecedent rule base will be called a cover if and only if
An Overview of Fuzzy Control Theory 13
n
∀x ∈ X : ∑ αi (x) = 1.
i=1
The set of all such collections A of rule antecedents shall be denoted as P ∗ (F(X)),
being the collection of all finite subsets of F(X), the fuzzy sets on X. The conse-
quence functions can be considered as members of the same set.
Sooner or later, the designer will have to face the question of how to build the
terms of the fuzzy rule base. Two important questions should therefore be answered:
(1) how are the shapes of the fuzzy sets determined, and (2) how many sets are
necessary and sufficient? As for the first question, we will give an overview of the
most commonly used fuzzy sets in fuzzy control.
2.3.2 Example
1 µ
0 a b X
Fig. 5 Triangular fuzzy set
14 W. Peeters
1
µ
1 µ
0 a b c d X
Fig. 7 Trapezoidal fuzzy set
The core still is a singleton {b}, while its support equals [a, c]. For example
µ (x) = “x is a good temperature for swimming” may be represented by (a, b, c) =
(15, 40, 50).
3. Trapezoidal rules (Figure 7)
⎧ x−a
⎪
⎪
⎪ b − a if x ∈ [a, b]
⎪
⎪
⎨1 if x ∈ [b, c]
µ (x) =
⎪
⎪ x − d if x ∈ [c, d]
⎪
⎪ c−d
⎪
⎩
0 otherwise
This time, the core of the trapezoidal rule equals the crisp interval [b, c], while its
support equals [a, d][. For example µ (x) = “x is a good temperature for garden-
ing” may be represented by (a, b, c, d) = (10, 15, 20, 25).
These three examples have the disadvantage that they are not differentiable. In
some cases, for example when a smooth change in the controller function is
desired, we would prefer the use of C ∞ –functions. Therefore, some continuous
modifications of the basic piecewise linear rules mentioned above exist in litera-
ture. We will give a few examples:
An Overview of Fuzzy Control Theory 15
0 x0 X
Fig. 8 Gaussian fuzzy set
−
(x − x0 )2
µ (x) = e 2σ 2
where x0 is called the mean and σ the standard derivation, a parameter that
determines the width of the fuzzy set. Normally, the support of µ equals the
(unbounded) set R; however, any restriction to a closed interval [a, b] ⊆ R may
also be considered. Note however that in this context, the Gaussian curve does
not have it traditional probabilistic meaning.
A variation on this definition that does not make use of the exponential function
is given by (Figure 9).
1
µ (x) =
1+ x− σ
x0 2
where again, σ is a parameter that determines the width. The same remarks
regarding the support of this fuzzy set are valid.
16 W. Peeters
1 1
0 x0 X 0 x0 X
a=1 a=2
1 1
0 x0 X 0 x0 X
a=3 a=4
5. FL Smidth controllers
A parameter family of fuzzy sets that is often used in fuzzy control is the so-
called FL Smidth controllers collection (Figure 10). It is given by
a
− σ
µ (x) = 1 − e x − x0
in which the extra parameter a controls the gradient of the sloping sides. The
following figure shows examples of FL Smidth controllers for a ∈ {1, 2, 3, 4}:
Note however that these fuzzy sets are only differentiable in certain particular
cases (a = 2, a = 4, ...).
6. Cosine functions
Another way to generate a variety of membership functions is by using a compo-
sition of a linear function and a cosine function. We define an s–curve as
⎧
⎪
⎨0 if x < a
1 1 x − b
µs(a,b) (x) = 2 + 2 cos b − a π if x ∈ [a, b]
⎪
⎩
1 if x > b
where a, b ∈ X will be called the left breakpoint and right breakpoint respectively
(Figure 11).
A z–curve then will be defined as a reflection of an s–curve: for the breakpoints
c, d ∈ X, we define (Figure 12)
An Overview of Fuzzy Control Theory 17
ms(a,b)
1
Fig. 11 s-curve 0 a b X
mz(c,d)
1
Fig. 12 z-curve 0 c d X
1 mp (a,b,c,d)
Fig. 13 π –curve 0 a b c d X
⎧
⎪
⎨1 if x < c
1 1 x − c
µz(c,d) (x) = 2 + 2 cos d − c π if x ∈ [c, d]
⎪
⎩
0 if x > d
Finally, a π –curve can be implemented as a combination of an s–curve and a
z–curve. For any a < b < c < d ∈ X we will define (Figure 13)
µπ (a,b,c,d) (x) = min{
⎧ µs(a,b) (x), µz(c,d) (x)}
⎪
⎪ 0 if x < a
⎪
⎪ −
⎪ 1 1 x b
⎨ 2 + 2 cos b − a π if x ∈ [a, b]
⎪
= 1
⎪ if x ∈ [b, c]
⎪
⎪ 1 + 1 cos x − c π if x ∈ [c, d]
⎪
⎪ d −c
⎪
⎩2 2
0 if x > d
7. LR–rules (Dubois/Prade [25])
The following family of fuzzy rules are suitable for differentiable as well as non-
differentiable functions. Let S : R+ −→ [0, 1] be decreasing functions that satisfy
the following three conditions:
a. S(0) = 1
b. ∀x > 0 : S(x) ∈]0, 1[
c. lim S(x) = 0
x→+∞
18 W. Peeters
1 µ
L R
α β
Fig. 14 LR–fuzzy real num-
ber defined by a shape func-
0 m X
tion
1 µ
Both LR–fuzzy real numbers and fuzzy real intervals make particular good
choices as fuzzy rules. The advantage is that, in literature, many interesting de-
scriptions of the algebraic operations on such LR–fuzzy sets exist, such as
x 10 20 30 40 50 60 70 80
µ (x) 0.4 0.6 0.8 1 0.9 0.6 0.4 0.1
In this case, X need not even be a set of numbers, but this approach will fall out
of scope for the purpose of this article.
The question whether a fuzzy rule base contains the necessary and sufficient amount
of fuzzy sets is not so straightforward. Several considerations should be taken into
account ([69]):
• A term set should be sufficiently wide to allow for noise in the measurement.
• If there is a gap between two fuzzy sets in the fuzzy rule base, no rule will fire
for values in this gap. Hence a certain amount of overlap is desireable; otherwise
te controller may run into poorly defined states, where it does not return a well–
defined output.
• On the other hand, a good rule of thumb is that the overlap should at least be
50%. The widths of the fuzzy sets should initially be chosen so that each value
of the universe yields a nonzero value for at least two fuzzy sets in the fuzzy rule
base, except maybe for the elements at both extreme ends of the universe.
Hence the number of fuzzy sets required is invariably dependent on the width of
the fuzzy sets, and vice versa. This does not solve the question of which particular
shapes of curves should be used, though.
2.4.2 Example
X
0 20 40 60 80
Let x ∈ X represent the age of a person, then we define five linguistic variables on
the space X, which denote the degree to which a person is “very young”, “young”,
“middle-aged”, “old” or “very old”. We could for instance take the following an-
tecedent rule base:
x if x ∈ [0, 20]
1 − 20
µvery young (x) =
0 otherwise
x
µyoung (x) = 1 − 1 − ∨ 0
20
x − 20
µmiddle-aged (x) = 1 − 1 − ∨0
20
x − 40
µold (x) = 1 − 1 − ∨0
20
⎧
⎨0 if x ≤ 60
µvery old (x) = x − 60 if x ∈ [60, 80]
⎩ 20
1 otherwise
For instance, if a person is 28 years old, the µyoung (28) = 0.6 and µmiddle-aged (28) =
0.4, while the other three linguistic variables are zero. In case of a partition of unity,
we always have that ∑ µ (x) = 1, as is the case here.
µ ∈A
k : IF (X1 = Ak1 ) and (X2 = Ak2 ) and ... and (Xn = Akn ) THEN (Y = Bk )
An Overview of Fuzzy Control Theory 21
with k ∈ {1, ..., K} and where {Aki : k ∈ {1, ..., K}, i ∈ {1, ..., n}} and {Bk : k ∈
{1, ..., K}} are sets of linguistic values for the linguistic variables X1 , X2 , ..., Xn ,
which we will call the antecedents and Y , which we will call the consequence. Basi-
cally, we would not want these rules to contradict, so any set of inputs (A1 , A2 , ..., An )
should only yield one output B. Furthermore, we will call the fuzzy controller com-
plete if all possible combinations of antecedents occur once and just once in the
rule base. In such a case, it is easy to see that K, the number of rules in the base,
equals the product of cardinalities of the different possible linguistic values of the
antecedents and of the consequence.
The key to the design of a fuzzy controller is a suitable choice of rules. When de-
signing an automatic steering system for their model car, M. Sugeno and M. Nishida
suggested in [142] that the main elements are the translation of the operator’s expe-
rience and knowledge about the control actions into a fuzzy model. The design of
a fuzzy controller and the speed of development may be greatly improved though
by applying pure heuristic design rules as well as the possibility to fine-tune the
model by on–line adaptation of the rules (see Section 8). Other fruitful techniques
have turned out to be combinations with other techniques, both crisp, such as PDI
controllers, as well as self–learning, such as neural networks and genetic algorithms
(see Section 10).
3.1.3 Example
Suppose that we want to control a variable that has as desired value xt ∈ R, and
suppose that we are able to measure the outcome xt at certain discrete time steps
t ∈ N. Then the error is given by et := xt − xt , and usually also the change of error
∆et := et −et−1 is also taken into account ([16]). Given that neither et nor ∆et exceed
a certain interval, which through scaling can always considered to be [−1, 1], a very
commonly used set of rules that is applied, is given by Figure 17, where NB means
“negative big”, NM means “negative medium”, NS means “negative small”, ZE
means “almost zero”, PS means “positive small”, PM means “positive medium”
µ NB µ NM µ NS µ ZE µ PS µ PM µ PB
1
and PB means “positive big”. Any of the control variables, as well as the output
variable, are then modelled in a similiar way, up to different scaling factors.
Some improvements one could apply to refine the controller include, but are not
limited to:
Note the arbitrariness with which the values in the table are created; the rea-
son why table lookups are preferred over the calculation of antecedent rule base
values, is that it speeds up the process relatively well;
• If one wants to achieve a greater precision around the stable zero situation, one
could consider taking fuzzy linguistic variables with a different width (Figure 18).
The same goal is achieved by applying a logarithmic transformation to the dis-
cretized input values. Instead of considering the values
(−1, −0.8, −0.6, −0.4, −0.2, 0, +0.2, +0.4, +0.6, +0.8, +1)
µ NB µ NM µ NS µ ZE µ PS µ PM µ PB
1
The reverse transformation, which can be used on the output value, is then given
by ⎧
⎨ (α + 1)y − 1
α y if y ≥ 0
g(y) =
⎩ − (α + 1) − 1 if y ≤ 0
α
The rule base can then be written as statements using the linguistic variables,
which makes them easy to read and interpret. For instance, feasible heuristic rules
would then be
where E = et and ∆E = et − et−1 , and U is the control output. Any time one of these
rules is used, we say that the rule fires. For instance, if the error is positive medium,
but the change in error is negative medium, this means that the positive error rate
tends to decrease, and therefore it is reasonable to believe that taking no action at
all will stabilize the controller. If this is not the case, another applicable rule will
fire. The main work on design of a fuzzy controller is adjusting the parameters, the
number of rules and the fuzzy rule base in such a manner that the system converges
as quickly as possible to a stable situation (see Section 9). Just as in expert control
systems, this may trigger phenomena such as overshoot and some related problems.
For further information, we refer to works as [63, 84, 145] and [165].
3.1.4 Remark
One final remark about the design of a fuzzy rule base is that instead of a required
correction action U to be taken, it is also possible to define a performance measure P
that indicates how well the controller behaves, e.g. by comparing the output results
to a given desired output. For this issue, see also subsection 8.1. For instance, rules
like
24 W. Peeters
...
n : IF (E is PB) and (∆E is PM) THEN (U is NB)
n + 1 : IF (E is PM) and (∆E is NM) THEN (U is ZE)
...
...
n : IF (E is PB) and (∆E is PM) THEN (P is small)
n + 1 : IF (E is PM) and (∆E is NM) THEN (P is large)
...
The reverse, where the rules that yield a performance measure are translated into
a set of possible correction actions, is also possible, of course, although the (poor)
quality of performance does not indicate in which direction action should be taken.
For now, this however remains a heuristic approach to the design of the fuzzy rule
base.
When designing a fuzzy controller, there are numerously many adjustable parame-
ters, such as the number of controllers, nominal (in the output) and ordinal (in the
input) scaling parameters, different inference methods — which will be discussed in
Section 4, and a suitable choice of defuzzification parameters, as will be discussed
in Section 5. For now, however, we will focus on some important parameters that
are considered in the design of the fuzzy rule base.
Note first of all that if the universe X is, or can be embedded in, a bounded and
closed subset of R, it is always possible to consider a fuzzy controller on the same
base space, by using ordinal scaling parameters. Let X ⊆ [a, b], then consider the
following affine transformation:
it is possible to map any X ⊆ [a, b] into a subset of any other interval [c, d]. Therefore,
any set of linguistic variables can be rescaled to the same domain. This permits for
instance to use the same fuzzy rule bases on the domain of possible errors and
possible error gains. It may therefore be sufficient only to study the behaviour of
fuzzy rule bases on, e.g. [−1,1], except, of course, only in the case where the domain
is unbounded.
Let us now assume that we only consider triangular membership functions, which
are computationally the most simple objects one can consider. Furthermore, let us
assume that all membership functions are normalized; if this is not the case, one can
also apply a scaling function on the ordinal scale.
For any triangular membership function µ with peak in a ∈ X, we define the left
width as |a − b| where
A fuzzy membership function µ will then be called symmetric if and only if left
width (µ ) and right width (µ ) are equal. Symmetry is necessary to obtain the follow-
ing property: suppose a fuzzy controller consists of only a single rule and a single
input, with a one-term triangular linguistic variable as consequence. Using Mamdani
inference (see Section 4) and Center-of-Gravity defuzzification (see Section 5), one
would expect that if the input equals the peak of the antecedent rule, the defuzzifi-
cation value would also be the peak of the rule consequence. This is however not
true if the latter is not symmetric, see for instance Figure 19.
1 1
0 a X 0 DCOG (µ) Y
µ1 µ2 µ3 µ4
1
0 X
Fig. 20 Condition width
of the left membership function and the distance between the two peaks are equal,
we say that the condition width is fulfilled.
D. Driankov, H. Hellendoorn and M. Reinfrank showed in [24] that an antecedent
rule base satisfying the condition width is a sufficient condition for a smooth change
of the control values with respect to a change in the value x ∈ X. Remark that the
condition width does not necessarily imply symmetry, as can be seen in Figure 20.
Of course, a combination of symmetry and condition width yields the best results.
Another parametrical concept introduced by D. Driankov, H. Hellendoorn and
M. Reinfrank in [24] is the following.
For any two overlapping triangular membership functions µa and µb with peaks in
a, b ∈ X respectively, we will define the cross point ratio as the number of elements
in the set
{x ∈ X : µa (x) = µb (x)}
The value µa (x) = µb (x) will be called the cross point level.
It is obvious that this set may contain more than one element. If this set is a
singleton however, we define the cross point ratio as the value µa (x) = µb (x). Often,
it is assumed that the cross point ratio is equal to one and that the cross point level
is 0.5.
Combining the width and crosspoint conditions of course yields the best results
in terms of smoothness. This explains at once why partitions of unity are often used
as fuzzy antecedent rule bases.
In order to be able to combine several fuzzy sets into statements that can be regarded
as the rules of the fuzzy controller, one has to be able to yield similar unary and
binary operations as used in classical logic, in order to produce new statements by
combining one or more “atomic” statements. The five “classical” operations in logic
An Overview of Fuzzy Control Theory 27
are: negation, conjunction, disjunction, implication and equivalence. The binary op-
erations conjunction (“AND”) and disjunction (“OR”) also have to be extendable to
an arbitrary yet finite number of arguments in an associative and commutative way,
i.e. such that the order of the statements and the order in which they are parsed,
does not matter. To this end, the following binary operators play an important role
in fuzzy control theory:
4.1.2 Example
4.1.4 Example
Following E.H. Mamdani et al. in [94], given each rule is of the type
n
kr (x) := αi (xi )
i=1
A list of such fuzzy variables and their operators will be called a fuzzy rule base.
We will now illustrate the need to consider various conjunctionand disjunction
operators. Although the conjuction and disjunction are denoted as and respec-
tively, and although the minimum and maximum operator could fulfill the necessary
An Overview of Fuzzy Control Theory 29
conditions, there are a lot more choices possible, and equally so for the implication.
A major reason for instance to choose the product as a conjunction over the mini-
mum is that, given an adaptive control situation in which we will try to improve the
worst performing rule, the minimum will only select the worst condition for each of
the rules separately as a criterion for selection, while the product is a conjunction of
all the conditions in the same rule. If for instance three conditions µ1 , µ2 , µ3 have
the values (0.8, 0.9, 0.1) for a first input value and (0.3, 0.4, 0.2) for a second input
value, the minimum will regard the first one as the worst, while the product will
consider 0.024 as definitely three times worse than 0.072.
In this section, we are going to give an overview of the different properties that
conjunction, disjunction and implication should fulfill under ideal circumstances,
as well as a list of commonly used operators. The quality of the choice of logical
operators can then be derived from the amount of properties that are fulfilled.
For our purposes, let us call the operators ∧, ∨ and ⇒ respectively a conjunction,
a disjunction and an implication. All three operators then should be considered as
pointwise extensions of the similar maps on I:
∗ : F(X) × F(X) −→ F(X)
(µ , ν ) → (µ ∗ ν ) : X −→ I
x → µ (x) ∗ ν (x)
with ∗ ∈ {∧, ∨, ⇒}. It therefore is sufficient to study the behavior of the operators
∧, ∨ and ⇒ on I only. Therefore, it is also natural to assume that a (pointwise)
pseudocomplementation
∼: I −→ I
x → 1 − x
exists, which can be used to formulate the different logical axioms (see also [21]).
The first property that these three operators should fulfill (although they do not
always do!) is that they should be an extension of the classical two-valued logical
operators. We formulate this property as follows:
4.2.3 Claim I
Let us now consider the properties that any conjunction ∧ and any disjunction ∨
should fulfill. There is a general agreement that the four most important conditions
those should satisfy are the following:
4.3.1 Claim II
∀a, b, c, d ∈ I : a ≤ b and c ≤ d ⇒ a ∧ c ≤ b ∧ d
and
∀a, b, c, d ∈ I : a ≤ b and c ≤ d ⇒ a ∨ c ≤ b ∨ d
∀a, b ∈ I : a ∧ b = b ∧ a
and
∀a, b ∈ I : a ∨ b = b ∨ a
4. 1 should be the neutral element for ∧ and 0 should be the neutral element for ∨:
∀a ∈ I : a ∧ 1 = 1 ∧ a = a
and
∀a ∈ I : a ∨ 0 = 0 ∨ a = a
Considering 4.1, one immediately finds that all the suitable conjunctions and
disjunctions respectively to consider are, by definition, the t–norms and t–conorms.
These are each other’s logical dual, as shown in [2], in the following sense:
4.3.2 Proposition
In case f is strictly decreasing, we change all min to max and vice versa in the
definition above.
The additive generators generate a lot of commonly used t–norms and t–conorms.
For more information, we refer to [87].
We will now give an overview of the most commonly used conjunction and dis-
junction operators ([10] and [101]). Some parametric families of such operators can
be found in detail in [91, 101, 157] and [168]. In what follows, let a, b ∈ I.
4.3.6 Examples
and
max(a, b) if min(a, b) = 0
sW (a, b) =
1 otherwise
4.3.7 Proposition
tW ≤ T ≤ t∞
s∞ ≤ S ≤ sW
holds.
2. t1 ≤ t1 1 ≤ t2 ≤ t2 1 and s2 1 ≤ s2 ≤ s1 1 ≤ s1
2 2 2 2
The choice of a suitable implication operator is not so well described as was the
case for conjunction and disjuction. In fact, so many different possible implication
operators can be considered, that it is virtually impossible to list them all. Some
important classes however are described by D. Dubois et al. in [33] and [34] and by
D. Ruan et al. in [127]. There is no agreement though on which implication prop-
erties of two–valued logic operators should be extended to the fuzzy case, unlike
the conjunction and disjunction properties. An example of such an axiom system
is that of Smets and Magrez in [139], which fundamentally assumes that the truth
value of an implication of two statements is only dependent of the truth values of
the separate statements, which is a reasonable assumption.
The following properties either may or may not be desirable when constructing
an implication operator:
1. Contrapositive symmetry:
2. Exchange principle:
3. Monotony:
µ out (Gödel)
1 µ in
µ out (Mamdani)
a µ out (product)
0 X
4. Boundary condition:
∀a, b ∈ I : if a ≤ b, then (a ⇒ b) = 1
5. Neutrality principle:
∀b ∈ I : (1 ⇒ b) = b
6. Continuity:
x ⇒ y is continuous in its arguments
4.4.2 Examples
This operator is derived from the fact that in two-valued logic, a ⇒ b is equiva-
lent to (a ∧ b) ∨ (∼ a), using the minimum as conjunction and the maximum as
disjunction.
2. Lucasiewicz’ implication operator is defined as
a ⇒ b = min(1, 1 − a + b)
LUC
This operator is derived from the fact that in two-valued logic, a ⇒ b is equivalent
to (∼ a) ∨ b, using the bounded sum as disjunction.
3. Mamdani’s implication operator is defined as
a ⇒ b = min(a, b)
MAM
interpreted more as a fuzzy (and in this case, symmetric) relation than as a fuzzy
logical implication.
4. Gödel’s implication operator is defined as
1 if b ≥ a
a ⇒ b=
GOD b if b < a
This operator is derived from the fact that in two-valued logic, a ⇒ b is equivalent
to (∼ a) ∨ b, using the maximum as disjunction.
6. Gaines’ implication operator is defined as
1 if a ≤ b
a ⇒ b= b
GAI
a if a > b
a ⇒ b = a∗b
PRD
The difference between some of the implications can be seen in Figure 22. We
show the Mamdani implication next to the Gödel implication, being the pointwise
“largest” implication possible, and the product implication.
4.4.3 Properties
⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒
ZAD LUC MAM GOD KLE GAI YAG PRD
contrapositive symmetry No Yes No No Yes No No No
exchange principle No Yes Yes Yes Yes No Yes Yes
monotony No Yes No Yes Yes Yes Yes No
boundary condition No Yes No Yes No Yes No No
neutrality principle Yes Yes Yes Yes Yes Yes Yes Yes
continuity Yes Yes Yes No Yes No No Yes
One could then state that for instance the Lucasiewicz implication is better
than the Mamdani implication, because it satisfies more of the axioms. This is
however a very heuristic approach to the choice of a suitable implication
operator.
4.4.4 Remarks
The result of the application of the implications to the different rules in the an-
tecedent rule base then yields a set of fuzzy consequence rules, which still requires
the application of yet another aggregation operator, which combines the results of
the individually fired rules into one resulting fuzzy set again. Now, in case we used
the Mamdani implication, this aggregation is the t–conorm max, in case we used the
Gödel implication, it is more logical to use the t–norm min. When we look at the
graphs of Figure 22, in the Mamdani case, we take the union of the fuzzy graphs,
while in the Gödel case, we take the intersection of the graphs. Generally though,
it is possible to take as an intersection operator any t–norm and as a union operator
any t–conorm.
Another difference that is important when applying the rules of inference, is the
following: we could first combine all the rules, and then fire them through a com-
position operation. We call this procedure a composition-based inference. On the
other hand, it is also possible to fire each rule individually, and then combine all of
the resulting fuzzy outputs into one fuzzy set. This procedure is called individual
rule-based inference. It is easy to see that in the case of a Mamdani implication,
these two concepts are equivalent, while in the case of a Gödel implication, they are
not — see [24]. A detailed study of the difference between the Mamdani and Gödel
approaches can be found in [35].
Often, to diminish the computational work, the aggregation and implication
are performed together in one operator, that we call a generalized aggregation
operator. One should heed though that this may give rise to unexpected problems,
of which the most important is that these operators do not necessarily
commute.
An Overview of Fuzzy Control Theory 37
5 Defuzzification Operators
Defuzzification is a necessary tool to make a fuzzy control system interact with real-
world models. This is in its strictest sense contradictory to the idea of fuzzification,
which extends the notion of crisp sets with a degree of uncertainty. But nevertheless
defuzzification is inavoidable when a crisp output is desired, as is the case in many
practical applications. A defuzzification can be seen as an operator
D : F (X) −→ X
assigning to each fuzzy set µ ∈ F(X) a crisp value D(µ ) ∈ X. In most cases, µ will
be the result of an aggregation process on some fuzzy rule base, with the resulting
fuzzy output looking like the one in the Figure 23.
The goal is then to make D(µ ) act as an element of X which approaches the
semantic essence of the fuzzy set µ as good as possible.
5.1 Criteria
m(x)
For an arbitrary universe X, the defuzzification value should be unique, and therefore
not dependent anymore of any stochastic process. Stated differently, the output of
the defuzzification process should be unique for every choice of the fuzzy set µ ∈
F(X).
∀µ ∈ F(X), ∃!x ∈ X : D(µ ) = x
A defuzzifier D that satisfies this property will be called unique.
For an arbitrary universe X, the defuzzification value should be such that its mem-
bership is among those of µ ∈ F(X) which have maximal membership. Stated dif-
ferently, the defuzzification value should be in the core of the fuzzy set ([129])
aµ + b : X −→ I
x → aµ (x) + b
(of course on condition that these operations are well defined). Then the defuzzifi-
cation value should not be changed, or, in other words,
For an ordered universe (X, ≤), the defuzzification should respect the order of
(X, ≤). For all µ ∈ F(X), if v ∈ F(X) such that µ (D(µ )) = ν (D(µ )) and
furthermore
then D(ν ) ≥ D(µ ) and vice versa. A defuzzifier D that satisfies this property will
be called monotonous. ([128], Figure 24).
This means that the defuzzification value operator D on F(X) will increase in
value when evaluated on a fuzzy set ν for which the membership values with respect
to a given fuzzy set µ are higher on one side of the defuzzification value and lower
on the other side.
For an ordered universe (X, ≤), given a conorm S : I × I −→ I, for all µ , ν ∈ F(X)
such that D(µ ) ≤ D(ν ), define
µ ∨S ν : X −→ I
x → S(µ (x), ν (x))
1
µ ν
0 D(m) D(n) X
Fig. 24 Monotony
40 W. Peeters
section. However, in doing so, one has to assume that X carries some topology to
describe the distance between two fuzzy sets. So let us now assume that (X, T ) is
a topological space, then it will be possible to state criteria that make use of the
topological structure of X. We have to make the following distinction:
The strong continuity criterion obviously implies the weak one. Yet, just as was
the case with the monotony criterion, one could extend this criterion to hold for other
topological structures on F(X), perhaps even all topological structures satisfying a
certain set of properties at once, but again, this condition may be just too strict.
Given that there exist an addition and a scalar multiplication on the topologi-
cal vector space (X, T ), which are continuous with respect to T . In that case, we
can endow X with the structure of a topological vector space. In that case, another
criterion holds:
An Overview of Fuzzy Control Theory 41
For a topological vector space universe (X, T , +, ·), any positive affine transforma-
tion on the universe X should induce the inverse affine transformation on the de-
fuzzification value. Stated differently, for all µ ∈ F(X), for all a ∈ R0 and b ∈ R,
define
µ a,b : X −→ I
x → µ x − a
b
(of course again on condition that this is well defined). Then the defuzzification
value should be
D(µ a,b ) = aD(µ ) + b.
A defuzzifier D that satisfies this property will be called universe scale-invariant.
([128])
In the particular case where X = R, being a topological vector space as well as
an ordered lattice, of course all of these criteria apply at once. Also, in that case
we are considering fuzzy real numbers, which means that all theory developed for
the treatment of the fuzzy real line can be used. We have included in the bibliogra-
phy a number of references dealing with the implementation of a structure on the
fuzzy real line; especially the work of D. Dubois and H. Prade ([25, 32]), S. Gähler
and W. Gähler ( [41]), R. Goetschel and W. Voxman ( [43]), R. Lowen ( [88–90]),
M. Mizumoto and J. Tanaka ([104]) is interesting in this context.
In [85] and [86] we have been working on two new criteria, only applicable on
compact subsets X ⊆ R or subsets thereof, which nevertheless seem to be important.
Suppose a controller is given by a rule base consisting of a finite number of fuzzy
variables A = {α1 , ..., αn } ⊆ F(X).
A function f : X −→ X will be called the control function. For every i ∈ {1, ..., n}, de-
fine βi := f(αi ), being the image of the fuzzy set as defined in Definition 1.5.8. Given
that the collection A = {α1 , ..., αn } covers X, then so does f(A) = {β1 , ..., βn }.
The reason why often the cartesian product is taken as implication inference is the
following: when the fuzzy variables in the antecedent rule base overlap, this yields
a certain degree of uncertainty, which increases with the length over the overlap,
as can be seen in Figure 25. In the product space, this means that the graph of the
control function f is to be found within a certain region of uncertainty.
Now one logical criterion that should hold is that, if f is the identity function, and
for any x ∈ X, given that µ (x) is the aggregation of the antecedent rule base with
x as input value, this µ (x) again defuzzifies to its original value x. In other words,
D ◦ µ = idX . However, it turns out that even for the most simple control functions,
this is not necessarily true. For instance, while this is understandable in the case
of a discontinuous defuzzifier such as Mean Of Maxima, it is surprising to see that
a continuous defuzzifier such as Center Of Gravity does not satisfy this property
42 W. Peeters
X
b3
b2
b1
X
a1 a3
a2
either. Therefore, in [85] and [86], we stated two new criteria that a defuzzifier may
or may not satisfy.
In the following criteria, put f = id and ∀i ∈ {1, ..., n} : βi := αi :
For a universe X ⊆ R compact, let A = {α1 , ..., αn } be an antecedent rule base that
covers X. Furthermore, let µ ∈ F(X) be the fuzzy set resulting from aggregation and
implication. A defuzzifier D will be called consistent if and only if for all x ∈ X,
D(µ (x)) = x(= id(x)).
One will rarely encounter a defuzzification operator that is consistent. Mostly,
our goal is to find an upper bound for the supremum distance D◦µ − f∞ ≤ l(n),
where n is the number of defuzzifiers.
When increasing the number of controllers and restricting the area of overlap,
the more certain one can become that the defuzzified function is indeed the identity,
but this is far from certain. Therefore, we will weaken the criterion as follows:
antecedent rule base that covers X, with µn ∈ F(X) the fuzzy set resulting from
aggregation and implication. Furthermore, we demand that
N(n)
lim max width(αin ) = 0
n→∞ i=1
An Overview of Fuzzy Control Theory 43
These two criteria mean that given an antecedent rule base A = {α1 , ..., αn }, the
difference between the output and the image through f tends to zero when increasing
the number of rules in the base. One can extend these criteria to hold for larger
collections of antecedent rule bases, all those which cover X, all partitions of unity,
or perhaps even all of them, and for other larger collections of test functions, but
one may expect that these criteria will become too strict again.
5.1.12 Corollary
Due to the scaling arguments 5.1.3 and 5.1.8, one may assume that the universe
X = [0, 1]. An often used standard rule base is the following collection of partitions
of unity:
−xn + 1 if x ∈ 0, n1
α1 =
0 otherwise
⎧
⎪
⎪ xn + 2 − k if x ∈ k − 2 k−1
n , n
⎨
αk = −xn + k if x ∈ k − 1 , k
⎪
⎪ n n
⎩
0 otherwise
xn − n + 1 if x ∈ n − 1
n ,1
αn+1 =
0 otherwise
The most crucial step in the construction of a fuzzy controller, however, is the de-
fuzzification method. In physical applications, at one stage in the adaptive process,
a decision has to be taken as how to adjust the system, thereby needing one out-
put variable. Several defuzzification techniques have been studied extensively, and
for a good overview we refer to the articles of T.A. Runkler et al. [129] and
W. Van Leekwijck et al. [148]. We will now give an overview of the different possi-
ble defuzzification operators
D· : F(X) −→ X,
together with a list of criteria they either do or do not fulfill. This list is by no
means meant to be exhaustive, but rather meant as an overview of the most important
44 W. Peeters
possibilities. A defuzzifier that satisfies all criteria does not exist. We assume that
any element µ ∈ F(X) is the result of an aggregation and implication of a certain
fuzzy rule base A = {α1 , ..., αn } with a given input value x ∈ X.
λ ({x})
P(x) = ,
λ ({core(µ )})
1. The first of maxima defuzzification DFOM (Figure 26) is a function that maps
µ ∈ F(X) to
D FOM
(µ ) = inf y ∈ X : µ (y) = sup µ (z)
z∈X
2. The last of maxima defuzzification DLOM (Figure 26) is a function that maps
µ ∈ F(X) to
DLOM (µ ) = sup y ∈ X : µ (y) = sup µ (z)
z∈X
3. The middle of maxima defuzzification DMOM (Figure 26) is a function that maps
µ ∈ F(X) to
DLOM (µ ) + DFOM (µ )
DMOM (µ ) =
2
1 µ
1 µ
0 DMOS(m) X
Fig. 27 MOS-defuzzification
1 µ
0 X
Fig. 28 Counterexample D
where anyone would agree that the main mass is located on the right side of the
defuzzification value. While core defuzzification criteria are computationally much
more simple, generally though the supplementary cost of calculation that takes into
account the whole fuzzy set, is acceptable. The defuzzifications that make use of
such a total consideration will be called centroid defuzzifications.
The following criterion is only useful in the case the rules {αi }ni=1 are functions
X ⊆ R compact −→ [0, 1].
µ 1 µ
1
0 X 0 DCOS(m) X
DCOG(m)
5.2.6 Proposition
The parameter Γ is hence a measure of confidence: the higher Γ, the more one is
convinced that the mean of the core is a good defuzzification value, meaning that as
a distribution at least the core of µ is more or less symmetric.
Another centroid defuzzification method was stated by R. Jager in [65], by omit-
ting all values of µ that lie below a certain threshold value α ∈ [0, 1], and subse-
quently taking the Center-of-Gravity defuzzification.
For a universe X ⊆ R compact, for any α ∈ [0, 1], the Indexed Center-of-Gravity
defuzzification DICOG (Figure 30) is a function that maps µ ∈ F(X) to
xµ (x)dx
Γα ( µ )
DICOG (µ , α ) = = DCOG (µα∗ )
µ (x)dx
Γα ( µ )
where
µ (x) if µ (x) ≥ α
µα∗ (x) =
0 if µ (x) < α
1 µ
0 DICOG(m,a)
X
Fig. 30 ICOG-defuzzification
48 W. Peeters
1 µ
a
•(1–b)
0 DSLIDE(m,a,b)
X
Fig. 31 SLIDE-defuzzification
5.2.8 Proposition
For a universe X ⊆ R compact, for any α , β ∈ [0, 1], the SemiLineair Defuzzification
DSLIDE (Figure 31) (see [160]) is a function that maps µ ∈ F(X) to
(1 − β ) xµ (x)dx + xµ (x)dx
(Γα (µ ))C Γα ( µ )
DSLIDE (µ , α , β ) =
(1 − β ) µ (x)dx + µ (x)dx
(Γα (µ ))C Γα ( µ )
5.2.10 Proposition
6 An Extended Example
Following the outline proposed in [168], Chapter 11, we will now give an extended
example of a fuzzy controller that is used to steer an automated heating system.
Let t ∈ T = [0, 40] represent the current temperature in a room, then we define five
linguistic variables on the space T , which denote the degree to which this is “freez-
ing”, “cold”, “average”,“warm” or “hot” . We could for instance take the following
antecedent rule base, which is a partition of the unity (Figure 32):
t
µfreezing (t) = 1 − ∨0
10 t
µcold (t) = 1 − 1 − ∨ 0
10
t − 10
µaverage (t) = 1 − 1 − ∨0
10
t − 20
µwarm (t) = 1 − 1 − ∨0
10
t − 30
µhot (t) = ∨0
10
Apart from that, we must make sure that the temperature never exceeds the
boundary values of [0, 40]. This can be done by applying a simple clipping of the
value t to 0 ∨t ∧ 40. Suppose now that we also know the value ∆t ∈ [−1, 1] denoting
the recent change of temperature, which can be “cooling fast”, “cooling”, “staying
the same”, “warming” or “warming fast”. Such a value ∆t can be obtained for ex-
ample by evaluating the temperature on two consequent measurement points in time
and clipping these for a certain minimum and maximum. In our example,
∆t(n) := −1 ∨ (t(n) − t(n − 1)) ∧ 1,
which it is reasonable to assume on condition that the change in temperature on two
subsequent measurement points in time does only exceptionally exceed the treshold.
ndecrease
1 nno action nincrease
If this is not the case, a higher frequency in sampling may be required. Therefore it
is feasible to propose the following antecedent rule base, which also is a partition of
the unity (and, in fact, the same as t up a scaling factor) (Figure 33):
Finally, a third rule base will serve as the consequence. The action to be taken
will either be to “decrease” the power of the heating system, to “take no action” or
to “increase” its power. For simplicity reasons, we will take the power p ∈ [−1, 1]
as well, which can be simply adjusted by any desired scale factor. Let us consider
the following rule base (Figure 34):
Secondly, we will establish a (heuristic) rule base. A suitable rule for instance
would be
However, instead of writing out all the rules, it is much easier to consider the fol-
lowing table:
t/∆t cf c sts w wf
f i i i i na
c i i i na na
a i na na na d
w na na d d d
h na d d d d
This table is complete, in the sense that any entry values (t, ∆t) in the given
intervals trigger at least one consequence rule. As a rule of inference, we will use the
minimum-operator, which we will use also and as a rule of consequence, following
the approach of Mamdani. Suppose a measurement is performed, and we find that
t = 27 and ∆t = −0.4. Then µaverage (t) and µwarm (t) are nonzero with respect to
t, and µcooling (∆t) and µstaying the same (∆t) are nonzero with respect to ∆t. From the
table, the rules in boldface therefore fire:
t/∆t cf c sts w wf
f i i i i na
c i i na na na
a i na na d d
w na na d d d
h na d d d d
The grades of membership are respectively µaverage (27) = 0.3, µwarm (27) = 0.7,
µcooling (−0, 4) = 0.8 and µstaying the same (−0.4) = 0.2. The four antecedents there-
fore are aggregated by means of the minimum operator:
t/∆t c (0.8) sts (0.2)
a (0.3) 0.3 0.2
w (0.7) 0.7 0.2
The consequence rules that fire are
min{νnoaction , 0.3}, min{νnoaction , 0.2}, min{νnoaction , 0.7} and min{νdecreasing , 0.2}
respectively. Considering the maximum over these four clipped fuzzy sets, we obtain
the consequence function shown in Figure 35.
ndecrease
1 nno action nincrease
0 1 P
Fig. 35 Consequence –1
52 W. Peeters
Since the temperature is too warm but the temperature has a negative gradient, the
fuzzy control system will advise the heating system to diminish its power, but only
slightly, in order to prevent overshoot.
It is fairly easy to calculate the outcome of the controller for other inputs; the
difficulty will be to adjust the antecent rule bases and, more importantly, which
fuzzy rules are to fire on what conditions. In the given example, by clipping the input
values and by ensuring that any input (t, ∆t) makes at least one rule fire, the fuzzy
controller is turned into a closed system. If the system would not have been closed,
in the sense that some spaces in the table would have been void, it would have
been necessary to complete the table with a “default” consequence rule, implying
no action whatsoever. The clipping also has as a side effect that no other state out
of [0, 40] × [−1, 1] can be reached, because we forced it to be so. It would be an
advantage if the system could be naturally closed, in the sense that no clipping (at
least not in the temperature values t) would be necessary.
Another important factor is whether the given control system eventually reaches
an equilibrium state, after which the temperature hardly needs to be adjusted any-
more. There is an important difference between a stable state, which means that
small perturbations in the input values will eventually lead to the same equilibrium
point, or a nonstable state, for which a small disruption can either lead to a different
stable state or no stability at all any more. A notorious example is the so-called in-
verted pendulum, for which the problem was already stated by H. Kwakernaak and
R. Sivan in [81]. More about this stability issue will be explained in Section 9.
7 Simplified Controllers
In this section, we will give an overview of various techniques that may simplify a
part of the control process. The most obvious reason for doing this is gaining pre-
cious computation time. We should ask ourselves two questions when determining
whether or not to use these techniques: (1) Do the calculations give the same or
at least a similar precision without affecting the control process, and (2) Are they
really time-saving?
An Overview of Fuzzy Control Theory 53
When the universes of discourse are discrete, or at least can be discretized to a fi-
nal number of states, it is always possible to calculate all thinkable combinations of
inputs before putting the controller into operation. Because all possible defuzzifica-
tions only have to be calculated once, this drastically reduces the computation time.
Consequently, the relation between all input combinations and their corresponding
outputs are arranged in a table. Let us assume that there are only two inputs and
one output, then this results in a two-dimensional lookup table, which we can eas-
ily visualize. For a higher dimension, the principle stays the same, and will not
lead to a drastic increase in calculation time, but practically a computer will be
needed.
7.1.1 Example
for the change of temperature. Up to a scaling factor, the outputs denote the appro-
priate action that should be taken to adjust the heating system. Therefore, for the
output, we consider five possibilities: positive big (PB), positive small (PS), zero
(ZE), negative small (NS) and negative big (NB). The corresponding fuzzy sets will
be given by
and subsequently
p = −0.7 = 0.4 ∗ (−1) + 0.6 ∗ (−0.5) ⇒ 0.4 ∗ 0.548 + 0.6 ∗ 0.298 = +0.398
On the other hand, a direct computation of the inference and defuzzification yields
p = +0.362.
7.2.1 Example
(t − 20) · (∆t)
IF (t is cold) and (∆t is cooling) THEN p=
20
associated with the consequence of the r–th rule. Therefore, the main advantage of
this kind of controllers is that the defuzzification need not be performed in every
step, and instead, one can consider a finite set {DMOM (βk )}k∈K of predetermined or
precalculated values.
7.2.2 Example
7.2.3 Example
Consider a single-input single-output rule base “error” on a space X = [0, 100] with
the following rules:
⎧
⎨ µsmall (e) = 1 − e ∨ 0
∀e ∈ [0, 100] : 60
⎩ µlarge (e) = 5 − e ∨ 0
3 60
Then consider the following two rules (Figure 36)
Using
µsmall (e)o2 (e) + µlarge (e)o1 (e)
DCOG (e) := ,
µsmall (e) + µlarge (e)
An Overview of Fuzzy Control Theory 57
0.8
0.6
0.4
0.2
e
Fig. 36 Overlapping Sugeno
0 20 40 60 80 100
controller
100
80
60
40
20
e
outside the overlap region we obtain a linear function of the error, and inside the
region we obtain a linear interpolation of the two, which is also a linear function
(Figure 37).
Most processes that require automatic control are nonlinear, in the sense that cer-
tain parameters will change either in function of time, the state the process is in,
or more likely, both. Therefore, linear controllers can only function on a limited
neighborhood of the operating point and in a limited period of time. Due to exter-
nal circumstances, it may be necessary to retune the controller at various moments
in time. It would therefore be particularly handy if adaptive controllers would be
able to periodically retune themselves. Any fuzzy controller for which the fuzzy
knowledge base is changed througout the control process, will be called an adap-
tive fuzzy controller. The adaptive component of such a controller consists of two
parts: the process monitor, which looks for changes in the process characterics, and
the adaptation mechanism, which alters the controller parameters on the basis of
any detected changes. Note that the first component is equally present in nonfuzzy
adaptive controllers (see [5]).
58 W. Peeters
We will first of all make the distinction between self-tuning controllers and self-
organizing controllers (see [162]). Both are fuzzy controllers that are able to adapt
following the outcome of some performance measure. However, we will speak about
self-tuning controllers if only the fuzzy set definitions are changed, and about self-
organizing controllers if the rules themselves, and particularly, their activations, are
changed, or if new rules are added or old ones omitted. Self-tuning controllers essen-
tially can only fine-tune a controller that is already designed, while self-organizing
controllers can be built from scratch.
Another common distinction that is used throughout literature is the one between
performance-adaptive controllers and parameter-adaptive controllers; see for in-
stance [135]. The distinction between these is which method is used as a progress
monitor to update the controller parameters. In the first case, some performance
measure is used that assesses how well the controller is controlling, in the second
case a parameter estimator is used that instantly updates a model of the process. We
need to remark however that a unifying theory about the performance evaluation of
adaptive fuzzy controllers is still lacking, and that most methods are just a heuristic
adaptation of the performance criteria used in conventional control theory.
rather than the action that has to be taken, which would look like this:
Various techniques exist to obtain from such a base of rules a measure of perfor-
mance for the original controller. See for instance, the work of W. Pedrycz [21],
[113], [114] and [115]. Other techniques involve for instance the use of time series
([13]).
An Overview of Fuzzy Control Theory 59
Performance measures, on the other hand, include, but are not limited to, choosing
one or more appropriate values among the following: overshoot, rise time, settling
time, decay ratio, frequency of oscillations, integral of the square arror, integral of
the absolute value of the error, integral of the time-weighted, absolute error, gain and
phase margins. Either the values of these performance measures are used directly
(e.g. [8]), or several of them are combined into a performance index (e.g. [119]).
Typically, a controller performance is measured as a trade-off between the different
goals and the constraints.
In the following subsections, we will give an overview of the basics of the most
commonly used adaptation techniques.
8.2 Scaling
In many cases, the fuzzy set definitions are defined on a normalized universe, for
instance the closed interval [−1, +1]. Any real-valued input can be scaled by mul-
tiplying the control parameter by an appropriate scale parameter. If we have for
instance a universe of discourse equalling [−20, +20], then we need to multiply the
input value by a scaling factor λ = 0.05. An input value x = +10 will then classify
as “positive medium (PM)”. Using a scaling factor λ = 0.025 will yield a universe
of discourse [−40, +40], in which the same input value x = 10 will be classified as
“positive small (PS)” (Figure 38).
For some applications, it may therefore be suitable not to consider the scaling fac-
tors as constants. Altering the scaling factors during a control process is the equiv-
alent of what is called gain tuning in the context of nonfuzzy PID controllers. The
most obvious way to incorporate this principle in fuzzy controllers is to change the
rule base. For instance, the rule
IF (temperature is cold) THEN (power gain = POSITIVE SMALL)
X
–1 –0,5 0 +0,5 +1
would be changed to
8.2.1 Example
Another commonly used technique is increasing the precision around the origin by
a logarithmic transformation, as we described in Section 3.1. It is possible however
to simply alter the scale factors following the result of certain performance criteria.
A notorious example is given by Y. Yamashita et al. in [161]. The article describes a
chemical process in which the temperature needs to be increased slowly at first, but
in which the increase has to be subdued after a present flow of hydrogen gas starts
to combust. The idea that is applied there is to have a variable scaling factor that is
controlled by a fuzzy controller. Using a performance measure Pt at sampling time
t, being the average of the squared error over the previous three sampling times,
the scaling factor Ct is controlled according to the following set of linguistic rules,
which only depend on the largeness of the performance measure:
The scaling factors for the error (E) and the change of error (∆E) are then updated
by applying the following scaling factors:
Et = Ct · E0
∆Et = Ct · ∆E0
where E0 and ∆E0 are fixed initial values. These scaling factors may be implemented
as a fuzzy controller as well as a crisp controller. Various other schemes for altering
the scaling factors are of course possible, although the design is, once more, mostly
heuristic in nature.
One of the earliest examples of a performance adaptive fuzzy controller was given
by G. Bartolini et al. in [8], and consists of a controller that adapts the member-
ship functions in the rule base online, according to the outcome of a series of
An Overview of Fuzzy Control Theory 61
performance criteria. The controller is of the PD-like fuzzy type, with as inputs
the error and the change in error, and its output being the required change in the
control variable. Six performance criteria are used to assess the quality of the con-
troller on-line. Over a fixed observation period, the length of which is also in its turn
a tunig parameter for the controller, the following indices are calculated:
• ē2 , the average square error
• ē, the average error
• |ē|, the average absolute error
• |e|max , the maximum absolute error
• n1 , the number of consecutive variations in control output
• n2 , the number of variations in control output during the given time interval
While the first four indices are meant to keep the controller at set-point, the latter two
serve the secondary objective of reducing the number of (unnecessary) command
variations. Let us assume for instance that the error function E can assume one of
the following three linguistic variables: Negative, Zero and Positive. The shape of
the fuzzy rules can be one out of the list described in Section 2.3. The adaptation
is then done by modifying the shapes of the membership functions in proportion to
the undesired effects that are being corrected. Depending on the outcome of the first
four performance indices, one the actions in Figure 39 is taken.
Whether one of the four actions, if any, has to be taken, depends on the outcome
of the algorithm shown in Figure 40.
If the average error is too large, then adaptation action (a) or (b) is carried out,
depending on the sign of the difference between the error and the set-point. Adapta-
tion action (a) for instance improves the controller performance when the process is
constantly below set-point. If either the squared error is too large, which indicates
imprecise control, or the error function produces an outlier, even one, adaptation
action (c) increases the sensitivity of the controller.
Only yielding these definitions would only cause the controller to increase its
precision, which will eventually make the controller unworkable, as it has to make
too many adaptations. Therefore, we apply a second flow chart, aimed at reducing
the number of command variations (Figure 41).
Adaptation action (d) is the reverse of adaptation action (c), and decreases the
sensitivity of the controller. The parameters n1 and n2 specify the level of command
variation that is considered to be intolerable.
It is obvious that the performance of this adaptive fuzzy controller relies signifi-
cantly on the appropriate choice of the parameters α , β , Γ, ε1 and ε2 , for which there
are no standard rules but heuristics. The following observations are helpful though.
Centrally, the idea behind this adaptation process is to provide a quick controller
adaptation — which absolutely need not be carried out at every sampling time —
without causing instability or oscillations, and with only small adaptations to the
62 W. Peeters
1 1 1
0 0 0
1 1 1
0 0 0
1 1 1
0 0 0
1 1 1
0 0 0
–a < e < + a
No Yes
No Yes Yes No
No
No Adaptation
Action
n1 > ε 1
No Yes
No
No Yes
No Adaptation Adaptation
Action Action (d)
fuzzy set definitions. It is also feasible to monitor the excessive command variations
at every sample, but monitor the set-point control only relatively sparse, because too
many control adaptations could lead to instable situations as well.
The parameters α and β are designed to specify what levels of the set-point error
criteria are still acceptable before an adaptation of the fuzzy membership definition
is required. If these parameters are chosen too small, the system will constantly try
to adapt itself, thereby surpassing its goal of producing a desired performance level.
On the other hand, chosing these values too high, the controller will not make nearly
enough adaptations and therefore yield a bad performance. In [8], G. Bartolini et
al. showed that an increase in β causes an increase in average square error ē2 , as
well as a decrease in the number of command variations. An appropriate choice
for β therefore is a trade-off between performance and precision, and the same can
be said about α . ε1 and ε2 are determined by energy conservation requirements,
and Γ thereby specifies that such an action is only to be carried out when the set-
point error is less than a certain upper bound, to prevent excessive control-output. Γ
too high may lead to too much adaptation and therefore insensivity, if Γ is chosen
too low, no adaptation to minimize control-output may ever take place. Another
parameter that is crucial in this approach is the length of the time window. A longer
interval makes the error values more significant, and is a great tool to filter out
unwanted noise in the signals, but also causes the adaptation mechanism to react too
sluggishly.
The method developed by H. Nomura et al. in [106] is a tuning method for a con-
troller by means of a set of input–output training data, which will be used to tune
the membership functions of a fuzzy controller using numerical techniques that are
very much alike similar techniques in physics to decrease an energy function. Let a
fuzzy controller consist of n fuzzy rules of the form
Rule i: IF x1 is INPUT (i,1) and ... and xm is INPUT (i,m) THEN u is OUTPUT (i)
where x1 , ..., xm are the controller inputs and the fuzzy sets of the membership func-
tions are given by
2 x − ai j
∀i ∈ {1, ..., n} , ∀ j ∈ {1, ..., m} : µ(i, j) (x) := 1 −
bi j
The membership functions are thus triangular rules as defined in Section 2.3, but
with center ai j and base length bi j which may be subject to adaptation. The output
states are defined by the fuzzy singletons
Now, let us assume that there exists a set of data R that describes for any r ∈ R a
series of input vectors, x = (x1r , ..., xm
r ) together with the desired output values ur .
The idea H. Nomura et al. in [106] proposed, is to minimize the objective function
1
E = (u∗ − ur )2
2
with respect to the parameters ai j , bi j and ui . One of the simplest methods to achieve
this goal is to use the gradient or steepest descent algorithm (which is described in,
e.g., D.R. Sadler [131]), which is an iterative algorithm that decreases the value of
the objective function, relying on the fact that from any point, the objective function
decreases most rapidly in the direction of the negative gradient of its parameters.
Let E(z1 , ..., z p ) be such an objective function, then this vector is
∂E ∂E ∂E
−∇(E) = − ,− ,···−
∂ z1 ∂ z2 ∂ zp
and if zi (t) is the value of the i–th parameter after t iterations, then the next estimate
for the same parameter is given by
∂E
∀i ∈ {1, ..., p} : zi (t + 1) = zi (t) − K ·
∂ zi
with K a constant that controls the maximal speed by which the parameters are
altered at each iteration.
In this case, the parameters we wish to alter are ai j , bi j and ui . Considering the
objective function in terms of the membership functions,
⎛ ⎞2
n m
⎜∑ ∏ · uiµIS(i, j) (xrj )
⎟
1 ⎜ i=1 j=1 ⎟
E = ·⎜
⎜ − ur⎟
⎟
2 ⎝ n m ⎠
µ
∑ ∏ IS(i, j) j
(x r)
i=1 j=1
66 W. Peeters
and using the steepest descent algorithm, we obtain the following adaptation equa-
tions:
n
∑ µk (uk (t) − ui (t)) 2 · sgn(xr − a (t))
Ka (u∗ − ur ) · µi k=1 j ij
ai j (t + 1) = ai j (t) −
µi j (xrj ) n 2 bi j (t)
∑ µk
k=1
n
∑ µk (uk (t) − ui (t)) 1 − µ (xr )
Kb (u∗ − ur ) · µi k=1 ij j
bi j (t + 1) = bi j (t) −
µi j (xrj ) n 2 bi j (t)
∑ µk
k=1
Ku (u∗ − ur ) · µ i
ui (t + 1) = ui (t) − n
∑ µk
k=1
m
with µi := ∏ µil (xlr ). Starting from a reliable controller input/output data set, one
l=1
can optimize the parameters as follows:
r ) and calculate for each rule µ as well as the control
• Insert the data x = (x1r , ..., xm i
∗
output u .
• Update the values of ui (t).
• Repeat the rule firing using the new values of ui .
• Update the values of ai j (t) and bi j (t).
• Calculate the inference error E(t)
and repeat these steps for other r ∈ R until |E(t) − E(t − 1)| is sufficiently small.
The first practical application of this adaptive fuzzy method was made by
H. Nomura et al. in [106], where a mobile robot was trained to avoid a moving
obstacle. Starting with 625 rules and a set of manual operations to obtain train-
ing data, 66 input/output sets provided enough information to tune the membership
functions. Many variations on this method exist, that mainly differ in the optimiza-
tion procedure, although one could also heuristically adapt the fuzzy set definitions
and the defuzzification method. For instance A. Maeda et al. claim in [93] to obtain
a learning speed 40 times faster than the algorithm described above.
Just like the membership function tuning controller described in Section 8.3, the
self-organizing controllers constructed by E.H. Mamdani et al. in [95] and [96],
and of which a detailed description can be found in T.J. Procyk et al. in [119], the
adaptation of this controller is carried out by calculating performance measures. The
difference, however, is that neither the fuzzy set definitions nor the scaling factors
An Overview of Fuzzy Control Theory 67
are adapted, but the rules themselves are. To be more accurate, each of the possible
rules is fired, and afterward, it is determined which of the rules causes the best
performance measure.
Without loss of generality, let us assume for the sake of simplicity a two-input,
one-output fuzzy rule base, with as input values the error and the change of error.
As is shown in D. Driankov et al. [24], the case with one input is trivially simple
and the case with two inputs allows for an easy matrix representation. Suppose
the linguistic arguments of the error function are E1 , ..., E p , those of the change of
error are ∆E1 , ..., ∆Eq and those of the output function are U1 , ...,Ur . Let the fuzzy
controller consist of rules like
for all i ∈ N = {1, ..., n} , a ∈ P = {1, ..., p} , b ∈ Q = {1, ..., q} , c ∈ R = {1, ..., r}. It
is only sensible that the number n should never exceed all possible combinations, so
without loss of generality, n ≤ p × q × r. Optimally, every pair of antecedents, the
occurrence of which we called the completeness of the antecedent rule base, yields
one and only one consequence, so n = p × q. The following algorithm can then be
used to determine the optimal linguistic value for Uc .
Suppose that there also exists a performance measure P. For any (a, b) ∈ P × Q
fixed, this may for instance be the difference between the defuzzification output
and the given output of a control data set. Therefore, let any rule with antecedents
µe = Ea and µ∆e = ∆Eb fire, and calculate
8.5.1 Example
Let us for instance consider the example given in Section 6 again. Suppose the
antecent rule base contains the following rule:
The linguistic value “increasing” was introduced in a heuristic way. We might won-
der what output will give the best result, when we have to choose for instance be-
tween “increasing” or “no action”. In other words, we wonder which of the two
tables will yield the best result: the original one
t/∆t cf c sts w wf
f i i i i na
c i i i na na
a i na na na d
w na na d d d
h na d d d d
68 W. Peeters
8.5.2 Remark
For the sake of completeness, it is common that all possible linguistic values
U1 , ...,Ur are used in the rule base, to come up with the best performing value.
To measure the influence of one rule at a time, it is recommended not to use this
algorithm for the substitution of two rules in the antecent rule base at once, since
then, it cannot be determined anymore which of them is responsible for the better or
worse performance individually. It would take too long to have all rules undergo this
testing procedure in every step; it is therefore advisable to only sporadically control
the rules, in such a way that all rules are eventually revised an equal number of
times, with enough uncontrolled runs of the control process in between. Following
D. Driankov et al. in [24], when making use of a matrix notation, this algorithm can
be simplified to a finite number of lattice theoretic and algebraic operations, yielding
a high performance. This method can not only be used to enhance the performance
of an existing controller, it is also possible to build up a controller from scratch, if
enough supervised test runs are performed.
which makes the process a little bit more difficult, especially since m, the number of
steps that have to be traced back, is unknown. Some work on this “delay in reward”
parameter has been done by T. Yamazaki et al. in [162].
An Overview of Fuzzy Control Theory 69
9 Stability Analysis
Through the heuristic nature of fuzzy controllers, and the positive correlation be-
tween the heuristic experience of the controller’s designer and the performance of
the controller, relatively little effort has been done to develop a solid theoretical
background for general analysis of the dynamic behavior of control loops. Never-
theless, the stability of fuzzy controllers ([143]) is an extremely important factor in
its design. To quote A. Kandel et al. in [71], “Stability is (...) the first and last con-
cept for any system design and a fundamental issue in every control system”. In fact,
the lack of a solid stability analysis has been considered as the major drawback for
fuzzy controllers, and is one of the main arguments against the use of fuzzy control
over conventional control ([1]). Because fuzzy controllers are essentially nonlinear
systems (see [24]), it will be hard to obtain general results, and one must be sat-
isfied with results that are only interpretable on a very local scale. There are two
main approaches to study the behavior of a fuzzy controller, one relying on classical
nonlinear dynamic systems theory, where we assume the fuzzy controller to be a
particular class of nonlinear controllers, and the theory of fuzzy dynamical systems,
associated with Zadeh’s Extension principle (see [165]). The basic theory of fuzzy
dynamical systems can be found in P.E. Kloeden [76], and research concerning the
stability of a fuzzy system makes use of the concept of energy or controllability of
the fuzzy system. See for instance [42] and [75]. The first articles specifically on
stability of fuzzy controllers are due to C.V. Negoita ([105]) and W.M. Kickert and
E.H. Mamdani ([74]).
Whether a fuzzy control design will be stable, i.e. whether it will reach — or
at least stay close enough, within a preset boundary of — an equilibrium, is still
an open question, unlike linear controllers. In linear control, we will call a system
stable if it converges to the equilibrium, no matter where the system state variables
start. It is a known result that a necessary and sufficient condition for a linear system
to be stable, is that all eigenvalues are situated in the left half of the complex plane.
For nonlinear systems, such as fuzzy control systems, the concept of stability is
much more intricate. Fuzzy controllers, however, are particularly nonlinear; mainly,
there are three sources of nonlinearity in fuzzy control:
• The rule base. The position, shape and number of fuzzy sets are nonlinear, and
the nonlinearity may be reinforced by nonlinear scaling of the input values. The
rules itself also often express a nonlinear control strategy.
• The inference engine. Connectives like (and,or):=(∧, ∨) are nonlinear.
• The defuzzification. Several defuzzification methods are nonlinear.
Checking conditions for stability on nonlinear systems is much more difficult
than on linear systems; some theoretical background can be found in [24] and [111].
Four methods are mentioned there, Lyapunov stability, Popov stability, circle sta-
bility and conicity. However, the results of these methods are rather strict, and
70 W. Peeters
But the cases in which the nonlinear fuzzy controller can be approximated by a lin-
ear controller, are, to say the least, scarce. As mentioned in [69], when we consider
a single-input, single-output controller, we can control the shape of the surface to
a certain extent by manipulating the membership functions. In [85] and [86], we
studied the behavior of the identity mapping between two fuzzy antecedent rule
bases, and whether or not it yields an identity mapping between the input before
fuzzification and the output after defuzzification. We found that this was not the
case. This mapping is called the input-output mapping, and can be used a design aid
when one has to choose the membership functions and constructing rules. Ideally,
we will want to make the distance between this mapping and the diagonal as small
as possible, for instance with respect to the L1 –metric.
9.2.1 Example
In each of the following examples, a set of IF–THEN rules on the universe of dis-
course X = [−1, 1] is given, for which we will determine the input-output mapping,
which we will plot against the linear controller. These results depend of course on
the choice of aggregation and defuzzification operators. We will choose the product
implication for its computational simplicity and for its continuity, and the Center-
of-Gravity defuzzification, because it is continuous, satisfies the uniqueness crite-
rion, and in case of a singleton output, it degenerates to a quotient of finite sums.
If there had been more than one input rule, we would have chosen the product
as the aggregation operator too, because it takes into account all bad performing
antecedents.
Consider for instance the following linguistic rule base:
1. Let us take as condition and as consequence the same antecent rule base consist-
ing of the triangular fuzzy sets
An Overview of Fuzzy Control Theory 71
Y 1
IF THEN
0.8
Neg Pos Neg Pos
Y1 Y1 0.6
Zero Zero
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 –0.2 X
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6
–0.8
–1
Then the output curve has the shape given in Figure 42.
As we have seen in [85], the Consistency Criterion 5.1.10 is not fulfilled; the
third example will learn us that this can never be the case when we consider the
identity mapping between the same rule bases. Notice also that it is impossible
to drive the output to its full potential of 100% output range.
2. The less overlap there is in the antecedent rule base, the steeper the slopes be-
tween the various regions will get. As an example consider the following rule
base:
3 3
µNeg (x) = 1 − − x) ∨ 0
2 2
3
µZero (x) = 1 − x) ∨ 0
2
3 3
µPos (x) = 1 − + x) ∨ 0
2 2
Y1
IF THEN 0.8
Neg Pos Neg Pos
Y1 Y1 0.6
Zero Zero
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2
Y 1
IF THEN 0.8
Neg Neg
Y1 Pos Y 1 Zero Pos
0.6
Zero
0.8 0.8 0.4
0.6 0.6
0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1
0.2 0.2
–0.2 X
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6
–0.8
–1
• The rule base must be a complete combination (cartesian product) of all input
families.
• The outputs must be singletons at the points in which the input sets reach their
core value.
• The defuzzification operator must be DCOG (or its discrete counterpart).
4. Let us now change the triangular fuzzy sets in the antecent rule base into trape-
zoidal sets
combined with the same Dirac outputs as in the third example. The resulting
curve is seen in Figure 45.
The “flat” input sets produce horizontal pieces in the input–output curve, which
inevitably cause large gains away from the reference value. This is the equivalent
An Overview of Fuzzy Control Theory 73
Y 1
IF THEN 0.8
Neg Y Zero Pos Neg Zero Pos
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
0.2 0.2 –1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
–0.2 X
Y1
IF THEN 0.8
Neg Y Zero Pos Neg Zero Pos
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 –0.4
0.2 0.4 0.6 0.8 1
X X –0.6
–0.8
–1
of a deadzone with saturation. Increasing the width of the middle term results in
a wider stable area around the reference.
5. The sharp corners may cause a problem when, e.g. considering differentiability.
To avoid this, one can introduce nonlinear input sets, then also the input–output
mapping becomes smooth. For instance, the antecedent rule base
⎧
⎨1 if x ∈ [−1, −0.75]
µNeg (x) = 1 1
+ cos(π (2x − 0.5)) if x ∈ [−0.75, −0.25]
⎩2 2
0 if x ∈ [−0.25, 1]
⎧
⎪
⎪ 0 if x ∈ [−1, −0.75]
⎪
⎪ 1 + 1 cos(π (−2x + 1.5)) if x ∈ [−0.75, −0.25]
⎪
⎨2 2
µZero (x) = 1 if x ∈ [−0.25, 0.25]
⎪
⎪
⎪
⎪
1 + 1 cos(π (2x + 1.5)) if x ∈ [0.25, 0.75]
⎪2 2
⎩
0 if x ∈ [0.75, 1]
⎧
⎨0 if x ∈ [−1, 0.25]
µPos (x) = 1 + 1 cos(π (−2x − 0.5)) if x ∈ [0.25, 0.75]
⎩2 2
1 if x ∈ [0.75, 1]
combined with the same Dirac outputs as in the third example results in the fol-
lowing input–output curve of Figure 46.
74 W. Peeters
6. Adding more sets only makes the mapping more bumpy. Consider for instance
the following antecedent rule base:
1 + 1 cos(π (2x + 2)) if x ∈ [−1, −0.5]
µNB (x) = 2 2
0 if x ∈ [−0.5, 1]
⎧1 1
⎨ 2 + 2 cos(π (2x + 1)) if x ∈ [−1, −0.5]
⎪
µNS (x) = 1 + 1 cos(π (−2x + 1)) if x ∈ [−0.5, 0]
⎪
⎩2 2
0 if x ∈ [0, 1]
⎧
⎨0 if x ∈ [−1, −0.5]
µZE (x) = 1 + 1 cos(π (2x)) if x ∈ [−0.5, 0.5]
⎩2 2
0 if x ∈ [0.5, 1]
⎧
⎪ 0 if x ∈ [−1, 0]
⎨1 1
µPS (x) = 2 + 2 cos(π (2x − 1)) if x ∈ [0, 0.5]
⎪
⎩1 1
+ cos(π (−2x − 1)) if x ∈ [0.5, 1]
2 2
0 if x ∈ [−1, 0.5]
µPB (x) = 1 1
2 + 2 cos(π (−2x − 2)) if x ∈ [0.5, 1]
The output functions will be the Dirac fuzzy sets µNB (x) = δ−1 (x), µNS (x) =
δ−0.5 (x), µZE (x) = δ0 (x), µPS (x) = δ0.5 (x) and µPB (x) = δ1 (x). The input-output
mapping looks like in Figure 47.
7. More sets make it easier however to manupulate the position of the refer-
ence plateau by moving the singletons around. Replacing the output functions
in the previous example by µNB (x) = δ−1 (x), µNS (x) = δ−0.25 (x), µZE (x) =
δ0 (x), µPS (x) = δ0.25 (x) and µPB (x) = δ1 (x) for instance yields the input–output
curve of Figure 48.
The only other case that can be visualized, is the one that involves two inputs and
one output. The graph of the input–output mapping is then a surface. Let us consider
Y 1
IF THEN
YZE 0.8
NB NS PS PB NB NS ZE PS PB
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6–0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6
–0.8
–1
Y1
IF THEN 0.8
NB NS YZE PS PB NB NS ZE PS PB
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2
–0.2 X
–0.4
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1
X X
–0.6
–0.8
–1
the example given in Section 7.1 again. We plot the output power as a function of
temperature and change in temperature, yielding the result in Figure 49.
We see that the graph contains peaks as well as more horizontal plateaus. It is
desirable to have the latter in regions near the desired equilibrium point, because
otherwise the control process might result in an unstable behavior, but it can be a
disadvantage if one wants to increase the sensitivity near the reference points. If
we would have used input sets with flat peaks, this would have resulted in various
deadzones, regions where the control behavior stagnates. The behavior involving
the smoothening of sharp corners or the steepness of the slopes is largely similar to
the one described in Examples 9.2.1. The design choices in Example 9.2.1(3) are
also a set of necessary conditions to obtain a diagonal plane as control surface. In
that case, the stability analysis for open-loop systems can be performed by the usual
methods regarding tuning and stability of a closed-loop system.
76 W. Peeters
The concept of a linguistic trajectory in the state space of a fuzzy controller was
introduced by M. Braae and D.A. Rutherford in [14] and [15]. Because the funda-
mental difference between the representation of a phase plane in two dimensions and
in higher–dimensional systems, practically this method is only applicable in case of
an input with two state variables, typically the error and the change of error, but to
generalize this, we will denote these variables by X1 and X2 . Suppose the number of
linguistic values each of the values can attain are N1 and N2 respectively. In such a
case, the maximal number of rules in the rule base equals N1 × N2 . Suppose further-
more that the output variable, u, can attain M different linguistic values. Then it is
possible to divide the state space into a partition of maximally M parts (disregarding
borders, which have zero Lebesgue measure, see [78]), according to the following
convention: each crisp element (x1 , x2 ) belongs to the set where the membership
grade of the antecedents is maximal. In other words, for any rules j, k ∈ R, denoted
as
we have that (x1 , x2 ) belongs to the partition, associated with uk , if and only if
∀ j = k : µ j (x1 , x2 ) ≤ µk (x1 , x2 )
where µ j and µk are the fuzzy sets representing the rule antecedents of the j–th and
the k–the rule. For instance, using the Mamdani-type implication,
A linguistic trajectory is the sequence of rules that are successively fired. In the
following example, the subsequent points in the state space at discrete points in
time are marked with an X; multiple successive occurrences of the same rule fired
are reduced to only one mention. The linguistic trajectory (see Figure 51) of this
particular path hence equals
∆E
wf
NA1 NA 2
D 3
D 4
D5
w I NA NA D D
6 7 8 9 10
sts I I NA D D
11 12 13 14 15
c I I NA NA D
16 17 18 19 20
I 21 I 22
I 23
NA 24 NA25
ef
f c a w h E
Fig. 50 State space
∆E
wf
1 2 3 4 5
w
6 7 8 9 10
sts
11 12 13 14 15
e
16 17 18 19 20
21 22 23 24 25
ef
f c a w h E
Fig. 51 Linguistic trajectory
M. Braae and D.A. Rutherford made a simple set of observation rules that relate
the number of visited states — in the example colored in gray — and their relative
position, to some simple conclusions that affect the scaling factors of the intervals
in which the variables x1 and x2 should lie. Let S1 be the total width of the first and
S2 of the total width of the second interval, then this scaling factor being too large or
too small may result in an inadequate covering of the partition space, which imply
that the scaling factors should be adjusted (see Figure 52).
Another problem that will quite often arise is the occurrence of loop behavior.
A system may start altering between two or more states without ever reaching
the desired equilibrium. To this end, since no general solution is found for fuzzy
78 W. Peeters
1 2 3 4 5 1 2 3 4 5
6 7 8 9 10 6 7 8 9 10
11 12 13 14 15 11 12 13 14 15
16 17 18 19 20 16 17 18 19 20
21 22 23 24 25 21 22 23 24 25
X1 X1
X2 both too large X2 S2 too large
1 2 3 4 5 1 2 3 4 5
6 7 8 9 10 6 7 8 9 10
11 12 13 14 15 11 12 13 14 15
16 17 18 19 20 16 17 18 19 20
21 22 23 24 25 21 22 23 24 25
X1 X1
controllers, some techniques involving crisp controllers are used to determine the
stability and the convergence. We will now describe some of these in a more de-
tailed way.
An equilibrium point x∗ is then a state of the system for which the following condi-
tion holds:
∃t0 ∈ R+ , ∀t ≥ t0 : x(t) = x∗ ,
or, in other words, once a state has reached an equilibrium, it will remain in this
point for all future times.
A nonlinear system may demonstrate a much more complicated behavior than a
linear system. For instance, unlike a linear system, it may feature multiple equilib-
rium points. Also, the stability analysis may be dependent of the system input as
well as the initial condition. One major problem that nonlinear systems have is that
an exact mathematical solution is in many cases impossible to obtain (see also [97]).
First of all, we have to make a distinction between the local and global stability
behavior of such a state. Most of the results we obtain concern local stability. In
that case, because we only consider states that are close to some equilibrium point,
which, without loss of generality, by applying some affine transformation, can be
taken as the origin o of the state space, if dim(x) = n is the number of state variables
and dim(u) = m is the number of inputs, a common mathematical technique is to
linearize the problem around o ([48]). In that case we obtain
ẋ = Ax + Bu
with the dot notation for derivation with respect to the time parameter t, A ∈ Mn×n
being the coefficient constant matrix of the linear system, B ∈ Mn×m being the input
matrix. A study of the linear dynamical system therefore can be reduced to a study
of the matrices A and B.
It is also possible that for any initial state, the trajectory can be made “close enough”
to the equilibrium (without necessarily really converging to it). In that case, we call
80 W. Peeters
We also need a measure describing how fast the system converges to the equilibrium
point. We say that the state x(t) converges exponentially if and only if it converges
to the origin no less than a certain exponential function. In other words, if and only
if
||x(t)||
∃δ > 0 : lim δ t = 0
t→∞ e
Now in the case of a fuzzy controller, let a state space with two variables X1 and
X2 be given, and let the closed-loop system be described by the following vectorial
differential equation:
ẋ = f(x) + bΦ(x)
where f(x) is the nonlinear function that describes the dynamics of the system with-
out correction. We have to make sure that f(o)=o, stating that o is an equilibrium.
Furthermore, x as well as b are vectors of dimension n, and Φ(x) is the scalar, non-
linear control function that represents the correction supplied by the fuzzy rule base.
As minimal condition, this correction should be zero in case of an equilibrium, so
we demand that Φ(o)=o. Using the notations of Section 5, we may consider for any
input vector x that Φ(x) equals the result of a defuzzification operator D on a fuzzy
set µ that is the consequence of the firing of the antecent rule base with the given
input (see Figure 53).
The fact whether or not the behavior of this nonlinear system will be closed loop,
will depend on f(x) and Φ(x). With a fixed vector b, the direction of the control
action is entirely determined by the sign of the scaling factor Φ(x). It is therefore
important to determine the subspace in the state space for which Φ(x) = 0. This
subspace will be called the switching line, which divides the state space in regions
with positive and negative control actions (see Figure 54).
Just as was the case in the state space approach as described in 9.3, it is fairly
simple to recite some heuristic rules that guarantee stability. Generally, a control
system will be stable if Φ(x) is such that it points toward the switching line, and
it will be unstable if it points away from it, with possible the case where the vec-
tor field bΦ(x) is parallel to the tangent to the switching line as a critical case,
An Overview of Fuzzy Control Theory 81
X2
b Φ (x) f (x) X1
f (x) + b Φ (x)
X2
b Φ (x)
Φ (x)=0
X1
in which the influence of the component f(x) becomes dominant. This approach can
also be useful to determine limit cycles, which are caused by multiple crossings of
the switching lines through the coordinate axes, or isolated areas, which have a dif-
ferent behavior from the dominant area. Isolated areas are closed sets that do not
contain the equilibrium point, and for which the trajectories tend to go around this
area (see Figure 55).
82 W. Peeters
X2 X2
1 2 3 4 5 1 2 3 4 5
6 7 8 9 10 6 7 8 9 10
X1 X1
11 12 13 14 15 11 12 13 14 15
Φ (x)=0
16 17 18 19 20 16 17 18 19 20
Φ (x)=0
21 22 23 24 25 21 22 23 24 25
The occurrence of limit cycles and/or isolated areas is often an indication that the
rule base has to be modified. In the case given in the graph above, the limit cycle in
the left state space implicates the need to changes rules R8 , R9 , R12 , R13 , R14 , R17 and
R18 , while the isolated area in the right state space justifies a change in rule R17 .
Apart from this heuristic approach, the Lyapunov stability can also be used to define
some indices that measure quantitatively the stability properties. Let us consider a
few particular cases, depending on the dimension of the problem.
1. n = 1
Let X ⊆ R. The mathematical model in the one–dimensional case becomes
ẋ = f (x) + Φ(x)
f(x)
f(x)+Φ(x)
X
Φ(x)
The first condition guarantees the equilibrium to be stable (see for instance,
Devaney [23]), the second condition prevents the appearance of other equilib-
ria, which are equivalent to intersections of Φ(x) + f (x) with the X–axis. If Φ(x)
or f (x) are continually deformed, the loss of a stable equilibrium and the appear-
ance of supplementary stable points are called bifurcations. It is easy to see that
under reasonable continuity conditions, such an occurrence of new equilibria al-
ways happens pairwise, and it is easy to see that one of these new stability points
will be a stable one and the other one will be instable.
These considerations also permit us to define two important measures that indi-
cate a measure of stability. Condition (1) can be rewritten as −(Φ (0) + f (0)) >
0, so −(Φ (0) + f (0)) can serve the purpose of a measure of robustness of the
system against the loss of stability at the origin. Similarly, we could take the infi-
mum of the distance between Φ(x) and f (x), being inf |Φ(x) + f (x)| as a second
stability measure. However, this value would always be zero, which will invari-
ably be reached in the origin. Therefore, we have to exclude a certain region
around the origin. To this end, define
! "
β1 := sup x < 0 : Φ (x) = − f (x)
! "
β2 := inf x > 0 : Φ (x) = − f (x)
2. n = 2
Let X ⊆ R2 . The mathematical model describing the fuzzy controller can in this
case be given by the following set of coupled differential equations:
b2 X
b1
–f(x)
Φ(x)
P(λ ) = det(J − λ · I)
= λ 2 − (a11 + a22 ) · λ + a11 a22 − a12 a21
= λ 2 − tr(J) · λ + det(J)
where I is the unit matrix, tr(J) := a11 + a22 is the trace of J and det(J) :=
a11 a22 − a12 a21 is the determinant of J.
First, we will generalize the stability of the equilibrium, which was, in the case of
n = 1, given by the index I1 . A static bifurcaton will happen if and only if one of
the roots of the characteristic polynomial P(λ ) is zero, i.e. when det(J) = 0. It is
therefore logical to assume that the higher the difference between I1 := det(J) and
0, the more stability the system possesses. On the other hand, a Hopf bifurcaton
will occur if and only if two complex eigenvalues cross the imaginary axis, in
which case tr(J) = 0. A second stability index will therefore be defined by I1 :=
−tr(J). Both these values are generalizations of the index I1 as described in the
case that n = 1.
For generalizing the index I2 , we must remember that in the one-dimensional
case, we had a bifurcation in case the vector fields Φ(x) and f (x) compensated
An Overview of Fuzzy Control Theory 85
each other. However, in that case, we had included the occurrence of b in the
vector field Φ(x). In the two-dimensional case, such a thing can only occur in
case the vector field of the controller is parallel to the direction given by b =
(b1 , b2 ). We therefore define the auxiliary subspace as the space of all points
(x1 , x2 ) for which
f1 (x1 , x2 ) f2 (x1 , x2 )
=
b1 b2
The auxiliary subspace is an one-dimensional subspace of the state space. In this
subspace, we can perform an analysis that is similar to the case n = 1. We would
like I2 to be defined as a minimal distance between the plant and the controller
components around the origin, excluding a certain region B around the origin.
This region occurs by again calculating equivalent values to β1 and β2 in the
linear subspace. Therefore, we find that
3. n> 2
Let X ⊆ Rn . Then we can generalize the previous results straightforwardly. The
mathematical model describing the fuzzy controller is given by the following
system of coupled differential equations:
The generalization of the index I1 in the case of a static bifurcation will become
86 W. Peeters
I1 = an = (−1)n · det(J)
For a Hopf bifurcation, it is necessary to have the same conditions as one should
obtain when having two pure imaginary axes. Therefore, it must be possible to
rewrite the characteristic polynomial as
P(λ ) = P1 (λ ) · (w2 + λ 2 ) + b1 λ + b2
for some real w. The condition for having two pure imaginary axes is in this case
that b1 = b2 = 0.
As an example let us see what this well become in the case n = 3. The character-
istic polynomial becomes
P(λ ) = λ 3 + a1 λ 2 + a2 λ + a3
= (λ + a1 )(w2 + λ 2 ) + (a2 − w2 )λ + a3 − a1 w2
I1 = a1 · a2 − a3
As all vector functions x(t) are elements of a normed space X and are given in
function of a time parameter, we need a norm function which makes the space of
all such signals, X, a normed vector space. This can be done by either taking the
L2 –norm #
$∞
$
x2 := % x(t)2 dt
0
representing all the signals with finite energy, or the essential supremum-norm as
defined in Kolmogorov et al. ([78])
representing all bounded signals, by considering all vectors x(t) for which the re-
spective norms are finite. Any signal x(t) that is unbounded in time can then be
truncated by defining
x(t) if t ≤ T
∀T > 0 : (x(t))T :=
0 if t > T
∀T > 0 : (x(t))T ∈ X
Any system for which G has as input vectors x(t) and as output vectors y(t), both
elements in the extended space Xe , can be considered as a relation G ⊆ Xe × Xe by
stating that it contains all pairs (x(t), y(t)) for which y(t) is a possible output for the
input x(t). Hence, any input signal may produce either one output, several outputs
or no outputs at all, so it is wrong to write y(t) = G(x(t)), which would be an
undefined function. The advantage of using relations instead of functions is that we
do not have to consider existence or uniqueness conditions.
The analysis of input–output stability (Figure 58) can be credited to Safonov
([132]) and Vidyasagar ([150]). Given any feedback system as above, where y is the
control vector and z the control output. G(v) represents the system to be controlled,
while H(u) is the controller. In that regard, v can be considered as the reference
88 W. Peeters
z y
We call a system G finite-gain stable if and only if the gain of G, which we will
define as
(Gx(t))T
g(G) := sup
x(t)=o (x(t))T
is finite. Alternatively, the system output can be made arbitrarily small by making
the inputs small. In case the open-loop system G is closed-loop, with (u, v) as closed-
loop inputs and (y, z) closed-loop outputs, the idea is that we can obtain small (y, z),
in which y may for instance represent the output error, by making (u, v) small, in
which the former denotes the disturbances to be avoided and the latter the reference
set-point.
One of the main results in this approach is the so-called Small Gain Theorem
([167]), stating that if the system described above is closed-loop, a sufficient con-
dition for its stability is that g(G) · g(H) < 1. Two important other criteria which
can be derived from this theorem are the Circle Criterion, which was studied in
Ray et. al [121] and [122], and its generalization, the Conicity Criterion — see for
instance [132].
The use of fuzzy control has always been subject to discussion over the ques-
tion whether it actually improves controller schematics (see [103]). While we
are convinced that fuzzy techniques perform at least as good as classical tech-
niques, especially in low-dimensional systems, we admit at the same time that
An Overview of Fuzzy Control Theory 89
for more complex systems, the gain achieved by replacing the classical control
techniques by their fuzzy counterparts, is indeed minimal. The most interesting
behavior however arises when fuzzy techniques are crossbred with other rela-
tively new mathematical theories that try to model other biological processes,
which have only become a subject of study since the exponential increase in
computing possibilities has taken place. We mainly think then of two particu-
lar techniques that have acquired a relative succes recently: artificial neural net-
works, for which a good introduction can be found in [36], [47] and [50], and
genetic algoritms, see for instance [46], [72] and [136]. While it is not our pur-
pose to give a detailed description of the theory involved, we will try to summa-
rize the basics of each of the two techniques, and point out the areas where suc-
cesful combination with fuzzy control techniques are possible. Systems combining
fuzzy control theory with one or more of the above will be referred to as hybrid
systems (see [66]).
In this section, we will first give an overview of the main theory involving neural
networks, followed by an overview of hybrid techniques where succesful combina-
tions of fuzzy set theory and neural networks have been made. We will follow the
approach as is presented very eloquently by Fullér in [39].
10.1.1 Introduction
Layer 2
Layer 1
Layer 3
Layer 4
A neural network is a collection of cellular units, which we will call the nodes or
neurons of the network, which serve as storage units for (binary) information. Each
neuron is characterized by an activity level, representing the state of its polarization,
an output value, representing its firing rate, and a set of input and output connections,
which we will call synapses, which connect the neurons as a directed graph would
do (see Figure 59). All these are characterized by real numbers.
Furthermore, the different neurons are ordered in layers, which may or may not
be hidden units (to which we will come later). If only one-directional arrows exist,
we will call the neural network a feedforward network; if moreover, the “previous”
nodes also obtain information from the “succeding” nodes, e.g. by arrows in the
reverse direction (though there are other methods to acquire this), we will call the
network a feedback network.
Each neuron possesses a finite number of input connections {x1 , x2 , ..., xn }, which
an associated weight value wi , which we will call the synaptic strength, and, for the
sake of simplicity, one output connection o, determined as a function of the input
signals as described in Figure 60.
An Overview of Fuzzy Control Theory 91
X1
W1
X2
W2 θ f o
…
Wn
Xn
Fig. 60 Example of a neuron
The output signal is given by the following relationship, in the particular case of
a single output connection:
n
o = f (< w, x >) = f ∑ w jx j
j=1
where the vector w = (w1 , w2 , ..., wn ) ∈ Rn is called the weight vector. The weights
(w j )nj=1 assign to each incoming synapse the strength of its effect, hence the name
synaptic strength. It may be positive (excitatory) or negative (inhibitory). The func-
tion f will be called the activation function or transfer function. For this transfer
function, a myriad of possible choices can be made, of which we will only highlight
the most important ones.
10.1.5 Example
It is possible to model both the boolean “AND” and “OR” operators as a neural
network with a binary linear transfer function. Suppose that the two input values x1
and x2 as well as the output value o are in {0, 1}, then the operators are modelled by
92 W. Peeters
“AND” “OR”
X2 X2
1,1
W W
1 2 2
1
x1 +x2
=0.6 x1 +x2 =0.8
2
X1 X1
0 1 0 1
the weight vectors w = ( 21 , 21 ) and threshold θ = 0.6, and weight vectors w = (1, 1)
and threshold θ = 0.8 respectively. Put differently,
1 if x1 +
2
x2 ≥ 0.6
x1 ∧ x2 =
0 if x1 +
2
x2 < 0.6
and
1 if x1 + x2 ≥ 0.8
x1 ∨ x2 =
0 if x1 + x2 < 0.8
Geometrically, for both connectives, this means that the points with output 1 and
those with output 0 can be separated by a (hyper)plane in the state space, for which
the weight vector is perpendicular to this hyperplane (see Figure 61).
Consequentially, it is easy to see that — and this is the major drawback of these
basic neural networks — it is not possible to model the exclusive OR-operator,
defined by
1 if (x1 = 0 and x2 = 1) or (x2 = 0 and x1 = 1)
x1 XOR x2 :=
0 otherwise
by means of a single neural network with a binary transfer function (Figure 62).
The reason is that, by using elemental geometry, it can readily be seen in
Figure 62 that is impossible to separate the two white dots from the two black dots
by means of a single hyperplane.
10.1.6 Proposition
It is quite easy to see that in case of a binary linear transfer function, a neuron with
n synapses and a threshold θ is equivalent to a neuron with n + 1 synapses, with
An Overview of Fuzzy Control Theory 93
“XOR”
X2
X1
X1
w1
X2
w2 q f o
…
wn
Xn
X1
w1
X2
w2 0 f o
…
wn
Xn
θ
Fig. 63 Including the thresh-
–1
old value as an extra weight
as one additional synapse a constant inhibitor function input −1 and fixed weight
function θ (see Figure 63).
Therefore, without loss of generality, we may always assume the threshold value
to be zero. Graphically, this means that all the hyperplanes described above include
the origin o of the euclidean state space.
possible, preferrably in such a way that the error between the output and the desired
output on the training set is as close to zero as possible. The networks are then tested
for ability to generalize.
The error-correction learning procedure is based on a simple concept: By training
the network, an input that is put into the network, generates a set of output values.
Then, the actual output is compared with the desired output, and if these match, no
change is made to the weights in the net. However, if the output differs from the
target a change must be made to some of the weights.
An often-needed modification of the binary linear transfer function is the follow-
ing:
The first, and most effective, way to train the weights is the perceptron learning
rule, introduced by Rosenblatt in [126]. Basically, it is an error-correction learning
algorithm for a of single-layer feedforward network with a hard transfer function.
Let the weight function wi j denote the synapse strength between the j–th input
vector and the i–th output vector. Let a training set of K input values and their
corresponding output values be given as
for which all output and input values are considered to be in the binary set {1, −1}
(see Figure 64). m
Our aim is to find the weight vectors wi := (wi j )nj=1 such that
i=1
X1
w11
wm1
X2
w12
wm2
o1
om
w1n
wmn
Xn
where we define the activation function oi (xk ) as the hard transfer function. Given a
parameter η > 0, which we will call the learning rate, the weights will consequently
be adjusted by the following rule:
∀i ∈ {1, ..., m} : wnew
i i + η (yi − oi )x
:= wold
From this equation it follows that if the desired output is equal to the computed
output, yi = oi , then the weight vector of the i–th output node has reached a stable
state, and will not change anymore. The learning process stops when all the weight
vectors remain unchanged during a complete training cycle.
Therefore, the following algorithm, which we will call the perceptron learning al-
gorithm provides a systematic way to determine the weight functions wi :
1. Choose η > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output
1 if < wi , xk >≥ 0
oi (x) :=
−1 if < wi , xk >< 0
can also be obtained as the outcome of a gradient descent method. For a description
of this method, see for instance [3]. More general, this rule is known in literature
as the delta learning rule. The basic idea of the delta learning rule is to define a
measure of the overall performance of the system, and then to find a way to optimize
that performance. In our network, we can define the performance of the system as
the total error function
K
1 K & & k
&2
k&
E= ∑ Ek = ∑
2 k=1
&y − o &
k=1
Then
1 K m k
E= ∑ ∑ (yi − oki )2
2 k=1 i=1
1 K m k
= ∑ ∑ (yi − < wi , xk >)2
2 k=1 i=1
An Overview of Fuzzy Control Theory 97
The goal, then, is to minimize this function. As it turns out, if the output func-
tions are differentiable, we change the weights of the system in proportion to the
derivative of the error with respect to the weights. The rule for changing weights
is given by minimizing the quadratic error function by using the following iteration
process:
∂E
i j := wi j − η
wnew old
∂ wi j
Particularly, using the chain rule,
∂E ∂ E ∂ oi
=
∂ wi j ∂ oi ∂ wi j
= −(yi − oi )x j
yielding the same error formula. If we have only one output unit then the delta
learning rule collapses into
with δ denoting the difference between the desired and the computed output; hence
the name “delta learning rule”.
Concluding, the standard delta rule essentially implements gradient descent
method in sum-squared error for linear activation functions. The use of the delta
learning rule, which is a generalization of the discrete perception training rule, in
neural network training should be accredited to McClelland and Rumelhart in [92].
It is sometimes also called the continuous perceptron training rule.
If we use a linear output unit then whatever the final weight vector is, the output
function of the network is a linear subspace, which means that the delta learning
rule with linear output function can approximate only a pattern set derived from an
almost linear function, which is, needless to say, unsatisfactory in certain real-world
applications. Therefore, other activations functions than the binary or hard transfer
functions are also commonly used, especially for their differentiability properties,
which allow then to derive them from a similar gradient descent method with a dif-
ferent output function. The unipolar sigmoidal activation function is such another,
commonly used example.
10.1.13 Example
For the sake of simplicity we will explain the learning algorithm in the case
of a multiple-input, single-output (MISO) network, with input functions x =
{x1 , x2 , ..., xn }, weight functions w = {w1 , w2 , ..., wn }, and a single neuron output o.
Suppose a training set
The rule for changing weights following presentation of input–output pair (xk , yk )
will be given by the gradient descent method, i.e. we minimize the quadratic error
∂E
i j := wi j − η
wnew old
∂wj
1 1 1
= − yk − 1− xkj
1 + e−<w,x > 1 + e−<w,x > 1 + e−<w,x >
k k k
Then the following algorithm, which we will call the delta learning rule for a unipo-
lar sigmoidal activation function provides a systematic way to determine the weight
functions wi :
1. Choose η > 0 and Emax > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output
1
ok = ok (< w, x >) = n
− ∑ w j xkj
j=1
1+e
4. Update the weights by putting
In case of the unipolar sigmoidal activation function, without hidden units, the error
surface is shaped like a bowl with only one minimum, so gradient descent is even-
tually guaranteed to find an absolutely optimal set of weights. With for instance the
presence of hidden units, however, it is not so obvious how to compute the deriva-
tives, and the error surface is not concave upwards, so there is the danger of getting
stuck in local minima. We then use the delta learning rule with the bipolar sigmoidal
activation function
2 2
o(< w, x >) = −1 = −1
1 + e−<w,x> n
− ∑ w jx j
j=1
1+e
instead. It is left as a verification for the reader that the gradient descent method then
yields a weight update algorithm where
1
wnew := wold + η (y − o)(1 − o2 )x
2
100 W. Peeters
as unknown, variable vectors that need to be “learned”. Prior to the learning, the
normalization & of& all (randomly chosen) weight vectors is required, such that ∀ j ∈
{1, ..., m} : &w j & = 1. The weight adjustment criterion for this mode of training is
the selection of an index r such that
& &
x − wr = min &x − w j & ,
j=1,...,m
X
w2
w1
w3
w5
Graphically, since the scalar product < w j , x > is the projection of x on the direction
of w j , we are in fact looking for the weight vector w j that is closest to x. In two
dimensions, consider the example given in Figure 65.
The winning weight vector is w1 , being the most similar to the vector x. With the
similarity criterion being the value of cos (x, w j ), the weight vector lengths should
be identical for this particular way of training. However, their directions should not
be modified. Intuitively, it is clear that a very long weight vector could lead to a very
large output value for its associated neuron, even if there was a large angle between
the weight vector and the pattern. This explains the need for weight normalization.
After one optimally located neuron has been identified and declared a winner, its
weight must be adjusted so that the distance x − wr is reduced in the current train-
ing step, preferrably along the gradient direction. Now, using the gradient descent
method,
∂ x − wr 2 ∂
= (< x − w j , x − w j >)
∂ wir ∂ wir
∂
= (< x, x > −2 < w j , x > + < w j , w j >)
∂ wir
∂ (w1r x1 + w2r x2 + ... + wnr xn ) ∂ (w21r + w22r + ... + w2nr )
= −2 +
∂ wir ∂ wir
= −2xr + 2wir
= −2[xr − 2wi ]r
102 W. Peeters
Transformed in vector notation, the following adaptation rule for the weight function
must be carried out:
wnew
r r + η (x − wr )
:= wold
= (1 − η )wold
r + ηx
where the constant 2 is, without loss of generality, incorporated in the learning rate
parameter η . It seems reasonable to reward the weights of the winning neuron with
an increment of weight in the negative gradient direction x − wr . The remaining
weight vectors are left unaffected. Note that from this identity, it follows that the
updated weight vector is a convex linear combination of the old weight and the
pattern vectors, as can be seen in the last equation.
wnew
r r + η (x − wr )
:= wold
and or := 1.
2. Normalize the weight vectors by putting
wold
wnew := & rold &
r &wr &
w1
w2
w3
1-z
1 1
–2
l1 l2 l3
1 2 1
1 1 1 1
x y
Fig. 67 The XOR-perceptron
latter. Remark also that in many practical cases instead of linear activation functions
we use semi-linear ones.
We recall that the “XOR” problem mentioned above cannot be solved by a single
layer perceptron neural network. Much work is credited due to Minsky and Papert
([100]), who proved this. As a solution, a supplementary layer, which will be called
a hidden layer, is needed. The neural network shown in Figure 67 is known to do
the desired trick, with the numbers in the neuron denoting the threshold values.
If the only possible outputs of the neurons are 0 and 1, then it is easy to see that
with the above weight functions and threshold values, z = 1 if and only if (x, y) =
(0, 0) or (x, y) = (1, 1).
This calls for an interesting generalization. If we study networks with a supple-
mentary layer, the delta learning rule should also be generalized to neural networks
with a two-layer (or three layers, if the nodes are counted instead of the synapses)
104 W. Peeters
x1
x2 w11
w11
w12 h1 o1
L
w1L
hidden
w1n nodes
wL1
wL2 wm1
hL om
wLn wmL
xn
architecture. Such a network in its most elementary form may look, e.g. like
Figure 68.
A layer with neurons whose outputs are inaccessible to the user, and thus not
comparable to a given data set, will be called hidden layers.
The generalized delta rule is the most often used supervised learning algorithm in
the study of multilayer neural networks. For reasons of simplicity, we will restrain
ourselves to the study of a neural network with one input layer with n inputs x =
(x1 , ..., xn ), one hidden layer with L nodes (h1 , ..., hL ), which we, so to speak, cannot
externally control, and one output node o. Denote the weight synapses between input
xi and hidden layer hl as wli and in vectorial notation, wl = (wl1 , ..., wln ), and the
weight synapses between hidden layer hl and the output layer o as Wl , in vectorial
notation, W = (W1 , ...,WL )
Let furthermore a training pattern ((xk , yk ))Kk=1 be given. The given problem is
to adjust the weights in such a matter that the total error of the system is minimized
with respect to the given input and output values. Furthermore, we opt for the out-
put function, given by the unipolar sigmoidal activation function (of course other
options as transfer function are possible), as well for the hidden layer as for the
generated output. Hence we define the internal output layer as
An Overview of Fuzzy Control Theory 105
1 1
∀l ∈ {1, ..., L} : okl (< wl , x >) = =
1 + e−<wl ,x> n
− ∑ wl j x j
j=1
1+e
and if we put the output vector of the hidden layer as ok := (ok1 , ..., okL ), then we
define the external output layer as
1 1
Ok (< W, ok >) = =
1 + e−<W,o >
k L
− ∑ Wl okl
1+e l=1
Again, the appropriate rule for adapting the weight synapses is given by the gra-
dient descent method. Given a learning rate η > 0, we adapt the external and internal
weights following the next iteration process:
∂ E(W, w)
wnew
lj lj −η
:= wold
∂ wl j
∂ E(W, w)
Wlnew := Wlold − η
∂ Wl
Analogously to the calculations in Section 10.1.12, and making use of the chain
rule for derivation, the rules for changing weights will turn out to be, in vectorial
notation,
Summarizing, the following algorithm, which we will call the generalized delta
learning rule, here in this case presented in particular for a unipolar sigmoidal
activation function, provides a systematic way to determine the weight functions
wl j and Wl :
1. Choose η > 0 and Emax > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk , determine the output.
106 W. Peeters
1
∀l ∈ {1, ..., L} : ol =
1 + e−<wl ,x >
k
and
1
∀l ∈ {1, ..., L} : O =
1 + e−<W,o>
4. Update the output weights by putting
Wnew := Wold + ηδ o
10.1.22 Theorem
The previous result can be refined by using the Stone–Weierstrass theorem from
real analysis, to show that certain neural network architectures possess the universal
approximation capability. By using the Stone–Weierstrass theorem in the design of
our networks, we also guarantee that these can compute certain polynomial expres-
sions of a certain set of given functions, as follows:
10.2.1 Introduction
The aim of any hybrid system is to try to join the strengths of several intelligent
computing techniques, and hence reenforcing the control method as a whole. Every
intelligent technique has particular computational properties that make them suited
for application in particular problems and not for others. For example, neural net-
works have a particularly great reputation when it comes down to solving pattern
recognition problems, but rather perform poor at the process of decision making.
On the other hand, fuzzy logic is a very suitable instrument for making decisions
and studying the transparancy of how a certain decision is reached, but their design
is absolutely not suited for, e.g. automatically generating the rules that are respon-
sible for those decisions. These limitations have been a central driving force behind
the creation of intelligent hybrid systems where two or more techniques are com-
bined in a way that the techniques reenforce their own strengths and overcome the
limitations of the other techniques involved.
Also, hybrid systems are designed to take into account the “best of both worlds”
when trying to model an application, which may be of a very variable nature, and
therefore may be a complex superposition of different components that require a
different approach. For instance, when some application consists of a combination
108 W. Peeters
with k ∈ {1, ..., K} be given. Then every combination of input and output vectors
can be considered as a training pattern for a neural network, where the antecedent is
the input and the consequence is the output for a neural net. Such a neural net with
fuzzy sets as inputs and outputs will be called a fuzzy neural net.
An Overview of Fuzzy Control Theory 109
k : IF (X = Ak ) THEN (Y = Bk )
for which the input–output training pairs are ((Ak , Bk ), (Ck , Dk )).
One of the most simple methods to incorporate the fuzzy component into a neural
network is to take a discrete number of input and output values, in which the fuzzy
value is taken, as input and output values for the neural network ([147]). Let us for
instance consider a SISO network, let [α1 , α2 ] be the collection of all possible input
values, such that
∀k ∈ {1, ..., K} : supp(µAk ) ⊆ [α1 , α2 ]
and let [β1 , β2 ] be the collection of all possible output values, such that
Then we divide the intervals [α1 , α2 ] and [β1 , β2 ] in equal parts. Choose two arbitrary
constants M, N ∈ N0 , then put
i
∀i ∈ {0, ..., M} : xi := α1 + (α2 − α1 )
M
j
∀ j ∈ {0, ..., N} : y j := β1 + ( β2 − β1 )
N
Then a discrete version of the continuous training set is given by the input/output
pairs
{(Ak (x0 ), Ak (x1 ), ..., Ak (xM )), (Bk (y0 ), Bk (y1 ), ..., Bk (yN )))}Kk=1
110 W. Peeters
Putting aki = Ak (xi ) and bk j = Bk (y j ), the fuzzy neural network reduces to an ordi-
nary neural network with (M + 1) inputs and (N + 1) outputs, which can be trained
by the the generalized delta rule from 10.1.21.
10.2.5 Modifications
Uehara and Fujise proposed in [146] to work with a finite number of α -levels of
the fuzzy set to represent the fuzzy numbers, which leads to a generally similar
approach.
Another idea is to change in selected applications certain elements in the defini-
tion of a neural network with their counterparts of fuzzy set theory. These generally
simple modifications lead to a fuzzy neural architecture based on fuzzy arithmetic
operations. While generally, the transfer function is given by
n
o(< w, x >) = f (< w, x >) = f ∑ w jx j
j=1
a more general definition might be the following — one additional condition how-
ever being that the arguments x j as well as the weight functions w j are in [0, 1];
otherwise a rescaling is required:
A hybrid neural network is a neural network with crisp signals and weight functions
in [0, 1], crisp transfer functions f : [0, 1] → [0, 1], but where the following deviations
with respect to an ordinary neural network are allowed:
1. Instead of combining x j and w j to the product w j x j , any t–norm (or t–conorm, or
other continuous operation) is allowed.
n
2. Instead of combining w1 x1 , w2 x2 , ..., wn xn to the sum ∑ w j x j , any t–conorm (or
j=1
t–norm, or other continuous operation) is allowed.
3. f may be replaced by any continuous function from the input set to the output
set.
Contrarily, a hybrid neural net may not use multiplication, addition, or a sigmoidal
function (because the results of these operations are not necessarily are in the unit
interval). A processing element of a hybrid neural net is called a fuzzy neuron.
10.2.7 Examples
1. Given a t–norm T and a t–conorm S, and a weight vector (w1 , w2 ). Then the
output function, which we will call the AND–composition is given by
In particular if T = min and S = max, we calll this fuzzy neuron the min–max
composition.
2. Again given a t–norm T and a t–conorm S, and a weight vector (w1 , w2 ). Then
the output function, which we will call the OR–composition is given by
In particular if T = min and S = max, we calll this fuzzy neuron the max–min
composition.
It is now really quite simple to change the arguments and weight functions of an
hybrid neural network from elements in [0, 1] to fuzzy sets which are elements in
F(X). All definitions above remain valid when the arguments are fuzzy sets, and
the operations are naturally expanded to the pointwise extended in the image space
of the fuzzy sets.
The most effective way a subprocess of fuzzy control can benefit from techniques
of neural networking, is by having the network steer the process of adjusting the
parameters of the fuzzy linguistic variables. Since the effectivity of the fuzzy mod-
els representing nonlinear input–output relationships depends strongly on the way
how the input–output spaces are partitioned, the tuning of membership functions
will always be a very important issue in fuzzy modelling. Since this tuning task can
be viewed as an optimization problem, neural networks offer a possibility for effec-
tively solving it. It is also reasonable to assume that the membership function belong
to a certain parametric class of shapes that are heuristically feasible, yet broadly
enough adjustable, so that the parameters can be trained by a neural network, given
once again a set of correct training input–output values.
Let the fuzzy training data be given by
and let us for the set of fuzzy rules particularly focus on a Sugeno controller (see
Section 7.2) with rules
i : IF (X1 = Ai1 ) and (X2 = Ai2 ) and ... and (Xn = Ain ) THEN (Y = zi )
and the output of the system will be computed by a discretized version of the Center-
of-Gravity defuzzification method as
m
∑ αi (xk )zi
i=1
ok := m
∑ αi (xk )
i=1
First of all, we can derive the most appropriate values for zi by minimizing the
total error function of the quadratic sum of the errors and using a gradient descent
method.
K K
1
E = ∑ Ek = ∑ (ok − yk )2
k=1 k=1 2
∂ Ek
zi (t + 1) = zi (t) − η
∂ zi
αi (xk )
= zi (t) − η (ok − yk ) m
∑ αi (xk )
i=1
t hereby indexes the number of adjustments made to the parameters, and can there-
fore be considered as a discrete time parameter.
But also the parameters of fuzzy numbers in the premises can be adjusted by
the gradient descent method. Rather than explaining all available possibilities for a
wide scope of choices for the fuzzy set shapes, we will use an example to illustrate
the process.
10.2.9 Example
Consider a fuzzy controller consisting of two fuzzy rules with one input and one
output variable, as follows:
1 : IF (x = A1 ) THEN (Y = z1 )
2 : IF (x = A2 ) THEN (Y = z2 )
where a1 , a2 , b1 and b2 are adjustable parameters for the premises. Let a given value
x b́e the input to the fuzzy system, and let the firing levels of the rules be α1 :=
µA1 (x) and α2 := µA2 (x). Then the output of the system is computed by the discrete
COG-defuzzification as
α1 z1 + α2 z2
o=
α1 + α2
Suppose furthermore that we have a training set (xk , yk )Kk=1 at our disposition. Then
our problem is reduced to finding the two fuzzy rules with appropriate membership
functions and consequence parts that generate the given input-output pairs. This
means that we have to adjust the following parameters:
• a1 , a2 , b1 and b2 , the parameters of the fuzzy numbers representing the linguistic
variables
• z1 and z2 , the values of the consequences of the Sugeno controller
Once more, we will use the gradient descent method on the total sum of quadratic
errors
K K
1
E = ∑ Ek = ∑ (ok (a1 , a2 , b1 , b2 , z1 , z2 ) − yk )2
k=1 k=1 2
where ok is the computed output from the fuzzy system corresponding to the input
pattern xk , and yk is the desired output.
First of all, we determine the adjustment for zi in the consequence; that is,
∂ Ek
z1 (t + 1) = z1 (t) − η
∂ z1
α1
= z1 (t) − η (ok − yk )
α1 + α2
µA1 (xk )
= z1 (t) − η (ok − yk )
µA1 (xk ) + µA2 (xk )
∂ Ek
z2 (t + 1) = z2 (t) − η
∂ z2
α2
= z2 (t) − η (ok − yk )
α1 + α2
µA2 (xk )
= z2 (t) − η (ok − yk )
µA1 (xk ) + µA2 (xk )
In a similar manner we can find the shape parameters (center and slope) of the
membership functions µA1 and µA2 :
∂ Ek
a1 (t + 1) = a1 (t) − η
∂ a1
∂ Ek
a2 (t + 1) = a2 (t) − η
∂ a2
114 W. Peeters
∂ Ek
b1 (t + 1) = b1 (t) − η
∂ b1
∂ Ek
b2 (t + 1) = b2 (t) − η
∂ b2
Let us furthermore assume that the parameters of the fuzzy membership functions
are not independent. In fact, it is reasonable to assume that a := a1 = a2 and b :=
b1 = −b2 . In that case the fuzzy membership functions become
1
µA1 (x) =
1 + eb(x−a)
1
µA2 (x) = −b(x−a)
1+e
and form a partition of unity (see Definition 2.3.1, and for example, see Figure 69)
since
∀x : µA1 (x) + µA2 (x) = 1.
In that case, the number of parameters to be adjusted is reduced by half, doubling
the efficiency of the algorithm, and we get
∂ Ek (a, b)
a(t + 1) = a(t) − η
∂a
∂ ok
= a(t) − η (ok − yk )
∂a
k ∂
= a(t) − η (o − y ) (z1 µA1 (xk ) + z2 µA2 (xk ))
k
∂a
∂
= a(t) − η (ok − yk ) (z1 µA1 (xk ) + z2 (1 − µA1 (xk )))
∂a
∂ µA1 (xk )
= a(t) − η (ok − yk )(z1 − z2 )
∂a
eb(x −a)
k
µ2(x)
1
0.8
0.6
0.4
0.2 µ1(x)
Fig. 69 Complementary 0
fuzzy partition 1 2 3 4
An Overview of Fuzzy Control Theory 115
and
∂ Ek (a, b)
b(t + 1) = b(t) − η
∂b
∂ ok
= b(t) − η (ok − yk )
∂b
∂
= b(t) − η (ok − yk ) (z1 µA1 (xk ) + z2 µA2 (xk ))
∂b
k ∂
= b(t) − η (o − y ) (z1 µA1 (xk ) + z2 (1 − µA1 (xk )))
k
∂b
∂ µA1 (xk )
= b(t) − η (ok − yk )(z1 − z2 )
∂b
eb(x −a)
k
For an arbitrary algorithm where the parameters of the fuzzy variables still have to
determined, given a training set (x k , y k )Kk=1 , the following steps should be carried
out:
1. Choose η > 0.
2. Take initial values for all parameters involved in the problem, and put the error
function E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output ok as the output given
by the algoritm, which possibly may contain some unknown parameters.
4. Adjust the parameters involved by
∂ Ek
a(t + 1) = a(t) − η
∂a
where the energy function is defined as
1
Ek = (ok (xk ) − yk )2
2
5. Cumulate the error function by putting
1
E new := E old + (ok (xk ) − yk )2
2
116 W. Peeters
a fuzzy rule base for the classification problem looks like this:
IF (x11 = A11 ) and ... and (xn1 = A1n ) THEN x1 belongs to class 1
IF (x12 = A21 ) and ... and (xn2 = A2n ) THEN x2 belongs to class 2
...
where Aki are linguistic variables that characterize the properties of the classes. By
combining the individual rules by means of the appropriate aggregation functions,
such as t–norms and t–conorms, the different actions are considered together, and
based on the result of pattern matching between rule antecedents and input signals, a
number of fuzzy rules are triggered in parallel with various values of firing strength.
Furthermore, we want the system to have the capability to learn, and hence to
update and fine-tune itself, based on newly acquired information. The task of fuzzy
classification is to generate an appropriate fuzzy partition of the feature space; in
this context the word “appropriate” means that the number of misclassified patterns
should be minimized. Also, the rule base should be optimized by deleting rules
which are not used or have a negligible influence.
To achieve this goal, each of the input domains is assigned a partition of unity
as an antecedent rule base. Considering that the minimum is the largest t–norm,
and that the firing strength, being the combination of the rule antecedents xk =
(x 1k , x 2k , ..., x nk ), is realized through such a t–norm, a pattern vector xk is then suitably
classified as belonging to class j if and only if its firing strength is larger than or
equal to 0.5. In such a case, a rule is created if for a given input pattern xk the
combination of fuzzy sets, where each yields the highest degree of membership for
the respective input feature, is achieved. If this rule antecedent combination is not
present as an existing rule in the rule base yet, a new rule is created. This method
however does not prevent that some patterns may be misclassified. In particular, this
may happen when either the fuzzy partition is not set up correctly, or if the number
of fuzzy linguistic variables is too small.
Since a general description of this method, incorporating all the possible choices
for aggregation operators, shapes of the membership functions, number of input and
output values and degree of overlap, would lead to a too general meta-description
of the method of neuro-classification, we will restrict ourselves once more to give a
few detailed examples, in several dimensions.
10.2.12 Example
ν3
ν2
ν1
1 0
1 µ1
µ2 µ3
Fig. 71 Training set
where Aµi denotes the i–th linguistic variable for the first input, represented by the
fuzzy set µi , and where Aν j denotes the j–th linguistic variable for the second input,
represented by the fuzzy set ν j . Two observations should now be obvious:
• The contraction of rules R4 , R5 and R6 to one single rule
does not in any way influence the classification, so we reach the same precision
with fewer rules.
• Nevertheless, the number of rules seems to be too small, as there are clearly two
misclassified data sets in the example.
10.2.13 Example
Consider another example (Figure 72), in which we will show that this reduction of
number of rules can be quite drastic.
If one would try to classify all the given patterns by fuzzy rules based on a sim-
ple fuzzy grid, a fine fuzzy partition and (6 × 6 = 36) rules would be necessary.
An Overview of Fuzzy Control Theory 119
ν6
ν5
ν4
ν3
ν2
ν1
1 0
1
µ1 µ2 µ3 µ4 µ5 µ6
Fig. 72 Another training set
However, if is easy to see that the pattern may be correctly classified with only the
following five IF–THEN rules:
Another example of the succesful combination of neural networks and fuzzy lin-
guistic variables is given by Sun and Jang ([68]), who have succesfully constructed
a fuzzy classifier based on an adaptive network, which they call an ANIFS (adap-
tive network-based fuzzy inference system) structure. The architecture is shown in
Figure 73.
Given two input variables x1 and x2 , the training data set is categorized into two
classes C1 and C2 . Each input is supposed to satisfy to a certain degree two linguistic
terms, hence we have four rules.
• In the first layer, the output is defined as the degree to which the given input
satisfies the given linguistic variable. Fuzzy variables describing this degree of
membership may be of the following normal and convex shape:
1 x−ai1 2
−2
µAi (x) = e b i1
! "2
where ai j , bi j i, j=1 are parameters that still have to be determined. The shape
of the membership function may, change in function of the parameters. The
120 W. Peeters
A1 T
x1
A2 T S f θ C1
B1 T S f θ C2
x2
B2 T
Fig. 73 ANIFS-structure
functions may, e.g. also have a trapezoidal or triangular shape; the parameters
are tuned by means of a gradient descent method.
• The signals that are generated by each of the nodes are combined in the second
layer by means of a t–norm T representing the AND-conjunction.
• The different outcomes are then combined through a t-conorm S or a linear com-
bination.
• Finally, in the last layer, a sigmoidal function is applied to calculate the degree
of membership to each of the classes.
Let therefore a training set {(xk , yk )}Kk=1 be given, where xk is the k-th input
pattern and
k (1, 0) if xk belongs to class 1
y =
(0, 1) if xk belongs to class 2
then the parameters of this hybrid neural net determine the shape of the membership
functions, and can be learned by gradient descent methods. The error function is
defined as
K
1 K
E = ∑ Ek = ∑ ok1 − yk1 + ok2 − yk2
k=1 2 k=1
where yk is the desired output vector and ok is the output given by the hybrid neural
net.
In this last section, we will again first give an overview of the main theory involving
genetic algorithms, followed by an overview of hybrid techniques where succesful
combinations of fuzzy set theory and genetic algorithms have been made. We will
follow the approach as is presented by Obitko and Slavı́k in [108].
An Overview of Fuzzy Control Theory 121
10.3.1 Introduction
NP problem is the traveling salesman problem, see for instance [70]. Usually, NP
problems are solved by some sort of “guessing” the solution, and then checking its
fitness. A characteristic of NP problems is that a simple algorithm, perhaps obvious
at a first sight, can be used to find usable solutions. But this approach generally pro-
vides many possible solutions — just trying all possible solutions in case of a simple
problem for which the answer is either yes or no, is already very slow process of
order O(2n ). The question whether for any NP problem, a solution exists that pro-
vides the exact answer in a polynomial function of time, is still an open problem.
Because of the lack of a way to construct such an efficient algorithm, scientists apply
alternative methods such as genetic algorithms.
Crossover and mutation are the most important parts of the genetic algorithm.
The performance is influenced mainly by these two operators. Before we can explain
more about crossover and mutation, some information about chromosomes will be
given.
A chromosome should in some way contain information about the solution that it
represents. We will illustrate the crossover and mutation operators in case the chro-
mosomes are defined as binary strings. Chromosomes then could look like this:
Chromosome 1 : 1101100100110110
Chromosome 2 : 1101111000011110
Each bit in this string can represent some characteristic of the solution. Of course,
there are many other ways of encoding, depending mainly on the problem to be
solved. For example, one can encode directly integer or real numbers, sometimes it
is useful to encode some permutations and so on.
10.3.4 Example
Let the same chromosomes as above be given, and denote the crossover point by |.
Then
An Overview of Fuzzy Control Theory 123
Chromosome 1 : 1101100|100110110
Chromosome 2 : 1101111|000011110
Variations in how to create crossover offspring include, for example, the choice of
more than one crossover point. Crossover can be quite complicated and depends
mainly on the encoding of chromosomes. Specific crossovers made for a specific
problem can improve the performance of the genetic algorithm.
After a crossover is performed, mutation takes place.
The mutation operation is intended to prevent falling of all solutions in the popula-
tion into a local optimum of the solved problem. The mutation operation randomly
changes the offspring resulted from crossover, with a low probability though. In case
of binary encoding, we can switch a few randomly chosen bits from 1 to 0 or from
0 to 1.
10.3.6 Example
Let the same offspring as caused by the crossover above be given. Then the result
of mutation of
Original offspring chromosome 1 : 1101100|000011110
Original offspring chromosome 2 : 1101111|100110110
Mutation should not occur very often, because then genetic algorithm will in fact
change to random search. The technique of mutation (as well as crossover) depends
mainly on the encoding of chromosomes. For example, when we are encoding per-
mutations, mutation could be performed as an exchange of two genes.
Chromosomes are selected from the population to be parents for crossover. The
problem is how to select which chromosomes will be given a chance to procre-
ate. According to Darwin’s theory of evolution, the best ones survive to create new
124 W. Peeters
offspring. There are many methods in selecting the best chromosomes. Any such an
algorithm is called a selection. Examples are roulette wheel selection, Boltzman se-
lection, tournament selection, rank selection, steady state selection and some other
selection methods.
10.3.8 Examples
3. Steady-state selection
This is not a particular method of selecting parents. The main idea of this type
of selecting to the new population is that a big part of chromosomes can survive
to next generation. The steady-state selection genetic algorithm works in the fol-
lowing way: in every generation a few good (with higher fitness) chromosomes
are selected for creating new offspring. Then some bad (with lower fitness) chro-
mosomes are removed and the new offspring is placed in their place. The rest
An Overview of Fuzzy Control Theory 125
of population, including the parents with high fitness values, survives to a new
generation.
4. Elitism
The idea of the elitism has been already introduced. When creating a new pop-
ulation by crossover and mutation, we have a big chance, that we will lose the
best chromosome. Elitism is the name of the method that first copies the best
chromosome (or few best chromosomes) to the new population. The rest of the
population is constructed in ways described above. Elitism can rapidly increase
the performance of GA, because it prevents a loss of the best found solution.
10.3.9 Parameters
There are two basic parameters of GA-crossover probability and mutation pro-
bability.
genetic algorithms, on which the performance depends very much, the type and
implementation of operators depends on the encoding that has been chosen as being
suitable to the problem. In the following examples we briefly some often encoun-
tered encoding methods
Binary encoding is the most common used type of encoding, due to historical rea-
sons as well as computational simplicity. Furthermore, binary encoding creates
many possible chromosomes even with a small number of data. On the other hand,
this encoding is often not natural for many problems and sometimes corrections
must be made after crossover and/or mutation. In binary encoding, every chromo-
some is a string of bits — 0 or 1.
Chromosome A : 1001001001100101100110
Chromosome B : 1110100011110101101110
• Two point crossover: two crossover points are selected, then consequently the
binary string from the beginning of the chromosome to the first crossover point
is copied from the first parent, the part from the first to the second crossover point
is copied from the other parent and the rest is copied from the first parent again.
11 | 0010 | 11
⇒ 11|0111|11
11 | 0111 | 11
• Uniform crossover: bits are randomly copied from the first or from the second
parent.
11001011
⇒ 11011111
11011111
An Overview of Fuzzy Control Theory 127
As far as mutation is concerned, only one feasible method is possible here: bit
inversion, where selected bits are inverted with a random probability.
11001001 ⇒ 10001001
Permutation encoding can be used in ordering problems, such as the travelling sales-
man problem or, more generally, any task ordering problem. In permutation encod-
ing, every chromosome is a string of numbers that represent a position in a sequence.
Chromosome A : 7 4 1 9 6 3 2 5 8
Chromosome B : 3 8 9 5 2 6 1 4 7
The standard problem that is associated with permutation encoding is the travel-
ling salesman problem: given a number of cities and a matrix denoting the distances
between them. A travelling salesman has to visit all of them exactly once, but at
once he wants to minimize his travel time. The aim of the genetic algorithm is then
to find the ideal order in which the salesman has to visit the cities. The chromosomes
describe of course the order in which the salesman will travel the cities. A myriad
of variations on the problem exist (e.g. the salesman wants to end in the same city
he started, city A must be visited before city B, one particular ordered pair (A, B)
should be excluded because of road works).
Several methods for crossover exist. Single point crossover can be achieved as
follows: one crossover point is selected, the permutation is copied from the first
parent till the crossover point, then the other parent is scanned, where all numbers
that are not yet in the offspring, are added in the same order as they occur in the
second parent. Note that there are more ways to produce the remainder of the string
after the crossover point.
12345 | 6789
⇒123456897
45368 | 9721
For mutation (and also for some types of crossover) corrections must be made
to leave the chromosome consistent (i.e. making sure that the chromosomes still
are feasible solutions). One could imagine for instance that random mutation would
cause the sequence not to contain all numbers anymore. A mutation therefore will
be encoded as the random exchange of a pair of numbers.
123456897⇒183456297
128 W. Peeters
Direct value encoding can be used in a wide scope of problems where more compli-
cated values such as real numbers are used, where binary encoding for this type of
problems would be meaningless. In the value encoding, every chromosome is a se-
quence of some values, possibly anything connected to the problem, such as (real)
numbers, characters or any objects. Almost any mathematical problem should be
able to cope with genetic algorithms with real numbers
which may for instace be the weights of the synapses between the neurons of a
neural network; but also, e.g. sequences of motions to find the shortest path through
a maze could be the object of study for a genetic algorithm:
For crossover, all crossovers from binary encoding can be used. In the case of
real value encoding, mutation can be performed by adding or subtracting a small
number to or from selected values.
1.12 0.24 5.71 4.33 2.05 ⇒ 1.12 0.24 5.71 4.56 2.05
The following recommendations for the design of a genetic algorithm are mostly
heuristically derived from the results of empiric studies of genetic algoritms with
binary encoding:
• The crossover rate should be high generally, about 80–95%. However, some re-
sults show that for some problems a crossover rate about 60% is the best.
• On the other side, the mutation rate should be very low. Best rates seems to be
about 0.5–1%.
• It may be surprising, that, concerning the population size, very big populations
usually do not improve performance of the genetic algorithm, in the sense of
speed of finding an optimal solution. A good population size is about 20–30,
however sometimes sizes 50–100 are reported as the best. Some research also
shows, that the best population size depends on the size of the encoded strings
(chromosomes). For instance chromosomes with 32 bits require a larger popula-
tion than chromosomes with 16 bits.
• For the selection, a basic roulette wheel selection can be used, but sometimes
rank selection can be better, as each method has its advantages and disadvantages.
• There are also some more sophisticated methods that change parameters of se-
lection during the run of the genetic algorithm. Basically, these behave similarly
like simulated annealing.
An Overview of Fuzzy Control Theory 129
• Elitism should be used for sure if you do not use any other method for saving the
best found solution. You can also try steady-state selection.
• The encoding depends on the problem and also on the size of instance of the
problem. Operators for crossover and mutation depend on the chosen encoding
and on the problem.
Although it would be beyond the scope of this article to give a complete overview
of all successful combination techniques involving fuzzy control and genetic al-
gorithms, we would like to recite a few of the most obvious applications that can
be made. The first approach is due to Hashiyama et al. ( [49]), who incorporated
ideas due to Karr ([72]), for designing a fuzzy antecedent rule base without prior
knowledge. Let an n–input, single output fuzzy controller be given by the following
set of linguistic rules:
For all i ∈ {1, ..., n}, let Ai assume a linguistic value in the range {ai,1 , ai,2 , ..., ai,n(i) }
and let B assume a linguistic value in the range {b1 , b2 , ..., bm }. The purpose of this
method is to find a validation of all possible rules that can be created, assuming
there is a training set at hand, as well as a performance measure (see Section 8.1) for
130 W. Peeters
n
the rule base. Unsupervised, the number of possible rules equals ∏ n(k) × m ,
k=1
which means that it is virtually impossible to test all the rules for validity within a
reasonable time period.
Therefore, only a selected number of rules will be created at random, and these
will be considered as a population of a genetic algorithm. Hence, a chromosome
will be given by
where ∀i ∈ {1, ..., n} : ji ∈ {1, ..., n(i)} and k ∈ {1, ..., m}. A chromosome exists by
taking all the possible linguistic input values together with a single linguistic out-
put value, hence yielding chromosomes of length n + 1. Given a population of N
randomly determined chromosomes, it is sufficient to define a crossover and muta-
tion operator in order to be able to apply the techniques described in Section 10.3.
A crossover at crossover point q ∈ {1, ..., n + 1} will be defined as follows: for all
K1 , K2 ∈ {1, ..., N},
⎫
K1 K1 ⎪
CK1 : aK1 K1
1 ... aq aq+1 ... an b ⎪
K1
⎬
⇒ aK1 1 ... aKq 1 aKq+1
2
... aKn 2 bK2
⎪
⎪
CK2 : aK1 2 ... aKq 2 aK2
q+1 ... aK
n
2
bK2 ⎭
In particular, when q = n, the rule antecedents will be matched with another conse-
quence. Mutation is performed as follows:
The approach described in the previous section is quite crude, and without expert
knowledge, convergence to an optimal rule base will not be guaranteed to be quick
enough. In this section, we would like to propose some simple heuristics which
will improve the quality of the solutions derived from the genetic algorithm. We
will again make the distinction between the self-tuning case, where the parameters
An Overview of Fuzzy Control Theory 131
occurring in the fuzzy rule definitions are changed, and the self-organizing case,
where fuzzy rules can be omitted and/or added.
• Given that the shape functions for the linguistic variables are fixed, it is also
possible to use a genetic algorithm to tune the parameters. Let for instance a set
of antecedent rules be given, where each membership function is of the shape
x − ai n
µi (x) = 1 − 1 − ∨ 0
bi i=1
Then we shall assume that the value ai is equal to a member of a discrete set
of possible center values A = {a0 , a1 , a2 , ..., , a p } and bi equals a possible spread
value B = {b0 , b1 , b2 , ..., , bq }. A chromosome then exists of a string of length 2n
a1 b1 a2 b2 ... an bn
with the usual crossover operator and as mutation a random selection of another
value a j ∈ A or b j ∈ B.
• Analogously, it is possible to consider several shape functions at once as an-
tecedent rule base variables, and let an evolutionary algorithm as in Section 10.4.1
determine which shape yields the best performance. Of course, as the degrees of
liberty increase, the search space grows, and so does the time to reach a conver-
gent behavior. This extension of the design of genetic rule bases should therefore
be approached with caution.
• An important extension however is inspired by Example 10.2.13, which showed
us that it is important to consider a variable number of antecedents in a partic-
ular rule. This is where genetic algorithms fail, since the chromosomes always
have the same length, say n. In order to fix this problem without fundamentally
changing the algorithm, it is possible however to add one “dummy value” to the
number of possible values of the chromosomes, with no effect. Consider as an
example again a rule base with rules
and suppose also that we would like that rules with fewer than n antecedent
conditions should be considered. If for instance the linguistic variable A1 ranges
in the set A1 = {a1,1 , a1,2 , ..., a1,n(1) }, it is most easy to extend A to the set A ∪
({a1,n(1)+1 := “always true”)}, where the latter can easily be encoded by taking
the fuzzy set µa1,n+1 := 1. This can be done for all variables, so that the rule
IF ... and (Xk−1 =Ak−1 ) and (Xk =ak,n(k)+1 ) and (Xk+1 =Ak+1 ) and ... THEN (Y =B)
IF ... and (Xk−1 = Ak−1 ) and (Xk+1 = Ak+1 ) and ... THEN (Y = B)
132 W. Peeters
References
1. D.Y. Abramovitch and L.G. Bushnell. Report on the Fuzzy versus Conventional Control De-
bate. IEEE Control Systems 19(3), pp. 88–91, 1999
2. C. Alsina. On a family of connectives for fuzzy sets. Fuzzy Sets and Systems 16, pp. 231–235,
1985
3. G. Arfken. The Method of Steepest Descents. in: Mathematical Methods for Physicists, 3rd
ed. Orlando, FL, Academic Press, pp. 428–436, 1985
4. S. Arnone, M. Dell’Orto and A. Tettamanzi A. Towards a fuzzy government of genetic popu-
lations. In Proc. Sixth IEEE Conference on Tools with Artificial Intelligence, Los Alamitos,
pp. 585–591, 1994
5. K.J. Aström and B. Wittenmark Adaptive Control. Addison-Wesley, 1989
6. S.M. Baas and H. Kwakernaak. Rating and ranking of multiple–aspects alternatives using
fuzzy sets. Automatica, 13, pp. 47–58, 1977
7. J.F. Baldwin. A new approach to approximate reasoning using a fuzzy logic. Fuzzy Sets and
Systems 2, pp. 309–325, 1979
8. G. Bartolini, G. Casalino, F. Davoli, M. Mastretta, R. Minciardi and E. Morten Development
of performance adaptive fuzzy controllers with application to continuous casting plants. In
R. Trappl Ed., Cyvbernetics and Systems Research. Amsterdam, North–Holland, pp. 721–
728, 1982
9. A. Bergman, W. Burgar and A. Hemker Adjusting parameters of genetic algorithms by fuzzy
control rules. Proc. Third International Workshop on Software Engineering and Expert Sys-
tems for High Energy and Nuclear Physics, Oberammergau In K.H. Becks and D.P. Gallix,
Eds. New Computer Techniques in Physics Research III, pp. 235–240, 1994
10. P.P. Bonissone and K.S. Decker. Selecting uncertainty calculi and granularity: An experi-
ment in trading-off precisionand complexity. In: L.N. Kanal and J.F. Lemmer. Uncertainty In
Artificial Intelligence, pp. 217–247, 1986
11. G. Bortolan and R. Degani. A review of some methods for ranking fuzzy subsets. Fuzzy Sets
and Systems 15, pp. 1–19, 1985
12. S.B. Boswell and M.S. Taylor. A central limit theorem for fuzzy random variables. Fuzzy
Sets and Systems 24, pp. 331–344, 1987
13. G.E.P. Box and G.M. Jenkins. Time Series Analysis: Forecasting and Control. Holden-Day,
1989
14. M. Braae and D.A. Rutherford. Selection of parameters for a fuzzy logic controller. Fuzzy
Sets and Systems 2, pp. 185–199, 1979
15. M. Braae and D.A. Rutherford. Theoretical and linguistical aspects of the fuzzy logic con-
troller. Automatica 15, pp. 553-577, 1979
16. Z.-X. Cai. Intelligent control: Principles, Techniques and Applications. World Scientific,
1997
17. L. Campos and J.L. Verdegay. Linear programming problems and ranking of fuzzy numbers.
Fuzzy Sets and Systems 32, pp. 1–11, 1989
18. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part I. Fuzzy Sets
and Systems 44, pp. 33–38, 1991
19. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part II. Fuzzy Sets
and Systems 45, pp. 189–202, 1992
An Overview of Fuzzy Control Theory 133
20. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part III. Fuzzy Sets
and Systems 46, pp. 281–286, 1992
21. E. Czogala and W. Pedrycz On identification in fuzzy systems and its applicatons in control
problems. Fuzzy Sets and Systems 6, pp. 73–83, 1981
22. M. Delgado, J.L. Verdegay and M.A. Villa. A procedure for ranking fuzzy numbers using
fuzzy relations. Fuzzy Sets and Systems 26, pp. 49–62, 1988
23. R.L. Devaney. An Introduction to Chaotic Dynamical Systems. Addison-Wesley, 1989
24. D. Driankov, H. Hellendoorn and M. Reinfrank. An introduction to fuzzy control. Springer-
Verlag, 1993
25. D. Dubois and H. Prade. Operations on fuzzy numbers. Internat. J. Systems Sci. 9, pp. 613–
626, 1978
26. D. Dubois and H. Prade. Fuzzy real algebra: some results. Fuzzy Sets and Systems 2, pp.
327–348, 1979
27. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Academic Press,
1980
28. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 1: Integration of fuzzy
mappings. Fuzzy Sets and Systems 8, pp. 1–17, 1982
29. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 2: Integration on fuzzy
intervals Fuzzy Sets and Systems 8, pp. 105–116, 1982
30. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 3: Differentiation. Fuzzy
Sets and Systems 8, pp. 225–233, 1982
31. D. Dubois and H. Prade. Ranking fuzzy numbers in the setting of possibility theory. Inform.
Sci. 30, pp. 183–224, 1983
32. D. Dubois and H. Prade. The mean of a fuzzy number. Fuzzy Sets and Systems 24, pp. 279–
300, 1987
33. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
34. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning, Part 1: Inference with possi-
bility distributions. Fuzzy Sets and Systems 40, pp. 143–202, 1991
35. D. Dubois and H. Prade. Basic issues on fuzzy rules and their application to fuzzy control.
Proceedings of the IJCAI-91 Workshop on Fuzzy Control, Sydney, pp. 5–17, 1991
36. L. Fausett. Fundamentals of Neural Networks. Prentice-Hall, 1994
37. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions.
Internat. J. Intelligent Systems 6, 1991, pp. 687–697
38. D.P. Filev and R.R. Yager. An adaptive approach to defuzzification based on level sets. Fuzzy
Sets and Systems 53, pp. 355–360, 1993
39. R. Fullér. Introduction to Neuro-Fuzzy Systems. Advances in Soft Computing Series,
Springer-Verlag, Berlin/Heidelberg, 2000
40. K.I. Funahashi. On the Approximate Realization of continuous Mappings by Neural Net-
works. Neural Networks, vol. 2, pp. 183–192, 1989
41. S. Gähler and W. Gähler. Fuzzy real numbers. Fuzzy Sets and Systems 66, pp. 137–158, 1994
42. M. de Glas. Invariance and stability of fuzzy systems. Journal of Mathematical Analysis and
Applications, 199, pp. 299–319, 1984
43. R. Goetschel and W. Voxman. Topological properties of fuzzy numbers. Fuzzy Sets and Sys-
tems 10, pp. 87–99, 1983
44. R. Goetschel and W. Voxman. Eigen fuzzy number sets. Fuzzy Sets and Systems 16, pp.
75–85, 1985
45. R. Goetschel and W. Voxman. Elementary fuzzy calculus. Fuzzy Sets and Systems 18, pp.
31–43, 1986
46. David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.
Addison-Wesley Publishing Company, 1989
47. K. Gurney. An Introduction to Neural Networks. UCL Press, 1997
48. J.C. Harris and J.F. Miles. Stability of linear systems: Some aspects of kinematic similarity.
Academic Press, New York, 1980
134 W. Peeters
49. T. Hashiyama, F. Furuhashiu and Y. Uchikawa. A creative design of fuzzy logic controller
using a genetic algorithm. In [134], pp. 37–48, 1997
50. S. Haykin. Neural Networks: A Comprehensive Foundation, 2nd Ed. Prentice-Hall, 1999
51. F. Herrera, E. Herrera-Viedma, M. Lozano and J.L. Verdegay. Fuzzy tools to improve ge-
netic algorithms. In Proc. Second European Conference on Intelligent Techniques and Soft
Computing (EUFIT Aachen’94), vol. 3, pp. 1532–1539, 1994
52. F. Herrera and M. Lozano. Adaption of genetic algorithm parameters based on fuzzy logic
controllers. In [57], pp. 95–125, 1996
53. F. Herrera and M. Lozano. Adaptive genetic algorithms based on fuzzy techniques. In Proc.
Sixth International Conference on Information Processing and Management of Uncertainty
in Knowledge Based Systems (IPMU’96), Granada, pp. 775–780, 1996
54. F. Herrera and M. Lozano. Heuristic crossovers for real-coded genetic algorithms based on
fuzzy connectives. In H.K. Voight, W. Ebeling, I. Rechenberg and H.P. Schwefel, Eds. Proc.
Fourth Paralell Problem Solving from Nature - PPSN IV. LCNS 1141 Springer-Verlag, Berlin,
pp. 336–345, 1996
55. F. Herrera, M. Lozano and J.L. Verdegay. The use of fuzzy connectives to design real-coded
genetic algorithms. Mathware & Soft Computing 1(3), pp. 239–251, 1995
56. F. Herrera, M. Lozano and J.L. Verdegay. Dynamic and heuristic fuzzy connectives based
crossover operators for controlling the diversity and convergence of real-coded genetic al-
gorithms. International Journal of Intelligent Systems 11(12), pp. 1013–1040, 1996
57. F. Herrera and J.L. Verdegay. Genetic Algorithms and Soft Computing. Physica Verlag, 1996
58. F. Herrera, M. Lozano and J.L. Verdegay. Tackling fuzzy genetic algorithms. In G. Winter,
J. Periaux, M. Galáan, and P. Cuesta, Eds. Genetic Algorithms in Engineering and Computer
Science. Wiley, Chichester, UK, pp. 167–189, 1995
59. F. Herrera, M. Lozano and J.L. Verdegay. Fuzzy connective based crossover operators to
model genetic algorithms population diversity. Fuzzy Sets and Systems, 92 (1), pp. 21–30
60. K. Hirota. Industrial Applications Of Fuzzy Technology. Tokyo, Berlin, Heidelberg, 1993
61. J.H. Holland. Adaption In Natural And Artificial Systems. MIT Press, Ann Arbor, 1975.
62. L.P. Holmblad and J.J. Østergaard. Control of cement kiln by fuzzy logic. in: Approximate
Reasoning In Decision Analysis. Eds. M.M. Gupta and E. Sanchez, Amsterdam, New York,
Oxford pp. 389–400, 1982
63. S. Isaka and A.V. Sebald. An optimization for fuzzy controller design. IEEE Trans. SMC, 22,
p. 1469, 1992
64. H. Ishigami, T. Fukuda and T. Shibata. Automatic fuzzy tuning and its applications. In [134],
pp. 49–70, 1997
65. R. Jager. Fuzzy logic in control. Ph.D. thesis, T.U. Delft, 1995
66. L.C. Jain and R.K. Jain. Hybrid intelligent engineering systems. in: Advances in Fuzzy Sys-
tems — Applications And Theory, Vol. 11. World Scientific, 1997
67. J.S.R. Jang. ANFIS: Adaptive-Network-Based Fuzzy Inference System. IEEE Transactions on
Systems, Man and Cybernetics, Vol. 23, pp. 665–685, 1993
68. J.S.R. Jang and C. Sun. Neuro-Fuzzy Modeling and Control. Proceedings of the IEEE, 83,
pp. 378–406, 1995
69. Jan Jantzen. Design of fuzzy controllers. Tech. report no. 98-E 384, TU denmark, Dept. of
Automation, Lyngby, Denmark, 1998
70. D.S. Johnson and L.A. McGeoch. The Traveling Salesman Problem: A Case Study In Local
Optimization In E.H.L. Aarts and J.K. Lenstra, Eds. Local Seach in Combinatorial Optimiza-
tion. To appear
71. A. Kandel, Y. Luo and Y.Q. Zhang. Stability analysis of fuzzy control systems. Fuzzy Sets
and Systems 105, pp. 33–48, 1999
72. C.L. Karr. Design of an adaptive fuzzy logic controller using a genetic algorithm. Proc. of
the 4th International Conference on Genetic Algorithms, pp. 450–457, 1992
73. E.E. Kerre. A comparative study of the behavior of some popular fuzzy implication operators.
In L.A. Zadeh and J. Kacprzyk, Eds., Fuzzy Logic For The Management Of Uncertainty.,
Wiley, New York, 1992
An Overview of Fuzzy Control Theory 135
74. W.M. Kickert and E.H. Mamdani. Analysis of a fuzzy logic controller. Fuzzy Sets and Sys-
tems 1, pp. 29–44, 1978
75. J.B. Kiszka, M.M. Gupta and M.N. Nikiforuk. Energetistic stability of fuzzy dynamic systems.
IEEE Trans. on Systems, Man and Cybernetics, 15, pp. 783–792, 1985
76. P.E. Kloeden. Fuzzy dynamical systems. Fuzzy Sets and Systems 7, pp. 275–296, 1982
77. T. Kohonen. Self-organising and Associative Memory, 3rd Ed, Springer Verlag, New York,
1988
78. A.N. Kolmogorov and S.V. Fomin. Measure, Lebesgue Integrals and Hilbert Space. Acad-
emic Press, New York, 1961
79. J.R. Koza. Genetic Programming: On The Programming Of Computers By Means Of Natural
Selection. MIT Press, 1992
80. K. Kristinsson and G.A. Dumont. System identification and control using genetic algorithms.
IEEE Transactions on System, Man, and Cybernetics, SMC-22(5), pp 1033–1046, 1992
81. H. Kwakernaak and R. Sivan Linear Optimal Control Systems. Wiley-Interscience, New
York, 1972
82. A.M. Lee and H. Takagi. A framework for studying the effects of dynamic crossover, mutation,
and population sizing in genetic algorithms. In T. Furuhashi, Ed. Advances in Fuzzy Logic,
Neural Networks and Genetic Algorithms. Proc. 1994 IEEE/Nagoya-University World Wide
Wisepersons. Selected papers. LNAI 1011 Springer-Verlag, Berlin, pp. 111–126, 1995
83. A.M. Lee and H. Takagi. Dynamic control of genetic algorithms using fuzzy logic techniques.
In Proc. Fifth International Conference on Genetic Algorithms (ICGA’93), San Mateo, pp.
76–83, 1993
84. C.C. Lee. Fuzzy logic in control systems: fuzzy logic controller, Parts I and II. IEEE Trans.
SMC. 20, pp. 405–435, 1900
85. H.K. Lee, E. Paillet and W. Peeters. A consistency criterion for optimizing defuzzification in
fuzzy control. In R. Lowen and A. Verschoren, Eds. Foundations of Generic Optimization Vol
II: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, Mathematical
Modelling: Theory and Applications, Springer Verlag, 2007
86. H.K. Lee, E. Paillet and W. Peeters. An asymptotic consistency criterion for optimizing de-
fuzzification in fuzzy control. In R. Lowen and A. Verschoren, Eds. Foundations of Generic
Optimization Vol II: Applications of Fuzzy Control, Genetic Algorithms and Neural Net-
works, Mathematical Modelling: Theory and Applications, Springer Verlag, 2007
87. C.H. Ling. Representation of associative functions. Publ. Math. Debrecen 12, pp. 182–212,
1965
88. R. Lowen. On (R(L), ⊕). Fuzzy Sets and Systems 10, pp. 203–209, 1983
89. R. Lowen. Fuzzy integers, fuzzy rationals and other subspaces of the fuzzy real line. Fuzzy
Sets and Systems 14, pp. 231–236, 1984
90. R. Lowen. The order aspect of the fuzzy real line. Manuscripta Math. 39, pp. 293–309, 1985
91. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic, Dordrechit, 1996
92. J.L. McClelland and D.E. Rumelhart. Explorations in Parallel Distributed Processing. MIT
Press, 1988
93. A. Maeda, S. Someya and M. Funabashi. A self–tuning algorithm for fuzzy membership func-
tions using a computational flow network. Proceedings of the IFSA ’91, Brussels, 1991
94. E.H. Mamdani and S. Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man-Machine Studies 7, pp. 1–13, 1975
95. E.H. Mamdani and N. Baaklini. Prescriptive method for deriving control policy in a fuzzy
logic controller. Electronic Letters, 11, pp. 625–626, 1975
96. E.H. Mamdani T. Procyk and N. Baaklini. Application of fuzzy logic to controller design
based on linguistic protocol. In: Discrete Systems And Fuzzy Reasoning, E.H. Mamdani and
B.R. Gaines, eds. Queen Mary College, University of London, pp. 125–149, 1976
97. M. Margialot and G. Langholz. Fuzzy Lyapunov–based approach to the design of fuzzy con-
trollers. Fuzzy Sets and Systems 106, pp. 49–59, 1999
136 W. Peeters
98. L. Meyer and X. Feng X. A fuzzy stop criterion for genetic algorithms using performance es-
timation. In Proc. of 3rd IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’94),
Orlando, pp. 1990–1995, 1994
99. M. Ming. On embedding problems of fuzzy number space: part 5. Fuzzy Sets and Systems
55, pp. 313–318, 1993
100. M. Minsky and A. Papert. Perceptrons. MIT Press, 1969
101. M. Mizumoto. Pictorial representations of fuzzy connectives part I: cases of t–norms, t–
conorms and averaging operators. Fuzzy Sets and Systems 31, pp. 217–242, 1989
102. M. Mizumoto. Realization of PID controllers by fuzzy control methods. In IEEE First Int.
Conf. on Fuzzy Systems, number 92CH3073-4. Institute of Electrical and Electronics Engi-
neers Inc, San Diego, pp. 1–16, 1992
103. M. Mizumoto. Improvement of fuzzy control methods. In H. Li and M.M. Gupta, Eds.
International Series In Intelligent Technologies: Fuzzy Logic And Intelligent Systems.
Kluwer Academic Publishers, pp.1–16, 1995
104. M. Mizumoto and J. Tanaka. Some properties of fuzzy numbers. In M.M. Gupta, R.K. Ragade
and R.R. Yager, Eds. Advances in Fuzzy Set Theory and Applications. North–Holland, New
York, pp. 153–164, 1979
105. C.V. Negoita. On te stability of fuzzy systems. Proc. IEEE Internat. Conf. Cybernetics and
Society, , pp. 936–937, 1978
106. H. Nomura, I. Hayashi and N. Wakami. A self–tuning method of fuzzy control by descent
method. Proceedings of the IFSA ’91, Brussels, pp. 155–158, 1991
107. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the con-
sequences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
108. M. Obitko and P. Slavı́k. Visualization of Genetic Algorithms in a Learning Environment. In
Spring Conference on Computer Graphics, SCCG ’99. Bratislava: Comenius University, pp.
101–106, 1999
109. A. Ollero, A. Garcia–Cerezo and J. Aracil. Design of Fuzzy Control Systems. Dpto. Ing. Sist.,
University of Malaga, Research report, 1992
110. S.V. Ovchinnikov. Transitive fuzzy orderings of fuzzy numbers. Fuzzy Sets and Systems 30,
pp. 283–295, 1989
111. K.M. Passino and S. Yurkovich. Fuzzy Control. Addison Wesley Longman Inc., Menlo Park,
CA, USA, 1998
112. R. Pearce and P.H. Cowley P. H. Use of fuzzy logic to overcome constraint problems in genetic
algorithms. In Proc. of 1st IEE/IEEE International Conference on Genetic Algorithms in
Engineering Systems: Innovations and Applications, Sheffield, pp.13–17, 1995
113. W. Pedrycz. An identification algorithm in fuzzy relational equations. Fuzzy Sets and Sys-
tems 13, pp. 153–167, 1984
114. W. Pedrycz. Identification in fuzzy systems. IEEE Trans. Systems, Man and Cybernetics 14,
pp. 361–366, 1984
115. W. Pedrycz. Approximate solutions of fuzzy relational equations. Fuzzy Sets and Systems 28,
pp. 183-202, 1988
116. W. Pedrycz. Fuzzy control and fuzzy systems, 2nd Ed. Wiley, New York, 1993
117. W. Pedrycz. Fuzzy Sets Engineering. CRC Press, 1995
118. W. Pedrycz and M. Reformat. Genetic optimization with fuzzy coding. In [57], pp. 51–67,
1996
119. T.J. Proczyk and E.H. Mamdani. A linguistic self-organizing process controller. Automatica
15(1), pp. 15–30, 1979
120. W. Qiao and M. Mizumoto. PID type fuzzy controller and parameters adaptive method.
Fuzzy Sets and Systems 78, pp. 23–35, 1996
121. K.S. Ray and D.D. Majumder. Application of the Circle Criteria for Stability Analysis of Lin-
ear SISO and MIMO Systems Associated With Fuzzy Logic Controller. IEEE Trans. Systems,
Man and Cybernetics, 14(2), pp. 345–349, 1984
122. K.S. Ray, A. Ghosh and D.D. Majumder. L2 -Stability and the Related Design Concept for
SISO Linear System Associated With Fuzzy Logic Controllers. IEEE Trans. Systems, Man
and Cybernetics, 14(6), pp. 932–939, 1984
An Overview of Fuzzy Control Theory 137
150. M. Vidyasagar. New directions of research in nonlinear systems theory. Proc. of the IEEE
77(8), pp. 1060–1090, 1986
151. S. Voget. Multiobjective optimization with genetic algorithm and fuzzy control. In Proc.
of the 4th European Conference on Intelligent Techniques and Soft Computing (EUFIT
Aachen’96), pp. 391–394, 1996
152. H.M. Voigt. Fuzzy evolutionary algorithms. Technical Report 92-038, International Com-
puter Science Institute (ICSI), 1947 Center Street, Suite 600, Berkeley, CA, 94704, 1992
153. H.M. Voigt, J. Born and I. Santibanez-Koref . A multivalued evolutionary algorithm. Tech-
nical Report 93-022, International Computer Science Institute (ICSI), 1947 Center Street,
Suite 600, Berkeley, CA, 94704, 1993
154. H.M. Voigt, H. Muhlenbein and D. Cvetkovic D. Fuzzy recombination for the continuous
breeder genetic algorithm. In Proc. of the 6th International Conference on Genetic Algo-
rithms (ICGA’95), Pittsburgh, pp. 104–111, 1995
155. M. Wakami and H. Terai. Application of fuzzy theory to home appliances. in: K. Hirota
Industrial Applications of Fuzzy Technology, Tokyo, Berlin, Heidelberg, pp. 283–310, 1993
156. P.Y. Wang, G.S. Wang, Y.H. Song and A.T. Johns. Fuzzy logic controlled genetic algorithms.
In Proc. of 5th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’96), vol. 2,
New Orleans, pages 972–979, 1996
157. S. Weber. A general concept of fuzzy connectives, negations and implications based on t–
norms and t–conorms. Fuzzy Sets and Systems 11, pp. 115–134, 1983
158. H.Y. Xu and G. Vukovich. A fuzzy genetic algorithm with effective search and optimization.
In Proc. International Joint Conference on Neural Networks (IJCNN’93), Nagoya, pp. 2967–
2970, 1993
159. R.R. Yager. A procedure for ordering fuzzy subsets of the unit interval. Information Sci. 24,
pp. 143–151, 1981
160. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
161. Y. Yamashita, S. Matsumoto and M. Suzuki. Start-up of a catalytic reactor by fuzzy con-
troller. J. Chemical Engineering of Japan, 21, pp. 277–281, 1988
162. T. Yamazaki and E.H. Mamdani. On the performance of a rule-based self–organising con-
troller. Proc. of IEEE Conf. on Applications of Adaptive and Multivariable Control. Hull,
England, pp. 50–55, 1982
163. S. Yasunobu and S. Miamoto. Automatic train operation by predictive fuzzy control. In
M. Sugeno. Industrial Applications of Fuzzy Control., Amsterdam, New York, 1985
164. L.A. Zadeh. Fuzzy sets. Information and Control 8, pp. 338–353, 1965
165. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
166. L.A. Zadeh. The concept of a linguistic variable and its application to approximate reason-
ing. Information Sci. 8, pp. 199–249 and 9, pp. 43–80, 1975
167. G. Zames. On the I–O Stability of Time Varying Nonlinear Feedback Systems. IEEE Trans.
on Automatic Control, 11, pp. 228–238, 1966a
168. H.J. Zimmermann. Fuzzy Set Theory and Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
Optimal Fuzzy Management of Reservoir
based on Genetic Algorithm
Abstract This chapter deals with water resource management problems faced from
an Automatic Control point of view. The motivation for the study is the need for
an automated management policy for an artificial reservoir (dam). A hybrid model
of the reservoir is considered and implemented in Stateflow/Simulink, and a fuzzy
decision mechanism is implemented in order to produce different water release
strategies. A new cost functional is proposed, able to weight user’s desiderata (in
terms of water demand) with water waste (in terms of water spills). The parame-
ters of the fuzzy system are optimized by employing Genetic Algorithms, which
have proved very effective due to the strong nonlinearity of the problem. Modi-
fied AR and ARMAX models of the inflow are identified and Montecarlo simula-
tions are used to test the effectiveness of the proposed strategy in different operating
scenarios.
1 Introduction
Alberto Cavallo
Dipartimento di Ingegneria dell’Informazione, Seconda Universitá degli Studi di Napoli,
via Roma 29, 81031 Aversa, Italy
Armando Di Nardo
Dipartimento di Ingegneria Civile, Seconda Universitá degli Studi di Napoli, via Roma 29, 81031
Aversa, Italy
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 139
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 139–159.
c 2008 Springer.
140 A. Cavallo and A.D. Nardo
In this section the basics of the release policy are presented. The life cycle of the
reservoir has been divided into [8]:
1. Ordinary management condition
2. Emergency management condition
The first refers to the case where, in a given time interval, the total available water
volume is not less than the required one. In this case there is enough water to satisfy
the user’s demand, and the decision strategy must select wether to supply all the
water the users ask for or to save some water for possible future needs. Note that,
due to evaporation losses, too conservative strategies would result in water waste
without fulfilling future users’ demand. The second management condition takes
place in drought period. In this case the system enters an “emergency operation
condition”, where reduced water flows are supplied trying to minimize discomforts
of the users.
The decision strategy is based on the values of h(t), the water level in the reser-
voir at time t, ḣ(t), the height rate, as internal variables and qid
ref (t), the “ideal” (i.e.
in the case of infinite water availability) desired water supply, the current month m
and the water inflow qin (t) as external variables, and produces the water supply qout ,
considering current and foreseen water availability. Basically, the idea is to use a
set of empirical rules to define the water release as a function of the input variables.
This can be naturally implemented by using heuristic fuzzy rules. The rules will be
later optimized by using a genetic algorithm. However, in order to design the control
laws for the reservoir operations, the mathematical structure of the reservoir must
be examined first.
A typical profile of the water inflow qin (t) and of required outflow qid
out (t) is de-
picted in Figure 1 in a time span encompassing 24 months. Note that the two curves
are, roughly speaking, out of phase by six months, corresponding to water avail-
ability and demand during the wet and dry seasons. The mathematical model of the
dynamics of the reservoir is described by the differential equation:
where V (t) is the reservoir volume at the generic time instant t, that depends on
the geometry of the reservoir, and qev (t) is the evaporation. In particular V =
-h
0 A(λ )d λ , where A(h) is the area of the water surface and h is the water height
in the reservoir. By applying the chain derivation rule:
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 143
x 107
3
q id
out
q
in
2.5
Water Flows [m3/month]
1.5
0.5
0
5 10 15 20 25
Time [month]
dV
V̇ = ḣ = A(h)ḣ. (2)
dh
The evaporation qev (t) is usually modelled via an evaporation coefficient kev (t)
deduced from reservoir’s losses at time instant t:
Physically, the volume is lower bounded by the “dead volume”, hence A(h) = 0.
Thus, the model of the reservoir can be written
1
ḣ = −kev + (qin (t) − qout (t)). (4)
A(h)
Finally, a simple discrete time version of eqn. (4), computed at time instants
t = kT , k = 0, 1, . . ., can be derived using an integration stepsize T = 4 hours
1
h[(k + 1)T ] = h(kT ) − kev (kT )T + [qin (kT ) − qout (kT )] T. (5)
A[h(kT )]
144 A. Cavallo and A.D. Nardo
Equation (5) describes the hydraulic balance in the reservoir only if the water vol-
ume belongs to a given interval at each time instant, i.e.
where Vmin is the dead volume and Vmax is the reservoir volume, depending on the
dam height. If V (t) tends to increase over Vmax , an overflow qsp (water spill) hap-
pens, while if it reduces below Vmin it will be impossible for the dam to supply any
desired flow. The above consideration naturally suggests an hybrid model for the
reservoir, where three states of the reservoir can be identified.
Some additional variables are defined, namely the tentative water volume Vt and
the actually released water flow qact . The hybrid model of the reservoir encompasses
three states (conditions), as follows.
1. A standard condition (NORMAL), when the bounds (6) are satisfied and eq. (5)
applies
2. An overflow condition (SPILLS), where the water volume is constrained to its
maximum value
3. A drought condition (EMPTY), where no water can be supplied to the user (qact =
0) and no evaporation occurs (at least approximately, actually a small evaporation
happens, but can be neglected)
The input variables are the water volume at the previous step, the current water
inflow, outflow and assumed evaporation, while the outputs are the water volume
in the reservoir, the corrected evaporation and the spills (needed to compute the
performance indices in Section 7).
Finally, a fixed integration step T = 1/180 (i.e. 4h) is considered. The resulting
statechart is reported in Figure 2.
The Stateflow element is integrated into a MATLAB/SIMULINK simulation
scheme, to be used to evaluate and compare different operation strategies.
The fuzzy automatic decision system defines, in real time, the “actual outflow” in
the case of “emergency management conditions”. As stated above, the key idea is
to modulate the overflow, i.e. to decide a multiplicative (time-varying) factor ρ (t),
with ρ ∈ [0, 1], such that
qout (t) = ρ (t)qid
out (t) (7)
is the released water, expressed as a fraction of the ideal one. As it is known in the lit-
erature (e.g. [10] and references therein), fuzzy systems allow to turn numeric input
through linguistic knowledge into numeric output. Moreover, strategy (7) naturally
suggests the use of a Sugeno-type FIS [23].
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 145
NORMAL
q_ev_c = q_ev;
q_sp = 0;
q_act = q_out,
Vt = V;
Vt = Vt + (q_in-q_act-q_ev)/180;
V = Vt;
The core of fuzzy logic theory is linguistic rules set: in this study, trying to
take into account knowledge of the reservoir management operator, the following
Sugeno-type rule system is developed
(l) (l)
R(l) : if x1 is P1 and . . . and xn is Pn then y = ρ (l) yid
where
• xi ∈ Ui ⊂ R is the i-th input linguistic variable in the universe of discourse Ui ⊂
R, i = 1, . . . , n.
• y ∈ S ⊂ R is the output linguistic variable in the universe of discourse, expressed
as product of a coefficient ρ (l) ∈ [0, 1] by an ideal output yid ∈ S.
(l)
• Pi is the fuzzy set referred to the i-th input variable and the l-th decision rule,
i = 1, . . . , n, l = 1, . . . , r.
• ρ (l) ∈ C1 ⊂ [0, 1] is a crisp multiplier for the l-th rule, l = 1, . . . , r, assuming
values in the set C1 , with cardinality γ1 . This is a “reduction factor” of the output
with respect to an “ideal” output.
The range of values of the coefficient ρ (l) is chosen so as to reduce the user’s
water demand. In particular, a decision rules system consisting of r = 13 rules and
γ1 = 6 levels of output reduction has been selected of the form
146 A. Cavallo and A.D. Nardo
R(l) : if h is LOW and ḣ is ZERO and month is DRY then qout =LITTLE
qid
out
with linguistic values and variables:
x1 = h
x2 = ḣ
x3 = month
1year
x4 = qΣin (t) = qin (t − τ )d τ
0
P1 = {LOW, HIGH}
P2 = {NEGATIVE, ZERO, POSITIVE}
P3 = {DRY, WET}
P4 = {DROUGHT, NOT DROUGHT}
C1 = {NOTHING, VERY LITTLE, LITTLE,
MUCH, VERY MUCH, EMERGENCY}
where qΣin is cumulative value of the inflow in the last year. The choice of the vari-
ables has the following rationale: h takes into account the water currently at dis-
posal, ḣ the presumed future volume trend, month the expected future inflow, qΣin the
past inflow history. Based on these variables, the decision strategy tries to foresee
the water availability to satisfy current and future customers’ requirements, suitably
reducing water supply in the case of hypothetical future negative scenarios.
The heuristic rules are summarized in Table 1.
Three different reservoir management strategies have been designed and analyzed.
As already stated, the first fuzzy strategy FOP was developed with an empirical
approach: in particular both the membership functions shape and values has been
chosen by exploiting expertise of reservoir operators and then set by simulation.
148 A. Cavallo and A.D. Nardo
However, this way to operate does not guarantee the optimal fulfillment of operating
rules because of the large number of parameters involved. Moreover, the reservoir
management problem is strongly nonlinear and time-varying and it is necessary to
apply an efficient optimization technique. In this context, as stated in the Introduc-
tion, Genetic Algorithms (GAs) [11, 17] have been recognized as a suitable tool
to solve the optimization problem, since they are conceptually powerful, although
flexible and relatively easy to implement. The Matlab GA Toolbox has been used to
optimize the 21 fuzzy parameter with historical data input. The problem is a non-
linear and constrained optimization problem, since, in order to preserve linguistic
meaning of fuzzy rules presented in Section 2.4 is necessary to constrain all the
variables. For instance, the following upper and lower bounds have been imposed
on the 6 output variables C1
where, w(qsp ) is a fuzzy weighting function which penalizes situations with high
spills. This is done to consider the case that saving more water can alleviate droughts
but increases water waste due to spills and evaporation.
The GA solution is obtained with a population size of 40 individuals and with
following principal GA parameters:
• Crossover Fraction = 0.80
• Migration Interval = 20
• Migration Fraction = 0.20
• Initial Penalty = 10
• Penalty Factor = 100
Finally, OFOP starts the GA optimization using the FOP solution as a starting
guess. In this way, the optimization solver is allowed to start from a “good” starting
guess, and trivial local minima are apriori avoided.
It is easy to understand that each policy has its advantages and drawbacks. Therefore
the three strategies SOP, FOP and OFOP are compared with different performance
indices, some inhomogeneous between them, in order to evaluate the effectiveness
of the proposed approaches from different points of view. In particular, the following
performances indices are defined
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 149
∑t qout
• Volumetric Reliability: × 100
∑t qid
out 2
• Integral of Squared Deficits: ∑ qout − qid
out
t
• Deficit Frequency: 100 ∑t d(t) × 100
maxi dis
• Maximum Seasonal Deficit × 100
∑t qid
out
• Total Spills: ∑ qsp (t)
t
• Total Evaporation: ∑ qev (t)
t
where
1 if qout (t) < qid
out (t)
d(t) =
0 otherwise
and
12
dis = ∑ qid
out (t) − qout (t) ,
i i
i = 1, . . . , n (11)
t=1
The above procedure suffers from a main disadvantage. In fact, it is clear that the
result depends not only on the ability of the genetic algorithm to seek for a “good”
suboptimum, but also on the inflow historical data entering the system. If for in-
stance more water were available, different results would have been obtained. The
problem is that the above procedure heavily relies on the vector of data qin (·), that is
actually a single realization of a stochastic process. Thus, the proposed strategy is
prone to the risk of overfitting a single (although significant) case, thus resulting
in a low level of generality. A possible, classic alternative is to use only a sub-
set of the data for the optimization, while the remaining data are employed for an
“objective” assessment (validation) of the result. However, this approach is accept-
able only when plenty of time-history data are available. In the present case, the
data are characterized by two dramatic events: a large peak in the first half of the
time history (around month 120), and a large drought in the second half (months
320–350). Thus, halving the data inevitably implies loss of meaningful pieces of
information. In the case of few data, it is advisable to “generate” new data by run-
ning a simulation model condensing the statistics of the inflow process. This can
be accomplished by identifying a dynamical model of the inflow [4] time history,
and using a random generator to produce simulated inflow processes, i.e. vector of
random numbers preserving the statistics of the original process [15]. Thus the de-
cision strategy is defined on the whole original set of data, and its performances are
150 A. Cavallo and A.D. Nardo
assessed by checking its behavior when inputs generated by the identified model are
used as new inputs.
Although there is plenty of mathematical tools for dealing with identification
problems, a deep understanding of the physics of the phenomenon to identify is still
necessary in order to obtain good results. In the case of the considered inflow, a
record set of 36 years monthly precipitations, looking at the time history in Figure 3
the following considerations can be deduced.
• Occasionally, large values of the inflow appear.
• In most cases, very low values (close to zero) happen. This behavior strictly
resembles what is called “intermittent time series”, although, strictly speaking,
intermittent time series must have zero values [22].
• The behavior exhibit a clear periodicity, mainly based on the seasonal repetition.
The first step in identifying a dynamic system or a time history is to prefilter
the data. Generally, all what can be easily extracted from the data, as mean and
trends, is removed. In the case of hydrologic seasonal data, and in general when
periodic behaviors are present, simply removing the mean value has low impact. It is
better to remove seasonal means and to perform a seasonal normalization, in order
to have data where only stationary stochastic behaviors are present. This can be
accomplished as follows. Let Qkin (t), t = 1, . . . , 12, k = 1, ..., 36 be the inflow related
to the year k and month t. Then the seasonal mean (monthly mean) is estimated as
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 151
1 36 k
Q̄in (t) = ∑ Qin (t),t = 1, . . . , 12.
36 k=1
(12)
1 36 k 2
2
SQ (t) = ∑
35 k=1
Qin (t) − Q̄in (t) ,t = 1, . . . , 12. (13)
The time series to identify is next normalized by removing the seasonal mean and
variance.
member of the family, i.e. the model fitting the data optimally according to the given
criterion.
The operation is performed by using the System Identification Toolbox of Mat-
lab, which implements a large set of techniques based on classical concepts [15].
Using the classical prediction error as optimality criterion, the following families
are inspected.
1. ARX (Auto Regressive with eXogenous input)
2. ARMAX (Auto Regressive Moving Average with eXogenous input)
3. Space-state
Moreover, also the order is selected along with the model. The worst results are
obtained with the space state model, essentially because there is no sharp variation
in the singular values of the Hankel matrix [14], hence it is not easy to select the
“right” order. As far as the ARX model is concerned, two popular techniques for
model complexity are selected, i.e. the FPE (Final Prediction Error) and the AIC
(Akaike Information Criterion) criteria [15]. For the sake of notational simplicity,
let us drop all the subscripts, and denote the time history to identify be denoted by
q(t) and the fictitious input u(t) defined above. The model obtained by minimizing
the FPE is an ARX(1, 3), with a three-step delayed input, i.e. q(t) = a1 q(t − 1) +
b1 u(t − 3) + b2 u(t − 4) + b3 u(t − 5), while the AIC gives an ARX(1, 1) with the
same delay, i.e. q(t) = a1 q(t − 1) + b1 u(t − 3). However, in both cases the values
of the coefficients of the input are very small, and below their standard deviation,
which means that they are barely reliable. Since from a physical point of view the
input is only a sign of the seasonal periodicity of the data, the conclusion that all the
seasonality has been removed from the data in the pretreatment phase can be drawn
(or, more correctly, there is no further evidence of a definite yearly pattern in the
data when using an ARX-family model). The next step will thus be to remove the
fictitious input and to identify the time sequence by using a simple AR model. In
this case both the AIC and the FPE give an AR(1) as best model, in particular the
result is
y(t) = 0.37(±0.063)y(t − 1) + ξ (t) (16)
where also the standard deviation of the estimate has been indicated and ξ (t) is a
white Gaussian noise, as can be easily verified by using suitable whitening tests (e.g.
Anderson’s test) and normality tests (e.g. Kolmogorov–Smirnov test for normality).
Moreover, an AR(3) model has also been tested, motivated by the three-step de-
lay computed with the ARX model above. The identification shows that actually a
special AR(3) model gives a good result, namely one with zero two-step delay:
However, this model is equivalent to model (16) from a prediction error criterion
point of view, hence the former is preferred for its lower complexity.
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 153
from which the following considerations are deduced. The ARMAX model is able
to detect a slight periodicity in the data, although with relatively high variance and
hence low reliability. Moreover, the model is considerably more complex than the
AR(1), and simply trying identifying an ARMA model by removing the fictitious
input, as in the AR case, lead to a completely unreliable model, with coefficients
whose standard deviations are larger than the coefficients themselves. On the other
side, the global improvement in using such a model is not worth the increase in
complexity, hence the model (16) is selected.
The model thus deduced is used for simulation, by feeding the identified system
with a Gaussian pseudo-white noise with variance computed from the model error
variance. A plot of a realization of the simulated inflow vs the true data is shown in
Figure 4.
7 Case Study
The methodology developed in this paper has been applied to the case of the man-
agement of Pozzillo reservoir, on the Salso River in Sicily (Italy). Pozzillo reservoir
is a multipurpose system (hydroelectric, irrigation and municipal), the basin area is
about 577 km2 and net storage is 123 × 106 m3 .
The available data are referred to the years 1962–1998, with 432-months wa-
ter inflow qin , represented in Figure 3, monthly evaporation rates kev , the reservoir
volume as a function of the water height, V = V (h) and the ideal water demand qid out .
Referring to hydrologic year (October–September) it is possible to see recent
drought events, that struck South Italy in the years 1988–1990.
The three different strategies SOP, FOP and OFOP have been tested both with
available historical data from 1962–1998 and with 10, 000 Montecarlo runs based
on historical data as explained in Section 6. The results from the historical data are
described below.
In Figure 5 is it possible to observe several months in which the reservoir does not
succeed in fulfilling the water demand; in particular, during the drought in months
320–340, the SOP strategy is unable to reduce the customer discomfort.
A dramatic improvement is obtained with the FOP strategy, as the water crises is
prevented by preserving water in the previous months and releasing it in the drought
months (Figure 6).
Even better performs the OFOP strategy, that represents an optimal solution to
improve the management reservoir. In fact, as shown in Figure 7, is it possible to
observe that during the winters, when the water demand is smaller, the demand is
x 107
3
id
qout
qout
2.5
2
Flow [m3/month]
1.5
0.5
0
240 260 280 300 320 340 360 380 400 420
Time [month]
x 107
3
qact
2.5 qid
out
qout
2
Flow [m3/month]
1.5
0.5
0
240 260 280 300 320 340 360 380 400 420
Time [months]
x 107
3
qact
id
qout
2.5
qout
2
Flow [m3/month]
1.5
0.5
0
240 260 280 300 320 340 360 380 400 420
Time [months]
almost completely satisfied, and in summer drought months, for example in months
390–400, the strategy behaves better in overcoming the crisis guaranteeing a re-
duced (but non-null) water yield to the user.
156 A. Cavallo and A.D. Nardo
Table 2 Performance indices of Pozzillo reservoir operation during 1982–1998 (historical data)
Operat. Volum. Sum of Def. Tot. Tot. Max. Mean
Policy Reliab. Sq. Def. Freq. Spill Evap. Seas. Def.
(%) (105m3 ) (%) (107m3 ) (107m3 ) (% demand)
Table 3 Performance indices of Pozzillo reservoir operation during 1982–1998 (mean values on
10,000 Montecarlo runs)
Operat. Volum. Sum of Def. Tot. Tot. Max. Mean
Policy Reliab. Sq. Def. Freq. Spill Evap. Seas. Def.
(%) (105m3 ) (%) (107m3 ) (107m3 ) (% demand)
Naturally, as already noted referring to FOP strategy, the improved result depends
on the fact that the user is given generally less water than required, because the
fuzzy strategies save some resource for possible future shortage. Nevertheless such
criteria presents some disadvantages, namely the increase of water spill and water
evaporation. So, in the following Table 2, comparison between the three strategies
is reported based on a simulation with the historical data.
From Table 2 it is possible to note that Sum of Square Deficits is drastically
reduced as the strategy changes from SOP to OFOP. However this happens at ex-
penses of Deficit Frequency and Volumetric Reliability because fuzzy strategy and
optimization fuzzy strategy preserve water resource in some previous months and,
as a consequence, spills and evaporation losses increase.
In order to perform a more objective test, a campaign of 10, 000 Montecarlo runs
has been performed on the three analyzed strategies. The results obtained from his-
torical data input are confirmed by the Montecarlo approach that presents better val-
ues for all performances indices because historical data input are strongly affected
by heavy drought period (see Table 3).
The superiority of the OFOP approach from the point of view of teh minimization
of the sum of the square deficit is apparent w.r.t. both the SOP and the FOP, and is
confirmed by using an Optimal Comparison Technique [1] for testing the hypothesis
of superiority of the OFOP decision strategy compared to the others at any common
level of significativity (e.g. α = 5% or α = 1%).
A final observation concerns the actual water availability. Indeed, the proposed
strategy simply modulates the required water. However, in periods of severe drought
it can happen that the reservoir is unable to satisfy even a reduced demand. This
explains why in Figures 6 and 7 the new variable qact appears: it is the water outflow
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 157
x 107
3
qid
out
qSOP
out
2.5
qFOP
out
qOFOP
out
2
Flow [m3/month]
1.5
0.5
0
240 260 280 300 320 340 360 380 400 420
Time [months]
actually released to the user, i.e. what the reservoir is able to yield, that is what really
matter to the user. A final figure comparing the real outflows qact in the three cases
is useful to stress the differences in the strategies (Figure 8).
8 Conclusions
In this chapter different decision strategies for the problem of handling the water
management of an artificial reservoir in a fully automatic way have been analyzed
and compared. In particular, a Standard Operation Policy (SOP), a Fuzzy Operation
Policy (FOP) and an Optimized Fuzzy Operation Policy (OFOP) have been con-
sidered. The SOP releases water whenever possible, regardless of foreseen water
demand. The FOP supplies water based on reservoir and external variables state,
thus exhibiting forecasting properties and reducing the water release, even if there
is currently some available water, if it seems that saving water can alleviate foreseen
future droughts. OFOP is an optimized version of FOP obtained with Genetic Algo-
rithms Techniques. To test the proposed strategies, a dynamic hybrid model of the
reservoir is deduced, simulating different operative situation with 10, 000 runs of
Montecarlo simulations. While an unconstrained optimization is prone to the risk of
overspecializing on a single realization of the data set, the work shows that by suit-
ably mixing heuristic and optimization strategies (by constraining the optimization
according to the heuristic) a “smart” decision policy can be defined able to perform
satisfactory also in cases not considered in the optimization phase, thus showing that
the decision law has “learned” the rules for the optimal management of the reservoir.
158 A. Cavallo and A.D. Nardo
References
1. Bar-Shalom Y. and X.-R. Li (1993). Estimation and Tracking: Principles, Techniques and
Software. Artech House, Boston. MA.
2. Basson M.S. and J.A. van Rooyen (2001). Practical Application of Probabilistic Approaches
to the Management of Water Resource Systems. Journ. of Hydrology 241, pp. 53–61.
3. Bing-Yuan C. (2003). Fuzzy Allotment Model in Water and Electricity Resources Shortage
and its Application Software. Proc. of the 12th IEEE Int. Conf. on Fuzzy Systems FUZZ ’03.
2. pp. 1317–1320.
4. Box, G. and G. Jenkins (1970). Time series analysis: Forecasting and control. Holden-Day.
San Francisco.
5. Bender M.J. and S.P. Simonovic (2000). A Fuzzy Compromise Approach to Water Resource
Systems Planning under Uncertainty. Fuzzy Sets and Systems 115, pp. 35–44.
6. Cancelliere, A., A. Ancarini and G. Rossi (2002). A neural networks approach for deriving
irrigation reservoir operating rules. Water Res. Management 16, pp. 71–88.
7. Castelletti, A., D. de Rigo, A.E. Rizzoli, R. Soncini-Sessa and E. Weber (2005). A Selec-
tive Improvement Technique or Fastening Neuro-Dynamic Programming in Water Resource
Network Management. Proc. of the 16th IFAC World Congress. Praha. (CZ).
8. Cavallo, A., A. Di Nardo and M. Di Natale (2003). A fuzzy control strategy for the regula-
tion of an artificial reservoir. In: Sustainable Planning and Development (E. Beriatos, C.A.
Brebbia, H. Coccossis and A. Kungolos, Eds.). WIT Press, pp. 629–639.
9. Chen, Y.-M. (1997). Management of Water Resources using Improved Genetic Algorithms.
Computers and Electronics in Agricolture 18, pp. 117–127.
10. Dubois, D. and Prade, H. (Eds.) (1980). Fuzzy Sets and Systems: Theory and Applications.
Academic Press, New York.
11. Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning.
Addison-Wesley.
12. Gollu, A. and P.P. Varaiya (1989). Hybrid dynamical systems. In: Proc. of IEEE Conference
on Decision and Control. Tampa, FL.
13. Hobbs, P.F. (1997). Bayesian Methods for Analysing Climate Change and Water Resource
Uncertainties. Journ. of Environmental Management 49, pp. 53–72.
14. Katayama T. (2005). Subspace Methods for System Identification. Springer-Verlag, London.
15. Ljung, L. (1987). System Identification: Theory for the User. Prentice-Hall, Englewood
Cliffs, NJ.
16. Matondo, J.I. (2002). A Comparison between Conventional and Integrated Water Resources
Planning and Management. Physics and Chemistry of the Earth, Parts A/B/C 27, pp. 831–838.
17. Michalewicz, Z. (1999). Genetic Algorithms + Data Strictures = Evolution Programs, 3rd ed.
Springer, Berlin.
18. Nazemi, A.R., M.R. Akbarzadeh and S.M. Hosseini. (2002). Fuzzy-stochastic Linear Pro-
gramming in Water Resources Engineering. Proc. of the IEEE Fuzzy Information Processing
Society, NAFIPS 2002. pp. 227–232.
19. Panigrahi, D.P. and P. P. Mujumdar (2000). Reservoir operation modeling with fuzzy logic.
Water Res. Management 14, pp. 89–109.
20. Russel, S.O. and P.E. Campbell (1996). Reservoir operating rules with fuzzy logic program-
ming. Journal Water Resources Planning Management 122(3), pp. 165–170.
21. Salas J.D., J.R. Delleur, V. Yevjevich and W.L. Lane (1980). Applied Modelling of Hydrologic
Time Series. Water Resources Publications. Littleton, CO.
22. Salas J.D. (1993). Analysis and Modeling of Hydrologic Time Series. In: Handbook of Hy-
drology (D.R. Maidment, Ed.). pp. 19.1–19.72. McGraw-Hill, New York.
23. Sugeno, M. (1985). Industrial Applications of Fuzzy Control, Elsevier Science.
24. Thomas, J.-S. and B. Durham (2000). Integrated Water Resource Management: Looking at the
Whole Picture. Desalination 156, pp. 21–28.
25. Thomas H.A.Jr. and M.B. Fiering, (1962). Mathematical Synthesis of Streamflow Sequences
for Analysis of River Basisns by Simulations. In: The Design of Water Resources Systems
(A. Maas et al. Ed.). Harward University Press, Cambridge, MA, pp. 459–493.
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 159
26. Vemuri V. R. and W. Cedeño (1995). A New Genetic Algorithm for Multi-Objective Opti-
mization in Water Resource Management. Proc. of the IEEE Int. Conf. on Evolutionary Com-
putation 1, pp. 495–500.
27. Wang, L., L. Fang and K.W. Hipel (2003). Cooperative Water Resource Allocation based on
Equitable Water Rights. Proc. of the IEEE International Conference on Systems, Man and
Cybernetics 5, pp. 4425–4430.
28. Yeh, W. (1985). Reservoir management and operation models: a state of the art review. Water
Resources Research 21(12), pp. 1797–1818.
Genetic Fuzzy Modeling of Supervisory
Scheduling of Freight Rail Systems
Abstract This chapter develops a genetic fuzzy modeling approach for train schedu-
ling of freight rail network systems. A genetic fuzzy algorithm is suggested as a
means to solve train scheduling problems. The algorithm uses fitness estimation
model based on participatory learning fuzzy clustering to improve its processing
speed and to keep solution quality. The approach is particularly useful in schedul-
ing problems involving dynamic environments because in these instances fitness
evaluation usually is costly. In dynamic environments such as rail network sys-
tems, decision-making demands feasible train movement plans to control traffic
and operate yards, stations and terminals. The genetic fuzzy algorithm is com-
pared against exact optimal solutions given by classic optimization and genetic
algorithms. To illustrate the usefulness of the approach, a real-world freight rail
system problem is solved using the genetic fuzzy approach and the classic genetic
algorithm. Results suggest that the genetic fuzzy approach constitutes a promising
alternative to solve scheduling problems in general, but performs particularly well
to produce supervisory train schedules.
1 Introduction
Traffic over rail networks has increased substantially during the last decade. Most
world rail network freight systems consist of single track with passing and crossing
sidings, although a fair amount of double track and few multiple mainline tracks
do exist. The growth in the transportation demand is introducing congestion and
complicating the accessibility and capacity of rail networks. For instance, container
trade is growing at a 9.5% annual rate worldwide and ports are expected to double
and possibly triple their cargo by the next decade [24]. Estimates assuming 3% per
year growth in a national economy indicate that railroads must carry an additional
888 million tons by 2020, a 44% increase from 2003 [27]. Correcting congestion
with additional capital expenditures is costly in an industry that already has a low
return on capital expended. Therefore, many railroads are looking to technology to
provide better utilization of the capital and system capacity that is already in place.
Until recently most freight railroads used a tonnage-based approach to dispatch
trains. This means that trains are held until they have enough tonnage to fill them
to capacity. Under the tonnage-based approach, the operating plan lists a train as
operating everyday, but if the railroad does not fill enough railcars, then it cancels
or delays the train. The idea is to minimize the total number of trains by choosing
higher size trains, which should help to decrease operation costs and increase track
capacity. However, tonnage-based train planning requires more railcars and higher
yard storage capacity to cope with traffic variability. It may also increase crew and
locomotive repositioning costs, and may jeopardize customer needs due to higher
emphasis in train operation economics [18]. Contrary to tonnage-based approach,
scheduled railroads are gaining attention once it forces trains to run on time even if
trains are partially loaded. Schedule-based schemes require trains with low tonnage
when demand is below expectations, systematic and precise forecast of transporta-
tion demand. Quick schedule adaptation, more advanced decision-making support
procedures, and methodologies to timely analyze different alternatives are also im-
portant. Currently practice uses hierarchical hybrids of tonnage and schedule-based
approaches because different commodities require distinct flexibility degrees to
accommodate trade-offs between customer needs and economic operation. For ins-
tance, in hierarchical systems a supervisory scheduling level develops medium range
(typically for 6-24 hours period) train schedules to provide references for train
movements. Whenever an unscheduled train enters the rail network or disturbances
occur, lower level real-time scheduling systems adjust current movement plans to
account for new traffic conditions. Adjustment must attempt to maintain the new
movement plans as compatible as possible with schedules given by the supervisory
level. If unfeasible movements occur, then a request is issued and the supervisory
level develops a new schedule.
There are many areas in which technology can improve the efficiency of rail-
road operations. Railroad operation plan describes how railcars, trains, locomotives
should travel, and how to assign the major assets needed to move the fleet, espe-
cially train crews, yards, tracks and maintenance crews. Railroad operation plan-
ning involves a multitude of complex tasks. It starts with transportation demand and
movement requirements, establishes railcar routes and train formation, and assigns
resources and plans trains movements. One major issue is the management of trains
movement across the network because it may improve capital utilization and system
capacity that already are in place. It can also help to discover bottlenecks and guide
investment. Controllers control the setting of switches, signals, issue of movement
authority in dark territories and manage movement plans remotely. In centralized
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 163
control systems, track occupancy is detected through track circuits, GPS (Global
Positioning System) signals, voice communication via radio links, and displayed to
the controllers. A controller must deal with a variety of track and signal infrastruc-
ture, and a wide variation in train performance. Maintenance crews requests and
safety in unsignaled areas must also be managed. As trains move across the rail
network, control of trains progress from controller to controller, often with frequent
interactions with yard and terminal managers for refueling, crew changes, car block-
ing and train formation. Moreover, not all trains are of the same economic value for
the railroad and priorities must be dynamically assigned to trains. Controllers per-
formance are measured on how well they move trains over the network. Therefore,
scheduling methodologies and algorithms provide a means to plan train movement
to their destination based on the value of trains and on physical, safety, and opera-
tional constraints. Supervisory scheduling algorithms and procedures are essential
to develop globally optimal schedules for trains moving at different railroad territo-
ries. Global schedules act as set-points of real-time level train movement plan and
control systems.
Train scheduling over a rail network of track segments resembles the job-shop
problem of scheduling jobs on machines. A rail network comprises a set of track
segments that cannot be occupied by opposing trains at any instant, just as ma-
chines in a job-shop can process only a job at a time. From the job-shop scheduling
point of view, there is a major difference once railroads often have yards and sta-
tions with multiple tracks, and eventually single or double track segments between
yards and stations. The major difference between job-shop and train schedules lies
in the set of constraints that depends on the track assignment and the selection of
tracks in multiple track yards and stations. Actually train scheduling is similar to
job-shop scheduling with alternative machines, which makes it much more difficult
than conventional job-shop. Currently it is virtually impractical, even for moderate
size instances, to solve this class of scheduling problems using exact methods.
The first attempt to solve train scheduling problems using both, exact and approx-
imate methods, dates back to the beginning of the 1970s when linear mathematical
programming models were developed [1]; [33]. Linear and nonlinear mixed pro-
gramming models became available [21], [23], [4], [14] but soon the intractability
of these models to solve complex real-world problems became apparent. Heuris-
tics such as tabu [14], [16], greedy search [22], genetic algorithms [31], [15] and
[16]; [25] were attempted to solve the problem. Knowledge-based techniques [5],
hybridizations of discrete event models and greedy search techniques [8], and com-
binations of discrete event models with fuzzy rule-based techniques [28]; [35] have
shown to provide a pragmatic and efficient approach to develop schedules for actual
system instances in real time. Distributed [19] and agent-based approaches [3] have
also been investigated. Recently, new classes of models were proposed to account
for the inherent multi-objective nature [11] and the flexibility required [37] by train
schedule problems.
Despite the significant performance of current high-speed computer systems,
exact solution of mixed optimization models with constraints for every train and
segment of a rail network still requires unreasonably long processing time. Usually
164 F. M. Filho et al.
Genetic algorithms are search algorithms based on the principles of natural genetics
whose purpose is to develop solutions for optimization problems. The main idea
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 165
Initial
population
Next
generatioin
Reproduction
Evolution of
population
Selection
no
Stopping
criteria satisfied
?
yes
Return best
individual
Genetic fuzzy systems are fuzzy systems augmented with a learning process
based on genetic algorithms. Similarly as genetic algorithms, they provide ro-
bust search capabilities in complex spaces and offer a powerful way to approach
problems requiring efficient and effective search processes [6], [7]. Genetic fuzzy
systems embrace different levels of complexity, from parameter optimization, to
learning of fuzzy rule bases and inference mechanisms. During the last ten years,
most of the effort in the area of genetic fuzzy systems has been devoted to fuzzy
rule-based systems. Recently, a new class of genetic fuzzy system emerged from
experiments with complex scheduling and sequencing problems for hybrid systems.
A particularly important class of hybrid systems are rail networks [25]. Scheduling
of rail systems involves continuous and discrete decision variables associated with
train movements in a rail line. The search space is considerably complex. In addi-
tion, fitness evaluation is expensive in rail systems because it involves the dynamics
of train movements, namely, train time trajectories. The genetic fuzzy system ad-
dressed here in this chapter uses fitness estimation procedures based on participatory
learning fuzzy clustering. The result is a genetic fuzzy system in which, contrary to
most current view of current genetic algorithms, learning occurs concurrently with
population evolution.
Most genetic algorithms require a large number of fitness evaluations before ac-
ceptable solutions are found. In many practical situations fitness evaluation may
demand computationally expensive procedures. In theses cases, fitness estimation
models can be adopted to alleviate computational costs, but solution quality must
be within acceptable bounds. In general, fitness estimation is useful when fitness
function evaluation is complex and time-consuming such as when there is no an-
alytic mathematical model, the environment is stochastic, and fitness landscape is
complex [20].
The use of fitness estimation models to improve computational performance of
evolutionary optimization algorithms dates back to the 1960s [9]. Previous efforts
have concentrated in response surface approximation instead of the original eval-
uation function [36]. Alternative approaches rely on special relations between the
approximate and the original model to develop multilevel search strategies [10].
Other schemes use functional approximation methods to form reduced models. A
comprehensive survey of fitness estimation models can be found in [20].
Two classes of genetic algorithms emerge from two main classes of fitness esti-
mation models, namely, fitness inheritance and fitness imitation [20]:
A. Fitness Inheritance
Fitness inheritance refers to all fitness estimation methods in which the fitness values
of the offspring individuals are directly derived from the fitness values of their par-
ents. These estimation methods can be interpreted as local once they consider only
parental information to estimate fitness, neglecting any information from the search
space. On the other hand, once they rely on local information only, they are easier
to use. An example of a simple fitness inheritance mechanism as a fitness estima-
tion strategy is suggested in [22]. Genetic algorithms with fitness inheritance follow
the steps of the basic genetic algorithms except that it adds a confidence degree as
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 167
an attribute to each individual of the population. Next, during parent selection for
reproduction, offspring are evaluated only if they are chosen according to a proba-
bility density function. If they are chosen, then they are evaluated using the original
fitness function. Otherwise they are evaluated as a weighted combination of the par-
ent fitness values. Weights depend on the similarity between parents and offspring.
Confidences degrees are updated accordingly. We refer the reader to [30] for a de-
tailed explanation of the algorithm once fitness inheritance will not be emphasized
here. Detailed description, analysis and comparisons are given in [20]; [25].
B. Fitness Imitation
Fitness imitation embraces all fitness estimation methods that do not use any form
of fitness inheritance mechanism. This class can be viewed as global because it con-
siders information of the search space to estimate fitness. However, because of the
need of global information, it tends to be more complex to use. Fitness imitation
genetic algorithms require the choice of a set of individuals to represent the whole
population. These representative individuals are evaluated using the original fitness
function while the remaining individuals are evaluated using the estimation proce-
dure. Therefore, fitness estimation procedures must also be selected. The perfor-
mance of the genetic algorithm depends on the mechanism to choose representative
individuals and on the fitness estimation procedure.
1) Choice of Representatives
The choice of representative individuals can be random or deterministic. A possi-
ble choice of representatives is to randomly sample the population using, e.g. the
roulette wheel procedure and store them in a fixed size memory. Only a subset of
the sampled individuals in memory is directly evaluated using the original fitness
function [13]. While intuitively simple and appealing, this method is very sensitive
to the choices of memory size and number of individuals for direct evaluation.
Alternatively, representative individuals can be chosen deterministically in each
generation by clustering population individuals in several groups [22]; [25]. Cluster-
ing is typically conducted in the genotype space. In this case, only those individuals
that represent the groups, that is, the cluster centers, are evaluated using the original
fitness function. Fitness evaluation of the remaining individuals is computed using
a weighted combination of representative individuals fitness values.
One mechanism to implement deterministic selection of representatives, the one
suggested in this chapter, is to use fuzzy clustering techniques. Fuzzy clustering is
interesting because it accounts for the fact that grouping is imprecise and allows the
same individual to be compatible with different clusters with different degrees. The
use of the fuzzy c-means [2], a powerful and efficient supervised fuzzy clustering
method, has been addressed in [26]. Here we suggest the use of the participatory
learning fuzzy clustering algorithm [32]. Contrary to fuzzy c-means, the partici-
patory learning fuzzy clustering algorithm is unsupervised and groups individuals
adaptively through generations. The result is a new class of genetic fuzzy system in
which, contrary to the current status of genetic fuzzy systems, learning occurs con-
currently with evolution. Figure 2 illustrates how individuals of a population evolve
168 F. M. Filho et al.
Clustering by PL
when using the fuzzy c-means (FCM) and the participatory learning fuzzy cluster-
ing algorithms (PL) in genetic fuzzy algorithms (GFA). The figure emphasizes the
first, tenth and twentieth generation, respectively.
In genetic algorithms, individuals tend to concentrate around the optimal solu-
tions as the population evolves and are likely to become genetically similar. This
fact suggests that the number of clusters should reduce during generations. As
Figure 2 shows, the fuzzy c-means always groups individual in the same number
of clusters because it assumes that the number of clusters is given. This generates
genetically redundant cluster centers as we notice in the tenth and twentieth gener-
ation. Contrary, participatory learning fuzzy clustering recognizes the distribution
of individuals over the search space and cluster individuals in smaller number of
groups through generations. This avoids genetically redundant clusters and makes
the genetic algorithm faster. Due to its adaptive nature, the participatory learning
fuzzy clustering algorithm performs better than the fuzzy c-means because.
2) Fitness Estimation
After their choice, the representative individuals are evaluated using the original
fitness function. Fitness of the remaining individuals are estimated using the values
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 169
of the representative individuals. Here we suggest two techniques. The first relies on
a normalized similarity (1) between the individual whose fitness is to be estimated
and the representative individuals. Estimate of the fitness of an individual uses (2),
a weighted combination of the fitness of the representative individuals.
dmax − dk j
Sk j = (1)
dmax
⎧ r
⎪
⎪
⎪
⎪
⎪ ∑ Sk j f (x j )
⎪
⎨ j=1
r , x j ∈ R, if r > 1
fˆ(xk ) = (2)
⎪
⎪
⎪
⎪
∑ Sk j
⎪
⎪ j=1
⎩ Sk j f (x j ), x j ∈ R, if r = 1
In (1), dmax denotes the maximum distance between any two individuals in
the population and dk j the distance between individuals xk and x j . Notice that
Sk j ∈ [0, 1]. In (2), fˆ(xk ) is the fitness estimate for individual xk , r is the number
of representative individuals, f (x j ) is the fitness value of individual x j , and R is the
set of representative individuals.
The second technique estimates fitness values considering uk j , the membership
degree of the k − th individual in the j − th cluster, the cluster whose center is the
individual x j . In (3), f (x j ) is the fitness of the individual x j , fˆ(xk ) the fitness esti-
mate of the individual xk , c is the number of representative individuals, that is, the
number of clusters, and V a matrix whose columns are cluster centers.
⎧ c
⎪
⎪
⎪
⎪
⎪ ∑ uk j f (x j )
⎪
⎨ j=1
c , x j ∈ V, if c > 1
fˆ(xk ) = (3)
⎪
⎪
⎪
⎪ j=1
∑ u kj
⎪
⎪
⎩ uk j f (x j ), x j ∈ V, if c = 1
Initial
population
Choice of Next
representatives generation
Reproduction
Evaluation of
representatives
Selection
no
yes
Return best
individual
2000
1500
1000
500
0
500
500
0
0
–500 –500
p .
f (x) = 418.9829 p + ∑ xi sin( (xi )), x ∈ ℜ p (4)
i=1
In the next section we address the train scheduling problem for freight rail net-
works to illustrate the usefulness of HGA in practical situations. Before proceed-
ing, we notice that most freight rail networks are divided into territories consisting
mainly of single tracks with sidings and, to lesser degree, a mixture of single track
and double track. Here we emphasize a single territory, single track line.
One of the major goals of current research in scheduling concerns the trade-off
between processing time and optimality. In practice scheduling algorithms and pro-
cedures that provide near-optimal solutions are preferable because they offer satis-
factory and pragmatic solutions faster than exact algorithms.
In supervisory traffic control, train dispatchers control train movement, plan the
meeting and passing of trains on single-track sections, align switches to control
each train movement, gather and report information, communicate with train crew,
station, and yard managers. Supervisory train schedule is one of the main tasks in
supervisory traffic control. The aim is to find a meet and pass plan for the rail line
and the speed of each train over each track segment to minimize an objective func-
tion. The objective function commonly is a weighted sum of objective functions
of all trains such as delay and operational costs. Generally train delay means the
additional amount of time a train needs to satisfy following and meet and pass con-
straints. The simplest form to determine delay is to compute the difference between
the free and actual transit time of a train journey. Supervisory train schedule trans-
lates in a movement plan composed by the arrival and departure time of each train
at each rail line segment within scheduling horizon.
This section details the use of genetic fuzzy algorithms to produce train move-
ment plans for single track railroads. Preliminary developments of the genetic fuzzy
system approach for train movement planning have been discussed in ([25]) using
fuzzy c-means clustering. Here we emphasize the genetic fuzzy algorithm with fit-
ness estimation using participatory learning clustering.
The supervisory train schedule problem assumes, without loss of generality, a
rail line with trains moving east and west bound. Trains may enter sidings to allow
trains moving in opposite directions to pass or overtake other trains. Trains should
only move when there is no chance to occur deadlock. Deadlock is the state in which
no train is able to progress in the rail line unless one of them backtracks to allow
other trains movement.
A. Genetic Fuzz Algorithm
The GFA addressed in Section 2 uses a discrete event model reported in [28], [29].
Figure 6 shows the model developd, emphasizing where the genetic code is placed.
The model requires the following input data:
• Railway line topology
• Departure time of all trains
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 173
Basically, the role of discrete event model is to simulate the train dispatch and
movement processes. Briefly, it works as follows. Events are the arrival and depart
times of each train at each rail line segment. Trains generate events as they move.
Whenever a train is to be dispatched, all eventual conflicts with trains competing
for the use of a common segment must be resolved first. The purpose is to decide if
the train should proceed, or if it must stop and wait for another conflicting train to
cross or overtake. Conflict decisions are handled by the subsystem called Dispatch
Policy. After conflict decision, the model checks if train movement causes deadlock.
If it does, the train must be kept stopped at its current segment until a deadlock free
movement occurs.
The Dispatch Policy decides which train should move, and different policies
mean different dispatching decisions. In other words, different policies mean dif-
ferent schedules. Certain schedules are preferable than others with respect to the
objective function value. The idea of the GFA is to evolve a Dispatching Policy that
provides near-optimal solutions within short processing time bounds.
It is interesting to note in Figure 6 that to evaluate a candidate solution directly
we must simulate all trains movement within the scheduling horizon. This is a very
time-consuming task for complex scenarios such as large railway lines with small
number sidings and large number of trains. This is the situation where large number
of movement conflicts is likely to occur. Conventional optimization models do pro-
vide optimal solutions, but the current computer technology turns them inapplica-
ble in practice because processing time is prohibitive. Heuristics and local search
Dispatch
Genetic Policy Fitness
Code
Phenotype
Discrete Feasible
Input
Event Schedule
Data
Simulator
Chromosome
Train 1
segment 1 segment 2 segment 3 segment 4
priority 2 4 8 1
velocity (Km/h) 60 73 40 55
Train n
priority 7 2 10 9
velocity (Km/h) 57 80 45 50
methods can provide feasible solutions fast, but solution quality may be poor. GFA
approach provides an attractive trade-off between solution quality and processing
time requirements.
1) Representation
The representation of individuals is through a chromosome consisting of 2n vectors,
where n is the number of trains in the rail line. The length of each vector is the
number of segments in the train route. As Figure 7 shows, two vectors characterize
each train. Each component of the first vector, called priority vector, defines the
priority of the train to occupy the segment in the corresponding position in its route.
In the second vector, called speed vector, each component gives the train speed
when moving in the segment in the corresponding position of its route. Therefore,
whenever movement conflicts happen, the Dispatching Policy must decide which
train will occupy the segment first: the train with the highest priority among the
competing trains is the one chosen to proceed.
2) Fitness Function
For simplicity, in what follows we assume that the aim is to minimize the total delay
in the schedule, as shown in (5). The fitness function used by the GFA is given in (6).
n m
F(Si ) = ∑ ∑ delay( j, k) (5)
j=1 k=1
1
f itnessi = (6)
1 + F(Si )
Table 2 shows that as the number of trains increases, the optimality gap between
the genetic algorithms and the exact optimal solution increases as well because sce-
narios become more complex. Gaps achieve 2.42% for the scenario with 5 trains,
2.39% for 6 trains and 6.76% for 7 trains. If the same comparison is made between
the GFA and CGA, the gap is much lower. For 7 trains GFP achieves better solution
than CGA. For 9 trains the gap between GFP and CGA is 1.4%.
Table 3 indicates that, the number of trains increases, exact optimal becomes
difficult to obtain using classic optimization modeling approach. In general, GFAs
run faster than CGA. For 8 and 9 trains, GFP was considerably faster than to achieve
96.45% and 98.6% of the CGA fitness function values, respectively.
2) Varying the number of sidings
Tables 4 and 5 summarize the behavior of the GFAs and the CGA as the number
of sidings increases, but keeping 5 trains moving in the rail line. Entries of Table 4
are the average minimum total delay over 5 runs. Table 5 shows the corresponding
average processing running times over 5 runs.
As Table 4 indicates, except for 6 the GFA achieved the optimal solution for all
test instances. For 6 sidings, the optimality gap is 2.42%. Notice that there is no gap
between GFP and CGA solutions.
From Table 5 we conclude that the computational effort to find exact optimal
solutions increases fast as the number of sidings increases. Clearly, all GFAs run
faster than CGA and achieve the optimum solution for most instances.
3) Real-world scenario
In this section we consider a rail line composed by 43 segments, 22 sidings and 21
single track segments, respectively. We assume 27 trains to be scheduled within 24
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 177
hours period. This corresponds to a territory of a major railroad network of the state
of Sao Paulo, Brazil.
Table 6 summarizes the performance of the GFAs and the CGA. Similarly as
in previous sections, entries of Table 6 are the averages over 5 runs of the values
computed using (5) and (6) and the corresponding processing times. Exact optimal
solution for this instance is not available once the problem size is above the one in
which reasonable processing times could be expected.
Table 6 shows that, from the point of view of solution quality, the GFAs perform
as well as CGA. The fitness values of GFAs are very close to the one achieved
by CGA. However, the GFAs run considerably faster than CGA. It is worth note
that, in particular, GFP is able to provide near optimal solutions within a period of
time fully consistent with requirements for train movement plans at the supervisory
control level. Figure 9 shows the schedule using train graph.
178 F. M. Filho et al.
4 Conclusion
Although rail is an old technology, current rail systems are complex and require
advanced techniques to be operated. This chapter has addressed the development
of supervisory train schedule for railroad network systems using genetic fuzzy al-
gorithms. Supervisory train scheduling is a major issue in railroad industry once
it provides a key to improve operational and economic performance. Supervisory
scheduling provides references on how to best manage and control train movements
in a rail network. The genetic fuzzy algorithm uses fitness estimation procedures
as a mechanism to reduce genetic algorithm complexities when handling heavily
constrained optimization problems whose fitness and performance evaluations are
computationally expensive. This is the case of train scheduling and movement plan-
ning problems. The genetic fuzzy algorithm approach suggested in this chapter sig-
nificantly reduces the number of direct fitness evaluations and decreases processing
times without significantly affect solution quality. A fitness estimation model that
uses the participatory learning clustering technique was emphasized and shown to
perform best in all experiments conducted. The genetic fuzzy algorithm with par-
ticipatory learning clustering achieves high fitness values with a reduced number of
direct evaluations.
Despite promising performance, genetic fuzzy algorithms still need considerable
effort for further improvement. For instance, new fitness estimation models based on
statistical and neural network models could be useful. The use of fuzzy rule-based
systems to control key genetic algorithms parameters such as crossover and muta-
tion rate, population size and the use rule-based genetic fuzzy systems could be an
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 179
Acknowledgments The first author acknowledges CAPES, the Brazilian Ministry of Education,
for a fellowship. The second author thanks FAPESP, the Research Foundation of the State of Sao
Paulo for its support. Currently he is with Cflex Computacao Flexivel Ltda, Campinas, Sao Paulo,
Brazil. The third author is grateful to CNPq, the Brazilian National Research Council, for grant
304299/2003 − 0.
References
1. I. Amit and D. Goldfarb. The timetable problem for railways. Developments in Operations
Research 2, pp. 379–387, 1971
2. J. Bezdeck. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press,
1981
3. J. Blum and A. Eskandrian. Enhancing intelligent agent collaboration for flow optimization
of railroad traffic. Transportation Research A, pp. 919–930, 2002
4. M. Carey and D. Lockwood. A model, algorithms and strategy for train pathing. Journal of
the Operational Research Society, 46, pp. 988–1005, 1995
5. T. Chiang and H. Hau. Railway scheduling system using repair-based approach. Proc. of the
IEEE Conf. on Decision and Control, pp. 71–78, 1995
6. O. Cordón, F. Herrera, F. Hoffmann and L. Magdalena. Genetic fuzzy systems: Evolutionary
tuning and learning of fuzzy knowledge bases. World Scientific, 2001
7. O. Cordón, F. Gomide, F. Herrera, F. Hoffmann and L. Magdalena. Ten years of genetic fuzzy
systems: Current framework and new trends. Fuzzy Sets and Systems, 141, pp. 5–31, 2004
8. M. Dorfman and J. Medanic. Scheduling trains on a railway network using a discrete event
model of railway traffic. Transportation Research B, 38, pp. 81–98, 2005
9. B. Dunham, D. Fridshal, R. Fridshal and J. North. Design by natural selection. Synthese, 15,
pp. 254–259, 1963
10. D. Eby, R. Averill, W. Punch and E. Goodman. Evaluation of injection island ga performance
on flywheel design optimization. Proc. 3rd Conf. on Adaptive Computing in Design and Man-
ufacturing, 1998
11. K. Ghoseiri, F. Szidarovszky and M.J. Asgharpour. A multi-objective train scheduling model
and solution. Transportation Research B, 38, pp. 927–952, 2004
12. D. Goldberg. The Design of Competent Genetic Algorithms: Steps Toward a Computational
Theory of Innovation. Kluwer Academic, 2002
13. Y. Hanaki, T. Hashiyama and S. Okuma. Accelerated evolutionary computation using fitness
estimation. Proc. of Int. Conf. on Systems, Man and Cybernetics, pp.643–648, 1999
14. A. Higgins, E. Kozan and L. Ferreira. Heuristic techniques for single line train scheduling.
Journal of Heuristics, 3(1), pp. 43–62, 1997
15. T. Ho and T. Yeung. Raiway junction conflict resolution by genetic algorithm. Electronics
Letters, 36(8), pp. 771–772, 2000
16. T. Ho and T. Yeung. Railway junction traffic control by heuristic methods. IEE Proc. Electronic
Power Applications, 148(1), pp. 77–84, 2001
17. J. Holland Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975
18. P. Ireland, R. Case, J. Fallis and C. Dyke. The Canadian pacific railway transforms operations
using models to develop its operating plans. Interfaces, 34(1), pp. 5–14, 2004
180 F. M. Filho et al.
19. R. Iyer and S. Ghosh. DARYN — a distributed decision-making algorithm for railway net-
works: Modeling and simulation. IEEE Trans. on Vehicular Technology, 44(1), pp. 180–191,
1995
20. Y. Jin A comprehensive survey of fitness approximation in evolutionary computation. Soft
Computing, 9, 1, pp. 3–12, 2005
21. D. Jovanovic and Harker. Tactical scheduling of rail operations: the scan I systems. Trans-
portation Science, 25, pp. 46–64, 1991
22. H. Kin and S. Cho. An efficient genetic algorithm with less evaluation by clustering. Proc. of
IEEE Congress on Evolutionary Computation Conference, pp. 786–792, 2000
23. D. Kraay and P. Harker. Real-time scheduling of freight railroads. Transportation Research B,
29, pp. 213–229, 1995
24. Q. Lu, M. Dessouky and R. Leachman. Modeling train movements through complex rail net-
works. ACM Transactions on Modeling and Computer Simulation, 14(1), pp. 48–75, 2004
25. F. Mota Filho, F. Gomide and R. Goncalves. Genetic algorithms, fuzzy clustering and discrete
event systems: An application in scheduling. Proc. First Workshop on Genetic Fuzzy Systems,
Granada, Spain, pp. 83–88, 2005
26. F. Mota Filho. Estimation fitness methods for genetic algorithms and applications., Master’s
thesis State University of Campinas, Faculty of Electrical and Computer Engineering, São
Paulo, Brazil, 2005
27. Railway Age Magazine, 2003
28. M. Rondón and F. Gomide. Railway simulation and optimization system. World Automation
Congress, pp. 1–6, 2000
29. M. Rondón and F. Gomide. Line block analysis in railway dispatch and simulation systems/
Proc. 9th IFAC Symposium on Control in Transportation Systems, pp. 405–409, 2000
30. M. Salami and T. Hendtlass. A fitness estimation strategy for genetic algorithms. Proc. 15th
Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert
Systems, Vol. 2358, pp. 319–326, 2002
31. V. Salim and X. Cai. Scheduling cargo trains using genetic algorithms. IEEE Int. Conf. on
Evolutionary Computation, Vol. 1, 1995
32. L. Silva, F. Gomide and R. Yager. Participatory learning in fuzzy clustering. 14th IEEE Annual
Int. Conf. on Fuzzy Systems, pp. 857–861, 2005
33. B. Szpigel. Optimal train scheduling on single track railway. In M. Ross (Ed) OR’72, North-
Holland, pp. 343–351, 1973
34. H. Takagi. Interactive evolutionary computation. Proc. 5th Int. Conf. on Soft Computing and
Information/Intelligent Systems, pp. 41–50, 1998
35. A. Tazoniero, R. Gonalves and F. Gomide. Fuzzy algorithm for real-time train dispatch and
control. Proc. of North American Fuzzy Information Processing Society, pp. 1–5, 2005
36. V. Toropov and L. Alvarez Application of genetic programming to the choice of a structure of
global approximations. Proc. 3rd Annual Conf. on Genetic Programming, 1998
37. A. Valle, F. Gomide and R. Gonalves. Fuzzy optmization model for train dispatch systems.
11th Int. Fuzzy Systems Association World Congress, Vol. 3, pp. 1788–1793, 2005
Multiobjective Evolutionary Search
of Difference Equations-based Models
for Understanding Chaotic Systems
1 Introduction
When modeling complex processes, there is always a balance between the trans-
parency of the model and its accuracy. Chaotic signals are not an exception to this
rule: we expect a technique that produces a black box from data [12,20,25,32,34,41]
to produce more accurate results than other procedures that also gains insight into
the block structure of the system. The best representative of this last kind of model
(that we will name white boxes, understandable, or transparent models) is arguably a
set of difference equations. A difference equations-based model allows the user not
only to predict the output of the process, but to know the dynamics of the model and
ultimately to design a control system for it. Nevertheless, obtaining an appropriate
set of equations from data is a problem that cannot be regarded as solved.
Many of the most recent approaches to obtain understandable descriptions of
chaotic systems are based on evolutionary techniques. In particular, the use of tree-
based codifications allows us to define a simultaneous search in both the different
families of models, and the parameters that define a model within one of these fami-
lies. Since we want to discover the structure of the set of equations (i.e., a consistent
subset of state variables and the dependences between them) and also the numerical
values of the coefficients in these equations, it is convenient for us to combine an
evolutionary search with a tree-based representation of the model, as it was done,
among others, in [3, 4, 11, 16, 40, 49].
Some of the latest algorithms are able to obtain difference equations, but there is
work yet to be done. Many evolutionary modeling methods minimize the discrep-
ancies between the data and the one-step prediction of the model, and do not take
into account the dynamic behavior of the model [41, 47]. As we will show later in
this paper, should we search for a model on the basis on the lowest one-step predic-
tion error, we have high chances of finding a non-chaotic model. In that case, the
obtained equations would be meaningless. The use of greater prediction horizons
is not always feasible, though. Being chaotic systems, we can find large deviations
between the recursive evaluation of the model and the training data.
In the following sections we will solve this problem by enforcing an additional
constraint: the value of the largest Lyapunov exponent of our model has to match
that value estimated from our train data. The largest Lyapunov exponent is a mea-
sure of the amount of chaos in the signal [24,25,48], and the difference between the
maximum Lyapunov exponents of two models also gives us a measure of similar-
ity between the complexities of their dynamics [17, 47]. Accordingly, we propose
to extend the aforesaid balance between transparency and accuracy to a new triplet
transparency/accuracy/dynamic. We define a multiobjective problem, designed to
minimize the square error and the complexity of the model, while restricting the
search to those models whose largest Lyapunov exponents are similar to the esti-
mated value from the time series we want to analyze. Since the evaluation of the
Lyapunov exponents is very time costly, we also propose to use our own custom
evolutionary algorithm, that combine a tree-based codification with a population-
based, multiobjective extension of the Simulated Annealing. The algorithm that we
propose in this paper is able to find a set of difference equations that reproduces
the dynamics of a given chaotic time series, and improves the results of modern
multiobjective evolutionary algorithms like NSGA-2 [9, 10] when the number of
evaluations of the objective function is limited.
The organization of this paper is as follows: in Section 2, we make a brief bibli-
ographic analysis of transparent models of chaotic systems, detailing the unsolved
Multiobjective Evolutionary Search of Difference Equations-based Models 183
problems. In Section 3, we describe our own proposal. Experiments and results are
shown in Section 4, and the paper finishes with the concluding remarks and the
future work.
posed to use a linear combination of the quadratic error and the largest Lyapunov
exponent for the fitness function, and have optimized it by means of a genetic
algorithm. In [12, 39] a Pareto-based approach is used instead of scalar functions,
in combination with the MOGA algorithm described in [15]. There are some dif-
ferent Pareto-based multiobjective strategies that could also be applied for the same
problem, as can be seen in [6]. Later in this paper, we will evaluate a more recent
approach, the NSGA-II algorithm [9, 10].
Given the computational cost of evaluating the Lyapunov exponents of a model,
and the potentially large size of some individuals, we are mostly interested in algo-
rithms that need a low number of iterations and small population sizes. It is widely
admitted that genetic algorithms are the best choice for this matter. As a matter of
fact, these algorithms have become an standard in all kind of multiobjective prob-
lems [50]. However, in our opinion, the experimentation that support this assert
was intended to solve problems based in a linear genotype, and it is not immedi-
ate to extrapolate all of their conclusions to tree-based representations. In previous
works [42], we have combined a simulated annealing (SA) global search with a
grammar-tree-based codification, in the context of the learning of fuzzy rules. An
strategy so simple as keeping only one individual, and repeatedly mutating it, ad-
mitting or discarding the result according to a probability decreasing with time and
distance, was able to improve the results of the GA. With this result in mind, in
this paper we will extend our own algorithm to multiobjective problems, and pro-
pose a new population-based, multiobjective SA search (MOSA) able to elicit a set
of nondominated solutions. In the following sections we will show that the genetic
search (the NSGA-II algorithm,) while equally efficient in the long term, can be
improved in this specific problem by a Simulated Annealing-based search in both
accuracy and memory usage.
Interesting enough to mention, a pure Pareto-based MOSA has not been previ-
ously defined, to our knowledge. The most recent approaches weight the different
criteria into a scalar function [19, 31, 45]. Otherwise, in [8] it was proposed to use
the dominance to decide the evolution of the simulated annealing. That approach
was also used in [18], where fuzzy numbers and uncertainty in dominance is man-
aged to decide if an individual is better than other or not. Similarly, in [35, 36],
Pareto dominance is studied to decide how the multiobjective simulated anneal-
ing evolves. But, in all of these cases, an aggregated function of objectives still
is used to evaluate each individual. A different approach to Pareto-based MOSA,
nearer to ours, is presented in [2]. In that work, a comparison of a Pareto-based
evolutionary algorithm and a population-based simulated annealing with domi-
nance control approach is presented. In each simulated annealing iteration, a new
individual is obtained by means of an heuristic, and it is included in the popu-
lation if there is nondominance relation with the current individual. If the new
one dominates the current, then it becomes the current one. In the opposite case,
then it is accepted with temperature dependent probability. Observe that, even in
this last case, it is required that either an individual dominates or is dominated by
another. This is done, again, weighting the different objectives into a scalar func-
tion and therefore the algorithm does not homogeneously sample the Pareto front.
Multiobjective Evolutionary Search of Difference Equations-based Models 185
In the next sections we will propose a different algorithm that does not pose this
problem.
The experimental analysis that we will show later compares the NSGA-II and the
MOSA algorithms, both sharing the same representation and operators. Our SA
search will be based in the mutation operator, in turn based in the genetic crossover
[42].
In this section we will state, for both search schemes, the representation of an
individual, its validation procedure, how to generate an individual at random, how
to evaluate it, the crossover and the mutation operators. In the next section we will
describe the pseudocode of the algorithms.
We will build the input data from a time series, given an embedding dimension
n, thus the training set contains the sampled values of n system state variables
xk1 , . . . , xkn , at times k = 1, 2, . . .. We wish to obtain a set of m ≤ n difference equation-
based models, with the structure that follows:
i
xk+1 = fi (xk1 , . . . , xkn ) i ∈ {1, . . . , n}. (1)
One of these state variables will be identified as the output of the system. It is as-
sumed that xk+1 = xk for all those variables without an equation assigned.
The phenotype of an individual is, therefore, a list of m valid equations. We will
define the concept “valid equation” by means of the the grammar shown in Figure 1.
S → Structure Parameters
Structure → ArithOp ∨ NonLinearOp ∨ DelayOp
Parameters → Variable ∨ Constant
Variable → System signal
Constant → ℜ
ArithOp → (+ Exp, Exp) ∨ (− Exp, Exp) ∨ (* Exp, Exp)
NonLinearOp → (G [LC, UC] → OC, Exp) ∨ (Dz [LC, UC] → OC, Exp)
DelayOp → (Ret delay Variable)
LC → Constant
UC → Constant
OC → Constant
Fig. 1 Grammar defining a valid equation. “G” means “gain”, “Dz” means “dead zone”. There are
some restrictions in the value of the constants that are also enforced: LC < UC, and all constants
are bounded
186 L. Sánchez and J.R. Villar
createModel
needs: list of system signals, id. of the output-signal, experiment parameters
produce: a random set of signals, including the one designed as system output,
and a randomly generated equation for every one of them
Fig. 2 Simplified pseudocode of the random generation of a model using the PTC2 algorithm. The
function createRandomEquation takes into account constrains like the maximum height of a tree,
the probabilities of each type of node and the grammar shown in Figure 1
The genotype will be the syntactic tree of a valid chain in this grammar. Each node
of this tree will encode the name of the production rule that originated each subtree.
This information will be used later to define a typed crossover.
It can be observed that each equation comprises two parts, associated to the
productions “Structure” and “Parameters”. The first production defines which
operations are valid to define the functions fi , and the second one is a list of
numerical parameters, on which these last functions depend. Following [26], the
nonlinear elements in the definition of fi are selected from the usual catalog of
building blocks in control engineering. We have restricted ourselves to the blocks
“gain with saturation” and “dead zone”.
The PTC2 algorithm (see Figure 2) is used to generate random trees [27, 28]. This
algorithm allows to specify the maximum number of nodes, the maximum height,
the types of nodes, and the probability distribution for each tree height and the prob-
ability distribution of each type of node, conditioned to our grammar.
Our crossover operator has two different expressions, to which we will refer as
parametric and structural. The parametric crossover takes place between the parts of
the individuals that derive from the production rule “Parameters”, and the structural
crossover involves the parts originated in the production “Structure”. Leaving apart
the differences in the grammar, the same operators proposed in [42] were used:
Multiobjective Evolutionary Search of Difference Equations-based Models 187
• To perform the parametric crossover we select one of the nodes derived from the
production Constant in each one of the trees, and modify both values with an
extended intermediate crossover [33].
• To carry out the structural crossover of two individuals, a random node of the
first parent is selected. The subtree rooted in this node is to be interchanged
with another one in the second parent. A list of valid nodes of this last parent is
produced. That list of valid nodes not only has to take into account the syntactic
restrictions of the grammar, but there are also semantic constrains: the height of
the offspring must not be higher than the limit, and the individuals must not have
more than one equation for each one of the state variables. If the list is empty,
the procedure is repeated with a different node in the first parent. Once we have
a nonempty list, one of its elements is randomly chosen and interchanged with
the former one.
In previous works [42], we have proposed to implement the macromutation in
the SA algorithm by means of a subtree crossover with a randomly generated indi-
vidual [21, 37]. In our MOSA implementation we will use this technique: crossover
with a random individual followed by a selection at random from the offspring. The
same mutation operator will also be used in our implementation of the NSGA-II
algorithm.
The fitness function comprises a pair of numbers: the mean error of the one-step
prediction of the model, and the absolute difference between the largest Lyapunov
exponents of the model and the training data. Different procedures have been pro-
posed to compare this kind of compound values [7]. We will use a Pareto multi-
objective evaluation, and guide the search towards obtaining a set of nondominated
individuals. In the most general case, it is said that an individual x dominates to
another individual y (x ≺ y,) if all the Fj components of the fitness vector F verify
Fj (x) ≤ Fj (y), and ∃t | Ft (x) < Ft (y). However, we are not interested in the whole
Pareto front, because models with a high prediction error are not of practical inter-
est. We will discard all models whose one-step prediction error is higher than the
variance of the time series, no matter their Lyapunov value.
The estimation of the one-step prediction error is immediate. Unfortunately the
same cannot be said about estimating the largest Lyapunov exponent of a model.
It will be computed, as mentioned, from the time series produced by the recursive
evaluation of the model since a given initial state, discarding the first samples of the
recursive evaluation, so we are certain that the trajectory is in the attractor.
Some different numerical algorithms were evaluated by us. Our first choice was
the well-known Wolf algorithm [48], that we had already used in previous works.
Unfortunately, the number of samples that this algorithm needs is rather high; this,
in combination with the large number of iterations and the population sizes needed
to obtain good models with multiobjective genetic algorithms makes the whole
188 L. Sánchez and J.R. Villar
The size of the intermediate population can be twice as high as the the current pop-
ulation size, in the worst case. To control the maximum population size, all the
dominated values and duplicated search points are removed at each iteration.
Our selection operator is a variation of that used in the NSGA-II algorithm [9,10].
In the first place, the set of nondominated search points is computed by pairwise
comparisons of all individuals in X . Observe that we do not need to use fast sorting
algorithms to compute this set, because the size of X in our experimentations ranges
between 10 and 25 individuals and performing 252 comparisons is much faster than
evaluating once the fitness value. Secondly,
• If the size of the set of nondominated search points is small enough, this set is
the new population.
• If its size must be further reduced, we sort the individuals in this last set by
means of the same crowding distance defined in the NSGA-II algorithm, and
choose them in inverse order of distance.
The problem with this reasoning is, there are many different networks able to
approximate the former training set without error. Most of them do not correspond
with chaotic models. For instance, observe the one-step prediction errors of the net-
works in the table that follows. They all are near zero, and apparently the models are
very precise, although some of them have too low an embedding dimension. How-
ever, in Figure 5 we have plotted the step responses of these models. Observe that
all of them are stable systems, with a punctual attractor. None of the nets captured
the chaotic nature of the signal.
Multilayer Perceptron
Embedding dimension Nodes in each layer Err
1 1-3-1 0.000972
2 2-5-1 0.000034
3 3 - 10 - 1 0.000004
4 4 - 10 - 1 0.000009
If we use a transparent model instead, the same can happen. In Figure 6 a Genetic
Algorithm was used, with the same representation and operators described in the text
but an scalar fitness (based only on the one-step error.) We have trained it with data
from the Henon map. The learned model is not chaotic, though, as pictured in the
center part of the figure. Lastly, in the lower part of the same figure the step response
of a model learned by the MOSA algorithm is shown. This is a chaotic model, and
in the next section we will also show some examples of reconstructed attractors. It
is remarked that the one-step error of either model the MOSA and the GA are close
to zero.
1.2
"pareto0.dat" u 3:4
"pareto40.dat" u 3:4
1.1 "pareto60.dat" u 3:4
"pareto80.dat" u 3:4
"pareto100.dat" u 3:4
1 "pareto.dat" u 3:4
0.9
0.8
0.7
0.6
0.5
0.4
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
Fig. 4 Evolution of the population in the MOSA algorithm, for a two-criteria problem taken
from [15]
192 L. Sánchez and J.R. Villar
1.4
"mac2.dat" u 0:1
1.2
0.8
0.6
0.4
0.2
0 100 200 300 400 500 600 700 800 900 1000
1.4 1.4
"SerieArtificial.dat" u 0:1 "SerieArtificial.dat" u 0:1
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
1.2 1.6
"SerieArtificial.dat" u 0:1 "SerieArtificial.dat" u 0:1
1.4
1
1.2
0.8
1
0.6 0.8
0.6
0.4
0.4
0.2
0.2
0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
Fig. 5 Chaotic time series, and multilayer perceptrons with one hidden layer, trained to minimize
the one-step error. Upper part: Chaotic signal (train data). Central part: Step responses of the neural
networks 1-3-1 and 2-5-1. Lower part: Networks 3-10-1 and 4-10-1. Despite the low values in the
objective funcion shown in the text, none of the neural nets is a chaotic model
In this section we will compare the results of MOSA and NSGA-II over some
benchmark problems. The NSGA-II is an implementation of the Pareto-based
multiobjective genetic algorithm detailed in [9, 10], which is currently assumed to
be among the best available implementations of such kind of algorithms.
The results will be shown with two different methodologies, graphical and statis-
tical. The graphical (qualitative) approach serves to identify the differences between
the combined Pareto fronts after a certain number of repetitions of each exper-
Multiobjective Evolutionary Search of Difference Equations-based Models 193
Fig. 6 Graphical analysis of experimental results, Henon map. Upper part: Train data. Center:
Typical recursive evaluation of a transparent model obtained by an evolutionary algorithm when the
maximum Lyapunov exponent is not included in the fitness function: In this case, the optimization
has converged to a stable model (Lyapunov exponent < 0.) Bottom: Step response of one of the
models found with the procedures mentioned in this paper
194 L. Sánchez and J.R. Villar
The parameters of the operators used in the experimentation are shown in the
following tableaux:
NSGA2
Parameter Value Parameter Value
Structural crossover 0.5 Parametric crossover 0.5
Mutation 0.01 Embedding dimension 2
Population size 100 Evaluations of fitness 5000
Constants minimum value −5 Constants maximum value 5
MOSA
Parameter Value Parameter Value
Initial temperature 1.00 Cooling Factor 0.999
Structural mutation 0.5 Parametric mutation 0.5
- Embedding dimension 2
Maximum population size 10 Evaluations of fitness 5000
Constants minimum value −5 Constants maximum value 5
The learning time is roughly proportional to the number of times that we esti-
mate the greater Lyapunov exponent of a model, and both algorithms are allowed
to evaluate 5,000 times this function. Since this estimation is not performed when
the one-step error is higher than the variance of the time series, this is equivalent
to 50 ≈ 100 generations of the NSGA-II algorithm. The parameters defining the
random initialization of the individuals are as follows:
Parameter Value
Maximum number of nodes in equations 10
Prob. of number of nodes/equation, 1 - 10 .05 .12 .11 .15 .15 .15 .11 .08 .05 .03
Maximum height 7
Height probability distribution, 1 - 7 .05 .4 .3 .15 .05 .025 .025
node types +; −; *; G; Dz;
Node type probability distribution .21 .21 .21 .21 .09 .07
Each experiment was repeated 10 times. The time series used for training and
validation have size 1,000. The chaotic systems that have have been used are the
Multiobjective Evolutionary Search of Difference Equations-based Models 195
Logistic and the Henon maps, with the set of parameters shown in the equations
that follow:
Logistic map: xk+1 = 4.0 ∗ xk ∗ (1 − xk ) (2)
xk+1 = 0.3yk + 1 − 1.4xk2
Henon map: (3)
yk+1 = xn
The graphical results are displayed in Figures 7 and 8. In both cases, we have
obtained the combined Pareto front (upper left part) after 10 repetitions of either
algorithm. This combined Pareto front is formed by selecting all the nondominated
individuals of the 10 runs. In the upper right part, all the elements of the 10 Pareto
fronts of each algorithm are displayed together, in the same graph. By last, in the
right lower part of the figures we have displayed a couple of reconstructed attractors
10 10
MOSA "PAR-GLOBAL-MOSA" u 1:2
NSGA-II "PAR-GLOBAL-NSGA" u 1:2
1 1
0.1 0.1
0.01 0.01
0.001 0.001
1e-04 1e-04
0.01 0.1 1 0.01 0.1 1
1.5 1.5
"atractor.dat" u 1:2 "atractor.dat" u 3:4
1 1
0.5 0.5
0 0
–0.5 –0.5
–1 –1
–1.5 –1.5
–1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
Fig. 7 Graphical analysis of experimental results, Henon map. Upper part, left: Combined Pareto
front of ten repetitions of the algorithms NSGA-II (triangles) and MOSA (circles). All the models
in the Pareto front of the NSGA-II algorithm are dominated by at least one element in the Pareto
front of the MOSA. A logarithmic scale is used, to enhance the differences. The vertical axe rep-
resents the error in the Lyapunov exponent, the horizontal one is the one-step error. Upper part,
right: combined cloud of the 10 Pareto fronts of both experiments, from which the Pareto fronts
were calculated. Lower part, left: Attractor of the Henon map. Lower part, right: Attractor of one
of the models induced by the MOSA method
196 L. Sánchez and J.R. Villar
10 10
"PAR-GLOBAL-MOSA" u 1:2
MOSA "PAR-GLOBAL-NSGA" u 1:2
1 NSGA-II 1
0.1 0.1
0.01 0.01
0.001 0.001
1e-04 1e-04
1e-05 1e-05
1e-06 1e-06
1e-04 0.001 0.01 0.1 1e-04 0.001 0.01 0.1 1
1 1
0.9
"atractor.dat" u 1:2
0.8 0.8 "atractor.dat" u 3:4
0.7
0.6 0.6
0.5
0.4 0.4
0.3
0.2 0.2
0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1
Fig. 8 Graphical analysis of experimental results, Logistic map. Upper part, left: Combined Pareto
front of ten repetitions of the algorithms NSGA-II (triangles) and MOSA (circles). All but one of
the models in the Pareto front of the NSGA-II algorithm are dominated by at least one element in
the Pareto front of the MOSA. A logarithmic scale is used, to enhance the differences. The vertical
axe represents the error in the Lyapunov exponent, the horizontal one is the one-step error. Upper
part, right: combined cloud of the 10 Pareto fronts of both experiments, from which the Pareto
fronts were calculated. Lower part, left: Attractor of the Henon map. Lower part, right: Attractor
of one of the models induced by the MOSA method
that show the similarities between the dynamic behavior of the models and that of
the original system (left part.)
As there is a clear difference between the combined fronts (all of the points in the
NSGA-II front are dominated by those of the MOSA) this is not so in this second
graph, since some of the executions of MOSA were dominated by NSGA-II and
vice versa. The extent to which, in average, one algorithm is better than the other,
will be studied in the next section.
There are functions (unary indicators) that can convert a Pareto front into a repre-
sentative value. It is possible to compare sets of these representative values with
the same methodology used in scalar evolutionary algorithms, i.e., a statistical test
able to discard that the expected errors are the same. However, some studies have
shown that these unary indicators are not able to show all the dominance relations
Multiobjective Evolutionary Search of Difference Equations-based Models 197
that can happen between Pareto fronts [52]. Therefore, to assess the average im-
provement between one algorithm and the other, we will used a method based on a
binary indicator, namely, the binary ε -indicator defined in [51].
Two different definitions of this last indicator are possible: the standard (multi-
plicative) Iε and the additive indicator Iε + . Given two fronts A and B, if Iε (A, B) < 1
and Iε (A, B) > 1, or if Iε + (A, B) < 0 and Iε + (A, B) > 0, we can state that A domi-
nates B. The values of these indicators for our combined Pareto fronts follow:
Iε (MOSA,NSGA) Iε (NSGA,MOSA)
Henon 0.25 122.98
Logistic 0.13 201.483
Iε + (MOSA,NSGA) Iε + (NSGA,MOSA)
Henon −0.04 0.19
Logistic −6 · 10−4 0.04
In both cases, we can conclude that combined MOSA results dominate that of
NSGA-II. These results are not conclusive, though, since one exceptionally good
result of either algorithm could be responsible of the dominance of the combined
Pareto front. Therefore, we propose to apply the ε -indicator to perform a full set of
comparisons between all pairs of fronts, and to calculate the fraction of times each
instance of the algorithm A dominates one of the instances of the algorithm B, and
vice versa.
Our methodology is as follows: Let pA (B) be 1 if A dominates B (i.e. when
Iε (A, B) > 1 and Iε (B, A) < 1), 0 otherwise. Given 10 repetitions B1 , . . . , B10 of an
algorithm B, let
1 10
PA (B) = ∑ pA (Bi ).
10 i=1
(4)
The vector PA (B) can be seen as a sample of a random variable: the fraction of times
that the output of the algorithm A dominates the algorithm B. If the expectation of
PA (B) is greater that the expectation of PB (A), then we can state that the algorithm
A is better than the algorithm B, since it is easier that results of the former improve
that of the latter than the opposite.
Therefore, to know whether there is a significant difference between the two
algorithms we can use a statistical test to discard that the expectations of PA (B)
and PB (A) are the same. Since the distributions of none of them were compati-
ble with the Gaussian distribution, we have used a Wilcoxon test (null hypothesis
E(PA (B)) = E(PB (A)), alternate hypothesis E(PA (B)) > E(PB (A)).) The resulting
p-values are shown in the following table:
198 L. Sánchez and J.R. Villar
1.0
l l
0.2
l
0.0
1 2 1 2
Fig. 9 Boxplots of (1) PMOSA (NSGA-II) and (2) PNSGA-II (MOSA) for the Henon map (left part) and
Logistic map (right part.) This graph shows that the probability of MOSA improves NSGA − II is
higher than the probability of NSGA − II improves MOSA in both problems
p-value
Henon 0.00020
Logistic 0.00013
We can discard with a confidence greater than 99% that the means of both vari-
ables are the same in favor of the alternate hypothesis, thus we can conclude that
MOSA is a significant improvement wrt. NSGA-II in this particular application. In
Figure 9 the boxplots of PMOSA (NSGA-II) and PNSGA-II (MOSA) for both problems are
also given.
The same can be said about unstable models, that are currently detected by mean of
heuristics (i.e., limits in the range of the output of the recursive evaluation.) The full
spectra or, at the least, the Kolmogorov entropy of the model should be evaluated
and taken into account along with the one step error and the largest exponent.
Acknowledgments The research in this paper has been funded by project TIN2005-08386-C05-
05, M.E.C., Spain
References
1. A. Alvarez, A. Orfila and J. Tintore, DARWIN: An evolutionary programa for nonlinear mod-
eling of chaotic time series. Computer Physics Communications, 136, pp. 334–349, 2000
2. E.K. Burke and J.D. Landa Silva. Improving the Performance of Trajectory-Based Multiob-
jective Optimisers by Using Relaxed Dominance. In: Lipo Wang, Kay Chen Tan, Takeshi
Furuhashi, Jong-Hwan Kim and Xin Yao, editors, Proceedings of the 4th Asia-Pacific Con-
ference on Simulated Evolution and Learning (SEAL’02), 1, pp. 203–207, Nanyang Technical
University, Orchid Country Club, Singapore, November 2002
3. H. Cao, L. Guo, Y. Chen and T. Guo, The Dynamic Evolutionary Modeling of HODEs for
Time Series Prediction., Computers and Mathematics with Applications, 46, pp. 1397–1411,
2003
4. Y.S. Chang, K.S. Park and B.Y. Kim, Nonlinear model for ECG R-R interval variation using
genetic programming approach. Future Generation Computer Systems, 21(7), pp. 1117–1123,
2005
5. I-F. Chung, C-J. Lin and C-T. Lin. A GA-based fuzzy adaptive learning control network. Fuzzy
Sets and Systems, 112, pp. 65–84, 2000
6. C.A. Coello. List of References on Evolutionary Multiobjective Optimization. http://www.
lania.mx/ccoello/EMOO/EMOObib.html
7. C.A. Coello. An Updated Survey of Evolutionary Multiobjective Optimization Techniques:
State of the Art and Future Trends. In 1999 Congress on Evolutionary Computation, IEEE
Service Center, 1, pp. 3–13, Washington, DC, 1999
8. P. Czyzak and A. Jaszkiewicz. Pareto simulated annealing — a metaheuristic technique for
multiple-objective combinatorial optimization. Journal of Multi-Criteria Decision Analysis, 7,
pp. 34–47, 1998
9. K. Deb, Samir Agrawal, Amrit Pratab and T. Meyarivan, A Fast Elitist Non-Dominated Sort-
ing Genetic Algorithm for Multi-Objective Optimization: NSGA-II. In: Marc Schoenauer, K.
Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo and Hans-Paul Schwe-
fel, editors. Proceedings of the Parallel Problem Solving from Nature VI Conference. Paris,
France. Springer. Lecture Notes in Computer Science No. 1917, p. 849–858, 2000
10. K. Deb and Tushar Goel. Controlled Elitist Non-dominated Sorting Genetic Algorithms for
Better Convergence. In: E. Zitzler, K. Deb, L. Thiele, C.A. Coello and David Corne, edi-
tors. First International Conference on Evolutionary Multi-Criterion Optimization. Springer-
Verlag. Lecture Notes in Computer Science No. 1993, pp. 67–81, 2001
11. K. Downing. Using evolutionary computational techniques in environmental modelling. Envi-
ronmental Modelling and Software, 13, pp. 519–528, 1998
12. C. Evans, P.J. Fleming, D.C. Hill, J.P. Norton, I. Pratt, D. Rees and K. Rodriguez-Vazquez,
Application of system identication techniques to aircraft gas turbine engines. Control Engi-
neering Practice, 9, pp. 135–148, 2001
13. A.I. Fernandez, L. Sanchez and J. J. Navarro. Approximating the discrete space equation from
chaotic noisy data (IPMU’2000). In Proceedings of Information Processing and Management
of Uncertainty in Knowledge-Based Systems, Madrid, Spain, pp. 149–156, 2000
200 L. Sánchez and J.R. Villar
14. D.B. Fogel and L.J. Fogel. Preliminary Experiments on Discriminating between Chaotic Sig-
nals and Noise using Evolutionary Programming. In: J.R. Koza, D.E. Goldberg, D.B. Fogel
and R. L. Riolo, editors. Genetic Programming 96., MIT Press, Cambridge, MA, 1996
15. C.M. Fonseca and Peter J. Fleming. Multiobjective Optimization and Multiple Constraint
Handling with Evolutionary Algorithms — Part I: A Unified Formulation. IEEE Transactions
on Systems, Man, and Cybernetics, Part A: Systems and Humans, 28(1), pp. 26–37, 1998
16. G.J. Gray, D.J. Murray-Smith, Y. Li, K.C. Sharman and T. Weinbrenner. Nonlinear model
structure identification using genetic programming. Control Engineering Practice, 6, pp.
1341–1352, 1998
17. N.F. Guler, E.D. Ubeyli and I. Guler. Recurrent neural networks employing Lyapunov ex-
ponents for EEG signals classification. Expert Systems with Applications, 29, pp. 506–514,
2005
18. M. Hapke, Andrzej Jaszkiewicz and Roman Slowinski. Pareto Simulated Annealing for Fuzzy
Multi-Objective Combinatorial Optimization. Journal of Heuristics, 6(3) pp. 329–345, August
2000
19. M. Hernández-Guı́a, R. Mulet and S. Rodrguez-Prez, A New Simulated Annealing Algorithm
for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media.
Physical Review E, 72 (3), 2005
20. W. Jiang, Q. Guo-Dong and D. Bin. Observer-based robust adaptive variable universe fuzzy
control for chaotic system. Chaos, Solutions and Fractals, 23, pp. 1013–1032, 2005
21. T. Jones. Crossover, macromutation and population-based search. In: 6th Internatonal Con-
ference on Genetic Algorithms. San Francisco, July 15–19, Morgan Kaufmann, 1, pp. 73–80,
2005
22. H. Kantz. A robust method to estimate the maximal Lyapunov exponent of a time series. Phys.
Lett. A, 185, pp. 77–87, 1994
23. D. Kim. Improving the fuzzy system performance by fuzzy system ensemble. Fuzzy Sets and
Systems, 98, pp. 43–56, 1998
24. D. Kugiumtzis, B. Lillekjendliey and N. Christophersen. Chaotic time series. Part I: Estima-
tion of some invariant properties in state space. Identification and Control, 4(15), pp. 205–224,
1995
25. B. Lillekjendlie, D. Kugiumtzis and N. Christophersen. Chaotic time series part II: System
identification and prediction. Identification and Control, 4(15), pp. 225–243, 1995
26. A. Lopez, H. Lopez and L. Sanchez. Graph based GP applied to dynamical system modeling.
In: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence, 6th Inter-
national Work-Conference on Artificial and Natural Neural Networks, IWANN 2001. Lecture
Notes in Computer Science, 2084, pp. 725–732, 2001
27. S. Luke. Two Fast Tree-Creation Algorithms for Genetic Programming. IEEE Transactions on
Evolutionary Computation, 4(3), pp. 274–283, September 2000
28. S. Luke and Liviu Panait. A Survey and Comparision of Tree Generation Algorithms. In: Lee
Spector et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO’2001). San Francisco, California, Morgan Kaufmann, pp. 81–88, 2001
29. M.T. Rosenstein, J.J. Collins and C.J. De Luca. A practical method for calculating largest
Lyapunov exponents from small data sets. Physica D, 65, pp. 117-134, June, 1993
30. M.W. Mak, K.W. Ku and Lu. On the improvement of the real time recurrent learning algorithm
for recurrent neural networks. Neurocomputing, 24, pp. 13–36, 1999
31. M.A. Matos and Paulo Melo. Multiobjective Reconfiguration for Loss Reduction and Service
Restorating Using Simulated Annealing. In: International Conference on Electric Power En-
gineering, Budapest 99. IEEE, pp. 213–218, 1999
32. Y. Mei-Ying and W. Xiao-Dong. Chaotic time series prediction using least squares support
vector machines. Chinese Physics, 13, pp. 454–458, 2004
33. Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs, 3rd ed.
Springer-Verlag, 1996
34. S. Mukherjee, E. Osuna and F. Girosi. Nonlinear Prediction of Chaotic Time Series Using
Support Vector Machines. in: Proceedings of IEEE NNSP’97, Amelia Island, FL, USA, IEEE
Service Center, pp. 24–26, September 1997
Multiobjective Evolutionary Search of Difference Equations-based Models 201
35. D. Nam and Cheol Hoon Park. Multiobjective Simulated Annealing: A Comparative Study to
Evolutionary Algorithms. International Journal of Fuzzy Systems, 2(2), pp. 87–97, 2000
36. D. Nam and Cheol Hoon Park. Pareto-Based Cost Simulated Annealing for Multiobjective Op-
timization. In: Lipo Wang, Kay Chen Tan, Takeshi Furuhashi, Jong-Hwan Kim and Xin Yao,
editors. Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning
(SEAL’02). Nanyang Technical University, Orchid Country Club, Singapore, 2, pp. 522–526,
November 2002
37. R. Poli and Nicholas F. McPhee. Exact GP Schema Theory for Headless Chicken Crossover
and Subtree Mutation. In: Proceedings of the 2001 Congress on Evolutionary Computation
CEC2001 COEX, World Trade Center, 159 Samseong-dong, Gangnam-gu, Seoul, Korea,
IEEE Press, pp. 1062–1069, 2001
38. P. Potocnik and I. Grabec. Nonlinear model predictive control of a cutting process. Neuro-
computing, 43, pp. 107–126, 2002
39. K. Rodrı́guez-Vázquez, C.M. Fonseca and P.J. Fleming. Identifying the Structure of NonLin-
ear Dynamic Systems Using Multiobjective Genetic Programming. IEEE Transactions on Sys-
tems, Man, and Cybernetics — Part A: Systems and Humans, 34(4), pp. 531–545, July 2004
40. J.J. Rowland. Model selection methodology in supervised learning with evolutionary compu-
tation. BioSystems, 72, pp. 187–196, 2003
41. A.E. Ruano, P.J. Fleming, C. Teixeira, K. Rodriguez-Vazquez and C.M. Fonseca. Nonlinear
identification of aircraft gas-turbine dynamics. Neurocomputing, 55, pp. 551–579, 2003
42. L. Sanchez, I. Couso and J.A. Corrales. Combining GP operators with SA search to evolve
Fuzzy Rule based classifiers. Information Sciences, 1–5, pp. 175–192, 2001
43. R.S. Sexton and J.N.D. Gupta. Comparative evaluation of genetic algorithm and backpropa-
gation for training neural networks. Information Sciences, 129, pp. 45–59, 2000
44. T. Shin and I. Han. Optimal signal multi-resolution by genetic algorithms to support artificial
neural networks for exchange-rate forecasting. Expert Systems with Applications, 18, pp.
257–269, 2000
45. Kevin I. Smith, Richard M. Everson and Jonathan E. Fieldsend. Dominance Measures
for Multi-Objective Simulated Annealing. In: 2004 Congress on Evolutionary Computation
(CEC’2004). Portland, Oregon, USA. IEEE Service Center, 1, pp. 23–30, June, 2004
46. J.C. Sprott. Chaos and Time-Series Analysis. Oxford University Press, 2003
47. Z. Wei, W. Zhi-ming and Y. Gen-ke. Genetic programming-based chaotic time series model-
ing. Journal of Zhejiang University SCIENCE, 5(11), pp. 1432–1439, 2004
48. A. Wolf, J.B. Switf, H.L. Swinney and J. A. Vastano. Determining Lyapunov Exponents from
a Time Series. Physica D, 16, pp. 285–317, 1985
49. A.M. Woodward, R.J. Gilbert and D. B. Kell. Genetic programming as an analytical tool for
non-linear dielectric spectroscopy. Bioelectrochemistry and Bioenergetics, 48, pp. 389–396,
1999
50. E. Zitzler, K. Deb and L. Thiele. Comparison of Multiobjective Evolutionary Algorithms:
Empirical Results. Evol. Comput., 8(2), pp. 173–195, 2000
51. E. Zitzler, M. Laumanns, L. Thiele, C.M. Fonseca and V. Grunert da Fonseca, Why Qual-
ity Assessment of Multiobjective Optimizers Is Difficult. In: W.B. Langdon, E. Cantú-Paz,
K. Mathias, R. Roy, D. Davis, R. Poli and K. Balakrishnan, V. Honavar, G. Rudolph,
J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E. Burke and N. Jonoska, edi-
tors. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’2002).
San Francisco, California, pp. 666–673, July 2002
52. E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca and V. Grunert da Fonseca. Performance
Assessment of Multiobjective Optimizers: An Analysis and Review. IEEE Transactions on Evo-
lutionary Computation, 7(2), pp. 117–132, April 2003
An Integrated Fuzzy Inference-based
Monitoring, Diagnostic, and Prognostic System
for Intelligent Control and Maintenance
Abstract With the advent of modern computation, intelligent control and mainte-
nance systems have become a viable option for complex engineering processes and
systems. Such control and maintenance systems can be generically described as be-
ing composed of 5 analysis steps: (1) predict the expected system signals from their
measured values, (2) use the residual of the measured and predicted value to deter-
mine if the system is operating in a nominal or a degraded mode, (3) if the system
is operating in a degraded mode, diagnose the fault, (4) prognose the failure by es-
timating the remaining useful life (RUL) of the system, and (5) use the collected
information to determine if an appropriate control or maintenance action should be
performed to maintain the health and safety of the system performance. This chap-
ter presents the development and adaptation of a single generic inference procedure,
namely the nonparametric fuzzy inference system (NFIS), for monitoring, diagnos-
tics, and prognostics. To illustrate the proposed methodologies, the embodiments of
the NFIS are used to detect, diagnose, and prognose faults in the steering system
of an automated oil drill. The embodiments of the NFIS were found to have simi-
lar performance to traditional algorithms, such as autoassociative kernel regression
(AAKR) and k-nearest neighbor (kNN), for monitoring and diagnosis. The NFIS
prognoser was also shown to estimate the remaining useful life of the steering sys-
tem to within an hour of its actual time of failure.
1 Introduction
The ability to monitor and control complex systems has been of interest for decades
with a myriad of successful applications; however, the ability to identify system
Dustin R. Garvey and J. Wesley Hines
The University of Tennessee, Knoxville, Department of Nuclear Engineering, Knoxville, TN
37996-2300, United States of America, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 203
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 203–222.
c 2008 Springer.
204 D.R. Garvey and J.W. Hines
degradation and predict remaining useful life (RUL) has proved much more diffi-
cult. Research in prognostic methods has recently come to the forefront as com-
panies strive to become more competitive and as the US Department of Defense
requires prognostic capabilities in new weapon systems. The desired system would
take the form of an integrated system for monitoring, detection, identification, and
prognostics.
Traditional reliability methods [11, 24] predict system or device RUL based on his-
torical data collected from a population of identical or similar devices. However,
these predictions are accurate only for an “average”, or typical, device. Predictions
for an individual device are far more useful because the uncertainty is typically
much smaller, and, thus, are the focus of more recent research. Improved prognostic
methods use covariate information and cumulative damage models [5]. These meth-
ods provide a prediction based on how long the average component would operate
under the current conditions. More recent techniques use degradation data to assess
equipment condition and predict future behavior, such as time to failure (TTF) or
RUL. These individualized prognostics techniques have the ability to make RUL
predictions with less uncertainty than population-based methods; however, they re-
quire measurement information related to the equipment degradation. A detailed re-
view of the reliability data-analysis methods using degradation measurements rather
than time-to-failure data is given by Lu and Meeker [23] and a recent review of re-
search in the field of prognostics and health management (PHM) for electronics is
given by Vichare and Pecht [28].
Prognostics methods require either detailed physics-of-failure models or failure
data to train empirical failure modes. Because detailed physics models are usually
difficult to construct for each failure mode and sufficient historical failure data is
rarely available, successful prognostic applications are rare. In industry, when equip-
ment degradation is detected, maintenance procedures are implemented that restore
or replace the failing item. If items are not maintainable, usually the item is re-
designed to remove the fault mode. Items that are allowed to fail in service are not
usually monitored, or they fail so rapidly that prognostics would not be beneficial.
These are the main reasons that successful prognostics applications are not readily
available for study.
PROCESS / SYSTEM
PROCESS / SYSTEM
The nonparametric fuzzy inference system (NFIS) is a fuzzy inference system (FIS),
whose membership function centers and parameters are observations of exemplar
inputs and outputs. This approach is unique in that previous algorithms described
in the literature use “composed” observations to parameterize the membership func-
tions (MF) of the FIS. For example, Germond and Niebur [13] use expert knowledge
to create MFs about composed patterns that map to qualitative features such as hot,
cold, high, and low. Another popular approach for MF parameterization is partition-
ing [22]. In fuzzy partitioning, the data space is partitioned into regions and MFs
are created about the centers of these regions. Here, the composed patterns are the
region centers. A similar approach implemented in unsupervised clustering algo-
rithms, such as fuzzy c-means [3, 10]; and Adeli-Hung [1] clustering, centers the
MFs on composed cluster centers and calculates the cluster parameters in terms of
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 207
the distance from the cluster center. In yet another approach, the parameters of the
MFs can be determined by performing least squares optimization of the FIS inputs
and outputs [21].
At this point, the NFIS inference procedure will be briefly described. For a more
detailed explanation, refer to Garvey [12]. Suppose n exemplar observations of the p
inputs and r outputs that characterize the system’s normal operating conditions (S)
are collected. These observations should cover the system’s future operating space.
As with any nonlinear, empirical prediction algorithm, confidence cannot be given
to predictions made outside the trained region. The NFIS will infer the system’s
mathematical relationship:
Y = S(X)
The exemplar observations are represented by two matrices: X and Y, in which
Xi, j is the observation i of input j and Yi,k is observation i of output k.
⎡ ⎤ ⎡ ⎤
X1,1 X1,2 ... X1,p Y1,1 Y1,2 ... Y1,r
⎢ X2,1 X2,2 ... X2,p ⎥ ⎢ Y2,1 Y2,2 ... Y2,r ⎥
⎢ ⎥ ⎢ ⎥
X=⎢ . . . . ⎥ Y=⎢ . .. . . .. ⎥
⎣ .. .. . . .. ⎦ ⎣ .. . . . ⎦
Xn,1 Xn,2 ... Xn,p Yn,1 Yn,2 ... Yn,r
In the NFIS, the MFs from the exemplar inputs and outputs are directly defined by
the data matrices. As an example, consider creating the MFs for five exemplar obser-
vations of a single input. For the sake of simplicity, also assume that the exemplars
are sorted from smallest to largest, i.e.
⎡ ⎤
X1,1
⎢ X2,1 ⎥
⎢ ⎥
X=⎢ ⎥ X1,1 < X2,1 < X3,1 < X4,1 < X5,1
⎢ X3,1 ⎥
⎣ X4,1 ⎦
X5,1
208 D.R. Garvey and J.W. Hines
Overlap = 2
1
µx1,1 (x1)
0
X1,1 X2,1 X3,1 X4,1 X5,1 x1
Fig. 2 Final triangular membership functions for the five exemplar observations and an overlap of
two
In the NFIS MF creation algorithm, triangular MFs are centered on the exemplar
observations and the MF support is set to be neighboring signal observations. The
proximity of the neighbors are controlled by an overlap parameter. For example,
the right endpoint of a triangular MF for the ith exemplar observation is set to the
(i + overlap)th observation. The parameters for the boundary MFs are defined in
terms of the half-width of the current MF. For an overlap parameter of 2, the MFs
presented in Figure 2 are obtained. This process is repeated for each input and output
signals to obtain the remaining MFs.
To estimate the response for an observation of the inputs, the previously pre-
sented FIS with the created MFs is used. The MEAN operator is used to determine
the degree of fulfillment (DOF) or the extent by which each rule fires instead of the
traditional MIN (AND) operator. This concludes the derivation of the general NFIS
framework, next the framework will be used to implement the five analysis steps of
the control/maintenance system.
This section provides a description of the different embodiments of the general NFIS
used in the integrated system: prediction, detection, diagnosis, and prognosis. As a
starting point, the integrated monitoring, diagnosis, and prognosis system is pre-
sented in Figure 3. Here, asset (system or process) data is collected and digitized.
The collected data is then passed to a signal selector, which takes the input signals
and extracts previously identified, correlated signals. The collected observations
of the signals are then presented as inputs to an NFIS predictor, which produces
estimates of the “correct” signal values from their measured values. The prediction
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 209
SPRT/CUMSUM
Detector
NO Fault
?
YES
NFIS Diagnoser
Control/Maintenance
NFIS Prognoser
Actioin
Operator
Fig. 3 Block diagram of the fuzzy inference-based monitoring, diagnostic, and prognostic system
for an autoassociative predictor architecture
residuals are then compared to the NFIS estimates by a cumulative sum (CUMSUM)
or sequential probability ratio test (SPRT) statistical detector, which determines if
the asset is operating in a nominal or degraded mode. If the detector output indicates
that the asset is operating normally (no fault/anomaly), then no maintenance/control
action is executed and the monitoring, diagnostic, and prognostic system examines
the next observation of the asset signals. However, if the detector output indicates
that the asset is operating in a degraded mode, the prediction and detection results
are passed to an NFIS diagnoser, which maps the provided symptom patterns (pre-
diction residuals, signals, alarms, etc.) to known fault conditions. Next, the predic-
tion, detection, and diagnosis results are passed to an NFIS prognoser, which esti-
mates RUL of the asset. Finally, the prediction, detection, diagnosis, and prognosis
results are used to determine an appropriate maintenance or control action.
In the remaining sections, the details of the different embodiments of the NFIS
will be described, beginning with the NFIS predictor.
3.1 Prediction
The NFIS methodology was previously presented for the prediction application;
therefore, an extensive discussion of the NFIS as a predictor is not necessary here.
210 D.R. Garvey and J.W. Hines
It is, however, important to describe the settings that are used to define the NFIS
architecture.
The NFIS architecture settings include options that are common to other non-
parametric predictors, such as the number of memory or exemplar vectors used to
define the system and the vector selection technique. A discussion of optimal vec-
tor selection is beyond the scope of this work and the reader is referred to a survey
paper by Hines and Garvey [17].
An important user selectable NFIS parameter is the membership function over-
lap. Recall that the overlap parameter controls the width of the MFs that are created
for each of the selected exemplar observations. This is similar to the kernel width
used in radial basis functions, generalized regression neural networks, and kernel
regression. The overlap parameter can be interpreted as a regularization parameter
because a larger overlap allows more exemplars to be deemed similar to the query,
which results in smoother model predictions.
The final NFIS architecture parameter is the implication method, which controls
how the memberships to each of the signals or variables are combined to obtain a
DOF for each exemplar observation. Common implication methods are the mini-
mum, maximum, sum, and mean operators. In general, the implication method does
not significantly affect the NFIS predictions, but may offer advantages and disad-
vantages for specific applications.
3.2 Detection
The NFIS is not explicitly used for anomaly and fault detection, but it does per-
form a critical task in the process. Isermann [20] describes the process by which
an anomaly or fault can be detected as being composed of two steps: (1) make a
prediction and (2) generate and evaluate a residual on the basis of being represen-
tative of a degraded system condition. For this work, the NFIS is used to predict
a system signal from other signal measurements. The residual is generated by cal-
culating the error between the predicted and measured values. Finally, the residual
is passed to a statistical routine that compares the current residual distribution to a
nominal distribution. More specifically, the statistical routine uses the distribution
of the residuals to determine if the system is currently operating in a nominal or de-
graded mode. Statistical routines that have been historically used for fault detection
include the sequential probability ratio test (SPRT) [14, 29] and the cumulative sum
(CUMSUM) test [2, 25].
3.3 Diagnosis
When considering the application of the NFIS to diagnosis, the most apparent ap-
proach would be to construct a NFIS predictor with symptom patterns as inputs
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 211
(e.g., prediction residuals for observations with fault alarms) and integer fault iden-
tifiers (ID) as the output. To obtain a classification, the observed residuals would
be input to the NFIS predictor and the predicted output would be rounded to the
nearest integer to obtain the fault type. The problem with this approach can be made
apparent by considering a quick example. Suppose symptom patterns for three dif-
ferent fault conditions exist and that a NFIS is trained to estimate the fault ID (1, 2,
or 3) for query symptom residual patterns. If there are no overlaps of the symptom
patterns for these three fault conditions, this approach should work well, but how
would the NFIS perform when there is symptom pattern overlap? To answer this
question, let us consider the case in which there is overlap between the symptom
patterns of the first and third fault types. Next, suppose that the goal is to diagnose
the fault of a query symptom pattern that lies in the overlapping regions of the first
and third faults. For this example, the memberships are near 0.5 for both the first
and third fault condition. The resulting diagnosis estimate would be near 2, which
means that we have diagnosed the query as belonging to the second fault condition.
Does this make sense? For a predictor model with continuous inputs and outputs,
this would be an appropriate estimate, since the inputs map to a value that is nu-
merically between 1 and 3. However, in some situations, a classification of 2 as
being an intermediate between the first and third class might not make any sense.
Therefore, the NFIS structure must be modified to reflect the occurrence of partial
memberships.
For this discussion, suppose that n observations of p inputs (variables) that are ex-
amples of nc classes (fault conditions) are collected. Also, let Ci designate the ith
class and ni the number of examples for this class. Using these definitions, the
sum of the number of examples for each class is equal to the number of example
observations.
nc
n = ∑ ni
i=1
To use the NFIS for diagnosis, the output Y is converted to a binary format, which
will be designated by Y∗ . To do this, create an n × nc matrix of zeros and then set
the ith column elements to 1 for the symptom observations for fault Ci . Therefore,
Y can be rewritten:
C1 C2 ... Cn
⎡ ⎤
1 0 ... 0
⎢ .. .. .. .. ⎥
⎢. . . .⎥
⎢ ⎥
⎢ 1 0 ... 0 ⎥
⎢ ⎥
⎢ 0 1 ... 0 ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
⎢. . . .⎥
Y=⎢
⎢ 0 1 ... 0 ⎥
⎥
⎢ ⎥
⎢. . . .⎥
⎢ .. .. .. .. ⎥
⎢ ⎥
⎢ 0 0 ... 1 ⎥
⎢ ⎥
⎢. . . .⎥
⎣ .. .. .. .. ⎦
0 0 ... 1
Traditionally, Cn − 1 dummy variables are used to fully define Cn fault classes. How-
ever, Cn dummy variables should be used in this application to allow for partial
memberships to each fault class. To diagnose a fault from an observation of the
symptom patterns, simply stimulate the NFIS with the observed symptom pattern
as an input. The output of the NFIS diagnoser is a vector of nc memberships of the
symptom pattern to each of the fault classes. Finally, diagnose the fault as belonging
to the class to which it has the largest membership.
3.4 Prognosis
Vichare and Pecht [28] define prognostics as being “the process of predicting a
future state (of reliability) based on current and historic conditions.” Since the even-
tual goal of any prognostic system is to be able to determine when a component
is going to fail, another appropriate definition of prognostics that will be adopted
for this work is “the process by which the remaining useful life of a component or
system is estimated” [27].
Before examining how the NFIS can be used for RUL estimation, the general
prognostic approach that will be implemented in this work should be examined.
Suppose the degradation of a system or component can be quantified by a single
parameter, which is referred to as a prognostic parameter. This parameter may be
constructed as a function of several measured parameters or residuals. As the system
degrades, the prognostic parameter should increase until a threshold is reached and
a failure occurs [16]. As an example, consider the plot presented in Figure 4. Notice
that as time progresses, the prognostic parameter generally increases until it reaches
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 213
t0
Time
the threshold Y*. The threshold may simply be a specified operating level such as
an upper allowed voltage threshold or vibration level. At this point, a failure is said
to have occurred.
To obtain RUL estimates for observations of the degradation parameter, tradi-
tional regression techniques can be used. For example, in Figure 5, a nonlinear
function of the form y = at−b is fit to the observed data.
There are two major problems that must be addressed if this approach is to be
used: (1) a viable prognostic parameter must be identified and (2) the threshold for
214 D.R. Garvey and J.W. Hines
failure must be identified. Gross et al. [14] suggests using the alarm frequency since
it “scales monotonically with the degree of severity of the degradation, regardless
of the magnitude or units for the original monitored signals (e.g., temperatures,
voltages).” If the alarm frequency is implemented by using a local window, in some
situations the parameter will stop increasing prior to failure if it flat lines at 1.0
or 100% of the window observations. For this reason, the cumulative sum of the
number of fault alarms is more suitable as a prognostic parameter. It is important
to note that, if an appropriate window size can be determined, both methods could
produce equivalent results, but the latter was selected for this work to avoid the
window-size problem. Also, since the NFIS prognosis algorithm is formally based
on the concept of a generic prognostic parameter, either parameter could be used.
For this work, a modified form of the previously described algorithm will be im-
plemented, which does not make use of a prognostic parameter threshold. Rather
than define failure explicitly in terms of the value of a prognostic parameter, failure
will be defined in terms of how long the system has been operating after the onset
of a failure mechanism. Here, onset to failure is defined as the time at which a spec-
ified number of fault alarms have occurred (e.g., 25−100 alarms). For the example
presented in the next sections, onset to failure was defined as the instance where 100
fault alarms have been registered.
Now that the general RUL estimation process has been described, the use of
NFIS for prognosis will be examined. Suppose that n histories for the prognostic
parameter of a system have been collected. From these histories a vector of the
time-to-failures after onset can be extracted.
⎡ ⎤
TTF1
⎢ TTF2 ⎥
⎢ ⎥
TTF = ⎢ . ⎥
⎣ .. ⎦
TTFn
Furthermore, suppose that regression on each of the histories has been performed.
The bank of equations that relate the time after the onset of the failure (t) to the
prognostic parameter (Y ) of the system may be expressed by the following equation,
where θ̂ i are the regressed parameters for the ith history and Θ̂ are all of the regressed
parameters. ⎡ ⎤
Y1 (t, θ̂ 1 )
⎢ Y2 (t, θ̂ 2 ) ⎥
⎢ ⎥
Y(t, Θ̂) = ⎢ .. ⎥
⎣ . ⎦
Yn (t, θ̂ n )
For this discussion, suppose that the time after the onset to failure is the number of
observations after a specified number of fault alarms (e.g., 25−100) have occurred.
This method can be easily implemented for time-series data with a constant sample
rate. For example, if N observations after onset to failure are observed, then evalu-
ate Y(t, Θ̂) for t = N to determine what the prognostic parameter value should be
according to the n regressed histories.
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 215
⎡ ⎤
Ŷ1
⎢ Ŷ2 ⎥
⎢ ⎥
Y(t = N, Θ̂) = ⎢ . ⎥
⎣ .. ⎦
Ŷn
At this point, the number of observations after the onset-to-failure (N) and cur-
rent prognostic parameter (Y ) have been determined. Additionally the vector of n
estimates for the prognostic parameter Ŷ and a vector of the time-to-failures after
onset TTF are determined. These values are used to build a predictor that maps
the observed prognostic parameter to the RUL of the system. To do this, an NFIS
predictor is created “on the fly” with predicted prognostic parameter values Ŷ as
exemplar inputs and their corresponding RULs as outputs. Here, the RUL of the
ith prognostic parameter estimate is simply its time-to-failure minus the number of
current observations:
RULi = T T Fi − N
Finally, the RUL is estimated by supplying the NFIS predictor with the current prog-
nostic parameter.
To review, several degradation histories are collected that have known lifetimes
after the onset of a failure and have specific structures that are characterized by the
“shape” of the prognostic parameter history. If it is desired to calculate the RUL of
another similar system or component, two pieces of information are available: the
elapsed time after onset to failure and the value of the prognostic parameter. Next,
the regressed functions are evaluated for the current elapsed time. This provides
“guide posts” that can be compared to the current prognostic parameter to determine
the “shape” of its progression. In essence, the estimates of the prognostic parameter
are used to determine which degradation history the system is similar to, and since
the time-to-failure for the failure histories are known, the similarities can be related
to the system RUL.
4 Methodology
The data used in the example presented in this section were collected from the
hydraulic steering system of a drill used for deep oil exploration. In the system,
the drill bit rotates and dislodged material is pumped to the surface. For this work
we are interested in the steering system, whose major components are the three hy-
draulic units that are located near the drill bit. To steer the unit, ribs are extended
in their respective directions to “push” the head in the desired direction. To empir-
ically model each hydraulic unit, four sensor measurements were used: the target
hydraulic pressure (calculated by the control system), measured hydraulic pressure,
electrical current to the hydraulic pump motor, and the motor RPM.
For this work, 11 data sets which progress to failure are used. These data sets
represent three different fault conditions:
216 D.R. Garvey and J.W. Hines
1. Mud invasion — mud enters the hydraulic units and causes failure (3 data sets)
2. Pressure transducer offset — sensor offset (negative and positive) causes prob-
lems in the control of the system, which eventually results in system failure (2
negative offset and 3 positive offset)
3. Pump startup failure — pump failure shortly after the drill is started (3 data sets)
For each data set the embodiments of the NFIS were used for monitoring (pre-
diction and detection), diagnosis, and prognosis. Traditional algorithms that can be
found in the literature are compared to the embodiments of the NFIS, when applica-
ble. Before the results of this study are examined, the methodologies used in each
analysis step are briefly presented.
To evaluate the effectiveness of the NFIS for monitoring, it was used as a pre-
dictor with the SPRT to detect faults in the 11 data sets discussed earlier. For the
sake of comparison, a comparable system implementing an autoassociative kernel
regression (AAKR) [7,8] predictor was used with the same SPRT test. For this work
the predictors and detectors were trained on the first 8 hours of operational data ex-
tracted from each data set, which was determined to be fault-free based on visual
inspection.
To evaluate the effectiveness of the NFIS for diagnosis, a “bagging” architecture
was used [4]. Notice in Figure 6 that the 4 signal observations with fault alarms
generated by the NFIS monitoring system are used to diagnose the three fault con-
ditions. The final classification is made by fusing the output of the four classifiers
via the mean operator and then identifying the class with the maximum fused mem-
bership. For the sake of comparison, a comparable system implementing a k-nearest
neighbor (kNN) diagnoser was used [6, 9]. The bagged diagnoser architecture was
selected because it was found to significantly outperform diagnosers that use ob-
servations of all of the symptom patterns as inputs. This structure more effectively
uses the symptom patterns and fault alarms since the individual diagnosers examine
Fault?
Fault?
YES NFIS
Measured
Pressure Diagnoser
Fault?
Memberships to 3
Fault Classes
symptom patterns for signals with alarms. For example, if there are faults in the
first two signals, then the diagnosis is based on the observed symptom patterns for
these two signal and not the other signals. For this work, the diagnosers were trained
on two mud invasion, two transducer offset (one positive and one negative), and two
pump startup data sets. To test the diagnoser, it was simulated with one mud inva-
sion, two transducer offset (one positive and one negative), and one pump startup
data set. One of the pressure transducer data sets was not used because a fault was
not detectable (Section 5).
Finally, to evaluate the effectiveness of the NFIS for prognosis, an NFIS prog-
noser was trained on each of the fault conditions. Here, the prognostic parameter is
the cumulative sum of the fault alarms and onset to failure was defined as being the
observation when 100 fault alarms have been registered. For this work a prognoser
is trained on two mud invasion, two transducer offset (one positive and one nega-
tive), and two pump startup data sets. To test the prognoser, the RUL is estimated for
the steering system with one mud invasion, two transducer offset (one positive and
one negative), and 1 pump startup data sets. Again, one of the pressure transducer
data sets is not used because a fault was not detectable (Section 5).
5 Results
The results of applying the previously described monitoring, diagnostic, and prog-
nostic systems to the hydraulic steering system are presented in this section.
5.1 Monitoring
The results of the monitoring systems implementing an NFIS and AAKR predictor
and SPRT detector are presented in Table 1. For this work, the warning time is
defined as the length of time from the instance of five sequential alarms and the
time of failure. The instance of five sequential alarms was used as an indicator of
warning time because the occurrence of multiple sequential alarms is more likely
due to an actual fault or anomaly as opposed to spurious alarms. Notice that both
monitoring systems detect faults in 10 of the 11 data sets, which translates to a
detection rate of approximately 91%. The missed detection was determined to be
0
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Measured Pressure
400
Pressure (bar)
200
0
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Electric Current
2
Current (A)
–2
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Motor RPM
5000
Motor RPM
0
0 5 10 15 20
Time (hrs)
Fig. 7 NFIS fault detection results for the first hydraulic unit of the Mud Invasion #1 data
5.2 Diagnosis
The diagnosis results of the NFIS and kNN diagnosers are presented in the confu-
sion matrices below, Tables 2 and 3 respectively. In the following tables, the number
of NFIS or kNN classifications for the different data sets is presented in the columns.
For example, the number of classifications for the test mud invasion (MI) data set is
presented in the first column. The count in the first row is the number of MI faults
that are classified correctly as being MI, the second row is the number of pressure
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 219
MI 1616 19 24 97.41 %
Class
True
transducer offset (PTO) faults that are incorrectly classified as MI faults, and the
third row is the number of pump startup (PS) faults that are incorrectly classified
as MI faults. Ideally, only the diagonal of elements of the confusion matrix (CM)
should be nonzero, since these elements represent correct classifications.
Notice that both diagnosers are able to accurately diagnose the 3 fault conditions.
More specifically, the overall accuracy of the NFIS diagnosis system is ∼94%, while
the accuracy of the kNN diagnosis system is ∼88%. For this analysis, the perfor-
mance of the NFIS diagnosis system is slightly better than the kNN system, but
a more important feature of the results is that the NFIS diagnoser performance is
comparable to the traditional kNN diagnoser.
5.3 Prognosis
The results of using the NFIS for RUL estimation are presented in Table 4. Again,
MI refers to mud invasion, PTO refers to pressure transducer offset, and PS refers to
pump startup. Here, OTF refers to onset to failure or the time when 100 fault alarms
have been registered. The mean lifetime after OTF is included to aid in interpreting
the scale in the RUL estimate errors, i.e., the mean absolute error (MAE) should be
small relative to the lifetime after OTF.
It can be seen that for the MI and PTO data sets, we are able to estimate the RUL
with a high degree of accuracy, in that the MAE is less than an hour. Next, notice
that the RUL estimates for the PTO and PS data sets are progressively less accurate
then the estimates for the MI data. This result is expected since we are estimating
the RUL by performing a regression with two data points (two training histories for
MI, PTO, and PS). As additional data is integrated into the described system, the
performance should improve considerably.
220 D.R. Garvey and J.W. Hines
6 Conclusions
This paper has described an intelligent control and maintenance system that includes
modules for prediction, detection, diagnosis, prognosis, and evaluation. This pa-
per has also addressed a major hurdle in the development of such a system for a
“real world” process or system by developing an integrated monitoring (prediction
and detection), diagnostic, and prognostic system by adapting the newly developed
nonparametric fuzzy inference system (NFIS) for each task. To validate the pro-
posed methodologies, the embodiments of the NFIS were used to detect, diagnose,
and prognose faults in the hydraulic steering system of an automated oil drill. The
embodiments of the NFIS were found to have similar performance to traditional
algorithms, such as autoassociative kernel regression (AAKR) and k-nearest neigh-
bor (kNN), for monitoring and diagnosis. The NFIS prognoser was also shown to
be able to estimate the remaining useful life (RUL) of a steering system to within
an hour of its actual time of failure. In closing, it is important to note that the re-
sults presented in this paper are founded on a very limited amount of data, namely
11 failure data sets for 4 fault conditions. While the results presented here are
promising, models developed with more data are expected to outperform the current
system.
References
1. H. Adeli and S.L. Hung. Machine Learning, Neural Networks, Genetic Algorithms, and Fuzzy
Logic. Wiley, New York, 1995
2. M. Basseville and I.V. Nikiforov. Detection of Abrupt Changes: Theory and Application.
Prentice-Hall, Englewood Cliffs, NJ, 1993
3. J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press,
New York: 1981
4. C.L. Black, R.E. Uhrig and J.W. Hines. Inferential Neural Networks for Nuclear Power Plant
Sensor Channel Drift Monitoring. Proceedings of the ANS Topical Meeting on NPIC &
HMIT: 1996
5. J.L. Bogdanoff and F. Kozin. Probabilistic Models of Cumulative Damage. Wiley, New York,
1985
6. T.M. Cover and P.E. Hart. Nearest Neighbor Pattern Classification. IEEE Transactions on
Information Theory, Vol. 13, No. 1: January 1967
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 221
28. N.M. Vichare and M.G. Pecht. Prognostic and Health Management of Electronics IEEE Trans-
actions on Components and Packaging Technologies, Vol. 29, No. 1, pp. 222–229: March 2006
29. A. Wald. Sequential Analysis. Wiley, New York, 1947
30. K. Whisnant, K. Gross and N. Lingurovska. Proactive Fault Monitoring in Enterprise Servers.
International Conference on Computer Design (CDES’05), Las Vegas, NV: June 27–30, 2005
31. W. Yan, C.J. Li and K.F. Goebel. A Multiple Classifier System for Aircraft Engine Fault Di-
agnosis. Proceedings of the 50th Meeting of the Machinery Failure Prevention Technology
(MFPT) Society, pp. 271–279, Virginia Beach, VA: April 3–6, 2006
Stable Anti-Swing Control for an Overhead
Crane with Velocity Estimation and Fuzzy
Compensation
Abstract This chapter proposes a novel anti-swing control strategy for an overhead
crane. The controller includes both position regulation and anti-swing control. Since
the crane model is not exactly known, fuzzy rules are used to compensate friction,
gravity as well as the coupling between position and anti-swing control. A high-
gain observer is introduced to estimate the joint velocities to realize PD control.
Using a Lyapunov method and an input-to-state stability technique, the controller is
proven to be robustly stable with bounded uncertainties, if the membership functions
are changed by certain learning rules and the observer is fast enough. Real-time
experiments are presented comparing this new stable anti-swing PD control strategy
with regular crane controllers.
1 Introduction
Although cranes are very important systems for handling heavy goods, automatic
cranes are comparatively rare in industrial practice [24], because of high investment
costs. The need for faster cargo handling requires control of the crane motion so that
Wen Yu
Department of Automatic Control
Xiaoou Li
Department of Computer Science
Center for Research and Advanced Studies of the National Polytechnic Institute
(CINVESTAV-IPN)
A.P. 14-740, Av.IPN 2508, México D.F., 07360, México, e-mail: [email protected]
George W. Irwin
School of Electronics, Electrical Engineering and Computer Science
Queen’s University Belfast, Ashby Building, Stranmillis Road, Belfast, BT9 5AH, UK
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 223
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 223–240.
c 2008 Springer.
224 W. Yu et al.
then used to estimate both friction and gravity. Unlike other work which used
the singular perturbation method [22], a new proof of stability is presented using
Lyapunov analysis. This proof explains the relation between the observer error and
the observer gain.
Since the swing of the payload depends on the acceleration of the trolley, mini-
mizing both the operation time and the payload swing produces partially conflicting
requirements. The anti-swing control problem involves reducing the swing of the
payload while moving it to the desired position as fast as possible [1]. One particu-
lar feedforward approach is input shaping [26], which is an especially practical and
effective method of reducing vibrations in flexible systems. In [20] the anti-swing
motion-planning problem is solved using the kinematic model in [17]. Here, anti-
swing control for a three-dimensional overhead crane is proposed, which addresses
the suppression of load swing. Non-linear anti-swing control based on the singu-
lar perturbation method is presented in [30]. Unfortunately, all of these anti-swing
controllers are model-based.
In this chapter, a PID law is used for anti-swing control which, being model-free,
will affect the position control. The same fuzzy compensator used for friction and
gravity is applied to handle the position error. The required online learning rule is
obtained from the tracking error analysis and there is no requirement for off-line
learning. The overall closed-loop system with the high-gain observer and the fuzzy
compensator is shown to be stable if the membership functions have certain learning
rules and the observer is fast enough. Finally, results from experimental tests carried
out to validate the controller are presented.
2 Preliminaries
The overhead crane system described schematically in Figure 1 (a) has the system
structure shown in Figure 1 (b). Here α is the payload angle with respect to the
vertical and β is the payload projection angle along the X-coordinate axis. The
dynamics of the overhead crane are given by [28]:
rail
xw
cart Fy
Fx
y yw
Fx
a Fy R
3D Crane
R a
FR
x xw b FR
b
yw
payload
Mc g
(a) (b)
Coriolis matrix C (x, ẋ) is skew-symmetric, i.e., it satisfies the following relation-
ship [9]
xT Ṁ(x) − 2C(x, ẋ) x = 0 (2)
A normal PD control law has the following from
where Kp and Kd are positive definite, symmetric and constant matrices, which cor-
respond to the proportional and derivative coefficients, x d ∈ ℜ5 is the desired po-
sition, and ẋ d ∈ ℜ5 is the desired joint velocity. Here the regulation problem is
discussed, so ẋ d = 0.
Input-to-state stability (ISS) is another elegant approach for stability analysis be-
sides the Lyapunov method. It can lead to general conclusions on stability using
the input and state characteristics. Thus, consider a class of non-linear systems de-
scribed by
ẋt = f (xt , ut ) (3)
where xt ∈ ℜn is the state vector, ut ∈ ℜm is the input vector, yt ∈ ℜm is the output
vector. f : ℜn × ℜm → ℜn is locally Lipschitz. Some passivity properties, as well
as some stability properties of passive systems are now recalled [4].
for each t ≥ 0.
Stable Anti-Swing Control for an Overhead Crane 227
This definition implies that if a system has input-to-state stability, its behaviour
should remain bounded when its inputs are bounded.
The control problem is to move the rail in such a way that the actual position of
the payload reaches the desired one. The three control inputs [Fx , Fy , FR ] can force
the crane to the position [xw , yw , R] , but the swing angles [α , β ] cannot be controlled
using the dynamic model (1) directly. In order to design an anti-swing control, lin-
earization models for [α , β ] are analyzed. Because the acceleration of the crane
is much smaller than the gravitational acceleration, the rope length is kept slowly
varying and the swing is not big, giving
|ẍw | g, |ÿw | g, R̈ g
Ṙ R, |α̇ | 1, β̇ 1
s1 = sin α ≈ α , c1 = cos α ≈ 1,
α̈ + ẍw + gα = 0, β̈ + ÿw + gβ = 0
Fx Fy
Since ẍw = Mr , ÿw = Mm , the dynamics of the swing angles are
Fx Fy
α̈ + gα = − , β̈ + gβ = − (4)
Mr Mm
The control forces Fx and Fy are assumed to have the following form
Fx = A1 (xw , ẋw ) + A2 (α , α̇ )
(5)
Fy = B1 (yw , ẏw ) + B2 β , β̇
where A1 (xw , ẋw ) and B1 (yw , ẏw ) are position controllers, and A2 (α , α̇ ) and B2 (β , β̇ )
are anti-swing controllers. Substituting (5) into (4), produces the anti-swing control
model
A1 A2 B1 B2
α̈ + gα + = − , β̈ + gβ + =− (6)
Mr Mr Mm Mm
A1 B1 A2
Now if M r
and Mr
are regarded as disturbance, Mr
and MB2m as control inputs, then (6)
is a second-order linear system with disturbances. Standard PID control can now be
applied to regulate α and β thereby producing the anti-swing controllers
228 W. Yu et al.
-1
A2 (α , α̇ ) = k pa2 α + kda2 α̇ + kia2 0 α dt
- (7)
B2 β , β̇ = k pb2 β + kdb2 β̇ + kib2 01 β dt
where k pa2 , kda2 and kia2 are positive constants corresponding to proportional, deriv-
ative and integral gains.
Substituting (5) into (1), produces the position control model
A generic fuzzy model for friction and gravity is provided by a collection of l fuzzy
rules (Mamdani fuzzy model [18])
Here fx , fy and fz are the uncertainties (friction, gravity and coupling errors) along
the X, Y, Z -coordinate axis. i = 1, 2 · · · l. A total of fuzzy IF-THEN rules are used to
perform the mapping from the input vector x = [xw , yw , α , β , R]T ∈ ℜ5 to the output
T
vector y(k) = f1 , f2 , f3 = [ y1 , y2 , y3 ] ∈ R3 . Here A1i , · · · Ani and B1i , · · · Bmi are
standard fuzzy sets. In this chapter, some on-line learning algorithms are introduced
for the membership functions B ji such that the PD controller is stable.
By using product inference, centre-average defuzzification and a singleton fuzzi-
fier, the pth output of the fuzzy logic system can be expressed as [29]
5 6 5 6
l 5 l 5 l
yp = ∑ w pi ∏ µA ji / ∑ ∏ µA ji = ∑ w pi φi (10)
i=1 j=1 i=1 j=1 i=1
n l n
φi = ∏ µA ji / ∑ ∏ µA ji (11)
j=1 i=1 j=1
compensator
u1 = [A1 (xw , ẋw ) , B1 (yw , ẏw ) , 0, 0, FR ]T = −Kp1 x − xd − Kd1 ẋ − ẋd + Ŵt Φ(x)
(13)
T
where x = [xw , yw , α , β , R]T , xd = xwd , ydw , 0, 0, Rd , and xwd , ydw and Rd are the
d d d
desired
positions. In the regulation case ẋw = ẏw = Ṙ = 0. Further, Kp1 =
diag k pa1 , k pb1 , 0, 0, k pr , Kd1 = diag [kda1 , kdb1 , 0, 0, kdr ] . The time-varying weight
matrix Ŵt is determined by the fuzzy learning law. According to the Stone–Weierstrass
theorem [8], a general non-linear smooth function can be written as
where W ∗ is optimal weight matrix, and µ (t) is the modeling error. In this chap-
ter we use the fuzzy compensator (12) to approximate the unknown non-linearity
(gravity, friction and coupling of anti-swing control) as
When the velocity ẋ is not available, a velocity observer is needed. Section 6.5
describes how to incorporate a model-free observer to PD control for the overhead
crane.
The overhead crane dynamics (1) can be rewritten in state-space form as [22]
ẋ1 = x2
ẋ2 = H1 (X, u) (16)
y = x1
If the velocity vector x2 is not measurable and the dynamics of manipulator are
unknown, a high-gain observer can be used to estimate x2 [22]
230 W. Yu et al.
d 1
x̂1 = x̂2 + K1 (x1 − x̂1 )
dt ε
(18)
d 1
x̂2 = 2 K2 (x1 − x̂1 )
dt ε
where x̂1 ∈ ℜ5 , x̂2 ∈ ℜ5 denotes the estimated values of x1 , x2 respectively; ε is a
small positive parameter,
7 8 and K1 and K2 are positive definite matrices chosen such
−K1 I
that the matrix is stable. Defining the observer error as
−K2 0
where x̂ = [x̂1T , x̂2T ]T , the observer error equation can then be formed from (16) and
(18)
d
ε z̃1 = z̃2 − K1 z̃1
dt (20)
d
ε z̃2 = −K2 z̃1 + ε H1
2
dt
or in the matrix form:
d
ε z̃ = Az̃ + ε 2 BH1 (21)
dt
7 8 7 8
−K1 I 0
where A = ,B= . The structure of the velocity observer is the same
−K2 0 I
as in [22], but a new theorem is proposed here in order to integrate the observer and
the fuzzy compensator .
Theorem 2. If the high gain observer (18) is used to estimate the velocity of the
overhead crane (16), the observer error x̃ will converge to the following residual set
Dε = {x̃ | x̃ ≤ K̄ (ε )}
AT P + PA = −I (22)
where x1d ∈ ℜ5 is the desired position, x2d ∈ ℜ5 is the desired velocity. In the regula-
tion case x2d = 0, and x̂2 is of course the velocity approximation from the high-gain
observer.
The coupling between anti-swing control and position control can be explained
as follows. For the anti-swing control (6), the position control A1 and B1 are distur-
bances, which can be decreased by the integral action in PID control. The anti-swing
model (6) is an approximator, but the anti-swing control (7) does not in fact use this,
as it is model-free. Hence while the anti-swing control law (7) cannot suppress the
swing completely, it can minimize any consequent vibration.
For the position control (8), the anti-swing control lies in the term D =
[A2 , B2 , 0, 0, 0]T , which can also be regarded as a disturbance. The coupling due to
anti-swing control can be compensated by the fuzzy system. Consequently, the PD
control with the fuzzy compensation can be expressed as
If neither the velocity x2 nor the friction and gravity are known, the normal PD
control needs to be combined with velocity estimation and fuzzy compensation to
give
τ = −Kp (x1 − x1d ) − Kd (x̂2 − x2d ) + Ŵt Φ(s) (25)
' T T (T d
where s = x1 , x̂2 , x2 = 0. The stability of this controller is analysed next.
6 Stability Analysis
where Ŵt is a time-varying weight matrix for the fuzzy system. The following rela-
tion holds
W ∗ Φ(x) − Ŵt Φ(x) = W̃t Φ(s) (29)
232 W. Yu et al.
where W̃t = W ∗ − Ŵt . From Theorem 1 it is known that the high gain observer (18)
can make (x̂2 − x2 ) converge to a residual set and it is possible to write x2 = x̂2 + δ ,
where δ is bounded such that δ T Λδ δ ≤ η̄δ . Now defining the tracking error as
(x2d = 0), x̄1 = x1 − x1d :
x̃2 = x̂2 = x̄2 − δ (30)
the following theorem holds.
Theorem 3. If the updating laws for the membership functions in (28) are
d
Ŵt = −Kw Φ(s)x̃2T (31)
dt
where Kw , Kv and Λ3 are positive definite matrices, and Kd satisfies
Kd > Λ−1 −1
g + Λδ (32)
then the PD control law with fuzzy compensation in (25) can make the tracking error
stable. In fact, the average tracking error x̄2 converges to
T
1
lim sup x̄2 2Q1 dt ≤ η̄g + 2η̄δ (33)
T →∞ T 0
' −1
(
where Q1 = Kd − Λ−1
g + Λδ .
7 Experimental Comparisons
The proposed anti-swing control for overhead crane systems has been implemented
on a InTeCo [10] overhead crane test-bed, see Figure 2. The rail is 150 cm long.,
and the physical parameters for the system are as follows:
The resulting angles are shown in Figure 3 for the position control without anti-
swing, and in Figure 4 for the position control with anti-swing. It can be seen that
the swing angles α and β are decreased a lot with the anti-swing controller.
The position control law in equation (13) is discussed next. In this case there
are two types of input to the position model (8), D = [A2 , . . .]T , u1 = [A1 , . . .]T .
When the position control A1 is designed by (25) with u1 = τ , the anti-swing
control A2 in (8) is regarded as a disturbance which will be compensated for the
fuzzy system (12). Theorem 2 implies that to assure stability, Kd should be large
−1
enough such that Kd > Λ−1 g + Λδ . Since these upper bounds are not known,
Kd1 = diag [80, 80, 0, 0, 10] is selected. The position feedback gain does not effect
the stability, but it should be positive, and was chosen as Kp1 = diag [5, 5, 0, 0, 1] .
A total of 20 fuzzy rules were used to compensate the friction, gravity and the
coupling from anti-swing control. The membership function for A ji was chosen to
be the Gaussian function
A ji = exp − (x j − m ji )2 /100 , j = 1 · · · 5, i = 1 · · · 20
234 W. Yu et al.
0.4
b a
0.2
–0.2
Time (second)
–0.4
0 5 10 15 20 25 30 35 40 45
0.4
b
0.2
–0.2
a
–0.4
–0.4 –0.3 –0.2 –0.1 0 0.1 0.2 0.3 0.4
0.4
a (rad) b (rad)
0.2
–0.2
–0.4
0 5 10 15 20 25 30 35 40 45
0.4
b (rad)
0.2
–0.2
a (rad)
–0.4
–0.25 –0.2 –0.15 –0.1 –0.05 0 0.05 0.1 0.5 0.2
where the centres m ji were selected randomly to lie in the interval (0, 1). Hence,
Ŵt ∈ R5×20 , Φ(x) = [σ1 · · · σ20 ]T . The learning law took the form in (31) with
Stable Anti-Swing Control for an Overhead Crane 235
Kw = 10. The desired gantry position was selected as a square wave, and the re-
sulting gantry positions are shown in Figure 5. The regulation results from PD
control without fuzzy compensation [15] are shown in Figure 6. For compari-
son the PID control results (Kd1 = diag [80, 80, 0, 0, 10] , Kp1 = diag [5, 5, 0, 0, 1] ,
Ki1 = diag [0.25, 0.25, 0, 0, 0.1]) are shown in Figure 7.
Clearly, PD control with fuzzy compensation can successfully compensate the
uncertainties such as friction, gravity and anti-swing coupling. Because the PID
controller has no adaptive mechanism, it does not work well for anti-swing coupling
in contrast to the fuzzy compensator which can adjust its control action. On the other
hand, the PID controller is faster than the PD control with fuzzy compensation in
the case of small anti-swing coupling.
The structure of fuzzy compensator is very important. The constants in the mem-
bership functions of the fuzzy system have to be chosen either by simulation or
experiment. From fuzzy theory the form of the membership function is known not
to influence the stability of the fuzzy control, but the approximation ability of fuzzy
system for a particular non-linear process depends on the membership functions
selected. The number of fuzzy rules constitutes a structural problem for fuzzy sys-
tems. It is well known that increasing the dimension of the fuzzy rules can cause the
“overlap” problem and add to the computational burden [29]. The best dimension
to use is still an open problem for the fuzzy research community. In this application
20 fuzzy rules were used. Since it is difficult to obtain the fuzzy structure from prior
knowledge, several fuzzy identifiers can be put in parallel and the best one selected
by a switching algorithm. The learning gain Kw will influence the learning speed, so
a very large gain can cause unstable learning, while a very small gain produce slow
learning process.
8 Conclusion
In this chapter, the disadvantages of the popular PD control for overhead crane are
overcome in the following two ways: (1) a high-gain observer is proposed for the
estimation of the velocities of the joints; (2) a fuzzy compensator is used to com-
pensate for gravity and friction. Using Lyapunov-like analysis, the stability of the
closed-loop system with velocity estimation and fuzzy compensation was proven.
Real-time experiments were presented comparing our stable anti-swing PD control
strategy with regular crane controllers. These showed that the PD control law with
the anti-swing and fuzzy compensations is effective for the crane system.
Acknowledgments Wen Yu would like to thank CONACyT for supporting his visit to Queen’s
University Belfast under the projects 46729Y and 50480Y. The second author would like to ac-
knowledge the support received from the International Exchange Scheme of Queen’s University
Belfast.
236 W. Yu et al.
1
(m) xw
0.5
0
0 5 10 15 20 25 30 35 40 45
1
(m) yw
0.5
0
0 5 10 15 20 25 30 35 40 45
1
(m) R
0.5
Time(s)
0
0 5 10 15 20 25 30 35 40 45
1
(m)
xw
0.5
0
0 5 10 15 20 25 30 35 40 45
1
(m) yw
0.5
0
0 5 10 15 20 25 30 35 40 45
1
(m)
R
0.5
Time(s)
0
0 5 10 15 20 25 30 35 40 45
1
(m) xw
0.5
0
0 5 10 15 20 25 30 35 40 45
1
(m)
yw
0.5
0
0 5 10 15 20 25 30 35 40 45
1 (m)
R
0.5
Time(s)
0
0 5 10 15 20 25 30 35 40 45
Fig. 7 PID
9 Appendix
Proof of Theorem 1. Since the spectra of K1 and K2 are in the left half plane,
(22) has a positive definite solution P. Consider the following candidate Lyapunov
function:V0 (z̃) = ε z̃T Pz̃. The derivative of this along the solutions of (20) is:
d T d
V̇0 = ε z̃ Pz̃ + ε z̃T P z̃
dt dt
' (
= z̃T AT P + PA z̃ + 2ε 2 (BH1 )T Pz̃ (34)
2
≤ − z̃ + 2ε BH1 P |z̃|
2
Since (16) has a solution for any t ∈ [0, T ] , H1 is bounded for any finite time T.
It can be therefore concluded that BH1 P is bounded.
V̇ ≤ − z̃2 + K̄ (ε ) z̃
Now, let Tk denote the time interval during which z̃(t) > K̄ (ε ) . Then V̇0 < 0,
∀t ∈ [0, T ] means the total time during which z̃(t) > K̄ (ε ) is finite
238 W. Yu et al.
∞
∑ Tk < ∞ (36)
k=1
If z̃(t) falls outside a ball of radius K̄ (ε ) for only a finite time and then re-enters
it, z̃(t) will eventually remain completely inside. If z̃(t) leaves the ball an infinite
∞
times (k → ∞), since ∑ Tk < ∞ and Tk > 0, then it follows that Tk → 0. This then
k=1
means that z̃(t) finally stays inside the ball and so z̃(t) is bounded from an invariant
set argument.
Now, from (21), dtd z̃(t) is also bounded. If z̃k (t)Q is defined as the largest track-
ing error during Tk , (36) and a bounded dtd z̃(t) imply that lim [z̃k (t) − K̄ (ε )] = 0,
7 8
k→∞
I 0
and z̃k (t)will convergence to K̄ (ε ) , because x̃ = z̃ and ε < 1, as a result
0 ε1 I
x̃ converges to the ball of radius K̄ (ε ) . QED
which is valid for any X,Y ∈ ℜn×k and for any positive definite matrix 0 < Λ =
ΛT ∈ ℜn×n , it follows that if X = x̃2 , and Y = δ , then x̃2T δ ≤ x̃2T Λ−1
δ x̃2 + η̄δ . Since
x2d = ẋ2d = 0, and using the learning law (31) and the skew-symmetric (2), then (40)
becomes
Stable Anti-Swing Control for an Overhead Crane 239
and, since x̄2 2Q = x̃2 2Q + η̄δ , equation (33) is established. QED
References
1. E.M. Abdel-Rahman, A.H. Nayfeh and Z.N. Masoud. Dynamics and control of cranes: a
review. Journal of Vibration and Control, Vol. 9, No. 7, 863–908, 2003
2. J.W. Auernig and H. Troger. Time optimal control of overhead cranes with hoisting of the
payload. Automatica, Vol. 23, No. 4, 437–447, 1987
3. J.W. Beeston. Closed-loop time optimatial control of a suspended payload-a design study.
Proc. 4th IFAC World Congress, 85–99, Warsaw Poland, 1969
4. C.I. Byrnes, A. Isidori and J.C. Willems. Passivity, feedback equivalence, and the global
stabilization of minimum phase nonlinear systems. IEEE Trans. Automat. Contr., Vol. 36,
1228–1240, 1991
5. S.K. Cho and H.H. Lee. A fuzzy-logic antiswing controller for three-dimensional overhead
cranes. ISA Trans., Vol. 41, No. 2, 235–43, 2002
6. G. Corriga, A. Giua and G. Usai. An implicit gain-scheduling controller for cranes. IEEE
Trans. Control Systems Technology, Vol. 6, No. 1, 15–20, 1998
7. C. Canudas de Wit and J.J.E. Slotine. Sliding observers for overhead crane manipulator. Au-
tomatica, Vol. 27, No. 5, 859–864, 1991
8. G. Cybenko. Approximation by superposition of sigmoidal activation function. Math. Control,
Sig Syst, Vol. 2, 303–314, 1989
9. Y. Fang, W.E. Dixon, D.M. Dawson and E. Zergeroglu. Nonlinear coupling control laws for
an underactuated overhead crane system. IEEE/ASME Trans. Mechatronics, Vol. 8, No. 3,
418–423, 2003
10. InTeCo, 3DCrane: Installation and Commissioning Version 1.2, Krakow, Poland, 2000
11. P.A. Ioannou and J. Sun. Robust adaptive control. Prentice-Hall Inc., NJ, 1996
12. Y.H. Kim and F.L. Lewis Neural Network Output Feedback Control of overhead crane Ma-
nipulator. IEEE Trans. Neural Networks, Vol. 15, 301–309, 1999
13. R. Kelly. Global Positioning on overhead crane manipulators via PD control plus a classs of
nonlinear integral actions. IEEE Trans. Automat. Contr., Vol. 43, No. 7, 934–938, 1998
14. R. Kelly. A tuning procedure for stable PID control of robot manipulators. Robotica, Vol. 13,
141–148, 1995
15. B. Kiss, J. Levine and P. Mullhaupt. A simple output feedback PD controller for nonlinear
cranes. Proc. Conf. Decision and Control, 5097–5101, 2000
16. H.H. Lee. Modeling and control of a three-dimensional overhead crane. Journal of Dynamic
Systems, Measurement, and Control, Vol. 120, 471–476, 1998
240 W. Yu et al.
17. H.H. Lee. A new motion-planning scheme for overhead cranes with high-speed hoisting. Jour-
nal of Dynamic Systems, Measurement, and Control, Vol. 126, 359–364, 2004
18. E.H. Mamdani. Application of fuzzy algorithms for control of simple dynamic plant. IEE Pro-
ceedings — Control Theory and Applications, Vol. 121, No. 12, 1585–1588, 1976
19. J.A. Méndez, L. Acosta, L. Moreno, S. Torres and G.N. Marichal. An application of a neural
self-tuning controller to an overhead crane. Neural Computing and Applications, Vol. 8, No. 2,
143–150, 1999
20. K.A. Moustafa and A.M. Ebeid. Nonlinear modeling and control of overhead crane load sway.
Journal of Dynamic Systems, Measurement, and Control, Vol. 110, 266–271, 1988
21. M.W. Noakes and J.F. Jansen. Generalized input for damped-vibration control of suspended
payloads. Journal of Robotics and Autonomous Systems, Vol. 10, No. 2, 199–205, 1992
22. S. Nicosia and A. Tornambe. High-gain observers in the state and parameter estimation of
overhead cranes having elastic joins. System & Control Letters, Vol. 13, 331–337, 1989
23. E.D. Sontag and Y. Wang. On characterization of the input-to-state stability property. System
& Control Letters, Vol. 24, 351–359, 1995
24. O. Sawodny, H. Aschemann and S. Lahres. An automated gantry crane as a large workspace
robot. Control Engineering Practice, Vol. 10, No. 12, 1323–1338, 2002
25. Y. Sakawa and Y. Shindo. Optimal control of container cranes. Automatica, Vol. 18, No. 3,
257–266, 1982
26. W. Singhose, W. Seering and N. Singer. Residual vibration reduction using vector diagrams
to generate shaped inputs. Journal of Dynamic Systems, Measurement, and Control, Vol. 116,
654–659, 1994
27. M. Takegaki and S. Arimoto. A new feedback method for dynamic control of manipulator.
ASME J. Dynamic Syst. Measurement, and Contr., Vol. 103, 119–125, 1981
28. R. Toxqui, W. Yu and X. Li. PD control of overhead crane systems with neural compensa-
tion. Advances in Neural Networks -ISNN 2006, Springer-Verlag, Lecture Notes in Computer
Science, LNCS 3972, 1110–1115, 2006
29. L.X. Wang. Adaptive Fuzzy Systems and Control. Englewood Cliffs NJ: Prentice-Hall, 1994.
30. J. Yu, F.L. Lewis and T. Huang. Nonlinear feedback control of a gantry crane. Proc. 1995
American Control Conference, Seattle, 4310–4315, USA, 1995
Intelligent Fuzzy PID Controller
Abstract This chapter aims to describe the development and two tuning methods
for a self-organising fuzzy PID controller. Before application of fuzzy logic, the
PID gains are tuned by conventional tuning methods. In the first tuning method,
fuzzy logic at the supervisory level readjusts the three PID gains during the system
operation. In the second tuning method fuzzy logic only readjusts the values of the
proportional PID gain, and the corresponding integral and derivative gains are read-
justed using Ziegler-Nichols tuning method while the system is in operation. For the
compositional rule of inferences in the fuzzy PID and the self-organising fuzzy PID
schemes two new approaches are introduced: the Min implication function with the
Mean-of-Maxima defuzzification method, and the Max-product implication func-
tion with the Centre-of-Gravity defuzzification method. The self-organising fuzzy
PID controller, the fuzzy PID controller and the PID controller are all applied to
a non-linear revolute-joint robot-arm for step input and path tracking experiments
using computer simulation. For the step input and path tracking experiments, the
novel self-organising fuzzy PID controller produces a better output response than
the fuzzy PID controller; and in turn both controllers produce better process output
that the PID controller.
Keywords: Fuzzy controller, fuzzy PID controller, self-organising fuzzy PID con-
troller, implication function, defuzzification method
1 Introduction
The Proportional Integral Derivative (PID) controller is one of the most popular
controllers in industrial applications. However, the PID controller has a suboptimal
performance in the industrial processes. There have been many attempts in the past
to develop control techniques and algorithms to tune the PID gains KP, KI and KD
[1]– [3]. These control techniques and algorithms are largely inadequate for tuning
the gains of the PID controllers, for non-linear systems. Some of the techniques and
algorithms used to tune the PID gains demonstrate that further retuning is necessary
by a skilled human operator during the application of the controller to a process.
The fuzzy controllers have been applied to industrial processes with some degree of
success [4]– [6], where the rule buffer codifies the experience of a skilled human
operator. As a result of fuzzy controllers’ successes, the fuzzy PID controllers have
been studied in past decade [7]– [14]. Furthermore, the applications of autonomous
or intelligent fuzzy PID controllers have been recently gathering momentum and
many researchers have worked in the areas of self-tuning fuzzy PID controllers. For
example, self-tuning fuzzy PID have been applied to load and frequency control in
energy conversion and management [15], heating, ventilating and air conditioning
plant [16], and programmable logic controllers [17], to name a few. This article takes
the fuzzy PID controller and the self-tuning fuzzy PID controller research further
by developing a novel self-organising fuzzy PID controller.
The self-organising fuzzy PID controller is a learning controller. The rule pro-
duction and modification of the self-organising fuzzy PID controller generates its
own control rule strategies, and deposits the new rules in the rule buffer. The rules
are produced and updated constantly in the rule buffer during the system opera-
tion, according to the new experience encountered both at the setpoint and from the
process under control. For the self-organising fuzzy PID controller, the step input
and path tracking trajectories are applied to a non-linear revolute-joint robot-arm,
with presence of noise and time variant dynamics. The revolute-joint robot-arm is
used as a test bed to study the behaviour of the self-organising fuzzy PID controller
for dynamic system applications. The results of the computer simulation experi-
ments for the self-organising fuzzy PID controller are compared with the fuzzy PID
controller and the PID controller, to evaluate the suitability of the self-organising
fuzzy PID controller for dynamic system applications and also obtain some infor-
mation about the tuning procedure. In order to have measurements of the perfor-
mances of the self-organising fuzzy PID controller, the fuzzy PID controller and the
PID controller, the Integral of the Absolute magnitude of the Error (IAE) criterion
is used. The performance index IAE is particularly useful for computer simulation
studies.
Section 2 describes the fuzzy PID controller and the self-organising fuzzy PID
controller. Section 3, describes the applications of the controllers to a non-linear
revolute-joint robot-arm. Section 4 is the computer simulation results for the step
input and the path tracking trajectories. Section 5 is the conclusion.
Intelligent Fuzzy PID Controller 243
In Figure 1, the general structure of the fuzzy controller is derived from the general
structure of the PID controller. Assilian in 1974 defined the fuzzy controller’s inputs
as the error and the change of error, and the output as an incremental one, similar
to the PID controller [18]. The fuzzy section of the fuzzy controller from Figure 1
is used for the fuzzy PID controller, as shown in Figure 2. The fuzzy section of the
fuzzy PID controller comprises of the fuzzifier, the rule buffer, the fuzzy control
and the defuzzifier blocks. The remaining blocks of the fuzzy PID controller of
Figure 2 are the PID gains and the revolute-joint robot-arm. The gains of the fuzzy
PID controller are initially tuned using a conventional tuning technique. The fuzzy
section has a supervisory role to readjust the gains of the PID controller during the
system operation.
In Figure 2, the fuzzifier block fuzzifies the error and the change of error. Scal-
ing and quantisation constitute the fuzzification of the error and the change of error.
The values of scaling factor are obtained by trial and error during the tuning of the
Fuzzy
Section rule
error buffer process O/P
setpoint
revolute-
fuzzy
fuzzifier defuzzifier joint
+ control
robot-arm
–
Fuzzy rule
Section buffer
error
fuzzy
fuzzifier defuzzifier
control
process O/P
NL NS ZE PS PL
1
0
–4 –3 –2 –1 –0 +0 +1 +2 +3 +4
Linguistic
Sets −4 −3 −2 −1 0 1 2 3 4
PL 0 0 0 0 0 0 0.3 0.7 1.0
PS 0 0 0 0 0.3 0.7 1.0 0.7 0
ZE 0 0 0.3 0.7 1.0 0.7 0 0 0
NS 0.3 0.7 1.0 0.7 0 0 0 0 0
NL 1.0 0.7 0 0 0 0 0 0 0
Membership Function
controller. The scaling factor for the error is shown as ESF and for the change of er-
ror is presented as CESF. Quantisation of the error and the change of error require
all the fuzzified values to remain within a certain range. In the experiments pre-
sented in this work, the range is from Negative Large to Positive Large. In Figure 3,
the linguistic codes for this range are: Negative Large (NL) = −4 or −3, Negative
Small (NS) = −2 or −1, Zero (ZE) = 0, Positive Small (PS) = +1 or +2, Positive
Large (PL) = +3 or +4. In the fuzzy control block, the fuzzified error, the fuzzified
change of error and the rules from the rule buffer block produce an output using
the compositional rule of inference [19]. An implication function and a defuzzifi-
cation method constitute the compositional rule of inference. There are many types
of implication functions and defuzzification methods. However, in the experiments
carried out for the novel self-organising fuzzy PID controller, the Min implication
function with the Mean-of-Maxima defuzzification method and the Max-product
implication function with the Centre-of-Gravity defuzzification method produced
better results for the process output. The Min implication function [20] with the
Mean-of-Maxima defuzzification method are shown by equations (1) and (2) re-
spectively:
uR (x, y) = uA (x) ∩ uB (y). (1)
uPi = [UPi(max) +UPi(max−1) ]/2. (2)
where a fuzzy subset A with elements x has a membership function of uA (x), within
a range of 0–1, see the membership matrix Table 1. Equally, a fuzzy subset B with
elements y has a membership function of uB (y). uR (x, y) is the resultant of the Min
implication function. The Mean of Maxima is defined [21], by taking an average
Intelligent Fuzzy PID Controller 245
between two elements in the universe of discourse, which correspond to two largest
values of the membership functions. The universe of discourse UPi(max) is the high-
est value of the membership function, the universe of discourse U|Pi(max − 1) is
the second highest value and UPi is the resultant. p is the proportional gain and i
is the sampling instant. The Max-product [22]- [23] implication function and the
Centre-of-Gravity [24] defuzzification method are shown by Equations (3) and (4)
respectively:
uR (x, y) = uA (x) · uB (y). (3)
n n
uPi = ∑(xn ∗Uni )/ ∑ xn . (4)
1 1
KP , KI and KD on the right of equations (5), (6) and (7) represent the PID gains
before the readjustments, and KP(Fuzzy−apps) , KI(Fuzzy−apps) and KD(Fuzzy−apps) on
the left of equations represent the PID gains after the application of the fuzzy PID
controller. KCP , KCI and KCD are the descaling factor coefficients for the propor-
tional, integral and derivative PID gains, respectively. As the values of the PID gains
KP , KI and KD change at different rates, three different values for the descaling fac-
tor coefficients are used. For instance, the range of variations in values for KP is
greater than KI and KD . The values of the descaling factor coefficients KCP , KCI
and KCD are also chosen to be different for each link. For a 2 link revolute-joint
robot-arm, KCPS , KCIS and KCDS are the descaling factor coefficients for the shoul-
der; KCPA , KCIA and KCDA are the descaling factor coefficients for the arm.
Finally in Figure 2, the controller output from the PID gains block has a transfer
function
U(s) KI(Fuzzy−apps)
= KP(Fuzzy−apps) + + KD(Fuzzy−apps) ∗ s. (8)
E(s) s
The block diagram of a novel self-organising fuzzy PID controller is shown in
Figure 4. The broken lines in the block diagram show the self-organising fuzzy at the
supervisory controller level and the PID at the actuator level. The rule production
and modification section of the self-organising fuzzy PID controller presented in
Figure 4 had been initially proposed by Mamdani and Baaklini [25], and has been
246 H.B. Kazemian
fuzzy
fuzzifier defuzzifier
control
PID gains
revolute-
KP(Fuzzy-apps),
joint
KI(Fuzzy-apps), robot-arm
+ E KD(Fuzzy-apps). U Y
setpoint –
studied by various researchers such as Procyk and Mamdani [26] and Kazemian
and Scharf [27] to name a few. However, the rule production and modification sec-
tion at supervisory level readjusting PID gains at the actuator level, has only been
studied by Kazemian [28]– [31]. The self-organising fuzzy PID controller in this
research is in effect the fuzzy PID controller with an additional rule production and
modification. The self-organising fuzzy PID controller automatically builds its own
control rule strategies in the rule buffer according to the changes encountered both
at the setpoint and from the process under control, starting with no rules in the rule
buffer, during the application of the self-organising fuzzy PID controller to the dy-
namic system. In Figure 4, the rule production and modification comprises of four
blocks, the linguistic rule table, the PID fuzzifier, the past states buffer and the rule
reinforcement. The linguistic rule table is responsible for keeping the revolute-joint
robot-arm output as close as possible to the setpoint. If the revolute-joint robot-
arm output approaches or follows the setpoint, then no value (zero) is outputted
from the linguistic rule table block. If the revolute-joint robot-arm output deviates
from the setpoint, a value called the gain correction KGC is outputted from the lin-
guistic rule table block. Based on these objectives a set of nine linguistic rules are
produced, Figure 5. The nine linguistic rules are converted into a table (Table 2),
which is placed in the linguistic rule table block of the rule production and modi-
fication. From the fuzzifier block of Figure 2, two values of the fuzzified error and
Intelligent Fuzzy PID Controller 247
the fuzzified change of error are fed into the linguistic rule table block (Table 2)
and a corresponding value of KGC is outputted. The PID fuzzifier block obtains and
fuzzifies the PID gains from the PID gains block. The scaling factors for the PID
fuzzifier block are denoted as SFp f , SFi f and SFd f . The past states buffer is a storage
block for the past values of the PID gains. Number of the PID gains in the past states
buffer are based on the time lag of the system and in turn the time lag depends on
delay-in-reward. The new values of the gain correction (KGC ) and the values of the
past states buffer generate new control rules in the rule reinforcement block, when
the revolute-joint robot-arm output deviates from the setpoint.
where KProp(i−N) , KInt(i−N) and KDeri(i−N) are the PID gains from the past states
buffer block. i is sampling instant and N is number of past samples before the
248 H.B. Kazemian
present sample. The new rules from the rule reinforcement block are transferred
continuously to the rule buffer block during the system operation.
Joint i + 1
qi+1
Joint i
Link i – 1
qi
Link i
Link i + 1
ai
zi
ai xi
di zi – 1
qi
xi–1
of the links and joints as such that, link i rotates around the Zi−1 axis of link i − 1
when joint i turns. Similarly, link i + 1, rotates around Zi at joint i + 1, etc. Xi is
related to link i and points along the common normal of Zi and Zi−1 . The D-H
representation of a link is based on four geometric parameters:
• θi is the angle between links, measuring the joint angle from the Xi−1 axis to the
Xi axis about the Zi−1 axis.
• αi is the twist of the link, the angle between axes Zi−1 and Zi about the Xi axis.
• ai is the length of the link, the shortest distance between the Zi−1 and Zi axes.
• di is the distance between the links, from the link i − 1 to the link i along the Zi−1
axis.
The driving force for each link is an armature controlled DC motor. The voltage is
applied at the input of the armature terminals and speed of rotation is produced at
the output. A second order differential equation is used to represent the dynamics of
a DC motor and load.
d2y dy
+f∗ + r(t) = r(t)u. (12)
dt 2 dt
where u is the input to the process, y is the output from the process, f is the friction,
and r(t) is the small friction values which varies with time. The non-linearities in a
revolute-joint robot-arm are caused by backlash, friction and motor characteristics.
In the robot-arm, the moment of inertia varies with time due to the movements of the
links. The DC motor dynamics is a time variant system, which could represent small
friction values and changes in the moment of inertia of the motor and load [35]. By
varying the term r(t) which stands for small friction values, changes in the moment
of inertia of the motor and load will take place. A sharp decrease or increase in
the moment of inertia makes the system more difficult to control. The third order
method of Runge-Kutta [36] is used to integrate the second order dynamic equation.
To simulate the noise, a random number generator program is used to produce 5,000
different numbers. This is based on a congruent linear random number generator,
which gives a distribution close to a rectangular. In accordance with observations
made with a practical system, the output is scaled to give a deviation of ±0.8 units,
which is added to the process output.
calculates the corresponding values of KI and KD . Method 1 is only suitable for step
input, since the values of KP , KI and KD are readjusted at the rise time, the steady
state error period and between the rise time and the steady state error period, respec-
tively. For a multi-input multi-output path tracking experiment such as the trapezium
waveform the rise time, the steady state error and overshoot do not apply.
The step input experiments are to produce some initial results for the fuzzy PID con-
troller, the self-organising fuzzy PID controller and the conventional PID controller.
For a revolute-joint robot-arm with pick and place in mind, the parameters for the
three controllers are tuned to obtain an appropriate damping around the setpoint,
minimise the overshoot and depress the steady state error. The PID gains KP , KI
and KD are initially tuned off line, without the fuzzy controllers. The experimental
results presented in this article are based on the following off line tuning method.
Firstly, a large value of KP is chosen and gradually the KP value is reduced until the
time the output process overshoot is minimised. Subsequently, KI and KD are tuned;
and finally KP is re-tuned to deduce the best possible output response. Once the PID
gains KP , KI and KD are tuned, the fuzzy controllers readjust the PID gains during
the system operation. For the step input experiments, one of the PID gains are read-
justed at the time. From the start of the signal to the point near to the setpoint, KP is
readjusted to improve the rise time; from this point until approaching the steady state
error region, KD is readjusted to dampen the overshoot; and lastly, KI is readjusted
to reduce the steady state error.
One experiment for the fuzzy PID controller and one experiment for the self-
organising PID controller are outlined with the following parameter values: the ini-
tial PID gains are KP = 50, KI = 0.55, KD = 1.0; ESF = 0.3, CESFL for the linguistic
rule table = 4, CESFF for the fuzzy control = 12; delay-in-reward = 6; the descaled
coefficients for the defuzzifier block are KCP = 0.5, KCI = 0.05, KCD = 0.1; the
scaling factors for the PID fuzzifier block are SFp f = 0.12, SFi f = 12 and SFd f = 6;
and the linguistic rule table of Table 2 is used. Figures 7 and 8 demonstrate examples
80
70
Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.
Fig. 7 Step response, method 1: using the fuzzy PID controller. Scaling: X-axis: 12 ms / sample
Y -axis: output - degrees
Intelligent Fuzzy PID Controller 251
80
70
Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.
Fig. 8 Step response, method 1: using the self-organising fuzzy PID controller, run number 5.
Scaling: X-axis: 12 ms / sample Y -axis: output - degrees
of process out for a step input using the fuzzy PID controller — method 1 and the
self-organising fuzzy PID controller — method 1, receptively. The Y -axis is the
process output in degree centigrade and the X-axis is the sample number, 12 ms
per sample. As the figures indicate, there is an improvement in the process output
for the self-organising fuzzy PID controller than the fuzzy PID controller. Due to
less computation in the simulation, the rise time is slightly faster for the fuzzy PID
controller than the self-organising fuzzy PID controller. The overshoot is virtually
non-existent for the self-organising fuzzy PID controller and the fuzzy PID con-
troller. The steady state error is improved considerably for the self-organising fuzzy
PID controller than the fuzzy PID controller. This is because, the self-organising
fuzzy PID controller continuously changes the values of the rule buffer block dur-
ing the system operation. In contrast, the values of the rule buffer block for the fuzzy
PID controller are predetermined, prior to the experiments being carried out.
80
70
Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.
Fig. 9 Step response, method 2: using the fuzzy PID controller. Scaling: X-axis: 12 ms/sample
Y -axis: output - degrees
Intelligent Fuzzy PID Controller 253
80
70
Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.
Fig. 10 Step response, method 2: using the self-organising fuzzy PID controller, run number 4.
Scaling: X-axis: 12 ms/sample Y -axis: output - degrees
PID controller, as the self-organising fuzzy PID controller continuously changes the
values of the rule buffer during the system operation. Finally, comparing the fuzzy
PID controller and the self-organising fuzzy PID controller using method 1 (Figures
7 and 9) and method 2 (Figures 8 and 10), the two methods produce very similar
results using computer simulation. However, method 1 and method 2 might produce
different results for practical applications. For a step input experiment, after initial
tuning of the PID gains using conventional methods, it is possible to predict which
of the three PID gains should be readjusted at the rule production and modification
section to further improve the process output response. However, it should be noted
that readjusting the gains KP , KI and KD , improves some part of the output response
and deteriorates the other part. For instance for a step input, the proportional gain KP
has a direct effect over the rise time and oscillation, the integral gain KI reduces the
steady state error but increases the possibility of instability, and the derivative gain
KD reduces the overshoot but it may cause major fluctuations in the process output
in the presence of high rates of change like noise. In contrast for the path tracking
experiments, with continuous changes at the setpoint and from the process itself
during the system operation, one cannot instantaneously decide which PID gains
should be readjusted in order to obtain an optimum path. Therefore, it is better to
apply method 2 to the path tracking experiments, as only KP needs readjusting by
the rule production and modification section.
Figure 11, compares the fuzzy PID controller (method 2) and the self-organising
fuzzy PID controller (method 2) with the PID controller. In Figure 11, the steady
state errors are about 1.6% for the fuzzy PID controller, 1.1% for the self-organising
fuzzy PID controller and 2.3% for the PID controller. The overshoot is negligible
for the fuzzy PID controller and the self-organising fuzzy PID controller. However,
the overshoot is high for the PID controller. This is because, due to the derivative
part of the PID controller and in the presence of high rates of change such as noise,
the PID controller fluctuates in the process output.
254 H.B. Kazemian
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.
Fig. 11 Step response, method 2: using the fuzzy PID Controller, the self-organising fuzzy PID
controller run number 4, and the PID controller. Scaling: X-axis: 12 ms/sample Y -axis: outputs -
degrees
Fig. 12 (a,b). Tracking a trapezium waveform: using two fuzzy PID controllers, Min implica-
tion function with Mean-of-Maxima defuzzification method, run number 4. Scaling: X-axis: 6 ms/
sample Y -axis: outputs - degrees
Fig. 13 (a,b). Tracking a trapezium waveform: using two self-organising fuzzy PID controllers,
Min implication function with Mean-of-Maxima defuzzification method, run number 5. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees
Fig. 14 (a,b). Tracking a trapezium waveform: using two fuzzy PID controllers, Max-product
implication function with Centre-of-Gravity defuzzification method, run number 5. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees
of Gravity is used in Figure 14. In Figures 13 and 15, a path tracking experiment
for a trapezium waveform with the following parameter values is shown, using two
self-organising fuzzy PID controllers: ESF = 0.45, the change-of-error scaling fac-
tor for the fuzzy control block CESFF = 12, the change-of-error scaling factor for
the linguistic rule table block CESFL = 6, SFp f = 1.1, KCPS = 0.4, KCIS = 0.05,
KCDS = 0.11, KCPA = 0.35, KCIA = 0.05, KCDA = 0.1 and delay-in-reward = 6.
The Min implication function with the Mean of Maxima is used in Figure 13 and the
Max - product implication function with the Centre of Gravity is used in Figure 15.
256 H.B. Kazemian
Fig. 15 (a, b). Tracking a trapezium waveform: using two self-organising fuzzy PID controllers,
Max-product implication function with Centre-of-Gravity defuzzification method, run number 3.
Scaling: X-axis: 6 ms/sample Y -axis: outputs - degrees
For the purpose of comparison, the gains in the PID gains block for the fuzzy PID
controller and the self-organising fuzzy PID controller are chosen to be the same:
KP [S] = 4.5, KI [S] = 1.6, KD [S] = 1.15 for the shoulder, and KP [A] = 4, KI [A] = 1.3,
KD [A] = 1.1 for the arm. For the path tracking experiments, two self-organising
fuzzy PID controllers trace the trapezium waveform closer and smoother than two
fuzzy PID controllers, refer to Figures 12–15. Increasing the amplitude and fre-
quency of the trapezium waveform effect the fuzzy PID controller more than the
self-organising fuzzy PID controller. As a result, the self-organising fuzzy PID con-
troller can react quickly to the changes experienced both at the setpoint and from
the process. There have been numerous experiments carried out with different im-
plication functions and defuzzification methods using different fuzzy controllers.
Yamazaki [23], used the Max - product implication function in conjunction with
the Centre of Gravity defuzzification method and concluded that the process out-
put is smoother. Lembessis [38], combined the Min implication function [20] with
the Mean-of-Maxima defuzzification [21] method and argued that this combination
produces a faster convergence to the setpoint. There were some initial experiments
carried out in this research to apply the fuzzy PID controller and the self-organising
fuzzy PID controller to a revolute-joint robot-arm using the Max - product implica-
tion function with the Centre-of-Gravity defuzzification method, as well as the Min
implication function with the Mean-of-Maxima defuzzification method. The experi-
mental results of Figures 12 and 13 show that the Min implication function with the
Mean of Maxima produce a faster convergence to the setpoint. The experimental
results of Figures 14 and 15 also reveal the Max - product implication function with
the Centre of Gravity produce a smoother transient response. In contrast, the process
output response is much better for two fuzzy PID controllers and two self-organising
fuzzy PID controllers than two PID controllers, see Figures 12 – 16. An introduction
of noise to the system for the fuzzy PID controller and the self-organising fuzzy PID
controller produces less disturbances in the process output response than for the PID
controller.
In many cases, the number of rules that define different output conditions are
limited. Subsequently, so often, no rules in particular satisfy certain outputs. This is
of course one of the biggest drawbacks of the fuzzy controllers, as it undermines the
Intelligent Fuzzy PID Controller 257
80 80
70 70
Shoulder Output
60 60
Arm Output
50 50
40 40
30 30
20 20
10 10
0 0
200 250 300 350 400 450 500 550 600 200 250 300 350 400 450 500 550 600
Sample No. Sample No.
(a) (b)
Fig. 16 (a, b). Tracking a trapezium waveform: using two PID controllers, run number 4. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees
In Equation (22), T is a finite time chosen so that the integral approaches a steady
state value. In the trapezium waveform experiments, the IAE criterion provides use-
ful information in the analysis of the path tracking ability of the system. Table 3
presents the IAE performance for the fuzzy PID controller, the self-organising fuzzy
PID controller and the PID controller. As already explained in this section, the Min
implication function with the Mean-of-Maxima defuzzification method as well as
258 H.B. Kazemian
5 Conclusion
For the step input experiments, the fuzzy PID controller and the self-organising
fuzzy PID controller are applied to a non-linear robot-arm using computer simula-
tion. The results of the computer simulation for the fuzzy PID controller and the self-
organising fuzzy PID controller are compared with a conventional PID controller
subject to the same data provided at the setpoint, in order to analyse the results and
also obtain some information about the tuning procedure. The results of the step in-
put experiments for the fuzzy PID controller and the self-organising fuzzy PID con-
troller demonstrate that, using the first method that is readjusting the three PID gains
individually produces virtually the same results as, using the second method that is
readjusting the proportional PID gain first and applying Ziegler–Nichols method to
calculate the corresponding values of the integral and the derivative gains. In gen-
eral, the rise time for the fuzzy PID controller is faster than the self-organising fuzzy
PID controller. The steady state error is better for the self-organising fuzzy PID con-
troller than the fuzzy PID controller. The overshoot for the fuzzy PID controller and
the self-organising fuzzy PID controller is virtually non-existent. It is concluded
that for the step input experiments, the novel self-organising fuzzy PID controller
is capable of producing a better process output than the fuzzy PID controller and
the PID controller in controlling a non-linear robot-arm. An introduction of noise to
the system for the fuzzy PID controller and the self-organising fuzzy PID controller
creates less disturbances in the process output response than for the PID controller.
The fuzzy PID controller and the self-organising fuzzy PID controller are both
also applied to a non-linear revolute-joint robot-arm for a path tracking experiment
to trail a trapezium waveform. To conclude, the new self-organising fuzzy PID con-
troller traces the trapezium better than the fuzzy PID controller. This is because
the rules in the rule buffer are updated and changed constantly during the applica-
tion of the self-organising fuzzy PID controller to the process. The results of the
experiments for the fuzzy PID controller and the self-organising fuzzy PID con-
troller provide a smoother process output response using the Max - product implica-
tion function with the Centre-of-Gravity defuzzification method. The experimental
results for the fuzzy PID controller and the self-organising fuzzy PID controller
present a swifter convergence to the setpoint using the Min implication function with
the Mean-of-Maxima defuzzification method. For the path-tracking experiments,
Intelligent Fuzzy PID Controller 259
the fuzzy PID controller and the self-organising fuzzy PID controller both produce
a better process output response than the PID controller, in the presence of noise
and time-variant dynamics.
References
1. M.M. Zavarei and M. Jamshidi. Time-delay systems — analysis, optimisation and applica-
tions. Amsterdam: North-Holland Systems and Control Series, vol. 9, 1987
2. D.P. Atherton. PID controller tuning. IEE Computing & Control Engineering journal,
pp. 44–50, April 1999
3. P. Airikka. PID controller: algorithm and implementation. IEE Computing & Control Engi-
neering journal, pp. 6–11, Dec/Jan 2003/2004
4. M.S. Fodil, P. Siarry, F. Guely and J.L. Tyran. A fuzzy rule base for the improved control
of a pressurised water nuclear reactor. IEEE Transactions on Fuzzy Systems, vol. 8, no. 1,
pp. 1–10, February 2000
5. J.S. Won and R. Langari. Fuzzy torque distribution control for a parallel hybrid vehicle. Expert
Systems, Int. J. of Knowledge Engineering and Neural Networks, vol. 19, no. 1, pp. 4–10,
February 2002
6. S.X. Yang, H. Li, M.Q.-H. Meng and P.X. Liu. An embedded fuzzy controller for a behaviour-
based mobile robot with guaranteed performance. IEEE Transactions on Fuzzy Systems,
vol. 12, no. 4, pp. 436–446, August 2004
7. W. Li. Design of a hybrid fuzzy logic proportional plus conventional integral-derivative con-
troller. IEEE Trans. Fuzzy Systems, vol. 6, no. 4, pp. 449–463, 1998
8. R.K. Mudi and N.R. Pal. A robust self-tuning scheme for PI- and PD-type fuzzy controllers.
IEEE Trans. on Fuzzy Systems, vol. 7, no. 1, pp. 2–16, 1999
9. G.K.I. Mann, B.G. Hu and R.G. Gosine. Two level tuning of fuzzy PID controllers. IEEE
Transactions on Systems, Man and Cybernetics, Part B, vol. 31, no. 5, pp. 263–269, April
2001
10. K.S. Tang, K.F. Man, G. Chen and S. Kwong. An optimal fuzzy PID controller. IEEE Trans-
actions on Industrial Electronics, vol. 48, no. 4, pp. 757–765, August 2001
11. B.G. Hu, G.K.I. Mann and R.G. Gosine. A systematic study of fuzzy PID controllers-function-
based evaluation approach. IEEE Transactions on Fuzzy Systems, vol. 9, no. 5, pp. 699–712,
October 2001
12. R.S. Ranganathan, H.A. Malki and G. Chen. Fuzzy predictive PI control for processes with
large time delays. Expert Systems, Int. J. of Knowledge Engineering and Neural Networks,
vol. 19, no. 1, pp. 21–33, February 2002
13. G.K.I. Mann and R.G. Gosine. Adaptive hierarchical tuning of fuzzy controllers. Expert
Systems, Int. J. of Knowledge Engineering and Neural Networks, vol. 19, no. 1, pp. 34–45,
February 2002
14. Y. Zhao and E.G. Collins Jr. Fuzzy PI control design for an industrial weigh belt feeder. IEEE
Trans. Fuzzy Systems, vol. 11, no. 3, pp. 311–319, June 2003
15. E. Yesil, M. Guzelkaya and I. Eksin. Self tuning fuzzy PID type load and frequency controller.
Energy Conversion and Management Journal, vol. 45, no. 3, pp. 377–390, ISSN. 0196-8904,
2004
16. B. Moshiri and F. Rashidi. Self-tuning based fuzzy PID controllers: application to control
of nonlinear HVAC systems. Intelligent Data Engineering and Automated Learning - IDEAL
2004, vol. 3177, pp. 437–442, ISBN. 978-3-540-22881-3, October 2004
17. O. Karasakal, E. Yesil, M. Guzelkaya and I. Eksin. Implementation of a new self-tuning fuzzy
PID controller on PLC. Turk Journal of Elec. Eng., vol. 13, no. 2, pp. 277–286, 2005
18. S. Assilian. Artificial Intelligence in the control of real dynamic systems. PhD. Thesis, Queen
Mary University of London, 1974
260 H.B. Kazemian
19. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst., Man and Cybern., vol. 3, no. 1, pp. 28–44, 1973
20. E.H. Mamdani. Advances in linguistic synthesis of fuzzy controllers. Int. J. Man-Machine
Studies, vol. 8, pp. 669–678, 1976
21. W. Pedrycz. Fuzzy control and fuzzy systems, Second Extended Edition. Research Studies
Press LTD, Taunton, Somerset, England TA1 1HD, 1993
22. I.P. Holmblad and J.J. Ostergaard. Fuzzy logic control: operator experience applied in auto-
matic process control. FLS Review, F.L. Smidth & Co., 77 Vigerslev Alle, DK-2500, Valby,
Copenhagen, Denmark, vol. 45, pp. 11–16, 1981
23. T. Yamazaki. An improved algorithm for a self-organising controller. PhD. Thesis, Queen May
University of London, 1982
24. Y.F. Li and C.C. Lau. Development of Fuzzy Algorithms for Servo Systems. IEEE Control
Systems Magazine, pp. 65–72, April 1989
25. E.H. Mamdani and N. Baaklini. Prescriptive method for deriving control policy in a fuzzy
logic controller. Electronics Letters, vol. 1, pp. 625–626, 1975
26. T.J. Procyk and E.H. Mamdani. A Linguistic self-organising process controller. Automatica,
vol. 15, pp. 15-30, 1979
27. H.B. Kazemian and E.M. Scharf. An application of multi-input multi-output self organising
fuzzy controller for a robot-arm. IEEE Int. Journal Neural Network World, vol. 6, no. 4,
pp. 631–641, 1996
28. H.B. Kazemian. Study of learning fuzzy controllers Expert Systems: The Int. Journal of
Knowledge Engineering and Neural Networks. Blackwell publishers Ltd., vol. 18, no. 4,
pp. 186–193, September 2001
29. H.B. Kazemian. Comparative study of a learning fuzzy PID controller and a self-tuning con-
troller. ISA Transactions the Int. Journal of Science and Engineering of Measurement and
Automation. Elsevier Science Ltd., vol. 40, no. 3, pp. 245–253, July 2001
30. H.B. Kazemian. The SOF-PID controller for the control of a MIMO robot-arm. IEEE Trans-
actions on Fuzzy Systems, vol. 10, no. 4, pp. 523–532, August 2002
31. H.B. Kazemian. Developments of fuzzy PID controllers. Expert Systems: The Int. Journal
of Knowledge Engineering and Neural Networks. Blackwell publishers Ltd., vol. 22, no. 5,
pp. 254–264, November 2005
32. J. Denavit and R.S. Hartenburg. A kinematic notation for lower-pair mechanisms based on
matrices. J. Applied Mechanics, pp. 215–221, 1955
33. M.W. Walker and D.E. Orin. Efficient dynamic computer simulation of robotics mechanisms
J. Dyn. Sys., Meas., and Control, vol. 104, pp. 205–211, 1982
34. K.S. Fu, R.C. Gonzalez and C.S.G. Lee. Robotics: control, sensing, vision, and intelligence.
McGraw-Hill Int. Eds., Industrial Engineering Series, 1988
35. R.C. Dorf and R.H. Bishop. Modern control systems. Addison-Wesley Publishing Company,
10th Ed., 2004
36. W. Bolton. Essential mathematics for engineering. Butterworth Heinemann Publishing Com-
pany, 1st Ed., 1997
37. J.G. Ziegler and N.B. Nichols. Optimum settings for automatic controllers. Transaction of
ASME, vol. 65, pp. 433–444, 1943
38. E. Lembessis. Dynamic learning behaviour of a rule-based self organizing controller. Ph.D.
Thesis, Queen Mary University of London, UK, 1984
Stability Analysis and Performance Design
for Fuzzy Model-based Control Systems
using a BMI-based Approach
H.K. Lam, Member, IEEE and F.H.F. Leung, Senior Member, IEEE
Abstract This chapter presents the stability analysis and performance design for
nonlinear systems. To facilitate the stability analysis, the T-S fuzzy model is
employed to represent the nonlinear plant. A fuzzy controller with enhanced sta-
bilization ability is proposed to close the feedback loop. Membership functions
different from those of the fuzzy model are used by the fuzzy controller to sim-
plify its structure. However, under such a case, an imperfect premise-matching con-
dition is resulted, which will lead to conservative stability conditions. To reduce
the conservativeness, the information of the membership functions of the fuzzy
model and controller is employed. The enhanced stabilization ability of the fuzzy
controller is able to further relax the stability conditions. However, the stability
conditions derived using the Lyapunov-based approach are in the form of bilin-
ear matrix inequalities (BMIs) of which the solution is difficult to be found. The
genetic-algorithm based convex programming technique is proposed to solve the
solution of the BMIs. BMI-performance conditions subject to a scalar performance
index are derived to guarantee the system performance. Simulation examples are
given to illustrate that the proposed approach can provide a systematic and ef-
fective way to help design stable and well-performed fuzzy model-based control
systems.
1 Introduction
The T-S fuzzy modelling approach [1, 2] provides a systematic framework to rep-
resent nonlinear plants and facilitates the stability analysis and controller synthesis.
Using the Lyapunov-based method, various stability conditions [3–11] have been
derived to guarantee the system stability. Furthermore, stability conditions can be
expressed in terms of linear matrix inequalities (LMIs) [12] of which the solution
can be found by using some convex programming techniques.
In general, two cases of fuzzy model-based control systems have been investi-
gated. In the first case, the fuzzy controller is designed under the imperfect premise-
matching condition of which the fuzzy model and the fuzzy controller do not
share the same premises. In [3, 4], LMI-based stability conditions were derived to
guarantee the system stability of this class of fuzzy model-based control systems.
Under the imperfect premise-matching condition, the fuzzy controller exhibits two
favourable features. One, the premise membership functions can be freely designed
so that the design flexibility for the fuzzy controllers is enhanced. Some simple
and commonly used membership functions can be employed to lower the structural
complexity, computational demand and implementation cost of the fuzzy controller.
Two, the fuzzy controller displays an inherent robustness property to handle para-
meter uncertainties of the nonlinear plant. In [3, 4], it can be seen that the stability
conditions are not related to the membership functions of the non-linear plant. Con-
sequently, the fuzzy controller designed under imperfect premise-matching condi-
tion is able to stabilize nonlinear plant with its fuzzy model subject to uncertain
grades of membership due to the presence of parameter uncertainties. However, the
imperfect premise-matching condition will lead to conservative stability conditions
as the membership functions of the fuzzy model are not considered during the stabil-
ity analysis. This problem is partially answered by the second case of fuzzy model-
based control system design. In this case, the fuzzy controller is designed under the
perfect premise-matching condition. Unlike the imperfect premise-matching condi-
tion, the fuzzy model and the fuzzy controller share the same premises during the
design of the fuzzy controller. As the membership functions of the fuzzy model are
considered during stability analysis, the stability conditions can be relaxed [4–11].
However, as the grades of membership function are needed to be known, the fuzzy
model considered in [4–11] must be uncertainty free. Hence, under the perfect
premise-matching condition, the stability conditions are relaxed by sacrificing the
inherent robustness property of the fuzzy controller. It can be seen that both fuzzy
controllers designed under the imperfect and perfect premise-matching conditions
cannot replace each other; each has its own advantages in various applications.
In this chapter, the stability of fuzzy model-based control systems under the
imperfect premise-matching conditions is investigated. As revealed by the stability
analysis results of fuzzy model-based systems under perfect premise-matching con-
ditions [4–11] and the preliminary stability analysis result in [13] published by the
same authors, the information of the fuzzy model is important to relax the stability
conditions. The knowledge on the membership functions of the fuzzy model is
employed to design the membership functions of the fuzzy controller. Furthermore,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 263
Let p be the number of fuzzy rules describing the non-linear plant. The i-th rule is
of the following format:
Rule i : IF f1 (x(t)) is M1i AND ... AND fΨ (x(t)) is MΨi THEN ẋ(t) = Ai x(t) + Bi u(t)
(1)
where Mαi is a fuzzy term of rule i corresponding to the known function fα (x(t)),
α = 1, 2, ..., Ψ; i = 1, 2, ..., p; Ψ is a positive integer; Ai ∈ Rn×n and Bi ∈ Rn×m are
known constant system and input matrices respectively; x(t) ∈ Rn×1 is the system
state vector and u(t) ∈ Rm×1 is the input vector. The system dynamics are described
by
p
ẋ(t) = ∑ wi (x(t)) (Ai x(t) + Bi u(t)) (2)
i=1
where
p
∑ wi (x(t)) = 1, wi (x(t)) ∈ [0, 1] for all i (3)
i=1
is a non-linear function of x(t) and µMαi (xα (t)) is the grade of membership corre-
sponding to the fuzzy term Mαi . The grade of membership is affected by any plant
parameter uncertainty.
A fuzzy controller with p fuzzy rules is to be designed for the non-linear plant. The
j-th rule of the fuzzy controller is of the following format:
Rule j : IF g1 (x(t)) is N1j AND ... AND gΩ (x(t)) is NΩj THEN u(t) = F j x(t)
(5)
where Nβj is a fuzzy term of rule j corresponding to the known function gβ (x(t)),
β = 1, 2, ..., Ω ; j = 1, 2, ..., p; Ω is a positive integer; F j ∈ Rm×n is the feedback
gain of rule j to be designed. The inferred output of the fuzzy controller is given by
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 265
p
u(t) = ∑ m j (x(t))F j x(t) (6)
j=1
where
p
∑ m j (x(t)) = 1, wi (x(t)) ∈ [0, 1] for all j (7)
j=1
is a non-linear function of x(t) and µN j (gβ (x(t))) is the grade of membership cor-
β
responding to the fuzzy term Nβj .
In order to improve the stabilization ability of the fuzzy controller, the feedback
Gj
gains are chosen to be F j = p to enhance the non-linearity for com-
∑ k m (x(t))ak
k=1
pensating the non-linear plant dynamics. From (6), we have,
p
∑ m j (x(t))G j x(t)
j=1
u(t) = p (9)
∑ mk (x(t))ak
k=1
3 Stability Analysis
The fuzzy model-based control system is formed by the fuzzy model of (2) and the
fuzzy controller of (9) connected in a closed loop. From (2) and (9), we have,
266 H.K. Lam and F.H.F. Leung
⎛ p ⎞
∑ m j (x(t))G j x(t)
p ⎜ ⎟
∑ wi (x(t)) ⎜ ⎟
j=1
ẋ(t) = ⎝Ai x(t) + Bi p ⎠
i=1 ∑ mk (x(t))ak
k=1
p p
1
= p ∑ ∑ wi (x(t))m j (x(t)) (a j Ai + Bi G j ) x(t) (10)
∑ mk (x(t))ak i=1 j=1
k=1
is used. For simplicity, wi (x(t)) and m j (x(t)) are written as wi and m j . To investi-
gate the stability of system of (10), the following Lyapunov function candidate is
considered.
V (t) = x(t)T Px(t) (11)
where P = PT ∈ Rn×n > 0 . From (10) and (11), we have,
It can be seen that V̇ (t) < 0 , which implies the asymptotic stability of the fuzzy
model-based control system, is satisfied when
(a j Ai + Bi G j )T P + P (a j Ai + Bi G j ) < 0
for all i and j. In order to relax the conservativeness of the stability conditions, the
membership functions of the fuzzy controller are designed such that mi − ρ wi > 0
for all i and x(t), where 0 < ρ < 1 is a constant scalar to be determined. Let X =
XT = P−1 and z(t) = X−1 x(t), from (12), we have,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 267
1 p p
V̇ (t) = p ∑ ∑ wi m j z(t)T a j XATi + XGTj BTi + a j Ai X + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
= p ∑ ∑ wi m j −ρ w j +ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
ρ p p
= p ∑ ∑ wi w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
+ p ∑ ∑ wi m j − ρ w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
+ p ∑ ∑ wi m j − ρ w j z(t)T (Λ i − Λ i ) z(t)
∑ mk ak i=1 j=1
k=1
ρ p p
= p ∑ ∑ wi w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
+ p ∑ ∑ wi m j − ρ w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' ( ρ p
(1 − ρ )
+ p ∑ ∑ wi m j − ρ w j z(t)T Λ i z(t) − p ∑ wi ρ
z(t)T Λ i z(t)
∑ mk ak i=1 j=1 ∑ mk ak i=1
k=1 k=1
p p
ρ (1 − ρ )
= p ∑ ∑ wi w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X−
ρ
Λ i z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
+ p ∑ ∑ wi m j −ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X + Λ i z(t)
∑ mk ak i=1 j=1
k=1
(13)
(1 − ρ )
Sii > ai XATi + ai Ai X + XGTi BTi + Bi Gi X − Λ i , i = 1, 2, ..., p (15)
ρ
(1 − ρ )
Si j + STij ≥ a j XATi + a j Ai X + XGTj BTi + Bi G j X − Λi
ρ
(1 − ρ )
+ai XATj + ai A j X + XGTi BTj + B j Gi X − Λ j,
ρ
i, j = 1, 2, ..., p; i < j (16)
ρ p
ρ p
V̇ (t) < p ∑ w2i z(t)T Sii z(t) + p ∑ ∑ wi w j z(t)T Si j + STij z(t)
∑ mk ak i=1 ∑ mk ak j=1 i< j
k=1 k=1
1 p p ' (
+ p ∑ ∑ wi m j −ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X + Λ i z(t)
∑ mk ak i=1 j=1
k=1
⎤T ⎡
⎡ ⎤
w1 z(t) w1 z(t)
ρ ⎢ ⎥ ⎢ ⎥
= p ⎢ w2 z(t) ⎥ S ⎢ w2 z(t) ⎥
⎣ ... ⎦ ⎣ ... ⎦
∑ mk ak w z(t) w p z(t)
k=1 p
1 p p ' ( a j XATi + a j Ai X
+ p ∑ ∑ wi m j − ρ w j z(t)T
+XGTj BTi + Bi G j X + Λ i
z(t) (17)
∑ mk ak i=1 j=1
k=1
⎡ ⎤
S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
where S = ⎢ ⎥
⎣ ... ... ... ... ⎦. It can be seen from (17) that V̇ (t) < 0, which implies
S p1 S p2 ... S pp
the asymptotic stability of the fuzzy model-based control system, if S < 0 and
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 269
for all i and j. The stability analysis result is summarized in the following theorem.
Theorem 1: The fuzzy model-based control system of (10) formed by the non-
linear plant in the form of (2) and the fuzzy controller of (9) is asymptotically stable
if the membership functions of the fuzzy controller are designed such that mi (x(t)) −
ρ wi (x(t)) > 0 for all i and x(t), where 0 < ρ < 1, and there exist non-zero positive
scalars ai and matrices P = PT ∈ Rn×n , Si j = STji ∈ Rn×n , Gi ∈ Rm×n , and Λ i =
Λ Ti ∈ Rn×n such that the following BMIs are satisfied:
• P > 0;
(1 − ρ )
• Sii > ai XATi + ai Ai X + XGTi BTi + Bi Gi X − ρ Λ i , i = 1, 2, ..., p;
(1 − ρ )
• Si j + STij ≥ a j XATi + a j Ai X + XGTj BTi + Bi G j X − ρ Λ i
(1 − ρ )
+ ai XATj + ai A j X + XGTi BTj + B j Gi X − ρ Λ j ,
⎡ i, j = 1, 2, ..., p; i < j;
⎤
S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
• S=⎢ ⎥
⎣ ... ... ... ... ⎦ < 0;
S p1 S p2 ... S pp
• a j XAi + a j Ai X + XGTj BTi + Bi G j X + Λ i < 0, i, j = 1, 2, ..., p.
T
In the following, the feedback gains G j and a j for the fuzzy controller are deter-
mined using the BMI-based approach.
Start
Genetic Algorithm
PS
fitness = z
No Stop criterion
reached?
Yes
END
(1 − ρ )
• Sii > ai XATi + ai Ai X + NTi BTi + Bi Ni − ρ Λ i , i = 1, 2, ..., p;
(1 − ρ )
• Si j + STij ≥ a j XATi + a j Ai X + NTj BTi + Bi N j − ρ Λ i
(1 − ρ )
+ ai XATj + ai A j X + NTi BTj + B j Ni − ρ Λ j ,
⎡ i, j = 1, 2, ..., p; i < j;
⎤
S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
• S=⎢ ⎥
⎣ ... ... ... ... ⎦ < 0;
S p1 S p2 ... S pp
• a j XAi + a j Ai X + NTj BTi + Bi N j + Λ i < 0, i, j = 1, 2, ..., p.
T
Based on Theorem 1 and Theorem 2, the fuzzy model-based control system of (10)
is guaranteed to be asymptotically stable if there exist scalars a j , j = 1, 2, ..., p, such
that the stability conditions are satisfied. It should be noted that the stability con-
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 271
ditions in Theorem 1 and Theorem 2 are not LMIs if a j for all j are variables. To
deal with this problem, the GA-based convex programming technique is proposed
to solve the solution. The procedure is illustrated in Figure 1 and is summarized as
follows:
Step 1) GA generates the potential solution of Ps = [a1 , a2 , ..., a p ] which is kept
constant and fed to an LMI solver in the subsequent stage. It should be noted that
when the value of Ps is kept constant, the BMI-based stability conditions become
LMIs which can be solved using convex programming technique. In general, the
initial value of Ps is randomly generated.
Step 2) The LMI solver solves the solution Pm to the LMI conditions based on the
fixed value of Ps generated by GA in Step 1. The LMI problem is generally denoted
by L(Pm , Ps ) + zI > 0 where
denotes the potential solution of the LMI problem and z is a scalar. It should be
noted that the initial value of Pm is randomly generated or determined by the LMI
solver.
Step 3) If there exists a negative z such that L(Pm , Ps ) + zI > 0, it implies that
both Pm and Ps satisfy the stability conditions. A solution has been found. On us-
ing the GA-based convex programming process, z is taken as a fitness function to
indicate the degree of satisfaction of both Pm and Ps to the inequality problem. A
more negative value of z indicates better solutions of Pm and Ps . Consequently, the
finding of solution is realized as a minimization problem (minimizing the value of
z). A stopping criterion should be set to stop the solution finding process, e.g., a
predefined number of iteration has been reached.
Step 4) If the stopping criterion is not met, return to Step 1).
In this section, BMI-based performance conditions are derived to guarantee the sys-
tem performance under the consideration of system stability. The performance con-
ditions are extra constraints added to the stability conditions in Theorem 2, which
confine the searching domain of N j for all j. Any values of N j inside that search-
ing domain satisfies a pre-defined scalar performance index [17]. The performance
index, which measures quantitatively the system performance, is defined as follows.
∞ p p 7 8T 7 87 8
x(t) J1 0 x(t)
J= ∑∑ mγ mλ aγ aλ
u(t) 0 J2 u(t)
dt (18)
γ =1 λ =1
0
272 H.K. Lam and F.H.F. Leung
where J1 = JT1 ∈ Rn×n > 0, J2 = JT2 ∈ Rm×m > 0, which are constant weighting
matrices determined by designers. The weighting matrices allocate the importance
of each system state or control signal contributed to the performance index of (18).
It can be seen that the performance index of (18) reflects the integral of energy of the
system states and control signals. A smaller scalar value of J indicates better system
p p p
performance. From (9) and (18), and with the property that ∑ m j = ∑ ∑ m j mk =
j=1 j=1 k=1
1, we have,
⎡ ⎤T ⎡ ⎤
I 0 I 0
∞ p 7 8T ⎢ p ⎥ 7 8⎢ p ⎥7 8
p
x(t) ⎢ ∑ m j G j ⎥ J1 0 ⎢ ∑ mk Gk ⎥ x(t)
J= ∑ ∑ mγ mλ aγ aλ ⎢ j=1 ⎥ ⎢ k=1 ⎥
x(t) ⎢0 p ⎥ 0 J2 ⎢ 0 p ⎥ x(t) dt
γ =1 λ =1 ⎣ ⎦ ⎣ ⎦
0
∑ mξ aξ ∑ mϕ aϕ
ξ =1 ϕ =1
⎡ p ⎤T ⎡ ⎤ p
∞ 7 8T ∑
⎢ j=1 m j a j 0 7 8 ∑ mk ak
⎥ J1 0 ⎢ k=1 0 7 8
x(t) ⎢ ⎥ ⎥ x(t)
= ⎣ p ⎦ ⎣ p ⎦ dt
x(t) 0 J2 x(t)
0 0 ∑ m jG j 0 ∑ mk Gk
j=1 k=1
∞ p p 7 8T 7 8T 7 87 87 8
x(t) a jI 0 J1 0 ak I 0 x(t)
= ∑ ∑ m j mk x(t) 0 Gj 0 J2 0 Gk x(t)
dt
j=1 k=1
0
(19)
where ⎡ ⎤
−η X 0 a j X 0
⎢ 0 −η X 0 Nj ⎥
T
Wj = ⎢ ⎥
⎣ a j X 0 −J−1 0 ⎦ , j = 1, 2, ..., p.
1
0 Nj 0 −J−1
2
The inequality of (23) is satisfied when W j < 0 for all j, which are the BMI-based
performance conditions. The analysis result is summarized in the following theo-
rem.
Theorem 3: The scalar performance index of (18) which measures quantitatively
the system performance of the fuzzy model-based control system of (10) is atten-
uated to a prescribed level governed by the non-zero-positive scalar value of η if
there exist nonzero positive scalars a j , j = 1, 2, ..., p, and matrices X = XT ∈ Rn×n ,
J1 = JT1 ∈ Rn×n > 0, J2 = JT2 ∈ Rm×m > 0, and N j ∈ Rn×n such that the follow BMIs
are satisfied.
• X > 0;⎡ ⎤
−η X 0 a j X 0
⎢ 0 −η X 0 NTj ⎥
• Wj = ⎢ ⎥
⎣ a j X 0 −J−1 0 ⎦ < 0, j = 1, 2, ..., p.
1
0 Nj 0 −J−12
Step II) Determine m j (x(t)) for the fuzzy controller and obtain the value of
0 < ρ < 1 subject to the conditions of m j (x(t)) − ρ w j (x(t)) for all j and (x(t)).
Determine the ranges a j for the GA-based convex programming technique.
Step III) Solve the solution of the stability conditions in Theorem 1 (if the values
of G j are pre-determined) or Theorem 2 (if the values of G j are determined au-
tomatically) using the GA-based convex programming technique process as shown
in Figure 1. If the system performance is considered, the BMI-based performance
conditions in Theorem 3 are needed to be added to those conditions in Theorem 2.
J1 and J2 have to be determined beforehand.
Step IV) Implement the fuzzy controller of (9) according to the values of G j and
a j.
6 Simulation Examples
Two simulation examples will be given to illustrate the merits of the proposed
approach.
where
7 8 7 8 7 8 7 8
2 −10 a −10 1 b
A1 = , A2 = , B1 = , and B2 = ;
1 0 1 3 0 0
2.8 2.8
2.6 2.6
2.4 2.4
2.2 2.2
2 2
b
b
1.8 1.8
1.6 1.6
1.4 1.4
1.2 1.2
1 1
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
a a
(a) (b)
Fig. 2 Stability region based on Theorem 1 with a j = 1 for all j Simulation Example 1 (a) ρ = 0.75
(b) ρ = 0.9
2.8 2.8
2.6 2.6
2.4 2.4
2.2 2.2
2 2
b
1.8 1.8
1.6 1.6
1.4 1.4
1.2 1.2
1 1
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
a a
(a) (b)
Fig. 3 Stability region based on Theorem 1 for Simulation Example 1 (a) ρ = 0.75 (b) ρ = 0.9
An example on stabilizing a cart-pole typed inverted pendulum [20] using the pro-
posed non-linear controller is given below.
276 H.K. Lam and F.H.F. Leung
x=q x=q
mg
l
M u
Step I) Figure 4 shows a diagram of the cart-pole typed inverted pendulum. The
dynamic equations of the inverted pendulum on the cart [20] are given by,
where x1 (t) and x2 (t) denote the angular displacement (rad) and the angular veloc-
ity (rad/s) of the pendulum from vertical respectively, x3 (t) and x4 (t) denote the
displacement (m) and the velocity (m/s) of the cart respectively, g = 9.8 m/s2 is the
acceleration due to gravity, m = 0.22 kg is the mass of the pendulum, M = 1.3282
kg is the mass of the cart, l = 0.304 m is the length from the centre of mass of the
pendulum to the shaft axis, J = ml 2 /3 kgm2 is the moment of inertia of the pendu-
lum around the centre of mass, F0 = 22.915 N/ms and F1 = 0.007056 N/rads are the
friction factors of the cart and the pendulum respectively, and u(t) is the force (N)
applied to the cart. The non-linear plant can be represented by a fuzzy model with
two fuzzy rules [20]. The i-th rule is given by,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 277
where
• x(t) =⎡[x1 (t) x2 (t) x3 (t) x4 (t)]T ; ⎤
0 1 0 0
⎢ (M + m)mgl/a1 −F1 (M + m)/a1 0 F0 ml/a1 ⎥
• A1 = ⎢⎣
⎥;
⎦
0 0 1 0
−m2 gl 2 /a1 F1 Ml/a1 0 −F0 (J + ml 2 )/a1
⎡ ⎤
0
⎢ −ml/a1 ⎥
• B1 = ⎢
⎣
⎥;
⎦
0
(J + ml 2 )/a1
⎡ ⎤
√ 0 1 0 0
⎢ 3 3 ⎥
⎢ (M + m)mgl/a2 −F1 (M + m)/a2 0 F0 ml cos(π /3)/a2 ⎥
• A2 = ⎢⎢ 2π ⎥;
0 0 1 0 ⎥
⎣ √ ⎦
3 3
− 2π m gl cos(π /3)/a2 F1 ml cos(π /3)/a2 0 −F0 (J + ml )/a2
2 2 2
⎡ ⎤
0
⎢ −ml cos(π /3)/a2 ⎥
• B2 = ⎢
⎣
⎥;
⎦
0
2
(J + ml )/a2
• a1 = (M + m)(J + ml 2 ) − m2 l 2 ;
• a2 = (M + m)(J + ml 2 ) − m2 l 2 (cos(π /3))2 .
The membership functions are defined as
1 1
w1 (x1 (t)) = µM1 (x1 (t)) = 1 −
1 1 + e−7(x1 (t)−π /6) 1 + e−7(x1 (t)+π /6)
and
w2 (x1 (t)) = µM2 (x1 (t)) = 1 − µM1 (x1 (t))
1 1
Step II) A two-rule fuzzy controller is proposed to control the non-linear plant.
The j-th rule is given by,
0.9
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
–2 –1.5 –1 –0.5 0 0.5 1 1.5 2
x1(t)(rad)
Fig. 5 Membership functions of fuzzy model and fuzzy controller in Simulation Example 2.
ρ µM1 (x1 (t)) (bell in solid line) and µN 1 (x1 (t)) (trapezoid in solid line), ρ µM2 (x1 (t)) (bell in dotted
1 1 1
line) and µN 2 (x1 (t)) (trapezoid in dotted line) with ρ = 0.8
1
2
2 ∑ m j (x(t))G j x(t)
j=1
u(t) = ∑ m j (x(t))F j x(t) = 2
(32)
j=1
∑ mk (x(t))ak
k=1
The membership functions of the fuzzy controller are shown in Figure 5. A sim-
ple commonly used trapezoidal membership function is employed to implement
the fuzzy controller. Based on the membership information of the fuzzy model
and fuzzy controller, we have ρ = 0.8 such that the conditions of m j (x1 (t)) −
ρ w j (x1 (t)) > 0 for all i and x1 (t).
Step III) Theorem 2 is employed to help design a stable fuzzy controller for
the inverted pendulum. BMI-performance conditions in Theorem 3 are added to
Theorem 2 to govern the system performance. To measure the system perfor-
mance, ⎤ performance index of (18), with η = 0.01 and weighting matrices
⎡ the scalar
1000
⎢0 1 0 0⎥
J1 = ⎢ ⎥
⎣ 0 0 1 0 ⎦ and J2 = 0.1, is used. The proposed GA-based convex program-
0001
ming technique is employed to solve the solution of the BMI-based stability and
performance conditions. The lower and upper bounds of a j , j = 1, 2, are chosen em-
pirically to be 10−3 and 2 respectively. The real-coded GA with arithmetic crossover
and non-uniform mutation [16] are used as the convex programming technique in
this application example. The parameters a j , j = 1, 2, form the chromosomes of the
GA process. Their initial values are randomly generated. The control parameters of
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 279
the real-coded GA are as follows. The probability of crossover is 0.8; the probabil-
ity of mutation is 0.5; the shape parameter is 1; the population size is 40 and the
number of training iteration is 500.
After the process, we obtain a1 = 0.1467 and a2 = 0.1752, and the feedback
gains as
G1 = [79.74436.81170.18396.6751], G2 = [106.13538.11760.14826.6387]
such that the BMI-based stability and performance conditions in Theorem 2 and
Theorem 3 are satisfied. In the following, the fuzzy controller with these feedback
gains is referred as fuzzy controller 1. For comparison purpose and to show the
effectiveness of the performance conditions, another set of feedback gains is ob-
tained⎡for fuzzy controller
⎤ 2 of which every parameter is kept unchanged except
10 0 0
⎢0 1 0 0⎥
J1 = ⎢ ⎥
⎣ 0 0 100 0 ⎦. On solving the stability and performance conditions, the feed-
00 0 1
back gains obtained for fuzzy controller 2 are
G1 = [139.369511.84973.724312.0212], G2 = [165.784413.43603.746512.8725].
Both fuzzy controllers 1 and 2 in the form of (32) are employed to stabilize the in-
verted pendulum described in (25) to (28). Figure 6 shows the system state responses
under the initial system state x(0) = 512 π 0 0 0 T . Referring to this figure, it can
be seen that the inverted pendulum can be stabilized ⎡ by both⎤ fuzzy controllers.
10 0 0
⎢0 1 0 0⎥
Considering the fuzzy controller 2, we have J1 = ⎢ ⎥
⎣ 0 0 100 0 ⎦ in which a heav-
00 0 1
ier weight is put to x3 (t) in the performance index. Consequently, the system state
response of x3 (t) of the controlled inverted pendulum with fuzzy controller 2 offers
better system performance than that with fuzzy controller 1 in terms of transient
response and settling time.
In this example, it can be seen that simple membership functions can be used by
the fuzzy controller instead of some complicated membership functions of the fuzzy
model under the perfect premise-matching condition. Moreover, under the perfect
premise-matching condition, the stability conditions in [4–10] cannot be applied to
aid the design of the fuzzy controller. Under the imperfect premise-matching condi-
tion, the proposed BMI-based stability and performance conditions offer a system-
atic way to realize a stable and well-performed fuzzy controller for the non-linear
system.
280 H.K. Lam and F.H.F. Leung
1.5 5
0
1
–5
x2(t)(rad/s)
0.5
x1(t)(rad)
–10
0 –15
–20
–0.5
–25
–1 –30
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Time(sec) Time(sec)
(a) (b)
30 30
25 25
20 20
x4(t)(m/s)
15
x3(t)(m)
15
10 10
5 5
0 0
–5 –5
0 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20
Time(sec) Time(sec)
(c) (d)
Fig. 6 System responses of the inverted pendulum with fuzzy controller 1 (solid lines) and fuzzy
controller 2 (dotted lines); (a) x1 (t); (b) x2 (t); (c) x3 (t); (d) x4 (t)
7 Conclusion
System stability of fuzzy model-based control systems under the imperfect premise-
matching condition has been investigated. A fuzzy controller with enhanced stabi-
lization ability has been proposed to deal with non-linear systems. The information
of the membership functions of the fuzzy model and fuzzy controller has been used
to facilitate the system analysis. Relaxed BMI-based stability conditions have been
derived using the Lyapunov-based approach to guarantee the system stability. Under
the imperfect premise-matching condition, simple membership functions can be em-
ployed to lower the structural complexity of the fuzzy controller. BMI-performance
conditions have been derived subject to a scalar performance index to guarantee
the system performance. The GA-based convex programming technique has been
proposed to solve the solution of the BMI-based stability and performance condi-
tions so as to aid the design of stable and well-performed fuzzy model-based control
systems. Simulation examples have been given to illustrate the effectiveness of the
proposed approach.
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 281
Acknowledgment The work described in this paper was supported by grants from King’s College
London and The Hong Kong Polytechnic University (Project No. G-YE92).
References
1. T. Takagi and M. Sugeno. Fuzzy identification of systems and its applications to modeling and
control. IEEE Trans. Sys., Man., Cybern., vol. smc-15 no. 1, pp. 116–132, Jan 1985
2. M. Sugeno and G.T. Kang, Structure identification of fuzzy model. Fuzzy sets and systems,
vol. 28, pp. 15–33, 1988
3. C.L. Chen, P.C. Chen and C.K. Chen. Analysis and design of fuzzy control system. Fuzzy Sets
and Systems, vol. 57, no 2, 26, pp. 125–140, Jul 1993
4. H.O. Wang, K. Tanaka and M.F. Griffin. An approach to fuzzy control of nonlinear systems:
stability and the design issues. IEEE Trans. Fuzzy Syst., vol. 4, no. 1, pp. 14–23, Feb 1996
5. K. Tanaka, T. Ikeda and H.O. Wang. Fuzzy regulator and fuzzy observer: Relaxed stability
conditions and LMI-based designs. IEEE Trans. Fuzzy Syst., vol. 6, no. 2, pp. 250–265, 1998
6. W.J. Wang, S.F. Yan and C.H. Chiu. Flexible stability criteria for a linguistic fuzzy dynamic
system. Fuzzy Sets and Systems, vol. 105, no. 1, pp. 63–80, Jul 1999
7. E. Kim and H. Lee. New approaches to relaxed quadratic stability conditions of fuzzy control
systems. IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 523–534, 2000
8. X. Liu and Q. Zhang. New approaches to H∞ -controller designs based on fuzzy observers for
T-S fuzzy systems via LMI. Automatica, vol. 39, no. 9, pp. 1571–1582, Sep 2003
9. X. Liu and Q. Zhang. Approaches to quadratic stability conditions and H∞ -control designs for
T-S fuzzy systems. IEEE Trans. Fuzzy Syst., vol. 11, no. 6, pp. 830–839, 2003
10. M.C.M. Teixeira, E. Assunção and R.G. Avellar. On relaxed LMI-based designs for fuzzy
regulators and fuzzy observers. IEEE Trans. on Fuzzy Systems, vol. 11, no. 5, pp. 613–623,
Oct 2003
11. C.H. Fang, Y.S. Liu, S.W. Kau, L. Hong and C.H. Lee. A new LMI-based approach to relaxed
quadratic stabilization of T-S fuzzy control systems. IEEE Trans. on Fuzzy Systems, vol. 14,
no. 3, pp. 386–397, Jun 2006
12. S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan. Linear Matrix Inequalities in Systems
and Control Theory. ser. SIAM studies in Applied Mathematics, Philadelphia, PA: SIAM,
1994
13. H.K. Lam and F.H.F. Leung. Stability analysis and synthesis of fuzzy control systems subject
to uncertain grades of membership. IEEE Trans. Syst., Man and Cybern, Part B: Cybernetics,
vol. 35, no. 6, pp. 1322–1325, Dec 2005
14. T.M. Guerra and L. Vermeiren. LMI-based relaxed nonquadratic stabilization conditions for
nonlinear systems in the Takagi-Sugeno’s form. Automatica, vol. 40, pp. 823–829, 2004
15. B.C. Ding, H.X. Sun and P. Yang. Further study on LMI-based relaxed nonquadratic stabi-
lization conditions for nonlinear systems in the Takagi-Sugeno’s form. Automatica, vol. 42,
pp. 503–508, 2006
16. Z. Michalewicz. Genetic Algorithm + Data Structures = Evolution Programs. 2nd ed.
Springer-Verlag, 1994
17. B.D.O. Anderson and J.B. Moore. Optimal Control: Linear Quadratic Methods. Prentice-Hall,
1990
18. T.M. Guerra, F. Delmotte, L. Vermeiren and H. Tirmant. Compensation and division control
law for fuzzy models. Fuzzy IEEE 2001, Australia, December, pp. 521–524, 2001
19. E. Kim, M. Park, S. Ji and M. Park. A new approach to fuzzy modeling. IEEE Trans. Fuzzy
Syst., vol. 7, no. 2, pp. 236–240, 1999
20. X.J. Ma and Z.Q. Sun. Analysis and design of fuzzy reduced-dimensional observer and fuzzy
functional observer. Fuzzy Sets and Systems, vol. 120, pp. 35–63, 2001
Two-Level Tuning of Fuzzy PID Controllers
for Multivariable Process Systems
Abstract This paper presents a novel design and tuning technique of fuzzy PID
(FPID) controllers for multivariable process systems. The inference mechanism of
the FPID system follows the Standard Additive Model (SAM)-based fuzzy rule
structure. The proposed design method can be used for any n × n dimensional multi-
input–multi-output (MIMO) process system and guarantees closed-loop stability. In
general the design of FPID for MIMO systems is challenging, mainly due to the ex-
istence of loop interactions. To address this issue a static decoupler is implemented
which has the capacity to remove steady-state loop interactions. The each control
loop is assigned with a FPID system. Two types of FPID configurations are consid-
ered. The first FPID system follows the Mamdani-type rule structure, where error
and error rates are directly used in the input space to derive fuzzy rules. The sec-
ond FPID configuration consists decoupled fuzzy rules where three decoupled rule
bases are assigned to follow individual PID actions. The tuning is achieved while
using the two-level tuning principle as described in [1]. The low-level tuning is ded-
icated to devise linear gain parameters in the FPID system where as the high-level
tuning is dedicated to adjust the fuzzy rule base parameters. The low-level tuning
method adopts a novel linear tuning scheme for general decoupled PID controllers
and the high-level tuning adopts a heuristic-based method to change the nonlinear-
ity in the fuzzy output. For robust implementation, a stability analysis is performed
using Nyquist array and Gershgorin band. The stability properties provides the hard
limits allowed for fuzzy rule parameters and also guarantees to operate within a
given gain phase margin limits. The performance and the design criterion is finally
evaluated using several control simulations.
List of abbreviations
1 Introduction
is presented by Hassan et al. [27] for robot arm. In these applications fuzzy logic
controllers are used at supervisory level for self tuning of conventional PID gains
at the lower level. In all aforementioned methods the design of FPID have been
arbitrary and the gain parameters were chosen using trial and error methods.
The literature review revealed that there is no systematic design procedure is
available to design and tune FPID controllers for MIMO process systems. It is very
clear that the available SISO-based FPID design techniques have limitations to
extend for general MIMO systems. Alternatively, this paper proposes a general-
ized tuning scheme for both linear PID and FPID controllers. The FPID controller
follows the fuzzy inference based on standard additive model (SAM), proposed
in [28]. The proposed tuning scheme follows the two levels of tuning, namely
low-level tuning followed by high-level tuning [1]. By considering interaction mea-
sure among loops, a generalized tuning technique is developed for low-level tuning
for MIMO process. In SAM-based fuzzy inference the consequent fuzzy sets are
weighted using either centroid or volume of membership functions which can also
be calculated in advance using SAM theorem. In the proposed design the high-level
tuning is dedicated to determine these centroid and volumes in the view of achieving
desired nonlinearity of the fuzzy output.
This paper is organized as follows. First, system description is presented in
Section 2. In Section 3, two-level tuning technique is described. Low-level tuning
is performed and generalized linear PID controller design technique is described in
Section 4. In Section 4, a new interaction measure is derived via interaction index
and PID controllers are tuned for MIMO process based on this index. In Section 5,
High-level tuning is performed using SAM-based fuzzy system. Two types of FPID
configurations are considered in Section 6 and SAM-based fuzzy controllers are de-
signed for individual system. In Section 7, the stability analysis is performed using
direct Nyquist array (DNA) theorem where hard limits of high-level tuning parame-
ters are found. In Section 8, application of proposed tuning algorithms, FPID type
I and FPID type II are simulated for two examples and results are compared with
linear PID controller system. Sections 9 and 10 deal with performance analysis and
conclusions.
2 System Description
The open-loop SISO transfer function between ith output and jth input when all
other inputs are zero is denoted by gi j where i, j = 1, 2, . . . , n. The static decoupler
D for the above system can be described using (2).
D = G−1 (0) (2)
Where it is assumed that G(0) is nonsingular.
3 Two-Level Tuning
The main challenge in fuzzy control design is in the tuning, particularly in choos-
ing correct fuzzy system and its associated fuzzy parameters. The curse of di-
mensionality during the rule explosion [28] has been the main draw-back in FLC
designs. In a typical tuning problem the parameters includes linear scaling para-
meters of the control variables, fuzzy membership parameters, rules and other
associated fuzzy variables in the rules base, such as number of rules, membership
distribution and rule composition. The mathematical complexity in the nonlinear
fuzzy control makes the formulation of a tuning mechanism an extremely a com-
plex problem. However, the recent increase in computing power enabled most de-
signers to adopt numerical optimization techniques for generating optimum or near
optimum solutions to fuzzy systems, such as genetic algorithm and neural network,
where those techniques have the capacity to determine a large number of unknown
parameters in fuzzy systems [29], [30]. However, those application are somewhat
specific and unable to generalize for wider process specifications. Most of those
designs adopt off-line optimization methods and cannot be implemented for online
control. Moreover the optimizations requires an accurate process model and any
process mismatch during operation can result in poor stability and affect the overall
performance.
The FPID design can be classified as a two-level tuning problem [1] in which
the tuning process is decomposed into two tuning levels. While low-level tuning
addresses the linear gain and overall stability, the high-level tuning provides nonlin-
ear control to enable superior performance. In a rule-coupled fuzzy system, such as
Mamdani–Zadeh-based system, the inputs (error and its derivative) are coupled to
produce a combined fuzzy PI output [1]. The coupled nature of the inputs generally
makes the nonlinear output a complex function. As a result, it is difficult for one
288 G.K.I. Mann and E. Harinath
to isolate linear gains from the nonlinear output. In order to facilitate the two-level
tuning, we define apparent linear gains (ALG) and apparent nonlinear gains (ANG).
While the ALG terms are related to the overall performance and stability of the sys-
tem the ANG terms provide the nonlinearity that is necessary in the fuzzy output.
In the past for SISO systems, some have attempted to provide tuning rules for linear
gains [31], [32], [33]. However the nonlinear tuning was not sufficiently or explic-
itly described. In [34], the design of a conventional FPID is identified as a two-level
tuning problem and described as a way of obtaining ALG terms for conventional
FPID type controllers. However, the nonlinearity tuning was not sufficiently or ex-
plicitly described for implementing a two-level tuning. In this section a systematic
procedure is developed to devise two-level tuning methodology for general FPID
controllers for MIMO systems.
Where
KIi
ci (s) = KPi + + KDi s
s
and KPi , KIi and KDi are proportional, integral and derivative gains of the ith PID
controller. For the above system, shown in Figure 1, the overall compensated system
i.e. process model and static decoupler can be written as,
Where Tii represents the time constant of the ith SISO loop and
Ki j ; i = j
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 289
Let ⎡ ⎤
q11 (s) q12 (s) . . . q1n (s)
⎢ q21 (s) q22 (s) . . . q2n (s) ⎥
⎢ ⎥
Q(s) = ⎢ . .. . . . ⎥, (6)
⎣ .. . . .. ⎦
qn1 (s) qn2 (s) . . . qnn (s)
where
Ki j s(KPi + KsIi + KDi s) ; i = j
qi j (s) = K
KPi + sIi +KDi s (7)
Tii s+1 ; i = j.
The close-loop relation for this system is expressed as,
Where r and y are input and output vectors respectively. Then, the closed transfer
matrix H(s) between y and r can be written as,
Let ⎡ ⎤
h11 (s) h12 (s) . . . h1n (s)
⎢ h21 (s) h22 (s) . . . h2n (s) ⎥
⎢ ⎥
H(s) = ⎢ . .. .. . ⎥. (9)
⎣ .. . . .. ⎦
hn1 (s) hn2 (s) . . . hnn (s)
When all other loops are open, the elements in first column of H(s) can be written
as,
qi1 (s)
hi1 (s) = = qi1 (s)S1
1 + q11 (s)
where S1 = (1 + q11 (s))−1 is defined as sensitivity function of the first loop [35].
Thus, for a step input change in the first loop, the interactions to other loops at low
frequencies can be computed as,
290 G.K.I. Mann and E. Harinath
where (S1 )max is the maximum value of S1 and max(| Ki1 |) is the maximum absolute
value of Ki1 ; i = 1. Hence we can introduce interaction index of the first loop as,
The value of KI1 can be calculated at particular value of (S1 )max so that the inter-
action index, I1 is kept as low as possible. Then, the rest of interactions can also be
reduced according to the inequality (11). The proportional gain, KP1 of PID con-
troller is computed using time constant of the first-order approximated process and
the designed integral gain. The derivative gain, KD1 is chosen from ZN formula as,
1
TD1 = TI1 . (13)
4
Where TD1 and TI1 are derivative and integral time constants for PID controller at
the first loop. Then,
K2
KD1 = P1 . (14)
4KI1
In order to find KP1 , In this analysis we use direct pole placement method [35] as
follows. The closed-loop transfer function of the first loop with reduced first-order
model and PID controller is given by,
KD1 s2 +KP1 s+KI1
T11 +KD1
h11 (s) = 1+KP1 KI1
. (15)
2
s + ( T11 +KD1 )s + T1 +K D1
The same procedure is repeated for other loops and tuned while keeping interaction
index as minimum.
This section introduces generalized interaction index for n × n MIMO process sys-
tem as follows.
Ii = max(| Ki j |)(| KIi |)(Si )max (18)
i= j
where
(Si )max = max(1 + qii (s))−1
is the maximum value of ith loop sensitivity function and the reasonable range of
(Si )max is 1.3–2 [35]. The max(| Ki j |) is the maximum absolute value of Ki j ; i = j.
The integral and proportional gains of each loop can be evaluated as,
Ii
KIi = (19)
maxi= j (| Ki j |)(Si )max
and 9
1 ± ζi KIi2 − 4ζi2 KIi3 Tii + 4KIi Tii2
KPi = . (20)
4KIi2 ζi2 − 1
By selecting suitable value for ζi , KPi can be calculated. Then,
2
KPi
KDi = . (21)
4KIi
and we can define KPi , KIi and KDi as ALG terms for FLC.
The high-level tuning is dedicated to determine fuzzy rule base parameters which
has direct relevance to the nonlinearity of the FLC output. The nonlinearity that
is generated through fuzzy mapping is then adjusted using high-level tuning para-
meters. In general the nonlinearity can be adjusted either by changing rules or by
changing knowledge base rule parameters, such as membership shapes and their
distributions in the universe of discourse of variables. An effective nonlinearity
tuning mechanism should have the capacity to produce a flexibility to change the
292 G.K.I. Mann and E. Harinath
nonlinearity of the fuzzy output in a wider range. A proper selection of a fuzzy in-
ference mechanism is quite important in achieving efficient high-level tuning [38].
It is found that SAM-based fuzzy inference has the capacity to provide convenient
way to obtain the desired nonlinearity while changing membership parameters.
In the additive fuzzy systems (controller), rules are fired in parallel to some degree.
Then the system weights and average then-part or consequent fuzzy sets to infer the
output fuzzy set [36], [37]. Finally, the system defuzzifies the output fuzzy set using
centroid of membership functions to generate the fuzzy output. An additive fuzzy
system is a function approximator and SAM is the simplest form of an additive fuzzy
system [28]. According to Kosko, an additive FLC divides the global conditional
mean into a convex sum of local conditional means while the conventional centroid
type FLC computes the conditional mean as output. The then-part or consequent
fuzzy sets of the SAM consists of centroid and area or volume. The SAM theorem,
[28] which is described in Section 5.2 allows these volumes and centroid to be
computed in advance and this particular feature allows fast implementation of FLC
for real time control. Consider fuzzy rules of the form
IF X = Aα THEN Y = Bβ
where X and Y be nonempty sets and λ and ζ be nonempty index sets. Then, Aα :
α ∈ λ and Bβ : β ∈ ζ represent input fuzzy set of X and output fuzzy sets of Y
respectively. An additive fuzzy system stores m number of above fuzzy rules. These
rules describe fuzzy subsets or fuzzy patches in the Cartesian product space X × Y
as shown in the Figure 2. Hence an additive fuzzy system (collection of IF-THEN
rules) approximates a function F : X → Y . The general framework for a feed forward
additive fuzzy system is shown in Figure 3. The mapping of an input x causes to fire
the if-part of all m rules to some degree in parallel. Then the system weights (using
rule weight wm ) the then-part to produce a new fuzzy sets Bβ . The weighted sum of
the inferred fuzzy sets form the output sets B.
m
B= ∑ wβ Bβ (x). (22)
β =1
The weights w j is used to reflect rule credibility or frequency and then it provides
an extra term for a learning system to tune. In practice the rule weights are often set
as equal to unity: w1 = . . . wm = 1. SAM is a special case of the additive model
framework and following can be observed as special properties in SAM.
1. The fired then-part set Bβ is the fit product aβ (x)Bβ . Where the fit value aβ (x)
(aβ is called membership function) express the membership grade of input x in
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 293
B5
B4
B3
B2
B1
A1 A2 A3 A4 A5
If A0 then B0 B⬘0
w0
w1
If A1 then B1 B⬘1
B Centroidal y = F(x)
x A Defuzzifier
wm
If Am then Bm B⬘m
the if-part fuzzy set Aα . Then the output set can be expressed as,
m
B= ∑ wβ aβ (x)Bβ (x). (23)
β =1
2. The system output F(x) computes as centroid of output set B(x) and defuzzifies
to a scalar or a vector.
m
F(x) = Centroid ∑ wβ aβ (x)Bβ (x) (24)
β =1
The centroid provides the structure of a conditional expectation to the fuzzy system
F and it acts as an optimal nonlinear approximator in the mean-squared sense.
294 G.K.I. Mann and E. Harinath
The SAM theorem, proposed by Kosko [28], allow us to compute then-part parame-
ters in advance. Suppose the fuzzy system F : Rn → R p is a standard additive model
as shown in (24). Then F(x) is a convex sum of the m then-part set centroid:
∑m
β =1 wβ aβ (x)Vβ Cβ
F(x) = (25)
∑mβ =1 wβ aβ (x)Vβ
m
= ∑ pβ (x)Cβ . (26)
β =1
wβ aβ (x)Vβ
pβ (x) = m . (27)
∑k=1 wk ak (x)Vk
-∞
ybβ (y)dy
−∞
Cβ = -∞ . (31)
bβ (y)dy
−∞
Then SAM theorem allows us to calculate these volumes and centroid (or local con-
ditional means) in advance. They can also set to be adaptive in real time control.
For each input x we need to compute only the mβ fit values aβ (x) and then update
the ratio in (25). The consequent then-part fuzzy sets Bβ can take the form of sym-
metrical triangle or trapezoidal or bell curve so that the area and centroid are easy
to calculate. The SAM structure (25) allows to replace all then- part fuzzy sets Bβ
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 295
to be even rectangle or non singletons Rβ having the same volume Vβ and centroid
Cβ . This would not change the output value F(x).
ei êi
Sei ∆ûPIDi
ûPIDi uPIDi
(a)
SAM û1i
F1 (êi) KPi
(b)
Fig. 4 FPID configurations. (a) Type I: rule-coupled FPID, (b) Type II: rule decoupled FPID
296 G.K.I. Mann and E. Harinath
e1i = ei
e2i = ∆ ei
e3i = ∆ 2 ei . (33)
All FLC input variables are normalized to a compact region [-1,1]. The error vari-
ables are normalized by using the condition êwi = max(−1, min(1, Swi ewi )). The
defuzzified controller output after the fuzzy inference is denoted by û. Similarly the
FLC output is normalized by using the condition û ≡ u/umax .
The nonlinear tuning variables are selected to affect ANG terms at any given local
control point in the control surface. As PID gains are proportional to the slopes of the
control surface, the slope angles of the tangents drawn at a given point on the non-
linear control surface are considered to be the nonlinear tuning variables. In order to
isolate them from their associated outputs of type I controller, the slopes are mea-
sured in the planes of individual error axes. The measurement of these angles with
respect to a two-dimensional control surface is shown in Figure 5(a). Figure 5(b)
shows a control curve that has been projected into a chosen error variable. In gen-
eral, for a three- input coupled rule base the slope angles can be described by
ûf
ûf
(θ1)1i 1
(θ1)2i (θ1)wi
1
0 (α1)wi
0
(α0)wi
–1
1
ê2i 1ê1i
(θ0)2i (θ0)1i
0 0 (θ0)wi
–1 –1
–1 0 1 ê
(a) (b)
∂ û f
(θ0 )wi =
∂ êwi êwi =−1
∂ û f
(θ1 )wi = . (34)
∂ êwi êwi =1
Consider two control regions in the controller output space. The first region is when
the normalized error variables are −1 ≤ êi < 0. The local control in this region af-
fects steady state, load disturbance and overshoot properties. The second region is
when 0 ≤ êi ≤ 1. The control in this region affects the speed of response during the
transient, undershoot and steady state properties. The objective is to realize inde-
pendent adjustment of FLC parameters in the view of changing ANG terms at the
chosen control points. The membership functions (ai ) for the if-part in SAM are
chosen as triangle functions as shown in the Figure 6. The slope angle θ for type II
(see Figure 5(b)) can be described by,
⎧
⎪ −V C0i +V1i C1i (−êi V0i C0i +(êi +1)V1i C1i )(−V0i +V1i )
⎪ arctan −êi V0i0i +(
⎪ −
⎪ ê +1)V 2
(−êi V0i +(êi +1)V1i )
⎨
i 1i
–1 0 1 êw
The stability properties are determined by the extreme values of equivalent PID
gains. Therefore, to guarantee stability, the maximum and minimum ANG terms are
considered in an equivalent linear PID system. In the SAM inference the maximum
or minimum of ANG occurs when êi = −1, êi = 0 and êi = 1. Then, the slope angle
at selected four points (see Figure 4) are,
It is clear, the pairs {(θ0 )wi , (α0 )wi } and {(θ1 )wi , (α1 )wi } form a right angle. There
are two independent slope angles that can be defined over the control surface of
SAM corresponding to two regions −1 ≤ êi < 0 and 0 ≤ êi ≤ 1. Therefore we select
(θ0 )wi and (θ1 )wi as the two independent slope angles to be adjusted within the
range of [0–90◦ ] for high level tuning. In order to find two independent angles, the
then-part volume for second membership function is selected as unity:V1i = 1. Then,
θ0 = arctan(1/V0 ) (41)
θ1 = arctan(1/V2 ) (42)
Hence the terms V0 and V2 are the nonlinear tuning variable for the SAM.
7 Stability Analysis
Where
Ri (ω ) = ∑ | qi j ( jω ) | for i = 1, 2, . . . , n (43)
i,i= j
is radius of ith Gershgorin circle. Then, DNA stability theorem [39], [40], [41] is
expressed as follows. When the Gershgorin bands based on the diagonal elements
qii (s) of Q(s) exclude the point (−1 + j 0) and the ith Gershgorin band encircle the
point (−1 + j 0), Ni times anticlockwise, then the closed-loop system is stable if ,
and only if,
n
∑ Ni = p0 ,
i=1
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 299
where p0 is the number of unstable poles of Q(s). In this work we have assumed that
the open-loop stable process, Q(s) since most of industrial process are open-loop
stable systems [42]. Then, p0 = 0 for this stability analysis. Hence, if the Gershgorin
bands neither encircle nor include the critical point(−1, j0) for (∀i), the closed-loop
system is stable.
Ho et al. [41] have shown the definitions for gain and phase margins of MIMO
systems as follows. Figure 7 shows a Nyquist diagram with Gershgorin circle at
the gain crossover frequency (defined as ωgi ) of ith loop. The Gershgorin circle
intersects the unit circle at A. At the phase cross over frequency (defined as ω pi ),
the Gershgorin circle intersects the negative real axis at C as shown in Figure 8.
Then the phase and gain margins for the MIMO system are defined as ,
1
Gershgorin circle
qii(jω)
–1 B 0
0
Im
A φ⬘i
φi
Ri(ωg)
–1
|qii(jωg)|
–1 0 1
Re
Fig. 7 Nyquist diagram with the Gershgorin circle at the gain crossover frequency ωg
300 G.K.I. Mann and E. Harinath
Ri(ωp) + |qii(jωp)|=1/α⬘i
1
Gershgorin circle
|qii(jωp)|=1/αi
qii(jω)
–1
0
Im
C O
–1
–1 0 1
Re
Fig. 8 Nyquist diagram with the Gershgorin circle at the phase crossover frequency ω p
does not encircle the point (−1 + j 0). As a rule of thumb, φi and αi should satisfy
the following conditions,
300 ≤ φi ≤ 600 and (46)
2 ≤ αi ≤ 5. (47)
The φi in Figure 7 and αi in Figure 8 are phase and gain margins in the SISO system
respectively. The following expression can be derived for φi (see Figure 7)
∑i,i= j | qi j ( jωgi ) |
φi = φi + 2 arcsin
2 | qii ( jωgi ) |
∑ i,i= j | gi j ( j ωgi ) |
= φi + 2 arcsin . (48)
2 | gii ( jωgi ) |
∑i,i= j | qi j ( jω pi ) |
αi = αi 1 +
2 | qii ( jω pi ) |
∑i,i= j | gi j ( jω pi ) |
= αi 1+ . (49)
2 | gii ( jω pi ) |
In order to guarantee the stability the predefined gain margin αi and phase margin
φi of MIMO process can be predefined while satisfying (46) and (47). The limits of
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 301
PID parameters can be calculated for ith loop. Following four equations can be used
to calculate the four unknowns, ω pi , ωgi , KPi and KIi in ith loop.
1
αi = (50)
| gii ( jω pi )ci ( jω pi ) |
arg[gii ( jω pi )ci ( jω pi )] = −π (51)
φi = π + arg[gii ( jωgi )ci ( jωgi )] (52)
| gii ( jωgi )ci ( jωgi ) | = 1 (53)
2
KPi max
KDi max = . (59)
4KIi max
Since PID gains are proportional to the slopes of the control surface shown in Fig-
ure 5, we can find maximum values of slopes angle corresponding to KPi max , KIi max
and KDi max . For instance, let the proportional SAM based fuzzy controller for ith
has high-level tuning parameters:V0 and V2 . From (37)–(42) following expression
can be derived
If {KPi max /KPi ≥ 1.571}, the fuzzy controller has independent variations of θ0 and
θ1 within the range [0 90◦ ]. Otherwise, it has feasible stability region as shown in
Figure 9.
302 G.K.I. Mann and E. Harinath
90⬚
α0max
⬚
45
0=
α0(degree)
α
0=
stability region
θ
α0min
8 Control Simulation
The proposed FPID controllers tuning techniques are applied for a multivariable
process with the Finite Element (FE)-based model of 3 × 3 soil-cell [43]. Here, two
transfer functions are derived. The first one is obtained directly using FE analysis
of the soil model and the second one is obtained while increasing time delay of
the FE-based transfer function in two times. This is performed in order to justify the
robustness of the proposed controllers for different processes. In addition, the equiv-
alent delayed first-order models for all the higher order subprocesses are obtained
by analyzing the response using plant reaction curve methods [44]. Then equivalent
first-order models with dead time are used to design of linear PID controllers. Since
the models and the processes are mismatch the controllers are more robust for un-
certainty. The liner PID tuning method also simulated to confirm the superiority of
the FPID controllers techniques. The following steps summarizes design of FPID
controller.
1. Equivalent first-order delayed models are derived for all higher order subprocesses
by analyzing the response using plant reaction curve methods.
2. The static decoupler is obtained for the first-order plus dead time model.
3. An equivalent first-order model for overall compensated system (first-order
model with static decoupler) is obtained using truncated Taylor series approx-
imation at low frequencies.
4. A measure of interaction is developed and integral gains are calculated for each
loops at particular values of interaction indices.
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 303
5. Using direct pole placement method and Ziegler–Nichols tuning formula propor-
tional and derivative gains of linear PID controllers calculated.
6. Nonlinear tuning parameters (volumes of then fuzzy of SAM) are designed so
that overall system has specific gain and phase margins.
8.1 Example 1
The dynamics of transfer function between heat input (W) and temperature output
(◦ C) is described by
⎡ ⎤
0.0288e−0.6s 0.0119e−1.2s 0.00028e−3.6s
⎢ 6.605s2 + 5.14s + 1 97.02s2 + 19.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0.0141e−1.2s 0.0295e−0.6s 0.0035e−1.8s ⎥
⎢ ⎥. (63)
⎢ 10.11s2 + 6.36s + 1 5.523s2 + 4.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎣ 0.0015e−3.6s 0.0143e−1.8s −0.6s
⎦
0.0282e
2
17.56s2 + 8.38s + 1 6.605s2 + 5.14s + 1 7.29s +5.4s+1
8.2 Example 2
Fig. 10 Example 1, Simulation of closed-loop system with PID and FPID controllers
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 305
Table 1 Performance characteristic indices of proposed FPID methods and PID method for set
point tracking in Example 1
Output Set point tracking
Rise time (minute) Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2 PID FPID1 FPID2
y1 5 11 5 19 11 17 15 26 11
y2 5 11 5 25 6 16 14 20 9
y3 5 11 6 33 8 25 19 14 20
Table 2 Performance characteristic indices of proposed FPID methods and PID method for load
disturbance in Example 1
Output Load disturbance
Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2
(1) 2.61 0.61 2.79 0.7 0.9 2.2 1.3 2.3 1.5 1.8 1.1
(2) 2.03 0.57 1.82 0.9 1.1 4.0 1.4 3.5 1.6 3.8 1.4
(3) 1.7 0.52 1.40 0.9 1.0 2.0 0.6 1.8 0.8 2.3 1.3
⎡ ⎤
0.0288e−1.2s 0.0119e−2.4s 0.00028e−7.2s
⎢ 6.605s2 + 5.14s + 1 97.02s2 + 19.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎢ 0.0141e−2.4s 0.0295e−1.2s 0.0035e−3.6s ⎥
⎢ ⎥
⎢ 10.11s2 + 6.36s + 1 5.523s2 + 4.7s + 1 23.52s2 + 9.7s + 1 ⎥ . (65)
⎢ ⎥
⎣ 0.0015e−7.2s 0.0143e−3.6s 0.0282e−1.2s ⎦
17.56s2 + 8.38s + 1 6.605s2 + 5.14s + 1 7.29s2 + 5.4s + 1
The equivalent first-order model obtained from plant reaction curve is given by
⎡ ⎤
0.0288e−2.45s 0.0119e−7.45s 0.00028e−9.7s
⎢ 4.35s + 1
⎢ 16.05s + 1 8.1s + 1 ⎥ ⎥
⎢ 0.0015e−3.5s 0.0295e−2.4s 0.0035e−6.05s ⎥
⎢ ⎥. (66)
⎢ 6.6s + 1
⎢ 3.9s + 1 7.95s + 1 ⎥ ⎥
⎣ 0.0015e−9.5s 0.0143e−4.9s 0.0282e−2.5s ⎦
6.6s + 1 4.2s + 1 4.5s + 1
306 G.K.I. Mann and E. Harinath
Table 4 Performance characteristic indices of proposed FPID methods and PID method for set
point tracking in Example 2
Output Set point tracking
Rise time (minute) Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2 PID FPID1 FPID2
y1 10 17 15 37 10 7 30 30 35
y2 9 20 11 8 0 6 22 23 22
y3 5 15 6 20 0 6 6 22 7
Table 5 Performance characteristic indices of proposed FPID methods and PID method for load
disturbance in Example 2
Output Load disturbance
Overshoot % Setting time (minute)
y1 34 38 17 14 16 23
y2 12 19 16 13 20 10
y3 1 2 3 0 0 0
(1) 0.56 0.25 0.32 1.1 0.8 5.5 0.35 4.2 0.5 3.5 0.2
(2) 1.45 0.23 2.29 0.8 1.0 1.1 0.80 1.3 0.9 1.1 1.8
(3) 2.34 0.39 3.54 0.9 1.2 1.5 0.50 1.8 0.2 0.9 0.6
The initial set-points are 40◦ C, 50◦ C and 40◦ C. Once the steady conditions have
been reached, the set-point of all three outputs are changed to 100◦ C at 80 minute.
In order to measure load disturbance rejection capability, a step-load disturbance
was applied to the first loop (y1 ) of the process. Figure 12 summarizes the output
behavior in the experiments. The system with FPID type II controller has again
shown less overshoot although response time is slow as compared to when the con-
troller is a linear PID system. However, all the systems show same capability of load
disturbance rejection. Tables 4 and 5 summarzies the comparisons of performance
indices. Table 6 provides all the tuning parameters.
9 Performance Analysis
1 1 1
0 y 0 y0
–1 –0.5 0.5 1 –1 –0.5 x 0.5 1 –1 –0.5 0.5 1
x
–0.5 –0.5 –0.5
–1 –1 –1
q11 q12 q13
1 1 1
y 0 0 y 0
–1 0.5 0.5 1 –1 –0.5 0.5 1 –1 –0.5 x 0.5 1
x
–1 –1 –1
q21 q22 q23
1 1 1
y 0 y0 0
–1 0.5 0.5 1 –1 –0.5 0.5 1 –1 –0.5 0 0.5 1
x x
–0.5 –0.5 –0.5
–1 –1 –1
q31 q32 q33
Fig. 11 Example 1, Nyquist array and Gershgorin bands of system with liner PID controller
operation of this controller for any other frequency, a stability analysis has been
then performed. Nyquist array and Gershgorin bands have been constructed for both
the examples assuming a linear PID system (Figures 11 and 13). For simulations a
second order plant has been modeled using plant reaction curve and model/plant
mismatch has been already considered. The results justify the robustness of the pro-
posed method. The gain and phase margins for individual loops are shown in Tables
7 and 8 for linear PID systems. The results reveal that gain and phase margins for
both the examples are within the specified limits as proposed in Ho et. al. [41].
Therefore both the examples confirm to the DNA stability theorem. Overall the
FPID type II system able to provide improved control as compared to linear and
type I FPID.
308 G.K.I. Mann and E. Harinath
Fig. 12 Example 2, Simulation of closed-loop system with PID and FPID controllers
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 309
Table 7 Gain and Phase margins of each loop of the system with linear PID controller, Example 1
Loop No Gain margin Phase margin
1 2.2 32◦
2 2.6 31◦
3 8.1 41◦
Table 8 Gain and Phase margins of each loop of the system with linear PID controller, Example 2
Loop No Gain margin Phase margin
1 3.2 24◦
2 3.0 40◦
3 8.0 56◦
1 1 1
–1 –1 –1
q11 q12 q13
1 1 1
–1 –1 –1
q21 q22 q23
1 1 1
–1 –1 –1
q31 q32 q33
Fig. 13 Example 2, Nyquist array and Gershgorin bands of system with liner PID controller
310 G.K.I. Mann and E. Harinath
10 Conclusions
References
1. George K.I. Mann, Bao-Gang Hu and Raymond G. Gosine. Two-Level Tuning of Fuzzy PID
Controllers. IEEE Transactions on Systems, Man and Cybernetics, Part B, 31(2), pp. 263–269,
Apr 2001
2. Jiawen Dong and Coleman B. Brosilow. Design of Robust Multivariable PID controllers via
IMC. Proceedings of the American Control Conference, 5, pp. 3380–3384, June 4–6, 1997
3. S. Yamamoto and I. Hashimoto. Present status and future needs: The view of from Japanese
industry. Proceedings of the 4th International Conference on Chemical Process Control. in
I. Arkun and I. Ray, Eds., New York: AIChe, 1991
4. J.G. Ziegler and N.B. Nichols. Optimum settings for automatic controllers. Trans. ASME, 64,
pp. 759–768, 1942
5. A. Niederlinski. A Heuristic Approach to the Deisgn of Linear Multivariable Interactiing Con-
trol Systems.. Automatica, 7, pp. 691–701, 1971
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 311
6. William L. Luyben. Simple Method for Tuning SISO Controllers in Multivariable Systems.
Ind. Eng. Chem. Process Des. Dev, 25(3), pp. 654–660, July 1986
7. D. Chen and D.E. Seborg. Multiloop PI/PID controller design based on Gershgorin bands.
Proceedings of the American Control Conference, 5, pp. 4122–4127, June 25-27, 2001
8. M. Witcher and T.J. McAvoy. Interacting Control Systems: Steady State and Dynamic Mea-
surement of Interaction. ISA Transactions, 16(3), pp. 35–41, 1977
9. Karl Johan Astrom, Karl Henrik Johansson and Qing-Guo Wang. Design of decoupled PID
controllers for MIMO systems. Proceedings of the American Control Conference, 3, pp. 2015–
2020, June 2001
10. Jietae Lee and Thomas F. Edgar. Interaction measure for decentralized control of multivari-
able processes. Proceedings of the American Control Conference, Anchorage, AK, United
States, 1, pp. 454–458, May 2002
11. M.H. Moradi, M.R. Katebi and M.A. Johnson. The MIMO Predictive PID Controller Design.
Asian Journal of Control, 4(4), pp. 452–463, Dec 2002
12. D.E. Rivera, S.M. Morari and S. Skogestad. Internal Model Control 4. PID Controller Design.
Ind. Eng. Chem. Proc. Des. Dev., 25(1), pp. 252–265, Jan 1986
13. J. Lieslehto, J.T. Tanttu and H.N. Koivo. An Expert System for Multivariable Controller
Design. Automatica, 29(4), pp. 953–968, 1993
14. R. Sehab, M. Remy and C. Renotte. An approach to design fuzzy PI supervisor for a nonlinear
system. IFSA World Congress and 20th NAFIPS International Conference, 2001. Joint 9th, 2,
pp. 894–899, July 25–28, 2001
15. A. Selk Ghafari and A. Alasty. Design and real-time experimental implementation of gain
scheduling PID fuzzy controller for hybrid stepper motor in micro-step operation. Proceedings
of the IEEE International Conference on Mechatronics ICM ’04., pp. 421–426, June 2004
16. E.H Mamdani. Application of fuzzy algorithms for control of simple dynamic plant. Proceed-
ings of the Institution of Electrical Engineers, 121(12), pp. 1585–1588, 1974
17. F.L. Lewis and Kai Liu. Towards a paradigm for fuzzy logic control. Automatica, 32(2),
pp. 167–181, Feb 1996
18. A. Rahmati, F. Rashidi and Rashidi M. A hybrid fuzzy logic and PID controller for control of
nonlinear HVAC systems. IEEE Transactions on Systems, Man and Cybernetics, 3, pp. 2249–
2254, Oct 2003
19. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Academic Press,
1980
20. M. Sugeno. Industrial Application of Fuzzy Control. North-Holland, Amsterdam, The
Netherlands. 1985
21. Han Xiong Li and Shaocheng Tong. A hybrid adaptive fuzzy control for a class of nonlinear
MIMO systems. Fuzzy Systems, IEEE Transactions on, 11(1), pp. 24–34, Feb 2003
22. Timothy J. Ross. Fuzzy Logic with Engineering Applications. Wiley, Chichester, UK, 2nd
edition, 2004
23. G.I. Eduardo and M.R. Hiram. Fuzzy multivariable control of a class of a biotechnology
process. Proceedings of the IEEE International Symposium on Industrial Electronics, 1,
pp. 419–424, July 1999
24. Chieh-Li Chen and Pey-Chung Chen. Application of fuzzy logic controllers in single-loop
tuning of multivariable system design. Computers in Industry, 17(1), pp. 33–41, 1991
25. B. Wayne Bequette. Process Control Modeling, Design and Simulation. Prentice-Hall of India,
2003
26. Shaoyuan Li, Hongbo Liu, Wen-Jian Cai, Yeng-Chai Soh and Li-Hua Xie. A new coordinated
control strategy for boiler-turbine system of coal-fired power plant. IEEE Transactions on
Control Systems Technology, 13(6), pp. 943–954, Nov 2005
27. Hassan B. Kazemian. The SOF-PID controller for the control of a MIMO robot arm. IEEE
Transactions on Fuzzy Systems, 10(4), pp. 523–532, Aug 2002
28. Bart Kosko. Fuzzy Engineering. Prentice-Hall, Simon & Schuster/A Viacom Company Upper
Saddle River, New Jersey, 1997
312 G.K.I. Mann and E. Harinath
29. H.A. Malki and D. Misir. Determination of the control gains of a fuzzy PID controller using
neural networks. Fuzzy Systems, Proceedings of the Fifth IEEE International Conference on,
2, pp. 1303–1307, Sept 1996
30. Yu Yongquan, Huang Ying, Wang Minghui, Zeng Bi and Zhong Guokun. Fuzzy neural PID
controller and Ftuning its weight factors using genetic algorithm based on different location
crossover. Systems, Man and Cybernetics, 2004 IEEE International Conference on, 4, pp.
3709–3713, Oct 10–13 2004
31. J.-X. Xu, C. Liu and C.C. Hang. Tuning of fuzzy PI controllers based on gain/phase margin
specifications and ITAE index. ISA Transactions, 35(1), pp. 59–91, May 1996
32. J.-X. Xu, C. Liu and C.C. Hang. Designing a stable fuzzy PI control system using extended
circle criterion. Int. J. of Intelligent Control and Systems, 1, pp. 355–366, 1996
33. S. Hayashi. Auto-tuning fuzzy PI Controller. Proceedings of the Intn’l Fuzzy Systems Associ-
ation Conference, pp. 41–44, 1991
34. H.-X. Li and H.B. Gatland. A new methodology for designing a fuzzy logic controller. IEEE
Transactions on Systems, Man and Cybernetics, 25(3), pp. 505–512, Mar 1995
35. K.J. Astrom and T. Hagglund. PID Controllers: Theory, Design and Tuning. Instrument Soci-
ety of America, Research Triangle Park, 2nd edition, NC, 1995
36. C.W. Reynolds. Flocks, Herds, and Schools: A Distributed Behavioral Model. Computer
Graphics, 21(4), pp. 25–45, July 1987
37. J.P. Martino. Technological Forecasting for Decisionmaking. Elsevier, 8(1), 1972
38. Baogang Hu, George K.I. Mann and Raymond G. Gosine. New methodology for analytical
and optimal design of fuzzy PID controllers. IEEE Transactions on Fuzzy Systems, 7(5), pp.
521–539, Oct 1999
39. H.H. Rosenbrock. State -Space and Multivariable Theory. Nelson, London, 1970
40. J.M. Maciejowski. Multivariable Feedback Design. Addison-Wesley, 1989
41. K. Ho Weng and H. Lee Tong, and Oon P. Gan. Tuning of Multiloop Proportional-Intergral-
Derivative Controllers Based on Gain and Phase Margin Specifications. Ind. Eng. Chem.
Res., 36, pp. 2231–2238, 1997
42. Weng Khuen Ho, Tong Heng Lee, Wen Xu, Jinrong R. Zhou and Ee Beng Tay. Direct Nyquist
array design of PID controllers. IEEE Transactions on Industrial Electronics, 47(1), pp 175–
185, Feb 2000
43. P.K. Roy and G. Mann and B.C. Hawlader. Fuzzy rule-adaptive model predictive control for a
multi-variable heating system. IEEE Conference on Control Applications., pp. 260–265, Aug
2005
44. E.F. Camacho and C. Bordons. Model Predictive Control. Springer-Verlag, London, 1999
Evaluation of Fuzzy Implications
and Intuitive Criteria of GMP and GMT
using MATLAB GUI
1 Introduction
FL is a multivalue logic used to model any events or conditions that are not precisely
defined or known. The inherent approximate reasoning capabilities of FL make it an
ideal tool to develop the applications which require a logical reasoning to define the
The two important fuzzy rules, used in FL for approximate reasoning or inferencing,
are GMP and GMT [2]. The basic definitions of these intuitive rules are as follows:
Generalized Modus Ponens: GMP is known as the direct reasoning or forward-
driven inferencing rule. It is defined by the following implication modus operandi:
Premise 1 : u is A
Premise 2 : IF u is A THEN v is B
Consequence : v is B
where A and A are input fuzzy sets, B and B are output fuzzy sets, u and v are the
linguistic variables corresponding to the input and output fuzzy sets, respectively.
The various values that the fuzzy set of premise 1 can have are: A, very A, more or
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 315
less A, and not A. The linguistic values such as very and more or less are known as
hedges and can be defined in terms of their membership grade as µ (·)2 and µ (·)1/2 ,
respectively. Here, (·) denotes the fuzzy sets A or B. Figure 1 shows the profiles
of these hedges. The various intuitive criteria of GMP, relating premise 1 and the
consequence for any given premise 2, are illustrated in Table 1 [2]. It is observed
from the table that there are in totality seven criteria under GMP in which each can
be related to our everyday reasoning. It is also noticed that if a fundamental relation
between “u is A” and “v is B” is not strong in premise 2 then the satisfaction of
criterion C2-2 and C3-2 is allowed.
Generalized Modus Tollens: GMT, known as indirect or backward goal-driven
inferencing rule, is defined by following inference procedure:
Premise 1 : v is B
Premise 2 : IF u is A THEN v is B
Consequence : u is A
Table 1 Intuitive criteria of GMP — a direct reasoning or forward goal-driven inference rule
GMP u is A v is B
Criteria (premise 1) (premise 2)
C1 u is A v is B
C2-1 u is very A v is very B
C2-2 u is very A v is B
C3-1 u is more or less A v is more or less B
C3-2 u is more or less A v is B
C4-1 u is not A v is unknown
C4-2 u is not A v is not B
316 S.K. Kashyap et al.
Fig. 2 Linguistic variables (hedges) “not very” and “not more or less”
Table 2 Intuitive criteria of GMT — an indirect reasoning or backward goal-driven inference rule
GMT v is B u is A
Criteria (premise 1) (premise 2)
C5 v is not B u is not A
C6 v is not very B u is not very A
C7 v is not more or less B u is not more or less A
C8-1 v is B u is unknown
C8-2 v is B u is A
The various values that a fuzzy set B of premise 1 can have are: not B, not very B,
not more or less B, and B. The linguistic values such as not very and not more or
less are known as hedges and can be defined in terms of their membership grade as
1 − µ (·)2 and 1 − µ (·)1/2 , respectively. Figure 2 shows the profiles of these hedges.
The various intuitive criteria of GMT, relating premise 1 and its consequence for
any given premise 2, are illustrated in Table 2.
IF u is A, THEN v is B (1)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 317
The rule has two parts known as antecedent or premise for “IF u is A” and conse-
quent for “THEN v is B”. Here, the crisp variable u, fuzzified by set A in a universe
of discourse U, is an input to the inference engine whereas the crisp variable v, rep-
resented by the set B in a universe of discourse V , is an output from the inference
engine. The formula used to compute the fuzzified output is given by
B = R◦A (2)
The following seven standard ways or interpretation of the fuzzy IF-THEN rule
exist, based on intuitive criteria or classical logic, to define the fuzzy implication.
• Fuzzy conjunction (FC):
In this report, a procedure is evolved for investigating the consequences when fuzzy
implication methods mentioned by Equations (12)–(18) (except Equation (16)) are
applied in the fuzzy inference process, and then visualized whether these conse-
quences match with any of the intuitive criteria (see Tables) of GMP and GMT,
the two ideal inference rules in our day-to-day reasoning or thought processes. In
order to do so, we have used MATLAB and GUI to speed up the process of inves-
tigation by seeing the output through plots or numerical results. In this section, we
basically establish the formulas which will be the backbone for evaluating the fuzzy
implication methods against the intuitive criteria of GMP and GMT.
The following formula is required to compute the consequences (so that they can
be compared with consequences of GMP) when the fuzzy implication methods are
applied in the fuzzy inference process:
B = R ◦ A (19)
min operator for “star”. Thus, Equaion (19) in terms of its membership functions, is
given by
where µA→B (u, v) is a fuzzy implication method from Equations (12)–(18) and x =
µA (u) is the premise 1 of GMP (see Table 1).containing any one of the following:
µA (u) = µA (u), µA (u) = µA2 (u), µA (u) = µA (u), or µA (u) = 1 − µA (u). It is
assumed that the fuzzy sets A and B are normalized ones, i.e. their membership
grades fall between 0 and 1.
Similarly in GMT, the following formula is required to compute the conse-
quences (so that they can be compared with consequences of GMT) when the fuzzy
implication methods are applied in the fuzzy inference process:
A = R ◦ B (21)
µA (u) = sup {min [µA→B (u, v), µB (v)]} (22)
v∈V
where µB (v) is the premise 1 of GMT (see Table 2) containing .any one of the fol-
lowing: µB (v) = 1 − µB (v), µB (v) = 1 − µB2 (v), µB (v) = 1 − µB (v), or µB (v) =
µB (v).
and Figures 4–5 show the panel for selecting premise 1 of GMP and GMT criteria
to be applied to the selected implication method.
The following steps are used to realize the satisfaction of criteria using MAT-
LAB/Graphics:
Step 1: Generation of 2D plots of selected implication method
Consider a fuzzy input set A and output set B with following membership grades:
2D plots are generated by taking one value of Equation (24) at a time for the
entire set of values µA (u) of Equation (23) and applying them to selected implication
method of Equations(12)–(18). In these plots the X-axis is µA (u) and the Y -axis is
µA→B (u, v) for each value of µB (v). Figures 6–11 show the 2D plots of the various
implication methods. In these figures, the coding with symbols indicates the various
values of the fuzzy implication methods computed by varying fuzzy sets µB between
0 and 1 with a fixed interval of 0.05. Interestingly, it is important to realize that
there could be infinite such values possible if we reduce the interval of µB to a very
small value. However, for concept proving, we felt that the values shown in the
aforementioned figures will be sufficient and easy to visualize.
322 S.K. Kashyap et al.
In steps 2–3, the GMP and GMT criteria that are applied to the implication meth-
ods and consequences are realized visually as well as analytically.
Step 2: One by one premise 1 of all GMP criteria, i.e. C1 to C4-2, are applied
to the implication methods. Let us first take the implication method “MORFI” for
investigation under the heading “MORFI: C#”, where # is a criterion index such as:
1, 2–1, 2–2, 3–1, 3–2, 4–1, and 4–2.
MORFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the
consequence µB (v). In this report, we first try to interpret the “min” operation of
Equation (20) by considering Figure 6 of the 2D view of the implication method
µA→B (u, v) and premise 1 µA (u) (Table 1). Figures 12 and 13 (for only one value of
µB (v)) show this superimposition, and it is observed that µA (u) is always larger or
equal to µA→B (u, v) for any value of µA (u), which means that outcome of the ‘min’
operation is µA→B (u, v) itself, i.e. Figure 6. It is also observed from Figures 12/13
that µA→B (u, v) = min(µA (u), µB (v)) converges to µB (v) (also the maximal value
of µA→B (u, v)) for µA (u) ≥ µB (v) and hence the supremum of µA→B (u, v) is µB (v),
i.e. µB (v) = µB (v). Therefore, it is concluded that MORFI satisfies the intuitive
criterion C1 of GMP (refer Table 1). We also prove this by an analytical method as
given below:
µB (v) = sup {min [min {µA (u), µB (v)} , µA (u)]}
u∈U
y = min {µA (u), µA (u)} ; for µA (u) ≤ µB (v)
= sup 1 (25)
u∈U y2 = min { µB (v), µA (u)} ; for µA (u) > µB (v)
y = µA (u) ; for µA (u) ≤ µB (v)
= sup 1
u∈U y2 = µB (v) ; for µA (u) > µB (v)
It is observed from the above equations that the outcome of the “min” operation
between µA→B (u, v) and µA (u) consists of y1 and y2 . The outcome starts with y1
which increases to a maximum value of µB (v) with an increase in µA (u) from zero
to µB (v). The y2 starts from the maximum value of y1 and remains constant on that
value in spite of any further increase in µA (u). Hence, we see that supremum is y2
only, i.e. µB (v) = µB (v).
MORFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to
get the consequence µB (v). Figures 14 and 15 illustrate the superimposed plots
of µA→B (u, v) and µA (u). It is observed that the area below the intersection point
of µA→B (u, v) and µA (u) corresponds to the “min” operation of Equation (20). It is
also noticed that the supremum of the resultant area is nothing but those intersection
points having values equal to µB (v). Therefore, it is concluded that MORFI satisfies
the intuitive criterion C2-2 (not C2-1) of GMP. The analytical proof is given below:
! "
µB (v) = sup min min {µA (u), µB (v)} , µA2 (u)
u∈U
! "
y1 = min !µA (u), µA2 (u)" ; for µA (u) ≤ µB (v)
= sup (26)
u∈U y2 = min µB (v), µA (u) ; for µA (u) > µB (v)
2
⎧
⎨ y1 = µA (u); since µA (u) ≤ µ. ; for µA (u) ≤ µB (v)
2 2
A (u)
= sup y21 = µA (u); for µA (u) ≤ . µB (v)
2
u∈U ⎩ ; for µA (u) > µB (v)
y22 = µB (v); for µA (u) > µB (v)
It is observed that the outcome of the “min” . operation between µA→B (u, v) and
µA (u) consists of y1 , y21 , and y22 . Since µB (v) > µB (v), therefore . y1 and y21
can be treated as one, having a value µA2 (u) for the value of µA (u) ≤ µB (v). The
outcome starts with y1 /y21 which . increases to a maximum value of µB (v) with an
increase in µA (u) from zero to µB (v). The y22 starts from the maximum value of
y1 /y21 and remains constant on that value in spite of any further increase in µA (u).
Hence, we see that supremum is y22 only, i.e. µB (v) = µB (v).
.
MORFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20)
to get the consequence µB (v). Figure 16 illustrates the superimposed plots of
µA→B (u, v) and µA (u). It is observed from the figure that µA (u) is always greater
than the implication µA→B (u, v) for any value of µB (v); therefore the outcome of the
‘min’ operation results in µA→B (u, v) itself. It is also observed from Figure 16 that
µA→B (u, v) = min(µA (u), µB (v)) converges to µB (v) for µA (u) ≥ µB (v)) and hence
the supremum of µA→B (u, v) is µB (v), i.e. µB (v) = µB (v). Therefore, it is concluded
that MORFI satisfies the intuitive criterion C3-2 (not C3-1) of GMP. The analytical
proof is given below:
.
µB (v) = sup min min {µA (u), µB (v)} , µA (u)
u∈U
⎧ .
⎨ y1 = min µA (u), µA (u) ; for µA (u) ≤ µB (v)
= sup . (27)
u∈U ⎩ y2 = min µB (v), µA (u) ; for µA (u) > µB (v)
⎧ .
⎨ y1 = µA (u); since µA (u) ≤ µA (u) ; for µA (u) ≤ µB (v)
⎪
.
= sup y2 = µB (v); since µA (u) > µB (v) and µA (u) < µA (u),
⎪
u∈U ⎩ .
hence µA (u) > µB (v)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 and y2 . The outcome starts with y1 which increases to a maxi-
mum value of µB (v) with an increase in µA (u) from zero to µB (v). The y2 starts from
the maximum value of y1 and remains constant on that value in spite of any further
increase in µA (u). Hence, we see that supremum is y2 only, i.e. µB (v) = µB (v).
MORFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20)
to get the consequence µB (v). Figure 17 illustrates the superimposed plots of
µA→B (u, v) and µA (u). It is noticed from the figure that µA (u) intersects µA→B (u, v)
a first time at µA (u) = µB (v) = 0.5, a point at which the outcome of the “min”
operation is at its peak value and also equals the maximal value that the conse-
quence µB (v) can achieve. Similarly the other values of µB (v) are the next inter-
section points (below 0.5) of µA (u) to µA→B (u, v). Hence it can be concluded that
µB (v) falls between µB (v) to 0.5 or in other words, µB (v) = min(0.5, µB (v)); that
is µB (v) = 0.5 ∩ µB (v). The point to be noted here is that MORFI does not satisfy
the intuitive criteria C4-1/C4-2 of GMP. The analytical proof is given below:
µB (v) = sup {min [min {µA (u), µB (v)} , 1 − µA (u)]} (28)
u∈U
y = min {µA (u), 1 − µA (u)} ; for µA (u) ≤ µB (v)
= sup 1
u∈U ⎧y2 = min { µB (v), 1 − µA (u)} ; for µA (u) > µB (v)
⎪
⎪ y11 = µA (u) ; for µA (u) ≤ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = 1 − µ (u) ; for µA (u) > 0.5
= sup 12 A
u∈U ⎪
⎪ y21 = 1 − µA (u) ; for µA (u) ≥ 1 − µB (v)
⎩ ; for µA (u) > µB (v)
y22 = µB (v) ; for µA (u) < 1 − µB (v)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 which can be divided into two regions with
the first region consisting of y11 , y21 , and y22 when µB (v) < 0.5 and a second region
consisting of y11 and y12 when µB (v) ≥ 0.5. The outcome of the first region starts
with y11 which increases to a maximum value of µB (v) with an increase of µA (u),
whereas y22 begins with that maximum value and remains the same until µA (u) <
1 − µB (v) and after that, y21 takes over, which decreases from its maximum value
of µB (v) to zero. Hence, we notice that the supremum in this region will be µB (v)
only having a value less than 0.5. In the second region, the outcome of the “min”
operation begins with y11 which increases to a maximum value of 0.5 and from there
y12 takes over which decreases to a zero value with any further increase of µA (u),
hence, the supremum of this region is 0.5 only and this also a maximum value that
the consequence µB (v) can achieve, otherwise, it is µB (v) < 0.5 or, in other words,
µB (v) = min(0.5, µB (v)); that is µB (v) = 0.5 ∩ µB (v).
Similar to MORFI, the implication method PORFI is investigated by applying
the intuitive criteria of GMP.
PORFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from the Figure 18 that µA (u) is always equal to
or greater than the implication µA→B (u, v) for any value of µB (v); therefore the
outcome of the “min” operation results in µA→B (u, v) itself. It is also noticed that
the supremum of µA→B (u, v) turns out to be µB (v). Therefore, it is concluded that
PORFI satisfies the intuitive criterion C1 of GMP. The analytical proof is given
below:
PORFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figures 19 and 20 that the point of
intersection of µA→B (u, v) and µA2 (u) is at µA (u)µB (v) = µA2 (u) or µB (v) = µA (u).
Hence the outcome of the “min” operation is equal to µA2 (u) for µB (v) ≥ µA (u) and
µA (u)µB (v) for µB (v) < µA (u). It is also noticed that as µA (u) tends to unity, the out-
come of min converges to µB (v) which is also a largest value of the min operation.
Hence, the supremum of the “min” operation becomes µB (v), i.e. µB (v) = µB (v).
Therefore, PORFI satisfies the C2-2 (not the C2-1) criterion of GMP. The analytical
proof is given below:
! "
µB (v) = sup min min {µA (u), µB (v)} , µA2 (u) (30)
u∈U
y = µA2 (u) ; for µA2 (u) ≤ µA (u)µB (v) or µA (u) ≤ µB (v)
= sup 1
u∈U y2 = µA (u) µB (v) ; for µA (u) > µB (v)
= µB (v) ; (since µA2 (u) < µA (u)µB (v) and µA (u)µB (v)
tends to µB (v) as µA (u) → 1)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , and y2 . The y1 increases to a maximum value of µB2 (v) with an
increase of µA (u), then y2 starts with that value and further increases to a new max-
imum value of µB (v) as µA (u) → 1. Since µB2 (v) < µB (v), therefore, the supremum
of the outcome of the ‘min’ operation will be µB (v) only i.e. µB (v) = µB (v).
.
PORFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to
get the consequence µB (v). It is observed from the Figure 21 that µA (u) is always
greater than the implication µA→B (u, v) for any value of µB (v); therefore the out-
come of the “min” operation results in µA→B (u, v) itself. So naturally the supremum
of µA→B (u, v) is nothing but µB (v) i.e. µB (v) = µB (v). Therefore, PORFI satisfies
the C3-2 (not C3-1) criterion of GMP. The analytical proof is given below:
.
µB (v) = sup min µA (u)µB (v), µA (u) (31)
u∈U
.
= sup {µA (u)µB (v)} (; since µA (u) > µA (u) and µA (u)µB (v) < µA (u))
u∈U
= µB (v); (µA (u)µB (v) tends to µB (v) as µA (u) → 1)
Since µB (v) = µA (u)µB (v) = 1 − µA (u), hence by solving µA (u)µB (v) = 1 − µA (u)
1 1 µB (v)
we get µA (u) = . Therefore, µB (v) = µ (v) or .
1 + µB (v) 1 + µB (v) B 1 + µB (v)
Therefore, PORFI does not satisfy the C4-1/C4-2 criteria of GMP. The analytical
proof is given below:
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v)
µA (u) consists of y1 and y2 . The y1 increases to a maximum value of
1 + µB (v)
with an increase of µA (u) from zero to 1 , then y2 starts from that maximum
1 + µB (v)
value and decreases with any further increase of µA (u). Hence, the supremum of the
µB (v)
outcome of the “min” operation is µB (v) = .
1 + µB (v)
ARFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from the Figures 24 and 25 that µA→B (u, v) equals
unity unless µA (u) becomes larger than µB (v), otherwise µA→B (u, v) is equal to
1 − µA (u) + µB (v). We also noticed that the curve µA (u) intersects only with
1 − µA (u) + µB (v), therefore the supremum of the outcome of the “min” operation
between the curves µA→B (u, v) and µA (u) are their intersection points, obtained by
solving the following equality:
1 + µB (v)
µB (v) = 1 − µA (u) + µB (v) = µA (u) = µA (u) =
2
Since 1 − µA (u) + µB (v) = µA (u), and solving this equation, we get µA (u) =
1 + µB (v)
2 . Based on the consequence µB (v) obtained, it is concluded that ARFI
does not satisfy the C1 criterion of GMP. The analytical proof is given below:
µB (v) = sup {min [min {1, 1 − µA (u) + µB (v)} , µA (u)]} (33)
u∈U
= sup {min [1 − µA (u) + µB (v), µA (u)]} ( for µA (u) > µB (v))
u∈U
⎧
⎨ y = µ (u) = µ (u) ; for µ (u) ≤ 1 + µB (v)
1 A A A 2
= sup
u∈U ⎩ y2 = 1 − µA (u) + µB (v) ; for µA (u) > 1 + µB (v)
2
It is observed from the above equation that the outcome of the “min” operation
between µA→B (u, v) and µA (u) is either y2 = 1 − µA (u) + µB (v) or y1 = µA (u).
Also, we see from the nature of the equations that µA (u) increases with increase in
µA (u), whereas 1 − µA (u) + µB (v) decreases and hence, the supremum of the “min”
operation is the point of intersection of y1 and y2 i.e. µB (v) = 1 − µA (u) + µB (v) =
1 + µB (v)
µA (u) or µB (v) = 2 .
ARFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figure 26 that the suprema of the
outcome of the “min” operation between the curves µA→B (u, v) and µA (u) are the
intersection points of these curves for any given value of µB (v). These intersection
points are obtained by solving the following equality:
µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (34)
u∈U
! "
= sup min 1 − µA (u) + µB (v)), µA2 (u) ; for µA (u) > µB (v)
u∈U
y = µA2 (u) ; for µA (u) ≤ µAmin A(u)
= sup 1
u∈U y2 = 1 − µA (u) + µB (v) ; for µA (u) > µA A(u)
min
.
5 + 4µB (v) − 1
where µAmin (u) = 2 is obtained by solving 1 − µA (u) + µB (v) =
µA (u).
2
It is observed from the above equation that the outcome of the “min” operation
between µA→B (u, v) and µA (u) is either 1 − µA (u) + µB (v) or µA (u). Also we see
from the nature of equations that µA (u) increases with increase in µA (u), whereas
1 − µA (u) + µB (v) decreases and hence, the supremum of the ‘min’ operation is the
point of intersection of y1 and y2 , i.e.
.
3 + 2µB (v) − 5 + 4µB (v)
µB (v) = 1 − µA (u) + µB (v) = µA (u) or µB (v) =
2
.
2
.
ARFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figure 27 that the suprema of the
outcome of the “min” operation between the curves µA→B (u, v) and µA (u) are the
intersection points of these curves for any given value of µB (v). These intersection
points are obtained by solving the following equality:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 335
Since
µA (u)=[1−µA (u)+ µB (v)]2 or µA2 (u)−(3+2µB (v))µA (u)+ µB2 (v)+2µB (v)+1 = 0,
µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (35)
u∈U
.
= sup min 1 − µA (u) + µB (v), µA (u) ; for µA (u) > µB (v)
u∈U
.
= sup y1 = µA (u) ; for µA (u) ≤ µAmin A(u)
u∈U y2 = 1 − µ A (u) + µ B (v) ; for µA (u) > µAmin A(u)
.
3 + 2µB (v)− 5 + 4µB (v)−1
where µAmin A(u)= , is obtained by solving 1− µA (u)+
. 2
µB (v) = µA (u).
It is observed from the above equation that the outcome of the “min” opera-
tion between µA→B (u, v) and µA (u) is either 1 − µA (u) + µB (v) or µA (u). Also we
see from the nature of the equations that µA (u) increases with increase in µA (u),
whereas 1 − µA (u) + µB (v) decreases, and hence, the supremum of the “min” op-
eration is the point of.intersection of y1 and y2 i.e. µB (v) = 1 − µA (u) + µB (v) =
. 5 + 4µB (v) − 1
µA (u) or µB (v) = 2 .
ARFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). We see from Figure 28 that µA→B (u, v) is always greater
than or equal to µA (u), thus the outcome of the “min” operation is always µA (u),
i.e. 1 − µA (u) for any value of µB (v). Since the supremum of 1 − µA (u) is always
the unity, hence µB (v) = 1. Therefore, ARFI satisfies the C4-1 (not C4-2) criterion
of GMP. The analytical proof is given below:
µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (36)
u∈U
= sup {min [1 − µA (u) + µB (v), µA (u)]} ; for µA (u) > µB (v)
u∈U
y1 = 1− µA (u)+ µB (v) ; for 1 − µA (u) + µB (v) < 1 − µA (u) i.e. µB (v) < 0
= sup
u∈U y2 = µA (u) = 1 − µA (u) ; for 1 − µA (u) + µB (v) > 1 − µA (u) i.e. µB (v) > 0
We observed from the above equation that y1 is not valid as µB (v) < 0 is not possible.
Hence, µB (v) is given by
µB (v) = sup {y2 = µA (u) = 1 − µA (u); for µB (v) > 0} .
u∈U
Since the supremum of 1 − µA (u) is always the unity, hence µB (v) = 1.
MRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from Figure 29 that the suprema of the outcome of
the “min” operation between µA→B (u, v) and µA (u) are their intersection points. We
observe that 0.5 as a minimum value of the supremum is a first point of intersection
and other values of the supremum are equal to µB (v) with values greater than 0.5
or, in other words, µB (v) = max(0.5, µB (v)) or µB (v) = 0.5 ∪ µB (v). Based the on
consequence µB (v), it is concluded that MRFI does not satisfy the C1 criterion of
GMP for the value of µB (v) greater than 0.5. The analytical proof is given below:
µB (v) = sup {min [max [min(µA (u), µB (v)), 1 − µA (u)] , µA (u)]} (37)
u∈U
y1 = min [max [µA (u), 1 − µA (u)] , µA (u)] ; for µA (u) ≤ µB (v)
= sup
u∈U y2 = min [max [µB (v), 1 − µA (u)] , µA (u)] ; for µA (u) > µB (v)
⎧
⎪
⎪ y11 = min [µA (u), µA (u)] ; for µA (u) ≥ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = min [1 − µ (u), µ (u)] ; for µA (u) < 0.5
= sup 12 A A
u∈U ⎪
⎪ y = min [µB (v), µA (u)] ; for µA (u) ≥ 1 − µB (v)
⎩ 21 ; for µA (u) > µB (v)
y22 = min [1 − µA (u), µA (u)] ; for µA (u) < 1 − µB (v)
⎧
⎪
⎪ y11 = µA (u) ; for µA (u) ≥ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = µA (u) ; for µA (u) < 0.5
= sup 12
u∈U ⎪
⎪ y21 = min [µB (v), µA (u)] ; for µA (u) ≥ 1 − µB (v)
⎩ ; for µA (u) > µB (v)
y22 = min [1 − µA (u), µA (u)] ; for µA (u) < 1 − µB (v)
⎧
⎪
⎪ y1 = µA (u); for µA (u) ≤ µB (v)
⎧
⎪
⎪ ⎨ y21 = µB (v) ; for µA (u) ≥ 1− µB (v)i.e.µB (v) ≥ 0.5
⎨
= sup y221 = µA (u) ; for µA (u) ≥ 0.5
u∈U ⎪
⎪ ⎩ ; for µA (u) < 1− µB (v) i.e. µB (v) < 0.5
⎪
⎪ y222 = 1− µA (u) ; for µA (u) < 0.5
⎩
; for µA (u) > µB (v)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 , y221 , and y222 which are divided into two regions depend-
ing on the values of µB (v). The first region contains y1 and y21 for µB (v) ≥ 0.5 and
the second region having y221 and y222 for µB (v) < 0.5. In the first region, the out-
come of the “min” operation starts with y1 which increases to a maximal value of
µB (v) with an increase in µA (u), and then y21 , which starts with that maximal value,
and remains there in spite of any further increase in µA (u). Clearly, the supremum
of this region is µB (v) with a value greater than 0.5.
In the second region, the outcome of the “min” operation starts with y221 which
increases to a maximal value of 0.5 with an increase in µA (u) and then y222 starts
with that maximal value and decreases with any further increase in µA (u). Hence,
the supremum of this region is 0.5 only. It is observed that the supremum of the
first region is either equal to or greater than the supremum of the second region,
therefore µB (v) = max(0.5, µB (v)), or µB (v) = 0.5 ∪ µB (v).
MRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figures 30 and 31 that the first
supremum of the outcome of the “min” operation between µA→B (u, v) and µA (u)
is the intersection point of the curve 1 − µA (u) and µA (u) = µA2 (u), computed by
solving the following equality:
Since 1 − µA (u) = √ µA2 (u) or µA2 (u) +√µA (u) − 1 = 0, solving the above equation, we
get µA (u) = −1 ± 2 1 + 4 = −1 ± 5 . Since the membership grade cannot exceed
2 √
the limits 0 and 1, therefore µA (u) has the following value µA (u) = 52− 1 , thus
the consequence µB (v) is given by
√ √ √
5−1 2− 5+1 3− 5
µB (v) = 1 − µA (u) = 1 − = =
2 2 2
√
It is also observed that 3 −2 5 is the lowest value of µB (v), and other values of the
√
supremum are equal to µB (v) with values greater than 3 −2 5 , or, in other words,
√ √
µB (v) = max 3 −2 5 , µB (v) or µB (v) = 3 −2 5 ∪ µB (v). Based on the conse-
quence µB (v), it is concluded that MRFI does not satisfy the C2-2 and C2-1 criteria
of GMP. The analytical proof is given below:
! "
µB (v) = sup min max [min(µA (u), µB (v)), 1 − µA (u)] , µA2 (u) (38)
u∈U
y = min max [µA (u), 1 − µA (u)] , µA2 (u) ; for µA (u) ≤ µB (v)
= sup 1
u∈U y2 = min max [ µB (v), 1 − µA (u)] , µA (u) ; for µA (u) > µB (v)
2
⎧
⎪ y11 = min µA (u), µA (u)2 ; for µA (u) ≥ 0.5
2
⎪
⎨ ; for µA (u) ≤ µB (v)
y = min 1 − µ (u), µA(u) ; for µA (u) < 0.5
= sup 12 A
u∈U ⎪
⎪ y21 = min µB (v), µA (u)
2
; for µA (u) ≥ 1 − µB (v) ; for µA (u) > µB (v)
⎩
y22 = min 1 − µA (u), µA2 (u) ; for µA (u) < 1 − µB (v)
⎧
⎪ y11 = µA2 (u) ; for µA (u) ≥ 0.5
⎪
⎪ ; for µA (u) ≤ µB (v)
⎪
⎪ y = µA2 (u) ; for µA (u) < 0.5 .
⎪
⎪ ⎧ 12
⎪ ⎪ y211 = µA (u) ; for µA (u) ≤ µB (v)
⎨ 2
⎪
⎨ . ; for µA (u) ≥ 1 − µB (v)
= sup
y212 = µB (v) ; for µA (u) > µB (v)
⎪
u∈U ⎪
⎪
⎪ ⎪
⎪ y221 = 1 − µA (u) ; for µA (u) ≥ µA (u) ; for µ (u) < 1 − µ (v)
min
⎪
⎪ ⎩ y = µ 2 (u)
⎪
⎪ ; for µA (u) < µAmin (u) A B
⎩ 222 A
; for µA (u) > µB (v)
⎧
⎪ y1 = µA2 (u); for µA (u) ≤ µB (v)
⎪ ⎧ y = µ 2 (u) ; for µ (u) ≤ .µ (v)
⎪
⎪
⎪
⎪ ⎪
⎨⎪ . B
211 A
⎨ A ; for µA (u) ≥ 1 − µB (v)
y 212 = µ B (v) ; for µA (u) > µB (v)
= sup
u∈U ⎪
⎪ ⎪ y221 = 1 − µA (u) ; for µA (u) ≥ µAmin (u)
⎪⎪
⎪ ⎩ ; for µA (u) < 1 − µB (v)
⎪
⎪ y222 = µA2 (u) ; for µA (u) < µAmin (u)
⎩
; for µA (u) > µB (v)
√
where µAmin (u) = 5 − 1 is obtained by solving 1 − µ (u) = µ 2 (u).
2 A A
340 S.K. Kashyap et al.
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) and consists of y1 , y211 , y212 , y221 , and y222 , which are divided into two re-
gions depending on the values of µB (v). The first region contains y1 , y211 , and y212
for µB (v) ≥ 0.5 and the second region having y221 and y222 for µB (v) < 0.5. In the
first region, the outcome of the “min” operation starts with y1 or y211 which increases
to a maximal value of µB (v) with an increase in µA (u) and then y212 which starts
with that maximal value, remains there in spite of any further increase in µA (u).
Clearly, the supremum of this region is µB (v) with value greater than 0.5.
In the second region, the outcome of the “min” operation starts with y222 which
increases to a maximal value of µAmax (u) with an increase in µA (u) and then
y221 starts with that maximal value and decreases with any further increase in
µA (u). Hence, the supremum of this region is µAmax (u) only. It is observed that
the supremum of the first region is greater than the supremum of the second re-
gion, therefore µ √B (v) = max(µA (u), µB (v)), or µB (v) = µA (u) ∪ µB (v), where
max max
µAmax (u) = 3 −2 5 .
.
MRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). Similar observations are made as for MRFI: C2-1/C2-2
with the only difference the minimum supremum point at which µA→B (u, v) and
µA (u) intersect a first time. The minimum supremum point is computed by solving
the following equality:
.
µB (v) = 1 − µA (u) = µA (u)
.
Since 1 − µA (u) = µA (u), or 1 + µA2 (u) − 2µA (u) = µ√A (u); µA2 (u) − 3√µA (u) + 1 =
0, solving the above equation, we get µA (u) = 3 ± 29 − 4 = 3 ±2 5 ; µA (u) =
√ √ √
3 − 5 . Thus µ (v) is given by µ (v) = 1 − µ (u) = 1 − 3 − 5 = 2 − 3 + 5 =
B B A
√2 2 2
5 − 1 . It is observed from Figure 32 that other values of the supremum are
2
µB (v), it is concluded that MRFI does not satisfy the C3-2 and C3-1 criteria of
GMP. The analytical proof is given below:
.
µB (v) = sup min max [min(µA (u), µB (v)), 1 − µA (u)] , µA (u) (39)
u∈U
⎧ .
⎨ y1 = min max [µA (u), 1 − µA (u)] , µA (u) ; for µA (u) ≤ µB (v)
= sup .
u∈U ⎩ y2 = min max [ µB (v), 1 − µA (u)] , µA (u) ; for µA (u) > µB (v)
⎧⎧ .
⎪
⎪⎨ y11 = min µA (u), µA (u) ; for µA (u) ≥ 0.5
⎪
⎪
⎪
⎪ . ; for µA (u) ≤ µB (v)
⎨⎩ y12 = min 1 − µA (u), µA (u) ; for µA (u) < 0.5
= sup ⎧ .
⎪⎨y21 = min
u∈U ⎪
⎪ µB (v), µA (u) ; for µA (u) ≥ 1 − µB (v)
⎪
⎪ . ; for µA (u) > µB (v)
⎪
⎩⎩y22 = min 1− µA (u), µA (u) ; for µA (u) < 1 − µB (v)
⎧
⎪ y11 = µA (u) ; for µA (u) ≥ 0.5
⎪
⎪ ; for µA (u) ≤ µB (v)
⎪ y12 = 1 − µA (u) ; for µA (u) < 0.5
⎪
⎪
⎪ ⎧ .
⎪
⎨⎪ y211 = µA (u) ; for µA (u) ≤ µB2 (v)
⎪
⎨ ; for µA (u) ≥ 1 − µB (v)
y212 = µB (v) ; for µA (u) > µB (v)
= sup 2
u∈U ⎪
⎪
⎪
⎪ ⎪
⎪ y221 = 1.− µA (u) ; for µA (u) ≥ µA (u)
min
⎪
⎪ ⎩ ; for µA (u) < 1 − µB (v)
⎪
⎪ y 222 = µA (u) ; for µA (u) < µAmin (u)
⎩
; for µA (u) > µB (v)
√
where µAmin (u) = 3 − 5 is obtained by solving 1 − µ (u) = .µ (u).
2 A A
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y211 , y212 , y221 , and y222 , which are divided into two re-
gions depending on the values of µB (v). The first region contains y11 , y12 , y211 , and
y212 for µB (v) ≥ 0.5 and the second region having y221 and y222 for µB (v) < 0.5. In
the first region, the outcome of the “min” operation starts with y211 which increases
to some maximal value with an increase in µA (u), then y12 which starts with that
maximal value and decreases to some value, then y11 which increases to a maxi-
mum value of µB (v) and y212 remains equal to that maximum value in spite of any
further increase in µA (u). Clearly, the supremum of this region is µB (v) with value
greater than 0.5.
In the second region, the outcome of the “min” operation starts with y222 which
increases to a maximal value of µAmax (u) with an increase in µA (u) and then y221
starts with that maximal value and decreases with any further increase in µA (u).
Hence, the supremum of this region is µAmax (u) only. It is observed that the supre-
mum of the first region is greater than the supremum of the second region, therefore
µ√B (v) = max(µAmax (u), muB (v)), or µB (v) = µAmax (u) ∪ µB (v), where µAmax (u) =
5−1.
2
342 S.K. Kashyap et al.
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) is 1 − µA (u) only and hence the supremum of that would be the unity, i.e.
µB (v) = 1.
BRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the conse-
quence µB (v). It is observed from Figures 34 and 35 that first the supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) is the intersection
point of the curve 1 − µA (u) and the curve µA (u) = µA (u), and is computed as
follows:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 343
hence µB (v) = 0.5. It is also observed from the figure that the first supremum is the
lowest among other values of µB (v) which are equal to µB (v), or in other words,
µB (v) = max(0.5, µB (v)) = 0.5 ∪ µB (v). Based on the consequence µB (v) it is con-
cluded that BRFI does not satisfy the C1 criterion of GMP. The analytical proof is
given below:
µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (41)
u∈U
y = min [1 − µA (u), µA (u)] ; for µA (u) ≤ 1 − µB (v)
= sup 1
u∈U y2 = min [ µB (v), µA (u)] ; for µA (u) > 1 − µB (v)
⎧
⎪
⎪ y11 = µA (u) ; for µA (u) ≤ 0.5
⎨ ; for µA (u) ≤ 1 − µB (v)
y = 1 − µ (u) ; for µA (u) > 0.5
= sup 12 A
u∈U ⎪
⎪ y21 = µA (u) ; for µA (u) ≤ µB (v)
⎩ ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB (v)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 which are divided into two regions depending
344 S.K. Kashyap et al.
on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5 and the
second region having y21 and y22 for µB (v) > 0.5. In the first region, the outcome
of the “min” operation starts with y11 which increases to a maximal value of 0.5
with an increase in µA (u) and then y12 which starts with that maximal value and
decreases with any further increase in µA (u). Clearly, the supremum of this region
is 0.5.
In the second region, the outcome of the “min” operation starts with y21 which
increases to a maximal value of µB (v) with an increase in µA (u) and then y22 starts
with that maximal value and remains there in spite of any further increase in µA (u).
Hence, the supremum of this region is µB (v) only. It is observed that the supremum
of the second region is either equal to or greater than the supremum of first region,
therefore µB (v) = max(0.5, µB (v)) or µB (v) = 0.5 ∪ µB (v).
BRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get the
consequence µB (v). It is observed from Figure 36 that the first supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) is the intersection
point of the curve 1 − µA (u) and the curve µA (u) = µA2 (u), and is computed as
follows:
µB (v) = 1 − µA (u) = µA2 (u)
Since 1 − µA (u) = √ µA2 (u), or µA2 (u) +
√µA (u) − 1 = 0, solving the above equation, we
get µA (u) = −1 ± 1 + 4 = −1 ± 5 . Since the membership grade cannot exceed
2 2 √
the limits 0 and 1, therefore µA (u) has the following value µA (u) = 52− 1 , thus the
√ √
consequence µB (v) is given by µB (v) = 1 − µA (u) = 1 − 52− 1 = 2 − 25 + 1 =
√ √
3 − 5 . It is also observed that 3 − 5 is the lowest value of µ (v), and other
B
2 2 √
values of the supremum are equal to µB (v) with values greater than 3 −2 5 , or, in
µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (42)
u∈U
y1 = min 1 − µA (u), µA2(u) ; for µA (u) ≤ 1 − µB (v)
= sup
u∈U y2 = min µB (v), µA (u) ; for µA (u) > 1 − µB (v)
2
⎧
⎪
⎪ y11 = µ (u)
2 ; for µA (u) ≤ µ min (u)
⎨ y = 1 A− µ (u) ; for µ (u) > µAmin (u) ; for µA (u) ≤ 1 − µB (v)
A .
= sup
12 A A
u∈U ⎪
⎪ y21 = µA2 (u) ; for µA (u) ≤ .µB (v)
⎩ ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB (v)
√
where µAmin (u) = 52− 1 is obtained by solving 1 − µA (u) = µA2 (u).
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 , and y22 which are divided into two regions depend-
ing on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5
and the second region having y21 and y22 for µB (v) > 0.5. In the first region, the
outcome of the “min” operation starts with y11 which increases to a maximal value
of µAmax (u) with an increase in µA (u) and then y12 which starts with that maximal
value and decreases with any further increase in µA (u). Clearly, the supremum of
this region is µAmax (u). In the second region, the outcome of the “min” operation
starts with y21 which increases to a maximal value of µB (v) with an increase in
µA (u) and then y22 starts with that maximal value and remains there in spite of any
further increase in µA (u). Hence, the supremum of this region is µB (v) only. It is
observed that the supremum of the second region is greater than the supremum of
first region, therefore, √µB (v) = max(µAmax (u), µB (v)) or µB (v) = µAmax (u) ∪ µB (v),
where µAmax (u) = 3 −2 5 .
.
BRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). Similar observations are made as for BRFI: C2-1/C2-2
with the only difference in the minimum supremum point at which µA→B (u, v) and
µA (u) intersect a first time. The minimum supremum point is computed by solving
the following equality:
.
µB (v) = 1 − µA (u) = µA (u)
.
Since 1 − µA (u) = µA (u), or 1 + µA2 (u) − 2µA√ (u)= µA (u); µ√A (u)−3 µA (u) + 1=0,
2
√
solving the above equation, we get µA (u) = 3 ± 29 − 4 = 3 ±2 5 ; µA (u) = 3 −2 5 .
Thus µB (v) is given by
√ √ √
3− 5 2−3+ 5 5−1
µB (v) = 1 − µA (u) = 1 − = =
2 2 2
346 S.K. Kashyap et al.
It is observed from Figure 37 that other values of the supremum are equal to µB (v)
√ √
with values greater than 52− 1 , or, in other words, µB (v) = max 5 − 1 , µ (v)
2 B
√
or µB (v) = 52− 1 ∪ µB (v). Based on the consequence µB (v), it is concluded that
BRFI does not satisfy the C3-2 and C3-1 criteria of GMP. The analytical proof is
given below:
µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (43)
u∈U
⎧ .
⎨ y1 = min 1 − µA (u), µA (u) ; for µA (u) ≤ 1 − µB (v)
= sup .
u∈U ⎩ y2 = min µB (v), µA (u) ; for µA (u) > 1 − µB (v)
⎧ .
⎪
⎪ y11 = µA (u) ; for µA (u) ≤ µAmin (u)
⎨ ; for µA (u) ≤ 1 − µB (v)
y12 = 1
. − µA (u) ; for µA (u) > µAmin (u)
= sup
u∈U ⎪
⎪ y = µA (u) ; for µA (u) ≤ µB2 (v)
⎩ 21 ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB2 (v)
√ .
where µAmin (u) = 3 −2 5 is obtained by solving 1 − µA (u) = µA (u).
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 , which are divided into two regions depending
on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5 and the
second region having y21 and y22 for µB (v) > 0.5. In the first region, the outcome of
the “min” operation starts with y11 which increases to a maximal value of µAmax (u)
with an increase in µA (u) and then y12 which starts with that maximal value and
decreases with any further increase in µA (u). Clearly, the supremum of this region
is µAmax (u).
In the second region, the outcome of the “min” operation starts with y21 which
increases to a maximal value of µB (v) with an increase in µA (u) and then y22
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 347
starts with that maximal value and remains there in spite of any further increase
in µA (u). Hence, the supremum of this region is µB (v) only. It is observed that
the supremum of the second region is greater than the supremum of the first re-
√ µB (v) = max(µA (u), µB (v)) or µB (v) = µA (u) ∪ µB (v), where
gion, therefore max max
µB (v) = sup {min [max(1− µA (u), µB (v)), µA (u)]} (44)
u∈U
y = min [1 − µA (u), 1 − µA (u)] ; for µA (u) ≤ 1 − µB (v)
= sup 1
u∈U y2 = min [ µB (v), 1 − µA (u)] ; for µA (u) > 1 − µB (v)
⎧
⎨y1 = 1 − µA (u) ; for µA (u) ≤ 1 − µB (v)
= sup y21 = 1 − µA (u) ; for µA (u) > 1 − µB (v)
u∈U ⎩ y = µ (v) ; for µA (u) > 1 − µB (v)
22 B ; for µA (u) ≤ 1 − µB (v)
It is observed that y22 is not valid, hence, not considered. Therefore, the outcome
of the “min” operation between µA→B (u, v) and µA (u) is 1 − µA (u) only and the
supremum of the outcome will be the unity only, i.e. µB (v) = 1.
GRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from Figure 39 that µB (v) or the supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) are the intersection
348 S.K. Kashyap et al.
µB (v)
points of µA (u) = µA (u) and the curve (selected because µA (u) intersects
µA (u)
this curve only) at various values of µB (v). These intersection points are computed
by solving the following equality:
µB (v)
µB (v) = = µA (u)
µA (u)
µ (v) .
Since B = µA (u), solving this equation we get µA (u) = µB (v), hence µB (v) =
. µ A (u)
µB (v). Therefore, it is concluded that GRFI does not satisfy the C1 criterion of
GMP. The analytical proof is given below:
⎧
7 µA (u)]
⎨ y1 = min [1, 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) (45)
u∈U ⎩ y2 = min µ (u) , µA (u) ; for µA (u) > µB (v)
A
⎧
⎪
⎪ y1 = µA (u) ; for µA (u) ≤ µB (v)
⎨⎧ .
⎨ y21 = µA (u) ; for µA (u) ≤ µB (v)
= sup .
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > µB (v)
µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 ,.
y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ µB (v),. therefore, the outcome starts with y1 /y21 which in-
creases to a maximal value of µB (v) with an increase of µA (u) and then y22 which
starts with that maximal value and decreases . with any further increase . of µA (u). So
clearly the supremum of the outcome is µB (v) only, i.e. µB (v) = µB (v).
GRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figure 40 that µB (v) or the supremum
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 349
of the outcome of the “min” operation between µA→B (u, v) and µA (u) are the in-
µ (v)
tersection points of µA (u) = µA2 (u) and the curve B (selected because µA (u)
µA (u)
intersects this curve only) at various values of µB (v). These intersection points are
computed by solving the following equality:
µB (v)
µB (v) = = µA2 (u)
µA (u)
µB (v)
Since = µA2 (u), solving this equation we get µA (u) = (µB (v))1/3 , hence
µA (u)
µB (v) = (µB (v))2/3 . Therefore, it is concluded that GRFI does not satisfy the
C2-1/C2-2 criteria of GMP. The analytical proof is given below:
⎧
⎨ y1 = min 71, µA2 (u) 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) 2 (46)
u∈U ⎩ y2 = min µ (u) , µA (u) ; for µA (u) > µB (v)
A
⎧
⎪
⎪ y = µA2 (u) ; for µA (u) ≤ µB (v)
⎨ ⎧1
⎨ y21 = µA2 (u) ; for µA (u) ≤ (µB (v))1/3
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > (µB (v))1/3
µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ (µB (v))1/3 , therefore, the outcome starts with y1 /y21 which
increases to a maximal value of (µB (v))2/3 with an increase of µA (u) and then y22
which starts with that maximal value and decreases with any further increase of
µA (u). So clearly the supremum of the outcome is (µB (v))2/3 only, i.e. µB (v) =
(µB (v))2/3 .
350 S.K. Kashyap et al.
.
GRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figure 41 that µB (v) or the supremum
of the outcome of the “min” operation between µA→B (u, v) and µA (u) are the inter-
. µ (v)
section points of µA (u) = µA (u) and the curve B (selected because µA (u)
µA (u)
intersects this curve only) at various values of µB (v). These intersection points are
computed by solving following equality:
µB (v) .
µB (v) = = µA (u)
µA (u)
µB (v) .
Since = µA (u), solving this equation we get µA (u) = (µB (v))2/3 , hence
µA (u)
µB (v) = (µB (v))1/3 . Therefore, it is concluded that GRFI does not satisfy the
C3-1/C3-2 criteria of GMP. The analytical proof is given below:
⎧ .
⎪
⎨ y1 = min 1, µA (u) ; for µA (u) ≤ µB (v)
7 8
µB (v) = sup µ (v) . (47)
u∈U ⎪
⎩ y2 = min B , µA (u) ; for µA (u) > µB (v)
µA (u)
⎧ .
⎪
⎪ y⎧1 = µA (u) ; for µA (u) ≤ µB (v)
⎨ .
⎨ y21 = µA (u) ; for µA (u) ≤ (µB (v))2/3
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > (µB (v))2/3
µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ (µB (v))2/3 , therefore, the outcome starts with y1 /y21 which
increases to a maximal value of (µB (v))1/3 with an increase of µA (u) and then y22
which starts with that maximal value and decreases with any further increase of
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 351
µA (u). So clearly the supremum of the outcome is (µB (v))1/3 only, i.e. µB (v) =
(µB (v))1/3 .
GRFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figures 42 and 43 that the outcome of
the “min” operation between the curves µA→B (u, v) and µA (u) always starts from
the unity value and follows the curve µA (u) = 1 − µA (u) for the value of µA (u) and
then µA→B (u, v) and back to µA (u). So we observe that the maximum value that
comes out of the “min” operation is the unity only, therefore µB (v) = 1. Hence,
GRFI satisfies the C4-1 (not C4-2) criterion of GMP. The analytical proof is given
below:
⎧
7 1 − µA (u)]
⎨ y1 = min [1, 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) (48)
u∈U ⎩ y2 = min µ (u) , 1 − µA (u) ; for µA (u) > µB (v)
A
⎧
⎪
⎪ y = 1 − µ (u) ; for µA (u) ≤ µB (v)
⎨⎧
1 A
⎨ y21 = 1 − µA (u) ; for µA (u) ≤ µA (u)
min
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > µAmin (u)
µA (u)
.
1− 1 − 4µB (v) µ (v)
where µAmin (u) = , is obtained by solving 1 − µA (u) = B .
2 µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is important to note that µAmin (u) is valid only
and hence y21 /y22 , when µB (v) ≤ 0.25. Therefore for µB (v) ≤ 0.25, the outcome
starts with y1 which decreases from its maximal value of unity to some value with an
increase of µA (u), then handed over to y22 which also decreases and finally ends with
y21 . Hence we see that the supremum of this region is the unity only. Similarly for
µB (v) > 0.25 there is only y1 and hence its supremum is again the unity. Therefore,
for any value of µB (v), the consequence µB (v) will be the unity only.
Step 3: One by one premise 1 of all GMT criteria, i.e. C5 to C8-2, are applied
to the implication methods. But before that, it is essential to realize here that the
relational matrices (R) of the implication methods, shown in Figures 6–11, should
be transposed before being investigated by the criteria of GMT. By doing so, the X–
axis now represents the fuzzy set µB (v) (in case of GMP it is µA (u)) and the Y –axis
represents the implication µA→B (u, v) computed for each value of fuzzy set µA (u).
The transpose of (R) is needed, due to fact that the inference rule of GMT is back-
ward goal-driven, as compared to GMP which is a forward goal-driven inference
rule. Figures 44–49 show the transpose of relational matrices of various implica-
tion methods to be put on under investigation against intuitive criteria of GMT. Let
us first take the implication method “MORFI” for investigation under the heading
“MORFI: C#”, where # is a criterion index such as: 5, 6, 7, 8–1 and 8–2.
MORFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). Figure 50 illustrates the superimposed plots of µA→B (u, v)
and µB (v). It is noticed from the figure that µB (v) intersects µA→B (u, v) a first
time at µA (u) = µB (v) = 0.5, a point at which the outcome of the “min” opera-
tion is at its peak value and also equal to the maximal value that the consequence
µA (u) can achieve. Similarly the other values of µA (u) are the next intersection
points (below 0.5) of µB (v) to µA→B (u, v). Hence it can be concluded that µA (u)
falls between µA (u) and 0.5, or, in other words, µA (u) = min(0.5, µA (u)) that is,
µA (u) = 0.5 ∩ µA (u). The point to be noted here is that MORFI does not satisfy the
intuitive criterion C5 of GMT. The analytical proof is given below:
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when the value of µB (v) is less than 0.5
and µB (v) ≤ µA (u), then y21 for µA (u) < µB (v) < 1 − µA (u) and finally y22 (when
µB (v) > 1 − µA (u) as well as µB (v) > µA (u), which is only possible when µA (u) is
less than 0.5). In this case, the supremum will be y21 , i.e. µA (u) only. It is also ob-
served that the outcome, for µA (u) greater than 0.5, will be y11 (when µB (v) ≤ 0.5)
and then y12 (when µB (v) > 0.5). Thus we see that the supremum of the outcome of
the “min” operation between µA→B (u, v) and µB (v) will be the intersection point of
y11 and y12 and that happens to always be 0.5. We also see that 0.5 is a maximum
value that the supremum can achieve, otherwise it is just µA (u), in other words
µA (u) = min(0.5, µA (u)), that is µA (u) = 0.5 ∩ µA (u).
MORFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 51 and 52 that µB (v) intersects
µA→B (u, v) a first time in a point at which the curve µB (v) is equal to µB (v) for
some value of µA (u). It is also observed that this point, computed below, is also a
maximum value that the consequence µA (u) can achieve.
Since µB (v) = 1 − µ√ 2
√µB (v)+ µB (v)− 1 = 0, by solving the above equation,
B (v), or
2
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when value of µB (v).is less than µBmin (v) is
less than and µB (v) ≤.µA (u), then y21 for µA (u) < µB (v) < 1 − µA (u) and finally
y22 (when µB (v) > 1 − µA (u) as well as µB (v) > µA (u), which is only possible
when µA (u) is less than µBmin (v)). In this case, the supremum will be y21 , i.e. µA (u)
only. It is also observed that the outcome, for µA (u) greater than µBmin (v), will be
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 357
y11 (when µB (v) ≤ µBmin (v)) and then y12 (when µB (v) > µBmin (v)). Thus we see
that the supremum of the outcome of the “min” operation between µA→B (u, v) and
µB (v) will be the intersection point of y11 and y12 and that turns always out to
be µBmin (v). We also see that µBmin (v) is a maximum value that the supremum can
achieve, otherwise it is just µA (u), in other words, µA (u) = min(µBmin (v), µA (u)),
that is µA (u) = µBmin (v) ∩ µA (u).
.
MORFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figure 53 that µB (v) intersects µA→B (u, v)
a first time in a point at which the curve µB (v) is equal to µB (v) for some value of
µA (u). It is also observed that this point, computed below, is also a maximum value
that the consequence µA. (u) can achieve.
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 , depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when the value of µB (v) is less than µBmin (v)
and µB (v) ≤ µA (u), then y21 for µA (u) < µB (v) < (1− µA (u))2 , and finally y22 (when
µB (v) > (1 − µA (u))2 as well as µB (v) > µA (u), which is only possible when µA (u)
is less than µBmin (v)). In this case, the supremum will be y21 , i.e. µA (u) only. It is
also observed that the outcome, for µA (u) greater than µBmin (v), will be y11 (when
µB (v) ≤ µBmin (v)) and then y12 (when µB (v) > µBmin (v)). Thus we see that the supre-
mum of the outcome of the “min” operation between µA→B (u, v) and µB (v) will
be the intersection point of y11 and y12 and that turns always out to be µBmin (v).
We also see that µBmin (v) is a maximum value that the supremum can achieve,
otherwise it is just µA (u), in other words, µA (u) = min(µBmin (v), µA (u)), that is
µA (u) = µBmin (v) ∩ µA (u).
MORFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 54 that µB (v) is always larger
than or equal to µA→B (u, v) for any value of µB (v), which means that the outcome of
the “min” operation is µA→B (u, v) itself i.e. fig. 44. It is also observed from Figure 54
that µA→B (u, v) = min(µA (u), µB (v)) converges to µA (u) (which is also the maximal
value of µA→B (u, v)) for µB (v) ≥ µA (u) and hence the supremum of µA→B (u, v)
is µA (u), i.e. µA (u) = µA (u). Therefore, it is concluded that MORFI satisfies the
intuitive criterion C8-2 (not C8-1) of GMT (refer Table 2). The analytical proof is
given below:
µA (u) = sup {min [min (µA (u), µB (v)) , µB (v)]}
v∈V
y = min [µB (v), µB (v)] ; for µB (v) ≤ µA (u)
= sup 1 (52)
v∈V y2 = min [µA (u), µB (v)] ; for µB (v) > µA (u)
It is observed from the above formula that the outcome of the “min” operation
between µA→B (u, v) and µB (v) (for some fixed value of µA (u)) consists of y1 having
value µB (v) when µB (v) ≤ µA (u) and which increases up to a value of µA (u), then y2
is equal to that fixed value of µA (u) when µB (v) > µA (u). Therefore, it can be con-
cluded that the supremum of y1 and y2 will be the curve µA (u), i.e. µA (u) = µA (u).
PORFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figures 55 and 56 that the supremum of
the “min” operation, i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v).
µA (u)
µA (u) = sup {min [µA (u)µB (v), 1 − µB (v)]} =
v∈V 1 + µA (u)
Since µA (u) = µA (u)µB (v) = 1 − µB (v), hence by solving µA (u)µB (v) = 1 − µB (v),
1 1 µA (u)
we get µB (v) = . Therefore, µA (u) = µ (u) or .
1 + µA (u) 1 + µA (u) A 1 + µA (u)
Hence, PORFI does not satisfy the C5 criterion of GMT. The analytical proof is
given below:
It is observed from y1 and y2 , computed for some fixed value of µA (u), these are
the outcome of the “min” operation between the implication µA→B (u, v) and µB (v).
µA (u)
The y1 increases with an increase in µB (v) to a maximum value equalling
1 + µA (u)
1 µA (u)
for µB (v) ≤ , whereas y2 starts from its maximum value of and
1 + µA (u) 1 + µA (u)
then decreases with any further increase of µB (v). Therefore, the supremum of these
µA (u)
curves will be only.
1 + µA (u)
PORFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 57 that the supremum of
the “min” operation i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v) .
9
! " µA (u) µA2 (u) + 4 − µA2 (u)
µA (u) = sup min µA (u)µB (v), 1 − µB2 (v) =
v∈V 2
It is observed from y1 and y2 , computed for some fixed value of µA (u), these
are the outcome of the “min” operation between the implication µA→B (u, v) and
µB (v). The y1 increases with an increase in µB (v) to a maximum value equal to
µA (u)µB (v) = µA (u)µBmin (v) for the µB (v) ≤ µBmin (v), whereas, y2 starts from its
maximum value of µA (u)µBmin (v) and then decreases with any further increase of
µB (v).9Therefore, the supremum of these curves will be µA (u)µBmin (v) =
µA (u) µA2 (u) + 4 − µA2 (u)
2 only.
.
PORFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 58 that the supremum of
the “min” operation, i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v). These intersection points are computed
as follows:
. 2µ (u) + 1 − .4µ (u) + 1
A A
µA (u) = sup min µA (u)µB (v), 1 − µB (v) =
v∈V 2µA (u)
.
Since µA (u) = µA (u)µB (v) = 1 − µB (v), or µA2 (u)µB2 (v) − (2µA (u) + 1)µB (v) +
1 = 0, by solving the above equation,
9
.
2µA (u) + 1 ± (2µA (u) + 1)2 − 4µA2 (u) 2µA (u) + 1 ± 4µA (u) + 1
µB (v) = =
2µA2 (u) 2µA2 (u)
.
2µ (u) + 1 − (4µA (u) + 1
or µB (v) = A and
2µA2 (u)
.
2µA (u) + 1 − 4µA (u) + 1
µA (u) = µA (u)µB (v) = .
2µA (u)
Therefore, PORFI does not satisfy the C7 criterion of GMT. The analytical proof is
given below:
.
µA (u) = sup min µA (u)µB (v), 1− µB (v) (55)
v∈V
.
y = µA (u)
. µB (v) ; for µA (u)µB (v) ≤ 1− µB (v) or µB (v) ≤ µBmin (v)
= sup 1
v∈V y2 = 1− µB (v) ; for µB (v) > µB (v)
min
.
2µA (u) + 1 − 4µA (u) + 1
where µBmin (v) = is obtained by solving the equation
. 2µA2 (u)
µA (u)µB (v) = 1 − µB (v).
It is observed from y1 and y2 , computed for some fixed value of µA (u), are the
outcome of the “min” operation between the implication µA→B (u, v) and µB (v). The
y1 increases with an increase in µB (v) to a maximum value equal to µA (u)µB (v) =
µA (u)µBmin (v) for the µB (v) ≤ µBmin (v), whereas, y2 starts from its maximum value
of µA (u)µBmin (v) and then decreases with any further increase of µB (v).
. Therefore,
2µ (u) + 1 − 4µA (u) + 1
the supremum of these curves will be µA (u)µBmin (v) or A
2µA (u)
only.
PORFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 59 that µB (v) is always equal
to or greater than the implication µA→B (u, v) for any value of µA (u), therefore the
outcome of the “min” operation results in µA→B (u, v) itself. It is also noticed that
the supremum of µA→B (u, v) turns out to be µA (u). Therefore, it is concluded that
PORFI satisfies the intuitive criterion C8-2 (not C8-1) of GMT. The analytical proof
is given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 363
ARFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 60 and 61 that the supremum of
the outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is
their intersection point obtained by solving following equality:
µ (u)
since 1 − µA (u) + µB (v) = 1 − µB (v) or 2µB (v) = µA (u) or µB (v) = A2 , hence
µ (u)
the consequence µA (u) is given by µA (u) = A2 . Therefore it is concluded that
ARFI does not satisfy the criterion C5 of GMT. The analytical proof is given below:
µA (u) = sup {min [min (1, 1 − µA (u) + µB (v)) , 1 − µB (v)]} (57)
v∈V
y1 = min [1 − µA (u) + µB (v), 1 − µB (v)] ; for µB (v) ≤ µA (u)
= sup
v∈V y2 = min [1, 1 − µB (v)] ; for µB (v) > µA (u)
⎧
⎨ y11 = 1 − µA (u) + µB (v) ; for 0 < µB (v) < µA (u)/2
; for µB (v) ≤ µA (u)
= sup y12 = 1 − µB (v) ; for µB (v) > µA (u)
v∈V ⎩ y = 1 − µ (v) ; for µB (v) > µA (u)
2 B
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 , and y2 . It is noticed that the y12 and y2 represent the same
equation that is 1 − µB (v) for the range of µA (u)/2 < µB (v) ≤ 1. So it will not be
wrong to say that the outcome basically consists of c1 = 1 − µA (u) + µB (v) and
µ (u)
then c2 = 1 − µB (v). The c1 increases up to a maximum value of 1 − A2 with an
µ (u)
increase in µB (v) from a zero value to A2 , whereas c2 starts from its maximum
µ (u)
value of 1 − A2 and then decreases with any further increase in µB (v) from the
µ (u)
value of A2 . Hence, it can be concluded that the supremum out of c1 and c2 is
µ (u)
1 − A2 only.
ARFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from Figure 62 that the supremum of the outcome of
the “min” operation between the curves µA→B (u, v) and µB (v) is their intersection
point obtained by solving the following equality:
µA (u) = 1 − µA (u) + µB (v) = 1 − µB2 (v)
since 1 − µA (u) + µB (v) = 1 − µB2 (v) or.µB2 (v) + µB (v) −.µA (u) = 0. By solving the
−1 ± 1 + 4µA (u) 1 + 4µA (u) − 1
above equation, we get µB (v) = 2 = 2 and
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 365
.
1 − 2µA (u) + 1 + 4µA (u)
whereas c2 starts from its maximum value of 2 and then
decreases with any further increase in µB (v) from the value of µ. min
B (v). Hence, it is
1 − 2µA (u) + 1 + 4µA (u)
finally concluded that supremum over c1 and c2 is 2 only.
.
ARFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 63 that the supremum of the
outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is their
intersection point, obtained by solving the following equality:
.
µA (u) = 1 − µA (u) + µB (v) = 1 − µB (v)
.
since 1 − µA (u) + µB (v) = 1 − µB (v) or µB2 (v) − (2µA (u) + 1)µB (v) + µA2 (u) = 0.
By solving the above equation, we get
9
1 + 2µA (u) ± (1 + 2µA (u))2 − 4µA2 (u)
µB (v) =
9 2
1 + 2µA (u) ± 1 + 4µA2 (u) + 4µA (u) − 4µA2 (u)
=
. 2
1 + 2µA (u) ± 1 + 4µA (u)
=
2
.
1 + 2µA (u) ± 1 + 4µA (u)
and µA (u) = 1 − µA (u) + µB (v) = 1 − µA (u) +
. 2.
2 − 2µA (u) + 1 + 2µA (u) ± 1 + 4µA (u) 3 − 1 + 4µA (u)
or µA (u) = 2 = 2 .
Therefore it is concluded that ARFI does not satisfy criteria C7 of GMT. The ana-
lytical proof is given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 367
.
µA (u) = sup min min (1, 1 − µA (u) + µB (v)) , 1 − µB (v) (59)
v∈V
⎧ .
⎨ y1 = min 1 − µA (u) + µB (v), 1 − µB (v) ; for µB (v) ≤ µA (u)
= sup .
v∈V ⎩ y2 = min 1, 1 − µB (v) ; for µB (v) > µA (u)
⎧
⎨ y11 = 1 − .µA (u) + µB (v) ; for 0 < µB (v) ≤ µBmin (v)
; for µB (v) ≤ µA (u)
= sup − µB (v)
y12 = 1. ; for µB (v) > µBmin (v)
⎩
v∈V
y2 = 1 − µB (v) ; for µB (v) > µA (u)
.
1 + 2µA (u) − 1 + 4µA (u)
where µBmin (v) = is obtained by solving the equation
. 2
1 − µA (u) + µB (v) = 1 − µB (v).
It is observed that outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11.
, y12 and y2 . It is noticed that the y12 and y2 represent the same
equation, that is 1 − µB (v), for the range of µBmin (v) < µB (v) ≤ 1. So it will be not
wrong to say that the outcome basically consists of c1 = 1 − µA (u) + µ. B (v) and then
. 3 − 1 + 4µA (u)
c2 = 1 − µB (v). The c1 increases up to a maximum value of 2
with an increase in µB (v) . from a zero value to µB (v), whereas c2 starts from its
min
3 − 1 + 4µA (u)
maximum value of 2 and then decreases with any further increase
in µB (v) from the value
.of µB (v). Hence, it is finally concluded that the supremum
min
3 − 1 + 4µA (u)
over c1 and c2 is 2 only.
ARFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 64 that is always equal to or less
than the implication µA→B (u, v) for any value of µA (u), therefore the outcome of
the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).
Therefore, it is concluded that ARFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:
µA (u) = sup {min [min (1, 1 − µA (u) + µB (v)) , µB (v)]} (60)
v∈V
y1 = min [1 − µA (u) + µB (v), µB (v)] ; for µB (v) ≤ µA (u)
= sup
v∈V y2 = min [1, µB (v)] ; for µB (v) > µA (u)
⎧
⎪
⎪ y = 1 − µA (u) + µB (v); for 1 − µA (u) + µB (v) < µBmin (v), i.e. µA (u) > 1
⎪ 11
⎪
⎨
y12 = µB (v) ; for µA (u) < 1
= sup
⎪
v∈V ⎪; for µ B (v) ≤ µ A (u)
⎪
⎪
⎩
y2 = µB (v); for µB (v) > µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 and y2 . Out of these, y11 is not valid as µA (u) > 1 is not
possible. Hence, the outcome is µB (v) only for any fixed value of µA (u). Therefore
the supremum of µB (v), i.e. µA (u), will be the unity only.
MRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figures 65 and 66 that the first supre-
mum of the outcome of the “min” operation between the implication µA→B (u, v) and
premise 1, µB (v), for the value of µA (u) greater than 0.5, is the intersection point
of the curve µB (v) of µA→B (u, v) and the curve µB (v) = 1 − µB (v). This intersec-
tion point is obtained by solving the equality µA (u) = 1 − µB (v) = µB (v) = 0.5. It
is also observed from Figure 65 that µA (u) = 0.5 is also the minimum value that
the consequence µA (u) can achieve. For the value of µA (u) less than 0.5, another
point of the supremum happens to be 1 − µA (u). Therefore, µA (u) falls between
0.5 and 1 − µA (u) with whichever is maximum, i.e. µA (u) = max(0.5, 1 − µA (u))
or µA (u) = 0.5 ∪ 1 − µA (u). It is noticed that MRFI does not satisfy the intuitive
criterion C5 of GMT. The analytical proof is given below:
µA (u) = sup {min [max (min (µA (u), µB (v)) , 1 − µA (u)) , 1− µB (v)]}
v∈V
y1 = min [max (µB (v), 1 − µA (u)) , 1 − µB (v)] ; for µB (v) ≤ µA (u)
= sup (61)
v∈V y2 = min [max (µA (u), 1 − µA (u)) , 1 − µB (v)] ; for µB (v) > µA (u)
⎧
⎪
⎪ y11 = min [µB (v), 1 − µB (v)] ; for µB (v) ≥ 1 − µA (u)
⎪
⎪
⎪
⎪ y12 = min [1 − µ A (u), 1 − µ B (v)] ; for µB (v) < 1 − µA (u)
⎨
;for µB (v) ≤ µA (u)
= sup
v∈V ⎪
⎪ y21 = min [µA (u), 1 − µB (v)] ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5
⎪
⎪
⎪
⎪ y22 = min [1 − µ A (u), 1 − µ B (v)] ; for µA (u) < 0.5
⎩
; for µB (v) > µA (u)
⎧⎧
⎪ ⎨ y111 = µB (v) ; for µB (v) ≤ 0.5
⎪
⎪ ; for µB (v) ≥ 1 − µA (u)
⎪
⎪ y112 = 1 − µB (v) ; for µB (v) > 0.5
⎪
⎨ ⎩ y = 1 − µ (u)
12 A ; for µB (v) < 1 − µA (u)
= sup ; for µ (v) ≤ µ (u)
⎪
⎪ y = 1 − µ (v) ; for µ (v) ≥ 1 − µ (u) or µ (u) ≥ 0.5
v∈V ⎪ B A
⎪
⎪
⎪ 21 B B
⎩ y = 1 − µ (v) ; for µ (u) < 0.5
A A
22 B A
; for µB (v) > µA (u)
It is observed that y111 = µB (v) is only possible when µB (v) ≤ 0.5 and µB (v) ≥
1 − µA (u), i.e. 1 − µA (u) ≤ 0.5 or µA (u) ≥ 0.5. In a similar way, y112 = 1 − µB (v) is
only possible when µB (v) > 0.5 and hence µA (u) ≥ 0.5. If we observe carefully, then
we find that y21 is the same as y112 . Now we have two equations for µA (u) < 0.5,
the first is y12 = 1 − µA (u) when µB (v) ≤ µA (u) and then y22 = 1 − µB (v) when
µB (v) > µA (u). Hence, the outcome of the “min” operation between µA→B (u, v)
and µB (v) can be divided into two regions. The first region consists of y12 and y22
when µA (u) < 0.5 and the second region consists of y111 and y21 when µA (u) ≥ 0.5.
We observe that the supremum of the first region will be y12 , i.e. 1 − µA (u) only,
and for the second region, the supremum is the intersection point of y111 and y21 .
This intersection point is computed by solving the equality 1 − µB (v) = µB (v) and
that turns out to be 0.5. It is also noticed that the supremum of the first region
will always be equal to or greater than the supremum of the second region. In
other words, a minimum value that the consequence µA (u) can have is 0.5 only,
otherwise whichever is the maximum i.e. µA (u) = max(0.5, 1 − µA (u)) or µA (u) =
0.5 ∪ 1 − µA (u).
370 S.K. Kashyap et al.
MRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 67 that the first supremum of the
outcome of the “min” operation between the implication µA→B (u, v) and premise
1, µB (v), is the intersection point of the curve of µB (v) and the curve µB (v) =
1 − µB2 (v). This intersection point is obtained by solving the following equality:
premise 1 µB. (v), is the intersection point of the curve of µB (v) and the curve
µB (v) = 1 − µB (v). This intersection point is obtained by solving the following
equality:
.
µA (u) = 1 − µB (v) = µB (v) or µB2 (v) − 3µB (v) + 1 = 0
√ √
By solving the above equation, we get µB (v) = 3 ± 5 = 3 − 5 µA (u) =
√ 2 2 , hence,
√
3 − 5
µB (v) = 2 . It is also observed from Figure 68 that µA (u) = 2 3 − 5 is also the
minimum value that the consequence µA (u) can achieve, and another point √of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 3 − 5 and
√ 2
1 − µA (u) with whichever is the maximum, i.e. µA (u) = max 3 −2 5 , 1 − µA (u)
√
or µA (u) = 3 −2 5 ∪1− µA (u). It is noticed that MRFI does not satisfy the intuitive
criterion C7 of GMT. The analytical proof is given below:
.
µA (u) = sup min max (min (µA (u), µB (v)) 1 − µA (u)) , 1 − µB (v)
v∈V
⎧ .
⎪
⎨ y1 = min max (µB (v), 1 − µA (u)) 1 − µB (v) ; for µB (v) ≤ µA (u)
= sup . (63)
v∈V ⎪
⎩ y2 = min max (µA (u), 1 − µA (u)) 1 − µB (v) ; for µB (v) > µA (u)
⎧⎧ .
⎪
⎪ ⎨ y11 = min µB (v), 1 − µB (v) ; for µB (v) ≥ 1 − µA (u)
⎪
⎪
⎪
⎪ .
⎪ ⎩
⎪ y12 = min 1 − µA (u), 1 − µB (v) ; for µB (v) < 1 − µA (u)
⎪
⎪
⎪
⎪
⎨ ;⎧for µB (v) ≤ µA (u)
.
= sup ⎪
⎪⎪
v∈V ⎪ ⎨ y21 = min µA (u), 1 − µB (v) ; for µB (v) ≥ 1 − µA (u)
⎪
⎪
⎪
⎪ or µA (u) ≥ 0.5
⎪
⎪ ⎪
⎪ .
⎪
⎪ ⎩ y22 = min 1 − µA (u), 1 − µB (v) ; for µA (u) < 0.5
⎪
⎪
⎩
; for µB (v) > µA (u)
⎧⎧
⎪
⎪ ⎨ y111 = µB (v) . ; for µB (v) ≤ µBmin (v)
⎪
⎪ ; for µB (v) ≥ 1 − µA (u)
⎪
⎪ y112 = 1 − µB (v) ; for µB (v) > µBmin (v)
⎪
⎪ ⎩
⎨ y12 = 1 − µA (u) ; for µB (v) < 1 − µA (u)
= sup ;for µB (v) ≤.µA (u)
v∈V ⎪
⎪
⎪
⎪ y21 = 1 − .µB (v) ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5
⎪
⎪
⎪
⎪ y 22 = 1 − µB (v) ; for µA (u) < 0.5
⎩
; for µB (v) > µA (u)
√ .
where µBmin (v) = 3 −2 5 is obtained by solving the equation µB (v) = 1 − µB (v).
It is observed that y111 = µB (v) is only possible when µB (v) ≤ µBmin (v) and
µB (v) ≥ 1 − µA (u), i.e.
. 1 − µA (u) ≤ µB (v) or µA (u) ≥ 1 − µBmin (v). In a sim-
min min
ilar way, y112 = 1 − µB (v) is only possible when µB (v) > µB (v) and hence
µA (u) ≥ µBmin (v). If we observe carefully, then we find that y21 is the same as
y112 . Now we have two equations for µA (u) < µBmin (v), the first is y12 = 1 − µA (u)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 373
.
when µB (v) ≤ µA (u) and then y22 = 1 − µB (v) when µB (v) > µA (u). Hence, the
outcome of the “min” operation between µA→B (u, v) and µB (v) can be divided
into two regions. The first region consists of y12 and y22 when µA (u) < µBmin (v)
and the second region consists of y111 and y21 when µA (u) ≥ µBmin (v). We ob-
serve that the supremum of first region will be y12 , i.e. 1 − µA (u) only, and for
the second region, the supremum is the intersection point . of y111 and y21 . This
intersection point is computed by solving the equality 1 − µB (v) = µB (v) and
that turns out to be µBmin (v). It is also noticed that the supremum of the first re-
gion will always be equal to or greater than the supremum of the second region.
In other words, a minimum value that the consequence µA (u) can have is µBmin (v)
only, otherwise whichever is the maximum, i.e. µA (u) = max(µBmin (v), 1 − µA (u))
or µA (u) = µBmin (v) ∪ 1 − µA (u).
MRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 69 that for some value of µA (u)
(lower values), the outcome of the “min” operation between µA→B (u, v) and µB (v)
consists of two curves, starting with the curve µB (v) and then 1 − µA (u). It is noticed
that the supremum of these curves is 1 − µA (u) only. Similarly for higher values of
µA (u), the outcome of the “min” operation between µA→B (u, v) and µB (v) consists
of two curves, starting with the curve µB (v) and then µA (u). In this case, it is no-
ticed that the supremum is µA (u) only. Therefore, the consequence µA (u) has either
1 − µA (u) or µA (u), with whichever is maximum, i.e. µA (u) = max(µA (u), 1 −
µA (u)) or µA (u) = µA (u) ∪ 1 − µA (u). It is noticed that MRFI does not satisfy the
intuitive criterion C8-1/C8-2 of GMT. The analytical proof is given below:
µA (u) = sup {min [max (min (µA (u), µB (v)) , 1 − µA (u)) , µB (v)]} (64)
v∈V
y = min [max (µB (v), 1 − µA (u)) , µB (v)] ; for µB (v) ≤ µA (u)
= sup 1
v∈V y2 = min [max (µA (u), 1 − µA (u)) , µB (v)] ; for µB (v) > µA (u)
It is observed from y11 , y12 , y21 and y22 that they can be divided into two re-
gions, based on the value of µA (u). The first region consists of y12 and y22 when
µA (u) < 0.5 and the second region consists of y11 and y21 when µA (u) ≥ 0.5. We
also observe that the supremum of the first region is y22 = 1 − µA (u), and the supre-
mum of the second region is y21 = µA (u). It is also noticed that the supremum of
the first region will always be equal to or greater than the supremum of the second
region. In other words, a minimum value that the consequence µA (u) can have is
µA (u) only, otherwise whichever is maximum, i.e. µA (u) = max(µA (u), 1 − µA (u))
or µA (u) = µA (u) ∪ 1 − µA (u).
BRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from the Figures 70 and 71 that the first supremum
of the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), for the value of µA (u) greater than 0.5, is the intersection point of
the curve µB (v) of µA→B (u, v) and the curve µB (v) = 1 − µB (v). This intersection
point is obtained by solving the following equality:
It is also observed from Figure 70 that µA (u) = 0.5 is also the minimum value that
the consequence µA (u) can achieve. For the value of µA (u) less than 0.5, another
point of the supremum happens to be 1 − µA (u). Therefore, µA (u) falls between 0.5
and 1 − µA (u) with whichever is the maximum, i.e. µA (u) = max(0.5, 1 − µA (u))
or µA (u) = 0.5 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C5 of GMT. The analytical proof is given below:
It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for the second region the
supremum is the intersection point of y21 and y22 . This intersection point is com-
puted by solving the equality 1 − µB (v) = µB (v) and that turns out to be 0.5. It is
also noticed that the supremum of the first region will always be equal to or greater
than the supremum of the second region. In other words, a minimum value that
the consequence µA (u) can have is 0.5 only, otherwise whichever is maximum, i.e.
µA (u) = max(0.5, 1 − µA (u)) or µA (u) = 0.5 ∪ 1 − µA (u).
BRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 72 that the first supremum of
the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), is the intersection point of the curve µB (v) of µA→B (u, v) and
the curve µB (v) = 1 − µB2 (v). This intersection point is obtained by solving the
following equality:
√ √
By solving the above equation, we get µB (v) = −1 ± 5 = 5 − 1 , hence, µ (u) =
A
√ 2 2 √
5 − 1
µB (v) = 2 . It is also observed from Figure 72 that µA (u) = 2 5 − 1 is also the
minimum value that the consequence µA (u) can achieve, and another √ point of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 52− 1 and
√
1 − µA (u) with whichever is maximum, i.e. µA (u) = max 5 − 1 , 1 − µ (u) or
2 A
√
µA (u) = 52− 1 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C6 of GMT. The analytical proof is given below:
! "
µA (u) = sup min max (1 − µA (u), µB (v)) 1 − µB2 (v) (66)
v∈V
y = min 1 − µA (u), 1 − µB2(v) ; for µB (v) ≤ 1 − µA (u)
= sup 1
v∈V y2 = min µB (v), 1 − µB2 (v) ; for µB (v) > 1 − µA (u)
⎧ .
⎪
⎪ y11 = 1 − µA (u) ; for µB (v) ≤ .µA (u)
⎨ ; for µB (v) ≤ 1 − µA (u)
y12 = 1 − µB2 (v) ; for µB (v) > µA (u)
= sup
v∈V ⎪
⎪ y = µB (v) ; for µB (v) ≤ µBmin (v)
⎩ 21 ; for µB (v) > 1 − µA (u)
y22 = 1 − µB (v) ; for µB (v) > µBmin (v)
2
√
where µBmin (v) = 52− 1 is obtained by solving the equation µB (v) = 1 − µB2 (v).
It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions, based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for the second region, the
supremum is the intersection point of y21 and y22 . This intersection point is com-
puted by solving the equality 1 − µB2 (v) = µB (v) and that turns out to be µBmin (v).
It is also noticed that the supremum of the first region will always be equal to or
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 377
greater than the supremum of the second region. In other words, a minimum value
that the consequence µA (u) can have is µBmin (v) only, otherwise whichever is the
maximum, i.e. µA (u) = max(µBmin (v), 1 − µA (u)) or µA (u) = µBmin (v) ∪ 1 − µA (u).
.
BRFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 73 that the first supremum of
the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), is the.intersection point of the curve µB (v) of µA→B (u, v) and
the curve µB (v) = 1 − µB (v). This intersection point is obtained by solving the
following equality:
.
µA (u) = 1 − µB (v) = µB (v) or µB2 (v) − 3µB (v) + 1 = 0
√ √
By solving the above equation, we get µB (v) = 3 ±2 5 = 3 −2 5 , hence µA (u) =
√ √
µB (v) = 3 −2 5 . It is also observed from Figure 73 that µA (u) = 3 −2 5 is also the
minimum value that the consequence µA (u) can achieve, and another point √of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 3 − 5 and
√ 2
1 − µA (u) with whichever is the maximum, i.e. µA (u) = max 3 −2 5 , 1 − µA (u)
√
or µA (u) = 3 −2 5 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C7 of GMT. The analytical proof is given below:
.
µA (u) = sup min max (1 − µA (u), µB (v)) 1 − µB (v) (67)
v∈V
⎧ .
⎨ y1 = min 1 − µA (u), 1 − µB (v) ; for µB (v) ≤ 1 − µA (u)
= sup .
v∈V ⎩ y2 = min µB (v), 1 − µB (v) ; for µB (v) > 1 − µA (u)
378 S.K. Kashyap et al.
⎧
⎪
⎪ µA (u) ; for µB (v) ≤ µA2 (u)
y11 = 1 − .
⎨ ; for µB (v) ≤ 1 − µA (u)
y = 1 − µB (v) ; for µB (v) > µA2 (u)
= sup 12
v∈V ⎪
⎪ y = µB (v). ; for µB (v) ≤ µB (v)
min
⎩ 21 ; for µB (v) > 1 − µA (u)
y22 = 1 − µB (v) ; for µB (v) > µBmin (v)
√
where µBmin (v) =3 − 5 is obtained by solving the equation µ (v) = 1 − .µ (v).
2 B B
It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for second region, the supre-
mum is the intersection point . of y21 and y22 . This intersection point is computed
by solving the equality 1 − µB (v) = µB (v), and that turns out to be µBmin (v). It is
also noticed that the supremum of the first region will always be equal to or greater
than the supremum of the second region. In other words, a minimum value that the
consequence µA (u) can have is µBmin (v) only, otherwise whichever is the maximum,
i.e. µA (u) = max(µBmin (v), 1 − µA (u)) or µA (u) = µBmin (v) ∪ 1 − µA (u).
BRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 74 that µB (v) is always equal to
or less than the implication µA→B (u, v) for any value of µA (u), therefore the outcome
of the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).
Therefore, it is concluded that BRFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) is always µB (v), thus µA (u) = 1.
GRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 75 and 76 that the supremum of
the outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is
their intersection point, obtained by solving the following equality:
µB (v)
µA (u) = = 1 − µB (v),
µA (u)
since
µB (v) µA (u)
= 1 − µB (v), or (1 + µA (u))µB (v) = µA (u) or µB (v) =
µA (u) 1 + µA (u)
µB (v) 1
µA (u) = = .
µA (u) 1 + µA (u)
Therefore it is concluded that GRFI does not satisfy criterion C5 of GMT. The
analytical proof is given below:
⎧ 7 8
⎨ µB (v)
y1 = min , 1 − µB (v) ; for µB (v) ≤ µA (u)
µA (u) = sup µA (u) (69)
v∈V ⎩ y = min [1, 1 − µ (v)] ; for µB (v) > µA (u)
2 B
⎧⎧
⎪
⎪⎪
⎪ ⎨ y11 = µB (v) µ (v)
; for B ≤ 1 − µB (v) or µB (v) ≤
µA (u)
⎪
⎪ µ (u) µ (u) 1 + µA (u)
⎨ A A
µ
⎪ (u)
= sup ⎩ y12 = 1 − µB (v) ; for µB (v) > 1 + µ (u)
A
v∈V ⎪
⎪ A
⎪ ; for µB (v) ≤ µA (u)
⎪
⎪
⎩
y2 = 1 − µB (v); for µB (v) > µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v)
and µB (v) consists of y11 , y12 and y2 . The outcome starts with y11 which in-
creases to a maximum value of 1 with an increase in µB (v) from 0 to
1 + µA (u)
µA (u)
µB (v) ≤ . It is observed that y12 and y2 are the same; therefore, we take
1 + µA (u)
y12 or y2 which starts from its maximum value of 1 and decreases with
1 + µA (u)
any further increase of µB (v). Hence, the supremum of the outcome of the “min”
operation is 1 only.
1 + µA (u)
GRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from Figure 77 that the supremum of the outcome of
the “min” operation between the curves µA→B (u, v) and µB (v) is their intersection
point, obtained by solving the following equality:
µB (v)
µA (u) = = 1 − µB2 (v),
µA (u)
since
µB (v)
= 1 − µB2 (v), or µA (u)µB2 (v) + µB (v) − µA (u) = 0
µA (u)
9
−1 ± 1 + 4µA2 (u)
By solving the above equation, we get µB (v) = and
9 9 2µA (u)
µB (v) −1 ± 1 + 4µA2 (u) 1 + 4µA2 (u) − 1
µA (u) = = = . Therefore it is con-
µA (u) 2µA2 (u) 2µA2 (u)
cluded that GRFI does not satisfy criterion C6 of GMT. The analytical proof is
given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 381
⎧ 7 8
⎨ µB (v)
y1 = min , 1 − µB (v) ; for µB (v) ≤ µA (u)
2
µA (u) = sup µA (u) (70)
⎩
v∈V y2 = min 1, 1 − µB2 (v) ; for µB (v) > µA (u)
⎧⎧
⎪ ⎪ µ (v) µ (v)
⎪
⎪ ⎨ y11 = µB(u) ; for B ≤ 1 − µB2 (v)
⎨ A µA (u)
= sup ⎪ or µB (v) ≤ µBmin (v) ; for µB (v) ≤ µA (u)
⎩
v∈V ⎪
⎪ µB (v) ; for µB (v) > µBmin (v)
⎩ y12 = 1 −
2
⎪
y2 = 1 − µB (v)
2 ; for µB (v) > µA (u)
9
1 + 4µA2 (u) − 1 µ (v)
where µBmin (v) = is obtained by solving the equation B =
2µA (u) µA (u)
1 − µB (v).
2
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 and y2 . The outcome starts with y11 which increases to a
maximum value of µBmax (v) with an increase in µB (v) from 0 to µB (v) ≤ µBmin (v).
It is observed that y12 and y2 are the same; therefore, we take y12 or y2 which starts
from its maximum value of µBmax (v) and decreases with any further increase of
µB (v). Hence, the supremum
9 of the outcome of the “min” operation is µBmax (v) only,
1 + 4µA2 (u) − 1
where µBmax (v) = .
2µA2 (u)
.
GRFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 78 that the supremum of the
outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is their
intersection point obtained by solving the following equality:
µB (v) .
µA (u) = = 1 − µB (v),
µA (u)
382 S.K. Kashyap et al.
since
µB (v) . ' (
= 1 − µB (v), or µB2 (v) − 2µA (u) + µA2 (u) µB (v) + µA2 (u) = 0
µA (u)
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 , and y2 . The outcome starts with y11 which increases to a
maximum value of µBmax (v) with an increase in µB (v) from 0 to µB (v) ≤ µBmin (v).
It is observed that y12 and y2 are the same; therefore, we take y12 or y2 which starts
from its maximum value of µBmax (v) and decreases with any further increase of
µB (v). Hence, the supremum of9 the outcome of the “min” operation is µBmax (v) only.
2 + µA (u) − µA2 (u) + 4µA (u)
where, µBmax (v) = 2 .
GRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 79 that µB (v) is always equal to
or less than the implication µA→B (u, v) for any value of µA (u), therefore the outcome
of the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).
Therefore, it is concluded that GRFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:
⎧ 7 8
⎨ µB (v)
y1 = min , µ (v) ; for µB (v) ≤ µA (u)
µA (u) = sup µA (u) B (72)
⎩
v∈V y2 = min [1, µB (v)] ; for µB (v) > µA (u)
⎧⎧
⎪ ⎪ µ (v) µ (v)
⎪
⎪ ⎨ y11 = B ; for B ≤ µB (v)
⎨ µA (u) µA (u)
or µA (u) > 1 ; for µB (v) ≤ µA (u)
= sup ⎪
⎪ ⎩ y = µ (v) ; for µ (u) < 1
v∈V ⎪
⎪
⎩ 12 B A
y2 = µB (v) ; for µB (v) > µA (u)
It is observed that y11 is not valid as µA (u) > 1 is not possible. Hence, the out-
come of the “min” operation between µA→B (u, v) and µB (v) is always µB (v), thus
µA (u) = 1.
384
6 Discussions
Tables 3 and 4 given below, summarize the results of investigation [2], on various
implication methods compared with intuitive criteria of GMP and GMT. It is ob-
served from the tables that implication methods such as MORFI and PORFI satisfy
exactly the same intuitive criteria of GMP and GMT with a total number of satis-
factions equalling to four. A similar observation is made for implication methods
such as ARFI, BRFI and GRFI. These methods satisfy only 2 intuitive criteria of
GMP and GMT. It is also observed that MRFI has the minimum number (equalling
to one) of satisfactions with these criteria. The logical explanation of these obser-
vations would be that by referring to Figures 6 and 7 we see that the curve profile
(as far as their shape of envelope is of any concern) of implication methods such
as MORFI and PORFI are similar, both starting from the origin and ending with
the membership grade µB (v). Similarly, by referring to Figures 8, 10 and 11 for the
implication methods ARFI, BRFI and GRFI respectively, it is observed that these
methods also have a similar curve profile with each of them starting with the unity
and finally converging to µB (v). The implication method MRFI has a unique curve
profile that does not match with any of the other methods, and this make MRFI as
a separate member among the existing implication methods. Finally, it can be con-
cluded that probably similarity in the curve profile of these methods (MORFI and
PORFI in one group/ARFI, BRFI, and GRFI in another group) leads to an equal
number of satisfactions of intuitive criteria of GMP and GMT.
7 Conclusions
A systematic approach has been followed to find out whether any of the existing im-
plication methods match with a given set of intuitive criteria of GMP and GMT. For
that, MATLAB with graphics is used to develop a user interactive package to eval-
uate the implication methods with respect to those criteria. The results are provided
in terms of tables and figures. It is found that the graphical method of investigation
is much quicker and requires less effort from the user as compared to the analyti-
cal method. Also, the analytical method seeks diagnosis of various curves (i.e. the
nature of these curves with respect to variation of the fuzzy sets and) involved in
finding the consequences when intuitive criteria of GMP and GMT are applied to
various implication methods.
References
1. S.K. Kashyap and J.R. Raol. Unification and Interpretation of Fuzzy Set Operations.
CCECE/CCGEI, IEEE Electrical and Computer Engineering Conference, Ottawa, Canada,
May 7–10, 2006.
2. Li-Xin Wang. Adaptive Fuzzy Systems and Control, Design and Stability Analysis. Prentice-
Hall, Englewood Cliffs, NJ, 1994.
FzController: A Development Environment
for Fuzzy Controllers
1 Introduction
As it is well known, the theory of fuzzy sets, and hence fuzzy logic, was introduced
by L.A. Zadeh by the middle of the 1960s as a way to describe the mechanisms
of approximate inference that are performed in the human brain [1]. Since then,
automatic control of processes has been the field where the applications of fuzzy
logic have gained most importance, what has been demonstrated by the diversity of
fuzzy logic based registered patents and the very large number of papers presented
and published along the last three decades in congresses and specialized journals.
In spite of these facts, and in order to help users, still is necessary to develop CAD
tools that allow specification, verification and synthesis of fuzzy controllers.
In the past years many tools have been elaborated for the development of sys-
tems based on fuzzy logic. With no doubt the tool that has the higher number of
users is the package of programs MATLAB for fuzzy logic. This tool which has all
the potential that this powerful package offers, presents some basic drawbacks as it
neither allow an implementation of the designed controller in hardware nor allow to
carry out the direct synthesis of controllers in industrial devices as Programmable
Logic Controllers (PLC). Another tool is XFuzzy [3–5], which is very useful for
those who want to carry out an implementation of the designed controller’s hard-
ware, but it does not allow to carry out the controller’s direct synthesis in industrial
devices like the PLC, neither has possibilities to carry the identification out of the
process to be controlled.
There are many other developed tools which use indistinctly for the specification
of the system a graphic interface or a description language, but in general most of
them either have serious limitations in the fuzzy operators they implement and/or
are closed systems that do not allow the user to implement their own operators or
are development systems for not specific technologies [6].
In this chapter one presents FzController, a prototype tool that, although already
used in practice, is still under development . The tool is characterized by a friendly
and clear graphic interface which, besides to avoid some of the above mentioned
drawbacks, it allows the users to perform the following main actions:
1. Process identification
2. Fuzzy controllers design using graphic design tools
3. Real-time control
4. Automatic generation of code for PLC and high level languages
Each action is associated to a module in the system.
To show the tool, the chapter is developed according to the following. In Section 2, to
give an overall idea, the general diagram of the system is presented. In Section 3, the
basic characteristics of the modules of identification, design of the fuzzy controller,
real-time control and automatic generation of code, all of them illustrating the main
distinctive features of this development environment with regard to other existent
ones, are presented. Finally some conclusions are pointed out.
As it is well known [2] there are two basic methods to implement fuzzy systems,
the exact one and the approximate one, each one with its pros and cons.
It is based on studying the way that the fuzzy sets adopt before each implication
operator. As a whole, a parametric representation of the inferred fuzzy sets is done.
This method is inconvenient in that a previous computation of the parameter expres-
sions of the fuzzy sets has to be made before implementing the controller [7].
FzController: A Development Environment for Fuzzy Controllers 389
Its basic characteristic is that it is not necessary to perform any previous computation
since the universe of discourse of each variable of the consequent is defined as a
finite discrete set. From a computational point of view, this fact implies a more time
consuming method, but the accuracy is given by the amount of points in the universe
of discourse. Hence it is necessary to reach a balance between computational speed
and accuracy. It has the advantage of being able to work with a bigger amount
of implication operators since it does not need to make a previous study of the
parameterized expressions of the fuzzy sets [7]. Thus the implication, aggregation
and defuzzification operators act on each one of the elements of the vectors obtained
as a result of the discretization process. In the implementation of the FzController
system here the Approximated Method is used.
The FzController system is developed under Windows environment and for an
efficient operation it needs a minimum configuration with any Pentium computer
with 128 Mb RAM and 40 Mb of HD.
Figure 1 presents the structure of the FzController system and the flow of infor-
mation among the different modules.
Editor of Operators
defined by the user
Code Editor (VBScript, Module of Automatic
JavaScript, Generation of codes for
DelphiScript) PLC and High level
programming languages
Graphic Editor of
systems:
- Controller
- Variables
Rules Editor
Real Time Control
Module
Properties Editor
Industrial plant
Graphic Visualization of
Rules and Interface
process
Graphic
Representation of
system response
Control Surface
3 Modules in FzController
As said above there are, four main modules in FzController. In the following one
describes each.
y
Plant
u e
y^
Fig. 2 Basic outline of fuzzy logic
the pattern of parallel system
identification
y
Plant
u e
y^
fuzzy logic
Fig. 3 Basic outline of the system
series-parallel identification
FzController: A Development Environment for Fuzzy Controllers 391
a) Graphic editor
b) Properties editor
c) Variables editor
d) Rules editor
e) Operators defined by the user editor
f) Graphic visualization of the rules and the inference process
g) Graphic representation of the system answer, control surface
The first five elements allow the controller’s specification, and the last two the
verification of its operation.
a) Graphic editor
When a fuzzy controller is designed, the first step is to select the controller’s struc-
ture that will be implemented. In this case FzController allows to run Sugeno–
Mamdani type controllers (the classic structure for a fuzzy controller).
The system editor (Figure 4) allows to select the type of controller (Sugeno or
Mamdani) and to define its linguistic variables. Once selected the controller, its
logical operators are defined. Besides the above-mentioned, in the system editor the
controller’s linguistic variables are added and the universe of discourse of them is
described, as well as the linguistic label. The editor’s main characteristic is that it
simplify the specifications of the system.
b) Properties editor
By means of the properties editor (Figure 5) the controller’s fuzzy operators are
edited. The operator is selected to be used for the connective AND, the connective
OR, implication operator, aggregation operator, addition operator to conjunction and
disjunction, defuzzification method, as well as the controller’s name.
Table 1 shows the operators that FzController has implemented by default.
One of the characteristics that enhance the system is that it is a general purpose
system allowing the implementation of any operator defined by the user. Later on
we will focus in some extent on this characteristic.
The properties editor also allows to modify, in a very simple way, each one of
the parameters of the membership functions or fuzzy sets defined in each one of the
variables.
c) Variables editor
It allows to edit the membership functions of each linguistic variable defined in the
system.
The FzController system has implemented by default membership functions of
trapezoid type (it includes the triangular functions as a particular case of this type
of functions), S-function, Z-function, Pi-function, Gauss Bell, singleton type (the
case of a controller with a Sugeno structure). Besides the above-mentioned, the
system offers the possibility for user to define membership functions by means of
mathematical expressions or by a vector (Fig. 6).
FzController: A Development Environment for Fuzzy Controllers 393
d) Rules editor
The rule base of a fuzzy controller contains the information or logical connection
between the input and output linguistic variables of the system. The FzController
system works with MISO rules (Multiple Inputs, Single Output) and it allows to
add the controller’s rules in a simple way and without committing syntax errors.
Figure 7 shows the window of the rule base editor.
394 I. Alvarez-López et al.
When the user implements his own operators, he not only can analyze it behavior
by means of the editor of operators, but also to analyze its effect in the controller’s
response by studying the inference process graphically. By an exhaustive analysis
of the inference process one can correct the operator designed to make sure on the
desired response of the system.
As it is appreciated in Figure 8 this screen does not only allow the visualization
of the activation or implied fuzzy group but also allows displaying the global fuzzy
group as a result of the adding process.
Throughout analysis of the control surface one can analyze to what combina-
tion of inputs the behavior of the system is not the one desired, as well as to fix it
by changing either the rules, or the operators or the membership functions of the
linguistic variables.
When one has a controller designed for an industrial plant or process it is important
to check the results obtained not only by simulation level but also with a real plant.
The developed control module in real time has as purpose to carry out control on
a real plant in experimental way, with the objective of improving the controller
designed before carrying out its syntheses for a PLC. To interact with the plant
or process it is used a data acquisition card.
FzController: A Development Environment for Fuzzy Controllers 397
Hence the evidence of the general purpose characteristic of the system, since it
allows to work with any acquisition data device. The work with the card of data
acquisition is carried out by means of a dynamic link library (dll) that may be pro-
grammed and added by the final user of the system.
In the real-time control module is carried out also the signal conditioning that
consists on carrying out the filtering process, scale adjustment, or to obtain new
signals as a result of a mathematical transformation or operation on the signals read
directly from the process (Fig. 11).
FzController also allows to apply filtering algorithms, scale adjustment, deriva-
tion function, error function and integration function to the signals read on the phys-
ical process or that have been sent to the process by means of the data acquisition
device.
Nowadays most of the industrial applications that have being developed, using fuzzy
logic or fuzzy systems have been implemented using PLC with fuzzy processing
modules or customized hardware [11, 12]. With the module of automatic generation
398 I. Alvarez-López et al.
of codes for PLC is possible to implement the designed controller in PLC proper
codes. This feature allows implementing as many fuzzy controllers as it is desired
without the necessity of incorporate an additional fuzzy processing module. For this,
FzController system is able to generate the controller’s code in structured text and
using functions of the standard IEC 61131-3.
The norm IEC 61131-3 [14] has a great impact in the world of industrial con-
trol and this is not restricted to the conventional market of the PLC’s. The use of
IEC 61131-3 provides many benefits for users/programmers. There are many bene-
fits from the adoption of this standard depending on the application areas: process
control, integrator system, education, programming, maintenance, system installa-
tion, among others. IEC 61131-3 is the result of the great effort carried out by
7 multinational companies with many years of experience in the field of the in-
dustrial automation. The standard constitutes the specifications of the syntax and
semantics of a programming language (structured text), including the software pat-
tern and the structure of the language. The Structured Text (ST) is a language
of high level with origins in ADA, Pascal and C. It may be used to code com-
plex expressions and nested instructions. This language has structures for loops
(REPEAT-UNTIL; WHILE-DO), conditional execution (IF-THEN-ELSE; CASE),
and functions (SQRT, SIN, etc.). The generated code makes use of a library of func-
tions, in which are implemented all the operators and membership function that sys-
tem works with. This library of functions is of free distribution. Initially it has been
developed for “Panasonic” PLC (former NAIS). Being a standard code, in principle
it is valid for any PLC which developing environment incorporates the standard IEC
61131-3.
FzController: A Development Environment for Fuzzy Controllers 399
The introduction of this module in the system that represents its fundamental
distinctive characteristic with regard to the well-known systems, offers the following
advantages in the implementation of control systems applying fuzzy logic in PLC:
• A nonlimited amount of operators and fuzzy sets that can be used
• Versatility of the generated code
• Possibility to implement as many controllers as desired if the limitations of the
CPU and the SCAN cycle allows it
• Cheaper than other existing ones
• Possibility to implement fuzzy control systems in industrial plants that are al-
ready operative with not new investments
• A shorter development times of final applications
4 Conclusions
In this chapter one has presented a prototype system, called FzController, which
constitutes an important tool for the development, implementation and real-time
control of a plant using a fuzzy controller. This system presents many tools that
FzController: A Development Environment for Fuzzy Controllers 401
cover in a long extent the different stages of specification, verification and synthesis
in the design of a fuzzy control system.
At this moment FzController is for free distribution (upon request to the authors)
and it runs in any Windows OS. Authors are working in a version for Linux as
well as in the incorporation of learning methods and on the controller’s automatic
adjustment.
FzController is able to generate code for PLC. The generated code fulfills the
Standard IEC 61131-3. Initially one worked for Panasonic PLC, but by the present
time one works for Siemens PLC. Due to be a standard, the generated code should
be valid for any PLC that fulfills the standard. Also one works on developing the
choice of code generation for high-level languages.
References
1. L.A. Zadeh. Fuzzy Sets. Information and Control 8 pp. 338–358, 1965
2. P. Bonissone. Fuzzy Logic and Soft Computing. Technology Development and Applications.
GE Technical Report, 1997
3. http://www.imse.cnm.es/Xfuzzy/
4. F.J. Moreno, I.Baturone, S. Snchez and A. Barriga. Rapid design of fuzzy systems with
XFUZZY. IEEE International Conference on Fuzzy Systems FUZZ-IEEE, pp. 342–347, 2003
5. D.R. López, S. Sánchez-Solano and A. Barriga. XFL: a fuzzy logic systems language. Sixth
IEEE International Conference on Fuzzy Systems 3, pp. 1585–1591, 1997
6. J. Schwarz. Motorola microcontroller as the platform for fuzzy applications. In Scientific Inter-
national Conference on Communications, Signal and Systems CSS’96, Brno, Czech Republic,
AMSE, Sept. 1996
7. O. Cordón, F. Herrera and A. Peregrı́n. A Practical Study on the Implementation of Fuzzy
Logic Controllers. The International Journal of Intelligent Control and Systems 3, pp. 49–91,
1991
8. J.L. Castro. Fuzzy Logic Controllers are Universal Approximators. IEEE transactions on Sys-
tems, Man and Cybernetics 25, pp. 629–635, 1995
9. J.L. Castro and M. Delgado. Fuzzy Systems with Defuzzification are Universal Approximators.
IEEE transactions on Systems, Man and Cybernetics. 26, pp. 149–152, 1996
10. S.K. Narendra and K. Parthasarathy. Identification and control of dynamical systems using
neural networks. IEEE Transaction on Neural Networks 1 (1), pp. 4–27, 1990
11. J. Balcells and J.L. Romeral. Programable Automata. Marcombo, Madrid (In Spanish), 1997
12. U. Michel. Industry Programable Automata. Marcombo, Madrid (In Spanish), 1990
13. AENOR: “UNE-EN-61131-1,2,3”, 1994
14. http://www.nais-e.com/plc/uacs/plc dl manual.html
A Consistency Criterion for Optimizing
Defuzzification in Fuzzy Control
1 Introduction
Fuzzy control ([16]) is used in a wide scope of applied sciences, including physics,
electronics and economy. It is based on the concept of fuzzy sets as introduced
by L.A. Zadeh ([14] and [15]), extending the notion of membership of a function
from a two-valued logic to one in which the range values continuously vary within
I = [0, 1]. The reason of its vast success is its fairly simple computational behaviour,
its obvious weakness however is, as is readily known, the inherently heuristic nature
of the design of such a fuzzy controller. The wide possibility of choice for shape
and parameters in the control variables shows the need for a solid mathematical
foundation, next to some obvious heuristic restraints which the controlled system
has to satisfy.
As everybody who is familiar with the basic concept of fuzzy control knows,
three key issues in the design of a fuzzy control system are:
• The choice of a suitable set of fuzzy variables, being functions from the space
in which control measurements are performed. Mostly this will be functions α
from R (or a part thereof) to I. Either the space R or an interval [a, b] will be
denoted as X.
• The choice of an implication function, or, equivalently, a set of linguistic rules,
each of the type
where the denoted variables Xi are linguistic, and linked to the fuzzy membership
sets αi , and coupled with an aggregation function to combine the consequences
of these assertions, and
• The choice of a suitable defuzzification method, assigning one crisp value with
the aggregated consequence function.
A fuzzy controller is then designed by choosing in a suitable way any combina-
tion of the three above. As for the fuzzy variables, every input variable of the fuzzy
controller has a finite collection of rule antecedents consisting of fuzzy functions,
which we will denote by {αi : X −→ I}ni=1 , and a consequence rule β : X −→ I.
Definition 1. For one such function α : X −→ I, the support will be defined as
If two rule antecedents αi and α j are not disjoint, they will be called overlapping.
Definition 3. On the other hand, the core of a fuzzy set α : X −→ I will be defined
as :
core α := x ∈ X : α (x) = sup α (y) .
y∈X
Definition 4. Given two non–disjoint fuzzy sets αi and α j , we will say that αi super-
centrally overlaps α j if and only if core αi ⊆ supp α j , and αi subcentrally overlaps
α j if and only if core αi ∩ supp α j = 0.
/ If αi both overlaps α j subcentrally as well as
supercentrally, we say that αi centrally overlaps α j .
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 405
α1
α2
where Aiji is the value of the j–th term of the linguistic variable i corresponding to
the antecedent membership function αiji , and B j is the value of the j–th term of the
linguistic variable corresponding to the consequence membership function β j , then
the aggregation of the rules is made by calculating
n
kr (x) := αiji (xi )
i=1
for each of the input vectors x = (x1 , ..., xn ), and determining the consequence fuzzy
set as ' (
µx (y) := ρ (x, y) = β j (y) ∧ kr (x) .
r
aµ + b : X −→ I
x → aµ (x) + b
(of course on condition that this is well defined). Then the defuzzification value
should not be changed, or, in other words,
∀µ ∈ F(X) such that aµ + b ∈ F(X) : D(aµ + b) = D(µ ).
A defuzzifier D that satisfies this property will be called ordinal scale-invariant.
([7], [9], nicely summarized in [12])
2 Any positive affine transformation on X should induce the inverse affine trans-
formation on the defuzzification value. Stated differently, for all µ ∈ F(X), for
all a ∈ R0 and b ∈ R, define
µ a,b : X −→ I
x−b
x → µ
a
(of course again on condition that this is well defined). Then the defuzzification
value should be
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 407
D µ a,b = aD(µ ) + b.
Some of the most commonly used criteria satisfying these conditions include:
Ideally, the following assumptions should hold for all applicable f : X −→ X, but
one can easily see that this demand is way too strict; an appropriate choice of such
f has to be made. In order to achieve a certain degree of regularity, we choose for
f : X −→ X the most obvious functions; while even a study of all such possible sin-
gle functions is way beyond the scope of this article, but nevertheless an interesting
topic for further research, in the remainder of this article we will suppose that f is
the identity mapping. So with each rule antecedent α : X −→ I, we associate iden-
tically the same collection as a fuzzy consequence variable, and this for each rule.
The function θ : X × P ∗ (F(X)) −→ X associates with each input value x and
each antecedent (and consequent) fuzzy set collection {αi }ni=1 in P ∗ (F(X)) an
output θ (x, Ξ) = D∗∗∗ (µx ) for some fixed choice of a defuzzifier ∗ ∗ ∗, where
µx : X −→ I is derived from the rule base Ξ = {αi : X −→ I}ni=1 as described above.
Schematically,
θ
X × P ∗ (F(X)) −→ X
(x, Ξ) → D∗∗∗ (µx )
θ∗ D∗∗∗
F(X)
µx
Ideally, we look for a Ξ and a D such that θ (·, Ξ) −→ · should be identical to the
function f we started with; in this case, ideally, θ should be the identity function.
We find however that this is almost never the case, apart from maybe some degen-
erate states which do never occur anyway. While this is understandable in the case
of a discontinuous defuzzifier such as MOM, it is surprising to see that a contin-
uous defuzzifier such as COG does not satisfy this property either. The continuity
of the restricion of the function θ to some fixed rule base in P ∗ (F(X)) as function
X −→ X, omitting the need for a topology on F(X), is strongly depending on the
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 409
The triangular-shaped fuzzy set passing through the points (a, 0), (c, 1) and (b, 0)
with c ∈ ]a, b[ is defined by
⎧
⎪ x−a
⎪
⎪ if x ∈ [a, c]
⎨ c−a
α (x) = x − b if x ∈ [c, b]
⎪
⎪
⎩ c−b
⎪
0 otherwise
0.8
0.6
0.4
0.2
0 1 2 3 4 5
Fig. 3 Single controller x
0.8
0.6
0.4
0.2
b b
(a + b)(x − a)(b − x)
yµx (y)dy = y(α (x) ∧ α (y))dy =
(b − a)
a a
Let Ξ ∈ P ∗ (F([a, c])) be the fuzzy controller with as an antecedent rule base the
fuzzy sets
• α1 : [a, b] −→ I passing through the points (a, 0), a + b , 1 and (b, 0);
2
• α2 : [b, c] −→ I passing through the points (b, 0), b + c
2 , 1 and (c, 0);
with a < b < c as seen in Figure 5. The consequence is than given by
2
µx (y) = max (αi (x) ∧ αi (y)) .
i=1
0.8
0.6
0.4
0.2
0
Fig. 5 Two disjoint 1 2 3 4 5 6 7
controllers x
412 H.K. Lee et al.
The two are hence identical except for a negligible set. It is easy to extend this result
to a finite number of single disjoint controllers.
Let Ξ ∈ P ∗ (F([a, d])) be the fuzzy controller with as an antecedent rule base the
overlapping fuzzy sets
• α1 : [a, b] −→ I passing through the points (a, 0), a + b , 1 and (b, 0);
2
• α2 : [c, d] −→ I passing through the points (c, 0), c +
2
d , 1 and (d, 0).
⎧ d−c 2
⎪ 0 if y ∈ [a, c]
c+d ' c−x ( ⎨ 2((d−x)∧(y−c)) c+d
, d 0 2 if y ∈ c,
⎩ 2((d−x)∧(d−y)) if y ∈ c+d , d
2 c−b ⎪ d−c 2
d−c 2
As a graphical example, let (a, b, c, d) = (2, 4, 3.5, 5.5). Then we obtain the rule
consequence functions given in Figure 9.
Calculating the FOM- and LOM-defuzzification then gives
⎧
⎪
⎪ x if x ∈ a, a + b
⎪
⎪ 2
⎪
⎪ b , ac − bd
⎪b+a−x
⎪ if x ∈ a +
⎪
⎪ 2 a−b+c−d
⎪
⎨ a2 − b2 + bc − ad ac − bd
DFOM (µx ) = a − b + c − d if x = a − b + c − d
⎪
⎪
⎪
⎪ x if x ∈ a −acb − bd , c + d
c−d 2
⎪
⎪ +
⎪
⎪ c + d
⎪
⎪ d +c−x if x ∈ 2 ,d
⎪
⎩
a if x ∈ {a, d}
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
x = 2.3 x = 3.6 x = 5.1
Namely, if x ∈ [c, b], then the two fuzzy sets intersect in the point x for which
− x = 2 x − c ; i.e. x = ac − bd . One can easily verify that this always is
2 bb − a d −c a−b+c−d
a point in the interval [c, b]. Analogously,
⎧
⎪
⎪ b+a−x if x ∈ a, a + b
⎪
⎪ 2
⎪
⎪
⎪
⎪ if x ∈ a + b ac − bd
⎪
⎪
⎪
x 2 , a−b+c−d
⎪
⎪
⎨ c2 − d 2 + ad − bc if x = a −acb − bd
DLOM (µx ) = a−b+c−d +c−d
⎪
⎪
⎪d +c−x
⎪
⎪ if x ∈ a −acb − bd
+c−d, 2
c+d
⎪
⎪
⎪
⎪
⎪
⎪ if x ∈ c + d
⎪
⎪
x 2 ,d
⎩
d if x ∈ {a, d}
Thus
⎧
⎪
⎪
a+b
⎪ 2 if x ∈ a, a −acb − bd
+c−d
⎪
⎪
⎪
⎪ a2 − b2 + c2 − d 2
⎨
a−b+c−d if x = a −acb − bd
+c−d
DMOM
( µx ) = .
⎪
⎪
⎪
⎪
c + d if x ∈ a −acb − bd , d
+c−d
⎪
⎪ 2
⎪
⎩ a+d
2 if x ∈ {a, d}
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 415
This function is, obviously, not continuous. For the calculation of the COG-
defuzzification, note that we can obtain the following results:
x∈ DCOG (µx ) =
a+b
]a, c]
2
7 8
ac − bd I1 I2 I3 I4
c, + / +
a−b+c−d b−a d −c b−a d −c
7 8
ac − bd J1 J2 J3 J4
,b + / +
a−b+c−d b−a d −c b−a d −c
c+d
[b, d[
2
with ⎧
⎪
b+a−x x t1
⎪
⎪
⎪
⎪ I1 = y(y − a)dy + y(b − x)dy + y(b − y)dy
⎪
⎪
⎪
⎪
⎪
⎪
a b+a−x x
⎪
⎪
d+c−x d
⎪
⎪
⎪
⎪ I2 = y(x − c)dy + y(d − y)dy
⎪
⎨
t1 d+c−x
⎪
⎪
b+a−x x t1
⎪
⎪
⎪
⎪ I3 = (y − a)dy + (b − x)dy + (b − y)dy
⎪
⎪
⎪
⎪ a x
⎪
⎪
b+a−x
⎪
⎪
d+c−x d
⎪
⎪
⎪
⎪ I = (x − c)dy + (d − y)dy
⎩ 4
t1 d+c−x
and ⎧
⎪
b+a−x t2
⎪
⎪
⎪
⎪ J1 = y(y − a)dy + y(b − x)dy
⎪
⎪
⎪
⎪
⎪
⎪
a b+a−x
⎪
⎪ x
d+c−x d
⎪
⎪
⎪
⎪ J2 = y(y − c)dy + y(x − c)dy + y(d − y)dy
⎪
⎨
t2 x d+c−x
⎪
⎪
b+a−x t2
⎪
⎪
⎪
⎪ J3 = (y − a)dy + (b − x)dy
⎪
⎪
⎪
⎪ a
⎪
⎪
b+a−x
⎪
⎪ x
d+c−x d
⎪
⎪
⎪
⎪ J = (y − c)dy + (x − c)dy + (d − y)dy
⎩ 4
t2 x d+c−x
416 H.K. Lee et al.
I1 + I2
lim DCOG (µx ) = lim b − a d − c =
a+b
>
x→c
>
x→c
I3 I4 2
b−a + d −c
and
J1 + J2
lim DCOG (µx ) = lim b − a d − c =
c+d
<
x→b
<
x→b
J3 J4 2
b−a + d −c
I1 + I2
and furthermore that lim b − a d −c and lim
< ac − bd I3 + I4 > ac − bd
x→ x→
a−b+c−d b−a d −c a−b+c−d
J1 J2
b − a + d − c both are equal to the same rational expression of a, b, c and d, which
J3 J4
b−a + d −c
proves the assertion. QED
Let Ξ ∈ P ∗ (F([a, d])) again be a fuzzy controller with as an antecedent rule base
the overlapping fuzzy sets
' (
• α1 : [a, b] −→ I passing through the points (a, 0), a+b , 1 and (b, 0);
' c+d2 (
• α2 : [c, d] → I passing through the points (c, 0), 2 , 1 and (d, 0).
As an example, let (a, b, c, d) = (2, 4, 2.5, 4.5). Then we obtain the rule conse-
quence functions seen in Figure 10. Calculating the FOM- and LOM-defuzzification
then yields exactly the same result as for two subcentrally overlapping controllers.
For the calculation of the COG-defuzzification, we can obtain the following results:
1 1 1 1
0.8 0.8 0.8 0.8
0.6 0.6 0.6 0.6
0.4 0.4 0.4 0.4
0.2 0.2 0.2 0.2
0 0 0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
x = 2.3 x = 2.7 x = 3.1 x = 3.25
1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
x = 3.3 x = 3.8 x = 4.2
x∈ DCOG (µx ) =
a+b
]a, c]
7 8 2
a+b K1 K2 K3 K4
c, + / +
7 2 8 b−a d −c b−a d −c
a+b ac − bd L1 L2 L3 L4
, + / +
7 2 a−b+c−d 8 b−a d −c b−a d −c
ac − bd c+d M1 M2 M3 M4
, + / +
7 a − b + 8c − d 2 b−a d −c b−a d −c
c+d N1 N2 N3 N4
,b + / +
2 b−a d −c b−a d −c
c+d
[b, d[
2
with ⎧
⎪ x
b+a−x t1
⎪
⎪
⎪
⎪ K1 = y(y − a)dy + y(x − a)dy + y(b − y)dy
⎪
⎪
⎪
⎪
⎪
⎪
a x b+a−x
⎪
⎪
d+c−x d
⎪
⎪
⎪
⎪ K2 = y(x − c)dy + y(d − y)dy
⎪
⎨
t1 d+c−x
⎪
⎪ x
b+a−x t1
⎪
⎪
⎪
⎪ K3 = (y − a)dy + (x − a)dy + (b − y)dy
⎪
⎪
⎪
⎪ a x
⎪
⎪
b+a−x
⎪
⎪
d+c−x d
⎪
⎪
⎪
⎪ K = (x − c)dy + (d − y)dy,
⎩ 4
t1 d+c−x
⎧
⎪
b+a−x x t1
⎪
⎪
⎪
⎪ L1 = y(y − a)dy + y(b − x)dy + y(b − y)dy
⎪
⎪
⎪
⎪
⎪
⎪
a b+a−x x
⎪
⎪
d+c−x d
⎪
⎪
⎪ L2 =
⎪ y(x − c)dy + y(d − y)dy
⎪
⎨
t1 d+c−x
⎪
⎪
b+a−x x t1
⎪
⎪
⎪
⎪ L3 = (y − a)dy + (b − x)dy + (b − y)dy
⎪
⎪
⎪
⎪ a x
⎪
⎪
b+a−x
⎪
⎪
d+c−x d
⎪
⎪
⎪
⎪ L = (x − c)dy + (d − y)dy,
⎩ 4
t1 d+c−x
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 419
⎧
⎪
b+a−x t2
⎪
⎪
⎪
⎪ M1 = y(y − a)dy + y(b − x)dy
⎪
⎪
⎪
⎪
⎪
⎪
a b+a−x
⎪
⎪ x
d+c−x d
⎪
⎪
⎪ M2 = y(y − c)dy +
⎪ y(x − c)dy + y(d − y)dy
⎪
⎨
t2 x d+c−x
⎪
⎪
b+a−x t2
⎪
⎪
⎪
⎪ M3 = (y − a)dy + (b − x)dy
⎪
⎪
⎪
⎪ a
⎪
⎪
b+a−x
⎪
⎪ x
d+c−x d
⎪
⎪
⎪
⎪ M = (y − c)dy + (x − c)dy + (d − y)dy,
⎩ 4
t2 x d+c−x
and ⎧
⎪
b+a−x t2
⎪
⎪
⎪
⎪ N1 = y(y − a)dy + y(b − x)dy
⎪
⎪
⎪
⎪
⎪
⎪
a b+a−x
⎪
⎪
d+c−x x d
⎪
⎪
⎪ N2 =
⎪ y(y − c)dy + y(d − x)dy + y(d − y)dy
⎪
⎨
t2 d+c−x x
⎪
⎪
b+a−x t2
⎪
⎪
⎪
⎪ N3 = (y − a)dy + (b − x)dy
⎪
⎪
⎪
⎪ a
⎪
⎪
b+a−x
⎪
⎪
d+c−x x d
⎪
⎪
⎪
⎪ N = (y − c)dy + (d − x)dy + (d − y)dy,
⎩ 4
t2 d+c−x x
Lemma 7. The formulas for the COG–defuzzification DCOG (µx ) in the subcentrally
overlapping case as given in Table 1 and those for the supercentrally overlapping
case as given in Table 2 are identical.
Proof: First of all, considering Table 2 only, it is easy to see by simple calculation
that K3 = L3 and K1 = L1 . Since furthermore K2 = L2 and K4 = L4 , being literally
the same, the expressions for DCOG (µx ) on c, a + b and on a + b , ac − bd
2 2 a−b+c−d
are identical. In exactly the same way, it can be proved that M4 = L4 and M2 = L2 ,
and obviously M1 = L1 and M3 = L3 . Thus also the expressions for DCOG (µx ) on
420 H.K. Lee et al.
x∈ DCOG (µx ) =
a+b
]a, c]
7 8 2
ac − bd K1 K2 K3 K4
c, + / +
7 a−b+c−d 8 b−a d −c b−a d −c
ac − bd N1 N2 N3 N4
,b + / +
a−b+c−d b−a d −c b−a d −c
c+d
[b, d[
2
Now it is easy to see that the formulas in Table 1 for the subcentrally overlapping and
Table 2b for the supercentrally overlapping case are factually identical. This state-
ment is trivial for x ∈]a, c] and x ∈ [b, d[. We will now prove that this is also the case
for x ∈ c, a −acb − bd ac − bd ac − bd
+ c − d and x ∈ a − b + c − d , b . If x ∈ c, a − b + c − d , then
it is verifiable with an easy calculation easy that I3 = K3 and that I1 = K1 , while I2 =
K2 and I4 = K4 are perfectly identical. If on the other hand x ∈ a −acb − bd
+c−d ,b ,
then it is equally easy to verify that J4 = N4 and that J2 = N2 , while J1 = N1 and
J3 = N3 are identical. Hence, in both cases the expressions DCOG (µx ) are identical.
QED
Consequently, without having to apply any limit theorem, also in this case the
COG-defuzzification is a continuous function. Furthermore, we will from now on
omit the second set of formulas.
g+ f
with a + b
2 < c < b and g < d < 2 . Then take the restriction to the domain
a + b , g + f . These functions look like Figure 11. The consequence then is given
2 2
by
3
µx (y) = max(αi (x) ∧ αi (y)).
i=1
It takes a tedious but similar verificaction that the MOM-defuzzification is not con-
tinuous, and that the COG-defuzzification, analogously we get a limit theorem sim-
ilar to 6 proving that the COG-defuzzification is a continuous function. Although
we will leave the necessary calculations to the interested reader, some of the typi-
cal shape functions of overlapping fuzzifiers with border constraints can be seen in
Figure 12.
0.8
0.6
0.4
0.2
0 1 2 3 4
x
5
(a) 6
(b) 12
(c) (d)
10
4 5 10
8
4 8
3 6
3 6
2 4
2 4
1 1 2 2
0 1 2 3 4 5 0 0 2 3 4 5 6 0 2 4 6 8 10 12 0 2 4 6 8 10
x x x x
8
10 8
8
6 8
6 6
4 6
4 4
4
2 2 2
2
0
2 4 6 8 1 2 4 6 8 0 2 4 6 8 10 0 2 4 6 8
x x x x
(e) (f) (g) (h)
Our goal now is to find out how the fuzzy controllers should be positioned with
respect to each other, such that the difference between the input value x and the
defuzzification value D(µx ) is minimal. Ideally, this difference should be zero, but
even in simple cases this is just not true. As for how this distance should be cal-
culated, various possibilities are open, but if we have to make a trade-off between
computational complexity and intuitive correctness, it seems only reasonable to take
the L1 -distance, defined on X by
∀µ , ν ∈ F(X) : d1 (µ , ν ) = |µ (x) − ν (x)| dx
X
D∗∗∗ (µ ) : X −→ X
x → D∗∗∗ (µx ) = θ (x, Ξ)
is smaller.
We will investigate this claim on some concrete examples.
3.1 Example
Because of the scaling invariance demands as stated in the introduction, only the
relative position of the controllers with respect to each other is considered to be
important. All other defuzzification values can be calculated through applying the
appropriate affine transformations. Therefore, fix α1 (x) to be the triangular fuzzy
set through the points (0, 0), (1, 1) and (2, 0), and let α2 (x) be a variable fuzzy set
λ +µ
through the points (λ , 0), 2 , 1 and (µ , 0). This yields the following result:?
x∈ DCOG (µx ) =
[0, λ ] 1
7 8
2µ A1 + A2
λ, DCOG ( µx ) =
µ −λ +2 l A3 + A4
7 8
2µ B1 + B2
, 2 DCOG ( µx ) =
µ −λ +2 r B3 + B4
λ +µ
[2, µ [ 2
with
⎛ ⎞
2µ −2x
µ−λ
1⎜⎜x
2−x ⎟
⎟
A1 = ⎜ y2 dy + xydy + y(2 − y)dy⎟
2⎝ ⎠
0 x 2−x
⎛ ⎞
µ+
λ −x µ
⎜ ⎟
1 ⎜ ⎟
A2 = ⎜ y(x − λ )dy + y(µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −2x µ +λ −x
µ −λ
⎛ ⎞
2µ −2x
µ−λ
1⎜⎜x
2−x ⎟
⎟
A3 = ⎜ ydy + xdy + (2 − y)dy⎟
2⎝ ⎠
0 x 2−x
⎛ ⎞
⎜ µ +λ −x µ ⎟
1 ⎜ ⎟
A4 = ⎜ (x − λ )dy + (µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −2x µ +λ −x
µ −λ
424 H.K. Lee et al.
and
⎛ 2µ −xµ +xλ
⎞
⎜ 2
2−x
1⎜ ⎟
B1 = y2 dy + y(2 − x)dy⎟
2⎝ ⎠
0 2−x
⎛ ⎞
x µ+
λ −x µ
1 ⎜⎜
⎟
B2 = y(y − λ )dy + y(x − λ )dy + y(µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −xµ +xλ x µ +λ −x
2
⎛ 2µ −xµ +xλ
⎞
⎜ 2
2−x
1⎜ ⎟
B3 = ydy + (2 − x)dy⎟
2⎝ ⎠
0 2−x
⎛ ⎞
x µ+
λ −x µ
1 ⎜⎜
⎟
B4 = (y − λ )dy + (x − λ )dy + (µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −xµ +xλ x µ +λ −x
2
Remark that if (λ , µ ) = (0, 2), then the function is the constant 1–mapping. Writ-
ing this out explicitly
is a long
and cumbersome work. However, when trying to
achieve that DCOG (µx ) − x is the constant zero function, it can be calculated that
the only solution of the non-linear equation system equals (λ , µ ) = (0, 2), which is
trivial. On the other hand, it might be interesting to note that this function is not only
2µ
continuous in x = , but also continuously differentiable. Indeed,
µ −λ +2
d d
lim (DCOG (µx )) = lim (DCOG (µx )) ,
< 2µ
x→ µ −λ +2
dx l >
x→
2µ dx r
µ −λ +2
x ∈ DCOG (µx ) =
]0, λ ] 1
[λ , 2] −4x + 2x + 2λ2 − 3λ x + λ x2 − 4λ x + 4λ 2
2 3 2
2(x − 2x + λ − λ x)
2
[2, µ [ λ + 1
Let us determine the collection of points for which the middle expression equals x.
The solutions of this third-grade equation equal
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 425
λ λ 1.
+ 1, 1 + ± 4 + 4λ − 7λ 2 .
2 2 2
In that case, the point λ2 + 1 is always a fixpoint. The existence of other fixpoints
depends on the sign of ∆ (λ ) = 4 + 4λ − 7λ 2 :
λ 2 − 4 √2 2 + 4 √2
7 7 7 7
∆ (λ ) − 0 + 0 −
√ √
So there are two more fixpoints if and only if λ ∈ 27 − 47 2, 27 + 47 2 , approxi-
mately equalling [−0.52, 1.09]. Three is at once the maximal number of fixpoints
there can be, since the required equation is of degree 3. To illustrate this, we
will sketch some of the graphs we obtain, where we plot the defuzzification value
DCOG (µx ) against the value x. The description of the functions will be left to the
reader to write out, the actual graphs can be found in Figure 13. Notice the change
in the number of fixpoints.
4 4 4 4
3 3 3 3
2 2 2 2
1 1 1 1
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
x x x x
λ=0.1 λ=0.3 λ=0.5 λ=0.7
4 4
2 2
3 1.8 3 1.8
1.6 1.6
2 2
1.4 1.4
1 1.2 1
1.2
1
0 1 2 3 4 1 1.2 1.4 1.6 1.8 2 0 1 2 3 4 1 1.2 1.4 1.6 1.8 2
x x x x
λ=0.9 (detail) λ=1 (detail)
4 4
2 2
3 1.8 3 1.8
2 1.6 1.6
2
1.4 1.4
1 1
1.2 1.2
4 4 4 4
3 3 3 3
2 2 2 2
1 1 1 1
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
x x x x
λ=1.3 λ=1.5 λ=1.7 λ=1.9
which has as a graphical plot the left graph in Figure 14. Moreover, considering
that the union of supports of the different rule bases are not of equal width, namely
[0, λ + 2], if we want to find the value for λ with — relatively speaking — the
d (DCOG (µx ), x)
smallest overlap, we have to examine the extreme values for 1 on
λ +2
the interval [0, 2]. The graphical plot of this function is the right graph in Figure 14.
d d1 (DCOG (µx ), x)
It is easy to investigate that = 0 for λ ! 1. 071 791. Hence,
dλ λ +2
the optimal value for λ is not equal to 1, which is, given the symmetric nature of the
problem, a remarkable result.
One possible drawback in the method as described above is the absence of de-
cent fuzzy sets in the antecent rule base that “round off the borders”. But even
then, asymmetries in the results are still occurring. Adding border constraints how-
ever does not seem to fix the problem of asymmetry in the search for a mini-
mal value for d1 (DCOG (µx ), x). We have checked thison an example with two
semi–triangular fuzzy sets, passing through the points 3
− 2 , 1 , (λ − 1, 0) and
3
(−λ + 1, 0), 2 , 1 , and two triangular fuzzy sets, passing through the points
{(−λ − 1, 0), (−λ , 1), (−λ + 1, 0)} and {(λ − 1, 0), (λ , 1), (λ + 1, 0)}. Where one
would expect to find an optimal value for λ , such that the aforementioned differ-
ence is minimal, to occur at λ = 21 , a closed yet very tedious set of continuous func-
tions, parametrically dependent on λ could be deduced. As an illustration, some of
the graphs of these functions can be found in Figure 15. These formulas tend to
be much more complicated however, except for some easy values of λ . Instead of
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 427
1.5 1.5 1.5 1.5 1.5
1 1 1 1 1
0.5 0.5 0.5 0.5 0.5
–1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
–0.5 x –0.5 x –0.5 x –0.5 x –0.5 x
–1 –1 –1 –1 –1
–1.5 –1.5 –1.5 –1.5 –1.5
Fig. 15 Plot of the defuzzification value against x for various λ in case of border constraints
tracking down the optimum by calculating the derivative, we plotted this value out
against the values of λ by means of Simpson’s integration rule by dividing the X–
interval into 100 equal intervals, using Matlab. All calculations have been carried
out with 64-bit precision, and found the minimum occuring for λ ! 0.5377, with
error margin 10−4 , which is most certainly not equal to 12 .
4 BADD-defuzzification
The previous result leads us to believe that better defuzzifiers with respect to the
consistency criterion must exist, because it seems only fair that the optimum should
not only be dependent of the appropriate choice of a rule base, but also of the de-
fuzzifier. While it is virtually impossible to study all defuzzifiers that have been
mentioned in the literature, the fact that the COG-defuzzifier is not necessarily the
best one, can be asserted by studying a few simple examples. An interesting para-
metric family to consider that incorporates the aforementioned defuzzifiers as well
as many others are the so-called basic defuzzification distributions DBADD (−, γ )
yµ γ (y)dy
DBADD (µ , γ ) = X
µ γ (y)dy
X
as introduced in [3], D.P. Filev and R.R. Yager. It is generally known (see also [12])
that
• DBADD (µ , 0) = DMOS (µ )
• DBADD (µ , 1) = DCOG (µ )
• lim DBADD (µ , γ ) = DMOM (µ )
γ →∞
of unity may not be the optimal choice for an antecedent rule base Ξ ∈ P ∗ (F(X)),
we have examined this one nevertheless because of its symmetric nature. Using a
long and cumbersome calculation (which we of course can provide to the reader
upon simple request), we found the results of the BADD-defuzzification to be valid
extensions of the three defuzzification operators which are given as a limit case by
the expressions above, at least in the case of single controllers, overlapping con-
trollers, with and without border conditions. One striking result was the continuity
of the BADD-defuzzification for any γ ∈ R+ 0 , while it is explicitly not continuous
for γ = 0 or γ → ∞, which can be considered as hybrid cases.
Again, to give the reader an idea, we will sketch some of the graphs we obtain, where
we plot the defuzzification value DBADD (µx , γ ) in the case of no border constraints,
against the value x, for different values of γ , in Figure 16.
Since the only difference occurs on [1, 2], we tried to minimize
(
2
'
d1 DBADD (µx , γ ), x = DBADD (µx , γ ) − x dx
1
with respect to γ ∈ R+ 0 . Unlike expected, the optimum was not found for γ = 1.
d ' ' BADD ((
Calculating the derivative d1 D (µx , γ ), x is even more difficult than the
dγ
problem in Example 3.1, so we therefore plotted this value out again, this time
against the values of γ by means of Simpson’s integration rule by dividing the X–
interval into 100 equal intervals, using Matlab. All calculations have been carried
out with 64 bit precision. As a verification, we moreover could calculate the precise
results for γ = 1 and γ = 2, being
' (
d1 DBADD (µx , 1), x = 41 + ln 54 ! 0.02685644868579
' (
d1 DBADD (µx , 2), x ! 0.0764924981457362
respectively. The two graphs shown in Figure 17 sketch the result with a precision
of γ taken every 10−3 , the second graph being a close up of the first one. With a
minimal step for γ of 10−5 used in the calculations, the minimum occurs for γ !
1.2041, with error margin 10−4 , which is most certainly not equal to 1. Moreover,
the minimum itself is almost, but not quite, zero.
In a similar case with border constraints, to give the reader an idea again, we
sketched some of the graphs in Figure 18. Again, we tried to minimize
(
2
'
d1 DBADD (µx , γ ), x = DBADD (µx , γ ) − x dx
0
x10–3
0.35 7
0.3 6.5
6
0.25
5.5
0.2 5
0.15 4.5
4
0.1
3.5
0.05
3
0 2.5
0 2 4 6 8 10 12 14 16 18 20 1.14 1.16 1.18 1.2 1.22 1.24 1.26
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20
with respect to γ ∈ R+ 0 . Yet again, the optimum was not found for γ = 1. Obviously
from the graphs, adding border' values to the fuzzy( controller does not improve the
result. Plotting the value of d1 DBADD (µx , γ ), x against the values of γ in a similar
Matlab environment as before, with only the precise for γ = 1 and γ = 2 being
known exactly (available as a means of verification)
' ( √ √
d1 DBADD (µx , 1), x = 83 + 23 ln 2 − 2 2 arctanh 22 ! 0.63586382647903
' (
d1 DBADD (µx , 2), x ! 0.459 352 869168308
the two graphs shown in Figure 19 sketch the result with a precision of γ taken
every 10−3 , the second graph being a close–up of the first one. With a minimal step
for γ of 10−8 (convergence is very slow in this case!) used in the calculations, the
minimum occurs for γ ! 5.24478, with error margin 10−5 , which is most certainly
not even anywhere near γ = 1. Unlike the example with no border constraints, the
minimum itself is not even close to zero.
5 Conclusions
While the consistency criterion 3 only seems a very reasonable demand, it is very
easy to debunk it: even for relatively simple functions, such as the identity, simple
rule bases, such as triangular-shaped fuzzy sets forming a partition of the union,
and simple defuzzification methods, such as COG-defuzzification, it is not hard to
find either rule bases or defuzzification methods that just yield better results. Does
this mean that the whole study has been pointless? Absolutely not: the consistency
criterion 3 is a good method of perception to quantify the quality of a defuzzifi-
caion, by measuring the difference between inputting the identity and yielding the
same identity as a result of a defuzzification process. While this is not an absolute
measure, and is strongly dependent of the chosen antecedent rule base, it can be
used, e.g. to compare two different defuzzification methods on a same rule base.
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 431
Further investigation still has to be carried out, such as the influence an increase in
the number of rules in the antecedent rule base. This will be the topic of a sequel
article.
References
1. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
2. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning part 1: Inference with possibility
distributions. Fuzzy Sets and Systems 40, pp. 143–202,1991
3. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions.
Internat. J. Intelligent Systems 6, pp. 687–697, 1991
4. E.E. Kerre. A comparative study of the behaviour of some popular fuzzy implication operators.
In: L.A. Zadeh and J. Kacprzyk, eds., Fuzzy Logic For The Mamagement of Uncertainty.
Wiley, New York, 1992
5. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic, Dordrecht, 1996
6. E.H. Mamdani and S.Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man-Machine Studies 7, pp. 1–13, 1975
7. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the conse-
quences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
8. D. Ruan, E.E. Kerre, G. De Cooman, B. Cappelle and F. Vanmassenhove. Influence of the fuzzy
implication operator on the method-of-cases inference rule. Internat. J. Approx. Reasoning, 4,
pp. 307–318, 1990
9. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1994
10. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1993
11. M. Sugeno. An introductory survey of fuzzy control. Inform. Sci 36, pp. 59–83, 1985
12. W. Van Leekwijck and E.E. Kerre. Defuzzification: criteria and classification. Fuzzy Sets and
Systems 108, pp. 159–178, 1999
13. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
14. L.A. Zadeh. Fuzzy sets. Inform. Control 8, pp. 338–353, 1965
15. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
16. H.J. Zimmermann. Fuzzy Set Theory And Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
An Asymptotic Consistency Criterion for
Optimizing Defuzzification in Fuzzy Control
Abstract In [6], we already pointed out that in a fuzzy control process, the choice of
a good defuzzification method is quintessential. Throughout the literature, various
defuzzification methods have been proposed, classified according to the properties
they fulfil, such as continuity, scale invariance, core consistenty and so forth. In [6]
we added a new criterion, by demanding that the defuzzification of the fuzzy image
of a basic function, such as the identity, should still yield the identity, and we imme-
diately found that this is almost never the case. However, the numerical deviation
of this result can be established as a measure of fitness for the fuzzy controller in
the particular problem. Moreover, given a parametric family of such defuzzification
operators, such as D.P. Filev and R.R. Yager’s BADD-defuzzification ([3]), we were
able to optimize the problem with respect to the arbitrary parameter. In this chap-
ter, we will weaken out Consistency Criterion posed in [6] to a version that only
needs to hold in an asymptotic case, namely with an infinite refinement of the width
of the fuzzy antecedent rules. We will show that what ensues is a nice numerical
description of the fitness of certain (families of) fuzzy defuzzification operators.
1 Introduction
Fuzzy control ([18]) is used in a wide scope of applied sciences, including physics,
electronics and economy. It is based on the concept of fuzzy sets as introduced
by L.A. Zadeh ([16] and [17]), extending the notion of membership of a function
from a two-valued logic to one in which the range values continuously vary within
I = [0, 1]. One major step is the defuzzification process, in which the fuzzy data
is again sampled into a single output value which is asserted to be a good repre-
sentation value of the (fuzzy) outcome of the control process. Depending on the
circumstances, several properties of defuzzification techniques have been studied
extensively, such as continuity, core representation and scaling invariance, and for
a good overview we would like to refer to the excellent articles of T.A. Runkler
et al. [12] and W. Van Leekwijck et al. [14]. In [6], we defined a consistency cri-
terion (CC) which should be a measure to the effectiveness of a fuzzy controller
by calculating how much a given function — usually the identity — differs in L1 –
measure from its image through a fuzzy controller and defuzzification process. For
a more detailed description of the L1 –measure between functions, we refer to [4].
As we would expect, the result turns out to be dependent on the fuzzy rule base as
well as on the chosen defuzzification process, which allows for a quantitive com-
parative study. One major drawback though is that the identity function rarely ever
is mapped onto the identity function, even with the most obvious and usually well-
behaving defuzzification operators. Even then, we have not needed to ask ourselves
the question whether other “basic” functions are mapped onto themselves through
the (fuzzy) identity operator.
In this chapter however, we will study the influence of increasing the number of
controller functions in the rule base to this L1 –measure, and use it as a means to com-
pare the quality of two different fuzzy controller sets. Logically, an increase of the
number of controllers represents a fine-tuning of the way information is handled by
the antecedent rule base. We therefore expect the results to improve with the number
of controllers, and it indeed turns out to be the case when some selected additional
assumptions are made, regarding scale-invariance for instance. However, rather than
establishing this convergence, we would like to render the information obtained
in a numerical way, in order to compare the different defuzzification methods and
their consistency — or rather a concept we will call asymptotic consistency — as
described in [6]. The closed formulas are still manageable in the case of the most
common defuzzificators, such as center-of-gravity defuzzification (a canonical con-
tinuous example) or Mean-of-Maximum defuzzification (a canonical discontinuous
example). When we investigate this asymptotic consistency on a parametric family
of defuzzification operators such as the BADD–defuzzification presented in [3] —
the latter chosen because it incorporates both former examples — the formulas
become way to complicated, and we hence have to rely on numerical techniques
to draw some sensible conclusions.
In this section, we will establish the notations that will be used throughout this
article. Many of the already established results can be traced back to [6]. The set X
will denote the domain of the fuzzy sets, and can either be considered as R or any
(closed) interval thereof.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 435
Definition 1. A fuzzy rule antecent base will be defined as a finite collection of rule
antecedents consisting of fuzzy functions, which we will denote by {αi : X −→ I}ni=1 ,
and a consequence rule β : X −→ I.
For one such function α : X −→ I, the support will be defined as
supp α := {x ∈ X : α (x) > 0} .
= {x ∈ X : ∀y ∈ X : µ (y) ≤ µ (x)} .
The set of all such collections of rule antecedents Ξ = {αi : X −→ I}ni=1 shall be
denoted as P ∗ (F(X)), being the collection of all finite subsets of F(X), the fuzzy
sets on X. A collection of rule antecedents Ξ will be called a partition of unity if and
only if
n
∀x ∈ X, ∑ αi (x) = 1.
i=1
The consequence functions can be considered as members of the same set. As for
the implication, following E.H. Mamdani et al. in [8], given each rule is of the type
' (
r : IF X1 = A1j1 and ... and Xn = Anjn THEN (Y = B j ),
where Aiji is the value of the j–th term of the linguistic variable i corresponding to
the antecedent membership function αiji , and B j is the value of the j–th term of the
linguistic variable corresponding to the consequence membership function β j , then
the aggregation of the rules is made by calculating
n
kr (x) := αiji (xi )
i=1
for each of the input vectors x = (x1 , ..., xn ), and determining the consequence fuzzy
set as ' (
µx (y) := ρ (x, y) = β j (y) ∧ kr (x) .
r
crisp value, assigned to the fuzzy output. Most of these properties are described in
T.A. Runkler et al. [12] and W. Van Leekwijck et al. [14]. In what follows, we will
have the need to apply various scaling arguments; therefore the conditions of ordi-
nal scale-invariance ([9], [11], [14]) and universal scale-invariance ([11]) should
hold. The descriptions are re-formulated in [6]. As also mentioned in that article, the
most commonly used criteria satisfying these conditions include: the first, last and
middle of maxima defuzzification DFOM , DLOM and DMOM , the middle of support
defuzzification DMOS , the center-of-gravity defuzzification
yµ (y)dy
X
D COG
(µ ) =
µ (y)dy
X
One other obvious criterion a defuzzifier has to satisfy is continuity, implying that
F(X) carries some topology to describe the distance between two fuzzy sets. In [6],
we described an alternative approach that does not rely on any structure on F(X)
as follows: suppose one has a finite collection of fuzzy sets as rule antecedents.
Each rule antecedent consists of only one fuzzy set α : X −→ I. Suppose that each
consequence function β : X −→ I is obtained as the image of the fuzzy set through
the mapping id : X −→ X. The function θ : X × P ∗ (F(X)) −→ X associates with
each input value x and each antecedent (and consequent) fuzzy set collection {αi }ni=1
in P ∗ (F(X)) an output θ (x, Ξ) = D∗∗∗ (µx ) for some fixed choice of a defuzzifier
∗ ∗ ∗, where µx : X −→ I is derived from the rule base Ξ = {αi : X −→ I}ni=1 as
described above. Schematically,
θ
X × P ∗ (F(X)) −→ X
(x, Ξ) → D∗∗∗ (µx )
θ∗ D∗∗∗
F(X)
µx
θ (·, Ξ) −→ ·
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 437
should be the identity function again, yet we found this is almost never the case. Yet
we have assumed the following condition to hold:
Ansatz (Consistency Criterion) A rule base Ξ ∈ P ∗ (F(X)) and a defuzzifi-
cation operator D∗∗∗ are more suited for fuzzy control, the more the value
d1 (id, D∗∗∗ (µ )), with
D∗∗∗ (µ ) : X −→ X
x → D∗∗∗ (µx ) = θ (x, Ξ)
is smaller.
Definition 3. If X = (X, d) is a metric space, then the width of a fuzzy set α ∈ F(X)
will then be defined as
width(α ) = sup d(x, y).
x,y∈supp α
By extension, for any rule base Ξ = {αi : X −→ I}ni=1 ∈ P ∗ (F(X)), we can define
the width of Ξ as
n
width(Ξ) = max width(αi ).
i=1
Definition 5. A rule base sequence (Ξn )n will be called a zero rule base sequence if
and only if
lim width(Ξn ) = 0.
n→∞
The subcollection of all such zero rule base sequences will be denoted R0 (X) ⊆
R(X).
Example 6
1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
Fig. 1 Ξ1 and Ξ2
• α2 (x), passing through the points (0, 0), (1, 1) and (2, 0);
• α3 (x), passing through the points (1, 0) and (1, 1).
All of these are special cases of the antecedent rules Ξn on [0, n + 1], consisting of
the antecedent rules
• α1 (x), passing through the points (0, 1) and (1, 0);
• α2 (x), passing through the points (0, 0), (1, 1) and (2, 0);
• ...
• αn+1 (x), passing through the points (n − 1, 0), (n, 1) and (n + 1, 0);
• αn+2 (x), passing through the points (n, 0) and (n + 1, 1)
Obviously, in any of the examples above, width(Ξ1 ) = 2. Remark that these are
all partitions of unity. We would like to redefine all these onto the same base space
X = [0, 1], since the defuzzifications that will be used will all be universe scale-
invariant. Therefore, let us define
x
∀n ∈ N0 , ∀x ∈ [0, 1] : Θn (x) = Ξn
n+1
µ a,b : X −→ I
' (
x → [τ (µ )](x) = µ τ −1 (x) = µ x −
a
b
QED.
440 H.K. Lee et al.
In this section, we will study the different defuzzification operators on the standard
zero rule base sequence (Θn )n as given in Section 3, even though we learned from
[6] that the partition of unity is by no means the optimal antecedent rule base. Instead
of refining the control by increasing the number of controllers on the same unit
interval, we can also use the scale invariancy and along with the increase of the
number n of controllers, extend the interval on which we are working. To make a
feasible comparism, afterwards we will scale the d1 -distance with a factor (n + 1)2 .
Using the defuzzification formulas we obtained in the aforementioned article, we
find the following results:
A first question which rises is whether the L1 –difference between D∗∗∗ (µx ) and
the identity function decreases with an increase in the number of antecedent rules.
Roughly put, would it be true that
In this section we will prove that for some important defuzzification operators
L1
D∗∗∗ such as DMOM and DCOG not only (D∗∗∗ (θ ∗ (x, Θn )))n → id[0,1] , but even
u
(D∗∗∗ (θ ∗ (x, Θn )))n → id[0,1] , which is a much stronger assertion. The purpose of
calculating the L1 –distance nevertheless is that it permits us to compare different
defuzzification methods with each other in terms of a factor how many antecent
rules more or less are needed to achieve the same degree of accuracy.
As we have seen, the consistency criterion does mostly not hold, so we will now
first weaken it to an asymptotic form:
Ansatz (Asymptotic Consistency Criterion) A zero rule base sequence
(Ξn )n ⊆ R0 (X)
It is trivial to see that if an antecedent rule base Ξ fulfills the consistency criterion,
that the constant rule base sequence (Ξn = Ξ)n , even though it is not a zero rule base
sequence, fulfills
(D∗∗∗ (θ ∗ (x, Ξn )) = idX )n → idX
u
On the other hand, this condition is way too strong, certainly for the most com-
mon types of defuzzifiers, to be fulfilled. Therefore, the consistency criterion CC is
stronger than he asymptotic consistency criterion ACC.
4.1 MOM-defuzzification
Before stating and proving the general theorems, we would like to provide the reader
with some basic examples, in order to show the tecniques involved.
Examples 8
(
2
'
d1 DMOM (µ ), id[0,2] = DMOM (µx ) − x dx
0
1 3
2 1 2
x
= x− dx + (1 − x)dx + (x − 1)dx
2
0 1 1
2
2
2+x 3
+ − x dx =
2 8
3
2
1 3
2.5
1.5
2
1 1.5
1
0.5
0.5
0 0 0.5 1 1.5 2 2.5 3
0.5 1 1.5 2
x x
Fig. 2 The input–output curve for the rule bases Ξ1 and Ξ2 with MOM-defuzzification
⎧
⎪
⎪ 0 if x = 0
⎪
⎪
⎪
⎪ x/2 if x ∈ ]0, 1/2[
⎪
⎪
⎪
⎪ 3/4 if x = 1/2
⎪
⎪
⎨1 if x ∈ ]1/2, 3/2[
DMOM (µx ) = 3/2 if x = 3/2
⎪
⎪
⎪
⎪ 2 if x ∈ ]3/2, 5/2[
⎪
⎪
⎪
⎪ 9/4 if x = 5/2
⎪
⎪
⎪
⎪ (3 + x)/2 if x ∈ ]5/2, 3[
⎩
3 if x = 3
which looks like the right graph in Figure 2. Consequently, analogously to the
previous example,
(
3
' 5
d1 DMOM (µ ), id[0,3] = DMOM (µx ) − x dx =
8
0
Consequently,
' (
n+1
MOM
d1 D MOM
(µ ), id[0,n+1] = D (µx ) − x dx
0
1
2 1
x
n+1
n+1+x
= x− dx+2n (1 − x)dx+ − x dx
2 2
0 1 1
2 n+ 2
1 1 1 1 1
= + n+ = n+
16 4 16 4 8
Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θn ), there-
fore
1
+ 1n 1 + 2n
Dn = d1 DMOM µ a,b , id[0,1] = 8 4 2 =
(n + 1) 8(n + 1)2
Scaled to the unit interval, and considering Θn on [0, n + 1] and putting µxn :=
θ ∗ (x, Θn ), we have that
444 H.K. Lee et al.
⎧
⎪
⎪ 0 if x = 0
⎪
⎪
⎪
⎪
⎪
⎪ x if x ∈ 0, 2n1+ 2
⎪
⎪ 2
⎪
⎪
⎪
⎪ 3 if x = 2n1+ 2
⎪
⎪ 4n + 4
⎪
⎪
⎪
⎨ k 2k − 1 , 2k + 1 with k ∈ {1, 2, ..., n}
if x ∈ 2n
D MOM
( µx ) = n + 1 + 2 2n + 2
⎪
⎪ 2k + 1 +
⎪
⎪ if x = 2n + 12
2k with k ∈ {1, 2, ..., n − 1}
⎪
⎪ 2n + 2
⎪
⎪
⎪
⎪ 4n + 1 if x = 2n +1
⎪
⎪ 4n + 4 2n + 2
⎪
⎪
⎪
⎪ 1+x 2n + 1 , 1
⎪
⎪ if x ∈ 2n
⎪
⎩ 2 +2
1 if x = 1
Therefore,
⎛ ⎞
& &
& MOM n & ⎜ t ⎟ 3 1
&D (µx ) − id & t − ⎠ ∨ −
& [0,1]
& =⎝ sup
2 4n + 4 2n + 2
∞ 1
t∈ 0, 2n+2
⎛ ⎞
n ⎜ k ⎟
∨ sup ⎝ sup t − ⎠
n+1
k=1 2k−1 2k+1
t∈ 2n+2 , 2n+2
⎛ ⎞
2n + 1 4n + 1 ⎜ 1 + t ⎟
∨ − ∨⎝ sup t − ⎠
2n + 2 4n + 4 2n+1
2
t∈ 2n+2 ,1
1 1 1 1 1 1
= ∨ ∨ ∨ ∨ =
4 |n+1| 4 |n+1| 2 |n+1| 4 |n+1| 4 |n+1| 2 |n+1|
4.2 COG-defuzzification
Examples 11
' ( 2 √ √
d1 D COG
(µx ), id[0,2] = DCOG (µx ) − x dx = 8 + 2 ln 2 − 2 2 ln 2 + 1
3 3
0
II. Considering Ξ2 on [0, 3] and putting µx := θ ∗ (x, Ξ2 ), following [6], we have that
⎧ 3
⎪ x + 23x − 9x − 1 if 0 ≤ x ≤ 1
⎪ 2
⎪
⎪ − 2x − 1)
⎪
⎨ 23(x
D COG
( µx ) = 3x − 11x +6 if 1 ≤ x ≤ 2
2 − 3x + 1)
⎪
⎪ 2(x
⎪
⎪
⎩ x 2− 3x − 8
3 2
⎪ if 2 ≤ x ≤ 3
3(x − 4x + 2)
which looks like the right graph in Figure 3. Consequently, analogously to the
first example,
2.5
1.5
2
1 1.5
1
0.5
0.5
Fig. 3 The input–output curve for the rule bases Ξ1 and Ξ2 with COG-defuzzification
446 H.K. Lee et al.
' ( 3
d1 DCOG (µ x ), id[0,3] = DCOG (µx ) − x dx = 35 + 8 ln 2
12 3
0
√ √
−2 2 ln 2 + 1 − ln 5
' (
n+1
COG
d1 DCOG (µ x ), id[0,n+1] = D (µx ) − x dx
8 √ √ 20
1 1
= − 2 2 ln 2 + 1 + ln 2 + 2(n − 1) − ln 5 + ln 2 +
3 3 2 8
Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ2 ), there-
fore,
Dn = d1 DCOG µxa,b , id[0,1]
√ √
2 ' (
8
3 − 2 2 ln 2 + 1 + 3 ln 2 + 2(n − 1) − 12 ln 5 + ln 2 + 18
=
(n + 1)2
We might as well apply scaling and consider Θn on [0, 1] rightaway. Then putting
1 ,
νxn := θ ∗ (x, Θn ), we obtain that on 0, n + 1
|(n − 2)x| + |3(n + 3)x| + |3(3n + 2)x| + 1
lim DCOG (νxn ) − x ≤ lim = 0,
n→∞ n→∞ 3 |(n + 1)2 x2 − 2(n + 1) − 1|
1 , 2 ,
and that on n + 1 n+1
|8x| + |6(n + 3)x| + |(11n + 13) x| + 6
lim DCOG (νxn ) − x ≤ lim = 0,
n→∞ n→∞ 2 |(n2 + 2n + 1)x2 + (−3n − 3)x + 1|
4.3 BADD-defuzzification
which are calculated by combining the results in [6] for the appropriate domains,
which means that we consider the defuzzification DBADD (θ ∗ (x, Ξn ), γ ) first, and
then scale it to the unit interval by use of Proposition 7. Therefore we can derive
that γ γ γ
' ( I + I + (n − 1)I3
d1 DBADD (θ ∗ (x, Θn ), γ ) , id[0,1] = 2 1 2
(n + 1)2
It is then a triviality to see that
' (
lim d1 DBADD (θ ∗ (x, Θn ), γ ) , id[0,1] = 0,
n→∞
There are two standard ways to compare two different defuzzification methods and
the amount to which they fulfill the asymptotic consistency criterion. It is possible to
compare the quotient of the L1 –distances obtained by putting the same antecedent
rule base sequences through two different defuzzification processes, and take the
limit for an increasing number of antecedent rules. Alternatively, one may compare
this distance of one particular defuzzification operator on the elements of a fixed an-
tecedent rule base sequence with a parameter that is characteristic for this sequence.
As an immediate candidate, the width of the antecedent rule base pops to mind.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 449
Definition 6. For any rule base sequence (Ξn )n ∈ R(X), we will define the relative
fitness of the defuzzification operators D∗∗∗ and D··· as
It is easy to see that if both defuzzification operators D∗∗∗ and D··· satisfy the as-
ymptotic consistency criterion, this limit is obviously 00 , and hence initially undeter-
mined. One can calculate this value in particular cases, though.
Example 16
For instance, let us compare the MOM-defuzzification (thin line) with the COG-
defuzzification (thick line) on the rule base sequence (Θn )n mentioned in Section 3.
The graphs are sketched in Figure 4. We obtain asymptotically
' (
COG,MOM d1 DCOG (θ ∗ (x, Θn )) , idX
RF [(Θn )n ] = lim sup ∗
n→∞ d1 (D (θ (x, Θn ))
MOM
, idX )
8 √ √ 2 1 1
3 −2 2 ln| 2+1 | 3
+ ln 2+2(n−1) − 2 ln 5+ln 2+ 8
(n+1)2
= lim sup 1+2n
n→∞
8(n+1)2
4
= 1 + 4 ln ! 0.1074257947
5
which indicates that, with an increasing number of controllers, eventually the COG–
defuzzification becomes about 10 times better.
For an absolute measure of fitness, we suggest the following definition:
Definition 7. For any rule base sequence (Ξn )n ∈ R(X), we will define the fitness
of the defuzzification D∗∗∗ as
0.1
0.08
0.06
0.04
0.02
0
2 4 6 8 10
n
The smaller this value is, the better the defuzzification D∗∗∗ is as a fuzzifier for
the rule base (Ξn )n . If (Ξn )n ∈ R0 (X) is a zero rule base sequence and if the de-
fuzzification operator D∗∗∗ satisfies the asymptotic consistency criterion on (Ξn )n ,
then this limit is obviously again 00 . Mark that this fitness depends of the antecedent
rule base (Ξn )n as well as of the chosen defuzzification operator D∗∗∗ ; however,
when yielding the same rule base, it is a means of comparing the speed with which
a defuzzification operator tends to fulfill the asymptotic consistency criterion. It is
furthermore trivial to see that for any two defuzzification operators D∗∗∗ and D··· on
the same rule base (Ξn )n fulfill
F∗∗∗ [(Ξn )n ]
= RF∗∗∗,··· [(Ξn )n ]
F··· [(Ξn )n ]
II. For the COG-defuzzification on the rule base sequence (Θn )n mentioned in
Section 3, we obtain that
8 √ √ 2 1 1
3 −2 2 ln| 2+1|+ 3 ln 2+2(n−1) − 2 ln 5+ln 2+ 8
(n+1)2
FCOG [(Θn )n ] = lim sup 2
n→∞ n+1
1 2
= + ln √ ! 0.013 428
8 5
Therefore
1
FCOG [(Θn )n ] 8 + ln √25 2√
RF COG,MOM
[(Θn )n ] = MOM = 1
= 1 + 8 ln 5 ! 0.107 426
F [(Θn )n ] 8
5
then still vary with different values of γ and n. Remark that for all γ > 0 obvi-
γ γ
ously D1 = E1 . If one tries for instance to find an optimal γ –value for a fixed n,
γ
this requires to find the derivative dD n
d γ , which is another good reason to seek
refuge in numerical techniques, such as Simpson’s integration rule with steps
of order 10−5 . We programmed this in C as to optimalize the speed, and all
calculations have been carried out with 80-bit precision. Section 4.2 learns us
that
√ √
E11 = D11 = 83 + 23 ln 2 − 2 2 ln 2 + 1 ! 0.635 864
√ √
D 1
35
108 + 278
ln 2 − 29 2 ln 2 + 1 − 19 ln 5
E21 = 22 = 2
! 0.110 453
3 3
D1γ
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20
γ γ
Fig. 5 Plotting γ against E1 = D1
452 H.K. Lee et al.
1 1
D γ
2
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0 2000 4000
γ
Fig. 6 lim sup E1 converges to E1MOM
γ →∞
this for example for n = 1, for which the graph is given in Figure 6. Noticeably,
γ
quite soon a relaxation to lim sup D1 = 0.375 = 38 occurs, which can equally
γ →∞
be found by considering Section 4.1.
γ
(c). Can this result be extended such that for all other n, lim sup En converges to
γ →∞
EnMOM ?
Absolutely. In the table in Figure 7, we see the results for n ∈ {1, 2, 3} when
γ
γ increases with steps 1, 000. One consequence is that apparently lim sup I1 =
γ →∞
1 γ γ
16 and that both lim sup I2 and lim sup I3 equal 18 , we can find a closed for-
γ →∞ γ →∞
mula
γ γ γ
I1 + I2 + (n − 1)I3 1
+ 18 + (n−1) 2n + 1
lim sup Enγ = lim sup = 16 8
=
γ →∞ γ →∞ n+1 n+1 16(n + 1)
which emphasizes the correctness of the result. Obviously this remains true
when we take a limit for n → ∞.
(d). What about the fitness of DBADD,γ ?
If we fix a γ , we consequently obtain that
γ γ γ
I1 + I2 + (n − 1)I3 γ
FBADD,γ [(Θn )n ] = lim sup Enγ = lim = I3
n→∞ n→∞ n+1
γ
Therefore, a study of the values of I3 with respect to γ is required. We al-
γ γ
ready know that lim I3 = 81 . Investigating I3 for γ ∈ {1, 2, 3, ..., 50} and
γ →∞
γ ∈ 1, 12 , 31 , ..., 50
1 , we obtain the results in Table 4 in Figure 8. One would
γ |3γ k γ=1/k | 3γ
1 0,013428224 1 1 0,013428224
2 0,038246249 2 0,5 0,059122185
3 0,064139561 3 0,333333 0,078830338
4 0,078842882 4 0,25 0,089601578
5 0,08807597 5 0,2 0,096345332
6 0,094331906 6 0,166667 0,100950403
7 0,098820545 7 0,142857 0,104289498
8 0,102185503 8 0,125 0,106819211
9 0,104796014 9 0,111111 0,108800879
10 0,106877353 10 0,1 0,110394597
11 0,108574081 11 0,090909 0,111703802
12 0,109982924 12 0,083333 0,112798246
13 0,111170909 13 0,076923 0,113726648
14 0,112185881 14 0,071429 0,114524057
15 0,113062866 15 0,066667 0,11521632
16 0,11382808 16 0,0625 0,115822915
17 0,114501521 17 0,058824 0,116358795
18 0,115098701 18 0,055556 0,116835627
19 0,115631837 19 0,052632 0,117262652
20 0,116110674 20 0,05 0,117647276
21 0,116543082 21 0,047619 0,11799551
22 0,116935485 22 0,045455 0,118312281
23 0,117293175 23 0,043478 0,118601664
24 0,117620554 24 0,041667 0,118867062
25 0,117921311 25 0,04 0,119111336
26 0,118198558 26 0,038462 0,119336909
27 0,118454945 27 0,037037 0,119545849
28 0,118692737 28 0,035714 0,119739929
29 0,118913883 29 0,034483 0,119920678
30 0,119120069 30 0,033333 0,120089424
31 0,119312761 31 0,032258 0,120247324
32 0,11949324 32 0,03125 0,12039539
33 0,119662631 33 0,030303 0,120534513
34 0,119821924 34 0,029412 0,120665479
35 0,119971994 35 0,028571 0,120788984
36 0,12011362 36 0,027778 0,120905649
37 0,120247493 37 0,027027 0,121016025
38 0,120374233 38 0,026316 0,121120609
39 0,120494395 39 0,025641 0,121219844
40 0,120608477 40 0,025 0,121314129
41 0,120716929 41 0,02439 0,121403827
42 0,120820158 42 0,02381 0,121489264
43 0,120918532 43 0,023256 0,121570737
44 0,121012384 44 0,022727 0,121648515
45 0,121102021 45 0,022222 0,121722843
46 0,121187718 46 0,021739 0,121793947
47 0,121269731 47 0,021277 0,121862032
48 0,121348292 48 0,020833 0,121927285
49 0,121423613 49 0,020408 0,12198988
50 0,121495892 50 0,02 0,122049976
6 Conclusions
Although we are fully aware of the limitedness of the cases we investigated in this
article, we would nevertheless like to point out that a consistency criterion as formu-
lated in [6] or an asymptotic consistency criterion as formulated in Section 4, turn
out to be a key notion in understanding the consistency of a fuzzy controller. For
investigating, looking at a family of defuzzification operators, such as the BADD–
defuzzification operators introduced in [3], the obvious value for the adjustable pa-
rameter γ turns out to be anything but the obvious one. We therefore think that a
much deeper investigation needed to establish a link between the rule bases, the
defuzzification operators, their width, the consistenty of other functions that the
identity that are mapped through the fuzzy controller, not to mention the computa-
tional complexity involved.
References
1. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning, Part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
2. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning, Part 1: Inference with possibility
distributions. Fuzzy Sets and Systems 40, pp. 143–202, 1991
3. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions. In-
ternat. J. Intelligent Systems 6, pp. 687–697, 1991
4. A.N. Kolmogorov and S.V. Fomin. Measure, Lebesgue Integrals, and Hilbert Space. Acad-
emic Press, New York, 1961
5. E.E. Kerre. A comparative study of the behaviour of some popular fuzzy implication operators.
in: L.A. Zadeh and J. Kacprzyk, eds., Fuzzy Logic For The Mamagement of Uncertainty.
Wiley, New York, 1992
6. H. Lee, E. Paillet and W. Peeters A Consistency Criterion for Optimizing Defuzzification in
Fuzzy Control. In: Foundations of Generic Optimization Vol II: Applications of Fuzzy Control,
Genetic Algorithms and Neural Networks, Editors R. Lowen and A. Verschoren. Mathematical
Modelling: Theory and Applications, Springer Verlag, 2007
7. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic Dordrecht, 1996
8. E.H. Mamdani and S. Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man–Machine Studies 7, pp. 1–13, 1975
9. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the conse-
quences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
10. D. Ruan, E.E. Kerre, G. De Cooman, B. Cappelle and F. Vanmassenhove. Influence of the fuzzy
implication operator on the method-of-cases inference rule. Internat. J. Approx. Reasoning, 4,
pp. 307–318, 1990
11. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1994
12. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1993
13. M. Sugeno. An introductory survey of fuzzy control. Inform. Sci 36, pp. 59–83, 1985
14. W. Van Leekwijck and E.E. Kerre. Defuzzification: criteria and classification. Fuzzy Sets and
Systems 108, pp. 159–178, 1999
456 H.K. Lee et al.
15. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
16. L.A. Zadeh. Fuzzy sets. Inform. Control 8, pp. 338–353, 1965
17. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
18. H.J. Zimmermann. Fuzzy Set Theory And Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
MATHEMATICAL MODELLING:
Theory and Applications
8. M.C. Bustos, F. Concha, R. Bürger and E.M. Tory: Sedimentation and Thick-
ening. Phenomenological Foundation and Mathematical Theory. 1999
ISBN 0-7923-5960-7
9. A.P. Wierzbicki, M. Makowski and J. Wessels (eds.): Model-Based Decision Sup-
port Methodology with Environmental Applications. 2000 ISBN 0-7923-6327-2
www.springer.com