0% found this document useful (0 votes)
23 views463 pages

Verschoren A. (Ed), Lowen R. (Ed) - Foundations of Generic Optimization, Volume 2 - Applications of Fuzzy Control, Genetic Algorithms and Neural Networks (2008)

Uploaded by

gissela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views463 pages

Verschoren A. (Ed), Lowen R. (Ed) - Foundations of Generic Optimization, Volume 2 - Applications of Fuzzy Control, Genetic Algorithms and Neural Networks (2008)

Uploaded by

gissela
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 463

Foundations of Generic Optimization

MATHEMATICAL MODELLING:
Theory and Applications
VOLUME 24

This series is aimed at publishing work dealing with the definition, development
and application of fundamental theory and methodology, computational and
algorithmic implementations and comprehensive empirical studies in mathe-
matical modelling. Work on new mathematics inspired by the construction of
mathematical models, combining theory and experiment and furthering the
understanding of the systems being modelled are particularly welcomed.

Manuscripts to be considered for publication lie within the following, non-


exhaustive list of areas: mathematical modelling in engineering, industrial
mathematics, control theory, operations research, decision theory, economic
modelling, mathematical programming, mathematical system theory, geophys-
ical sciences, climate modelling, environmental processes, mathematical mod-
elling in psychology, political science, sociology and behavioural sciences,
mathematical biology, mathematical ecology, image processing, computer
vision, artificial intelligence, fuzzy systems, and approximate reasoning, genetic
algorithms, neural networks, expert systems, pattern recognition, clustering,
chaos and fractals.

Original monographs, comprehensive surveys as well as edited collections will


be considered for publication.

Managing Editor:
R. Lowen (Antwerp, Belgium)

Series Editors:
R. Laubenbacher (Virginia Bioinformatics Institute, Virginia Tech, USA)
A. Stevens (Max Planck Institute for Mathematics in the Sciences, Leipzig,
Germany)

The titles published in this series are listed at the end of this volume.
Foundations of Generic
Optimization
Volume 2: Applications of Fuzzy Control, Genetic
Algorithms and Neural Networks

Edited by

R. Lowen
University of Antwerp, Belgium

and

A. Verschoren
University of Antwerp, Belgium
Robert Lowen Alain Verschoren
University of Antwerp University of Antwerp
Belgium Belgium

ISBN: 978-1-4020-6667-2 e-ISBN: 978-1-4020-6668-9

All Rights Reserved


c 2008 Springer
No part of this work may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying, microfilming, recording
or otherwise, without written permission from the Publisher, with the exception of any
material supplied specifically for the purpose of being entered and executed on a computer
system, for exclusive use by the purchaser of the work.

Printed on acid-free paper.

9 8 7 6 5 4 3 2 1

springer.com
Contents

An Overview of Fuzzy Control Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


W. Peeters
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Structure of a Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Fuzzy Modelling Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 The Fuzzy Controller Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Fuzzy Rule Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Linguistic Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Linguistic Hedges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Fuzzy Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Linguistic Variables Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 The Design of a Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1 Choice of Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Design Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Aggregation and Implication Operators . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.1 t–norms and t–conorms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Extension of Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Conjunction and Disjunction Operators . . . . . . . . . . . . . . . . . . . 30
4.4 Implication Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Defuzzification Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Overview of the Different Defuzzification Operators . . . . . . . . 43
6 An Extended Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7 Simplified Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.1 Table-Based Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2 Sugeno Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

v
vi Contents

8 Adaptive Fuzzy Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57


8.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.3 Membership Function Tuning using Performance Criteria . . . 60
8.4 Gradient Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.5 Self-Organizing Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.2 The Input–Output Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.3 The State Space Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.4 Lyapunov Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
9.5 Input–Output Stability and Related Techniques . . . . . . . . . . . . 87
10 Other Adaptive Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.1 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.2 Neuro-fuzzy Hybrid Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.3 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
10.4 Fuzzy-Genetic Hybrid Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 129
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm . . . 139
Alberto Cavallo and Armando Di Nardo
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
2 Reservoir Water Release Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
3 Mathematical Model of the Reservoir . . . . . . . . . . . . . . . . . . . . . . . . . . 142
3.1 Volume Balance Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
3.2 Hybrid Dynamical Model of the Reservoir . . . . . . . . . . . . . . . . 144
4 Fuzzy Decision System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5 Optimizing the Decision Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.1 Genetic Algorithm and Fuzzy Membership Function
Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2 Performances Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6 Inflow Identification and Montecarlo Simulation . . . . . . . . . . . . . . . . . 149
7 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail


Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Francisco Mota Filho, Rodrigo Goncalves, and Fernando Gomide
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
2 Genetic Fuzzy Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
3 Supervisory Train Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Contents vii

Multiobjective Evolutionary Search of Difference Equations-based


Models for Understanding Chaotic Systems . . . . . . . . . . . . . . . . . . . . . . . . . 181
Luciano Sánchez and José R. Villar
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
2 Evolutionary Transparent Modeling of Chaotic Systems . . . . . . . . . . . 183
3 Operators Used in the Evolutionary Searches . . . . . . . . . . . . . . . . . . . . 185
3.1 Representation of an Individual . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.2 Random Generation of Genotypes . . . . . . . . . . . . . . . . . . . . . . . 186
3.3 Genetic Crossover and Mutation . . . . . . . . . . . . . . . . . . . . . . . . . 186
3.4 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
4 Detailed Description of the MOSA Algorithm . . . . . . . . . . . . . . . . . . . 188
4.1 Outline of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
4.2 The Distance Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
4.3 The Selection Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.4 Example of a MOSA Evolution . . . . . . . . . . . . . . . . . . . . . . . . . 190
5 Experiment and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
5.1 Dynamic Behavior of Universal Approximators . . . . . . . . . . . . 190
5.2 Benchmark Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
6 Concluding Remarks and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 198
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
An Integrated Fuzzy Inference-based Monitoring, Diagnostic,
and Prognostic System for Intelligent Control and Maintenance . . . . . . . . 203
Dustin R. Garvey and J. Wesley Hines
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
1.1 Reliability Engineering Methods . . . . . . . . . . . . . . . . . . . . . . . . 204
1.2 Integrated Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
2 Nonparametric Fuzzy Inference System . . . . . . . . . . . . . . . . . . . . . . . . 206
3 Embodiments of the NFIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
3.1 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
3.2 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3.3 Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
3.4 Prognosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
5.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
5.2 Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
5.3 Prognosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Stable Anti-Swing Control for an Overhead Crane with Velocity


Estimation and Fuzzy Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Wen Yu, Xiaoou Li, and George W. Irwin
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
viii Contents

3 Anti-Swing Control for the Overhead Crane . . . . . . . . . . . . . . . . . . . . . 227


4 Position Control with Fuzzy Compensation . . . . . . . . . . . . . . . . . . . . . 228
5 PD Control with a Velocity Observer . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7 Experimental Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Intelligent Fuzzy PID Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241


Prof. H.B. Kazemian, PhD, SMIEEE
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
2 The Development of Self-Organizing Fuzzy PID Controller . . . . . . . . 243
3 Kinematics and Dynamics of the Robot-Arm . . . . . . . . . . . . . . . . . . . . 248
4 Computer Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Stability Analysis and Performance Design for Fuzzy Model-based
Control Systems using a BMI-based Approach . . . . . . . . . . . . . . . . . . . . . . . 261
H.K. Lam, Member, IEEE and F.H.F. Leung, Senior Member, IEEE
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
2 Fuzzy Model and Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
2.1 Fuzzy Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
2.2 Fuzzy Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
3 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
4 Design of G j and a j for the Fuzzy Controller . . . . . . . . . . . . . . . . . . . . 269
4.1 Design of Feedback Gains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
4.2 Solution Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
5 BMI-Based Performance Design of Fuzzy Model-Based Control
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
6 Simulation Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
6.1 Simulation Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
6.2 Simulation Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
George K.I. Mann and Eranda Harinath
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
3 Two-Level Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
4 Low-Level Tuning: Linear PID Controller Tuning . . . . . . . . . . . . . . . . 288
4.1 Tuning First Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
4.2 Tuning ith loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Contents ix

5 High-Level Tuning: Nonlinearity Tuning . . . . . . . . . . . . . . . . . . . . . . . 291


5.1 Standard Additive Model (SAM) . . . . . . . . . . . . . . . . . . . . . . . . 292
5.2 SAM Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
6 Fuzzy PID (FPID) Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
6.1 High-Level Nonlinear Tuning Variables . . . . . . . . . . . . . . . . . . . 296
6.2 Design of SAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
7 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
7.1 Direct Nyquist Array (DNA) Stability Theorem . . . . . . . . . . . . 298
7.2 Maximum Values of PID Parameters . . . . . . . . . . . . . . . . . . . . . 299
8 Control Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
8.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
9 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

Evaluation of Fuzzy Implications and Intuitive Criteria of GMP


and GMT using MATLAB GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Sudesh K. Kashyap, J.R. Raol, and Ambalal V. Patel
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
2 Intuitive Criteria of GMP and GMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
3 Fuzzy Implication Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
4 Properties of Interpretations of Fuzzy IF-THEN Rules . . . . . . . . . . . . 319
5 Study of Satisfaction of Criteria using MATLAB/Graphics . . . . . . . . 320
6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

FzController: A Development Environment for Fuzzy Controllers . . . . . . 387


I. Alvarez-López, O. Llanes-Santiago, and J.L. Verdegay
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
2 General Conception of the FzController System . . . . . . . . . . . . . . . . . . 388
2.1 Exact method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
2.2 Approximated Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
3 Modules in FzController . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
3.1 Identification Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
3.2 Design of Fuzzy Controllers Module . . . . . . . . . . . . . . . . . . . . . 391
3.3 Real-Time Control Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
3.4 Automatic Generation of Codes Module . . . . . . . . . . . . . . . . . . 397
4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
x Contents

A Consistency Criterion for Optimizing Defuzzification


in Fuzzy Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Hyei Kyung Lee, Eric Paillet, and Werner Peeters
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
2 MOM- and COG-defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
2.1 Single Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
2.2 Two Single Disjoint Controllers . . . . . . . . . . . . . . . . . . . . . . . . . 411
2.3 Two Subcentrally Overlapping Controllers . . . . . . . . . . . . . . . . 412
2.4 Two Supercentrally Overlapping Controllers . . . . . . . . . . . . . . 416
2.5 Overlapping Controllers with Border Conditions . . . . . . . . . . . 420
3 The Consistency Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
4 BADD-defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
4.1 Results with No Border Constraints . . . . . . . . . . . . . . . . . . . . . . 428
4.2 Results with Border Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 429
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

An Asymptotic Consistency Criterion for Optimizing Defuzzification


in Fuzzy Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Hyei Kyung Lee, Eric Paillet, and Werner Peeters
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
2 Rule Antecedent Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
3 Rule Base Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
4 The Asymptotic Consistency Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 440
4.1 MOM-defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
4.2 COG-defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
4.3 BADD-defuzzification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
5 Defuzzification Fitness Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
An Overview of Fuzzy Control Theory

W. Peeters

Abstract This chapter may serve as an introductory article, and is meant to give an
overview of the mathematical methods applied in fuzzy control techniques, such as
fuzzification, aggregation and defuzzification. We will also discuss the advantages
and disadvantages of the several techniques, with respect to the achievability of
their goals, and we will give a brief overview of “hybrid techniques”, techniques
that involves fuzzy control as well as other artificial intelligent computing methods,
such as neural networks and genetic algorithms.

Keywords: fuzzy control, fuzzification, defuzzification, hybrid neural networks,


fuzzy genetic algorithms

1 Introduction

1.1 History

Fuzzy control ([21, 168]) is a tool to model the control of complex systems derived
from knowledge obtained by human experience. Unlike ordinary expert systems,
fuzzy control systems do not require the time-consuming process of designing ap-
propriate algorithms for modelling the human behavior, and by its relative heuristic
simplicity, it is an excellent means to control more engineer-oriented applications
without a thorough understanding of the underlying mechanism; often it is suffi-
cient to develop a control strategy by a few simple “rules of thumb”, which consti-
tute a mere sufficient collection of conditions to keep the system stable, i.e. that the

W. Peeters
University of Antwerp
Dept. of Mathematics and Computer Science
Middelheimlaan 1
B-2020 Antwerp, Belgium, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 1
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 1–138.
c 2008 Springer.
2 W. Peeters

error can be kept within reasonable bounds. In the example of a fuzzy controlled car
([142]) for instance, the designers would want to make sure their vehicle does not
bump into other objects, without prior knowledge of the precise location of those
objects, so that the car would still function in a different environment. Fuzzy con-
trol is used in a wide scope of applied sciences, including physics, electronics and
economy. It is a powerful tool for steering complex processes without a need for de-
signing difficult tuning functions. The nature of the systems make that fuzzy control
systems are easy tools for modelling human experience, and even for adaptive learn-
ing of control behavior. Particularly the areas where these techniques are crossbred
with other succesful self–tuning algorithms, such as neural networks and genetic
algorithms (see Section 10), have produced very interesting results, although the
design of a fuzzy controller is inherently very heuristic by nature.
While the first application of fuzzy sets to control theory, in this case on a steam
engine, occured in 1975, performed by E.H. Mamdani and S. Assilian ([94]), the
first practical industrial application can be traced back to 1982, by L.P. Holmblad
and J.J. Østergaard, who applied fuzzy control to a cement kiln. Methods to control
an automated car ( [142]) were extended to automated steering systems for trains
( [163]), which examples show that the first applications of fuzzy control invari-
ably occur in big industrial processes. Only in the late 1980s, after some successful
implementations by Japanese manufacturers of fuzzy controllers in household ap-
pliances, such as vacuum cleaners and cameras ([60, 155]), the interest in the study
of fuzzy controllers has grown to worldwide proportions. Credit is due to M. Sugeno
([141]), whose work was an important source of inspiration for implementing fuzzy
control systems as a contemporary innovation to popular appliances, thus making
fuzzy control a widely accepted, economically profitable and quite popular topic in
engineering sciences. Fuzzy control is an approach for control systems that aim to
model human experience, alternative to expert control systems ([16]). However, its
origins trace back to control engineering rather than to techniques of artificial intelli-
gence. Fuzzy control is mostly a rule-based system, where the designer heuristically
formulates a set of control rules, which makes the scope of fuzzy control narrower
than general expert control systems. The main advantage is the relative simplicity
with which fuzzy rule bases can be defined, refined or tuned. Further studies how-
ever have shown that the design, the robustness and the capability of outperforming
convention control systems, such as PID-controllers, are largely dependent of the
circumstances in which one wants to perform fuzzy control.
We have to admit at the same time that fuzzy control theory suffers from some
serious drawbacks, that have been repeatedly targeted by its adversaries, and which
have been the subject of some heated debates over the last few decades, where
the question openly arises what the advantage of fuzzy control is as opposed to
classical control theory ( [1]). Without going into detail, we feel it necessary to
summarize these counterarguments, so that the reader can bear these in mind, al-
though we do not think any of the arguments make fuzzy control theory absolutely
superfluous.
• Fuzzy control theory is a largely empirical and heuristic theory, which lacks a
unifying design theory.
An Overview of Fuzzy Control Theory 3

• For much too long, “fuzzy” has been a buzz–word in commercial applications,
making the notion devoid of its content.
• While fuzzy controllers have proved their usefulness in relatively simple control
schemes, multivariable control systems are much harder to develop, while crisp
control methods do not suffer from this drawback. For larger, more complex
systems, the time consumed by the design of a fuzzy controller is almost equal
to, or even exceeds, the time needed to construct a classical controller, derived
from knowledge about dynamical systems.
• A generalized fuzzy control-specific stability analysis method (see Section 9)
does not exist yet, and many of the existing methods are simply generalizations
of crisp control stability methods.
• The mathematics behind crisp control theory involves much more difficult math-
ematical methods, which makes fuzzy control “the easy way out”. We believe
however, that the relative simplicity of fuzzy control can be an advantage as well,
because some of the nonlinear differential equations that describe accurately the
physical model of a control system are not analytically solvable anyhow, very
unstable for perturbations, and then fuzzy control might as well be as good as
any other approximation theory.
This first section will focus on the basic definitions and notations, while in
Section 2, we will establish a working definition for fuzzy rule bases, thus creat-
ing an environment in which fuzzified data can be considered as input, so that we
can complete the design of a fuzzy controller in Section 3. Aggregation and impli-
cation operators, and the process of defuzzification will respectively be studied in
Sections 4 and 5. All this theory will be illustrated with an extended example of an
automated heating system in Section 6. Simplifications of the fuzzy control theory,
such as table-based controllers and Sugeno controllers will be studied in Section 7,
and the design of adaptive fuzzy controllers, such as self-tuning and self-organizing
controllers, in Section 8. Section 9 will contain a brief summary of stability con-
trol techniques for fuzzy controllers, while in the last Section, 10, we will describe
shortly which other artificial computing techniques can successfully be combined
with fuzzy controllers.

1.2 Structure of a Fuzzy Controller

Figure 1 shows a schematical breakdown of a fuzzy controller ([69]). As we can see,


the fuzzy controller is preceded by a preprocessor and followed by a postprocessor
block. The preprocessor usually is a device that makes crisp measurements, which
are most often numerical in nature, rather than linguistic. During the preprocess-
ing, already some calculations are performed which have no real connection to the
fuzzy control process, but nevertheless can yield a lot of influence. Some of the
4 W. Peeters

Fuzzi- Rule base / Inference Defuzzi-


fication fication

Aggre-
+ gation
Preprocessing Postprocessing
Impli
cation

FUZZY CONTROLLER

Fig. 1 Schematical breakdown of a fuzzy controller


1

–1
–10 small 0 medium large 10

Fig. 2 An example of nonlinear scaling

processes that may be carried out in the preprocessing block comprise, but are not
limited to:
• Quantization of the measurements. When performing a sampling, typical errors
occur that are caused by the rounding-off of integers, depending on the coarse-
ness of the quantization steps or the precision scale of the measuring equipment.
Quantization is a means to reduce the data input, but if it is too coarse, the con-
troller may oscillate around the reference or even become unstable. The number
of quantization steps therefore always is a trade-off between the computing re-
sources one disposes of, and the desired precision. If the allowed measurement
values are for instance only −4, −3, −2, −1, 0, 1, 2, 3, 4, a measurement of x = 2.5
is rounded off to 3, causing an error of 0.5, being 6.25% of the total width of the
range space.
One possible solution to overcome problems with quantization is nonlinear
scaling ([62]) — see Figure 2. Typically, the end user is asked to enter three typ-
ical numbers for a small, medium and a large measurement, respectively. These
numbers then are considered as the break–points on a piecewise linear curve that
scale the incoming measurements. Although similar techniques are used in the
fuzzy controller itself, this technique is strictly speaking completely independent
of it.
• Normalization or scaling of the measurements onto a particular, standard range
• Removing noise by filtering
• Averaging out the results over a number of measurements, in order to obtain the
tendencies over a longer term
An Overview of Fuzzy Control Theory 5

• Differentiation and integration (or their discrete equivalences) in order to calcu-


late the rate change or the cumulative results of the measurements respectively.
This step is mostly done whenever the input to the controller is some numerical
value describing the error. Whenever the preprocessor also calculates the differ-
ential and the integral over this error function, the change in error and the error
history are also considered as inputs for the fuzzy controller. This is in fact the
basic idea for what ordinary control theory considers to be a PID–controller.
The issues that typically surface during the postprocessing are similar to the ones
in preprocessing. Both preprocessing and postprocessing fall beyond the scope of
this overview article; we will therefore concentrate on the fuzzy controller processes
themselves. We assume that the preprocessor passes a finite number of crisp mea-
surement values into the controller, and we expect the fuzzy controller to feed a
numerical output value to the postprocessor, on which value some control decision
may be based.

1.3 Fuzzy Modelling Requirements

Ideally, fuzzy control — and by extension, any kind of automated control — would
be a process in which a relatively small amount of control parameters are given as
input, and a desired output state is required. Of course this will almost never succeed
immediately, and an error between the desired output and the factual output will be
generated. Consequently, the controller designer will try to adjust the input values
in such a way that the output varies as a — preferrably — continuous function with
respect to the input values. In conventional (nonfuzzy) controllers, depending on
the chosen model, the error functions in a certain scope of time, say k measure-
ments, et , et−1 , ..., et−k+1 as well as the output control values ut−1 , ..., ut−k+1 , will
be stocked into a memory, and possibly, a model ut := f (e, u) will be designed that
determines the action to be taken.
In fuzzy control, it is not necessary to explicitly define the control action as an
input of the previous control and error variables, and instead, a set of control rules
will be defined by means of linguistic variables. The various rules generate a number
of rule consequences, which are then combined in one fuzzy set that describes the
possible control actions that can be taken, a process which will be called aggregation
(see Section 4) . Finally, a suitable method will have to be designed to generate
from this rule consequence one crisp control value; this latter process will be called
defuzzification (see Section 5). Apart from this, one also has to consider the number
of input signals, the shape of the fuzzy membership functions that make up the
linguistic variables, the number of fuzzy rules and much more.
Since, however, the rules in this knowledge base are the only tool for the system
designer to translate his expert knowlegde to, the behavior of the system will be
basically influenced by this design. Therefore, the necessary time should be reserved
to obtain and derive these rules. It will be mandatory to have a suitable set of rules
to obtain a closed–loop behavior of the system and to finally reach some kind of
6 W. Peeters

equilibrium. Sugeno and Nishida recommend in [142] the following ways to find
control rules:
• The operator’s experience and the control engineer’s knowledge. In [62], an op-
erator’s handbook for a cement kiln, such a collection of rules of thumb is estab-
lished by organizing an extensive questioning of experts on the subject. This is a
very time-consuming process.
• Fuzzy modelling of the operator’s control actions. Fuzzy IF–THEN rules can be
deduced from observation of an operator’s control actions or a log book. The
rules express input–output relationship.
• Fuzzy modelling of the process. Considering the linguistic rule base as the inverse
model of the control process, this inverse model may be used to obtain the fuzzy
control rules. This model can only be used with relatively low order systems,
but it provides an explicit solution to the inverse problem, assuming that fuzzy
models of the open- and closed-loop systems are available. For more information,
we refer to [84] and [116].
H.J. Zimmerman adds in [168] that also the following sources may be useful:
– Crisp modelling of the process
– Heuristic design rules
– On–line adaptation of the rules
• Self–learning controllers. Other interesting and more recent approaches are those
in which the controller determines the rules itself. The theory of fuzzy control
is crossbred with theory involving genetic algorithms and neural networks (see
Section 10), and this has recently produced some encouraging results.
This list is, however, neither complete nor universally necessary. Just as in con-
ventional control, an increase in the knowlegde of the system design will lead to
better control results. There is, however, no fixed design procedure in fuzzy control;
the various freeware and commercial software tools all use different strategies to
establish the rule base.
The reason of the vast success of fuzzy controllers is its fairly simple computa-
tional behavior, its obvious weakness however is, as is readily known, the inherently
heuristic nature of the design of a fuzzy controller. The wide possibility of choice
for shape and parameters in the control variables shows the need for a solid math-
ematical foundation, next to some obvious heuristic restraints which the controlled
system has to satisfy. Mathematically speaking, fuzzy control is based on the con-
cept of fuzzy sets as introduced by L.A. Zadeh ( [164] and [165]), extending the
notion of membership of a function from a two-valued logic to one in which the
range values continuously vary within I = [0, 1].
The most obvious kind of fuzzy control is the so-called direct control. The out-
put of the process is directly compared to a desired reference value, and if there
is a deviation, the controller will take action depending on the numerical value of
the error, the change in error and/or the cumulative error. This kind of controller is
an immediate substitute for the so-called PID–controllers (P–roportional I–ntegral
D–erivative). Another possible strategy is the so-called feedforward controller, that
An Overview of Fuzzy Control Theory 7

is put parallel to a conventional controller, say a PID–controller, and if the latter


becomes mathematically too complicated, the fuzzy controller takes over and com-
pensates the possible disturbances of the model. A third, widespread technique, is
using fuzzy rules to fine-tune the parameters in a parameter adaptive controller,
also called a gain scheduling. A gain scheduling controller contains a linear con-
troller whose parameters are changed as a function of the operating point, and is a
good way to compensate for nonlinearities in the parameter variations.

1.4 The Fuzzy Controller Block

As everybody who is familiar with the basic concept of fuzzy control knows, three
key issues in the design of a fuzzy control system are:
• The choice of a suitable set of fuzzy variables, being functions from the space
in which control measurements are performed. Mostly this will be functions α
from R (or commonly, a closed interval thereof) to I.
• The choice of an implication function, or, equivalently, a set of linguistic rules,
each of the type

IF (X1 = A1 ) and ... and (Xn = An ) THEN (Y = B)

where the denoted variables Xi are linguistic, and linked to the fuzzy membership
sets αi , and coupled with an aggregation function to combine the consequences
of these assertions, and an implication function
• The choice of a suitable defuzzification method, assigning one crisp value with
the aggregated consequence function
Any combination of the three above will be referred to as a fuzzy controller block.

1.5 Notations

1.5.1 Definitions (Fuzzy Sets)

A fuzzy set will be denoted as µ : X −→ I, where X is the universe and I the unit
interval [0, 1]. The collection of all fuzzy sets on X shall be denoted as F(X).
The following properties of fuzzy sets will be used througout this text:

1.5.2 Definition (α –Cuts)

For any fuzzy set µ ∈ F(X) and any number α ∈ [0, 1], the α –cut of µ will be
denoted as the following crisp subset of X:

Γα (µ ) = {x ∈ X : µ (x) ≥ α }
8 W. Peeters

and the strong α –cut of µ will be denoted as the following crisp subset of X:

Γ∗α (µ ) = {x ∈ X : µ (x) > α }.

1.5.3 Definition (Height, Normality)

For any fuzzy set µ ∈ F(X), the height of µ will be defined as

h(µ ) = sup µ (x).


x∈X

A fuzzy set µ ∈ F(X) will be called normal if and only if h(µ ) = 1.

1.5.4 Definition (Convexity)

A fuzzy set µ ∈ F(X) will be called convex if and only if

∀λ ∈ [0, 1], ∀x, y ∈ X : µ (λ y + (1 − λ )x) ≥ λ µ (y) + (1 − λ )µ (x).

1.5.5 Definition (Support and Core of a Fuzzy Set)

For any fuzzy set µ ∈ F(X), the following crisp subsets of X will be of utmost
importance:
• The support of µ equals

supp(µ ) := {x ∈ X : µ (x) > 0} = Γ∗0 (µ )

with the usual closure operator for a topology on X, in the worst case being the
discrete structure.
• The core of µ equals

core(µ ) := {x ∈ X : µ (x) = h(µ )}


= {x ∈ X : µ (x) = sup µ (y)}
y∈X
= {x ∈ X : ∀y ∈ X : µ (y) ≤ µ (x)}.

The core of a normal fuzzy set is called the kernel of the fuzzy set.

1.5.6 Example

Mark that the core of a fuzzy set may be empty. For instance, consider the following
fuzzy set on X = [0, 1]: 
x if x ∈ [0, 1[
µ (x) =
0 if x = 1
An Overview of Fuzzy Control Theory 9

Then h(µ ) = 1, yet core(µ ) = 0.


/ In order to guarantee that the core of the fuzzy set
is nonempty, one has to assume at least that there is a topological structure on X,
and with the standard euclidean topology on [0, 1], for instance assume the upper
semi–continuity of the fuzzy set.

1.5.7 Definition (Width of a Fuzzy Set)

If (X, d) is a metric space, then the width of a fuzzy set µ ∈ F(X) will then be
defined as
width(µ ) = sup d(x, y).
x,y∈suppµ

1.5.8 Definition (Image and Preimage of a Fuzzy Set)

Let f : X −→ Y be a (crisp) function between two universes X and Y . For all µ ∈


F(X), the image f (µ ) ∈ F(Y ) is defined by

∀y ∈ Y : [ f (µ )] (y) := sup{µ (x) : x ∈ f −1 (y)}

while the preimage of any fuzzy set ν ∈ F(Y ) is defined by


 
∀x ∈ X : f −1 (ν ) (x) := (ν ◦ f )(x).

([91]).

2 Fuzzy Rule Bases

2.1 Linguistic Variables

2.1.1 Zadeh’s Definition

The term linguistic variable was used for the first time by L.A. Zadeh in [166]. Not
quite mathematically elaborate as we will refine the definition further on, Zadeh
defined a linguistic variable as a quintuple

X = (x, T (x),U, G, M̃).

x would than be the name of the variable and T (x) the list of linguistic values the
variable would be able to assume. As a mathematician, the idea of T (x) being a
multiple-valued function is of course uncomfortable. Each of these values is taken
as a representatieve for a fuzzy variable µname : U −→ [0, 1] denoting the degree
to which an object u satisfies the linguistic variable X. U would then be the set of
10 W. Peeters

possible — crisp — outcomes of some measurement that determines the degree of


membership of the different linguistic values of T (x). For any y ∈ T (x), M̃(y) would
then be the graph of the fuzzy set µT (x) , while G is a syntactic grammar rule that
associates the elements of T (x) with their meaning.

2.1.2 Example

Let x for instance be “body weight”, then T (x) might for instance be

T (x) = {µfeatherweight , µlightweight , µthin , µaverage , µfat , µoverweight , µobese },

and if the universe U denotes the weight of a person in kilograms, for instance
  
 80  if x ∈ [80, 120]
1 − 1 − x −
µoverweight (u) = 20
0 otherwise

In that case, quite some confusion arises between the function as an object and its
outcome values, and also with the terms that are generated by G. We would find for
instance

G(T (x)) = {featherweight, lightweight, thin, average, fat, overweight, obese},

altough in literature, these terms are often denoted as T (x) too. In order to avoid all
this confusion, we will take the liberty of denoting linguistic variables as well as
their possible outcome values by fuzzy sets µ : X −→ [0, 1] where X is the universe
of discourse, either equalling R or a subset thereof.

2.2 Linguistic Hedges

The only instance where the use of the “grammar” G may be useful, is when we try
to define new linguistic variables, starting from other existing linguistic variables.

2.2.1 Definition

A functional G which associates with any linguistic variable µ ∈ F(X) another


linguistic variable G(µ ) ∈ F(X) will be called a linguistic modifier or a linguistic
hedge if and only if it is a pointwise operation ([91]).
This means that for any index set S,

G : (F(X))S → (F(X))
(µs )s∈S → G((µs )s∈S )
An Overview of Fuzzy Control Theory 11

in such a way that for all x, y ∈ X and for all (µs )s∈S , (νs )s∈S we have that

if ∀s ∈ S : µs (x) = νs (x), then G((µs )s∈S )(x) = G((νs )s∈S )(y).

Summarized, the value of (G(µ ))(x) is only dependent of the value of µ (x) and not
of the values µ (y) with y = x ∈ X.

2.2.2 Examples

Let us now consider some examples of linguistic hedges:


1. Let  x
µold (x) := 100 if x ≤ 100
1 if x ≥ 100
be a linguistic variable on R+ denoting “a person aged x is old”. Then the point-
wise negation is given by (Figure 3)

µnot old (x) := 1 − µold (x)

2. ([7]) Let µold (x) be defined as above, then we define two new linguistic variables
(Figure 4)

µvery old (x) := (µold (x))2


µfairly old (x) := (µold (x))1/2

1 1
µnot old
µold

0 100 X 0 100 X

Fig. 3 Graph for the linguistic hedge “not”

1 1 µfairly old
µold

µvery old

0 100 X 0 100 X

Fig. 4 Graph for the linguistic hedges “very” and “fairly”


12 W. Peeters

3. It is possible to apply the induction principle on the formalism defined by Bald-


win ([7]) to conceive notions such as the following:

µvery very old (x) := (µvery old (x))2


= (µold (x))4
µextremely old (x) := (µvery old (x))3
µslightly old (x) := (µvery old (x))1/3

In fact, if we define µp–old (x)) to be (µold (x)) p , we can even consider the linguis-
tic variable

0 if x < 100
µabsolutely old (x) :=
1 if x ≥ 100
= lim (µold (x)) p
p→+∞

On the other hand, taking lim (µold (x)) p results in the linguistic variable
p→0

0 if x = 0
µ0 (x) :=
1 if x ≥ 0

which represents undecidedness, since all ages, except for a negligible set, are
considered to be equally “old”.

2.3 Fuzzy Rules

2.3.1 Definition (Antecedent Rule Base)

When designing any fuzzy controller, one starts with taking a finite collection of
rule antecendents, consisting of fuzzy variables, which we will denote by

A = {αi : X −→ I}ni=1

Such a collection will be called an antecedent rule base.


• An antecedent rule base will be called disjoint if

∀i = j ∈ {1, ..., n} : suppαi ∩ suppα j = 0,


/

or sometimes, when considering two adjacent rules,

∀i = j ∈ {1, ..., n} : #(suppαi ∩ suppα j ) ≤ 1.

If two rule antecedents αi and α j are not disjoint, they will be called overlapping.
• An antecedent rule base will be called a cover if and only if
An Overview of Fuzzy Control Theory 13

∀x ∈ X, ∃i ∈ {1, ..., n} : αi (x) > 0.

Moreover, it will be called a partition of unity if and only if

n
∀x ∈ X : ∑ αi (x) = 1.
i=1

The set of all such collections A of rule antecedents shall be denoted as P ∗ (F(X)),
being the collection of all finite subsets of F(X), the fuzzy sets on X. The conse-
quence functions can be considered as members of the same set.
Sooner or later, the designer will have to face the question of how to build the
terms of the fuzzy rule base. Two important questions should therefore be answered:
(1) how are the shapes of the fuzzy sets determined, and (2) how many sets are
necessary and sufficient? As for the first question, we will give an overview of the
most commonly used fuzzy sets in fuzzy control.

2.3.2 Example

As rules on R, the following functions are commonly used:

1. Triangular rules (Figure 5)




⎪ 2(x − a) a+b
⎨ b − a if x ∈ a, 2

µ (x) = 2(x − b) if x ∈ a + b , b

⎪ a−b 2


0 otherwise
 
 x − a 

= 1 − 1 − 2 ∨0
b−a 
 
The core of the triangular rule therefore equals the singleton a +
2
b , while its
support equals [a, b]. For example, µ (x) = “x is a good temperature for baking
fries” may be represented by (a, b) = (160, 200).

1 µ

0 a b X
Fig. 5 Triangular fuzzy set
14 W. Peeters

1
µ

Fig. 6 Asymmetric triangular


0 a b c X
fuzzy set

1 µ

0 a b c d X
Fig. 7 Trapezoidal fuzzy set

2. Asymmetric triangular rules (Figure 6)


⎧ x−a

⎪ if x ∈ [a, b]
⎨ b−a
µ (x) = x − c

⎪ b − c if x ∈ [b, c]

0 otherwise

The core still is a singleton {b}, while its support equals [a, c]. For example
µ (x) = “x is a good temperature for swimming” may be represented by (a, b, c) =
(15, 40, 50).
3. Trapezoidal rules (Figure 7)
⎧ x−a


⎪ b − a if x ∈ [a, b]


⎨1 if x ∈ [b, c]
µ (x) =

⎪ x − d if x ∈ [c, d]

⎪ c−d


0 otherwise

This time, the core of the trapezoidal rule equals the crisp interval [b, c], while its
support equals [a, d][. For example µ (x) = “x is a good temperature for garden-
ing” may be represented by (a, b, c, d) = (10, 15, 20, 25).
These three examples have the disadvantage that they are not differentiable. In
some cases, for example when a smooth change in the controller function is
desired, we would prefer the use of C ∞ –functions. Therefore, some continuous
modifications of the basic piecewise linear rules mentioned above exist in litera-
ture. We will give a few examples:
An Overview of Fuzzy Control Theory 15

0 x0 X
Fig. 8 Gaussian fuzzy set

Fig. 9 A variation on the


Gaussian set that does not
make use of the exponential
0 x0 X
function

4. Gaussian fuzzy sets (Figure 8)


The standard Gaussian curve centered around x0 is given by the equation


(x − x0 )2
µ (x) = e 2σ 2

where x0 is called the mean and σ the standard derivation, a parameter that
determines the width of the fuzzy set. Normally, the support of µ equals the
(unbounded) set R; however, any restriction to a closed interval [a, b] ⊆ R may
also be considered. Note however that in this context, the Gaussian curve does
not have it traditional probabilistic meaning.
A variation on this definition that does not make use of the exponential function
is given by (Figure 9).
1
µ (x) =  
1+ x− σ
x0 2

where again, σ is a parameter that determines the width. The same remarks
regarding the support of this fuzzy set are valid.
16 W. Peeters

1 1

0 x0 X 0 x0 X
a=1 a=2

1 1

0 x0 X 0 x0 X
a=3 a=4

Fig. 10 FL Smidth controllers

5. FL Smidth controllers
A parameter family of fuzzy sets that is often used in fuzzy control is the so-
called FL Smidth controllers collection (Figure 10). It is given by
 a
 
− σ 
µ (x) = 1 − e x − x0

in which the extra parameter a controls the gradient of the sloping sides. The
following figure shows examples of FL Smidth controllers for a ∈ {1, 2, 3, 4}:
Note however that these fuzzy sets are only differentiable in certain particular
cases (a = 2, a = 4, ...).
6. Cosine functions
Another way to generate a variety of membership functions is by using a compo-
sition of a linear function and a cosine function. We define an s–curve as


⎨0   if x < a
1 1 x − b
µs(a,b) (x) = 2 + 2 cos b − a π if x ∈ [a, b]


1 if x > b

where a, b ∈ X will be called the left breakpoint and right breakpoint respectively
(Figure 11).
A z–curve then will be defined as a reflection of an s–curve: for the breakpoints
c, d ∈ X, we define (Figure 12)
An Overview of Fuzzy Control Theory 17

ms(a,b)
1

Fig. 11 s-curve 0 a b X

mz(c,d)
1

Fig. 12 z-curve 0 c d X

1 mp (a,b,c,d)

Fig. 13 π –curve 0 a b c d X



⎨1   if x < c
1 1 x − c
µz(c,d) (x) = 2 + 2 cos d − c π if x ∈ [c, d]


0 if x > d
Finally, a π –curve can be implemented as a combination of an s–curve and a
z–curve. For any a < b < c < d ∈ X we will define (Figure 13)
µπ (a,b,c,d) (x) = min{
⎧ µs(a,b) (x), µz(c,d) (x)}

⎪ 0   if x < a

⎪ −
⎪ 1 1 x b
⎨ 2 + 2 cos b − a π if x ∈ [a, b]

= 1
⎪   if x ∈ [b, c]

⎪ 1 + 1 cos x − c π if x ∈ [c, d]

⎪ d −c

⎩2 2
0 if x > d
7. LR–rules (Dubois/Prade [25])
The following family of fuzzy rules are suitable for differentiable as well as non-
differentiable functions. Let S : R+ −→ [0, 1] be decreasing functions that satisfy
the following three conditions:
a. S(0) = 1
b. ∀x > 0 : S(x) ∈]0, 1[
c. lim S(x) = 0
x→+∞
18 W. Peeters

1 µ

L R

α β
Fig. 14 LR–fuzzy real num-
ber defined by a shape func-
0 m X
tion

1 µ

Fig. 15 Grade of membership X


0 10 20 30 40 50 60 70 80
table

Examples of such shape functions include S(x) = 1 , S(x) = 1 and


1 + x2 1 + 2|x|
much more.
Given two shape functions L, R, an LR–fuzzy real number then (Figure 14) is for
any m ∈ R, α , β > 0
⎧  

⎨ L mα− x if x ≤ m
(m, α , β )LR (x) =
⎩ R x − m if x ≥ m

β

An LR–fuzzy real interval then is for any m1 ≤ m2 ∈ R, α , β > 0


⎧  


⎪ L m1α− x if x ≤ m1

(m1 , m2 , α , β )LR (x) = 1 if x ∈ [m1 , m2 ]

⎪ x − m2 if x ≥ m

⎩R
β 2

Both LR–fuzzy real numbers and fuzzy real intervals make particular good
choices as fuzzy rules. The advantage is that, in literature, many interesting de-
scriptions of the algebraic operations on such LR–fuzzy sets exist, such as

(m, α , β )LR ⊕ (n, Γ, δ )LR = (m + n, α + Γ, β + δ )LR


−(m, α , β )LR = (−m, β , α )LR

8. Grade of membership–rules (Figure 15)


In case the space X is finite, or can be subdivided in a finite number of classes,
it is also feasible to make a table in which each class is assigned a certain grade
of membership, and then the fuzzy set is completely determined by this table.
For example µ (x) = “the degree to which it is pleasant to teach mathematical
An Overview of Fuzzy Control Theory 19

exercises sessions to a group of x students” may be represented by the following


table, if we round off the number of students to the nearest multiple of 10, and
cap the number of students to 80:

x 10 20 30 40 50 60 70 80
µ (x) 0.4 0.6 0.8 1 0.9 0.6 0.4 0.1

In this case, X need not even be a set of numbers, but this approach will fall out
of scope for the purpose of this article.

2.3.3 Necessity and Sufficiency of a Fuzzy Rule Base

The question whether a fuzzy rule base contains the necessary and sufficient amount
of fuzzy sets is not so straightforward. Several considerations should be taken into
account ([69]):

• A term set should be sufficiently wide to allow for noise in the measurement.
• If there is a gap between two fuzzy sets in the fuzzy rule base, no rule will fire
for values in this gap. Hence a certain amount of overlap is desireable; otherwise
te controller may run into poorly defined states, where it does not return a well–
defined output.
• On the other hand, a good rule of thumb is that the overlap should at least be
50%. The widths of the fuzzy sets should initially be chosen so that each value
of the universe yields a nonzero value for at least two fuzzy sets in the fuzzy rule
base, except maybe for the elements at both extreme ends of the universe.

Hence the number of fuzzy sets required is invariably dependent on the width of
the fuzzy sets, and vice versa. This does not solve the question of which particular
shapes of curves should be used, though.

2.4 Linguistic Variables Revisited

2.4.1 Definition (Antecedent Rule Base)

Any antecedent rule base A = {αi : X −→ I}ni=1 ∈ P ∗ (F(X)) may be interpreted


as the possible (fuzzy) outcomes of a linguistic variable. Such a variable is used to
associate with a value x ∈ X the degree to which the different states of the process
are fulfilled.

2.4.2 Example

Consider the collection of fuzzy sets as given in Figure 16.


20 W. Peeters

µ middle- µ very old


µ very µ young µold
1 young aged

X
0 20 40 60 80

Fig. 16 Example of an antecedent rule base of linguistic variables

Let x ∈ X represent the age of a person, then we define five linguistic variables on
the space X, which denote the degree to which a person is “very young”, “young”,
“middle-aged”, “old” or “very old”. We could for instance take the following an-
tecedent rule base:
 x if x ∈ [0, 20]
1 − 20
µvery young (x) =
0 otherwise
  x 

µyoung (x) = 1 − 1 −  ∨ 0
 20 
 x − 20 
µmiddle-aged (x) = 1 − 1 − ∨0
20 
 
 x − 40 
µold (x) = 1 − 1 − ∨0
20 

⎨0 if x ≤ 60
µvery old (x) = x − 60 if x ∈ [60, 80]
⎩ 20
1 otherwise

For instance, if a person is 28 years old, the µyoung (28) = 0.6 and µmiddle-aged (28) =
0.4, while the other three linguistic variables are zero. In case of a partition of unity,
we always have that ∑ µ (x) = 1, as is the case here.
µ ∈A

3 The Design of a Fuzzy Controller

3.1 Choice of Rules

3.1.1 Definition (Fuzzy Controller)

A fuzzy controller is a finite set of rules

k : IF (X1 = Ak1 ) and (X2 = Ak2 ) and ... and (Xn = Akn ) THEN (Y = Bk )
An Overview of Fuzzy Control Theory 21

with k ∈ {1, ..., K} and where {Aki : k ∈ {1, ..., K}, i ∈ {1, ..., n}} and {Bk : k ∈
{1, ..., K}} are sets of linguistic values for the linguistic variables X1 , X2 , ..., Xn ,
which we will call the antecedents and Y , which we will call the consequence. Basi-
cally, we would not want these rules to contradict, so any set of inputs (A1 , A2 , ..., An )
should only yield one output B. Furthermore, we will call the fuzzy controller com-
plete if all possible combinations of antecedents occur once and just once in the
rule base. In such a case, it is easy to see that K, the number of rules in the base,
equals the product of cardinalities of the different possible linguistic values of the
antecedents and of the consequence.

3.1.2 Rule Design

The key to the design of a fuzzy controller is a suitable choice of rules. When de-
signing an automatic steering system for their model car, M. Sugeno and M. Nishida
suggested in [142] that the main elements are the translation of the operator’s expe-
rience and knowledge about the control actions into a fuzzy model. The design of
a fuzzy controller and the speed of development may be greatly improved though
by applying pure heuristic design rules as well as the possibility to fine-tune the
model by on–line adaptation of the rules (see Section 8). Other fruitful techniques
have turned out to be combinations with other techniques, both crisp, such as PDI
controllers, as well as self–learning, such as neural networks and genetic algorithms
(see Section 10).

3.1.3 Example

Suppose that we want to control a variable that has as desired value xt ∈ R, and
suppose that we are able to measure the outcome xt at certain discrete time steps
t ∈ N. Then the error is given by et := xt − xt , and usually also the change of error
∆et := et −et−1 is also taken into account ([16]). Given that neither et nor ∆et exceed
a certain interval, which through scaling can always considered to be [−1, 1], a very
commonly used set of rules that is applied, is given by Figure 17, where NB means
“negative big”, NM means “negative medium”, NS means “negative small”, ZE
means “almost zero”, PS means “positive small”, PM means “positive medium”

µ NB µ NM µ NS µ ZE µ PS µ PM µ PB
1

Fig. 17 Commonly used rule X


–1 0 +1
base for error
22 W. Peeters

and PB means “positive big”. Any of the control variables, as well as the output
variable, are then modelled in a similiar way, up to different scaling factors.
Some improvements one could apply to refine the controller include, but are not
limited to:

• Increasing the number of control variables


• Not restricting oneself to only considering antecent rule bases that are partitions
of unity. The rule base should however always be a cover of X.
• Discretizing the fuzzy sets; instead of functions, one can then for instance con-
sider tables as follows:
x −1 −0.8 −0.6 −0.4 −0.2 0 +0.2 +0.4 +0.6 +0.8 +1
µNB 1 0.8 0.4 0.2 0 0 0 0 0 0 0
µNM 0.4 0.8 1 0.8 0.4 0.2 0 0 0 0 0
µNS 0 0.2 0.4 0.8 1 0.8 0.4 0.2 0 0 0
µZE 0 0 0.2 0.4 0.8 1 0.8 0.4 0.2 0 0
µPS 0 0 0 0.2 0.4 0.8 1 0.8 0.4 0.2 0
µPM 0 0 0 0 0 0.2 0.4 0.8 1 0.8 0.4
µPB 0 0 0 0 0 0 0 0.2 0.4 0.8 1

Note the arbitrariness with which the values in the table are created; the rea-
son why table lookups are preferred over the calculation of antecedent rule base
values, is that it speeds up the process relatively well;
• If one wants to achieve a greater precision around the stable zero situation, one
could consider taking fuzzy linguistic variables with a different width (Figure 18).
The same goal is achieved by applying a logarithmic transformation to the dis-
cretized input values. Instead of considering the values

(−1, −0.8, −0.6, −0.4, −0.2, 0, +0.2, +0.4, +0.6, +0.8, +1)

on could take for instance



log(α +1) (|α x + 1|) if x ≥ 0
f (x) =
− log(α +1) (|α x + 1|) if x ≤ 0

µ NB µ NM µ NS µ ZE µ PS µ PM µ PB
1

Fig. 18 Base sets with differ- X


–1 0 +1
ent width
An Overview of Fuzzy Control Theory 23

with α > 0 an adjustable parameter that characterizes the relative deformation


around the origin. Some examples of α are given in the table below

α −1 −0.8 −0.6 −0.4 −0.2 0 +0.2 +0.4 +0.6 +0.8 +1


0.1 −1 −0.81 −0.61 −0.41 −0.21 0 +0.21 +0.41 +0.61 +0.81 +1
0.5 −1 −0.82 −0.65 −0.45 −0.24 0 +0.24 +0.45 +0.65 +0.82 +1
1 −1 −0.85 −0.68 −0.49 −0.26 0 +0.26 +0.49 +0.68 +0.85 +1
10 −1 −0.92 −0.81 −0.67 −0.46 0 +0.46 +0.67 +0.81 +0.92 +1
50 −1 −0.94 −0.87 −0.77 −0.61 0 +0.61 +0.77 +0.87 +0.94 +1

The reverse transformation, which can be used on the output value, is then given
by ⎧
⎨ (α + 1)y − 1
α y if y ≥ 0
g(y) =
⎩ − (α + 1) − 1 if y ≤ 0
α
The rule base can then be written as statements using the linguistic variables,
which makes them easy to read and interpret. For instance, feasible heuristic rules
would then be

IF (E is NB) and (∆E is NB) THEN (U is PB)


IF (E is NM) and (∆E is NB) THEN (U is PM)
IF (E is NS) and (∆E is NB) THEN (U is PM)
IF (E is PM) and (∆E is NM) THEN (U is ZE)
etc.

where E = et and ∆E = et − et−1 , and U is the control output. Any time one of these
rules is used, we say that the rule fires. For instance, if the error is positive medium,
but the change in error is negative medium, this means that the positive error rate
tends to decrease, and therefore it is reasonable to believe that taking no action at
all will stabilize the controller. If this is not the case, another applicable rule will
fire. The main work on design of a fuzzy controller is adjusting the parameters, the
number of rules and the fuzzy rule base in such a manner that the system converges
as quickly as possible to a stable situation (see Section 9). Just as in expert control
systems, this may trigger phenomena such as overshoot and some related problems.
For further information, we refer to works as [63, 84, 145] and [165].

3.1.4 Remark

One final remark about the design of a fuzzy rule base is that instead of a required
correction action U to be taken, it is also possible to define a performance measure P
that indicates how well the controller behaves, e.g. by comparing the output results
to a given desired output. For this issue, see also subsection 8.1. For instance, rules
like
24 W. Peeters

...
n : IF (E is PB) and (∆E is PM) THEN (U is NB)
n + 1 : IF (E is PM) and (∆E is NM) THEN (U is ZE)
...

will then be rewritten as

...
n : IF (E is PB) and (∆E is PM) THEN (P is small)
n + 1 : IF (E is PM) and (∆E is NM) THEN (P is large)
...

The reverse, where the rules that yield a performance measure are translated into
a set of possible correction actions, is also possible, of course, although the (poor)
quality of performance does not indicate in which direction action should be taken.
For now, this however remains a heuristic approach to the design of the fuzzy rule
base.

3.2 Design Parameters

When designing a fuzzy controller, there are numerously many adjustable parame-
ters, such as the number of controllers, nominal (in the output) and ordinal (in the
input) scaling parameters, different inference methods — which will be discussed in
Section 4, and a suitable choice of defuzzification parameters, as will be discussed
in Section 5. For now, however, we will focus on some important parameters that
are considered in the design of the fuzzy rule base.

3.2.1 Definition (Affine Transformation)

Note first of all that if the universe X is, or can be embedded in, a bounded and
closed subset of R, it is always possible to consider a fuzzy controller on the same
base space, by using ordinal scaling parameters. Let X ⊆ [a, b], then consider the
following affine transformation:

T[a,b] : [a, b] −→ [−1,


 1] 
x → 2 bx − a
−a −1
By means of the composition map
−1
T[c,d] ◦ T[a,b] ,
An Overview of Fuzzy Control Theory 25

it is possible to map any X ⊆ [a, b] into a subset of any other interval [c, d]. Therefore,
any set of linguistic variables can be rescaled to the same domain. This permits for
instance to use the same fuzzy rule bases on the domain of possible errors and
possible error gains. It may therefore be sufficient only to study the behaviour of
fuzzy rule bases on, e.g. [−1,1], except, of course, only in the case where the domain
is unbounded.
Let us now assume that we only consider triangular membership functions, which
are computationally the most simple objects one can consider. Furthermore, let us
assume that all membership functions are normalized; if this is not the case, one can
also apply a scaling function on the ordinal scale.

3.2.2 Definition (Left and Right Width)

For any triangular membership function µ with peak in a ∈ X, we define the left
width as |a − b| where

b = sup{x ∈ X : x < a and µ (x) = 0}

and the right width as |c − a| where

c = inf{x ∈ X : x > a and µ (x) = 0}

A fuzzy membership function µ will then be called symmetric if and only if left
width (µ ) and right width (µ ) are equal. Symmetry is necessary to obtain the follow-
ing property: suppose a fuzzy controller consists of only a single rule and a single
input, with a one-term triangular linguistic variable as consequence. Using Mamdani
inference (see Section 4) and Center-of-Gravity defuzzification (see Section 5), one
would expect that if the input equals the peak of the antecedent rule, the defuzzifi-
cation value would also be the peak of the rule consequence. This is however not
true if the latter is not symmetric, see for instance Figure 19.

3.2.3 Definition (Condition Width)

If we have that, for any two overlapping triangular membership functions in an


antecent rule base, the left width of the right membership function, the right width

1 1

0 a X 0 DCOG (µ) Y

Fig. 19 COG-defuzzification in the nonsymmetric case


26 W. Peeters

µ1 µ2 µ3 µ4
1

0 X
Fig. 20 Condition width

of the left membership function and the distance between the two peaks are equal,
we say that the condition width is fulfilled.
D. Driankov, H. Hellendoorn and M. Reinfrank showed in [24] that an antecedent
rule base satisfying the condition width is a sufficient condition for a smooth change
of the control values with respect to a change in the value x ∈ X. Remark that the
condition width does not necessarily imply symmetry, as can be seen in Figure 20.
Of course, a combination of symmetry and condition width yields the best results.
Another parametrical concept introduced by D. Driankov, H. Hellendoorn and
M. Reinfrank in [24] is the following.

3.2.4 Definition (Cross Point Ratio)

For any two overlapping triangular membership functions µa and µb with peaks in
a, b ∈ X respectively, we will define the cross point ratio as the number of elements
in the set
{x ∈ X : µa (x) = µb (x)}
The value µa (x) = µb (x) will be called the cross point level.
It is obvious that this set may contain more than one element. If this set is a
singleton however, we define the cross point ratio as the value µa (x) = µb (x). Often,
it is assumed that the cross point ratio is equal to one and that the cross point level
is 0.5.
Combining the width and crosspoint conditions of course yields the best results
in terms of smoothness. This explains at once why partitions of unity are often used
as fuzzy antecedent rule bases.

4 Aggregation and Implication Operators

In order to be able to combine several fuzzy sets into statements that can be regarded
as the rules of the fuzzy controller, one has to be able to yield similar unary and
binary operations as used in classical logic, in order to produce new statements by
combining one or more “atomic” statements. The five “classical” operations in logic
An Overview of Fuzzy Control Theory 27

are: negation, conjunction, disjunction, implication and equivalence. The binary op-
erations conjunction (“AND”) and disjunction (“OR”) also have to be extendable to
an arbitrary yet finite number of arguments in an associative and commutative way,
i.e. such that the order of the statements and the order in which they are parsed,
does not matter. To this end, the following binary operators play an important role
in fuzzy control theory:

4.1 t–norms and t–conorms

4.1.1 Definition (t–norm)

A function T : I ×I −→ I will be called a t–norm if and only if it fulfills the following


properties:

• T is increasing in both arguments, i.e. ∀x1 , y1 , x2 , y2 ∈ I :

x1 ≤ x2 and y1 ≤ y2 ⇒ T (x1 , y1 ) ≤ T (x2 , y2 )

• T is commutative, i.e. ∀x, y ∈ I : T (x, y) = T (y, x)


• T is associative, i.e. ∀x, y, z ∈ I : T (x, T (y, z)) = T (T (x, y), z)
• 1 is the unit element for T , i.e. ∀x ∈ I : T (x, 1) = T (1, x) = x

4.1.2 Example

The following functions are t–norms:


1. T (x, y) := min(x, y)
2. T (x, y) := xy
3. T (x, y) := max(0, x + y − 1)

4.1.3 Definition (t–conorm)

A function S : I × I −→ I will be called a t–conorm if and only if it fulfills the


following properties:
• S is increasing in both arguments, i.e. ∀x1 , y1 , x2 , y2 ∈ I :

x1 ≤ x2 and y1 ≤ y2 ⇒ S(x1 , y1 ) ≤ S(x2 , y2 )

• S is commutative, i.e. ∀x, y ∈ I : S(x, y) = S(y, x)


• S is associative, i.e. ∀x, y, z ∈ I : S(x, S(y, z)) = S(S(x, y), z)
• 0 is the unit element for S, i.e. ∀x ∈ I : S(x, 0) = S(0, x) = x
28 W. Peeters

4.1.4 Example

The following functions are t–conorms:


1. S(x, y) := max(x, y)
2. S(x, y) := x + y − xy
3. S(x, y) := min(1, x + y)
t–norms and t–conorms play an important part in the fuzzy set theory develop-
ment, and act as a generalization of disjunction and conjunction operators in fuzzy
logic. For more information, we refer to ([91]).

4.2 Extension of Logic

4.2.1 Definition (Fuzzy Rule Base)

Following E.H. Mamdani et al. in [94], given each rule is of the type

r : IF (X1 = A1 ) and ... and (Xn = An ) THEN (Y = B),

where Ai is the value of the linguistic variable i corresponding to the antecedent


membership function αi , and B is the value of the linguistic variable corresponding
to the consequence membership function β . Denote the set of all the applicable rules
as K. The design of a fuzzy controller is invariably linked to the suitable choice of
any combination of the following three entities:

• A conjunction (combining the rule antecedents of a single rule), say


n
kr (x) := αi (xi )
i=1

for each of the input vectors x = (x1 , ..., xn ),


• An implication
βr (x, y) = (kr (x) ⇒ β (y)),
indicating to which degree the antecedents of the r–th rule imply the conse-
quence, and
• A disjunction, combining the different rules to one fuzzy relation

ρ (x, y) = βx (y) := (kr (x) ⇒ β (y))
r∈K

A list of such fuzzy variables and their operators will be called a fuzzy rule base.
We will now illustrate the need to consider various conjunctionand disjunction

operators. Although the conjuction and disjunction are denoted as and respec-
tively, and although the minimum and maximum operator could fulfill the necessary
An Overview of Fuzzy Control Theory 29

conditions, there are a lot more choices possible, and equally so for the implication.
A major reason for instance to choose the product as a conjunction over the mini-
mum is that, given an adaptive control situation in which we will try to improve the
worst performing rule, the minimum will only select the worst condition for each of
the rules separately as a criterion for selection, while the product is a conjunction of
all the conditions in the same rule. If for instance three conditions µ1 , µ2 , µ3 have
the values (0.8, 0.9, 0.1) for a first input value and (0.3, 0.4, 0.2) for a second input
value, the minimum will regard the first one as the worst, while the product will
consider 0.024 as definitely three times worse than 0.072.
In this section, we are going to give an overview of the different properties that
conjunction, disjunction and implication should fulfill under ideal circumstances,
as well as a list of commonly used operators. The quality of the choice of logical
operators can then be derived from the amount of properties that are fulfilled.

4.2.2 Pointwise Extension Property

For our purposes, let us call the operators ∧, ∨ and ⇒ respectively a conjunction,
a disjunction and an implication. All three operators then should be considered as
pointwise extensions of the similar maps on I:
∗ : F(X) × F(X) −→ F(X)
(µ , ν ) → (µ ∗ ν ) : X −→ I
x → µ (x) ∗ ν (x)
with ∗ ∈ {∧, ∨, ⇒}. It therefore is sufficient to study the behavior of the operators
∧, ∨ and ⇒ on I only. Therefore, it is also natural to assume that a (pointwise)
pseudocomplementation
∼: I −→ I
x → 1 − x
exists, which can be used to formulate the different logical axioms (see also [21]).
The first property that these three operators should fulfill (although they do not
always do!) is that they should be an extension of the classical two-valued logical
operators. We formulate this property as follows:

4.2.3 Claim I

Any conjunction ∧, disjunction ∨ and implication ⇒ should be a generalization of


the two-valued classical logic operator. Hence the following table should hold:

a b a∧b a∨b a⇒b


0 0 0 0 1
0 1 0 1 1
1 0 0 1 0
1 1 1 1 1
30 W. Peeters

4.3 Conjunction and Disjunction Operators

Let us now consider the properties that any conjunction ∧ and any disjunction ∨
should fulfill. There is a general agreement that the four most important conditions
those should satisfy are the following:

4.3.1 Claim II

1. ∧ and ∨ should be monotonous:

∀a, b, c, d ∈ I : a ≤ b and c ≤ d ⇒ a ∧ c ≤ b ∧ d

and
∀a, b, c, d ∈ I : a ≤ b and c ≤ d ⇒ a ∨ c ≤ b ∨ d

2. ∧ and ∨ should be associative:


∀a, b, c ∈ I : a ∧ (b ∧ c) = (a ∧ b) ∧ c
and
∀a, b, c ∈ I : a ∨ (b ∨ c) = (a ∨ b) ∨ c

3. ∧ and ∨ should be commutative:

∀a, b ∈ I : a ∧ b = b ∧ a

and
∀a, b ∈ I : a ∨ b = b ∨ a

4. 1 should be the neutral element for ∧ and 0 should be the neutral element for ∨:

∀a ∈ I : a ∧ 1 = 1 ∧ a = a

and
∀a ∈ I : a ∨ 0 = 0 ∨ a = a

Considering 4.1, one immediately finds that all the suitable conjunctions and
disjunctions respectively to consider are, by definition, the t–norms and t–conorms.
These are each other’s logical dual, as shown in [2], in the following sense:

4.3.2 Proposition

For any t–norm T : I −→ I, the function


S(x, y) = 1 − T (1 − x, 1 − y)

is a t–conorm and vice versa.


An Overview of Fuzzy Control Theory 31

One thus obtains an extension of De Morgan’s laws of logic. It can furthermore


be proved ([10]) that for any associated pair of t–norm and t–conorm, De Morgan’s
laws still hold for other well–chosen pseudocomplementation operators.
Many of these t–norms and t–conorms are created by so-called additive and mul-
tiplicative generators.

4.3.3 Definition (Pseudo-inverse)

Let f : K −→ J be a continuous, strictly increasing function with K and J possibly


unbounded subintervals of [0, ∞]. Then we define the pseudo-inverse of f as

⎨ min J if y ≤ f (min J)
f (−1) : J −→ K : y → f −1 (y) if y ∈ [ f (min J), f (max J)]

max J if y ≥ f (max J)

In case f is strictly decreasing, we change all min to max and vice versa in the
definition above.

4.3.4 Definition (Additive Generator)

A continuous, strictly decreasing function f : I −→ [0, ∞] with f (1) = 0 is called an


additive generator for the t–norm T if and only if

∀x, y ∈ I : T (x, y) = f (−1) ( f (x) + f (y))

and a strictly increasing function f : I −→ [0, ∞] with f (0) = 0 is called an additive


generator for the t–conorm S if and only if

∀x, y ∈ I : S(x, y) = f (−1) ( f (x) + f (y))

There is also a multiplicative modification of this definition.

4.3.5 Definition (Multiplicative Generator)

A continuous, strictly increasing function f : I −→ I with f (1) = 1 is called an


multiplicative generator for the t–norm T if and only if

∀x, y ∈ I : T (x, y) = f (−1) ( f (x) f (y))

and a strictly decreasing function f : I −→ I with f (1) = 0 is called an multiplicative


generator for the t–conorm S if and only if

∀x, y ∈ I : S(x, y) = f (−1) ( f (x) f (y))


32 W. Peeters

The additive generators generate a lot of commonly used t–norms and t–conorms.
For more information, we refer to [87].
We will now give an overview of the most commonly used conjunction and dis-
junction operators ([10] and [101]). Some parametric families of such operators can
be found in detail in [91, 101, 157] and [168]. In what follows, let a, b ∈ I.

4.3.6 Examples

1. The drastic product and drastic sum are defined as



min(a, b) if max(a, b) = 1
tW (a, b) =
0 otherwise

and 
max(a, b) if min(a, b) = 0
sW (a, b) =
1 otherwise

2. The bounded difference and bounded sum are defined as

t1 (a, b) = max(0, a + b − 1) and s1 (a, b) = min(1, a + b)

3. The Einstein product and Einstein sum are defined as


ab a+b
t1 1 (a, b) = and s1 1 (a, b) =
2 2 − (a + b − ab) 2 1 + ab

4. The algebraic product and algebraic sum are defined as

t2 (a, b) = ab and s2 (a, b) = a + b − ab

5. The Hamacher product and Hamacher sum are defined as


ab a + b − 2ab
t2 1 (a, b) = and s2 1 (a, b) =
2 a + b − ab 2 1 − ab

6. The minimum and maximum are defined as

t∞ (a, b) = min(a, b) and s∞ (a, b) = max(a, b)

4.3.7 Proposition

The following pointwise inequalities hold:


An Overview of Fuzzy Control Theory 33

1. For any t–norm T , the inequality

tW ≤ T ≤ t∞

holds, and equally for any t–conorm S, the inequality

s∞ ≤ S ≤ sW

holds.
2. t1 ≤ t1 1 ≤ t2 ≤ t2 1 and s2 1 ≤ s2 ≤ s1 1 ≤ s1
2 2 2 2

For a proof, see ([91] and [168]).

4.4 Implication Operators

The choice of a suitable implication operator is not so well described as was the
case for conjunction and disjuction. In fact, so many different possible implication
operators can be considered, that it is virtually impossible to list them all. Some
important classes however are described by D. Dubois et al. in [33] and [34] and by
D. Ruan et al. in [127]. There is no agreement though on which implication prop-
erties of two–valued logic operators should be extended to the fuzzy case, unlike
the conjunction and disjunction properties. An example of such an axiom system
is that of Smets and Magrez in [139], which fundamentally assumes that the truth
value of an implication of two statements is only dependent of the truth values of
the separate statements, which is a reasonable assumption.
The following properties either may or may not be desirable when constructing
an implication operator:

4.4.1 Claim III

1. Contrapositive symmetry:

∀a, b ∈ I : (a ⇒ b) = ((∼ b) ⇒ (∼ a))

2. Exchange principle:

∀a, b, c ∈ I : (a ⇒ (b ⇒ c)) = (b ⇒ (a ⇒ c))

3. Monotony:

∀a, b, c, d ∈ I : if a ≤ c and b ≥ d, then (a ⇒ b) ≥ (c ⇒ d)


34 W. Peeters

µ out (Gödel)
1 µ in

µ out (Mamdani)
a µ out (product)

0 X

Fig. 21 Different implication operators

4. Boundary condition:

∀a, b ∈ I : if a ≤ b, then (a ⇒ b) = 1

5. Neutrality principle:
∀b ∈ I : (1 ⇒ b) = b

6. Continuity:
x ⇒ y is continuous in its arguments

Some of the most commonly used implication operators encountered in literature


are then the following (Figure 21).

4.4.2 Examples

1. Zadeh’s implication operator is defined as


a ⇒ b = max(1 − a, min(a, b))
ZAD

This operator is derived from the fact that in two-valued logic, a ⇒ b is equiva-
lent to (a ∧ b) ∨ (∼ a), using the minimum as conjunction and the maximum as
disjunction.
2. Lucasiewicz’ implication operator is defined as
a ⇒ b = min(1, 1 − a + b)
LUC

This operator is derived from the fact that in two-valued logic, a ⇒ b is equivalent
to (∼ a) ∨ b, using the bounded sum as disjunction.
3. Mamdani’s implication operator is defined as
a ⇒ b = min(a, b)
MAM

While this operation is commonly referred to as a fuzzy implication, we would


like to stress that we are taking in fact the cartesian product. Hence this should be
An Overview of Fuzzy Control Theory 35

interpreted more as a fuzzy (and in this case, symmetric) relation than as a fuzzy
logical implication.
4. Gödel’s implication operator is defined as

1 if b ≥ a
a ⇒ b=
GOD b if b < a

5. Kleene–Dienes’ implication operator is defined as


a ⇒ b = max(1 − a, b)
KLE

This operator is derived from the fact that in two-valued logic, a ⇒ b is equivalent
to (∼ a) ∨ b, using the maximum as disjunction.
6. Gaines’ implication operator is defined as

1 if a ≤ b
a ⇒ b= b
GAI
a if a > b

7. Yager’s implication operator is defined as


a ⇒ b = ba
YAG

8. The product implication operator is defined as

a ⇒ b = a∗b
PRD

The difference between some of the implications can be seen in Figure 22. We
show the Mamdani implication next to the Gödel implication, being the pointwise
“largest” implication possible, and the product implication.

Fig. 22 Mamdani vs Gödel implication


36 W. Peeters

4.4.3 Properties

As stated in [168], the following properties hold:

⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒
ZAD LUC MAM GOD KLE GAI YAG PRD
contrapositive symmetry No Yes No No Yes No No No
exchange principle No Yes Yes Yes Yes No Yes Yes
monotony No Yes No Yes Yes Yes Yes No
boundary condition No Yes No Yes No Yes No No
neutrality principle Yes Yes Yes Yes Yes Yes Yes Yes
continuity Yes Yes Yes No Yes No No Yes

One could then state that for instance the Lucasiewicz implication is better
than the Mamdani implication, because it satisfies more of the axioms. This is
however a very heuristic approach to the choice of a suitable implication
operator.

4.4.4 Remarks

The result of the application of the implications to the different rules in the an-
tecedent rule base then yields a set of fuzzy consequence rules, which still requires
the application of yet another aggregation operator, which combines the results of
the individually fired rules into one resulting fuzzy set again. Now, in case we used
the Mamdani implication, this aggregation is the t–conorm max, in case we used the
Gödel implication, it is more logical to use the t–norm min. When we look at the
graphs of Figure 22, in the Mamdani case, we take the union of the fuzzy graphs,
while in the Gödel case, we take the intersection of the graphs. Generally though,
it is possible to take as an intersection operator any t–norm and as a union operator
any t–conorm.
Another difference that is important when applying the rules of inference, is the
following: we could first combine all the rules, and then fire them through a com-
position operation. We call this procedure a composition-based inference. On the
other hand, it is also possible to fire each rule individually, and then combine all of
the resulting fuzzy outputs into one fuzzy set. This procedure is called individual
rule-based inference. It is easy to see that in the case of a Mamdani implication,
these two concepts are equivalent, while in the case of a Gödel implication, they are
not — see [24]. A detailed study of the difference between the Mamdani and Gödel
approaches can be found in [35].
Often, to diminish the computational work, the aggregation and implication
are performed together in one operator, that we call a generalized aggregation
operator. One should heed though that this may give rise to unexpected problems,
of which the most important is that these operators do not necessarily
commute.
An Overview of Fuzzy Control Theory 37

5 Defuzzification Operators

Defuzzification is a necessary tool to make a fuzzy control system interact with real-
world models. This is in its strictest sense contradictory to the idea of fuzzification,
which extends the notion of crisp sets with a degree of uncertainty. But nevertheless
defuzzification is inavoidable when a crisp output is desired, as is the case in many
practical applications. A defuzzification can be seen as an operator

D : F (X) −→ X

assigning to each fuzzy set µ ∈ F(X) a crisp value D(µ ) ∈ X. In most cases, µ will
be the result of an aggregation process on some fuzzy rule base, with the resulting
fuzzy output looking like the one in the Figure 23.
The goal is then to make D(µ ) act as an element of X which approaches the
semantic essence of the fuzzy set µ as good as possible.

5.1 Criteria

It may be handy to make some preliminary demands on which conditions a good


defuzzification operator D should satisfy. It will be practically impossible to find
a defuzzification operator which satisfies all conditions, so it is of the utmost im-
portance that we should select beforehand which criteria will be of importance in
our particular application. A nice description of these defuzzification criteria can be
found in W. Van Leeckwijck and E. Kerre [148, 149]. Some defuzzification crite-
ria historically go back to the ideas T. Runkler proposed in [128]. Other interesting
remarks about criteria that influence the choice of a defuzzification operator are
given in Section 3.6 of [24]. The different defuzzification operators can be classified
following two criteria: either by looking at their mathematical properties, or else
considering the computational efficiency and mathematical transparency. Some re-
marks concerning the former can be found in [73] and [148], and in this article, we
will also restrict ourselves to the former. Practically, a distinction has to be made
regarding which structure on X exists, if any. We will distinguish between the cases
where X has no structure at all, (X, ≤) is an ordered lattice, (X, T ) is a topological
space and (X, T +, ·) is a topological vector space.

m(x)

Fig. 23 Fuzzy consequence


after aggregation X
38 W. Peeters

We list the following criteria:

5.1.1 Uniqueness Criterion (UC)

For an arbitrary universe X, the defuzzification value should be unique, and therefore
not dependent anymore of any stochastic process. Stated differently, the output of
the defuzzification process should be unique for every choice of the fuzzy set µ ∈
F(X).
∀µ ∈ F(X), ∃!x ∈ X : D(µ ) = x
A defuzzifier D that satisfies this property will be called unique.

5.1.2 Core Selection Criterion (CSC)

For an arbitrary universe X, the defuzzification value should be such that its mem-
bership is among those of µ ∈ F(X) which have maximal membership. Stated dif-
ferently, the defuzzification value should be in the core of the fuzzy set ([129])

∀µ ∈ F(X) : D(µ ) ∈ core(µ )

Or, differently stated,

∀µ ∈ F(X) : ∀y ∈ X : µ (y) ≤ µ (D(µ ))

A defuzzifier D that satisfies this property will be called semantically correct.

5.1.3 Ordinal Scale Invariance Criterion (OSIC)

For an arbitrary universe X, the defuzzification value should be independent of any


positive affine transformation applied to the values in the range space I. Stated dif-
ferently, for all µ ∈ F(X) for all a ∈ R+
0 and b ∈ R, define

aµ + b : X −→ I
x → aµ (x) + b

(of course on condition that these operations are well defined). Then the defuzzifi-
cation value should not be changed, or, in other words,

∀µ ∈ F(X) such that aµ + b ∈ F(X) : D(aµ + b) = D(µ )

A defuzzifier D that satisfies this property will be called ordinal scale-invariant.


([107], [128], nicely summarized in [148]).
Let us now assume that (X, ≤) is an ordered lattice.
An Overview of Fuzzy Control Theory 39

5.1.4 Monotony Criterion (MC)

For an ordered universe (X, ≤), the defuzzification should respect the order of
(X, ≤). For all µ ∈ F(X), if v ∈ F(X) such that µ (D(µ )) = ν (D(µ )) and
furthermore

∀x < D(µ ) : ν (x) ≤ µ (x)


∀x > D(µ ) : ν (x) ≥ µ (x)

then D(ν ) ≥ D(µ ) and vice versa. A defuzzifier D that satisfies this property will
be called monotonous. ([128], Figure 24).
This means that the defuzzification value operator D on F(X) will increase in
value when evaluated on a fuzzy set ν for which the membership values with respect
to a given fuzzy set µ are higher on one side of the defuzzification value and lower
on the other side.

5.1.5 Triangular Conorm Criterion (TNC)

For an ordered universe (X, ≤), given a conorm S : I × I −→ I, for all µ , ν ∈ F(X)
such that D(µ ) ≤ D(ν ), define

µ ∨S ν : X −→ I
x → S(µ (x), ν (x))

Then D(µ ) ≤ µ ∨S ν ≤ D(ν ). A defuzzifier D that satisfies this property will be


called S–conjunctive.
This means that given a conjunction operator S, the defuzzification of the con-
junction of two fuzzy sets stays between the defuzzification values of the fuzzy sets
separately. One can extend this criterion to hold for larger collections of conorms,
or perhaps even all conorms, but one may expect that this is a very strict criterion.
One of the obvious criteria a defuzzifier has to satisfy is continuity: when the
rule antecedents are only modified slightly, this should not drastically affect the
output of the defuzzifier. It has extensively been studied that for instace the MOM-
defuzzifier is not continuous, unlike the COG-defuzzifier — see further on in this

1
µ ν

0 D(m) D(n) X
Fig. 24 Monotony
40 W. Peeters

section. However, in doing so, one has to assume that X carries some topology to
describe the distance between two fuzzy sets. So let us now assume that (X, T ) is
a topological space, then it will be possible to state criteria that make use of the
topological structure of X. We have to make the following distinction:

5.1.6 Weak Continuity Criterion (WCC)

For a topological universe (X, T ), a defuzzifier D will be called weakly continuous


if and only if for all x0 ∈ X and ξ ∈ R, if we define
ξ
µx0 : X −→ I
µ (x) + ξ if x = x0
x →
µ (x) if x = x0

(Given that ξ ∈ R is chosen such that this is well defined), then

∀ε > 0, ∃δ > 0 : ∀|ξ | < δ : |D(µxξ0 ) − D(µ )| < ε

The defuzzification operator D : F(X) −→ (X, T ) must be continuous for “some”


structure on F(X). Hereby the question rises which topology should be put on the
latter. As described above, this turns out to be a pointwise topology. It seems more
natural to put some stronger (uniform) topology on F(X) though. Therefore, put the
following pseudometric on F(X):

d(µ , ν ) = µ − ν  := sup |µ (x) − ν (x)|


x∈X

Then the following criterion should apply:

5.1.7 Strong Continuity Criterion (SCC)

For a topological universe (X, T ), a defuzzifier D will be called strongly continuous


if and only if for all µ ∈ F(X), then

∀ε > 0, ∃δ > 0 : ∀ν ∈ F(X) such that µ − ν  < δ : |D(µ ) − D(ν )| < ε

The strong continuity criterion obviously implies the weak one. Yet, just as was
the case with the monotony criterion, one could extend this criterion to hold for other
topological structures on F(X), perhaps even all topological structures satisfying a
certain set of properties at once, but again, this condition may be just too strict.
Given that there exist an addition and a scalar multiplication on the topologi-
cal vector space (X, T ), which are continuous with respect to T . In that case, we
can endow X with the structure of a topological vector space. In that case, another
criterion holds:
An Overview of Fuzzy Control Theory 41

5.1.8 Nominal Scale Invariance Criterion (NSIC)

For a topological vector space universe (X, T , +, ·), any positive affine transforma-
tion on the universe X should induce the inverse affine transformation on the de-
fuzzification value. Stated differently, for all µ ∈ F(X), for all a ∈ R0 and b ∈ R,
define
µ a,b : X −→ I  
x → µ x − a
b

(of course again on condition that this is well defined). Then the defuzzification
value should be
D(µ a,b ) = aD(µ ) + b.
A defuzzifier D that satisfies this property will be called universe scale-invariant.
([128])
In the particular case where X = R, being a topological vector space as well as
an ordered lattice, of course all of these criteria apply at once. Also, in that case
we are considering fuzzy real numbers, which means that all theory developed for
the treatment of the fuzzy real line can be used. We have included in the bibliogra-
phy a number of references dealing with the implementation of a structure on the
fuzzy real line; especially the work of D. Dubois and H. Prade ([25, 32]), S. Gähler
and W. Gähler ( [41]), R. Goetschel and W. Voxman ( [43]), R. Lowen ( [88–90]),
M. Mizumoto and J. Tanaka ([104]) is interesting in this context.
In [85] and [86] we have been working on two new criteria, only applicable on
compact subsets X ⊆ R or subsets thereof, which nevertheless seem to be important.
Suppose a controller is given by a rule base consisting of a finite number of fuzzy
variables A = {α1 , ..., αn } ⊆ F(X).

5.1.9 Definition (Control Function)

A function f : X −→ X will be called the control function. For every i ∈ {1, ..., n}, de-
fine βi := f(αi ), being the image of the fuzzy set as defined in Definition 1.5.8. Given
that the collection A = {α1 , ..., αn } covers X, then so does f(A) = {β1 , ..., βn }.
The reason why often the cartesian product is taken as implication inference is the
following: when the fuzzy variables in the antecedent rule base overlap, this yields
a certain degree of uncertainty, which increases with the length over the overlap,
as can be seen in Figure 25. In the product space, this means that the graph of the
control function f is to be found within a certain region of uncertainty.
Now one logical criterion that should hold is that, if f is the identity function, and
for any x ∈ X, given that µ (x) is the aggregation of the antecedent rule base with
x as input value, this µ (x) again defuzzifies to its original value x. In other words,
D ◦ µ = idX . However, it turns out that even for the most simple control functions,
this is not necessarily true. For instance, while this is understandable in the case
of a discontinuous defuzzifier such as Mean Of Maxima, it is surprising to see that
a continuous defuzzifier such as Center Of Gravity does not satisfy this property
42 W. Peeters

X
b3

b2

b1

X
a1 a3

a2

Fig. 25 Graph within a certain region of uncertainty

either. Therefore, in [85] and [86], we stated two new criteria that a defuzzifier may
or may not satisfy.
In the following criteria, put f = id and ∀i ∈ {1, ..., n} : βi := αi :

5.1.10 Consistency Criterion (CC)

For a universe X ⊆ R compact, let A = {α1 , ..., αn } be an antecedent rule base that
covers X. Furthermore, let µ ∈ F(X) be the fuzzy set resulting from aggregation and
implication. A defuzzifier D will be called consistent if and only if for all x ∈ X,
D(µ (x)) = x(= id(x)).
One will rarely encounter a defuzzification operator that is consistent. Mostly,
our goal is to find an upper bound for the supremum distance D◦µ − f∞ ≤ l(n),
where n is the number of defuzzifiers.
When increasing the number of controllers and restricting the area of overlap,
the more certain one can become that the defuzzified function is indeed the identity,
but this is far from certain. Therefore, we will weaken the criterion as follows:

5.1.11 Asymptotic Consistency Criterion (ACC)

For a universe X ⊆ R compact, for every n ∈ N0 let An = {α1n , ..., αN(n)


n } be an

antecedent rule base that covers X, with µn ∈ F(X) the fuzzy set resulting from
aggregation and implication. Furthermore, we demand that

N(n)
lim max width(αin ) = 0
n→∞ i=1
An Overview of Fuzzy Control Theory 43

A defuzzifier D will be called asymptotically consistent if and only if

lim D◦µn − f∞ = 0


n→∞

These two criteria mean that given an antecedent rule base A = {α1 , ..., αn }, the
difference between the output and the image through f tends to zero when increasing
the number of rules in the base. One can extend these criteria to hold for larger
collections of antecedent rule bases, all those which cover X, all partitions of unity,
or perhaps even all of them, and for other larger collections of test functions, but
one may expect that these criteria will become too strict again.

5.1.12 Corollary

Due to the scaling arguments 5.1.3 and 5.1.8, one may assume that the universe
X = [0, 1]. An often used standard rule base is the following collection of partitions
of unity:

−xn + 1 if x ∈ 0, n1
α1 =
0 otherwise


⎪ xn + 2 − k if x ∈ k − 2 k−1
n , n

αk = −xn + k if x ∈ k − 1 , k

⎪ n n

0 otherwise

xn − n + 1 if x ∈ n − 1
n ,1
αn+1 =
0 otherwise

5.2 Overview of the Different Defuzzification Operators

The most crucial step in the construction of a fuzzy controller, however, is the de-
fuzzification method. In physical applications, at one stage in the adaptive process,
a decision has to be taken as how to adjust the system, thereby needing one out-
put variable. Several defuzzification techniques have been studied extensively, and
for a good overview we refer to the articles of T.A. Runkler et al. [129] and
W. Van Leekwijck et al. [148]. We will now give an overview of the different possi-
ble defuzzification operators
D· : F(X) −→ X,
together with a list of criteria they either do or do not fulfill. This list is by no
means meant to be exhaustive, but rather meant as an overview of the most important
44 W. Peeters

possibilities. A defuzzifier that satisfies all criteria does not exist. We assume that
any element µ ∈ F(X) is the result of an aggregation and implication of a certain
fuzzy rule base A = {α1 , ..., αn } with a given input value x ∈ X.

5.2.1 Random Choice of Maxima

The random choice of maxima defuzzification DRCM is a stochastic variable that


maps µ ∈ F(X) to a random element x ∈ core(µ ) with a probability

λ ({x})
P(x) = ,
λ ({core(µ )})

λ being the Lebesgue measure on X (see [78]).


The following defuzzifications will be called core defuzzifications, because the
defuzzified values are always a member of the core set.

5.2.2 Definition (FOM-, LOM-, MOM- and MOS-defuzzification)

For an ordered universe (X, ≤),

1. The first of maxima defuzzification DFOM (Figure 26) is a function that maps
µ ∈ F(X) to  
D FOM
(µ ) = inf y ∈ X : µ (y) = sup µ (z)
z∈X

2. The last of maxima defuzzification DLOM (Figure 26) is a function that maps
µ ∈ F(X) to  
DLOM (µ ) = sup y ∈ X : µ (y) = sup µ (z)
z∈X

3. The middle of maxima defuzzification DMOM (Figure 26) is a function that maps
µ ∈ F(X) to
DLOM (µ ) + DFOM (µ )
DMOM (µ ) =
2

4. The middle of support defuzzification DMOS is a function that maps µ ∈ F(X) to

inf{y ∈ X : µ (y) > 0} + sup{y ∈ X : µ (y) > 0}


DMOS (µ ) =
2
The problem with core defuzzification criteria is that they tend to select an oc-
casional peak value over a centroid mass that is located elsewhere, but has a sub-
stantially more important weight. As a counterexample, consider the fuzzy set from
Figure 28:
An Overview of Fuzzy Control Theory 45

1 µ

Fig. 26 FOM-, LOM- and X


0 DFOM(m) DMOM(m) DLOM(m)
MOM-defuzzification

1 µ

0 DMOS(m) X
Fig. 27 MOS-defuzzification

1 µ

0 X
Fig. 28 Counterexample D

where anyone would agree that the main mass is located on the right side of the
defuzzification value. While core defuzzification criteria are computationally much
more simple, generally though the supplementary cost of calculation that takes into
account the whole fuzzy set, is acceptable. The defuzzifications that make use of
such a total consideration will be called centroid defuzzifications.
The following criterion is only useful in the case the rules {αi }ni=1 are functions
X ⊆ R compact −→ [0, 1].

5.2.3 Definition (COG-defuzzification)

For a universe X ⊆ R compact, the Center-of-Gravity defuzzification DCOG is a


function that maps µ ∈ F(X) to

xµ (x)dx
X
D COG
(µ ) = 
µ (x)dx
X
46 W. Peeters

DCOG is perhaps the most commonly used defuzzification method, although it


heavily relies on the fact that the membership function is interpreted as a probability,
which is strictly theoretically speaking, not necessarily true. The major drawback
of this method is its relative computational complexity, because the mass under
the function µ is considered to be uniformly distributed, which implies that all the
intersection points of the different fired rules have to be calculated. A relatively
more simple method is counting the value of the integrals under the firing of each
rule separately, and superposing the results.

5.2.4 Definition (COS-defuzzification)

For a universe X ⊆ R compact, the Center Of Sums–defuzzification DCOS is a func-


tion that maps a rule base consisting of rules {µi }i=1,...,n ∈ F(X) to
n 
∑ xµi (x)dx
i=1 X
DCOS (µ ) = n 
∑ µi (x)dx
i=1 X

The difference with the Center-of-Gravity defuzzification is that some parts of


the area may be counted multiple times, as can be seen in the difference in grey
tones in Figure 29.
In [37], D.P. Filev and R.R. Yager considered the Center-of-Gravity defuzzifi-
cation as one particular case of a more general parametric family of probability
distributions.

5.2.5 Definition (BADD-defuzzification)

For a universe X ⊆ R compact, and any Γ ∈ R+ , the Basic Defuzzification Distrib-


utions DBADD (−, Γ) are a parametric family of functions that map µ ∈ F(X) to

µ 1 µ
1

0 X 0 DCOS(m) X
DCOG(m)

Fig. 29 COG-defuzzification vs. COS-defuzzification


An Overview of Fuzzy Control Theory 47

x(µ (x))Γ dx
DBADD (µ , Γ) = X
(µ (x))Γ dx
X

5.2.6 Proposition

For a universe X ⊆ R compact,


1. DBADD (µ , 0) = DMOS (µ )
2. DBADD (µ , 1) = DCOG (µ )
3. lim DBADD (µ , Γ) = DMOM (µ )
Γ→∞

The parameter Γ is hence a measure of confidence: the higher Γ, the more one is
convinced that the mean of the core is a good defuzzification value, meaning that as
a distribution at least the core of µ is more or less symmetric.
Another centroid defuzzification method was stated by R. Jager in [65], by omit-
ting all values of µ that lie below a certain threshold value α ∈ [0, 1], and subse-
quently taking the Center-of-Gravity defuzzification.

5.2.7 Definition (ICOG-defuzzification)

For a universe X ⊆ R compact, for any α ∈ [0, 1], the Indexed Center-of-Gravity
defuzzification DICOG (Figure 30) is a function that maps µ ∈ F(X) to

xµ (x)dx
Γα ( µ )
DICOG (µ , α ) =  = DCOG (µα∗ )
µ (x)dx
Γα ( µ )

where 
µ (x) if µ (x) ≥ α
µα∗ (x) =
0 if µ (x) < α

1 µ

0 DICOG(m,a)
X
Fig. 30 ICOG-defuzzification
48 W. Peeters

1 µ

a
•(1–b)

0 DSLIDE(m,a,b)
X
Fig. 31 SLIDE-defuzzification

Obviously, the following proposition then holds:

5.2.8 Proposition

For a universe X ⊆ R compact,


1. DICOG (µ , h(µ )) = DMOM (µ )
2. DICOG (µ , 0) = DCOG (µ )
Another two-parameter family of probability-based defuzzifications called
semilineair defuzzification was introduced by R.R. Yager and D.P. Filev in [160].

5.2.9 Definition (SLIDE-defuzzification)

For a universe X ⊆ R compact, for any α , β ∈ [0, 1], the SemiLineair Defuzzification
DSLIDE (Figure 31) (see [160]) is a function that maps µ ∈ F(X) to
 
(1 − β ) xµ (x)dx + xµ (x)dx
(Γα (µ ))C Γα ( µ )
DSLIDE (µ , α , β ) =  
(1 − β ) µ (x)dx + µ (x)dx
(Γα (µ ))C Γα ( µ )

Whereas the parameter α is again a measure of confidence in the system, the


parameter β on the contrary is a parameter that denotes the degree of rejection of
all points with membership µ (x) < α .

5.2.10 Proposition

For a universe X ⊆ R compact,


1. ∀β ∈ [0, 1], DSLIDE (µ , 0, β ) = DCOG (µ )
2. ∀α ∈]0, h(µ )], DSLIDE (µ , α , 0) = DCOG (µ )
3. ∀α ∈]0, h(µ )], DSLIDE (µ , α , 1) = DCOG (µ )
4. DSLIDE (µ , h(µ ), 1) = DMOM (µ )
An Overview of Fuzzy Control Theory 49

Taking α = h(µ ), the SLIDE-parametric family DSLIDE (µ , h(µ ), β ) hence both


contains the (continuous) DCOG for β = 0 as well as the (noncontinuous) DMOM for
β = 1.

6 An Extended Example

Following the outline proposed in [168], Chapter 11, we will now give an extended
example of a fuzzy controller that is used to steer an automated heating system.
Let t ∈ T = [0, 40] represent the current temperature in a room, then we define five
linguistic variables on the space T , which denote the degree to which this is “freez-
ing”, “cold”, “average”,“warm” or “hot” . We could for instance take the following
antecedent rule base, which is a partition of the unity (Figure 32):
 t 
µfreezing (t) = 1 − ∨0
 10 t 
 
µcold (t) = 1 − 1 −  ∨ 0
 10 
 t − 10 

µaverage (t) = 1 − 1 − ∨0
10 
 
 t − 20 
µwarm (t) = 1 − 1 − ∨0
10 
t − 30
µhot (t) = ∨0
10

Apart from that, we must make sure that the temperature never exceeds the
boundary values of [0, 40]. This can be done by applying a simple clipping of the
value t to 0 ∨t ∧ 40. Suppose now that we also know the value ∆t ∈ [−1, 1] denoting
the recent change of temperature, which can be “cooling fast”, “cooling”, “staying
the same”, “warming” or “warming fast”. Such a value ∆t can be obtained for ex-
ample by evaluating the temperature on two consequent measurement points in time
and clipping these for a certain minimum and maximum. In our example,
∆t(n) := −1 ∨ (t(n) − t(n − 1)) ∧ 1,
which it is reasonable to assume on condition that the change in temperature on two
subsequent measurement points in time does only exceptionally exceed the treshold.

mcold maverage mwarm mhot


1
mfreezing

Fig. 32 Rule base for the T


0 10 20 30 40
temperature
50 W. Peeters

mstaying the same


mcooling fast m mwarming
cooling
1 mwarming fast

Fig. 33 Rule base for the ∆T


–1 0 1
change in temperature

ndecrease
1 nno action nincrease

Fig. 34 Consequence rule


base describing the action to 0 P
be taken –1 1

If this is not the case, a higher frequency in sampling may be required. Therefore it
is feasible to propose the following antecedent rule base, which also is a partition of
the unity (and, in fact, the same as t up a scaling factor) (Figure 33):

µcooling fast (∆t) = (1 − |2 + 2∆t|) ∨ 0


µcooling (∆t) = (1 − |1 + 2∆t|) ∨ 0
µstaying the same (∆t) = (1 − |2∆t|) ∨ 0
µwarming (∆t) = (1 − |1 − 2∆t|) ∨ 0
µwarming fast (∆t) = (1 − |2 − 2∆t|) ∨ 0

Finally, a third rule base will serve as the consequence. The action to be taken
will either be to “decrease” the power of the heating system, to “take no action” or
to “increase” its power. For simplicity reasons, we will take the power p ∈ [−1, 1]
as well, which can be simply adjusted by any desired scale factor. Let us consider
the following rule base (Figure 34):

νdecrease (p) = (1 − |1 + p|) ∨ 0


νno action (p) = (1 − |p|) ∨ 0
νincrease (p) = (1 − |1 − p|) ∨ 0

Secondly, we will establish a (heuristic) rule base. A suitable rule for instance
would be

IF (t is cold) and (∆t is cooling) THEN (p is increasing)


An Overview of Fuzzy Control Theory 51

However, instead of writing out all the rules, it is much easier to consider the fol-
lowing table:
t/∆t cf c sts w wf
f i i i i na
c i i i na na
a i na na na d
w na na d d d
h na d d d d
This table is complete, in the sense that any entry values (t, ∆t) in the given
intervals trigger at least one consequence rule. As a rule of inference, we will use the
minimum-operator, which we will use also and as a rule of consequence, following
the approach of Mamdani. Suppose a measurement is performed, and we find that
t = 27 and ∆t = −0.4. Then µaverage (t) and µwarm (t) are nonzero with respect to
t, and µcooling (∆t) and µstaying the same (∆t) are nonzero with respect to ∆t. From the
table, the rules in boldface therefore fire:
t/∆t cf c sts w wf
f i i i i na
c i i na na na
a i na na d d
w na na d d d
h na d d d d
The grades of membership are respectively µaverage (27) = 0.3, µwarm (27) = 0.7,
µcooling (−0, 4) = 0.8 and µstaying the same (−0.4) = 0.2. The four antecedents there-
fore are aggregated by means of the minimum operator:
t/∆t c (0.8) sts (0.2)
a (0.3) 0.3 0.2
w (0.7) 0.7 0.2
The consequence rules that fire are
min{νnoaction , 0.3}, min{νnoaction , 0.2}, min{νnoaction , 0.7} and min{νdecreasing , 0.2}
respectively. Considering the maximum over these four clipped fuzzy sets, we obtain
the consequence function shown in Figure 35.

ndecrease
1 nno action nincrease

0 1 P
Fig. 35 Consequence –1
52 W. Peeters

Analytically, this function can be defined as



⎪ 0.3 if p ∈ [−1, −0, 7]


1 + p if p ∈ [−0.7, −0.3]
νconsequence (p) =

⎪ 0.7 if p ∈ [−0.3, 0.3]

1 − p if p ∈ [0.3, 1]

Using COG-defuzzification, we obtain that


 1
pν (p)d p
−81
DCOG (ν ) = −11 = ≈ −0.0424
1910
ν (p)d p
−1

Since the temperature is too warm but the temperature has a negative gradient, the
fuzzy control system will advise the heating system to diminish its power, but only
slightly, in order to prevent overshoot.
It is fairly easy to calculate the outcome of the controller for other inputs; the
difficulty will be to adjust the antecent rule bases and, more importantly, which
fuzzy rules are to fire on what conditions. In the given example, by clipping the input
values and by ensuring that any input (t, ∆t) makes at least one rule fire, the fuzzy
controller is turned into a closed system. If the system would not have been closed,
in the sense that some spaces in the table would have been void, it would have
been necessary to complete the table with a “default” consequence rule, implying
no action whatsoever. The clipping also has as a side effect that no other state out
of [0, 40] × [−1, 1] can be reached, because we forced it to be so. It would be an
advantage if the system could be naturally closed, in the sense that no clipping (at
least not in the temperature values t) would be necessary.
Another important factor is whether the given control system eventually reaches
an equilibrium state, after which the temperature hardly needs to be adjusted any-
more. There is an important difference between a stable state, which means that
small perturbations in the input values will eventually lead to the same equilibrium
point, or a nonstable state, for which a small disruption can either lead to a different
stable state or no stability at all any more. A notorious example is the so-called in-
verted pendulum, for which the problem was already stated by H. Kwakernaak and
R. Sivan in [81]. More about this stability issue will be explained in Section 9.

7 Simplified Controllers

In this section, we will give an overview of various techniques that may simplify a
part of the control process. The most obvious reason for doing this is gaining pre-
cious computation time. We should ask ourselves two questions when determining
whether or not to use these techniques: (1) Do the calculations give the same or
at least a similar precision without affecting the control process, and (2) Are they
really time-saving?
An Overview of Fuzzy Control Theory 53

7.1 Table-Based Controllers

When the universes of discourse are discrete, or at least can be discretized to a fi-
nal number of states, it is always possible to calculate all thinkable combinations of
inputs before putting the controller into operation. Because all possible defuzzifica-
tions only have to be calculated once, this drastically reduces the computation time.
Consequently, the relation between all input combinations and their corresponding
outputs are arranged in a table. Let us assume that there are only two inputs and
one output, then this results in a two-dimensional lookup table, which we can eas-
ily visualize. For a higher dimension, the principle stays the same, and will not
lead to a drastic increase in calculation time, but practically a computer will be
needed.

7.1.1 Example

As an illustration, let us consider an example similar to the one given in Section 6.


As membership functions, we will consider the triangular fuzzy sets
 t 
µcold (t) = 1 − ∨0
 20 t 
 
µaverage (t) = 1 − 1 −  ∨ 0
 20

 t 
µwarm (t) = 1 − 2 −  ∨ 0
20
for the temperature, and
µcooling (∆t) = (1 − |1 + ∆t|) ∨ 0
µstaying the same (∆t) = (1 − |∆t|) ∨ 0
µwarming (∆t) = (1 − |1 − ∆t|) ∨ 0

for the change of temperature. Up to a scaling factor, the outputs denote the appro-
priate action that should be taken to adjust the heating system. Therefore, for the
output, we consider five possibilities: positive big (PB), positive small (PS), zero
(ZE), negative small (NS) and negative big (NB). The corresponding fuzzy sets will
be given by

νPB (p) = (1 − |2 − 2p|) ∨ 0


νPS (p) = (1 − |1 − 2p|) ∨ 0
νZE (p) = (1 − |2p|) ∨ 0
νNS (p) = (1 − |1 + 2p|) ∨ 0
νNB (p) = (1 − |2 + 2p|) ∨ 0
54 W. Peeters

The antecedent rule base is given by the following table:


t/∆t c sts w
c PB PS ZE
a PS ZE NS
w ZE NS NB
Let us make the additional assumption that the temperature t belongs to a discrete
space {0, 10, 20, 30, 40} and the change in temperature ∆t belongs to a discrete space
{−1, − 12 , 0, 12 , 1}. Using the membership functions described above, we would only
get one rule firing with each input. Therefore, using a triangular set of fuzzy rules
and the appropriate defuzzification method (center of gravity), the resulting output
equals
t/∆t −1 − 12 0 1
2 1
0 +0.83 +0.56 +0.50 +0.25 0
10 +0.56 +0.31 +0.25 0 −0.25
20 +0.50 +0.25 0 −0.25 −0.50
30 +0.25 0 −0.25 −0.31 −0.56
40 0 −0.25 −0.50 −0.56 −0.83
The array implementation considerably improves the execution speed, because the
repeated application of the inference and defuzzification is reduced to a simple table-
lookup, which is a lot faster.
The antidiagonal in this table represents the states where no supplementary action
should be taken. There, either the temperature is average and the change in temper-
ature is zero, being the reference value, or it is tending toward the reference value.
Should the process move away from the zero diagonal, a supplementary action will
have to be taken to move the controller back to its stable state. The further away
from the diagonal, the more drastic the action to be taken becomes. The numerical
values on the two sides of the antidiagonal need not necessarily be antisymmetric,
but in this case, they are. If we follow the subsequent states a process visits, we get
an equivalent of a phase plane in dynamical systems ([23]). See again Section 9.
If the resolution in the table is too coarse, it may cause cycles in the trajec-
tory behavior of the system, oscillations around the reference. The only feasible
solution is to refine the table. Instead of a tedious recalculation, the more obvious
thing to do would be to use bilinear interpolation. Suppose a temperature t ∈ [t1 ,t2 ]
and a change in temperature ∆t ∈ [∆t1 , ∆t2 ] would be given, where t1 ,t2 as well as
∆t1 , ∆t2 are neighboring points in the table. The resulting table value can then be
found by first linearly interpolating in the direction of the T –axis, yielding the val-
ues u1 between p(t1 , ∆t1 ) and p(t2 , ∆t1 ) and u2 between p(t1 , ∆t2 ) and p(t2 , ∆t2 ),
and subsequently linearly interpolating in the direction of the ∆T –axis between the
aforementioned points. Reversing the order is of course also possible.
Let us for instance consider the point (t, ∆t) = (12, −0.7). We then find that

u1 = 0.8 ∗ 0.56 + 0.2 ∗ 0.50 = 0.548 if ∆t = −1
12 = 0.8 ∗ 10 + 0.2 ∗ 20 ⇒
u2 = 0.8 ∗ 0.31 + 0.2 ∗ 0.25 = 0.298 if ∆t = −0.5
An Overview of Fuzzy Control Theory 55

and subsequently

p = −0.7 = 0.4 ∗ (−1) + 0.6 ∗ (−0.5) ⇒ 0.4 ∗ 0.548 + 0.6 ∗ 0.298 = +0.398

On the other hand, a direct computation of the inference and defuzzification yields
p = +0.362.

7.2 Sugeno Controllers

An interesting and widely used kind of controller in which the defuzzification


process is incorporated in the rule base, being an alteration of the method proposed
by E.H. Mamdani et al. in [94], is given by M. Sugeno in [140] and was brought into
practice by M. Sugeno and M. Nishida in [142]. The idea is that our fuzzy controller
still has fuzzy antecedents, but that the consequence functions are already crisp set
values, being functions of the input variables. As an output value, one then takes the
aggregation of the results of the different rules, weighted with the degree of mem-
bership of the input values in the rule antecedents, which eliminates the need for a
defuzzification procedure. A rule hence has the following definition:

r : IF (X1 = A1 ) and ... and (Xn = An ) THEN Y = fr (X1 , ..., Xn ),

Given a mapping fr : X1 × X2 × ... × Xn −→ Y , associated with the r–th rule of


the antecedent rule base, and an input vector x = (x1 , ..., xn ), the output value then
becomes
∑ kr (x) · fr (x)
r∈K
y=
∑ kr (x)
r∈K

where kr (x) is defined as in Section 4.

7.2.1 Example

In Example 6, we could for instance write a rule as

(t − 20) · (∆t)
IF (t is cold) and (∆t is cooling) THEN p=
20

The Sugeno controllers have the advantage that a defuzzification afterwards is


not needed any more; the defuzzifications values are replaced by the outcome values
of fr , and the defuzzification procedure assigns a certain “weight” to each of these
values.
The most simple Sugeno controllers are those for which fr (x) is a constant func-
tion. One could for instance state that fr (x) = DMOM (βr ), where βr is the fuzzy set,
56 W. Peeters

associated with the consequence of the r–th rule. Therefore, the main advantage of
this kind of controllers is that the defuzzification need not be performed in every
step, and instead, one can consider a finite set {DMOM (βk )}k∈K of predetermined or
precalculated values.

7.2.2 Example

In Example 6 again, we could for instance write a rule as

IF (t is cold) and (∆t is cooling) THEN (p = +0.3)

The clear disadvantage of a Sugeno controller over a Mamdani controller is the


fact that the functional relation between the input values and the output value is not
straightforward. However, it allows the possibility of introducing and adjusting dif-
ferent scaling parameters in the output function, which make Sugeno controllers
extremely suitable for fine-tuning. Its main advantage however is the computa-
tional simplicity, since the time-consuming defuzzification step is omitted. In fact,
a Sugeno fuzzy controller can be seen as a modification of an ordinary linear con-
troller for which only the input value has been fuzzified, and can thus be regarded
upon as a combination of several linear control strategies.
One can also apply inference with several rules of various firing strength. The
output from each rule is then a moving singleton, and the defuzzified output is
the weighted average of the contributions of each of the rules. In such a case, the
controller interpolates between several linear controllers, but the weighting systems
yields a region of interpolation in the overlap between the linear controllers. In such
a case, we say that the rules interpolate smoothly between the lineair gains.

7.2.3 Example

Consider a single-input single-output rule base “error” on a space X = [0, 100] with
the following rules:
⎧  
⎨ µsmall (e) = 1 − e ∨ 0
∀e ∈ [0, 100] :  60 
⎩ µlarge (e) = 5 − e ∨ 0
3 60
Then consider the following two rules (Figure 36)

IF (error is large) THEN (o1 (e) = 0.2 ∗ e + 90)


IF (error is small) THEN (o2 (e) = 0.6 ∗ e + 20)

Using
µsmall (e)o2 (e) + µlarge (e)o1 (e)
DCOG (e) := ,
µsmall (e) + µlarge (e)
An Overview of Fuzzy Control Theory 57

0.8

0.6

0.4

0.2
e
Fig. 36 Overlapping Sugeno
0 20 40 60 80 100
controller

100

80

60

40

20
e

Fig. 37 Linear interpolation 0 20 40 60 80 100

outside the overlap region we obtain a linear function of the error, and inside the
region we obtain a linear interpolation of the two, which is also a linear function
(Figure 37).

8 Adaptive Fuzzy Control

8.1 General Remarks

Most processes that require automatic control are nonlinear, in the sense that cer-
tain parameters will change either in function of time, the state the process is in,
or more likely, both. Therefore, linear controllers can only function on a limited
neighborhood of the operating point and in a limited period of time. Due to exter-
nal circumstances, it may be necessary to retune the controller at various moments
in time. It would therefore be particularly handy if adaptive controllers would be
able to periodically retune themselves. Any fuzzy controller for which the fuzzy
knowledge base is changed througout the control process, will be called an adap-
tive fuzzy controller. The adaptive component of such a controller consists of two
parts: the process monitor, which looks for changes in the process characterics, and
the adaptation mechanism, which alters the controller parameters on the basis of
any detected changes. Note that the first component is equally present in nonfuzzy
adaptive controllers (see [5]).
58 W. Peeters

8.1.1 Self-Tuning vs Self-Organizing Controllers

We will first of all make the distinction between self-tuning controllers and self-
organizing controllers (see [162]). Both are fuzzy controllers that are able to adapt
following the outcome of some performance measure. However, we will speak about
self-tuning controllers if only the fuzzy set definitions are changed, and about self-
organizing controllers if the rules themselves, and particularly, their activations, are
changed, or if new rules are added or old ones omitted. Self-tuning controllers essen-
tially can only fine-tune a controller that is already designed, while self-organizing
controllers can be built from scratch.

8.1.2 Performance-Adaptive vs. Parameter-Adaptive Controllers

Another common distinction that is used throughout literature is the one between
performance-adaptive controllers and parameter-adaptive controllers; see for in-
stance [135]. The distinction between these is which method is used as a progress
monitor to update the controller parameters. In the first case, some performance
measure is used that assesses how well the controller is controlling, in the second
case a parameter estimator is used that instantly updates a model of the process. We
need to remark however that a unifying theory about the performance evaluation of
adaptive fuzzy controllers is still lacking, and that most methods are just a heuristic
adaptation of the performance criteria used in conventional control theory.

8.1.3 Parameter Estimators

As can be seen in [24], a parameter estimator can in its turn be modelled as a


fuzzy controller. Such a model consists of a similar set of fuzzy control rules as
the main controller, but with the difference that it describes the linguistic values
of the process-output for given linguistic values of the process-input, rather than
the control action. For instance, again referring to the heat system described in the
example of Section 6, a parameter estimation rule could look like this:

IF (tn is cold) and (tn−1 is average) THEN (tn+1 is average)

rather than the action that has to be taken, which would look like this:

IF (tn is cold) and (tn−1 is average) THEN (p is increasing)

Various techniques exist to obtain from such a base of rules a measure of perfor-
mance for the original controller. See for instance, the work of W. Pedrycz [21],
[113], [114] and [115]. Other techniques involve for instance the use of time series
([13]).
An Overview of Fuzzy Control Theory 59

8.1.4 Performance Measures

Performance measures, on the other hand, include, but are not limited to, choosing
one or more appropriate values among the following: overshoot, rise time, settling
time, decay ratio, frequency of oscillations, integral of the square arror, integral of
the absolute value of the error, integral of the time-weighted, absolute error, gain and
phase margins. Either the values of these performance measures are used directly
(e.g. [8]), or several of them are combined into a performance index (e.g. [119]).
Typically, a controller performance is measured as a trade-off between the different
goals and the constraints.
In the following subsections, we will give an overview of the basics of the most
commonly used adaptation techniques.

8.2 Scaling

In many cases, the fuzzy set definitions are defined on a normalized universe, for
instance the closed interval [−1, +1]. Any real-valued input can be scaled by mul-
tiplying the control parameter by an appropriate scale parameter. If we have for
instance a universe of discourse equalling [−20, +20], then we need to multiply the
input value by a scaling factor λ = 0.05. An input value x = +10 will then classify
as “positive medium (PM)”. Using a scaling factor λ = 0.025 will yield a universe
of discourse [−40, +40], in which the same input value x = 10 will be classified as
“positive small (PS)” (Figure 38).
For some applications, it may therefore be suitable not to consider the scaling fac-
tors as constants. Altering the scaling factors during a control process is the equiv-
alent of what is called gain tuning in the context of nonfuzzy PID controllers. The
most obvious way to incorporate this principle in fuzzy controllers is to change the
rule base. For instance, the rule
IF (temperature is cold) THEN (power gain = POSITIVE SMALL)

µNB µNM µNS µZE µPS µPM µPB


1

X
–1 –0,5 0 +0,5 +1

–20 –10 0 +10 +20


Fig. 38 An example of
scaling –40 –20 0 +10 +20 +40
60 W. Peeters

would be changed to

IF (temperature is cold) THEN (power gain = POSITIVE LARGE)

In that case, the adapted fuzzy controller will be considered as a self-organizing


controller, since the rules are modified.

8.2.1 Example

Another commonly used technique is increasing the precision around the origin by
a logarithmic transformation, as we described in Section 3.1. It is possible however
to simply alter the scale factors following the result of certain performance criteria.
A notorious example is given by Y. Yamashita et al. in [161]. The article describes a
chemical process in which the temperature needs to be increased slowly at first, but
in which the increase has to be subdued after a present flow of hydrogen gas starts
to combust. The idea that is applied there is to have a variable scaling factor that is
controlled by a fuzzy controller. Using a performance measure Pt at sampling time
t, being the average of the squared error over the previous three sampling times,
the scaling factor Ct is controlled according to the following set of linguistic rules,
which only depend on the largeness of the performance measure:

IF (Pt is very large) THEN Ct is very small


IF (Pt is large) THEN Ct is small
IF (Pt is medium) THEN Ct is medium
IF (Pt is small) THEN Ct is large

The scaling factors for the error (E) and the change of error (∆E) are then updated
by applying the following scaling factors:

Et = Ct · E0
∆Et = Ct · ∆E0

where E0 and ∆E0 are fixed initial values. These scaling factors may be implemented
as a fuzzy controller as well as a crisp controller. Various other schemes for altering
the scaling factors are of course possible, although the design is, once more, mostly
heuristic in nature.

8.3 Membership Function Tuning using Performance Criteria

One of the earliest examples of a performance adaptive fuzzy controller was given
by G. Bartolini et al. in [8], and consists of a controller that adapts the member-
ship functions in the rule base online, according to the outcome of a series of
An Overview of Fuzzy Control Theory 61

performance criteria. The controller is of the PD-like fuzzy type, with as inputs
the error and the change in error, and its output being the required change in the
control variable. Six performance criteria are used to assess the quality of the con-
troller on-line. Over a fixed observation period, the length of which is also in its turn
a tunig parameter for the controller, the following indices are calculated:
• ē2 , the average square error
• ē, the average error
• |ē|, the average absolute error
• |e|max , the maximum absolute error
• n1 , the number of consecutive variations in control output
• n2 , the number of variations in control output during the given time interval
While the first four indices are meant to keep the controller at set-point, the latter two
serve the secondary objective of reducing the number of (unnecessary) command
variations. Let us assume for instance that the error function E can assume one of
the following three linguistic variables: Negative, Zero and Positive. The shape of
the fuzzy rules can be one out of the list described in Section 2.3. The adaptation
is then done by modifying the shapes of the membership functions in proportion to
the undesired effects that are being corrected. Depending on the outcome of the first
four performance indices, one the actions in Figure 39 is taken.
Whether one of the four actions, if any, has to be taken, depends on the outcome
of the algorithm shown in Figure 40.
If the average error is too large, then adaptation action (a) or (b) is carried out,
depending on the sign of the difference between the error and the set-point. Adapta-
tion action (a) for instance improves the controller performance when the process is
constantly below set-point. If either the squared error is too large, which indicates
imprecise control, or the error function produces an outlier, even one, adaptation
action (c) increases the sensitivity of the controller.
Only yielding these definitions would only cause the controller to increase its
precision, which will eventually make the controller unworkable, as it has to make
too many adaptations. Therefore, we apply a second flow chart, aimed at reducing
the number of command variations (Figure 41).
Adaptation action (d) is the reverse of adaptation action (c), and decreases the
sensitivity of the controller. The parameters n1 and n2 specify the level of command
variation that is considered to be intolerable.

8.3.1 Parameter Heuristics

It is obvious that the performance of this adaptive fuzzy controller relies signifi-
cantly on the appropriate choice of the parameters α , β , Γ, ε1 and ε2 , for which there
are no standard rules but heuristics. The following observations are helpful though.
Centrally, the idea behind this adaptation process is to provide a quick controller
adaptation — which absolutely need not be carried out at every sampling time —
without causing instability or oscillations, and with only small adaptations to the
62 W. Peeters

E=negative E=zero E=positive

1 1 1

0.5 0.5 0.5

0 0 0

Adaptation action a (right shift)

E=negative E=zero E=positive

1 1 1

0.5 0.5 0.5

0 0 0

Adaptation action b (left shift)

E=negative E=zero E=positive

1 1 1

0.5 0.5 0.5

0 0 0

Adaptation action c (sharpening)


E=negative E=zero E=positive

1 1 1

0.5 0.5 0.5

0 0 0

Adaptation action d (smoothing)

Fig. 39 Membership tuning


An Overview of Fuzzy Control Theory 63

Primary flow chart

–a < e < + a

No Yes

e>0 e max > a

No Yes Yes No

Adaptation Adaptation Adaptation


Action (a) Action (b) Action (c)
Yes e2>b

No

No Adaptation
Action

Fig. 40 Decision algorithm for membership tuning

Secondary flow chart

n1 > ε 1

No Yes

n2 > ε2 Yes e<g

No
No Yes

No Adaptation Adaptation
Action Action (d)

Fig. 41 Refinement of the decision algorithm for membership tuning


64 W. Peeters

fuzzy set definitions. It is also feasible to monitor the excessive command variations
at every sample, but monitor the set-point control only relatively sparse, because too
many control adaptations could lead to instable situations as well.
The parameters α and β are designed to specify what levels of the set-point error
criteria are still acceptable before an adaptation of the fuzzy membership definition
is required. If these parameters are chosen too small, the system will constantly try
to adapt itself, thereby surpassing its goal of producing a desired performance level.
On the other hand, chosing these values too high, the controller will not make nearly
enough adaptations and therefore yield a bad performance. In [8], G. Bartolini et
al. showed that an increase in β causes an increase in average square error ē2 , as
well as a decrease in the number of command variations. An appropriate choice
for β therefore is a trade-off between performance and precision, and the same can
be said about α . ε1 and ε2 are determined by energy conservation requirements,
and Γ thereby specifies that such an action is only to be carried out when the set-
point error is less than a certain upper bound, to prevent excessive control-output. Γ
too high may lead to too much adaptation and therefore insensivity, if Γ is chosen
too low, no adaptation to minimize control-output may ever take place. Another
parameter that is crucial in this approach is the length of the time window. A longer
interval makes the error values more significant, and is a great tool to filter out
unwanted noise in the signals, but also causes the adaptation mechanism to react too
sluggishly.

8.4 Gradient Descent Method

The method developed by H. Nomura et al. in [106] is a tuning method for a con-
troller by means of a set of input–output training data, which will be used to tune
the membership functions of a fuzzy controller using numerical techniques that are
very much alike similar techniques in physics to decrease an energy function. Let a
fuzzy controller consist of n fuzzy rules of the form

Rule i: IF x1 is INPUT (i,1) and ... and xm is INPUT (i,m) THEN u is OUTPUT (i)

where x1 , ..., xm are the controller inputs and the fuzzy sets of the membership func-
tions are given by
 
2 x − ai j 
∀i ∈ {1, ..., n} , ∀ j ∈ {1, ..., m} : µ(i, j) (x) := 1 −
bi j

The membership functions are thus triangular rules as defined in Section 2.3, but
with center ai j and base length bi j which may be subject to adaptation. The output
states are defined by the fuzzy singletons

∀i ∈ {1, ..., n} : ν(i) (u) := 1ui (u)


An Overview of Fuzzy Control Theory 65

Furthermore, the max-product composition is used, as well as the center of gravity


defuzzification DCOG , yielding for a crisp input vector x = (x1 , ..., xm ) the nonfuzzy
output  
n m
∑ ∏ µ(i, j) (x j ) · ui
∗ i=1 j=1
u =  
n m
∑ ∏ µ(i, j) (x j )
i=1 j=1

8.4.1 Gradient or Steepest Descent Algorithm

Now, let us assume that there exists a set of data R that describes for any r ∈ R a
series of input vectors, x = (x1r , ..., xm
r ) together with the desired output values ur .

The idea H. Nomura et al. in [106] proposed, is to minimize the objective function
1
E = (u∗ − ur )2
2
with respect to the parameters ai j , bi j and ui . One of the simplest methods to achieve
this goal is to use the gradient or steepest descent algorithm (which is described in,
e.g., D.R. Sadler [131]), which is an iterative algorithm that decreases the value of
the objective function, relying on the fact that from any point, the objective function
decreases most rapidly in the direction of the negative gradient of its parameters.
Let E(z1 , ..., z p ) be such an objective function, then this vector is

∂E ∂E ∂E
−∇(E) = − ,− ,···−
∂ z1 ∂ z2 ∂ zp

and if zi (t) is the value of the i–th parameter after t iterations, then the next estimate
for the same parameter is given by

∂E
∀i ∈ {1, ..., p} : zi (t + 1) = zi (t) − K ·
∂ zi
with K a constant that controls the maximal speed by which the parameters are
altered at each iteration.
In this case, the parameters we wish to alter are ai j , bi j and ui . Considering the
objective function in terms of the membership functions,
⎛   ⎞2
n m
⎜∑ ∏ · uiµIS(i, j) (xrj )

1 ⎜ i=1 j=1 ⎟
E = ·⎜
⎜   − ur⎟

2 ⎝ n m ⎠
µ
∑ ∏ IS(i, j) j
(x r)
i=1 j=1
66 W. Peeters

and using the steepest descent algorithm, we obtain the following adaptation equa-
tions:
n
∑ µk (uk (t) − ui (t)) 2 · sgn(xr − a (t))
Ka (u∗ − ur ) · µi k=1 j ij
ai j (t + 1) = ai j (t) −
µi j (xrj ) n 2 bi j (t)
∑ µk
k=1
n
∑ µk (uk (t) − ui (t)) 1 − µ (xr )
Kb (u∗ − ur ) · µi k=1 ij j
bi j (t + 1) = bi j (t) −
µi j (xrj ) n 2 bi j (t)
∑ µk
k=1
Ku (u∗ − ur ) · µ i
ui (t + 1) = ui (t) − n
∑ µk
k=1

m
with µi := ∏ µil (xlr ). Starting from a reliable controller input/output data set, one
l=1
can optimize the parameters as follows:
r ) and calculate for each rule µ as well as the control
• Insert the data x = (x1r , ..., xm i

output u .
• Update the values of ui (t).
• Repeat the rule firing using the new values of ui .
• Update the values of ai j (t) and bi j (t).
• Calculate the inference error E(t)
and repeat these steps for other r ∈ R until |E(t) − E(t − 1)| is sufficiently small.
The first practical application of this adaptive fuzzy method was made by
H. Nomura et al. in [106], where a mobile robot was trained to avoid a moving
obstacle. Starting with 625 rules and a set of manual operations to obtain train-
ing data, 66 input/output sets provided enough information to tune the membership
functions. Many variations on this method exist, that mainly differ in the optimiza-
tion procedure, although one could also heuristically adapt the fuzzy set definitions
and the defuzzification method. For instance A. Maeda et al. claim in [93] to obtain
a learning speed 40 times faster than the algorithm described above.

8.5 Self-Organizing Controllers

Just like the membership function tuning controller described in Section 8.3, the
self-organizing controllers constructed by E.H. Mamdani et al. in [95] and [96],
and of which a detailed description can be found in T.J. Procyk et al. in [119], the
adaptation of this controller is carried out by calculating performance measures. The
difference, however, is that neither the fuzzy set definitions nor the scaling factors
An Overview of Fuzzy Control Theory 67

are adapted, but the rules themselves are. To be more accurate, each of the possible
rules is fired, and afterward, it is determined which of the rules causes the best
performance measure.
Without loss of generality, let us assume for the sake of simplicity a two-input,
one-output fuzzy rule base, with as input values the error and the change of error.
As is shown in D. Driankov et al. [24], the case with one input is trivially simple
and the case with two inputs allows for an easy matrix representation. Suppose
the linguistic arguments of the error function are E1 , ..., E p , those of the change of
error are ∆E1 , ..., ∆Eq and those of the output function are U1 , ...,Ur . Let the fuzzy
controller consist of rules like

Rule i: IF µe = Ea and µ∆e = ∆Eb THEN νu = Uc

for all i ∈ N = {1, ..., n} , a ∈ P = {1, ..., p} , b ∈ Q = {1, ..., q} , c ∈ R = {1, ..., r}. It
is only sensible that the number n should never exceed all possible combinations, so
without loss of generality, n ≤ p × q × r. Optimally, every pair of antecedents, the
occurrence of which we called the completeness of the antecedent rule base, yields
one and only one consequence, so n = p × q. The following algorithm can then be
used to determine the optimal linguistic value for Uc .
Suppose that there also exists a performance measure P. For any (a, b) ∈ P × Q
fixed, this may for instance be the difference between the defuzzification output
and the given output of a control data set. Therefore, let any rule with antecedents
µe = Ea and µ∆e = ∆Eb fire, and calculate

c0 = argmax P(Ea , ∆Eb ,Uc )


c∈R

8.5.1 Example

Let us for instance consider the example given in Section 6 again. Suppose the
antecent rule base contains the following rule:

IF (t is cold) and (∆t is warming) THEN (p is increasing)

The linguistic value “increasing” was introduced in a heuristic way. We might won-
der what output will give the best result, when we have to choose for instance be-
tween “increasing” or “no action”. In other words, we wonder which of the two
tables will yield the best result: the original one
t/∆t cf c sts w wf
f i i i i na
c i i i na na
a i na na na d
w na na d d d
h na d d d d
68 W. Peeters

or the altered one,


t/∆t cf c sts w wf
f i i i na na
c i i i na na
a i na na na d
w na na d d d
h na d d d d
Given a data set (ti , ∆ti ) at a certain point i in time, and given a control output value
u, we can determine both defuzzification values, which are derived in the same way
as in Section 6. Suppose the first table gives as defuzzification value −0.03 and the
second one −0.06, while the given value is u = −0.05, then we have a reasonable
cause to omit the former rule in favour of the latter. The performance measure is
here P(tn , ∆rn , u).

8.5.2 Remark

For the sake of completeness, it is common that all possible linguistic values
U1 , ...,Ur are used in the rule base, to come up with the best performing value.
To measure the influence of one rule at a time, it is recommended not to use this
algorithm for the substitution of two rules in the antecent rule base at once, since
then, it cannot be determined anymore which of them is responsible for the better or
worse performance individually. It would take too long to have all rules undergo this
testing procedure in every step; it is therefore advisable to only sporadically control
the rules, in such a way that all rules are eventually revised an equal number of
times, with enough uncontrolled runs of the control process in between. Following
D. Driankov et al. in [24], when making use of a matrix notation, this algorithm can
be simplified to a finite number of lattice theoretic and algebraic operations, yielding
a high performance. This method can not only be used to enhance the performance
of an existing controller, it is also possible to build up a controller from scratch, if
enough supervised test runs are performed.

8.5.3 Time Dependence

We have, however, omitted one particular drawback in the description of this


method, and that is the time dependence. Naturally, it could be possible that a bad
performance at time i is caused by a faulty control rule m instances ago. Formally,
therefore the description of the rules should be changed to something like

IF µe (i) = Ea and µ∆e (i) = ∆Eb THEN νu (i − m) = Uc

which makes the process a little bit more difficult, especially since m, the number of
steps that have to be traced back, is unknown. Some work on this “delay in reward”
parameter has been done by T. Yamazaki et al. in [162].
An Overview of Fuzzy Control Theory 69

9 Stability Analysis

9.1 General Remarks

Through the heuristic nature of fuzzy controllers, and the positive correlation be-
tween the heuristic experience of the controller’s designer and the performance of
the controller, relatively little effort has been done to develop a solid theoretical
background for general analysis of the dynamic behavior of control loops. Never-
theless, the stability of fuzzy controllers ([143]) is an extremely important factor in
its design. To quote A. Kandel et al. in [71], “Stability is (...) the first and last con-
cept for any system design and a fundamental issue in every control system”. In fact,
the lack of a solid stability analysis has been considered as the major drawback for
fuzzy controllers, and is one of the main arguments against the use of fuzzy control
over conventional control ([1]). Because fuzzy controllers are essentially nonlinear
systems (see [24]), it will be hard to obtain general results, and one must be sat-
isfied with results that are only interpretable on a very local scale. There are two
main approaches to study the behavior of a fuzzy controller, one relying on classical
nonlinear dynamic systems theory, where we assume the fuzzy controller to be a
particular class of nonlinear controllers, and the theory of fuzzy dynamical systems,
associated with Zadeh’s Extension principle (see [165]). The basic theory of fuzzy
dynamical systems can be found in P.E. Kloeden [76], and research concerning the
stability of a fuzzy system makes use of the concept of energy or controllability of
the fuzzy system. See for instance [42] and [75]. The first articles specifically on
stability of fuzzy controllers are due to C.V. Negoita ([105]) and W.M. Kickert and
E.H. Mamdani ([74]).
Whether a fuzzy control design will be stable, i.e. whether it will reach — or
at least stay close enough, within a preset boundary of — an equilibrium, is still
an open question, unlike linear controllers. In linear control, we will call a system
stable if it converges to the equilibrium, no matter where the system state variables
start. It is a known result that a necessary and sufficient condition for a linear system
to be stable, is that all eigenvalues are situated in the left half of the complex plane.
For nonlinear systems, such as fuzzy control systems, the concept of stability is
much more intricate. Fuzzy controllers, however, are particularly nonlinear; mainly,
there are three sources of nonlinearity in fuzzy control:
• The rule base. The position, shape and number of fuzzy sets are nonlinear, and
the nonlinearity may be reinforced by nonlinear scaling of the input values. The
rules itself also often express a nonlinear control strategy.
• The inference engine. Connectives like (and,or):=(∧, ∨) are nonlinear.
• The defuzzification. Several defuzzification methods are nonlinear.
Checking conditions for stability on nonlinear systems is much more difficult
than on linear systems; some theoretical background can be found in [24] and [111].
Four methods are mentioned there, Lyapunov stability, Popov stability, circle sta-
bility and conicity. However, the results of these methods are rather strict, and
70 W. Peeters

sometimes yield unrealistically small areas where stability is guaranteed. Therefore,


another possibility is approximating the fuzzy controller with a linear controller, and
then apply the conventional linear analysis and design procedures on the approxima-
tion. It must be said that the theoretical background as to why the stability margins
of the nonlinear system would be “close” to those of the linear approximation, is
still unexplored.

9.2 The Input–Output Mapping

But the cases in which the nonlinear fuzzy controller can be approximated by a lin-
ear controller, are, to say the least, scarce. As mentioned in [69], when we consider
a single-input, single-output controller, we can control the shape of the surface to
a certain extent by manipulating the membership functions. In [85] and [86], we
studied the behavior of the identity mapping between two fuzzy antecedent rule
bases, and whether or not it yields an identity mapping between the input before
fuzzification and the output after defuzzification. We found that this was not the
case. This mapping is called the input-output mapping, and can be used a design aid
when one has to choose the membership functions and constructing rules. Ideally,
we will want to make the distance between this mapping and the diagonal as small
as possible, for instance with respect to the L1 –metric.

9.2.1 Example

In each of the following examples, a set of IF–THEN rules on the universe of dis-
course X = [−1, 1] is given, for which we will determine the input-output mapping,
which we will plot against the linear controller. These results depend of course on
the choice of aggregation and defuzzification operators. We will choose the product
implication for its computational simplicity and for its continuity, and the Center-
of-Gravity defuzzification, because it is continuous, satisfies the uniqueness crite-
rion, and in case of a singleton output, it degenerates to a quotient of finite sums.
If there had been more than one input rule, we would have chosen the product
as the aggregation operator too, because it takes into account all bad performing
antecedents.
Consider for instance the following linguistic rule base:

IF (X is Neg) THEN (Y is Neg)


IF (X is Zero) THEN (Y is Zero)
IF (X is Pos) THEN (Y is Pos)

1. Let us take as condition and as consequence the same antecent rule base consist-
ing of the triangular fuzzy sets
An Overview of Fuzzy Control Theory 71

Y 1
IF THEN
0.8
Neg Pos Neg Pos
Y1 Y1 0.6
Zero Zero
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 –0.2 X

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6
–0.8
–1

Fig. 42 Input–output mapping

µNeg (x) = (1 − |1 − (x + 2)|) ∨ 0


µZero (x) = (1 − |1 − (x + 1)|) ∨ 0
µPos (x) = (1 − |1 − x|) ∨ 0

Then the output curve has the shape given in Figure 42.
As we have seen in [85], the Consistency Criterion 5.1.10 is not fulfilled; the
third example will learn us that this can never be the case when we consider the
identity mapping between the same rule bases. Notice also that it is impossible
to drive the output to its full potential of 100% output range.
2. The less overlap there is in the antecedent rule base, the steeper the slopes be-
tween the various regions will get. As an example consider the following rule
base:
 
3 3 
µNeg (x) = 1 −  − x) ∨ 0

2 2
 
3 
µZero (x) = 1 −  x) ∨ 0

2
 
3 3 
µPos (x) = 1 −  + x) ∨ 0

2 2

Then the output curve looks like in Figure 43.


3. The problem with the output range is eliminated when we replace the outputs —
not the inputs — by crisp sets. Taking the same antecedent rule base as in the
first example, we replace the output functions by the Dirac fuzzy sets µNeg (x) =
δ−1 (x), µZero (x) = δ0 (x) and µPos (x) = δ1 (x). The input-output mapping even
becomes linear in this case (see Figure 44). The only way to assure that the
input-output mapping is linear, is the following set of conditions on the design
parameters (see Sections 1.5 and 3.2, credit due to W. Siler & H. Ying ([138], M.
Mizumoto ([102]) and W. Qiao and M. Mizumoto ([120])):
• The input sets must be triangular with cross point level 0.5.
• The aggregation operator must be the algebraic product.
72 W. Peeters

Y1
IF THEN 0.8
Neg Pos Neg Pos
Y1 Y1 0.6
Zero Zero
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2

–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 –0.4


0
X1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8
X1
–0.6
–0.8
–1

Fig. 43 Input–output mapping

Y 1
IF THEN 0.8
Neg Neg
Y1 Pos Y 1 Zero Pos
0.6
Zero
0.8 0.8 0.4
0.6 0.6
0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1
0.2 0.2
–0.2 X

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6

–0.8

–1

Fig. 44 Input–output mapping

• The rule base must be a complete combination (cartesian product) of all input
families.
• The outputs must be singletons at the points in which the input sets reach their
core value.
• The defuzzification operator must be DCOG (or its discrete counterpart).
4. Let us now change the triangular fuzzy sets in the antecent rule base into trape-
zoidal sets

µNeg (x) = (−2x − 0.5) ∨ 0 ∧ 1




⎪ 0 if x ∈ [−1, −0.75]


⎨ 2x + 1.5 if x ∈ [−0.75, −0.25]
µZero (x) = 1 if x ∈ [−0.25, 0.25]



⎪ −2x + 1.5 if x ∈ [0.25, 0.75]

0 if x ∈ [0.75, 1]
µPos (x) = (−2x − 0.5) ∨ 0 ∧ 1

combined with the same Dirac outputs as in the third example. The resulting
curve is seen in Figure 45.
The “flat” input sets produce horizontal pieces in the input–output curve, which
inevitably cause large gains away from the reference value. This is the equivalent
An Overview of Fuzzy Control Theory 73

Y 1
IF THEN 0.8
Neg Y Zero Pos Neg Zero Pos
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
0.2 0.2 –1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
–0.2 X

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 –0.4


X1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8
X1
–0.6
–0.8
–1

Fig. 45 Input–output mapping

Y1
IF THEN 0.8
Neg Y Zero Pos Neg Zero Pos
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 –0.4
0.2 0.4 0.6 0.8 1
X X –0.6
–0.8
–1

Fig. 46 Input–output mapping

of a deadzone with saturation. Increasing the width of the middle term results in
a wider stable area around the reference.
5. The sharp corners may cause a problem when, e.g. considering differentiability.
To avoid this, one can introduce nonlinear input sets, then also the input–output
mapping becomes smooth. For instance, the antecedent rule base

⎨1 if x ∈ [−1, −0.75]
µNeg (x) = 1 1
+ cos(π (2x − 0.5)) if x ∈ [−0.75, −0.25]
⎩2 2
0 if x ∈ [−0.25, 1]


⎪ 0 if x ∈ [−1, −0.75]

⎪ 1 + 1 cos(π (−2x + 1.5)) if x ∈ [−0.75, −0.25]

⎨2 2
µZero (x) = 1 if x ∈ [−0.25, 0.25]




1 + 1 cos(π (2x + 1.5)) if x ∈ [0.25, 0.75]
⎪2 2

0 if x ∈ [0.75, 1]

⎨0 if x ∈ [−1, 0.25]
µPos (x) = 1 + 1 cos(π (−2x − 0.5)) if x ∈ [0.25, 0.75]
⎩2 2
1 if x ∈ [0.75, 1]

combined with the same Dirac outputs as in the third example results in the fol-
lowing input–output curve of Figure 46.
74 W. Peeters

6. Adding more sets only makes the mapping more bumpy. Consider for instance
the following antecedent rule base:

1 + 1 cos(π (2x + 2)) if x ∈ [−1, −0.5]
µNB (x) = 2 2
0 if x ∈ [−0.5, 1]
⎧1 1
⎨ 2 + 2 cos(π (2x + 1)) if x ∈ [−1, −0.5]

µNS (x) = 1 + 1 cos(π (−2x + 1)) if x ∈ [−0.5, 0]

⎩2 2
0 if x ∈ [0, 1]

⎨0 if x ∈ [−1, −0.5]
µZE (x) = 1 + 1 cos(π (2x)) if x ∈ [−0.5, 0.5]
⎩2 2
0 if x ∈ [0.5, 1]

⎪ 0 if x ∈ [−1, 0]
⎨1 1
µPS (x) = 2 + 2 cos(π (2x − 1)) if x ∈ [0, 0.5]

⎩1 1
+ cos(π (−2x − 1)) if x ∈ [0.5, 1]
2 2
0 if x ∈ [−1, 0.5]
µPB (x) = 1 1
2 + 2 cos(π (−2x − 2)) if x ∈ [0.5, 1]

The output functions will be the Dirac fuzzy sets µNB (x) = δ−1 (x), µNS (x) =
δ−0.5 (x), µZE (x) = δ0 (x), µPS (x) = δ0.5 (x) and µPB (x) = δ1 (x). The input-output
mapping looks like in Figure 47.
7. More sets make it easier however to manupulate the position of the refer-
ence plateau by moving the singletons around. Replacing the output functions
in the previous example by µNB (x) = δ−1 (x), µNS (x) = δ−0.25 (x), µZE (x) =
δ0 (x), µPS (x) = δ0.25 (x) and µPB (x) = δ1 (x) for instance yields the input–output
curve of Figure 48.
The only other case that can be visualized, is the one that involves two inputs and
one output. The graph of the input–output mapping is then a surface. Let us consider

Y 1
IF THEN
YZE 0.8
NB NS PS PB NB NS ZE PS PB
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6–0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2 X
–0.2

–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –0.4
X X
–0.6
–0.8
–1

Fig. 47 Input–output mapping


An Overview of Fuzzy Control Theory 75

Y1
IF THEN 0.8
NB NS YZE PS PB NB NS ZE PS PB
1 Y1 0.6
0.8 0.8 0.4
0.6 0.6 0.2
0.4 0.4
–1 –0.8 –0.6 –0.4 –0.2 0.2 0.4 0.6 0.8 1
0.2 0.2
–0.2 X

–0.4
–1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1 –1 –0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 1
X X
–0.6
–0.8
–1

Fig. 48 Input–output mapping

Fig. 49 Control surface

the example given in Section 7.1 again. We plot the output power as a function of
temperature and change in temperature, yielding the result in Figure 49.
We see that the graph contains peaks as well as more horizontal plateaus. It is
desirable to have the latter in regions near the desired equilibrium point, because
otherwise the control process might result in an unstable behavior, but it can be a
disadvantage if one wants to increase the sensitivity near the reference points. If
we would have used input sets with flat peaks, this would have resulted in various
deadzones, regions where the control behavior stagnates. The behavior involving
the smoothening of sharp corners or the steepness of the slopes is largely similar to
the one described in Examples 9.2.1. The design choices in Example 9.2.1(3) are
also a set of necessary conditions to obtain a diagonal plane as control surface. In
that case, the stability analysis for open-loop systems can be performed by the usual
methods regarding tuning and stability of a closed-loop system.
76 W. Peeters

9.3 The State Space Approach

The concept of a linguistic trajectory in the state space of a fuzzy controller was
introduced by M. Braae and D.A. Rutherford in [14] and [15]. Because the funda-
mental difference between the representation of a phase plane in two dimensions and
in higher–dimensional systems, practically this method is only applicable in case of
an input with two state variables, typically the error and the change of error, but to
generalize this, we will denote these variables by X1 and X2 . Suppose the number of
linguistic values each of the values can attain are N1 and N2 respectively. In such a
case, the maximal number of rules in the rule base equals N1 × N2 . Suppose further-
more that the output variable, u, can attain M different linguistic values. Then it is
possible to divide the state space into a partition of maximally M parts (disregarding
borders, which have zero Lebesgue measure, see [78]), according to the following
convention: each crisp element (x1 , x2 ) belongs to the set where the membership
grade of the antecedents is maximal. In other words, for any rules j, k ∈ R, denoted
as

Rule j: IF x1 = X1j and x2 = X2j = THEN U = u j

Rule k: IF x1 = X1k and x2 = X2k = THEN U = uk

we have that (x1 , x2 ) belongs to the partition, associated with uk , if and only if

∀ j = k : µ j (x1 , x2 ) ≤ µk (x1 , x2 )

where µ j and µk are the fuzzy sets representing the rule antecedents of the j–th and
the k–the rule. For instance, using the Mamdani-type implication,

µi (x1 , x2 ) = min(µXi 1 (x1 ), µXi 2 (x2 ))

As an illustration, let us use the antecedent rule base described in Example 6


again, and remark that this is a complete table. The partition of the state space then
looks as in Figure 50. In the lower right corner, we have enumerated the rules.

9.3.1 Definition (Linguistic Trajectory)

A linguistic trajectory is the sequence of rules that are successively fired. In the
following example, the subsequent points in the state space at discrete points in
time are marked with an X; multiple successive occurrences of the same rule fired
are reduced to only one mention. The linguistic trajectory (see Figure 51) of this
particular path hence equals

{R17 , R18 , R19 , R20 , R15 , R10 , R9 , R8 , R7 , R12 , R13 }


An Overview of Fuzzy Control Theory 77

∆E

wf
NA1 NA 2
D 3
D 4
D5

w I NA NA D D
6 7 8 9 10

sts I I NA D D
11 12 13 14 15

c I I NA NA D
16 17 18 19 20

I 21 I 22
I 23
NA 24 NA25
ef

f c a w h E
Fig. 50 State space

∆E

wf
1 2 3 4 5

w
6 7 8 9 10

sts
11 12 13 14 15

e
16 17 18 19 20

21 22 23 24 25
ef

f c a w h E
Fig. 51 Linguistic trajectory

M. Braae and D.A. Rutherford made a simple set of observation rules that relate
the number of visited states — in the example colored in gray — and their relative
position, to some simple conclusions that affect the scaling factors of the intervals
in which the variables x1 and x2 should lie. Let S1 be the total width of the first and
S2 of the total width of the second interval, then this scaling factor being too large or
too small may result in an inadequate covering of the partition space, which imply
that the scaling factors should be adjusted (see Figure 52).
Another problem that will quite often arise is the occurrence of loop behavior.
A system may start altering between two or more states without ever reaching
the desired equilibrium. To this end, since no general solution is found for fuzzy
78 W. Peeters

X2 S1 too large X2 both too small

1 2 3 4 5 1 2 3 4 5

6 7 8 9 10 6 7 8 9 10

11 12 13 14 15 11 12 13 14 15

16 17 18 19 20 16 17 18 19 20

21 22 23 24 25 21 22 23 24 25

X1 X1
X2 both too large X2 S2 too large

1 2 3 4 5 1 2 3 4 5

6 7 8 9 10 6 7 8 9 10

11 12 13 14 15 11 12 13 14 15

16 17 18 19 20 16 17 18 19 20

21 22 23 24 25 21 22 23 24 25

X1 X1

Fig. 52 Linguistic trajectory: adjusting the scaling factors

controllers, some techniques involving crisp controllers are used to determine the
stability and the convergence. We will now describe some of these in a more de-
tailed way.

9.4 Lyapunov Stability

Generally, a nonlinear control system, not necessarily fuzzy, can be described by a


differential equation
dx
= f (x, u)
dt
where x is the state of the system. Commonly, a point x = (e, ∆e) describes the error
and the change in error, and u is the system input, which in the fuzzy case can be
derived from the defuzzification value.
An Overview of Fuzzy Control Theory 79

9.4.1 Definition (Equilibrium Point)

An equilibrium point x∗ is then a state of the system for which the following condi-
tion holds:
∃t0 ∈ R+ , ∀t ≥ t0 : x(t) = x∗ ,
or, in other words, once a state has reached an equilibrium, it will remain in this
point for all future times.
A nonlinear system may demonstrate a much more complicated behavior than a
linear system. For instance, unlike a linear system, it may feature multiple equilib-
rium points. Also, the stability analysis may be dependent of the system input as
well as the initial condition. One major problem that nonlinear systems have is that
an exact mathematical solution is in many cases impossible to obtain (see also [97]).

9.4.2 Local and Global Stability

First of all, we have to make a distinction between the local and global stability
behavior of such a state. Most of the results we obtain concern local stability. In
that case, because we only consider states that are close to some equilibrium point,
which, without loss of generality, by applying some affine transformation, can be
taken as the origin o of the state space, if dim(x) = n is the number of state variables
and dim(u) = m is the number of inputs, a common mathematical technique is to
linearize the problem around o ([48]). In that case we obtain

ẋ = Ax + Bu

with the dot notation for derivation with respect to the time parameter t, A ∈ Mn×n
being the coefficient constant matrix of the linear system, B ∈ Mn×m being the input
matrix. A study of the linear dynamical system therefore can be reduced to a study
of the matrices A and B.

9.4.3 Definition (Asymptotic Stability)

In dynamical systems ([23]), a nonlinear system is called asymptotically stable if


any state in a certain neighbourhood (“close enough”) of the origin converges to its
centre, being the equilibrium state.

9.4.4 Definition (Lyapunov Stable)

It is also possible that for any initial state, the trajectory can be made “close enough”
to the equilibrium (without necessarily really converging to it). In that case, we call
80 W. Peeters

an equilibrium Lyapunov-stable. Equivalently, a Lyapunov-stable system is a system


for which the states will remain bounded for all time, for any finite initial condition.
Asymptotic stability implies Lyapunov stability, although the reverse is not nec-
essarily true.

9.4.5 Definition (Exponential Convergence)

We also need a measure describing how fast the system converges to the equilibrium
point. We say that the state x(t) converges exponentially if and only if it converges
to the origin no less than a certain exponential function. In other words, if and only
if
||x(t)||
∃δ > 0 : lim δ t = 0
t→∞ e

Exponential stability guarantees asymptotic stability, and hence Lyapunov stabil-


ity as well.

9.4.6 Switching Line

Now in the case of a fuzzy controller, let a state space with two variables X1 and
X2 be given, and let the closed-loop system be described by the following vectorial
differential equation:
ẋ = f(x) + bΦ(x)

where f(x) is the nonlinear function that describes the dynamics of the system with-
out correction. We have to make sure that f(o)=o, stating that o is an equilibrium.
Furthermore, x as well as b are vectors of dimension n, and Φ(x) is the scalar, non-
linear control function that represents the correction supplied by the fuzzy rule base.
As minimal condition, this correction should be zero in case of an equilibrium, so
we demand that Φ(o)=o. Using the notations of Section 5, we may consider for any
input vector x that Φ(x) equals the result of a defuzzification operator D on a fuzzy
set µ that is the consequence of the firing of the antecent rule base with the given
input (see Figure 53).
The fact whether or not the behavior of this nonlinear system will be closed loop,
will depend on f(x) and Φ(x). With a fixed vector b, the direction of the control
action is entirely determined by the sign of the scaling factor Φ(x). It is therefore
important to determine the subspace in the state space for which Φ(x) = 0. This
subspace will be called the switching line, which divides the state space in regions
with positive and negative control actions (see Figure 54).
Just as was the case in the state space approach as described in 9.3, it is fairly
simple to recite some heuristic rules that guarantee stability. Generally, a control
system will be stable if Φ(x) is such that it points toward the switching line, and
it will be unstable if it points away from it, with possible the case where the vec-
tor field bΦ(x) is parallel to the tangent to the switching line as a critical case,
An Overview of Fuzzy Control Theory 81

X2

b Φ (x) f (x) X1

f (x) + b Φ (x)

Fig. 53 Dynamics of the trajectory and Lyapunov stability

X2
b Φ (x)

Φ (x)=0
X1

Fig. 54 Switching line

in which the influence of the component f(x) becomes dominant. This approach can
also be useful to determine limit cycles, which are caused by multiple crossings of
the switching lines through the coordinate axes, or isolated areas, which have a dif-
ferent behavior from the dominant area. Isolated areas are closed sets that do not
contain the equilibrium point, and for which the trajectories tend to go around this
area (see Figure 55).
82 W. Peeters

X2 X2

1 2 3 4 5 1 2 3 4 5

6 7 8 9 10 6 7 8 9 10

X1 X1

11 12 13 14 15 11 12 13 14 15

Φ (x)=0
16 17 18 19 20 16 17 18 19 20
Φ (x)=0
21 22 23 24 25 21 22 23 24 25

limit cycle isolated area

Fig. 55 Limit cycles and isolated areas

The occurrence of limit cycles and/or isolated areas is often an indication that the
rule base has to be modified. In the case given in the graph above, the limit cycle in
the left state space implicates the need to changes rules R8 , R9 , R12 , R13 , R14 , R17 and
R18 , while the isolated area in the right state space justifies a change in rule R17 .

9.4.7 Measures of Stability

Apart from this heuristic approach, the Lyapunov stability can also be used to define
some indices that measure quantitatively the stability properties. Let us consider a
few particular cases, depending on the dimension of the problem.

1. n = 1
Let X ⊆ R. The mathematical model in the one–dimensional case becomes

ẋ = f (x) + Φ(x)

where, without loss of generality, we consider the constant b to be incorporated


in Φ(x). We still demand however that f (0) = 0 and Φ(0) = 0. The equilibrium
points then occur whenever ẋ = 0 or, alternatively, Φ(x) = − f (x). 0 needs to be
an attractor for this state, the undesired state of occurence of other attractor points
is equivalent with other zeroes of the function Φ(x) + f (x) (see Figure 56).
In order to be a stable equilibrium, the functions f (x) and Φ(x) have to fulfill the
following two conditions:

(1) Φ (0) < − f  (0)


(2) ∀x = 0 : |Φ(x)| < | f (x)|
An Overview of Fuzzy Control Theory 83

f(x)

f(x)+Φ(x)
X

Φ(x)

Fig. 56 Presence of multiple attractors

The first condition guarantees the equilibrium to be stable (see for instance,
Devaney [23]), the second condition prevents the appearance of other equilib-
ria, which are equivalent to intersections of Φ(x) + f (x) with the X–axis. If Φ(x)
or f (x) are continually deformed, the loss of a stable equilibrium and the appear-
ance of supplementary stable points are called bifurcations. It is easy to see that
under reasonable continuity conditions, such an occurrence of new equilibria al-
ways happens pairwise, and it is easy to see that one of these new stability points
will be a stable one and the other one will be instable.
These considerations also permit us to define two important measures that indi-
cate a measure of stability. Condition (1) can be rewritten as −(Φ (0) + f  (0)) >
0, so −(Φ (0) + f  (0)) can serve the purpose of a measure of robustness of the
system against the loss of stability at the origin. Similarly, we could take the infi-
mum of the distance between Φ(x) and f (x), being inf |Φ(x) + f (x)| as a second
stability measure. However, this value would always be zero, which will invari-
ably be reached in the origin. Therefore, we have to exclude a certain region
around the origin. To this end, define
! "
β1 := sup x < 0 : Φ (x) = − f  (x)
! "
β2 := inf x > 0 : Φ (x) = − f  (x)

The interpretation of these values can be graphically shown in Figure 57.


Therefore, we redefine the stability indices as
I1 := −(Φ (0) + f  (0))
I2 := inf |Φ(x) + f (x)|
x∈X\]β1 ,β2 [

2. n = 2
Let X ⊆ R2 . The mathematical model describing the fuzzy controller can in this
case be given by the following set of coupled differential equations:

ẋ1 = f1 (x1 , x2 ) + b1 · Φ(x1 , x2 )


ẋ2 = f2 (x1 , x2 ) + b2 · Φ(x1 , x2 )
84 W. Peeters

b2 X
b1

–f(x)
Φ(x)

Fig. 57 Stability indices

with boundary conditions f1 (0, 0) = f2 (0, 0) = Φ(0, 0) = 0. Following Devaney


[23]) again, the equilibrium at the origin is stable if and only if the eigenvalues of
the linear approximation around the origin have a negative real part. Bifurcation
then occurs whenever a real negative eigenvalue becomes positive (static bifurca-
tion), or whenever a pair of conjugate complex eigenvalues cross the imaginary
axis such that the negative real parts become positive (Hopf bifurcation). The
linearization of this nonlinear system occurs by considering the Jacobian matrix,
given by
⎛ ⎞
∂ f1 ∂ f2
J = ⎝ ∂∂ xf1 ∂∂ xf1 ⎠ =
a11 a12
1 2 a21 a22
∂ x2 ∂ x2
The characteristic polynomial then is given as

P(λ ) = det(J − λ · I)
= λ 2 − (a11 + a22 ) · λ + a11 a22 − a12 a21
= λ 2 − tr(J) · λ + det(J)

where I is the unit matrix, tr(J) := a11 + a22 is the trace of J and det(J) :=
a11 a22 − a12 a21 is the determinant of J.
First, we will generalize the stability of the equilibrium, which was, in the case of
n = 1, given by the index I1 . A static bifurcaton will happen if and only if one of
the roots of the characteristic polynomial P(λ ) is zero, i.e. when det(J) = 0. It is
therefore logical to assume that the higher the difference between I1 := det(J) and
0, the more stability the system possesses. On the other hand, a Hopf bifurcaton
will occur if and only if two complex eigenvalues cross the imaginary axis, in
which case tr(J) = 0. A second stability index will therefore be defined by I1  :=
−tr(J). Both these values are generalizations of the index I1 as described in the
case that n = 1.
For generalizing the index I2 , we must remember that in the one-dimensional
case, we had a bifurcation in case the vector fields Φ(x) and f (x) compensated
An Overview of Fuzzy Control Theory 85

each other. However, in that case, we had included the occurrence of b in the
vector field Φ(x). In the two-dimensional case, such a thing can only occur in
case the vector field of the controller is parallel to the direction given by b =
(b1 , b2 ). We therefore define the auxiliary subspace as the space of all points
(x1 , x2 ) for which
f1 (x1 , x2 ) f2 (x1 , x2 )
=
b1 b2
The auxiliary subspace is an one-dimensional subspace of the state space. In this
subspace, we can perform an analysis that is similar to the case n = 1. We would
like I2 to be defined as a minimal distance between the plant and the controller
components around the origin, excluding a certain region B around the origin.
This region occurs by again calculating equivalent values to β1 and β2 in the
linear subspace. Therefore, we find that

I2 = min | f (x) + b · Φ(x)|


x∈X\B

3. n> 2
Let X ⊆ Rn . Then we can generalize the previous results straightforwardly. The
mathematical model describing the fuzzy controller is given by the following
system of coupled differential equations:

ẋ1 = f1 (x1 , x2 , ..., xn ) + b1 · Φ(x1 , x2 , ..., xn )


ẋ2 = f2 (x1 , x2 , ..., xn ) + b2 · Φ(x1 , x2 , ..., xn )
...
ẋn = fn (x1 , x2 , ..., xn ) + b2 · Φ(x1 , x2 , ..., xn )

with boundary conditions

f1 (0, 0, ..., 0) = f2 (0, 0, ..., 0) = ... = fn (0, 0, ..., 0) = Φ(0, 0, ..., 0) = 0

The linearization of this nonlinear system occurs by considering the Jacobian


matrix, given by ⎛ ⎞
∂ f1 ∂ f2 ∂ fn
⎜ ∂ x1 ∂ x1 ... ⎟
∂ x1
⎜∂f ∂f ∂ fn ⎟
⎜ 1 2 ... ⎟
J=⎜ ⎜ ∂ x2 ∂ x2 ∂ x2 ⎟

⎜ ... ... ... ... ⎟
⎝∂f ∂f ⎠
1 2 ... ∂ f n
∂ xn ∂ xn ∂ xn
The characteristic polynomial then is given as

P(λ ) = λ n + a1 λ n−1 + a2 λ n−2 + ... + an−2 λ 2 + an−1 λ + an

The generalization of the index I1 in the case of a static bifurcation will become
86 W. Peeters

I1 = an = (−1)n · det(J)

For a Hopf bifurcation, it is necessary to have the same conditions as one should
obtain when having two pure imaginary axes. Therefore, it must be possible to
rewrite the characteristic polynomial as

P(λ ) = P1 (λ ) · (w2 + λ 2 ) + b1 λ + b2

for some real w. The condition for having two pure imaginary axes is in this case
that b1 = b2 = 0.
As an example let us see what this well become in the case n = 3. The character-
istic polynomial becomes

P(λ ) = λ 3 + a1 λ 2 + a2 λ + a3
= (λ + a1 )(w2 + λ 2 ) + (a2 − w2 )λ + a3 − a1 w2

so b1 = a2 − w2 and b2 = a3 − a1 w2 . Demanding that both should be zero, yields


that a1 · a2 − a3 = 0. Therefore, it is feasible to take as an index for the measure
of the “distance” between the complex poles with negative real part and crossing
the imaginary axis at points where a Hopf bifurcation occurs

I1  = a1 · a2 − a3

As demonstrated in [109], if we define the Hurwitz matrices Hn and its minor


principal Hn−1 as
⎛ ⎞
a1 a3 a5 0 0 ... ⎛ ⎞
⎜ 1 a2 a4 a6 0 ... ⎟ a1 a3 a5 0 ...
⎜ ⎟ ⎜ 1 a2 a4 a6 ... ⎟
⎜ 0 a1 a3 a5 0 ... ⎟ ⎜ ⎟
⎜ ⎟ ⎜ 0 a1 a3 a5 ... ⎟
⎜ ⎟
Hn = ⎜ 0 1 a2 a4 a6 ... ⎟ and Hn−1 = ⎜ ⎜ ⎟

⎜ 0 0 a1 a3 a5 ... ⎟ ⎜ 0 1 a2 a4 ... ⎟
⎜ ⎟ ⎝ ... ... ... ... ... ⎠
⎝ ... ... ... ... ... ... ⎠
... ... ... ... an−1
... ... ... ... ... an

it is easy to see that


I1  = det(Hn−1 )
In fact, it is even true that I1 · I1  = det(Hn ).
For generalizing the index I2 , again we define the auxiliary subspace as the space
of all points (x1 , x2 , ..., xn ) for which

f1 (x1 , x2 , ..., xn ) f2 (x1 , x2 , ..., xn ) fn (x1 , x2 , ..., xn )


= = ... =
b1 b2 bn
in which I2 is defined as a minimal distance between the plant and the controller
components around the origin, excluding a certain region B around the origin.
An Overview of Fuzzy Control Theory 87

I2 = min | f (x) + b · Φ(x)|


x∈X\B

Remarkably, for every n, the auxiliary subspace is always one-dimensional.

9.5 Input–Output Stability and Related Techniques

9.5.1 The Extended Space Xe

As all vector functions x(t) are elements of a normed space X and are given in
function of a time parameter, we need a norm function which makes the space of
all such signals, X, a normed vector space. This can be done by either taking the
L2 –norm #
$∞
$
x2 := % x(t)2 dt
0

representing all the signals with finite energy, or the essential supremum-norm as
defined in Kolmogorov et al. ([78])

xess.sup := inf {α ∈ R : λ (t ∈ [0, ∞] : x(t) > α ) = 0}

representing all bounded signals, by considering all vectors x(t) for which the re-
spective norms are finite. Any signal x(t) that is unbounded in time can then be
truncated by defining

x(t) if t ≤ T
∀T > 0 : (x(t))T :=
0 if t > T

thus extending the space X to the extended space Xe for which

∀T > 0 : (x(t))T ∈ X

Any system for which G has as input vectors x(t) and as output vectors y(t), both
elements in the extended space Xe , can be considered as a relation G ⊆ Xe × Xe by
stating that it contains all pairs (x(t), y(t)) for which y(t) is a possible output for the
input x(t). Hence, any input signal may produce either one output, several outputs
or no outputs at all, so it is wrong to write y(t) = G(x(t)), which would be an
undefined function. The advantage of using relations instead of functions is that we
do not have to consider existence or uniqueness conditions.
The analysis of input–output stability (Figure 58) can be credited to Safonov
([132]) and Vidyasagar ([150]). Given any feedback system as above, where y is the
control vector and z the control output. G(v) represents the system to be controlled,
while H(u) is the controller. In that regard, v can be considered as the reference
88 W. Peeters

z y

Fig. 58 Input–output stability u


schematics

variable and u represents the dependence on initial conditions or the disturbances


that influence the system.

9.5.2 Definition (Finite-Gain Stable)

We call a system G finite-gain stable if and only if the gain of G, which we will
define as  
(Gx(t))T 
g(G) := sup
x(t)=o (x(t))T 
is finite. Alternatively, the system output can be made arbitrarily small by making
the inputs small. In case the open-loop system G is closed-loop, with (u, v) as closed-
loop inputs and (y, z) closed-loop outputs, the idea is that we can obtain small (y, z),
in which y may for instance represent the output error, by making (u, v) small, in
which the former denotes the disturbances to be avoided and the latter the reference
set-point.
One of the main results in this approach is the so-called Small Gain Theorem
([167]), stating that if the system described above is closed-loop, a sufficient con-
dition for its stability is that g(G) · g(H) < 1. Two important other criteria which
can be derived from this theorem are the Circle Criterion, which was studied in
Ray et. al [121] and [122], and its generalization, the Conicity Criterion — see for
instance [132].

10 Other Adaptive Techniques

The use of fuzzy control has always been subject to discussion over the ques-
tion whether it actually improves controller schematics (see [103]). While we
are convinced that fuzzy techniques perform at least as good as classical tech-
niques, especially in low-dimensional systems, we admit at the same time that
An Overview of Fuzzy Control Theory 89

for more complex systems, the gain achieved by replacing the classical control
techniques by their fuzzy counterparts, is indeed minimal. The most interesting
behavior however arises when fuzzy techniques are crossbred with other rela-
tively new mathematical theories that try to model other biological processes,
which have only become a subject of study since the exponential increase in
computing possibilities has taken place. We mainly think then of two particu-
lar techniques that have acquired a relative succes recently: artificial neural net-
works, for which a good introduction can be found in [36], [47] and [50], and
genetic algoritms, see for instance [46], [72] and [136]. While it is not our pur-
pose to give a detailed description of the theory involved, we will try to summa-
rize the basics of each of the two techniques, and point out the areas where suc-
cesful combination with fuzzy control techniques are possible. Systems combining
fuzzy control theory with one or more of the above will be referred to as hybrid
systems (see [66]).

10.1 Neural Networks

In this section, we will first give an overview of the main theory involving neural
networks, followed by an overview of hybrid techniques where succesful combina-
tions of fuzzy set theory and neural networks have been made. We will follow the
approach as is presented very eloquently by Fullér in [39].

10.1.1 Introduction

Neural networks can be considered as simplified mathematical models of com-


putationally complex systems like a human brain, working as parallel distributed
computing systems. However, in contrast to conventional computers, which are pro-
grammed to perform specific task, most neural networks must be taught, or trained,
in a supervised way, where the designer provides a training data set, consisting of
the input values of the system, together with the desired output values. The neural
network can learn new associations, new functional dependencies and new patterns.
Although computers outperform both biological and artificial neural systems for
tasks based on precise and fast arithmetic operations, artificial neural systems rep-
resent the promising new generation of information processing networks.
A key advantage of neural network systems is their flexibility: these simple, yet
powerful learning procedures can be defined, allowing the systems to adapt to their
environments. The essential character of such networks is that they map similar
input patterns to similar output patterns. This characteristic is what allows these
networks to make reasonable generalizations and perform reasonably well on pat-
terns that have never before been presented but belong to the same class of possible
input patterns as the training set. A good general work on neural networks and self-
organizing maps is Haykin [50].
90 W. Peeters

Layer 2

Layer 1

Layer 3

Layer 4

Fig. 59 Example of a neural network

10.1.2 Definition (Neural Network)

A neural network is a collection of cellular units, which we will call the nodes or
neurons of the network, which serve as storage units for (binary) information. Each
neuron is characterized by an activity level, representing the state of its polarization,
an output value, representing its firing rate, and a set of input and output connections,
which we will call synapses, which connect the neurons as a directed graph would
do (see Figure 59). All these are characterized by real numbers.
Furthermore, the different neurons are ordered in layers, which may or may not
be hidden units (to which we will come later). If only one-directional arrows exist,
we will call the neural network a feedforward network; if moreover, the “previous”
nodes also obtain information from the “succeding” nodes, e.g. by arrows in the
reverse direction (though there are other methods to acquire this), we will call the
network a feedback network.
Each neuron possesses a finite number of input connections {x1 , x2 , ..., xn }, which
an associated weight value wi , which we will call the synaptic strength, and, for the
sake of simplicity, one output connection o, determined as a function of the input
signals as described in Figure 60.
An Overview of Fuzzy Control Theory 91

X1
W1
X2
W2 θ f o

Wn
Xn
Fig. 60 Example of a neuron

The output signal is given by the following relationship, in the particular case of
a single output connection:
 
n
o = f (< w, x >) = f ∑ w jx j
j=1

where the vector w = (w1 , w2 , ..., wn ) ∈ Rn is called the weight vector. The weights
(w j )nj=1 assign to each incoming synapse the strength of its effect, hence the name
synaptic strength. It may be positive (excitatory) or negative (inhibitory). The func-
tion f will be called the activation function or transfer function. For this transfer
function, a myriad of possible choices can be made, of which we will only highlight
the most important ones.

10.1.3 Definition (Linear Transfer Function)

The linear transfer function is defined by


n
o =< w, x >= ∑ w jx j
j=1

10.1.4 Definition (Binary Linear Transfer Function)

The binary linear transfer function is defined by



1 if < w, x > ≥ θ
o = f (< w, x >) =
0 if < w, x > < θ

θ will be called the threshold level.

10.1.5 Example

It is possible to model both the boolean “AND” and “OR” operators as a neural
network with a binary linear transfer function. Suppose that the two input values x1
and x2 as well as the output value o are in {0, 1}, then the operators are modelled by
92 W. Peeters

“AND” “OR”
X2 X2

1,1
W W
1 2 2
1

x1 +x2
=0.6 x1 +x2 =0.8
2

X1 X1
0 1 0 1

Fig. 61 Geometric interpretation of the “AND” and “OR” neurons

the weight vectors w = ( 21 , 21 ) and threshold θ = 0.6, and weight vectors w = (1, 1)
and threshold θ = 0.8 respectively. Put differently,


1 if x1 +
2
x2 ≥ 0.6
x1 ∧ x2 =
0 if x1 +
2
x2 < 0.6

and 
1 if x1 + x2 ≥ 0.8
x1 ∨ x2 =
0 if x1 + x2 < 0.8
Geometrically, for both connectives, this means that the points with output 1 and
those with output 0 can be separated by a (hyper)plane in the state space, for which
the weight vector is perpendicular to this hyperplane (see Figure 61).
Consequentially, it is easy to see that — and this is the major drawback of these
basic neural networks — it is not possible to model the exclusive OR-operator,
defined by

1 if (x1 = 0 and x2 = 1) or (x2 = 0 and x1 = 1)
x1 XOR x2 :=
0 otherwise

by means of a single neural network with a binary transfer function (Figure 62).
The reason is that, by using elemental geometry, it can readily be seen in
Figure 62 that is impossible to separate the two white dots from the two black dots
by means of a single hyperplane.

10.1.6 Proposition

It is quite easy to see that in case of a binary linear transfer function, a neuron with
n synapses and a threshold θ is equivalent to a neuron with n + 1 synapses, with
An Overview of Fuzzy Control Theory 93

“XOR”
X2

X1

Fig. 62 Geometric interpreta- 0 1


tion of the “XOR”

X1
w1
X2
w2 q f o

wn
Xn

X1
w1
X2
w2 0 f o

wn
Xn
θ
Fig. 63 Including the thresh-
–1
old value as an extra weight

as one additional synapse a constant inhibitor function input −1 and fixed weight
function θ (see Figure 63).
Therefore, without loss of generality, we may always assume the threshold value
to be zero. Graphically, this means that all the hyperplanes described above include
the origin o of the euclidean state space.

10.1.7 Training Sets

The process of “learning” in terms of neural networks is simply the problem of


finding a set of synaptic strengths (weights) which allow the network to carry out
a certain desired calculation. The network is provided with a finite set of example
input/output pairs, which we will call a training set and whose purpose it is to fine-
tune its weight functions in order to approximate the given function as close as
94 W. Peeters

possible, preferrably in such a way that the error between the output and the desired
output on the training set is as close to zero as possible. The networks are then tested
for ability to generalize.
The error-correction learning procedure is based on a simple concept: By training
the network, an input that is put into the network, generates a set of output values.
Then, the actual output is compared with the desired output, and if these match, no
change is made to the weights in the net. However, if the output differs from the
target a change must be made to some of the weights.
An often-needed modification of the binary linear transfer function is the follow-
ing:

10.1.8 Definition (Hard Transfer Function)

The hard transfer function is defined by



1 if < w, x >≥ 0
o(< w, x >) := sign < w, x >=
−1 if < w, x >< 0

with the scalar product


n
< w, x >:= ∑ w jx j
j=1

10.1.9 Perceptron Learning Rule

The first, and most effective, way to train the weights is the perceptron learning
rule, introduced by Rosenblatt in [126]. Basically, it is an error-correction learning
algorithm for a of single-layer feedforward network with a hard transfer function.
Let the weight function wi j denote the synapse strength between the j–th input
vector and the i–th output vector. Let a training set of K input values and their
corresponding output values be given as

1 : x1 = (x11 , x21 , ..., xn1 ) → y1 = (y11 , y12 , ..., y1m )


2 : x2 = (x12 , x22 , ..., xn2 ) → y2 = (y21 , y22 , ..., y2m )
...
K : xK = (x1K , x2K , ..., xnK ) → yK = (yK1 , yK2 , ..., yKm )

for which all output and input values are considered to be in the binary set {1, −1}
(see Figure 64).   m
Our aim is to find the weight vectors wi := (wi j )nj=1 such that
i=1

∀k ∈ {1, ..., K}, ∀i ∈ {1, ..., m} : oi (xk ) = yki


An Overview of Fuzzy Control Theory 95

X1
w11

wm1
X2
w12

wm2

o1

om

w1n

wmn
Xn

Fig. 64 Example of a perceptron

where we define the activation function oi (xk ) as the hard transfer function. Given a
parameter η > 0, which we will call the learning rate, the weights will consequently
be adjusted by the following rule:
∀i ∈ {1, ..., m} : wnew
i i + η (yi − oi )x
:= wold
From this equation it follows that if the desired output is equal to the computed
output, yi = oi , then the weight vector of the i–th output node has reached a stable
state, and will not change anymore. The learning process stops when all the weight
vectors remain unchanged during a complete training cycle.

10.1.10 Perceptron Learning Algorithm

Therefore, the following algorithm, which we will call the perceptron learning al-
gorithm provides a systematic way to determine the weight functions wi :
1. Choose η > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output

1 if < wi , xk >≥ 0
oi (x) :=
−1 if < wi , xk >< 0

with the scalar product


96 W. Peeters
n
< wi , x >:= ∑ wi j x j
j=1

4. Update the weights by putting

∀i ∈ {1, ..., m} : wnew i + η (yi − oi )x


:= wold k
i

5. Cumulate the error function by putting


1
E new := E old + y − o2
2
6. If k < K, then increase k by 1 and go back to step 3.
7. The training cycle is completed. If E = 0 (or perhaps, if E < Ethreshold ), end the
training session; otherwise, set E := 0 and go back to step 3 to re–train.
It can be proved that under certain separation properties, namely if all initial
states that arrive in one output can be separated by a hyperplane from all initial states
that arrive in another one, this algorithm converges in a finite number of steps.

10.1.11 Delta Learning Rule

The following observation is important when generalizing the methodics of neural


network training to more general networks, with, e.g. a supplementary layer, differ-
ent input and output spaces. Remark that the weight adjustment rule

∀i ∈ {1, ..., m} : wnew


i i + η (yi − oi )x
:= wold

can also be obtained as the outcome of a gradient descent method. For a description
of this method, see for instance [3]. More general, this rule is known in literature
as the delta learning rule. The basic idea of the delta learning rule is to define a
measure of the overall performance of the system, and then to find a way to optimize
that performance. In our network, we can define the performance of the system as
the total error function
K
1 K & & k
&2
k&
E= ∑ Ek = ∑
2 k=1
&y − o &
k=1

Then

1 K m k
E= ∑ ∑ (yi − oki )2
2 k=1 i=1

1 K m k
= ∑ ∑ (yi − < wi , xk >)2
2 k=1 i=1
An Overview of Fuzzy Control Theory 97

The goal, then, is to minimize this function. As it turns out, if the output func-
tions are differentiable, we change the weights of the system in proportion to the
derivative of the error with respect to the weights. The rule for changing weights
is given by minimizing the quadratic error function by using the following iteration
process:
∂E
i j := wi j − η
wnew old
∂ wi j
Particularly, using the chain rule,

∂E ∂ E ∂ oi
=
∂ wi j ∂ oi ∂ wi j

= −(yi − oi )x j

yielding the same error formula. If we have only one output unit then the delta
learning rule collapses into

wnew := wold + η (y − o)x = wold + ηδ x

with δ denoting the difference between the desired and the computed output; hence
the name “delta learning rule”.
Concluding, the standard delta rule essentially implements gradient descent
method in sum-squared error for linear activation functions. The use of the delta
learning rule, which is a generalization of the discrete perception training rule, in
neural network training should be accredited to McClelland and Rumelhart in [92].
It is sometimes also called the continuous perceptron training rule.
If we use a linear output unit then whatever the final weight vector is, the output
function of the network is a linear subspace, which means that the delta learning
rule with linear output function can approximate only a pattern set derived from an
almost linear function, which is, needless to say, unsatisfactory in certain real-world
applications. Therefore, other activations functions than the binary or hard transfer
functions are also commonly used, especially for their differentiability properties,
which allow then to derive them from a similar gradient descent method with a dif-
ferent output function. The unipolar sigmoidal activation function is such another,
commonly used example.

10.1.12 Definition (Unipolar Sigmoidal Activation Function)

The unipolar sigmoidal transfer function is defined by


1 1
o(< w, x >) = =
1 + e−<w,x> n
− ∑ w jx j
j=1
1+e
98 W. Peeters

10.1.13 Example

For the sake of simplicity we will explain the learning algorithm in the case
of a multiple-input, single-output (MISO) network, with input functions x =
{x1 , x2 , ..., xn }, weight functions w = {w1 , w2 , ..., wn }, and a single neuron output o.
Suppose a training set

1 : x1 = (x11 , x21 , ..., xn1 ) → y1


2 : x2 = (x12 , x22 , ..., xn2 ) → y2
...
K : xK = (x1K , x2K , ..., xnK ) → yK

is given. Similarly to the perceptron learning algorithm, given an input vector xk ,


an output vector ok is calculated and compared with the desired output function yk .
Only this time, we define the unipolar sigmoidal activation function as above. Then
let us again define the performance of the system as the total error function
K
E= ∑ Ek
k=1
K
1
= ∑ (yk − ok )2
2 k=1
⎛ ⎞2
K
1 ⎜ 1 ⎟
= ∑ ⎝yk −
2 k=1 n
− ∑ w jx j

j=1
1+e

The rule for changing weights following presentation of input–output pair (xk , yk )
will be given by the gradient descent method, i.e. we minimize the quadratic error

∂E
i j := wi j − η
wnew old
∂wj

In particular, this gradient equals


2
∂E 1 K ∂ 1
= ∑ yk −
∂wj 2 k=1 ∂ w j 1 + e−<w,x
k>

1 1 1
= − yk − 1− xkj
1 + e−<w,x > 1 + e−<w,x > 1 + e−<w,x >
k k k

= −(yk − ok )ok (1 − ok )xkj

Therefore our learning rule for w, written as a vector, is

∀i ∈ {1, ..., m} : wnew i + η (y − o )o (1 − o )x


:= wold k k k k
i
An Overview of Fuzzy Control Theory 99

10.1.14 Delta Learning Rule for a Unipolar Sigmoidal Activation Function

Then the following algorithm, which we will call the delta learning rule for a unipo-
lar sigmoidal activation function provides a systematic way to determine the weight
functions wi :
1. Choose η > 0 and Emax > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output
1
ok = ok (< w, x >) = n
− ∑ w j xkj
j=1
1+e
4. Update the weights by putting

wnew := wold + η (y − ok )ok (1 − ok )xk

5. Cumulate the error function by putting


1
E new := E old + (y − o)2
2
6. If k < K, then increase k by 1 and go back to step 3.
7. The training cycle is completed. If E ≤ Emax , end the training session; if E >
Emax , set E := 0, k := 1 and go back to step 3 to retrain.

10.1.15 Definition (Bipolar Sigmoidal Activation Function)

In case of the unipolar sigmoidal activation function, without hidden units, the error
surface is shaped like a bowl with only one minimum, so gradient descent is even-
tually guaranteed to find an absolutely optimal set of weights. With for instance the
presence of hidden units, however, it is not so obvious how to compute the deriva-
tives, and the error surface is not concave upwards, so there is the danger of getting
stuck in local minima. We then use the delta learning rule with the bipolar sigmoidal
activation function
2 2
o(< w, x >) = −1 = −1
1 + e−<w,x> n
− ∑ w jx j
j=1
1+e
instead. It is left as a verification for the reader that the gradient descent method then
yields a weight update algorithm where
1
wnew := wold + η (y − o)(1 − o2 )x
2
100 W. Peeters

10.1.16 Definition (Other Activation Functions)

More activations that are commonly used are the following:


• The piecewise linear activation function, defined as


⎪ 1 if < w, x >> 1


o(< w, x >) = < w, x > if | < w, x > | ≤ 1



⎩ −1 if < w, x >< −1

• The hyperbolic bipolar sigmoidal transfer function, defined as

o(< w, x >) = tanh < w, x >

10.1.17 Unsupervised Classification

Another important application of neural networks is unsupervised classification


learning, which is based on clustering of input data for which no a priori knowl-
edge is assumed to be available regarding an input’s memebership in a particular
class. Rather the classes and possible boundaries between them will gradually be
detected by searching for characteristics during a training, which will assist the net-
work in discerning classes. The technique involved, clustering, is understood to be
the grouping of similar objects and separating of dissimilar ones. Much of this work
must be credited to Kohonen in [77].
' (K
Given K input vectors xk = (x1k , x2k , ..., xnk ) k=1 , we want to find a method to
divide these in a prespeficied number of classes m according to clustering properties.
Consider therefore again the neural network for which the schematics were given in
Figure 64. The learning algorithm then treats the set of m weight vectors
' (m
w j = (w1 j , w2 j , ..., wn j ) j=1

as unknown, variable vectors that need to be “learned”. Prior to the learning, the
normalization & of& all (randomly chosen) weight vectors is required, such that ∀ j ∈
{1, ..., m} : &w j & = 1. The weight adjustment criterion for this mode of training is
the selection of an index r such that
& &
x − wr  = min &x − w j & ,
j=1,...,m

corresponding to the vector wr which is the closest approximation of the current


input x. Since, however,
An Overview of Fuzzy Control Theory 101

X
w2
w1

w3
w5

Fig. 65 Kohonen weight


w4
vectors
& &
&x − w j & = < x − w j , x − w j >
= < x, x > −2 < w j , x > + < w j , w j >
& &2
= x2 − 2 < w j , x > + &w j &
= x2 − 2 < w j , x > +1

we find the same solution by selecting an index r such that


< wr , x >= max < w j , x >
j=1,...,m

Graphically, since the scalar product < w j , x > is the projection of x on the direction
of w j , we are in fact looking for the weight vector w j that is closest to x. In two
dimensions, consider the example given in Figure 65.
The winning weight vector is w1 , being the most similar to the vector x. With the
similarity criterion being the value of cos (x,  w j ), the weight vector lengths should
be identical for this particular way of training. However, their directions should not
be modified. Intuitively, it is clear that a very long weight vector could lead to a very
large output value for its associated neuron, even if there was a large angle between
the weight vector and the pattern. This explains the need for weight normalization.
After one optimally located neuron has been identified and declared a winner, its
weight must be adjusted so that the distance x − wr  is reduced in the current train-
ing step, preferrably along the gradient direction. Now, using the gradient descent
method,

∂ x − wr 2 ∂
= (< x − w j , x − w j >)
∂ wir ∂ wir

= (< x, x > −2 < w j , x > + < w j , w j >)
∂ wir
∂ (w1r x1 + w2r x2 + ... + wnr xn ) ∂ (w21r + w22r + ... + w2nr )
= −2 +
∂ wir ∂ wir
= −2xr + 2wir
= −2[xr − 2wi ]r
102 W. Peeters

Transformed in vector notation, the following adaptation rule for the weight function
must be carried out:

wnew
r r + η (x − wr )
:= wold
= (1 − η )wold
r + ηx

where the constant 2 is, without loss of generality, incorporated in the learning rate
parameter η . It seems reasonable to reward the weights of the winning neuron with
an increment of weight in the negative gradient direction x − wr . The remaining
weight vectors are left unaffected. Note that from this identity, it follows that the
updated weight vector is a convex linear combination of the old weight and the
pattern vectors, as can be seen in the last equation.

10.1.18 Kohonen’s Learning Algorithm

Kohonen’s learning algorithm for unsupervised learning can be summarized in the


following steps:
1. Let r be the winning neuron, then

wnew
r r + η (x − wr )
:= wold

and or := 1.
2. Normalize the weight vectors by putting

wold
wnew := & rold &
r &wr &

and do not affect the nonwinning weight vectors.


When performed correctly, when terminating the Kohonen training process, the
final weight vectors point to the centers of gravity of the different classes that were
distinguished during the training. This network however will only be trainable if
the classes of patterns are linearly separable from other classes by hyperplanes
passing through the origin. In order to ensure separability of clusters with a priori
unknown numbers of training clusters, the unsupervised training can be performed
with an excessive number of neurons, which provides a certain separability safety
margin. During the training, some neurons are likely not to develop their weights,
and if their weights change chaotically, they will not be considered as indicative
to being a member of any of the particular clusters. Therefore such weights can be
omitted during the recall phase, since their outputs do not provide any essential clus-
tering information. The weights of remaining neurons should settle at values that are
indicative of clusters (Figure 66).
This approach is tedious however, leads very often to unwanted results, like a
separation in too many classes. Instantly, it pops to mind that defining the borders
between the different classes as fuzzy sets, would be a suitable application of the
An Overview of Fuzzy Control Theory 103

w1

w2
w3

Fig. 66 Result of Clustering


Kohonen Weight Vectors

1-z

1 1
–2
l1 l2 l3
1 2 1

1 1 1 1

x y
Fig. 67 The XOR-perceptron

latter. Remark also that in many practical cases instead of linear activation functions
we use semi-linear ones.

10.1.19 Hidden Layers

We recall that the “XOR” problem mentioned above cannot be solved by a single
layer perceptron neural network. Much work is credited due to Minsky and Papert
([100]), who proved this. As a solution, a supplementary layer, which will be called
a hidden layer, is needed. The neural network shown in Figure 67 is known to do
the desired trick, with the numbers in the neuron denoting the threshold values.
If the only possible outputs of the neurons are 0 and 1, then it is easy to see that
with the above weight functions and threshold values, z = 1 if and only if (x, y) =
(0, 0) or (x, y) = (1, 1).
This calls for an interesting generalization. If we study networks with a supple-
mentary layer, the delta learning rule should also be generalized to neural networks
with a two-layer (or three layers, if the nodes are counted instead of the synapses)
104 W. Peeters

x1

x2 w11
w11
w12 h1 o1

L
w1L
hidden
w1n nodes

wL1
wL2 wm1

hL om
wLn wmL

xn

Fig. 68 Example of a neural network with hidden layer

architecture. Such a network in its most elementary form may look, e.g. like
Figure 68.
A layer with neurons whose outputs are inaccessible to the user, and thus not
comparable to a given data set, will be called hidden layers.

10.1.20 Generalized Delta Rule

The generalized delta rule is the most often used supervised learning algorithm in
the study of multilayer neural networks. For reasons of simplicity, we will restrain
ourselves to the study of a neural network with one input layer with n inputs x =
(x1 , ..., xn ), one hidden layer with L nodes (h1 , ..., hL ), which we, so to speak, cannot
externally control, and one output node o. Denote the weight synapses between input
xi and hidden layer hl as wli and in vectorial notation, wl = (wl1 , ..., wln ), and the
weight synapses between hidden layer hl and the output layer o as Wl , in vectorial
notation, W = (W1 , ...,WL )
Let furthermore a training pattern ((xk , yk ))Kk=1 be given. The given problem is
to adjust the weights in such a matter that the total error of the system is minimized
with respect to the given input and output values. Furthermore, we opt for the out-
put function, given by the unipolar sigmoidal activation function (of course other
options as transfer function are possible), as well for the hidden layer as for the
generated output. Hence we define the internal output layer as
An Overview of Fuzzy Control Theory 105

1 1
∀l ∈ {1, ..., L} : okl (< wl , x >) = =
1 + e−<wl ,x> n
− ∑ wl j x j
j=1
1+e

and if we put the output vector of the hidden layer as ok := (ok1 , ..., okL ), then we
define the external output layer as
1 1
Ok (< W, ok >) = =
1 + e−<W,o >
k L
− ∑ Wl okl
1+e l=1

A measure for the error on an input/output training pattern is then given by


K K
1
E(W, w) = ∑ Ek (W, w) = ∑ 2 (yk − Ok )2
k=1 k=1

Again, the appropriate rule for adapting the weight synapses is given by the gra-
dient descent method. Given a learning rate η > 0, we adapt the external and internal
weights following the next iteration process:

∂ E(W, w)
wnew
lj lj −η
:= wold
∂ wl j
∂ E(W, w)
Wlnew := Wlold − η
∂ Wl
Analogously to the calculations in Section 10.1.12, and making use of the chain
rule for derivation, the rules for changing weights will turn out to be, in vectorial
notation,

Wnew = Wold + η (yk − Ok )Ok (1 − Ok )ok


∀l ∈ {1, ..., L} : wnew l + η (y − O )O (1 − O )Wl ol (1 − ol )x
= wold k k k k k k k
l

as can be verified easily.

10.1.21 Generalized Delta Learning Rule Algorithm

Summarizing, the following algorithm, which we will call the generalized delta
learning rule, here in this case presented in particular for a unipolar sigmoidal
activation function, provides a systematic way to determine the weight functions
wl j and Wl :
1. Choose η > 0 and Emax > 0.
2. Initialize the weight functions wi with small random values; put the error function
E := 0 and let k := 1.
3. The training cycle begins. Take xk , determine the output.
106 W. Peeters

1
∀l ∈ {1, ..., L} : ol =
1 + e−<wl ,x >
k

and
1
∀l ∈ {1, ..., L} : O =
1 + e−<W,o>
4. Update the output weights by putting

Wnew := Wold + ηδ o

with δ = (y − O)O(1 − O).


5. Update the hidden layer weights by putting

∀l ∈ {1, ..., L} : wnew


l l + ηδ Wl ol (1 − ol )x
:= wold

6. Cumulate the error function by putting


1
E new := E old + (y − o)2
2
7. If k < K, then increase k by 1 and go back to step 3.
8. The training cycle is completed. If E ≤ Emax , end the training session; if E >
Emax , set E := 0, k := 1 and go back to step 3 to retrain.
As far as the effectiveness of neural networks is concerned, Funahashi showed
in [40] that infinitely large neural networks with a single hidden layer are capable
of approximating all continuous functions, as stated in the following theorem:

10.1.22 Theorem

Let φ (x) be a nonconstant, bounded and monotone increasing continuous function.


Let K ⊆ Rn be a compact set and f : K → R be a real–valued continuous function
on K. Then for arbitrary ε > 0, there exists an integer N and real constants wi , wi j
such that
N n
f)(x1 , ..., xn ) = ∑ wi φ ( ∑ wi j x j )
i=1 j=1

satisfies & &


& &
& f − f) & = sup | f (x) − f)(x)| ≤ ε
∞ x∈K

Stated differently, any continuous mapping on a compact set K can be approximated


with respect to the uniform topology on K by input–output mappings of two-layers
neural networks networks with φ (x) as output function for the hidden layer and
linear output functions for the output layer.
An Overview of Fuzzy Control Theory 107

The previous result can be refined by using the Stone–Weierstrass theorem from
real analysis, to show that certain neural network architectures possess the universal
approximation capability. By using the Stone–Weierstrass theorem in the design of
our networks, we also guarantee that these can compute certain polynomial expres-
sions of a certain set of given functions, as follows:

10.1.23 Theorem (Stone–Weierstrass)

Let K be an n-dimensional compact space and let G be a set of continuous real-


valued functions on K, satisfying the following three conditions:
1. The constant function f (x) = 1 is in G.
2. G is point–separating, i.e. for any two points x1 = x2 in K, there is an f in G such
that f (x1 ) = f (x2).
3. If f1 and f2 are two functions in G, then f g as well as for any two real numbers
α , β ∈ R, α f1 + β f2 are in G.
Then G is dense in C(K), the set of continuous real-valued functions on K. In other
words, for any ε > 0 and for any f ∈ C(K), there exists g ∈ G such that

 f − g∞ = sup | f (x) − g(x)| ≤ ε


x∈K

10.2 Neuro-fuzzy Hybrid Systems

10.2.1 Introduction

The aim of any hybrid system is to try to join the strengths of several intelligent
computing techniques, and hence reenforcing the control method as a whole. Every
intelligent technique has particular computational properties that make them suited
for application in particular problems and not for others. For example, neural net-
works have a particularly great reputation when it comes down to solving pattern
recognition problems, but rather perform poor at the process of decision making.
On the other hand, fuzzy logic is a very suitable instrument for making decisions
and studying the transparancy of how a certain decision is reached, but their design
is absolutely not suited for, e.g. automatically generating the rules that are respon-
sible for those decisions. These limitations have been a central driving force behind
the creation of intelligent hybrid systems where two or more techniques are com-
bined in a way that the techniques reenforce their own strengths and overcome the
limitations of the other techniques involved.
Also, hybrid systems are designed to take into account the “best of both worlds”
when trying to model an application, which may be of a very variable nature, and
therefore may be a complex superposition of different components that require a
different approach. For instance, when some application consists of a combination
108 W. Peeters

of a signal processing task with a decision process, then a well-chosen combination


of a neural network and a fuzzy controller may yield a better result than each of
the approaches separately. One example that immediately jumps to mind is using
neural networks to tune the membership functions of a fuzzy controller, given a
particular training set. The fuzzy control rules may be heuristic in nature, or apply
exterior knowledge, but, although self-tuning fuzzy controllers are at hand, as we
have pointed out before, the use of neural network, may considerably shorten
the design time, especially when the process to be controlled is quite intricate, while
at once it also improves the performance. Other successful combinations of both
may include the extension of the crisp output space {0, 1} to its fuzzy counterpart
[0, 1], or the application of fuzzy borders in clustering problems, as was described
in Section 10.1.17.
Strictly speaking however, neural networks and fuzzy controllers can be proven
to be equivalent, yet their method of design and parametrization each has its own
advantages and disadvantages. Neural networks generally tend to be slow in learn-
ing, and hard to analyze, because for instance the analysis of the behavior and the
separate influence of their weight functions is by no means straightforward. Neither
is it possible to extract structural knowledge such as fuzzy control rules directly
from a trained neural network, nor to change the network as to simplify certain
computations, which is in its turn very easy in fuzzy control theory. One may for
instance omit the defuzzification step in favor of applying a Sugeno-type fuzzy con-
troller (see Section 7.2), or join several fuzzy rules into one, leave out rules that turn
out to be insignificant as to speed up the control process, or split up rules to create
a more detailed precision. Without the use of neural networks, fuzzy control will
only be possible on a relatively simple universe of discourse where a great deal of
expert knowledge is available, and the number of input variables is small. There-
fore, neural networks are of use when either trying to fine-tune the parameters of an
existing fuzzy rule base, or to create new rules.
In our first example, we will see how we can manipulate a given fuzzy antecedent
rule base in order to achieve a neural net with both inputs and outputs which are
fuzzy sets, hence incorporating in the model a certain degree of uncertainty.

10.2.2 Definition (Neural Fuzzy Net)

Let an output block of K fuzzy controller rules


 
k : IF (X1 = Ak1 ) and ... and (Xn = Akn )
 
THEN (Y1 = Bk1 ) and ... and (Ym = Bkm )

with k ∈ {1, ..., K} be given. Then every combination of input and output vectors
can be considered as a training pattern for a neural network, where the antecedent is
the input and the consequence is the output for a neural net. Such a neural net with
fuzzy sets as inputs and outputs will be called a fuzzy neural net.
An Overview of Fuzzy Control Theory 109

10.2.3 Special Cases

1. Especially, in case n = m = 1, we get a single-input, single-output (SISO) fuzzy


rule base with rules

k : IF (X = Ak ) THEN (Y = Bk )

The input–output training set then consists of the pairs (Ak , Bk ).


2. If n = 2 and m = 1, we get a two-input, single-output (MISO) fuzzy rule base
with rules
k : IF (X1 = Ak ) and (X2 = Bk ) THEN (Y = Ck )
and in that case the input–output training pairs consist of the vectors ((Ak , Bk ),Ck ).
3. It is also possible to consider multiple output networks, such as the two-input,
two-output (MIMO) rules

k : IF (X1 = Ak ) and (X2 = Bk ) THEN (Y1 = Ck ) and (Y2 = Dk )

for which the input–output training pairs are ((Ak , Bk ), (Ck , Dk )).

10.2.4 Standard Error Backpropagation Networks

One of the most simple methods to incorporate the fuzzy component into a neural
network is to take a discrete number of input and output values, in which the fuzzy
value is taken, as input and output values for the neural network ([147]). Let us for
instance consider a SISO network, let [α1 , α2 ] be the collection of all possible input
values, such that
∀k ∈ {1, ..., K} : supp(µAk ) ⊆ [α1 , α2 ]
and let [β1 , β2 ] be the collection of all possible output values, such that

∀k ∈ {1, ..., K} : supp(µBk ) ⊆ [β1 , β2 ]

Then we divide the intervals [α1 , α2 ] and [β1 , β2 ] in equal parts. Choose two arbitrary
constants M, N ∈ N0 , then put

i
∀i ∈ {0, ..., M} : xi := α1 + (α2 − α1 )
M
j
∀ j ∈ {0, ..., N} : y j := β1 + ( β2 − β1 )
N
Then a discrete version of the continuous training set is given by the input/output
pairs
{(Ak (x0 ), Ak (x1 ), ..., Ak (xM )), (Bk (y0 ), Bk (y1 ), ..., Bk (yN )))}Kk=1
110 W. Peeters

Putting aki = Ak (xi ) and bk j = Bk (y j ), the fuzzy neural network reduces to an ordi-
nary neural network with (M + 1) inputs and (N + 1) outputs, which can be trained
by the the generalized delta rule from 10.1.21.

10.2.5 Modifications

Uehara and Fujise proposed in [146] to work with a finite number of α -levels of
the fuzzy set to represent the fuzzy numbers, which leads to a generally similar
approach.
Another idea is to change in selected applications certain elements in the defini-
tion of a neural network with their counterparts of fuzzy set theory. These generally
simple modifications lead to a fuzzy neural architecture based on fuzzy arithmetic
operations. While generally, the transfer function is given by
 
n
o(< w, x >) = f (< w, x >) = f ∑ w jx j
j=1

a more general definition might be the following — one additional condition how-
ever being that the arguments x j as well as the weight functions w j are in [0, 1];
otherwise a rescaling is required:

10.2.6 Definition (Hybrid Neural Network)

A hybrid neural network is a neural network with crisp signals and weight functions
in [0, 1], crisp transfer functions f : [0, 1] → [0, 1], but where the following deviations
with respect to an ordinary neural network are allowed:
1. Instead of combining x j and w j to the product w j x j , any t–norm (or t–conorm, or
other continuous operation) is allowed.
n
2. Instead of combining w1 x1 , w2 x2 , ..., wn xn to the sum ∑ w j x j , any t–conorm (or
j=1
t–norm, or other continuous operation) is allowed.
3. f may be replaced by any continuous function from the input set to the output
set.
Contrarily, a hybrid neural net may not use multiplication, addition, or a sigmoidal
function (because the results of these operations are not necessarily are in the unit
interval). A processing element of a hybrid neural net is called a fuzzy neuron.

10.2.7 Examples

1. Given a t–norm T and a t–conorm S, and a weight vector (w1 , w2 ). Then the
output function, which we will call the AND–composition is given by

o(x1 , x2 ) := T (S(w1 , x1 ), S(w2 , x2 ))


An Overview of Fuzzy Control Theory 111

In particular if T = min and S = max, we calll this fuzzy neuron the min–max
composition.
2. Again given a t–norm T and a t–conorm S, and a weight vector (w1 , w2 ). Then
the output function, which we will call the OR–composition is given by

o(x1 , x2 ) := S(T (w1 , x1 ), T (w2 , x2 ))

In particular if T = min and S = max, we calll this fuzzy neuron the max–min
composition.

It is now really quite simple to change the arguments and weight functions of an
hybrid neural network from elements in [0, 1] to fuzzy sets which are elements in
F(X). All definitions above remain valid when the arguments are fuzzy sets, and
the operations are naturally expanded to the pointwise extended in the image space
of the fuzzy sets.

10.2.8 Neural IF–THEN Rules

The most effective way a subprocess of fuzzy control can benefit from techniques
of neural networking, is by having the network steer the process of adjusting the
parameters of the fuzzy linguistic variables. Since the effectivity of the fuzzy mod-
els representing nonlinear input–output relationships depends strongly on the way
how the input–output spaces are partitioned, the tuning of membership functions
will always be a very important issue in fuzzy modelling. Since this tuning task can
be viewed as an optimization problem, neural networks offer a possibility for effec-
tively solving it. It is also reasonable to assume that the membership function belong
to a certain parametric class of shapes that are heuristically feasible, yet broadly
enough adjustable, so that the parameters can be trained by a neural network, given
once again a set of correct training input–output values.
Let the fuzzy training data be given by

∀k ∈ {1, ..., K} : xk = (x1k , x2k , ..., xnk ) → yk

and let us for the set of fuzzy rules particularly focus on a Sugeno controller (see
Section 7.2) with rules

i : IF (X1 = Ai1 ) and (X2 = Ai2 ) and ... and (Xn = Ain ) THEN (Y = zi )

for i ∈ {1, ..., m} and zi ∈ R.


As an aggregation operation for the firing level of the i–th rule, we are allowed
to use any t–norm; we will for instance choose the product operator
n
αi (xk ) = ∏ µAi j (x kj )
j=1
112 W. Peeters

and the output of the system will be computed by a discretized version of the Center-
of-Gravity defuzzification method as
m
∑ αi (xk )zi
i=1
ok := m
∑ αi (xk )
i=1

First of all, we can derive the most appropriate values for zi by minimizing the
total error function of the quadratic sum of the errors and using a gradient descent
method.
K K
1
E = ∑ Ek = ∑ (ok − yk )2
k=1 k=1 2

We then have that ∀i ∈ {1, ..., m}:

∂ Ek
zi (t + 1) = zi (t) − η
∂ zi
αi (xk )
= zi (t) − η (ok − yk ) m
∑ αi (xk )
i=1

t hereby indexes the number of adjustments made to the parameters, and can there-
fore be considered as a discrete time parameter.
But also the parameters of fuzzy numbers in the premises can be adjusted by
the gradient descent method. Rather than explaining all available possibilities for a
wide scope of choices for the fuzzy set shapes, we will use an example to illustrate
the process.

10.2.9 Example

Consider a fuzzy controller consisting of two fuzzy rules with one input and one
output variable, as follows:

1 : IF (x = A1 ) THEN (Y = z1 )

2 : IF (x = A2 ) THEN (Y = z2 )

with fuzzy membership functions


1
µA1 (x) =
1 + eb1 (x−a1 )
1
µA2 (x) =
1 + eb2 (x−a2 )
An Overview of Fuzzy Control Theory 113

where a1 , a2 , b1 and b2 are adjustable parameters for the premises. Let a given value
x b́e the input to the fuzzy system, and let the firing levels of the rules be α1 :=
µA1 (x) and α2 := µA2 (x). Then the output of the system is computed by the discrete
COG-defuzzification as
α1 z1 + α2 z2
o=
α1 + α2
Suppose furthermore that we have a training set (xk , yk )Kk=1 at our disposition. Then
our problem is reduced to finding the two fuzzy rules with appropriate membership
functions and consequence parts that generate the given input-output pairs. This
means that we have to adjust the following parameters:
• a1 , a2 , b1 and b2 , the parameters of the fuzzy numbers representing the linguistic
variables
• z1 and z2 , the values of the consequences of the Sugeno controller
Once more, we will use the gradient descent method on the total sum of quadratic
errors
K K
1
E = ∑ Ek = ∑ (ok (a1 , a2 , b1 , b2 , z1 , z2 ) − yk )2
k=1 k=1 2

where ok is the computed output from the fuzzy system corresponding to the input
pattern xk , and yk is the desired output.
First of all, we determine the adjustment for zi in the consequence; that is,

∂ Ek
z1 (t + 1) = z1 (t) − η
∂ z1
α1
= z1 (t) − η (ok − yk )
α1 + α2
µA1 (xk )
= z1 (t) − η (ok − yk )
µA1 (xk ) + µA2 (xk )
∂ Ek
z2 (t + 1) = z2 (t) − η
∂ z2
α2
= z2 (t) − η (ok − yk )
α1 + α2
µA2 (xk )
= z2 (t) − η (ok − yk )
µA1 (xk ) + µA2 (xk )

In a similar manner we can find the shape parameters (center and slope) of the
membership functions µA1 and µA2 :

∂ Ek
a1 (t + 1) = a1 (t) − η
∂ a1
∂ Ek
a2 (t + 1) = a2 (t) − η
∂ a2
114 W. Peeters

∂ Ek
b1 (t + 1) = b1 (t) − η
∂ b1
∂ Ek
b2 (t + 1) = b2 (t) − η
∂ b2
Let us furthermore assume that the parameters of the fuzzy membership functions
are not independent. In fact, it is reasonable to assume that a := a1 = a2 and b :=
b1 = −b2 . In that case the fuzzy membership functions become
1
µA1 (x) =
1 + eb(x−a)
1
µA2 (x) = −b(x−a)
1+e
and form a partition of unity (see Definition 2.3.1, and for example, see Figure 69)
since
∀x : µA1 (x) + µA2 (x) = 1.
In that case, the number of parameters to be adjusted is reduced by half, doubling
the efficiency of the algorithm, and we get

∂ Ek (a, b)
a(t + 1) = a(t) − η
∂a
∂ ok
= a(t) − η (ok − yk )
∂a
k ∂
= a(t) − η (o − y ) (z1 µA1 (xk ) + z2 µA2 (xk ))
k
∂a

= a(t) − η (ok − yk ) (z1 µA1 (xk ) + z2 (1 − µA1 (xk )))
∂a
∂ µA1 (xk )
= a(t) − η (ok − yk )(z1 − z2 )
∂a
eb(x −a)
k

= a(t) − η (ok − yk )(z1 − z2 )b  2


1 + eb(x −a)
k

µ2(x)
1

0.8

0.6

0.4

0.2 µ1(x)

Fig. 69 Complementary 0
fuzzy partition 1 2 3 4
An Overview of Fuzzy Control Theory 115

= a(t) − η (ok − yk )(z1 − z2 )bµA1 (x)(1 − µA1 (x))


= a(t) − η (ok − yk )(z1 − z2 )bµA1 (x)µA2 (x)

and
∂ Ek (a, b)
b(t + 1) = b(t) − η
∂b
∂ ok
= b(t) − η (ok − yk )
∂b

= b(t) − η (ok − yk ) (z1 µA1 (xk ) + z2 µA2 (xk ))
∂b
k ∂
= b(t) − η (o − y ) (z1 µA1 (xk ) + z2 (1 − µA1 (xk )))
k
∂b
∂ µA1 (xk )
= b(t) − η (ok − yk )(z1 − z2 )
∂b
eb(x −a)
k

= b(t) + η (ok − yk )(z1 − z2 )(xk − a)  2


1 + eb(x −a)
k

= b(t) + η (ok − yk )(z1 − z2 )(xk − a)µA1 (x)(1 − µA1 (x))


= b(t) + η (ok − yk )(z1 − z2 )(xk − a)µA1 (x)µA2 (x)

10.2.10 Generalized Delta Rule with Fuzzy Membership Functions

For an arbitrary algorithm where the parameters of the fuzzy variables still have to
determined, given a training set (x k , y k )Kk=1 , the following steps should be carried
out:
1. Choose η > 0.
2. Take initial values for all parameters involved in the problem, and put the error
function E := 0 and let k := 1.
3. The training cycle begins. Take xk and compute the output ok as the output given
by the algoritm, which possibly may contain some unknown parameters.
4. Adjust the parameters involved by
∂ Ek
a(t + 1) = a(t) − η
∂a
where the energy function is defined as
1
Ek = (ok (xk ) − yk )2
2
5. Cumulate the error function by putting
1
E new := E old + (ok (xk ) − yk )2
2
116 W. Peeters

6. If k < K, then increase k by 1 and go back to step 3.


7. The training cycle is completed. If E ≤ Emax ), a certain predefined threshold, end
the training session; otherwise, set E := 0 and go back to step 3 to retrain.
In 1993 Jang showed ([67]) that fuzzy inference systems with Sugeno fuzzy IF–
THEN rules are universal approximators, i.e. they can approximate any continuous
function on a compact set to arbitrary accuracy. It means that the more fuzzy terms
(and consequently more rules) are used in the rule base, the closer the output of the
fuzzy system to the desired values of the function to be approximated, is.

10.2.11 Neuro-Fuzzy Classifiers

Another important application of neural networks is pattern classification. A neural


network is trained by giving samples of clusters and assigning those to a certain
category label. The major drawback is that it is certainly not straightforward to
define the boundaries between the different classes, as we already explained in Sub-
section 10.1.17. If the boundaries can only be defined by hyperplanes, as assumed
before, it is for example not possible to classify the set of data of Figure 70 into two
seperate classes.
And on top of that, this is only a two-dimensional problem with one classification
feature. When the number of classes and the dimension of the space increase, the
problem becomes intricately complex. One obvious solution is then to define fuzzy
boundary conditions, so that an overlapping area is created where the membership
value in two or more different classes can at the same time be nonzero. This solu-
tion tackles both problems at once, and also reflects the reality of many real-world
applications, where classification according to certain features need not be unique
either.
A classifier can be defined by a number of IF–THEN rules, where as usual, K
' (K
n-dimensional pattern vectors xk = (x1k , x2k , ..., xnk ) k=1 are given as a training set,
which e.g. can be considered as crisply belonging to either one of two classes. Then

Fig. 70 Data which cannot be


divided into two classes
An Overview of Fuzzy Control Theory 117

a fuzzy rule base for the classification problem looks like this:

IF (x11 = A11 ) and ... and (xn1 = A1n ) THEN x1 belongs to class 1
IF (x12 = A21 ) and ... and (xn2 = A2n ) THEN x2 belongs to class 2
...

where Aki are linguistic variables that characterize the properties of the classes. By
combining the individual rules by means of the appropriate aggregation functions,
such as t–norms and t–conorms, the different actions are considered together, and
based on the result of pattern matching between rule antecedents and input signals, a
number of fuzzy rules are triggered in parallel with various values of firing strength.
Furthermore, we want the system to have the capability to learn, and hence to
update and fine-tune itself, based on newly acquired information. The task of fuzzy
classification is to generate an appropriate fuzzy partition of the feature space; in
this context the word “appropriate” means that the number of misclassified patterns
should be minimized. Also, the rule base should be optimized by deleting rules
which are not used or have a negligible influence.
To achieve this goal, each of the input domains is assigned a partition of unity
as an antecedent rule base. Considering that the minimum is the largest t–norm,
and that the firing strength, being the combination of the rule antecedents xk =
(x 1k , x 2k , ..., x nk ), is realized through such a t–norm, a pattern vector xk is then suitably
classified as belonging to class j if and only if its firing strength is larger than or
equal to 0.5. In such a case, a rule is created if for a given input pattern xk the
combination of fuzzy sets, where each yields the highest degree of membership for
the respective input feature, is achieved. If this rule antecedent combination is not
present as an existing rule in the rule base yet, a new rule is created. This method
however does not prevent that some patterns may be misclassified. In particular, this
may happen when either the fuzzy partition is not set up correctly, or if the number
of fuzzy linguistic variables is too small.
Since a general description of this method, incorporating all the possible choices
for aggregation operators, shapes of the membership functions, number of input and
output values and degree of overlap, would lead to a too general meta-description
of the method of neuro-classification, we will restrict ourselves once more to give a
few detailed examples, in several dimensions.

10.2.12 Example

Consider the following example, where a number of patterns have to be subdivided


into two classes. A training set is given by the set of data given in Figure 71.
The two-dimensional space is partitioned in nine subspaces. The following nine
rules can be generated from the partitions in the figure:

R1 : IF (x1 = Aµ1 ) and (x2 = Aν1 ) THEN x ∈ C1


R2 : IF (x1 = Aµ1 ) and (x2 = Aν2 ) THEN x ∈ C1
118 W. Peeters

ν3

ν2

ν1

1 0

1 µ1
µ2 µ3
Fig. 71 Training set

R3 : IF (x1 = Aµ1 ) and (x2 = Aν3 ) THEN x ∈ C1


R4 : IF (x1 = Aµ2 ) and (x2 = Aν1 ) THEN x ∈ C2
R5 : IF (x1 = Aµ2 ) and (x2 = Aν2 ) THEN x ∈ C2
R6 : IF (x1 = Aµ2 ) and (x2 = Aν3 ) THEN x ∈ C2
R7 : IF (x1 = Aµ3 ) and (x2 = Aν1 ) THEN x ∈ C1
R8 : IF (x1 = Aµ3 ) and (x2 = Aν2 ) THEN x ∈ C2
R9 : IF (x1 = Aµ3 ) and (x2 = Aν3 ) THEN x ∈ C1

where Aµi denotes the i–th linguistic variable for the first input, represented by the
fuzzy set µi , and where Aν j denotes the j–th linguistic variable for the second input,
represented by the fuzzy set ν j . Two observations should now be obvious:
• The contraction of rules R4 , R5 and R6 to one single rule

R456 : IF (x1 = Aµ2 ) THEN x ∈ C2

does not in any way influence the classification, so we reach the same precision
with fewer rules.
• Nevertheless, the number of rules seems to be too small, as there are clearly two
misclassified data sets in the example.

10.2.13 Example

Consider another example (Figure 72), in which we will show that this reduction of
number of rules can be quite drastic.
If one would try to classify all the given patterns by fuzzy rules based on a sim-
ple fuzzy grid, a fine fuzzy partition and (6 × 6 = 36) rules would be necessary.
An Overview of Fuzzy Control Theory 119

ν6

ν5
ν4
ν3
ν2

ν1

1 0

1
µ1 µ2 µ3 µ4 µ5 µ6
Fig. 72 Another training set

However, if is easy to see that the pattern may be correctly classified with only the
following five IF–THEN rules:

R1 : IF (x1 = Aµ1 ) THEN x ∈ C1


R2 : IF (x1 = Aµ 6 ) THEN x ∈ C1
R3 : IF (x2 = Aν1 ) THEN x ∈ C1
R4 : IF (x2 = Aν6 ) THEN x ∈ C1
R5 : IF (x1 = Aµ1 ) and (x1 = Aµ 6 )
and (x2 = Aν1 ) and (x2 = Aν6 ) THEN x ∈ C2

10.2.14 Example (Adaptive Network-based Fuzzy Inference System)

Another example of the succesful combination of neural networks and fuzzy lin-
guistic variables is given by Sun and Jang ([68]), who have succesfully constructed
a fuzzy classifier based on an adaptive network, which they call an ANIFS (adap-
tive network-based fuzzy inference system) structure. The architecture is shown in
Figure 73.
Given two input variables x1 and x2 , the training data set is categorized into two
classes C1 and C2 . Each input is supposed to satisfy to a certain degree two linguistic
terms, hence we have four rules.
• In the first layer, the output is defined as the degree to which the given input
satisfies the given linguistic variable. Fuzzy variables describing this degree of
membership may be of the following normal and convex shape:
 
1 x−ai1 2
−2
µAi (x) = e b i1

! "2
where ai j , bi j i, j=1 are parameters that still have to be determined. The shape
of the membership function may, change in function of the parameters. The
120 W. Peeters

A1 T

x1

A2 T S f θ C1

B1 T S f θ C2

x2

B2 T

Fig. 73 ANIFS-structure

functions may, e.g. also have a trapezoidal or triangular shape; the parameters
are tuned by means of a gradient descent method.
• The signals that are generated by each of the nodes are combined in the second
layer by means of a t–norm T representing the AND-conjunction.
• The different outcomes are then combined through a t-conorm S or a linear com-
bination.
• Finally, in the last layer, a sigmoidal function is applied to calculate the degree
of membership to each of the classes.
Let therefore a training set {(xk , yk )}Kk=1 be given, where xk is the k-th input
pattern and 
k (1, 0) if xk belongs to class 1
y =
(0, 1) if xk belongs to class 2
then the parameters of this hybrid neural net determine the shape of the membership
functions, and can be learned by gradient descent methods. The error function is
defined as
K
1 K    
E = ∑ Ek = ∑ ok1 − yk1 + ok2 − yk2
k=1 2 k=1

where yk is the desired output vector and ok is the output given by the hybrid neural
net.

10.3 Genetic Algorithms

In this last section, we will again first give an overview of the main theory involving
genetic algorithms, followed by an overview of hybrid techniques where succesful
combinations of fuzzy set theory and genetic algorithms have been made. We will
follow the approach as is presented by Obitko and Slavı́k in [108].
An Overview of Fuzzy Control Theory 121

10.3.1 Introduction

Genetic algorithms are a part of evolutionary computing, which is a rapidly growing


area of artificial intelligence. Evolutionary computing was introduced by Rechenberg
in [123], while genetic algoritms must be credited to Holland in [61]. The cross-
breeding with computer science led to the successful technique of genetic pro-
gramming, introduced by Koza in [79]. While neural networks are a mathematical
model for the working of the human brain, genetic algorithms are strongly inspired
by Darwin’s theory of evolution. For finding a solution to a problem by genetic
algorithms, we will use an evolutionary process, where possible solutions, which
will be called chromosomes, will be used to create new solutions. Such a set of
possible solutions will be called a population. Which solutions in one particular
population will survive and either will make it to the next generation of populations
itself, or whether its offspring will do so, is dependent of a fitness measure, denot-
ing the suitability of the solution to the given problem. The idea of “survival of the
fittest” is a key concept in Darwin’s theory. The more suitable a chromosome, the
more chances it will get to reproduce. During reproduction, two techniques are used
to make new chromosomes out of new ones:
• Crossover, where genes from parents are recombined to form a whole new chro-
mosome.
• Mutation, where with a certain, low probability, in the newly created offspring
certain the elements are a bit changed. This changes are, just as is the case with
human DNA, mainly caused by errors in copying genes from parents.
• Elitism, where the genes with the best fitness values are selected for the next
generation.
This process is repeated until some critical condition, e.g. whether or not there is
still improvement in the solution, will be satisfied.
Genetic algorithms play an important role in solving problems involving large
search spaces of feasible solutions. Usually, we will already be satisfied if we find
some solution representing a local optimum for the problem. Each possible solution
yields a different fitness value for the particular problem. Genetic algorithms tend
to look for the best solution, but usually only yield a “good” solution, i.e. better
than the solutions immediately surrounding it in the search space. The problem is
that the latter can be very complicated. One may not know where to look for a
solution or where to start. There are, however, many good methods one can use
for finding a suitable solution, although these methods do not necessarily provide
the best solution. Some other methods besides genetic algorithms are hill climbing,
tabu search and simulated annealing. The solutions found by these methods are
often considered as good solutions, because it is not often possible to prove what
the optimum is.
One example of a class of problems which cannot be solved in the “traditional”
way, are the so-called nondeterministic polynomial, or NP problems, for short.
These are problems which cannot be solved by an algorithm that increases no more
than a polynomial in function of its parameters. The most notorious example of an
122 W. Peeters

NP problem is the traveling salesman problem, see for instance [70]. Usually, NP
problems are solved by some sort of “guessing” the solution, and then checking its
fitness. A characteristic of NP problems is that a simple algorithm, perhaps obvious
at a first sight, can be used to find usable solutions. But this approach generally pro-
vides many possible solutions — just trying all possible solutions in case of a simple
problem for which the answer is either yes or no, is already very slow process of
order O(2n ). The question whether for any NP problem, a solution exists that pro-
vides the exact answer in a polynomial function of time, is still an open problem.
Because of the lack of a way to construct such an efficient algorithm, scientists apply
alternative methods such as genetic algorithms.
Crossover and mutation are the most important parts of the genetic algorithm.
The performance is influenced mainly by these two operators. Before we can explain
more about crossover and mutation, some information about chromosomes will be
given.

10.3.2 Definition (Chromosomes)

A chromosome should in some way contain information about the solution that it
represents. We will illustrate the crossover and mutation operators in case the chro-
mosomes are defined as binary strings. Chromosomes then could look like this:

Chromosome 1 : 1101100100110110
Chromosome 2 : 1101111000011110

Each bit in this string can represent some characteristic of the solution. Of course,
there are many other ways of encoding, depending mainly on the problem to be
solved. For example, one can encode directly integer or real numbers, sometimes it
is useful to encode some permutations and so on.

10.3.3 Definition (Crossover)

The crossover operation on two selected parent chromosomes is the creation of a


new offspring chromosome. Crossover is made in the hope that new chromosomes
will contain good parts of old chromosomes and therefore the new chromosomes
will have a larger fitness value. However, it is good to ensure that some part of
old population survives to the next generation. The simplest way how to perform
crossover is to choose randomly some crossover point and copy everything before
this point from the first parent and then copy everything after the crossover point
from the other parent.

10.3.4 Example

Let the same chromosomes as above be given, and denote the crossover point by |.
Then
An Overview of Fuzzy Control Theory 123

Chromosome 1 : 1101100|100110110
Chromosome 2 : 1101111|000011110

creates two new offspring chromosomes,

Offspring chromosome 1 : 1101100|000011110


Offspring chromosome 2 : 1101111|100110110

Variations in how to create crossover offspring include, for example, the choice of
more than one crossover point. Crossover can be quite complicated and depends
mainly on the encoding of chromosomes. Specific crossovers made for a specific
problem can improve the performance of the genetic algorithm.
After a crossover is performed, mutation takes place.

10.3.5 Definition (Mutation)

The mutation operation is intended to prevent falling of all solutions in the popula-
tion into a local optimum of the solved problem. The mutation operation randomly
changes the offspring resulted from crossover, with a low probability though. In case
of binary encoding, we can switch a few randomly chosen bits from 1 to 0 or from
0 to 1.

10.3.6 Example

Let the same offspring as caused by the crossover above be given. Then the result
of mutation of
Original offspring chromosome 1 : 1101100|000011110
Original offspring chromosome 2 : 1101111|100110110

can for instance be


Mutated offspring chromosome 1 : 1101101|000010110
Mutated offspring chromosome 2 : 1001111|100110111

Mutation should not occur very often, because then genetic algorithm will in fact
change to random search. The technique of mutation (as well as crossover) depends
mainly on the encoding of chromosomes. For example, when we are encoding per-
mutations, mutation could be performed as an exchange of two genes.

10.3.7 Definition (Selection)

Chromosomes are selected from the population to be parents for crossover. The
problem is how to select which chromosomes will be given a chance to procre-
ate. According to Darwin’s theory of evolution, the best ones survive to create new
124 W. Peeters

offspring. There are many methods in selecting the best chromosomes. Any such an
algorithm is called a selection. Examples are roulette wheel selection, Boltzman se-
lection, tournament selection, rank selection, steady state selection and some other
selection methods.

10.3.8 Examples

1. Roulette wheel selection


Imagine a wheel of fortune, where all the chromosomes in the population are
placed with a section size that is proportional to the value of the fitness function
of every chromosome — the bigger the value is, the larger the section is. Then
clearly, the chromosomes with bigger fitness value will be selected more times.
This process can be described by the following algorithm.
* Calculate S, the sum of all chromosome fitnesses in the population.
* Generate a random number r from the interval ]0, S[.
* Go through the population and sum the fitnesses. When this sum, s, is greater
then r, stop and return the chromosome where you are.
2. Rank selection
The previous type of selection will cause problems whenever there are big dif-
ferences between the fitness values. For example, if the best chromosome fitness
value is 90% of the sum of all fitnesses then the other chromosomes will have
very poor chances to be selected. Rank selection ranks the population first and
then every chromosome receives a fitness value determined by this ranking. The
worst performing chromosome will have the fitness 1, the second worst 2, etc.
and the best will have fitness N, being the number of chromosomes in the popula-
tion. While this method permits all the chromosomes to be selected, this method
can lead to slower convergence.

Chromosome Fitness value Probability Ranking Rank probability


A 1.2 9.7 % 3 20.0 %
B 0.7 5.6 % 1 6.7 %
C 8.1 65.3 % 5 33.3 %
D 1.6 12.9 % 4 26.7 %
E 0.8 6.5 % 2 13.3 %
Total 12.4 100% 15 100 %

3. Steady-state selection
This is not a particular method of selecting parents. The main idea of this type
of selecting to the new population is that a big part of chromosomes can survive
to next generation. The steady-state selection genetic algorithm works in the fol-
lowing way: in every generation a few good (with higher fitness) chromosomes
are selected for creating new offspring. Then some bad (with lower fitness) chro-
mosomes are removed and the new offspring is placed in their place. The rest
An Overview of Fuzzy Control Theory 125

of population, including the parents with high fitness values, survives to a new
generation.
4. Elitism
The idea of the elitism has been already introduced. When creating a new pop-
ulation by crossover and mutation, we have a big chance, that we will lose the
best chromosome. Elitism is the name of the method that first copies the best
chromosome (or few best chromosomes) to the new population. The rest of the
population is constructed in ways described above. Elitism can rapidly increase
the performance of GA, because it prevents a loss of the best found solution.

10.3.9 Parameters

There are two basic parameters of GA-crossover probability and mutation pro-
bability.

• Crossover probability describes often crossover will be performed. If there is no


crossover, the offspring are clones of single parents and a whole new generation
is made from exact copies of chromosomes from the old population, which need
not mean that the new generation as a whole is the same; by selection, also only
the strongest individuals will survive. If on the other hand crossover probability
is 100%, then all offspring are made by crossover, so technically speaking, no
chromosomes survive for more than one generation, although it may be possible
that some of the offspring look exactly like their parents.
• Mutation probability determines how often parts of chromosome will be mutated
at random. If there is no mutation, the offspring generated immediately after
crossover (or directly copied) without any change, is “genetically pure”. If mu-
tation probability is 100%, the whole chromosome is changed, and the algorithm
is reduced to pure random search. In case of binary chromosomes, a mutation of
100% means that all chromosomes are in fact inverted, which means that after
an even number of steps in the algorithm, no mutation has taken place at all. In
such a case, choosing the mutation probability either 0% or 100% yields a similar
(bad) performance: in either case, the population will degenerate very quickly.
Another parameter in the genetic algorithm that is particularly important, is the
population size, the number of chromosomes that are present in one generation. If
there are too few chromosomes, the genetic algorithm will have too few candidates
to perform crossover with, resulting in only a partial exploration of the search space.
This is similar to the biological principle of sexual degeneration or in-breed. On the
other hand, a population that is too large, slows the genetic algorithm down con-
siderably, while it was specifically designed to enhance the speed of certain search
problems. Research shows that after some limit, which depends mainly on encoding
and the problem, it is not useful to use very large populations because it does not
solve the problem faster than moderate-sized populations.
The encoding of chromosomes is the first step in solving a problem by using
a genetic algorithm. Since crossover and mutation are the two basic operators of
126 W. Peeters

genetic algorithms, on which the performance depends very much, the type and
implementation of operators depends on the encoding that has been chosen as being
suitable to the problem. In the following examples we briefly some often encoun-
tered encoding methods

10.3.10 Example: Binary Encoding

Binary encoding is the most common used type of encoding, due to historical rea-
sons as well as computational simplicity. Furthermore, binary encoding creates
many possible chromosomes even with a small number of data. On the other hand,
this encoding is often not natural for many problems and sometimes corrections
must be made after crossover and/or mutation. In binary encoding, every chromo-
some is a string of bits — 0 or 1.

Chromosome A : 1001001001100101100110
Chromosome B : 1110100011110101101110

An important example of a problem that is solved through binary encoding is


the knapsack problem: given a knapsack for which you would like to maximize the
load, where the objects are things with a given value and size. As a hard boundary
condition, the total load should neither exceed the size of the knapsack, nor a cer-
tain limit weight. As the encoding concerns, every bit of a chromosome indicates
whether the corresponding object should be present in the knapsack.
Crossover can be performed in several variations. We make the following dis-
tinction:
• Single point crossover: one crossover point is selected, then consequently the
binary string from the beginning of the chromosome to the crossover point is
copied from the first parent, the rest is copied from the other parent.

11001 | 011
⇒ 11001|111
11011 | 111

• Two point crossover: two crossover points are selected, then consequently the
binary string from the beginning of the chromosome to the first crossover point
is copied from the first parent, the part from the first to the second crossover point
is copied from the other parent and the rest is copied from the first parent again.

11 | 0010 | 11
⇒ 11|0111|11
11 | 0111 | 11

• Uniform crossover: bits are randomly copied from the first or from the second
parent. 
11001011
⇒ 11011111
11011111
An Overview of Fuzzy Control Theory 127

• Arithmetic crossover: some arithmetic operation is performed to make a new


offspring (e.g. the logical AND-operator):

11001011
⇒ 11011111
11011111

As far as mutation is concerned, only one feasible method is possible here: bit
inversion, where selected bits are inverted with a random probability.

11001001 ⇒ 10001001

10.3.11 Example: Permutation Encoding

Permutation encoding can be used in ordering problems, such as the travelling sales-
man problem or, more generally, any task ordering problem. In permutation encod-
ing, every chromosome is a string of numbers that represent a position in a sequence.

Chromosome A : 7 4 1 9 6 3 2 5 8
Chromosome B : 3 8 9 5 2 6 1 4 7

The standard problem that is associated with permutation encoding is the travel-
ling salesman problem: given a number of cities and a matrix denoting the distances
between them. A travelling salesman has to visit all of them exactly once, but at
once he wants to minimize his travel time. The aim of the genetic algorithm is then
to find the ideal order in which the salesman has to visit the cities. The chromosomes
describe of course the order in which the salesman will travel the cities. A myriad
of variations on the problem exist (e.g. the salesman wants to end in the same city
he started, city A must be visited before city B, one particular ordered pair (A, B)
should be excluded because of road works).
Several methods for crossover exist. Single point crossover can be achieved as
follows: one crossover point is selected, the permutation is copied from the first
parent till the crossover point, then the other parent is scanned, where all numbers
that are not yet in the offspring, are added in the same order as they occur in the
second parent. Note that there are more ways to produce the remainder of the string
after the crossover point.

12345 | 6789
⇒123456897
45368 | 9721

For mutation (and also for some types of crossover) corrections must be made
to leave the chromosome consistent (i.e. making sure that the chromosomes still
are feasible solutions). One could imagine for instance that random mutation would
cause the sequence not to contain all numbers anymore. A mutation therefore will
be encoded as the random exchange of a pair of numbers.

123456897⇒183456297
128 W. Peeters

10.3.12 Example: Value Encoding

Direct value encoding can be used in a wide scope of problems where more compli-
cated values such as real numbers are used, where binary encoding for this type of
problems would be meaningless. In the value encoding, every chromosome is a se-
quence of some values, possibly anything connected to the problem, such as (real)
numbers, characters or any objects. Almost any mathematical problem should be
able to cope with genetic algorithms with real numbers

Chromosome : 1.82521 0.87243 5.00231 3.92321 − 0.87625

which may for instace be the weights of the synapses between the neurons of a
neural network; but also, e.g. sequences of motions to find the shortest path through
a maze could be the object of study for a genetic algorithm:

Chromosome : back, f orward, f orward, le f t, right, back, le f t

For crossover, all crossovers from binary encoding can be used. In the case of
real value encoding, mutation can be performed by adding or subtracting a small
number to or from selected values.

1.12 0.24 5.71 4.33 2.05 ⇒ 1.12 0.24 5.71 4.56 2.05

10.3.13 General Outline for Design

The following recommendations for the design of a genetic algorithm are mostly
heuristically derived from the results of empiric studies of genetic algoritms with
binary encoding:
• The crossover rate should be high generally, about 80–95%. However, some re-
sults show that for some problems a crossover rate about 60% is the best.
• On the other side, the mutation rate should be very low. Best rates seems to be
about 0.5–1%.
• It may be surprising, that, concerning the population size, very big populations
usually do not improve performance of the genetic algorithm, in the sense of
speed of finding an optimal solution. A good population size is about 20–30,
however sometimes sizes 50–100 are reported as the best. Some research also
shows, that the best population size depends on the size of the encoded strings
(chromosomes). For instance chromosomes with 32 bits require a larger popula-
tion than chromosomes with 16 bits.
• For the selection, a basic roulette wheel selection can be used, but sometimes
rank selection can be better, as each method has its advantages and disadvantages.
• There are also some more sophisticated methods that change parameters of se-
lection during the run of the genetic algorithm. Basically, these behave similarly
like simulated annealing.
An Overview of Fuzzy Control Theory 129

• Elitism should be used for sure if you do not use any other method for saving the
best found solution. You can also try steady-state selection.
• The encoding depends on the problem and also on the size of instance of the
problem. Operators for crossover and mutation depend on the chosen encoding
and on the problem.

10.4 Fuzzy-Genetic Hybrid Systems

It is as well possible to apply fuzzy control techniques to improve the perfor-


mance of a genetic algorithm, e.g., by relaxing border constraints, as using genetic
algorithms in the design of a fuzzy controller. The crossbreeding of fuzzy and
genetic techniques provides for a very large application domain. There is a vast
scope of literature available on the use of fuzzy methods in genetic algorithms.
There are obvious control processes that can be applied, such as the fine-tuning of
the parameters such as crossover and mutation rates by means of a fuzzy controller
([9], [52], [82]). Other work includes the use of fuzzy connectives on crossover op-
erators, work which is largely due to Herrera et al. ( [55], [56], [54], [59]), fuzzy
control processes of the genetic algorithm population ( [4], [154]), the applica-
tion of fuzzy control to the constraints of a genetic algorithm ( [112]), improved
optimization problems ( [118], [151], [158]) and applications in soft computing
( [133], [134]). More general results about fuzzy genetic algorithms can be found
in [51], [58], [53], [80] [83], [98], [137], [152], [153], and [156]. More refined meth-
ods include the automatic tuning of a fuzzy neural network by a genetic algorithm
([64]) and fuzzy classification methods based on neural networks and genetic algo-
rithms ([144]).

10.4.1 Genetic Rule Bases

Although it would be beyond the scope of this article to give a complete overview
of all successful combination techniques involving fuzzy control and genetic al-
gorithms, we would like to recite a few of the most obvious applications that can
be made. The first approach is due to Hashiyama et al. ( [49]), who incorporated
ideas due to Karr ([72]), for designing a fuzzy antecedent rule base without prior
knowledge. Let an n–input, single output fuzzy controller be given by the following
set of linguistic rules:

IF (X1 = A1 ) and ... and (Xn = An ) THEN (Y = B)

For all i ∈ {1, ..., n}, let Ai assume a linguistic value in the range {ai,1 , ai,2 , ..., ai,n(i) }
and let B assume a linguistic value in the range {b1 , b2 , ..., bm }. The purpose of this
method is to find a validation of all possible rules that can be created, assuming
there is a training set at hand, as well as a performance measure (see Section 8.1) for
130 W. Peeters

n
the rule base. Unsupervised, the number of possible rules equals ∏ n(k) × m ,
k=1
which means that it is virtually impossible to test all the rules for validity within a
reasonable time period.
Therefore, only a selected number of rules will be created at random, and these
will be considered as a population of a genetic algorithm. Hence, a chromosome
will be given by

CK : aK1, j1 aK2, j2 . . . aKn, jn bKk

where ∀i ∈ {1, ..., n} : ji ∈ {1, ..., n(i)} and k ∈ {1, ..., m}. A chromosome exists by
taking all the possible linguistic input values together with a single linguistic out-
put value, hence yielding chromosomes of length n + 1. Given a population of N
randomly determined chromosomes, it is sufficient to define a crossover and muta-
tion operator in order to be able to apply the techniques described in Section 10.3.
A crossover at crossover point q ∈ {1, ..., n + 1} will be defined as follows: for all
K1 , K2 ∈ {1, ..., N},

K1 K1 ⎪
CK1 : aK1 K1
1 ... aq aq+1 ... an b ⎪
K1

⇒ aK1 1 ... aKq 1 aKq+1
2
... aKn 2 bK2


CK2 : aK1 2 ... aKq 2 aK2
q+1 ... aK
n
2
bK2 ⎭

In particular, when q = n, the rule antecedents will be matched with another conse-
quence. Mutation is performed as follows:

q,jq ... an, jn bk ⇒ a1, j1 a2, j2 ... aq,j ... an, jn bk


CK : aK1, j1 aK2, j2 ... aK K K K K K K K
q

with jq a random linguistic value in {1, ..., n(q)} to replace jq .


After a number of runs, the rule base should be optimized by this nonlin-
ear technique, which especially pays off whenever the number of possible rules,
n
∏ n(k) × m , becomes large. Of course, a number of parameters still needs to be
k=1
fine-tuned, such as the number of surviving chromosomes in one generation.

10.4.2 Improvements of Genetic Rule Bases

The approach described in the previous section is quite crude, and without expert
knowledge, convergence to an optimal rule base will not be guaranteed to be quick
enough. In this section, we would like to propose some simple heuristics which
will improve the quality of the solutions derived from the genetic algorithm. We
will again make the distinction between the self-tuning case, where the parameters
An Overview of Fuzzy Control Theory 131

occurring in the fuzzy rule definitions are changed, and the self-organizing case,
where fuzzy rules can be omitted and/or added.

• Given that the shape functions for the linguistic variables are fixed, it is also
possible to use a genetic algorithm to tune the parameters. Let for instance a set
of antecedent rules be given, where each membership function is of the shape
 
 x − ai  n
µi (x) = 1 − 1 − ∨ 0
bi  i=1

Then we shall assume that the value ai is equal to a member of a discrete set
of possible center values A = {a0 , a1 , a2 , ..., , a p } and bi equals a possible spread
value B = {b0 , b1 , b2 , ..., , bq }. A chromosome then exists of a string of length 2n

a1 b1 a2 b2 ... an bn

with the usual crossover operator and as mutation a random selection of another
value a j ∈ A or b j ∈ B.
• Analogously, it is possible to consider several shape functions at once as an-
tecedent rule base variables, and let an evolutionary algorithm as in Section 10.4.1
determine which shape yields the best performance. Of course, as the degrees of
liberty increase, the search space grows, and so does the time to reach a conver-
gent behavior. This extension of the design of genetic rule bases should therefore
be approached with caution.
• An important extension however is inspired by Example 10.2.13, which showed
us that it is important to consider a variable number of antecedents in a partic-
ular rule. This is where genetic algorithms fail, since the chromosomes always
have the same length, say n. In order to fix this problem without fundamentally
changing the algorithm, it is possible however to add one “dummy value” to the
number of possible values of the chromosomes, with no effect. Consider as an
example again a rule base with rules

IF (X1 = A1 ) and ... and (Xn = An ) THEN (Y = B)

and suppose also that we would like that rules with fewer than n antecedent
conditions should be considered. If for instance the linguistic variable A1 ranges
in the set A1 = {a1,1 , a1,2 , ..., a1,n(1) }, it is most easy to extend A to the set A ∪
({a1,n(1)+1 := “always true”)}, where the latter can easily be encoded by taking
the fuzzy set µa1,n+1 := 1. This can be done for all variables, so that the rule

IF ... and (Xk−1 =Ak−1 ) and (Xk =ak,n(k)+1 ) and (Xk+1 =Ak+1 ) and ... THEN (Y =B)

has the same effect as

IF ... and (Xk−1 = Ak−1 ) and (Xk+1 = Ak+1 ) and ... THEN (Y = B)
132 W. Peeters

Whether or not it is suitable to diminish the number of conditions in the rule


base, it will automatically be judged by the performance measure of the fuzzy
controller. One rule should then be added that a rule in which all antecedents are
always true, should automatically be deleted.

References

1. D.Y. Abramovitch and L.G. Bushnell. Report on the Fuzzy versus Conventional Control De-
bate. IEEE Control Systems 19(3), pp. 88–91, 1999
2. C. Alsina. On a family of connectives for fuzzy sets. Fuzzy Sets and Systems 16, pp. 231–235,
1985
3. G. Arfken. The Method of Steepest Descents. in: Mathematical Methods for Physicists, 3rd
ed. Orlando, FL, Academic Press, pp. 428–436, 1985
4. S. Arnone, M. Dell’Orto and A. Tettamanzi A. Towards a fuzzy government of genetic popu-
lations. In Proc. Sixth IEEE Conference on Tools with Artificial Intelligence, Los Alamitos,
pp. 585–591, 1994
5. K.J. Aström and B. Wittenmark Adaptive Control. Addison-Wesley, 1989
6. S.M. Baas and H. Kwakernaak. Rating and ranking of multiple–aspects alternatives using
fuzzy sets. Automatica, 13, pp. 47–58, 1977
7. J.F. Baldwin. A new approach to approximate reasoning using a fuzzy logic. Fuzzy Sets and
Systems 2, pp. 309–325, 1979
8. G. Bartolini, G. Casalino, F. Davoli, M. Mastretta, R. Minciardi and E. Morten Development
of performance adaptive fuzzy controllers with application to continuous casting plants. In
R. Trappl Ed., Cyvbernetics and Systems Research. Amsterdam, North–Holland, pp. 721–
728, 1982
9. A. Bergman, W. Burgar and A. Hemker Adjusting parameters of genetic algorithms by fuzzy
control rules. Proc. Third International Workshop on Software Engineering and Expert Sys-
tems for High Energy and Nuclear Physics, Oberammergau In K.H. Becks and D.P. Gallix,
Eds. New Computer Techniques in Physics Research III, pp. 235–240, 1994
10. P.P. Bonissone and K.S. Decker. Selecting uncertainty calculi and granularity: An experi-
ment in trading-off precisionand complexity. In: L.N. Kanal and J.F. Lemmer. Uncertainty In
Artificial Intelligence, pp. 217–247, 1986
11. G. Bortolan and R. Degani. A review of some methods for ranking fuzzy subsets. Fuzzy Sets
and Systems 15, pp. 1–19, 1985
12. S.B. Boswell and M.S. Taylor. A central limit theorem for fuzzy random variables. Fuzzy
Sets and Systems 24, pp. 331–344, 1987
13. G.E.P. Box and G.M. Jenkins. Time Series Analysis: Forecasting and Control. Holden-Day,
1989
14. M. Braae and D.A. Rutherford. Selection of parameters for a fuzzy logic controller. Fuzzy
Sets and Systems 2, pp. 185–199, 1979
15. M. Braae and D.A. Rutherford. Theoretical and linguistical aspects of the fuzzy logic con-
troller. Automatica 15, pp. 553-577, 1979
16. Z.-X. Cai. Intelligent control: Principles, Techniques and Applications. World Scientific,
1997
17. L. Campos and J.L. Verdegay. Linear programming problems and ranking of fuzzy numbers.
Fuzzy Sets and Systems 32, pp. 1–11, 1989
18. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part I. Fuzzy Sets
and Systems 44, pp. 33–38, 1991
19. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part II. Fuzzy Sets
and Systems 45, pp. 189–202, 1992
An Overview of Fuzzy Control Theory 133

20. W. Cong–Xin and M. Ming. Embedding problem of fuzzy number space, Part III. Fuzzy Sets
and Systems 46, pp. 281–286, 1992
21. E. Czogala and W. Pedrycz On identification in fuzzy systems and its applicatons in control
problems. Fuzzy Sets and Systems 6, pp. 73–83, 1981
22. M. Delgado, J.L. Verdegay and M.A. Villa. A procedure for ranking fuzzy numbers using
fuzzy relations. Fuzzy Sets and Systems 26, pp. 49–62, 1988
23. R.L. Devaney. An Introduction to Chaotic Dynamical Systems. Addison-Wesley, 1989
24. D. Driankov, H. Hellendoorn and M. Reinfrank. An introduction to fuzzy control. Springer-
Verlag, 1993
25. D. Dubois and H. Prade. Operations on fuzzy numbers. Internat. J. Systems Sci. 9, pp. 613–
626, 1978
26. D. Dubois and H. Prade. Fuzzy real algebra: some results. Fuzzy Sets and Systems 2, pp.
327–348, 1979
27. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Academic Press,
1980
28. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 1: Integration of fuzzy
mappings. Fuzzy Sets and Systems 8, pp. 1–17, 1982
29. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 2: Integration on fuzzy
intervals Fuzzy Sets and Systems 8, pp. 105–116, 1982
30. D. Dubois and H. Prade. Towards fuzzy differential calculus, Part 3: Differentiation. Fuzzy
Sets and Systems 8, pp. 225–233, 1982
31. D. Dubois and H. Prade. Ranking fuzzy numbers in the setting of possibility theory. Inform.
Sci. 30, pp. 183–224, 1983
32. D. Dubois and H. Prade. The mean of a fuzzy number. Fuzzy Sets and Systems 24, pp. 279–
300, 1987
33. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
34. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning, Part 1: Inference with possi-
bility distributions. Fuzzy Sets and Systems 40, pp. 143–202, 1991
35. D. Dubois and H. Prade. Basic issues on fuzzy rules and their application to fuzzy control.
Proceedings of the IJCAI-91 Workshop on Fuzzy Control, Sydney, pp. 5–17, 1991
36. L. Fausett. Fundamentals of Neural Networks. Prentice-Hall, 1994
37. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions.
Internat. J. Intelligent Systems 6, 1991, pp. 687–697
38. D.P. Filev and R.R. Yager. An adaptive approach to defuzzification based on level sets. Fuzzy
Sets and Systems 53, pp. 355–360, 1993
39. R. Fullér. Introduction to Neuro-Fuzzy Systems. Advances in Soft Computing Series,
Springer-Verlag, Berlin/Heidelberg, 2000
40. K.I. Funahashi. On the Approximate Realization of continuous Mappings by Neural Net-
works. Neural Networks, vol. 2, pp. 183–192, 1989
41. S. Gähler and W. Gähler. Fuzzy real numbers. Fuzzy Sets and Systems 66, pp. 137–158, 1994
42. M. de Glas. Invariance and stability of fuzzy systems. Journal of Mathematical Analysis and
Applications, 199, pp. 299–319, 1984
43. R. Goetschel and W. Voxman. Topological properties of fuzzy numbers. Fuzzy Sets and Sys-
tems 10, pp. 87–99, 1983
44. R. Goetschel and W. Voxman. Eigen fuzzy number sets. Fuzzy Sets and Systems 16, pp.
75–85, 1985
45. R. Goetschel and W. Voxman. Elementary fuzzy calculus. Fuzzy Sets and Systems 18, pp.
31–43, 1986
46. David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.
Addison-Wesley Publishing Company, 1989
47. K. Gurney. An Introduction to Neural Networks. UCL Press, 1997
48. J.C. Harris and J.F. Miles. Stability of linear systems: Some aspects of kinematic similarity.
Academic Press, New York, 1980
134 W. Peeters

49. T. Hashiyama, F. Furuhashiu and Y. Uchikawa. A creative design of fuzzy logic controller
using a genetic algorithm. In [134], pp. 37–48, 1997
50. S. Haykin. Neural Networks: A Comprehensive Foundation, 2nd Ed. Prentice-Hall, 1999
51. F. Herrera, E. Herrera-Viedma, M. Lozano and J.L. Verdegay. Fuzzy tools to improve ge-
netic algorithms. In Proc. Second European Conference on Intelligent Techniques and Soft
Computing (EUFIT Aachen’94), vol. 3, pp. 1532–1539, 1994
52. F. Herrera and M. Lozano. Adaption of genetic algorithm parameters based on fuzzy logic
controllers. In [57], pp. 95–125, 1996
53. F. Herrera and M. Lozano. Adaptive genetic algorithms based on fuzzy techniques. In Proc.
Sixth International Conference on Information Processing and Management of Uncertainty
in Knowledge Based Systems (IPMU’96), Granada, pp. 775–780, 1996
54. F. Herrera and M. Lozano. Heuristic crossovers for real-coded genetic algorithms based on
fuzzy connectives. In H.K. Voight, W. Ebeling, I. Rechenberg and H.P. Schwefel, Eds. Proc.
Fourth Paralell Problem Solving from Nature - PPSN IV. LCNS 1141 Springer-Verlag, Berlin,
pp. 336–345, 1996
55. F. Herrera, M. Lozano and J.L. Verdegay. The use of fuzzy connectives to design real-coded
genetic algorithms. Mathware & Soft Computing 1(3), pp. 239–251, 1995
56. F. Herrera, M. Lozano and J.L. Verdegay. Dynamic and heuristic fuzzy connectives based
crossover operators for controlling the diversity and convergence of real-coded genetic al-
gorithms. International Journal of Intelligent Systems 11(12), pp. 1013–1040, 1996
57. F. Herrera and J.L. Verdegay. Genetic Algorithms and Soft Computing. Physica Verlag, 1996
58. F. Herrera, M. Lozano and J.L. Verdegay. Tackling fuzzy genetic algorithms. In G. Winter,
J. Periaux, M. Galáan, and P. Cuesta, Eds. Genetic Algorithms in Engineering and Computer
Science. Wiley, Chichester, UK, pp. 167–189, 1995
59. F. Herrera, M. Lozano and J.L. Verdegay. Fuzzy connective based crossover operators to
model genetic algorithms population diversity. Fuzzy Sets and Systems, 92 (1), pp. 21–30
60. K. Hirota. Industrial Applications Of Fuzzy Technology. Tokyo, Berlin, Heidelberg, 1993
61. J.H. Holland. Adaption In Natural And Artificial Systems. MIT Press, Ann Arbor, 1975.
62. L.P. Holmblad and J.J. Østergaard. Control of cement kiln by fuzzy logic. in: Approximate
Reasoning In Decision Analysis. Eds. M.M. Gupta and E. Sanchez, Amsterdam, New York,
Oxford pp. 389–400, 1982
63. S. Isaka and A.V. Sebald. An optimization for fuzzy controller design. IEEE Trans. SMC, 22,
p. 1469, 1992
64. H. Ishigami, T. Fukuda and T. Shibata. Automatic fuzzy tuning and its applications. In [134],
pp. 49–70, 1997
65. R. Jager. Fuzzy logic in control. Ph.D. thesis, T.U. Delft, 1995
66. L.C. Jain and R.K. Jain. Hybrid intelligent engineering systems. in: Advances in Fuzzy Sys-
tems — Applications And Theory, Vol. 11. World Scientific, 1997
67. J.S.R. Jang. ANFIS: Adaptive-Network-Based Fuzzy Inference System. IEEE Transactions on
Systems, Man and Cybernetics, Vol. 23, pp. 665–685, 1993
68. J.S.R. Jang and C. Sun. Neuro-Fuzzy Modeling and Control. Proceedings of the IEEE, 83,
pp. 378–406, 1995
69. Jan Jantzen. Design of fuzzy controllers. Tech. report no. 98-E 384, TU denmark, Dept. of
Automation, Lyngby, Denmark, 1998
70. D.S. Johnson and L.A. McGeoch. The Traveling Salesman Problem: A Case Study In Local
Optimization In E.H.L. Aarts and J.K. Lenstra, Eds. Local Seach in Combinatorial Optimiza-
tion. To appear
71. A. Kandel, Y. Luo and Y.Q. Zhang. Stability analysis of fuzzy control systems. Fuzzy Sets
and Systems 105, pp. 33–48, 1999
72. C.L. Karr. Design of an adaptive fuzzy logic controller using a genetic algorithm. Proc. of
the 4th International Conference on Genetic Algorithms, pp. 450–457, 1992
73. E.E. Kerre. A comparative study of the behavior of some popular fuzzy implication operators.
In L.A. Zadeh and J. Kacprzyk, Eds., Fuzzy Logic For The Management Of Uncertainty.,
Wiley, New York, 1992
An Overview of Fuzzy Control Theory 135

74. W.M. Kickert and E.H. Mamdani. Analysis of a fuzzy logic controller. Fuzzy Sets and Sys-
tems 1, pp. 29–44, 1978
75. J.B. Kiszka, M.M. Gupta and M.N. Nikiforuk. Energetistic stability of fuzzy dynamic systems.
IEEE Trans. on Systems, Man and Cybernetics, 15, pp. 783–792, 1985
76. P.E. Kloeden. Fuzzy dynamical systems. Fuzzy Sets and Systems 7, pp. 275–296, 1982
77. T. Kohonen. Self-organising and Associative Memory, 3rd Ed, Springer Verlag, New York,
1988
78. A.N. Kolmogorov and S.V. Fomin. Measure, Lebesgue Integrals and Hilbert Space. Acad-
emic Press, New York, 1961
79. J.R. Koza. Genetic Programming: On The Programming Of Computers By Means Of Natural
Selection. MIT Press, 1992
80. K. Kristinsson and G.A. Dumont. System identification and control using genetic algorithms.
IEEE Transactions on System, Man, and Cybernetics, SMC-22(5), pp 1033–1046, 1992
81. H. Kwakernaak and R. Sivan Linear Optimal Control Systems. Wiley-Interscience, New
York, 1972
82. A.M. Lee and H. Takagi. A framework for studying the effects of dynamic crossover, mutation,
and population sizing in genetic algorithms. In T. Furuhashi, Ed. Advances in Fuzzy Logic,
Neural Networks and Genetic Algorithms. Proc. 1994 IEEE/Nagoya-University World Wide
Wisepersons. Selected papers. LNAI 1011 Springer-Verlag, Berlin, pp. 111–126, 1995
83. A.M. Lee and H. Takagi. Dynamic control of genetic algorithms using fuzzy logic techniques.
In Proc. Fifth International Conference on Genetic Algorithms (ICGA’93), San Mateo, pp.
76–83, 1993
84. C.C. Lee. Fuzzy logic in control systems: fuzzy logic controller, Parts I and II. IEEE Trans.
SMC. 20, pp. 405–435, 1900
85. H.K. Lee, E. Paillet and W. Peeters. A consistency criterion for optimizing defuzzification in
fuzzy control. In R. Lowen and A. Verschoren, Eds. Foundations of Generic Optimization Vol
II: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, Mathematical
Modelling: Theory and Applications, Springer Verlag, 2007
86. H.K. Lee, E. Paillet and W. Peeters. An asymptotic consistency criterion for optimizing de-
fuzzification in fuzzy control. In R. Lowen and A. Verschoren, Eds. Foundations of Generic
Optimization Vol II: Applications of Fuzzy Control, Genetic Algorithms and Neural Net-
works, Mathematical Modelling: Theory and Applications, Springer Verlag, 2007
87. C.H. Ling. Representation of associative functions. Publ. Math. Debrecen 12, pp. 182–212,
1965
88. R. Lowen. On (R(L), ⊕). Fuzzy Sets and Systems 10, pp. 203–209, 1983
89. R. Lowen. Fuzzy integers, fuzzy rationals and other subspaces of the fuzzy real line. Fuzzy
Sets and Systems 14, pp. 231–236, 1984
90. R. Lowen. The order aspect of the fuzzy real line. Manuscripta Math. 39, pp. 293–309, 1985
91. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic, Dordrechit, 1996
92. J.L. McClelland and D.E. Rumelhart. Explorations in Parallel Distributed Processing. MIT
Press, 1988
93. A. Maeda, S. Someya and M. Funabashi. A self–tuning algorithm for fuzzy membership func-
tions using a computational flow network. Proceedings of the IFSA ’91, Brussels, 1991
94. E.H. Mamdani and S. Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man-Machine Studies 7, pp. 1–13, 1975
95. E.H. Mamdani and N. Baaklini. Prescriptive method for deriving control policy in a fuzzy
logic controller. Electronic Letters, 11, pp. 625–626, 1975
96. E.H. Mamdani T. Procyk and N. Baaklini. Application of fuzzy logic to controller design
based on linguistic protocol. In: Discrete Systems And Fuzzy Reasoning, E.H. Mamdani and
B.R. Gaines, eds. Queen Mary College, University of London, pp. 125–149, 1976
97. M. Margialot and G. Langholz. Fuzzy Lyapunov–based approach to the design of fuzzy con-
trollers. Fuzzy Sets and Systems 106, pp. 49–59, 1999
136 W. Peeters

98. L. Meyer and X. Feng X. A fuzzy stop criterion for genetic algorithms using performance es-
timation. In Proc. of 3rd IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’94),
Orlando, pp. 1990–1995, 1994
99. M. Ming. On embedding problems of fuzzy number space: part 5. Fuzzy Sets and Systems
55, pp. 313–318, 1993
100. M. Minsky and A. Papert. Perceptrons. MIT Press, 1969
101. M. Mizumoto. Pictorial representations of fuzzy connectives part I: cases of t–norms, t–
conorms and averaging operators. Fuzzy Sets and Systems 31, pp. 217–242, 1989
102. M. Mizumoto. Realization of PID controllers by fuzzy control methods. In IEEE First Int.
Conf. on Fuzzy Systems, number 92CH3073-4. Institute of Electrical and Electronics Engi-
neers Inc, San Diego, pp. 1–16, 1992
103. M. Mizumoto. Improvement of fuzzy control methods. In H. Li and M.M. Gupta, Eds.
International Series In Intelligent Technologies: Fuzzy Logic And Intelligent Systems.
Kluwer Academic Publishers, pp.1–16, 1995
104. M. Mizumoto and J. Tanaka. Some properties of fuzzy numbers. In M.M. Gupta, R.K. Ragade
and R.R. Yager, Eds. Advances in Fuzzy Set Theory and Applications. North–Holland, New
York, pp. 153–164, 1979
105. C.V. Negoita. On te stability of fuzzy systems. Proc. IEEE Internat. Conf. Cybernetics and
Society, , pp. 936–937, 1978
106. H. Nomura, I. Hayashi and N. Wakami. A self–tuning method of fuzzy control by descent
method. Proceedings of the IFSA ’91, Brussels, pp. 155–158, 1991
107. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the con-
sequences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
108. M. Obitko and P. Slavı́k. Visualization of Genetic Algorithms in a Learning Environment. In
Spring Conference on Computer Graphics, SCCG ’99. Bratislava: Comenius University, pp.
101–106, 1999
109. A. Ollero, A. Garcia–Cerezo and J. Aracil. Design of Fuzzy Control Systems. Dpto. Ing. Sist.,
University of Malaga, Research report, 1992
110. S.V. Ovchinnikov. Transitive fuzzy orderings of fuzzy numbers. Fuzzy Sets and Systems 30,
pp. 283–295, 1989
111. K.M. Passino and S. Yurkovich. Fuzzy Control. Addison Wesley Longman Inc., Menlo Park,
CA, USA, 1998
112. R. Pearce and P.H. Cowley P. H. Use of fuzzy logic to overcome constraint problems in genetic
algorithms. In Proc. of 1st IEE/IEEE International Conference on Genetic Algorithms in
Engineering Systems: Innovations and Applications, Sheffield, pp.13–17, 1995
113. W. Pedrycz. An identification algorithm in fuzzy relational equations. Fuzzy Sets and Sys-
tems 13, pp. 153–167, 1984
114. W. Pedrycz. Identification in fuzzy systems. IEEE Trans. Systems, Man and Cybernetics 14,
pp. 361–366, 1984
115. W. Pedrycz. Approximate solutions of fuzzy relational equations. Fuzzy Sets and Systems 28,
pp. 183-202, 1988
116. W. Pedrycz. Fuzzy control and fuzzy systems, 2nd Ed. Wiley, New York, 1993
117. W. Pedrycz. Fuzzy Sets Engineering. CRC Press, 1995
118. W. Pedrycz and M. Reformat. Genetic optimization with fuzzy coding. In [57], pp. 51–67,
1996
119. T.J. Proczyk and E.H. Mamdani. A linguistic self-organizing process controller. Automatica
15(1), pp. 15–30, 1979
120. W. Qiao and M. Mizumoto. PID type fuzzy controller and parameters adaptive method.
Fuzzy Sets and Systems 78, pp. 23–35, 1996
121. K.S. Ray and D.D. Majumder. Application of the Circle Criteria for Stability Analysis of Lin-
ear SISO and MIMO Systems Associated With Fuzzy Logic Controller. IEEE Trans. Systems,
Man and Cybernetics, 14(2), pp. 345–349, 1984
122. K.S. Ray, A. Ghosh and D.D. Majumder. L2 -Stability and the Related Design Concept for
SISO Linear System Associated With Fuzzy Logic Controllers. IEEE Trans. Systems, Man
and Cybernetics, 14(6), pp. 932–939, 1984
An Overview of Fuzzy Control Theory 137

123. I. Rechenberg. Evolutionsstrategie. Frd. Fromm Verlag, 1973


124. S.E. Rodabaugh. Fuzzy addition in the L–fuzzy real line. Fuzzy Sets and Systems 8, pp.
39–52, 1982
125. S.E. Rodabaugh. Complete fuzzy topological hyperfields and fuzzy multiplication in the fuzzy
real lines. Fuzzy Sets and Systems 15, pp. 285–311, 1985
126. F. Rosenblatt. Principles of Neurodynamics. Washington, DC, Spartan Press, 1961
127. D. Ruan, E.E. Kerre, G. De Cooman, B. Cappelle and F. Vanmassenhove. Influence of the
fuzzy implication operator on the method-of-cases inference rule. Internat. J. Approx. Rea-
soning, 4, pp. 307–318, 1990
128. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a
theory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1994
129. T.A. Runkler and M. Glesner. Defuzzification and ranking in the context of membership value
semantics, rule modality, and measurement theory. In Proc. of the 1st European Congress on
Fuzzy and Intelligent Techniques, Aachen, 1994
130. J.J. Saade and H. Schwarzlander. Ordering fuzzy sets of the real line: An approach based on
decision making under uncertainty. Fuzzy Sets and Systems 50, pp. 237–246, 1992
131. D.R. Sadler. Numerical Methods for Nonlinear Regression. St. Lucia, University of Queens-
land Press, 1975
132. M.G. Safonov. Stability and Robustness of Multivariable Feedback Systems. Cambridge,
MA, MIT Press, 1980
133. E. Sanchez. Fuzzy genetic algorithms in soft computing enviroment. In Proc. of 5th Interna-
tional Fuzzy Systems Association World Congress (IFSA’93), Seoul, pp. 1–13, 1993
134. E. Sanchez, T. Shibata and L.A. Zadeh. Genetic Algorithms And Fuzzy Logic Systems: Soft
Computing Perspectives In: Advances in Fuzzy Systems — Applications And Theory, vol. 7
World Scientific, 1997
135. G. Saridis. Towards the Realization of Intelligent Controls. Proceedings of the IEEE, 67
pp. 1115–1133, 1979
136. E. Schönburg, F. Heinzmann and S. Feddersen. Genetische Algorithmen und Evolution-
sstrategien. Addison-Wesley, 1994
137. A.B.S. Serapião, A.F. Rocha, M.P. Rebello and W. Pedrycz. Towards a theory of genetic
systems. In [57], pp. 68–94, 1996
138. W. Siler and H. Ying. Fuzzy control theory: The linear case. Fuzzy Sets and Systems 33,
pp. 275–290, 1989
139. P. Smets and P. Magrez. Implication in fuzzy logic. International Journal of Applied Reason-
ing, 1, pp. 327–347, 1987
140. M. Sugeno. An introductory survey of fuzzy control. Inform. Sci 36, pp. 59–83, 1985
141. M. Sugeno. Industrial Applications of Fuzzy Control. North-Holland, Amsterdam, 1985
142. M. Sugeno and M. Nishida. Fuzzy control of model car. Fuzzy Sets and Systems 16, pp. 103–
113, 1985
143. M. Sugeno and K. Tanaka. Stability analysis and design of fuzzy systems. Fuzzy Sets and
Systems 45, pp. 136–156, 1992
144. C.T. Sun and J.S. Jang. Fuzzy classification based on adaptive network and genetic algo-
rithms. In [134], pp. 113–131, 1997
145. K.L. Tang and R.J. Mulholland. Comparing fuzzy logic with classical controller designs.
IEEE Trans. SMC 17, pp. 151–164, 1987
146. K. Uehara and M. Fujise. Fuzzy inference based on families of α -level sets. IEEE Trans.
Fuzzy Systems 1 (2) pp. 111–124, 1993
147. M. Umano and Y, Ezawa. Execution of approximate reasoning by neural network. (in
Japanese) Proceedings of FAN Symposium, pp. 267–273, 1991
148. W. Van Leekwijck and E.E. Kerre. Defuzzification: criteria and classification. Fuzzy Sets
and Systems 108, 1999, pp. 159–178.
149. W. Van Leekwijck and E.E. Kerre. Continuity focused choice of maxima: Yet another de-
fuzzification method. Fuzzy Sets and Systems 122, pp. 303–314, 2001
138 W. Peeters

150. M. Vidyasagar. New directions of research in nonlinear systems theory. Proc. of the IEEE
77(8), pp. 1060–1090, 1986
151. S. Voget. Multiobjective optimization with genetic algorithm and fuzzy control. In Proc.
of the 4th European Conference on Intelligent Techniques and Soft Computing (EUFIT
Aachen’96), pp. 391–394, 1996
152. H.M. Voigt. Fuzzy evolutionary algorithms. Technical Report 92-038, International Com-
puter Science Institute (ICSI), 1947 Center Street, Suite 600, Berkeley, CA, 94704, 1992
153. H.M. Voigt, J. Born and I. Santibanez-Koref . A multivalued evolutionary algorithm. Tech-
nical Report 93-022, International Computer Science Institute (ICSI), 1947 Center Street,
Suite 600, Berkeley, CA, 94704, 1993
154. H.M. Voigt, H. Muhlenbein and D. Cvetkovic D. Fuzzy recombination for the continuous
breeder genetic algorithm. In Proc. of the 6th International Conference on Genetic Algo-
rithms (ICGA’95), Pittsburgh, pp. 104–111, 1995
155. M. Wakami and H. Terai. Application of fuzzy theory to home appliances. in: K. Hirota
Industrial Applications of Fuzzy Technology, Tokyo, Berlin, Heidelberg, pp. 283–310, 1993
156. P.Y. Wang, G.S. Wang, Y.H. Song and A.T. Johns. Fuzzy logic controlled genetic algorithms.
In Proc. of 5th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’96), vol. 2,
New Orleans, pages 972–979, 1996
157. S. Weber. A general concept of fuzzy connectives, negations and implications based on t–
norms and t–conorms. Fuzzy Sets and Systems 11, pp. 115–134, 1983
158. H.Y. Xu and G. Vukovich. A fuzzy genetic algorithm with effective search and optimization.
In Proc. International Joint Conference on Neural Networks (IJCNN’93), Nagoya, pp. 2967–
2970, 1993
159. R.R. Yager. A procedure for ordering fuzzy subsets of the unit interval. Information Sci. 24,
pp. 143–151, 1981
160. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
161. Y. Yamashita, S. Matsumoto and M. Suzuki. Start-up of a catalytic reactor by fuzzy con-
troller. J. Chemical Engineering of Japan, 21, pp. 277–281, 1988
162. T. Yamazaki and E.H. Mamdani. On the performance of a rule-based self–organising con-
troller. Proc. of IEEE Conf. on Applications of Adaptive and Multivariable Control. Hull,
England, pp. 50–55, 1982
163. S. Yasunobu and S. Miamoto. Automatic train operation by predictive fuzzy control. In
M. Sugeno. Industrial Applications of Fuzzy Control., Amsterdam, New York, 1985
164. L.A. Zadeh. Fuzzy sets. Information and Control 8, pp. 338–353, 1965
165. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
166. L.A. Zadeh. The concept of a linguistic variable and its application to approximate reason-
ing. Information Sci. 8, pp. 199–249 and 9, pp. 43–80, 1975
167. G. Zames. On the I–O Stability of Time Varying Nonlinear Feedback Systems. IEEE Trans.
on Automatic Control, 11, pp. 228–238, 1966a
168. H.J. Zimmermann. Fuzzy Set Theory and Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
Optimal Fuzzy Management of Reservoir
based on Genetic Algorithm

Alberto Cavallo and Armando Di Nardo

Abstract This chapter deals with water resource management problems faced from
an Automatic Control point of view. The motivation for the study is the need for
an automated management policy for an artificial reservoir (dam). A hybrid model
of the reservoir is considered and implemented in Stateflow/Simulink, and a fuzzy
decision mechanism is implemented in order to produce different water release
strategies. A new cost functional is proposed, able to weight user’s desiderata (in
terms of water demand) with water waste (in terms of water spills). The parame-
ters of the fuzzy system are optimized by employing Genetic Algorithms, which
have proved very effective due to the strong nonlinearity of the problem. Modi-
fied AR and ARMAX models of the inflow are identified and Montecarlo simula-
tions are used to test the effectiveness of the proposed strategy in different operating
scenarios.

Keywords: Fuzzy control; Hybrid control; Montecarlo simulation; Cost functional

1 Introduction

Water resources management is a multiobjective problem where many different dis-


ciplines have to be involved. Often management decisions are to be based on very
different considerations (political, economical, etc.), which are rather hard to ex-
press in mathematical terms. Moreover, the success of water management policies
is not only a matter of technical approaches (see e.g. [16]). However, the best the

Alberto Cavallo
Dipartimento di Ingegneria dell’Informazione, Seconda Universitá degli Studi di Napoli,
via Roma 29, 81031 Aversa, Italy
Armando Di Nardo
Dipartimento di Ingegneria Civile, Seconda Universitá degli Studi di Napoli, via Roma 29, 81031
Aversa, Italy
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 139
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 139–159.
c 2008 Springer.
140 A. Cavallo and A.D. Nardo

engineer can do is to employ different Multiobjective Optimization Techniques and


a multidisciplinary approach as flexible as possible in order to take into account
the wide spectrum of aspect to be considered in the decision strategy. Basically, a
reservoir system is a water storage device: water volumes are stored during rainy
seasons and released in dry seasons. Starting from this basic concept, several reser-
voirs can be interconnected, thus resulting in the “Integrated Water Resource Man-
agement” paradigm, where different levels of complexity can be faced. In some
cases, careful water resource planning is crucial, as in the case of countries like
South Africa, where large demands caused by an economy based on mining and
heavy industries contrast with reduced water availability, thus calling for complex
water resources management [2]. There are different examples of successful ap-
plications of water cycle management made possible by a careful integration of
different technologies [24].
Classical approaches to optimization problems in water resources management
involve the use of linear, dynamic, nonlinear or stochastic programming (see [28]
and references therein for a good survey on the topic). Neuro-Dynamic Program-
ming has been used in [7], where Evolutionary concepts have been used to accel-
erate the training phase of a neural network. Also game theory has been used, as
in [27], where a cooperative game model of the cooperative water allocation poli-
cies is deduced, and different parameters like water quantity and quality are taken
into account.
Moreover, in the last decade, a large number of papers devoted to the solution
of reservoir management problems based on fuzzy logic approaches have appeared
(e.g. [20], [19] and references therein). The fuzzy approach has proved to be very
effective both for its “native” capability to deal with nonlinear models and for the
possibility to take into account heuristic and political rules. However, pure (heuris-
tic) fuzzy reasoning is very complex in practical applications [3], thus, after an initial
“naı̈ve” approach, fuzzy modelling has become more and more formalized: “black
box” identification, optimality issues, clustering, stability proofs and other mathe-
matical procedures have conferred a strong mathematical background to the fuzzy
approach, allowing the engineer to use a unique design tool for problems described
both in terms of heuristic and classical mathematical structures. In such a context,
genetic algorithms have been applied due to the high computational complexity of
the phenomena to face [26] or because of nonlinearities in the objective function to
minimize [9].
In this chapter a novel decision strategy for reservoir management is proposed,
based on a cost index proposed by the authors. In particular, a water monthly de-
mand is considered, resulting from mean values of historical user demands and a
water release policy is to be decided, based on this demand and on the water current
availability. Of course, if water availability were infinite the best water release pol-
icy would be to release all the water requested. Thus, the user demand can be viewed
as an “ideal” water demand. If, more realistically, water availability is limited, how
much water to release must be decided by the water manager. In the case of water
shortage, it is clear that it is useless, or even harmful, to ask for a water release that
cannot be yielded by the reservoir, hence it makes sense to reduce the water demand
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 141

to the reservoir. A fuzzy decision controller is used to modulate (i.e. to reduce,


according to the above discussion) the “ideal” water demand so that a nominally
requested water is released when available, but, if a drought is expected, lower water
demand levels are imposed. Critical is the choice of the decision strategy. It is appar-
ent that too low water levels inside the reservoir prevent any corrective actions in the
case of droughts, while too high water levels in the reservoir are generally useless
and wasteful, since large volume of water are lost for evaporation and/or water over-
flow. Objective of this paper is to investigate different management strategies based
on mixed heuristic and nonlinear mathematical approaches. Moreover, quantitative
comparisons are carried out by evaluating standard quality indices for the proposed
solutions. The advantage of the fuzzy implementation is the possibility of defin-
ing a linguistic meaning for the rules resulting from the mathematical optimization
and to add also heuristic rules, thus combining heuristic and rigorous mathemati-
cal treatment. As a case study, the management of the Pozzillo (South Italy) river
basin is considered and simulations are carried on using the MATLAB/SIMULINK
integrated environment, using 36-years (1962–1998) monthly data.
In the case of severe water shortage, even the reduced demand cannot be met
by the water volumes in the reservoir. This situation is detected by simulating the
behavior of the reservoir by using a detailed model of the reservoir. Since extreme
situations (droughts, reservoir overflows) are to be taken into account, the standard
model of the reservoir based on the volume balance is not sufficient, as in the ex-
treme cases the structure itself of the system changes. Thus an hybrid model [12]
is used to describe with a unique model the reservoir also in the presence of water
spills and water shortage.
A problem to face when defining optimal water management policies is the pres-
ence of uncertainties in several data of the problem (e.g. water inflow, user demand,
etc.). For instance, in [5] a fuzzy version of Compromise Programming is proposed
to withstand the problem of resource planning for long-range water management.
However, the most used approach to uncertainties in water resource applications
is the stochastic approach, where uncertainties are considered as random variables
affecting the process in different ways. The use of stochastic approaches to mod-
elling hydrologic time series and their connections with water resources manage-
ment is well known. For instance, Hobbs [13] consider water resources uncertainties
resulting from long term climate changes by using a Bayesian approach, i.e. a sub-
jective approach. Also, fuzzy-stochastic linear programming has been proposed to
the case of uncertain evaporation losses [18]. Another use of stochastic modelling
is the evaluation of control policies in different scenarios by using Montecarlo sim-
ulations in order to assess reliability and effectiveness of decision strategies [2].
The latter is the approach used in this study. Specifically, a periodic, AR-
lognormal and a more complex periodic ARMAX-lognormal models of the inflow
are identified and used to assess the performances of a strategy selected by using
Genetic Algorithms on a fuzzy decision system, defining the water release based on
information on water levels, water level rates, current month and ideal water request.
The effectiveness of the proposed strategy is showed against different strategies pro-
posed in practical use and literature.
142 A. Cavallo and A.D. Nardo

2 Reservoir Water Release Policy

In this section the basics of the release policy are presented. The life cycle of the
reservoir has been divided into [8]:
1. Ordinary management condition
2. Emergency management condition
The first refers to the case where, in a given time interval, the total available water
volume is not less than the required one. In this case there is enough water to satisfy
the user’s demand, and the decision strategy must select wether to supply all the
water the users ask for or to save some water for possible future needs. Note that,
due to evaporation losses, too conservative strategies would result in water waste
without fulfilling future users’ demand. The second management condition takes
place in drought period. In this case the system enters an “emergency operation
condition”, where reduced water flows are supplied trying to minimize discomforts
of the users.
The decision strategy is based on the values of h(t), the water level in the reser-
voir at time t, ḣ(t), the height rate, as internal variables and qid
ref (t), the “ideal” (i.e.
in the case of infinite water availability) desired water supply, the current month m
and the water inflow qin (t) as external variables, and produces the water supply qout ,
considering current and foreseen water availability. Basically, the idea is to use a
set of empirical rules to define the water release as a function of the input variables.
This can be naturally implemented by using heuristic fuzzy rules. The rules will be
later optimized by using a genetic algorithm. However, in order to design the control
laws for the reservoir operations, the mathematical structure of the reservoir must
be examined first.

3 Mathematical Model of the Reservoir

3.1 Volume Balance Equation

A typical profile of the water inflow qin (t) and of required outflow qid
out (t) is de-
picted in Figure 1 in a time span encompassing 24 months. Note that the two curves
are, roughly speaking, out of phase by six months, corresponding to water avail-
ability and demand during the wet and dry seasons. The mathematical model of the
dynamics of the reservoir is described by the differential equation:

V̇ = qin (t) − qev (t) − qout (t), (1)

where V (t) is the reservoir volume at the generic time instant t, that depends on
the geometry of the reservoir, and qev (t) is the evaporation. In particular V =
-h
0 A(λ )d λ , where A(h) is the area of the water surface and h is the water height
in the reservoir. By applying the chain derivation rule:
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 143

x 107
3
q id
out
q
in
2.5
Water Flows [m3/month]

1.5

0.5

0
5 10 15 20 25
Time [month]

Fig. 1 Typical behavior of natural inflow and required outflow

dV
V̇ = ḣ = A(h)ḣ. (2)
dh

The evaporation qev (t) is usually modelled via an evaporation coefficient kev (t)
deduced from reservoir’s losses at time instant t:

qev (t) = kev (t)A(h). (3)

Physically, the volume is lower bounded by the “dead volume”, hence A(h) = 0.
Thus, the model of the reservoir can be written

1
ḣ = −kev + (qin (t) − qout (t)). (4)
A(h)

Finally, a simple discrete time version of eqn. (4), computed at time instants
t = kT , k = 0, 1, . . ., can be derived using an integration stepsize T = 4 hours

1
h[(k + 1)T ] = h(kT ) − kev (kT )T + [qin (kT ) − qout (kT )] T. (5)
A[h(kT )]
144 A. Cavallo and A.D. Nardo

3.2 Hybrid Dynamical Model of the Reservoir

Equation (5) describes the hydraulic balance in the reservoir only if the water vol-
ume belongs to a given interval at each time instant, i.e.

Vmin ≤ V (t) ≤ Vmax , (6)

where Vmin is the dead volume and Vmax is the reservoir volume, depending on the
dam height. If V (t) tends to increase over Vmax , an overflow qsp (water spill) hap-
pens, while if it reduces below Vmin it will be impossible for the dam to supply any
desired flow. The above consideration naturally suggests an hybrid model for the
reservoir, where three states of the reservoir can be identified.
Some additional variables are defined, namely the tentative water volume Vt and
the actually released water flow qact . The hybrid model of the reservoir encompasses
three states (conditions), as follows.
1. A standard condition (NORMAL), when the bounds (6) are satisfied and eq. (5)
applies
2. An overflow condition (SPILLS), where the water volume is constrained to its
maximum value
3. A drought condition (EMPTY), where no water can be supplied to the user (qact =
0) and no evaporation occurs (at least approximately, actually a small evaporation
happens, but can be neglected)
The input variables are the water volume at the previous step, the current water
inflow, outflow and assumed evaporation, while the outputs are the water volume
in the reservoir, the corrected evaporation and the spills (needed to compute the
performance indices in Section 7).
Finally, a fixed integration step T = 1/180 (i.e. 4h) is considered. The resulting
statechart is reported in Figure 2.
The Stateflow element is integrated into a MATLAB/SIMULINK simulation
scheme, to be used to evaluate and compare different operation strategies.

4 Fuzzy Decision System

The fuzzy automatic decision system defines, in real time, the “actual outflow” in
the case of “emergency management conditions”. As stated above, the key idea is
to modulate the overflow, i.e. to decide a multiplicative (time-varying) factor ρ (t),
with ρ ∈ [0, 1], such that
qout (t) = ρ (t)qid
out (t) (7)
is the released water, expressed as a fraction of the ideal one. As it is known in the lit-
erature (e.g. [10] and references therein), fuzzy systems allow to turn numeric input
through linguistic knowledge into numeric output. Moreover, strategy (7) naturally
suggests the use of a Sugeno-type FIS [23].
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 145

NORMAL
q_ev_c = q_ev;
q_sp = 0;
q_act = q_out,
Vt = V;
Vt = Vt + (q_in-q_act-q_ev)/180;
V = Vt;

[Vt > VM]

[Vt > Vm + thresh]

[Vt < Vm – thresh] SPILLS


V = VM;
q_ev_c = q_ev;
EMPTY [Vt < VM]
during:
q_ev_c = 0; q_act = q_out;
q_sp = 0; q_sp = q_in–q_act-q_ev;
q_act = 0; Vt = VM = q_sp / 180;
during:
Vt = Vm+(q_in–q_ev)/180; exit:
V = Vm; q_sp = 0;

Fig. 2 Stateflow statechart for the reservoir

The core of fuzzy logic theory is linguistic rules set: in this study, trying to
take into account knowledge of the reservoir management operator, the following
Sugeno-type rule system is developed
(l) (l)
R(l) : if x1 is P1 and . . . and xn is Pn then y = ρ (l) yid
where
• xi ∈ Ui ⊂ R is the i-th input linguistic variable in the universe of discourse Ui ⊂
R, i = 1, . . . , n.
• y ∈ S ⊂ R is the output linguistic variable in the universe of discourse, expressed
as product of a coefficient ρ (l) ∈ [0, 1] by an ideal output yid ∈ S.
(l)
• Pi is the fuzzy set referred to the i-th input variable and the l-th decision rule,
i = 1, . . . , n, l = 1, . . . , r.
• ρ (l) ∈ C1 ⊂ [0, 1] is a crisp multiplier for the l-th rule, l = 1, . . . , r, assuming
values in the set C1 , with cardinality γ1 . This is a “reduction factor” of the output
with respect to an “ideal” output.
The range of values of the coefficient ρ (l) is chosen so as to reduce the user’s
water demand. In particular, a decision rules system consisting of r = 13 rules and
γ1 = 6 levels of output reduction has been selected of the form
146 A. Cavallo and A.D. Nardo

R(l) : if h is LOW and ḣ is ZERO and month is DRY then qout =LITTLE
qid
out
with linguistic values and variables:

x1 = h
x2 = ḣ
x3 = month
 1year
x4 = qΣin (t) = qin (t − τ )d τ
0
P1 = {LOW, HIGH}
P2 = {NEGATIVE, ZERO, POSITIVE}
P3 = {DRY, WET}
P4 = {DROUGHT, NOT DROUGHT}
C1 = {NOTHING, VERY LITTLE, LITTLE,
MUCH, VERY MUCH, EMERGENCY}

where qΣin is cumulative value of the inflow in the last year. The choice of the vari-
ables has the following rationale: h takes into account the water currently at dis-
posal, ḣ the presumed future volume trend, month the expected future inflow, qΣin the
past inflow history. Based on these variables, the decision strategy tries to foresee
the water availability to satisfy current and future customers’ requirements, suitably
reducing water supply in the case of hypothetical future negative scenarios.
The heuristic rules are summarized in Table 1.

Table 1 Fuzzy system rules


INPUT OUTPUT
P1 P2 P3 P4 C1
(h) (ḣ) (month) (qΣin (t))
LOW NEGATIVE DRY - LITTLE
LOW ZERO DRY - LITTLE
LOW POSITIVE DRY - LITTLE
HIGH NEGATIVE DRY - VERY LITTLE
HIGH ZERO DRY - NOTHING
HIGH POSITIVE DRY - NOTHING
LOW NEGATIVE WET - VERY MUCH
LOW ZERO WET - VERY MUCH
LOW POSITIVE WET - MUCH
HIGH NEGATIVE WET - NOTHING
HIGH ZERO WET - NOTHING
HIGH POSITIVE WET - NOTHING
LOW - - DROUGHT EMERGENCY
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 147

5 Optimizing the Decision Strategy

Three different reservoir management strategies have been designed and analyzed.

1. SOP: Standard Operation Policy


2. FOP: Fuzzy Operation Policy
3. OFOP: Optimized Fuzzy Operation Policy
The SOP [6] policy releases all water demand if there is enough available water
stored, whether there is a ordinary management condition or an emergency man-
agement condition. This policy, although often used by reservoir managers, can be
the cause of many users disadvantages. The FOP strategy distinguishes between or-
dinary and emergency working conditions trying to reduce negative consequences
for users in drought situations. It is designed with a heuristic estimation of all para-
meters according to the rules described in Section 4. Finally, the OFOP strategy is
an optimized version of the FOP. In particular, 21 parameters are optimized in the
fuzzy system, as detailed below, i.e. center and support for triangular membership
functions and center and variance for Gaussian membership functions. On the con-
trary, the shape and the rules are still chosen heuristically, as in Table 1. So, with
a suitable parameter description, it has been possible to individuate 15 variables to
optimize for the input. Instead, referring to the output membership function, the op-
timizing problem description has conducted to six variables only, due to the specific
Sugeno structure of the fuzzy member function that has six values for output. Thus,
a total of 21 different variables have to be optimized, and in particular:
• LOW and HIGH for the input fuzzy set P1 depend on two parameters, as well
as NEGATIVE and POSITIVE for the fuzzy set P2 , DRY and WET for P3 and
DROUGHT for P4 .
• ZERO for the fuzzy set P2 depends on a single parameter (its center is fixed to
the value 0, only the variance is considered as a parameter).
• All the membership functions of the output fuzzy set C1 depend on a single
parameter.
So optimizing fuzzy parameters aims to improve the fuzzy rules system by
changing the shape, the overlap and the significance of linguistic rules. It is im-
portant to note that each parameter has to be constrained in order to preserve the
linguistic meaning of the rule system.

5.1 Genetic Algorithm and Fuzzy Membership Function


Parameters

As already stated, the first fuzzy strategy FOP was developed with an empirical
approach: in particular both the membership functions shape and values has been
chosen by exploiting expertise of reservoir operators and then set by simulation.
148 A. Cavallo and A.D. Nardo

However, this way to operate does not guarantee the optimal fulfillment of operating
rules because of the large number of parameters involved. Moreover, the reservoir
management problem is strongly nonlinear and time-varying and it is necessary to
apply an efficient optimization technique. In this context, as stated in the Introduc-
tion, Genetic Algorithms (GAs) [11, 17] have been recognized as a suitable tool
to solve the optimization problem, since they are conceptually powerful, although
flexible and relatively easy to implement. The Matlab GA Toolbox has been used to
optimize the 21 fuzzy parameter with historical data input. The problem is a non-
linear and constrained optimization problem, since, in order to preserve linguistic
meaning of fuzzy rules presented in Section 2.4 is necessary to constrain all the
variables. For instance, the following upper and lower bounds have been imposed
on the 6 output variables C1

UB = [0.60, 1.00, 0.80, 1.00, 0.40, 0.20] (8)

LB = [0.40, 0.95, 0.60, 0.80, 0.20, 0.05] (9)


The fitness function defined in optimization procedure based on GA is:
  2
y= out (t) − qout (t)
w(qsp ) qid dt, (10)

where, w(qsp ) is a fuzzy weighting function which penalizes situations with high
spills. This is done to consider the case that saving more water can alleviate droughts
but increases water waste due to spills and evaporation.
The GA solution is obtained with a population size of 40 individuals and with
following principal GA parameters:
• Crossover Fraction = 0.80
• Migration Interval = 20
• Migration Fraction = 0.20
• Initial Penalty = 10
• Penalty Factor = 100
Finally, OFOP starts the GA optimization using the FOP solution as a starting
guess. In this way, the optimization solver is allowed to start from a “good” starting
guess, and trivial local minima are apriori avoided.

5.2 Performances Indices

It is easy to understand that each policy has its advantages and drawbacks. Therefore
the three strategies SOP, FOP and OFOP are compared with different performance
indices, some inhomogeneous between them, in order to evaluate the effectiveness
of the proposed approaches from different points of view. In particular, the following
performances indices are defined
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 149

∑t qout
• Volumetric Reliability: × 100
∑t qid
out  2
• Integral of Squared Deficits: ∑ qout − qid
out
t
• Deficit Frequency: 100 ∑t d(t) × 100
maxi dis
• Maximum Seasonal Deficit × 100
∑t qid
out
• Total Spills: ∑ qsp (t)
t
• Total Evaporation: ∑ qev (t)
t

where

1 if qout (t) < qid
out (t)
d(t) =
0 otherwise

and
12
dis = ∑ qid
out (t) − qout (t) ,
i i
i = 1, . . . , n (11)
t=1

6 Inflow Identification and Montecarlo Simulation

The above procedure suffers from a main disadvantage. In fact, it is clear that the
result depends not only on the ability of the genetic algorithm to seek for a “good”
suboptimum, but also on the inflow historical data entering the system. If for in-
stance more water were available, different results would have been obtained. The
problem is that the above procedure heavily relies on the vector of data qin (·), that is
actually a single realization of a stochastic process. Thus, the proposed strategy is
prone to the risk of overfitting a single (although significant) case, thus resulting
in a low level of generality. A possible, classic alternative is to use only a sub-
set of the data for the optimization, while the remaining data are employed for an
“objective” assessment (validation) of the result. However, this approach is accept-
able only when plenty of time-history data are available. In the present case, the
data are characterized by two dramatic events: a large peak in the first half of the
time history (around month 120), and a large drought in the second half (months
320–350). Thus, halving the data inevitably implies loss of meaningful pieces of
information. In the case of few data, it is advisable to “generate” new data by run-
ning a simulation model condensing the statistics of the inflow process. This can
be accomplished by identifying a dynamical model of the inflow [4] time history,
and using a random generator to produce simulated inflow processes, i.e. vector of
random numbers preserving the statistics of the original process [15]. Thus the de-
cision strategy is defined on the whole original set of data, and its performances are
150 A. Cavallo and A.D. Nardo

Fig. 3 Historical inflows

assessed by checking its behavior when inputs generated by the identified model are
used as new inputs.
Although there is plenty of mathematical tools for dealing with identification
problems, a deep understanding of the physics of the phenomenon to identify is still
necessary in order to obtain good results. In the case of the considered inflow, a
record set of 36 years monthly precipitations, looking at the time history in Figure 3
the following considerations can be deduced.
• Occasionally, large values of the inflow appear.
• In most cases, very low values (close to zero) happen. This behavior strictly
resembles what is called “intermittent time series”, although, strictly speaking,
intermittent time series must have zero values [22].
• The behavior exhibit a clear periodicity, mainly based on the seasonal repetition.
The first step in identifying a dynamic system or a time history is to prefilter
the data. Generally, all what can be easily extracted from the data, as mean and
trends, is removed. In the case of hydrologic seasonal data, and in general when
periodic behaviors are present, simply removing the mean value has low impact. It is
better to remove seasonal means and to perform a seasonal normalization, in order
to have data where only stationary stochastic behaviors are present. This can be
accomplished as follows. Let Qkin (t), t = 1, . . . , 12, k = 1, ..., 36 be the inflow related
to the year k and month t. Then the seasonal mean (monthly mean) is estimated as
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 151

1 36 k
Q̄in (t) = ∑ Qin (t),t = 1, . . . , 12.
36 k=1
(12)

Analogously, an estimate of the variance is obtained

1 36  k 2
2
SQ (t) = ∑
35 k=1
Qin (t) − Q̄in (t) ,t = 1, . . . , 12. (13)

The time series to identify is next normalized by removing the seasonal mean and
variance.

Qin (t) − Q̄in (t)


Q∗in (t) = (14)
SQ (t)
However, it is well known that, in the case of seasonal data, removing the sea-
sonal mean and variance is not sufficient to guarantee that any periodicity has been
removed from the data. In fact, autocorrelation in data is in general present, and
periodicity can express itself by affecting the correlation coefficients, that assume a
periodic pattern. Differently from the computation of seasonal mean and variance,
identifying residual periodicity is not a straightforward task, and ad hoc procedures
have been developed, as the use of Periodic Autoregressive Models (PAR) [21, 25],
where the data are given in a “circular” fashion, i.e. the “head” and the “tail” of
the series are assumed coincident. However, in this way a periodicity is assumed,
rather than looked for. Since in our case a form of seasonality has already been
removed, a simple idea to search for a periodicity and simultaneously identifying
it is to assume the time series dependent on a fictitious exogenous periodic input
u(t) = {(1, ..., 432) mod 12}.
Before performing this operation, the data need further preprocessing. In fact,
both the intermittent character mentioned above and the presence in the data of
extreme events, call for a mathematical transformation of the data in order to “com-
press” the extreme differences. By resorting to a distribution that is very used in the
hydrologic field, the lognormal distribution, a reasonable transformation is comput-
ing the logarithm of the sequence. The set of data is thus transformed as

Q̂(t) = log10 (Q∗in (t) − qm + qM ), (15)


where qm = mint (Q∗in (t)) and qM = maxt (Q∗in (t))
are chosen so as to “symmetrize”
the variable Q̂ around zero.
Finally, fitting the transformed data with a second-degree polynomial shows that
a slight parabolic trend is present in the data. Also this trend is removed, in order to
exploit the identification step at its best.
This concludes the data pretreatment phase. The data now exhibit more “uni-
form” variations, hence a stationary behavior is expected.
At this point a linear stationary model able to fit the data is sought. Specifically,
a family of models is postulated, and a criterion is minimized to compute the best
152 A. Cavallo and A.D. Nardo

member of the family, i.e. the model fitting the data optimally according to the given
criterion.
The operation is performed by using the System Identification Toolbox of Mat-
lab, which implements a large set of techniques based on classical concepts [15].
Using the classical prediction error as optimality criterion, the following families
are inspected.
1. ARX (Auto Regressive with eXogenous input)
2. ARMAX (Auto Regressive Moving Average with eXogenous input)
3. Space-state
Moreover, also the order is selected along with the model. The worst results are
obtained with the space state model, essentially because there is no sharp variation
in the singular values of the Hankel matrix [14], hence it is not easy to select the
“right” order. As far as the ARX model is concerned, two popular techniques for
model complexity are selected, i.e. the FPE (Final Prediction Error) and the AIC
(Akaike Information Criterion) criteria [15]. For the sake of notational simplicity,
let us drop all the subscripts, and denote the time history to identify be denoted by
q(t) and the fictitious input u(t) defined above. The model obtained by minimizing
the FPE is an ARX(1, 3), with a three-step delayed input, i.e. q(t) = a1 q(t − 1) +
b1 u(t − 3) + b2 u(t − 4) + b3 u(t − 5), while the AIC gives an ARX(1, 1) with the
same delay, i.e. q(t) = a1 q(t − 1) + b1 u(t − 3). However, in both cases the values
of the coefficients of the input are very small, and below their standard deviation,
which means that they are barely reliable. Since from a physical point of view the
input is only a sign of the seasonal periodicity of the data, the conclusion that all the
seasonality has been removed from the data in the pretreatment phase can be drawn
(or, more correctly, there is no further evidence of a definite yearly pattern in the
data when using an ARX-family model). The next step will thus be to remove the
fictitious input and to identify the time sequence by using a simple AR model. In
this case both the AIC and the FPE give an AR(1) as best model, in particular the
result is
y(t) = 0.37(±0.063)y(t − 1) + ξ (t) (16)

where also the standard deviation of the estimate has been indicated and ξ (t) is a
white Gaussian noise, as can be easily verified by using suitable whitening tests (e.g.
Anderson’s test) and normality tests (e.g. Kolmogorov–Smirnov test for normality).
Moreover, an AR(3) model has also been tested, motivated by the three-step de-
lay computed with the ARX model above. The identification shows that actually a
special AR(3) model gives a good result, namely one with zero two-step delay:

y(t) = 0.40(±0.069)y(t − 1) + 0.108(±0.074)y(t − 3) + ξ (t) (17)

However, this model is equivalent to model (16) from a prediction error criterion
point of view, hence the former is preferred for its lower complexity.
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 153

Finally, an ARMAX model is identified by using the ARX analysis as starting


point. However, the identification is now more complex, the interpretation of the
results is less intuitive. By operating as in the ARX case, an ARMAX(3, 1, 2) model
with delay 3 in the input is obtained,

y(t) = 0.86(±0.066)y(t − 1) + 1.15(±0.044)y(t − 2) − 0.35(±0.066)y(t − 3)


+ 0.0064(±0.0038)u(t − 3) + ξ (t) − 0.49(±0.01)ξ (t − 1) + ξ (t − 2) (18)

from which the following considerations are deduced. The ARMAX model is able
to detect a slight periodicity in the data, although with relatively high variance and
hence low reliability. Moreover, the model is considerably more complex than the
AR(1), and simply trying identifying an ARMA model by removing the fictitious
input, as in the AR case, lead to a completely unreliable model, with coefficients
whose standard deviations are larger than the coefficients themselves. On the other
side, the global improvement in using such a model is not worth the increase in
complexity, hence the model (16) is selected.
The model thus deduced is used for simulation, by feeding the identified system
with a Gaussian pseudo-white noise with variance computed from the model error
variance. A plot of a realization of the simulated inflow vs the true data is shown in
Figure 4.

Fig. 4 Measured and simulated inflows


154 A. Cavallo and A.D. Nardo

7 Case Study

The methodology developed in this paper has been applied to the case of the man-
agement of Pozzillo reservoir, on the Salso River in Sicily (Italy). Pozzillo reservoir
is a multipurpose system (hydroelectric, irrigation and municipal), the basin area is
about 577 km2 and net storage is 123 × 106 m3 .
The available data are referred to the years 1962–1998, with 432-months wa-
ter inflow qin , represented in Figure 3, monthly evaporation rates kev , the reservoir
volume as a function of the water height, V = V (h) and the ideal water demand qid out .
Referring to hydrologic year (October–September) it is possible to see recent
drought events, that struck South Italy in the years 1988–1990.
The three different strategies SOP, FOP and OFOP have been tested both with
available historical data from 1962–1998 and with 10, 000 Montecarlo runs based
on historical data as explained in Section 6. The results from the historical data are
described below.
In Figure 5 is it possible to observe several months in which the reservoir does not
succeed in fulfilling the water demand; in particular, during the drought in months
320–340, the SOP strategy is unable to reduce the customer discomfort.
A dramatic improvement is obtained with the FOP strategy, as the water crises is
prevented by preserving water in the previous months and releasing it in the drought
months (Figure 6).
Even better performs the OFOP strategy, that represents an optimal solution to
improve the management reservoir. In fact, as shown in Figure 7, is it possible to
observe that during the winters, when the water demand is smaller, the demand is

x 107
3
id
qout
qout
2.5

2
Flow [m3/month]

1.5

0.5

0
240 260 280 300 320 340 360 380 400 420
Time [month]

Fig. 5 SOP (Standard Operation Policy), years 1982–1998


Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 155

x 107
3
qact

2.5 qid
out
qout

2
Flow [m3/month]

1.5

0.5

0
240 260 280 300 320 340 360 380 400 420
Time [months]

Fig. 6 FOP (Fuzzy Operation Policy), years 1982–1998

x 107
3
qact
id
qout
2.5
qout

2
Flow [m3/month]

1.5

0.5

0
240 260 280 300 320 340 360 380 400 420
Time [months]

Fig. 7 OFOP (Optimized Fuzzy Operation Policy), years 1982–1998

almost completely satisfied, and in summer drought months, for example in months
390–400, the strategy behaves better in overcoming the crisis guaranteeing a re-
duced (but non-null) water yield to the user.
156 A. Cavallo and A.D. Nardo

Table 2 Performance indices of Pozzillo reservoir operation during 1982–1998 (historical data)
Operat. Volum. Sum of Def. Tot. Tot. Max. Mean
Policy Reliab. Sq. Def. Freq. Spill Evap. Seas. Def.
(%) (105m3 ) (%) (107m3 ) (107m3 ) (% demand)

SOP 84 969 17 51.5 22.0 100


FOP 82 812 100 53.8 26.3 100
OFOP 78 685 100 61.3 28.2 100

Table 3 Performance indices of Pozzillo reservoir operation during 1982–1998 (mean values on
10,000 Montecarlo runs)
Operat. Volum. Sum of Def. Tot. Tot. Max. Mean
Policy Reliab. Sq. Def. Freq. Spill Evap. Seas. Def.
(%) (105m3 ) (%) (107m3 ) (107m3 ) (% demand)

SOP 89 754 12 51.0 24.0 60


FOP 85 650 100 56.0 27.5 56
OFOP 81 545 100 67.3 29.3 52

Naturally, as already noted referring to FOP strategy, the improved result depends
on the fact that the user is given generally less water than required, because the
fuzzy strategies save some resource for possible future shortage. Nevertheless such
criteria presents some disadvantages, namely the increase of water spill and water
evaporation. So, in the following Table 2, comparison between the three strategies
is reported based on a simulation with the historical data.
From Table 2 it is possible to note that Sum of Square Deficits is drastically
reduced as the strategy changes from SOP to OFOP. However this happens at ex-
penses of Deficit Frequency and Volumetric Reliability because fuzzy strategy and
optimization fuzzy strategy preserve water resource in some previous months and,
as a consequence, spills and evaporation losses increase.
In order to perform a more objective test, a campaign of 10, 000 Montecarlo runs
has been performed on the three analyzed strategies. The results obtained from his-
torical data input are confirmed by the Montecarlo approach that presents better val-
ues for all performances indices because historical data input are strongly affected
by heavy drought period (see Table 3).
The superiority of the OFOP approach from the point of view of teh minimization
of the sum of the square deficit is apparent w.r.t. both the SOP and the FOP, and is
confirmed by using an Optimal Comparison Technique [1] for testing the hypothesis
of superiority of the OFOP decision strategy compared to the others at any common
level of significativity (e.g. α = 5% or α = 1%).
A final observation concerns the actual water availability. Indeed, the proposed
strategy simply modulates the required water. However, in periods of severe drought
it can happen that the reservoir is unable to satisfy even a reduced demand. This
explains why in Figures 6 and 7 the new variable qact appears: it is the water outflow
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 157

x 107
3
qid
out
qSOP
out
2.5
qFOP
out
qOFOP
out
2
Flow [m3/month]

1.5

0.5

0
240 260 280 300 320 340 360 380 400 420
Time [months]

Fig. 8 Simulation results in the period 1986–1998

actually released to the user, i.e. what the reservoir is able to yield, that is what really
matter to the user. A final figure comparing the real outflows qact in the three cases
is useful to stress the differences in the strategies (Figure 8).

8 Conclusions

In this chapter different decision strategies for the problem of handling the water
management of an artificial reservoir in a fully automatic way have been analyzed
and compared. In particular, a Standard Operation Policy (SOP), a Fuzzy Operation
Policy (FOP) and an Optimized Fuzzy Operation Policy (OFOP) have been con-
sidered. The SOP releases water whenever possible, regardless of foreseen water
demand. The FOP supplies water based on reservoir and external variables state,
thus exhibiting forecasting properties and reducing the water release, even if there
is currently some available water, if it seems that saving water can alleviate foreseen
future droughts. OFOP is an optimized version of FOP obtained with Genetic Algo-
rithms Techniques. To test the proposed strategies, a dynamic hybrid model of the
reservoir is deduced, simulating different operative situation with 10, 000 runs of
Montecarlo simulations. While an unconstrained optimization is prone to the risk of
overspecializing on a single realization of the data set, the work shows that by suit-
ably mixing heuristic and optimization strategies (by constraining the optimization
according to the heuristic) a “smart” decision policy can be defined able to perform
satisfactory also in cases not considered in the optimization phase, thus showing that
the decision law has “learned” the rules for the optimal management of the reservoir.
158 A. Cavallo and A.D. Nardo

References

1. Bar-Shalom Y. and X.-R. Li (1993). Estimation and Tracking: Principles, Techniques and
Software. Artech House, Boston. MA.
2. Basson M.S. and J.A. van Rooyen (2001). Practical Application of Probabilistic Approaches
to the Management of Water Resource Systems. Journ. of Hydrology 241, pp. 53–61.
3. Bing-Yuan C. (2003). Fuzzy Allotment Model in Water and Electricity Resources Shortage
and its Application Software. Proc. of the 12th IEEE Int. Conf. on Fuzzy Systems FUZZ ’03.
2. pp. 1317–1320.
4. Box, G. and G. Jenkins (1970). Time series analysis: Forecasting and control. Holden-Day.
San Francisco.
5. Bender M.J. and S.P. Simonovic (2000). A Fuzzy Compromise Approach to Water Resource
Systems Planning under Uncertainty. Fuzzy Sets and Systems 115, pp. 35–44.
6. Cancelliere, A., A. Ancarini and G. Rossi (2002). A neural networks approach for deriving
irrigation reservoir operating rules. Water Res. Management 16, pp. 71–88.
7. Castelletti, A., D. de Rigo, A.E. Rizzoli, R. Soncini-Sessa and E. Weber (2005). A Selec-
tive Improvement Technique or Fastening Neuro-Dynamic Programming in Water Resource
Network Management. Proc. of the 16th IFAC World Congress. Praha. (CZ).
8. Cavallo, A., A. Di Nardo and M. Di Natale (2003). A fuzzy control strategy for the regula-
tion of an artificial reservoir. In: Sustainable Planning and Development (E. Beriatos, C.A.
Brebbia, H. Coccossis and A. Kungolos, Eds.). WIT Press, pp. 629–639.
9. Chen, Y.-M. (1997). Management of Water Resources using Improved Genetic Algorithms.
Computers and Electronics in Agricolture 18, pp. 117–127.
10. Dubois, D. and Prade, H. (Eds.) (1980). Fuzzy Sets and Systems: Theory and Applications.
Academic Press, New York.
11. Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning.
Addison-Wesley.
12. Gollu, A. and P.P. Varaiya (1989). Hybrid dynamical systems. In: Proc. of IEEE Conference
on Decision and Control. Tampa, FL.
13. Hobbs, P.F. (1997). Bayesian Methods for Analysing Climate Change and Water Resource
Uncertainties. Journ. of Environmental Management 49, pp. 53–72.
14. Katayama T. (2005). Subspace Methods for System Identification. Springer-Verlag, London.
15. Ljung, L. (1987). System Identification: Theory for the User. Prentice-Hall, Englewood
Cliffs, NJ.
16. Matondo, J.I. (2002). A Comparison between Conventional and Integrated Water Resources
Planning and Management. Physics and Chemistry of the Earth, Parts A/B/C 27, pp. 831–838.
17. Michalewicz, Z. (1999). Genetic Algorithms + Data Strictures = Evolution Programs, 3rd ed.
Springer, Berlin.
18. Nazemi, A.R., M.R. Akbarzadeh and S.M. Hosseini. (2002). Fuzzy-stochastic Linear Pro-
gramming in Water Resources Engineering. Proc. of the IEEE Fuzzy Information Processing
Society, NAFIPS 2002. pp. 227–232.
19. Panigrahi, D.P. and P. P. Mujumdar (2000). Reservoir operation modeling with fuzzy logic.
Water Res. Management 14, pp. 89–109.
20. Russel, S.O. and P.E. Campbell (1996). Reservoir operating rules with fuzzy logic program-
ming. Journal Water Resources Planning Management 122(3), pp. 165–170.
21. Salas J.D., J.R. Delleur, V. Yevjevich and W.L. Lane (1980). Applied Modelling of Hydrologic
Time Series. Water Resources Publications. Littleton, CO.
22. Salas J.D. (1993). Analysis and Modeling of Hydrologic Time Series. In: Handbook of Hy-
drology (D.R. Maidment, Ed.). pp. 19.1–19.72. McGraw-Hill, New York.
23. Sugeno, M. (1985). Industrial Applications of Fuzzy Control, Elsevier Science.
24. Thomas, J.-S. and B. Durham (2000). Integrated Water Resource Management: Looking at the
Whole Picture. Desalination 156, pp. 21–28.
25. Thomas H.A.Jr. and M.B. Fiering, (1962). Mathematical Synthesis of Streamflow Sequences
for Analysis of River Basisns by Simulations. In: The Design of Water Resources Systems
(A. Maas et al. Ed.). Harward University Press, Cambridge, MA, pp. 459–493.
Optimal Fuzzy Management of Reservoir based on Genetic Algorithm 159

26. Vemuri V. R. and W. Cedeño (1995). A New Genetic Algorithm for Multi-Objective Opti-
mization in Water Resource Management. Proc. of the IEEE Int. Conf. on Evolutionary Com-
putation 1, pp. 495–500.
27. Wang, L., L. Fang and K.W. Hipel (2003). Cooperative Water Resource Allocation based on
Equitable Water Rights. Proc. of the IEEE International Conference on Systems, Man and
Cybernetics 5, pp. 4425–4430.
28. Yeh, W. (1985). Reservoir management and operation models: a state of the art review. Water
Resources Research 21(12), pp. 1797–1818.
Genetic Fuzzy Modeling of Supervisory
Scheduling of Freight Rail Systems

Francisco Mota Filho, Rodrigo Goncalves, and Fernando Gomide

Abstract This chapter develops a genetic fuzzy modeling approach for train schedu-
ling of freight rail network systems. A genetic fuzzy algorithm is suggested as a
means to solve train scheduling problems. The algorithm uses fitness estimation
model based on participatory learning fuzzy clustering to improve its processing
speed and to keep solution quality. The approach is particularly useful in schedul-
ing problems involving dynamic environments because in these instances fitness
evaluation usually is costly. In dynamic environments such as rail network sys-
tems, decision-making demands feasible train movement plans to control traffic
and operate yards, stations and terminals. The genetic fuzzy algorithm is com-
pared against exact optimal solutions given by classic optimization and genetic
algorithms. To illustrate the usefulness of the approach, a real-world freight rail
system problem is solved using the genetic fuzzy approach and the classic genetic
algorithm. Results suggest that the genetic fuzzy approach constitutes a promising
alternative to solve scheduling problems in general, but performs particularly well
to produce supervisory train schedules.

Keywords: Genetic algorithm; Fuzzy control; Fitness estimation; Scheduling


problems

1 Introduction

Traffic over rail networks has increased substantially during the last decade. Most
world rail network freight systems consist of single track with passing and crossing
sidings, although a fair amount of double track and few multiple mainline tracks
do exist. The growth in the transportation demand is introducing congestion and

Francisco Mota Filho, Rodrigo Goncalves, and Fernando Gomide


Department of Computer Engineering and Automation
Faculty of Electrical and Computer Engineering, State University of Campinas, 13083-512
Campinas, SP, Brazil, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 161
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 161–180.
c 2008 Springer.
162 F. M. Filho et al.

complicating the accessibility and capacity of rail networks. For instance, container
trade is growing at a 9.5% annual rate worldwide and ports are expected to double
and possibly triple their cargo by the next decade [24]. Estimates assuming 3% per
year growth in a national economy indicate that railroads must carry an additional
888 million tons by 2020, a 44% increase from 2003 [27]. Correcting congestion
with additional capital expenditures is costly in an industry that already has a low
return on capital expended. Therefore, many railroads are looking to technology to
provide better utilization of the capital and system capacity that is already in place.
Until recently most freight railroads used a tonnage-based approach to dispatch
trains. This means that trains are held until they have enough tonnage to fill them
to capacity. Under the tonnage-based approach, the operating plan lists a train as
operating everyday, but if the railroad does not fill enough railcars, then it cancels
or delays the train. The idea is to minimize the total number of trains by choosing
higher size trains, which should help to decrease operation costs and increase track
capacity. However, tonnage-based train planning requires more railcars and higher
yard storage capacity to cope with traffic variability. It may also increase crew and
locomotive repositioning costs, and may jeopardize customer needs due to higher
emphasis in train operation economics [18]. Contrary to tonnage-based approach,
scheduled railroads are gaining attention once it forces trains to run on time even if
trains are partially loaded. Schedule-based schemes require trains with low tonnage
when demand is below expectations, systematic and precise forecast of transporta-
tion demand. Quick schedule adaptation, more advanced decision-making support
procedures, and methodologies to timely analyze different alternatives are also im-
portant. Currently practice uses hierarchical hybrids of tonnage and schedule-based
approaches because different commodities require distinct flexibility degrees to
accommodate trade-offs between customer needs and economic operation. For ins-
tance, in hierarchical systems a supervisory scheduling level develops medium range
(typically for 6-24 hours period) train schedules to provide references for train
movements. Whenever an unscheduled train enters the rail network or disturbances
occur, lower level real-time scheduling systems adjust current movement plans to
account for new traffic conditions. Adjustment must attempt to maintain the new
movement plans as compatible as possible with schedules given by the supervisory
level. If unfeasible movements occur, then a request is issued and the supervisory
level develops a new schedule.
There are many areas in which technology can improve the efficiency of rail-
road operations. Railroad operation plan describes how railcars, trains, locomotives
should travel, and how to assign the major assets needed to move the fleet, espe-
cially train crews, yards, tracks and maintenance crews. Railroad operation plan-
ning involves a multitude of complex tasks. It starts with transportation demand and
movement requirements, establishes railcar routes and train formation, and assigns
resources and plans trains movements. One major issue is the management of trains
movement across the network because it may improve capital utilization and system
capacity that already are in place. It can also help to discover bottlenecks and guide
investment. Controllers control the setting of switches, signals, issue of movement
authority in dark territories and manage movement plans remotely. In centralized
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 163

control systems, track occupancy is detected through track circuits, GPS (Global
Positioning System) signals, voice communication via radio links, and displayed to
the controllers. A controller must deal with a variety of track and signal infrastruc-
ture, and a wide variation in train performance. Maintenance crews requests and
safety in unsignaled areas must also be managed. As trains move across the rail
network, control of trains progress from controller to controller, often with frequent
interactions with yard and terminal managers for refueling, crew changes, car block-
ing and train formation. Moreover, not all trains are of the same economic value for
the railroad and priorities must be dynamically assigned to trains. Controllers per-
formance are measured on how well they move trains over the network. Therefore,
scheduling methodologies and algorithms provide a means to plan train movement
to their destination based on the value of trains and on physical, safety, and opera-
tional constraints. Supervisory scheduling algorithms and procedures are essential
to develop globally optimal schedules for trains moving at different railroad territo-
ries. Global schedules act as set-points of real-time level train movement plan and
control systems.
Train scheduling over a rail network of track segments resembles the job-shop
problem of scheduling jobs on machines. A rail network comprises a set of track
segments that cannot be occupied by opposing trains at any instant, just as ma-
chines in a job-shop can process only a job at a time. From the job-shop scheduling
point of view, there is a major difference once railroads often have yards and sta-
tions with multiple tracks, and eventually single or double track segments between
yards and stations. The major difference between job-shop and train schedules lies
in the set of constraints that depends on the track assignment and the selection of
tracks in multiple track yards and stations. Actually train scheduling is similar to
job-shop scheduling with alternative machines, which makes it much more difficult
than conventional job-shop. Currently it is virtually impractical, even for moderate
size instances, to solve this class of scheduling problems using exact methods.
The first attempt to solve train scheduling problems using both, exact and approx-
imate methods, dates back to the beginning of the 1970s when linear mathematical
programming models were developed [1]; [33]. Linear and nonlinear mixed pro-
gramming models became available [21], [23], [4], [14] but soon the intractability
of these models to solve complex real-world problems became apparent. Heuris-
tics such as tabu [14], [16], greedy search [22], genetic algorithms [31], [15] and
[16]; [25] were attempted to solve the problem. Knowledge-based techniques [5],
hybridizations of discrete event models and greedy search techniques [8], and com-
binations of discrete event models with fuzzy rule-based techniques [28]; [35] have
shown to provide a pragmatic and efficient approach to develop schedules for actual
system instances in real time. Distributed [19] and agent-based approaches [3] have
also been investigated. Recently, new classes of models were proposed to account
for the inherent multi-objective nature [11] and the flexibility required [37] by train
schedule problems.
Despite the significant performance of current high-speed computer systems,
exact solution of mixed optimization models with constraints for every train and
segment of a rail network still requires unreasonably long processing time. Usually
164 F. M. Filho et al.

solution procedures must rely on approximation of actual rail operation proce-


dures. Heuristic procedures alleviate processing time requirements, but generally
they depend heavily on the problem characteristic and in many cases they may
not be effective to produce acceptable solutions. Optimization-based heuristic app-
roaches rely on simplifications to force auxiliary optimization problems to produce
unique feasible solutions at each node of a search tree. This is especially critical for
nonlinear schedule models. Moreover, because current multiobjective scheduling
models are extensions of their single objective counterparts, they share similar com-
putational complexity as do single objective optimization models. The main reason
behind these difficulties is the fact that trains can pass or overtake only at sidings.
Discrete event system models remove the complexities behind passing and overtak-
ing, but must rely on deadlock avoidance procedures, itself a very complex problem.
In general discrete event models and greedy search produce mathematically subop-
timal schedules that perform well when solutions approach the optimal. Genetic
algorithms developed so far suffer from scalability due to inappropriate represen-
tation of individuals, and the need to include the constraints via penalty functions.
Small population sizes must often be adopted because there is a high computational
cost to evaluate the fitness of each individual.
In this chapter we introduce a novel genetic fuzzy system approach to solve
supervisory train scheduling. The purpose is to develop train movement plans to
act as references for real-time train control systems, reflecting acceptable trade-off
between processing time and solution quality. The genetic fuzzy system benefits
from a fitness estimation model based on participatory learning fuzzy clustering
to improve its computational performance. The fitness estimation model addressed
in this paper assumes that individuals are genetically related. The participatory
learning fuzzy clustering [32] is used to cluster population into groups with sim-
ilar individuals during the fitness evaluation step. Clustering reduces the number
of direct evaluations and improves computational performance of the evolutionary
process. In addition, cluster-based schemes help to maintain population diversity, a
key mechanism to obtain good quality solutions. Overall, the genetic fuzzy approach
produces good quality schedules and run significantly faster than conventional gene-
tic algorithms.
The chapter is organized as follows. Section 2 introduces the fitness estimation
model adopted and its role in the genetic fuzzy algorithms. An unconstrained, non-
linear function optimization example illustrates the performance of the algorithm.
Section 3 considers freight train scheduling in single track rail lines and presents
experimental results. Section 4 concludes the chapter and suggests issues for future
consideration.

2 Genetic Fuzzy Algorithm

Genetic algorithms are search algorithms based on the principles of natural genetics
whose purpose is to develop solutions for optimization problems. The main idea
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 165

is to start with a population of candidate solutions (individuals) encoded in a data


structure called chromosome, and to evolve the population through a process of
competition and controlled stochastic variation. The population and its individual
members evolve during successive iterations called generations. Evolution under-
goes natural selection using evaluation of individuals via a fitness function. Based
on these evaluations, a new population is formed using a selection mechanism and
specific genetic operators such as crossover and mutation. This procedure is re-
peated until a stopping criteria is met. The best chromosome in the final population
expresses a solution. Although there are many possible variants of the main idea of
genetic algorithms, the fundamental mechanism consists of three steps; (a) evalu-
ation of each individual using the fitness function; (b) formation of an intermedi-
ate population using a selection mechanism, and (c) recombination of individuals
using crossover and mutation operators [17]; [12]. Figure 1 summarizes the basic
genetic algorithm. Generally, genetic algorithms are good choices when problems
involve discontinuous, nondifferentiable, and multimodal objective functions and
constraints. It is also useful to handle discrete search spaces, and interactive op-
timization models involving subjective evaluations such as in system design and
simulation [34].

Initial
population

Next
generatioin

Reproduction

Evolution of
population

Selection

no
Stopping
criteria satisfied
?
yes

Return best
individual

Fig. 1 Genetic algorithms


166 F. M. Filho et al.

Genetic fuzzy systems are fuzzy systems augmented with a learning process
based on genetic algorithms. Similarly as genetic algorithms, they provide ro-
bust search capabilities in complex spaces and offer a powerful way to approach
problems requiring efficient and effective search processes [6], [7]. Genetic fuzzy
systems embrace different levels of complexity, from parameter optimization, to
learning of fuzzy rule bases and inference mechanisms. During the last ten years,
most of the effort in the area of genetic fuzzy systems has been devoted to fuzzy
rule-based systems. Recently, a new class of genetic fuzzy system emerged from
experiments with complex scheduling and sequencing problems for hybrid systems.
A particularly important class of hybrid systems are rail networks [25]. Scheduling
of rail systems involves continuous and discrete decision variables associated with
train movements in a rail line. The search space is considerably complex. In addi-
tion, fitness evaluation is expensive in rail systems because it involves the dynamics
of train movements, namely, train time trajectories. The genetic fuzzy system ad-
dressed here in this chapter uses fitness estimation procedures based on participatory
learning fuzzy clustering. The result is a genetic fuzzy system in which, contrary to
most current view of current genetic algorithms, learning occurs concurrently with
population evolution.
Most genetic algorithms require a large number of fitness evaluations before ac-
ceptable solutions are found. In many practical situations fitness evaluation may
demand computationally expensive procedures. In theses cases, fitness estimation
models can be adopted to alleviate computational costs, but solution quality must
be within acceptable bounds. In general, fitness estimation is useful when fitness
function evaluation is complex and time-consuming such as when there is no an-
alytic mathematical model, the environment is stochastic, and fitness landscape is
complex [20].
The use of fitness estimation models to improve computational performance of
evolutionary optimization algorithms dates back to the 1960s [9]. Previous efforts
have concentrated in response surface approximation instead of the original eval-
uation function [36]. Alternative approaches rely on special relations between the
approximate and the original model to develop multilevel search strategies [10].
Other schemes use functional approximation methods to form reduced models. A
comprehensive survey of fitness estimation models can be found in [20].
Two classes of genetic algorithms emerge from two main classes of fitness esti-
mation models, namely, fitness inheritance and fitness imitation [20]:
A. Fitness Inheritance
Fitness inheritance refers to all fitness estimation methods in which the fitness values
of the offspring individuals are directly derived from the fitness values of their par-
ents. These estimation methods can be interpreted as local once they consider only
parental information to estimate fitness, neglecting any information from the search
space. On the other hand, once they rely on local information only, they are easier
to use. An example of a simple fitness inheritance mechanism as a fitness estima-
tion strategy is suggested in [22]. Genetic algorithms with fitness inheritance follow
the steps of the basic genetic algorithms except that it adds a confidence degree as
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 167

an attribute to each individual of the population. Next, during parent selection for
reproduction, offspring are evaluated only if they are chosen according to a proba-
bility density function. If they are chosen, then they are evaluated using the original
fitness function. Otherwise they are evaluated as a weighted combination of the par-
ent fitness values. Weights depend on the similarity between parents and offspring.
Confidences degrees are updated accordingly. We refer the reader to [30] for a de-
tailed explanation of the algorithm once fitness inheritance will not be emphasized
here. Detailed description, analysis and comparisons are given in [20]; [25].
B. Fitness Imitation
Fitness imitation embraces all fitness estimation methods that do not use any form
of fitness inheritance mechanism. This class can be viewed as global because it con-
siders information of the search space to estimate fitness. However, because of the
need of global information, it tends to be more complex to use. Fitness imitation
genetic algorithms require the choice of a set of individuals to represent the whole
population. These representative individuals are evaluated using the original fitness
function while the remaining individuals are evaluated using the estimation proce-
dure. Therefore, fitness estimation procedures must also be selected. The perfor-
mance of the genetic algorithm depends on the mechanism to choose representative
individuals and on the fitness estimation procedure.
1) Choice of Representatives
The choice of representative individuals can be random or deterministic. A possi-
ble choice of representatives is to randomly sample the population using, e.g. the
roulette wheel procedure and store them in a fixed size memory. Only a subset of
the sampled individuals in memory is directly evaluated using the original fitness
function [13]. While intuitively simple and appealing, this method is very sensitive
to the choices of memory size and number of individuals for direct evaluation.
Alternatively, representative individuals can be chosen deterministically in each
generation by clustering population individuals in several groups [22]; [25]. Cluster-
ing is typically conducted in the genotype space. In this case, only those individuals
that represent the groups, that is, the cluster centers, are evaluated using the original
fitness function. Fitness evaluation of the remaining individuals is computed using
a weighted combination of representative individuals fitness values.
One mechanism to implement deterministic selection of representatives, the one
suggested in this chapter, is to use fuzzy clustering techniques. Fuzzy clustering is
interesting because it accounts for the fact that grouping is imprecise and allows the
same individual to be compatible with different clusters with different degrees. The
use of the fuzzy c-means [2], a powerful and efficient supervised fuzzy clustering
method, has been addressed in [26]. Here we suggest the use of the participatory
learning fuzzy clustering algorithm [32]. Contrary to fuzzy c-means, the partici-
patory learning fuzzy clustering algorithm is unsupervised and groups individuals
adaptively through generations. The result is a new class of genetic fuzzy system in
which, contrary to the current status of genetic fuzzy systems, learning occurs con-
currently with evolution. Figure 2 illustrates how individuals of a population evolve
168 F. M. Filho et al.

Clustering by PL

1⬚ generation 10⬚ generation 20⬚ generation

Clustering by FCM with four clusters in each generation

1⬚ generation 10⬚ generation 20⬚ generation

Individuals Optimum Cluster centers

Fig. 2 Fuzzy c-means and participatory learning clustering in GFA

when using the fuzzy c-means (FCM) and the participatory learning fuzzy cluster-
ing algorithms (PL) in genetic fuzzy algorithms (GFA). The figure emphasizes the
first, tenth and twentieth generation, respectively.
In genetic algorithms, individuals tend to concentrate around the optimal solu-
tions as the population evolves and are likely to become genetically similar. This
fact suggests that the number of clusters should reduce during generations. As
Figure 2 shows, the fuzzy c-means always groups individual in the same number
of clusters because it assumes that the number of clusters is given. This generates
genetically redundant cluster centers as we notice in the tenth and twentieth gener-
ation. Contrary, participatory learning fuzzy clustering recognizes the distribution
of individuals over the search space and cluster individuals in smaller number of
groups through generations. This avoids genetically redundant clusters and makes
the genetic algorithm faster. Due to its adaptive nature, the participatory learning
fuzzy clustering algorithm performs better than the fuzzy c-means because.
2) Fitness Estimation
After their choice, the representative individuals are evaluated using the original
fitness function. Fitness of the remaining individuals are estimated using the values
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 169

of the representative individuals. Here we suggest two techniques. The first relies on
a normalized similarity (1) between the individual whose fitness is to be estimated
and the representative individuals. Estimate of the fitness of an individual uses (2),
a weighted combination of the fitness of the representative individuals.

dmax − dk j
Sk j = (1)
dmax
⎧ r




⎪ ∑ Sk j f (x j )

⎨ j=1
r , x j ∈ R, if r > 1
fˆ(xk ) = (2)




∑ Sk j

⎪ j=1
⎩ Sk j f (x j ), x j ∈ R, if r = 1

In (1), dmax denotes the maximum distance between any two individuals in
the population and dk j the distance between individuals xk and x j . Notice that
Sk j ∈ [0, 1]. In (2), fˆ(xk ) is the fitness estimate for individual xk , r is the number
of representative individuals, f (x j ) is the fitness value of individual x j , and R is the
set of representative individuals.
The second technique estimates fitness values considering uk j , the membership
degree of the k − th individual in the j − th cluster, the cluster whose center is the
individual x j . In (3), f (x j ) is the fitness of the individual x j , fˆ(xk ) the fitness esti-
mate of the individual xk , c is the number of representative individuals, that is, the
number of clusters, and V a matrix whose columns are cluster centers.
⎧ c




⎪ ∑ uk j f (x j )

⎨ j=1
c , x j ∈ V, if c > 1
fˆ(xk ) = (3)



⎪ j=1
∑ u kj


⎩ uk j f (x j ), x j ∈ V, if c = 1

Figure 3 summarizes the genetic fuzzy algorithm (GFA).


To quickly and intuitively illustrate its computational properties, the GFA of
Figure 3 using FCM and PL to choose, evaluate and estimate fitness, and the
classic genetic algorithm are used to find the minimum of the Schwefel function.
In this example all genetic algorithms use floating-point representation of geno-
types, arithmetic crossover, and Gaussian mutation. The selection operator is a four-
round tournament procedure. The crossover rate is kept at 0.75, the mutation rate
at 0.04, and the maximum number of generations is 1,000 for all cases. Schwe-
fel function (4) is interesting because it is a nonlinear function with many local
optima scattered over the search space, Figure 4. This example challenges most
classic, unconstrained optimization algorithms. The global minimum is located at
x∗ = (−420.9687, −420.9687).
170 F. M. Filho et al.

Initial
population

Choice of Next
representatives generation

Reproduction

Evaluation of
representatives
Selection

no

Fitness estimation Stopping


of the remaining criteria satisfied
individuals ?

yes

Return best
individual

Fig. 3 Genetic fuzzy algorithm with fitness estimation

2000

1500

1000

500

0
500
500
0
0
–500 –500

Fig. 4 Schwefel function


Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 171

Table 1 GFA Performance for Schwefel function


Model Fitness Number of direct evaluations

CGA 0.99997454 76065.8


GFP 0.99997450 1645.6
GFC 0.99997454 10010

Fig. 5 Convergence of the genetic fuzzy and classic algorithm

p .
f (x) = 418.9829 p + ∑ xi sin( (xi )), x ∈ ℜ p (4)
i=1

Table 1 summarizes the performance of genetic fuzzy algorithm using fuzzy c-


means (GFC), participatory learning fuzzy clustering (GFP), and the classic genetic
algorithm (CGA). Table 1 values are averages of the best fitness values found over
10 runs, and the average number of direct evaluations necessary to find the optimal
solution. Whenever analytical functions are use to directly evaluate individuals ac-
curately is, of course, much more efficient than using fitness estimation models. For
our purposes, however, performance evaluation is more accurate using the number
of direct evaluations.
As Table 1 indicates, all genetic algorithms achieve the same fitness value, but
the genetic fuzzy algorithms spent far less direct evaluations than the classic ge-
netic algorithm. The participatory learning fuzzy clustering algorithm, in particular,
outperforms the fuzzy c-means. This result indicates that, since we are interested
in complex real-world problems and direct evaluations can be considerably expen-
sive, reduced number of direct evaluations to reach the same fitness value means
faster and good quality solutions. Figure 5 shows the convergence of the genetic
algorithms to obtain the solutions of Table 1.
172 F. M. Filho et al.

In the next section we address the train scheduling problem for freight rail net-
works to illustrate the usefulness of HGA in practical situations. Before proceed-
ing, we notice that most freight rail networks are divided into territories consisting
mainly of single tracks with sidings and, to lesser degree, a mixture of single track
and double track. Here we emphasize a single territory, single track line.

3 Supervisory Train Schedule

One of the major goals of current research in scheduling concerns the trade-off
between processing time and optimality. In practice scheduling algorithms and pro-
cedures that provide near-optimal solutions are preferable because they offer satis-
factory and pragmatic solutions faster than exact algorithms.
In supervisory traffic control, train dispatchers control train movement, plan the
meeting and passing of trains on single-track sections, align switches to control
each train movement, gather and report information, communicate with train crew,
station, and yard managers. Supervisory train schedule is one of the main tasks in
supervisory traffic control. The aim is to find a meet and pass plan for the rail line
and the speed of each train over each track segment to minimize an objective func-
tion. The objective function commonly is a weighted sum of objective functions
of all trains such as delay and operational costs. Generally train delay means the
additional amount of time a train needs to satisfy following and meet and pass con-
straints. The simplest form to determine delay is to compute the difference between
the free and actual transit time of a train journey. Supervisory train schedule trans-
lates in a movement plan composed by the arrival and departure time of each train
at each rail line segment within scheduling horizon.
This section details the use of genetic fuzzy algorithms to produce train move-
ment plans for single track railroads. Preliminary developments of the genetic fuzzy
system approach for train movement planning have been discussed in ([25]) using
fuzzy c-means clustering. Here we emphasize the genetic fuzzy algorithm with fit-
ness estimation using participatory learning clustering.
The supervisory train schedule problem assumes, without loss of generality, a
rail line with trains moving east and west bound. Trains may enter sidings to allow
trains moving in opposite directions to pass or overtake other trains. Trains should
only move when there is no chance to occur deadlock. Deadlock is the state in which
no train is able to progress in the rail line unless one of them backtracks to allow
other trains movement.
A. Genetic Fuzz Algorithm
The GFA addressed in Section 2 uses a discrete event model reported in [28], [29].
Figure 6 shows the model developd, emphasizing where the genetic code is placed.
The model requires the following input data:
• Railway line topology
• Departure time of all trains
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 173

• Initial segment (track) of all trains


• Route of each train
• Dispatch policy
• Train activities to be completed during its journey
• Track maintenance schedule of the rail line

Basically, the role of discrete event model is to simulate the train dispatch and
movement processes. Briefly, it works as follows. Events are the arrival and depart
times of each train at each rail line segment. Trains generate events as they move.
Whenever a train is to be dispatched, all eventual conflicts with trains competing
for the use of a common segment must be resolved first. The purpose is to decide if
the train should proceed, or if it must stop and wait for another conflicting train to
cross or overtake. Conflict decisions are handled by the subsystem called Dispatch
Policy. After conflict decision, the model checks if train movement causes deadlock.
If it does, the train must be kept stopped at its current segment until a deadlock free
movement occurs.
The Dispatch Policy decides which train should move, and different policies
mean different dispatching decisions. In other words, different policies mean dif-
ferent schedules. Certain schedules are preferable than others with respect to the
objective function value. The idea of the GFA is to evolve a Dispatching Policy that
provides near-optimal solutions within short processing time bounds.
It is interesting to note in Figure 6 that to evaluate a candidate solution directly
we must simulate all trains movement within the scheduling horizon. This is a very
time-consuming task for complex scenarios such as large railway lines with small
number sidings and large number of trains. This is the situation where large number
of movement conflicts is likely to occur. Conventional optimization models do pro-
vide optimal solutions, but the current computer technology turns them inapplica-
ble in practice because processing time is prohibitive. Heuristics and local search

Genotype Evaluators Set

Dispatch
Genetic Policy Fitness
Code

Phenotype
Discrete Feasible
Input
Event Schedule
Data
Simulator

Fig. 6 Supervisory train scheduling model using genetic fuzzy algorithms


174 F. M. Filho et al.

Chromosome
Train 1
segment 1 segment 2 segment 3 segment 4
priority 2 4 8 1

velocity (Km/h) 60 73 40 55

Train n
priority 7 2 10 9

velocity (Km/h) 57 80 45 50

Fig. 7 Chromosome representation

methods can provide feasible solutions fast, but solution quality may be poor. GFA
approach provides an attractive trade-off between solution quality and processing
time requirements.
1) Representation
The representation of individuals is through a chromosome consisting of 2n vectors,
where n is the number of trains in the rail line. The length of each vector is the
number of segments in the train route. As Figure 7 shows, two vectors characterize
each train. Each component of the first vector, called priority vector, defines the
priority of the train to occupy the segment in the corresponding position in its route.
In the second vector, called speed vector, each component gives the train speed
when moving in the segment in the corresponding position of its route. Therefore,
whenever movement conflicts happen, the Dispatching Policy must decide which
train will occupy the segment first: the train with the highest priority among the
competing trains is the one chosen to proceed.
2) Fitness Function
For simplicity, in what follows we assume that the aim is to minimize the total delay
in the schedule, as shown in (5). The fitness function used by the GFA is given in (6).

n m
F(Si ) = ∑ ∑ delay( j, k) (5)
j=1 k=1

1
f itnessi = (6)
1 + F(Si )

In (5), Si is the schedule of the i − th individual, a chromosome of the form


shown in Figure 7, n is the number of trains, and m is the number of track segments.
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 175

Individual i train n Individual i train n


seg1 seg2 seg3 seg4 seg1 seg2 seg3 seg4
velocity 50 34 70 60 33 46 70 60
priority 2 5 8 6 1 8 8 6

Individual j train n Individual j train n


seg1 seg2 seg3 seg4 seg1 seg2 seg3 seg4
velocity 53 46 65 90 50 34 65 90
priority 1 8 4 10 2 5 4 10

Fig. 8 One-point crossover

Therefore, delay( j, k) refers to the delay of train j at segment k. In (6), f itnessi


refers to the fitness of the i − th individual. Clearly, the longer the total delay, the
worse the schedule.
3) Selection
Selection is performed using the tournament ([12]), a procedure that selects individ-
uals from small subsets of the population based on a fitness rank mechanism.
4) Reproduction
Since the supervisory train scheduling model of Figure 6 does not generate un-
feasible schedules during generations (because the discrete event simulation model
allows only safe and feasible train movements), arithmetic crossover and Gaussian
mutation are adopted. Figure 8 shows an example of one-point crossover operation
between individuals i and j and train n.
B. Experimental Results
To verify the performance of GFA, a number of experiments were conducted starting
with small size instances to compare the GFA solutions with the optimal solutions
produced by an exact and a classic genetic algorithm. All examples were run in a
Pentium IV, 2GHz with 512 MB RAM computer. In all cases, crossover rate was
0.75, mutation rate 0.04, and the maximum number of generations was 1,000. For
small instances, the optimal solution was computed using the branch and bound
algorithm and the optimization model reported in [37].
1) Varying the number of trains
Tables 2 and 3 show the behavior of the GFA and the CGA as the number of trains
increases in a rail line with 11 segments and 4 sidings. Trains are conveniently
inserted in opposing directions with departure times close enough to force as many
conflicts as possible. Entries of Table 2 are the average minimum total delay, over
5 runs, of the schedule for each case. The lower these values are, the better the
schedule. Table 3 presents the processing times, over 5 runs, of each model needed
to achieve the corresponding total delays of Table 2.
176 F. M. Filho et al.

Table 2 Minimum total delay in minutes


Model 5 6 7 8 9
trains trains trains trains trains
Optimal 330 709 1050 — —
CGA 338 726 1139 1689 2487
GFP 338 726 1131 1749 2522
GFC 338 744 1141 1765 2615

Table 3 Processing time in minutes


Model 5 6 7 8 9
trains trains trains trains trains
Optimal 0.4 6.51 2433 — —
CGA 1.81 2.46 2.08 7.71 7.51
GFP 1 1.45 0.30 0.59 1.39
GFC 1.93 1.58 1 0.66 1.48

Table 2 shows that as the number of trains increases, the optimality gap between
the genetic algorithms and the exact optimal solution increases as well because sce-
narios become more complex. Gaps achieve 2.42% for the scenario with 5 trains,
2.39% for 6 trains and 6.76% for 7 trains. If the same comparison is made between
the GFA and CGA, the gap is much lower. For 7 trains GFP achieves better solution
than CGA. For 9 trains the gap between GFP and CGA is 1.4%.
Table 3 indicates that, the number of trains increases, exact optimal becomes
difficult to obtain using classic optimization modeling approach. In general, GFAs
run faster than CGA. For 8 and 9 trains, GFP was considerably faster than to achieve
96.45% and 98.6% of the CGA fitness function values, respectively.
2) Varying the number of sidings
Tables 4 and 5 summarize the behavior of the GFAs and the CGA as the number
of sidings increases, but keeping 5 trains moving in the rail line. Entries of Table 4
are the average minimum total delay over 5 runs. Table 5 shows the corresponding
average processing running times over 5 runs.
As Table 4 indicates, except for 6 the GFA achieved the optimal solution for all
test instances. For 6 sidings, the optimality gap is 2.42%. Notice that there is no gap
between GFP and CGA solutions.
From Table 5 we conclude that the computational effort to find exact optimal
solutions increases fast as the number of sidings increases. Clearly, all GFAs run
faster than CGA and achieve the optimum solution for most instances.
3) Real-world scenario
In this section we consider a rail line composed by 43 segments, 22 sidings and 21
single track segments, respectively. We assume 27 trains to be scheduled within 24
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 177

Table 4 Minimum total delay in minutes


Model 6 7 8 9
sidings sidings sidings sidings

Optimal 330 285 355 251


CGA 338 285 355 251
GFP 338 285 355 251
GFC 338 319 355 289

Table 5 Processing times in minutes


Model 6 7 8 9
sidings sidings sidings sidings
Optimal 0.4 0.5 2.03 7.88
CGA 1.81 5.63 3.10 3.20
GFP 1 0.53 1.05 0.60
GFC 1.93 0.61 1.13 0.71

Table 6 Performance in real world scenario


Model Fitness Processing Time
(minutes)
CGA 0.939946 72.64
GFP 0.928611 3.61
GFC 0.926666 32.3

hours period. This corresponds to a territory of a major railroad network of the state
of Sao Paulo, Brazil.
Table 6 summarizes the performance of the GFAs and the CGA. Similarly as
in previous sections, entries of Table 6 are the averages over 5 runs of the values
computed using (5) and (6) and the corresponding processing times. Exact optimal
solution for this instance is not available once the problem size is above the one in
which reasonable processing times could be expected.
Table 6 shows that, from the point of view of solution quality, the GFAs perform
as well as CGA. The fitness values of GFAs are very close to the one achieved
by CGA. However, the GFAs run considerably faster than CGA. It is worth note
that, in particular, GFP is able to provide near optimal solutions within a period of
time fully consistent with requirements for train movement plans at the supervisory
control level. Figure 9 shows the schedule using train graph.
178 F. M. Filho et al.

Fig. 9 Example of a supervisory schedule

4 Conclusion

Although rail is an old technology, current rail systems are complex and require
advanced techniques to be operated. This chapter has addressed the development
of supervisory train schedule for railroad network systems using genetic fuzzy al-
gorithms. Supervisory train scheduling is a major issue in railroad industry once
it provides a key to improve operational and economic performance. Supervisory
scheduling provides references on how to best manage and control train movements
in a rail network. The genetic fuzzy algorithm uses fitness estimation procedures
as a mechanism to reduce genetic algorithm complexities when handling heavily
constrained optimization problems whose fitness and performance evaluations are
computationally expensive. This is the case of train scheduling and movement plan-
ning problems. The genetic fuzzy algorithm approach suggested in this chapter sig-
nificantly reduces the number of direct fitness evaluations and decreases processing
times without significantly affect solution quality. A fitness estimation model that
uses the participatory learning clustering technique was emphasized and shown to
perform best in all experiments conducted. The genetic fuzzy algorithm with par-
ticipatory learning clustering achieves high fitness values with a reduced number of
direct evaluations.
Despite promising performance, genetic fuzzy algorithms still need considerable
effort for further improvement. For instance, new fitness estimation models based on
statistical and neural network models could be useful. The use of fuzzy rule-based
systems to control key genetic algorithms parameters such as crossover and muta-
tion rate, population size and the use rule-based genetic fuzzy systems could be an
Genetic Fuzzy Modeling of Supervisory Scheduling of Freight Rail Systems 179

alternative to learn supervisory scheduling rules. More detailed investigation must


be done to verify how clustering performs when tackling decision problems with
discrete search spaces. Comparisons of the genetic fuzzy algorithms with alterna-
tive scheduling approaches still need to be completed. Hopefully this issues will be
addressed in the near future.

Acknowledgments The first author acknowledges CAPES, the Brazilian Ministry of Education,
for a fellowship. The second author thanks FAPESP, the Research Foundation of the State of Sao
Paulo for its support. Currently he is with Cflex Computacao Flexivel Ltda, Campinas, Sao Paulo,
Brazil. The third author is grateful to CNPq, the Brazilian National Research Council, for grant
304299/2003 − 0.

References

1. I. Amit and D. Goldfarb. The timetable problem for railways. Developments in Operations
Research 2, pp. 379–387, 1971
2. J. Bezdeck. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press,
1981
3. J. Blum and A. Eskandrian. Enhancing intelligent agent collaboration for flow optimization
of railroad traffic. Transportation Research A, pp. 919–930, 2002
4. M. Carey and D. Lockwood. A model, algorithms and strategy for train pathing. Journal of
the Operational Research Society, 46, pp. 988–1005, 1995
5. T. Chiang and H. Hau. Railway scheduling system using repair-based approach. Proc. of the
IEEE Conf. on Decision and Control, pp. 71–78, 1995
6. O. Cordón, F. Herrera, F. Hoffmann and L. Magdalena. Genetic fuzzy systems: Evolutionary
tuning and learning of fuzzy knowledge bases. World Scientific, 2001
7. O. Cordón, F. Gomide, F. Herrera, F. Hoffmann and L. Magdalena. Ten years of genetic fuzzy
systems: Current framework and new trends. Fuzzy Sets and Systems, 141, pp. 5–31, 2004
8. M. Dorfman and J. Medanic. Scheduling trains on a railway network using a discrete event
model of railway traffic. Transportation Research B, 38, pp. 81–98, 2005
9. B. Dunham, D. Fridshal, R. Fridshal and J. North. Design by natural selection. Synthese, 15,
pp. 254–259, 1963
10. D. Eby, R. Averill, W. Punch and E. Goodman. Evaluation of injection island ga performance
on flywheel design optimization. Proc. 3rd Conf. on Adaptive Computing in Design and Man-
ufacturing, 1998
11. K. Ghoseiri, F. Szidarovszky and M.J. Asgharpour. A multi-objective train scheduling model
and solution. Transportation Research B, 38, pp. 927–952, 2004
12. D. Goldberg. The Design of Competent Genetic Algorithms: Steps Toward a Computational
Theory of Innovation. Kluwer Academic, 2002
13. Y. Hanaki, T. Hashiyama and S. Okuma. Accelerated evolutionary computation using fitness
estimation. Proc. of Int. Conf. on Systems, Man and Cybernetics, pp.643–648, 1999
14. A. Higgins, E. Kozan and L. Ferreira. Heuristic techniques for single line train scheduling.
Journal of Heuristics, 3(1), pp. 43–62, 1997
15. T. Ho and T. Yeung. Raiway junction conflict resolution by genetic algorithm. Electronics
Letters, 36(8), pp. 771–772, 2000
16. T. Ho and T. Yeung. Railway junction traffic control by heuristic methods. IEE Proc. Electronic
Power Applications, 148(1), pp. 77–84, 2001
17. J. Holland Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975
18. P. Ireland, R. Case, J. Fallis and C. Dyke. The Canadian pacific railway transforms operations
using models to develop its operating plans. Interfaces, 34(1), pp. 5–14, 2004
180 F. M. Filho et al.

19. R. Iyer and S. Ghosh. DARYN — a distributed decision-making algorithm for railway net-
works: Modeling and simulation. IEEE Trans. on Vehicular Technology, 44(1), pp. 180–191,
1995
20. Y. Jin A comprehensive survey of fitness approximation in evolutionary computation. Soft
Computing, 9, 1, pp. 3–12, 2005
21. D. Jovanovic and Harker. Tactical scheduling of rail operations: the scan I systems. Trans-
portation Science, 25, pp. 46–64, 1991
22. H. Kin and S. Cho. An efficient genetic algorithm with less evaluation by clustering. Proc. of
IEEE Congress on Evolutionary Computation Conference, pp. 786–792, 2000
23. D. Kraay and P. Harker. Real-time scheduling of freight railroads. Transportation Research B,
29, pp. 213–229, 1995
24. Q. Lu, M. Dessouky and R. Leachman. Modeling train movements through complex rail net-
works. ACM Transactions on Modeling and Computer Simulation, 14(1), pp. 48–75, 2004
25. F. Mota Filho, F. Gomide and R. Goncalves. Genetic algorithms, fuzzy clustering and discrete
event systems: An application in scheduling. Proc. First Workshop on Genetic Fuzzy Systems,
Granada, Spain, pp. 83–88, 2005
26. F. Mota Filho. Estimation fitness methods for genetic algorithms and applications., Master’s
thesis State University of Campinas, Faculty of Electrical and Computer Engineering, São
Paulo, Brazil, 2005
27. Railway Age Magazine, 2003
28. M. Rondón and F. Gomide. Railway simulation and optimization system. World Automation
Congress, pp. 1–6, 2000
29. M. Rondón and F. Gomide. Line block analysis in railway dispatch and simulation systems/
Proc. 9th IFAC Symposium on Control in Transportation Systems, pp. 405–409, 2000
30. M. Salami and T. Hendtlass. A fitness estimation strategy for genetic algorithms. Proc. 15th
Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert
Systems, Vol. 2358, pp. 319–326, 2002
31. V. Salim and X. Cai. Scheduling cargo trains using genetic algorithms. IEEE Int. Conf. on
Evolutionary Computation, Vol. 1, 1995
32. L. Silva, F. Gomide and R. Yager. Participatory learning in fuzzy clustering. 14th IEEE Annual
Int. Conf. on Fuzzy Systems, pp. 857–861, 2005
33. B. Szpigel. Optimal train scheduling on single track railway. In M. Ross (Ed) OR’72, North-
Holland, pp. 343–351, 1973
34. H. Takagi. Interactive evolutionary computation. Proc. 5th Int. Conf. on Soft Computing and
Information/Intelligent Systems, pp. 41–50, 1998
35. A. Tazoniero, R. Gonalves and F. Gomide. Fuzzy algorithm for real-time train dispatch and
control. Proc. of North American Fuzzy Information Processing Society, pp. 1–5, 2005
36. V. Toropov and L. Alvarez Application of genetic programming to the choice of a structure of
global approximations. Proc. 3rd Annual Conf. on Genetic Programming, 1998
37. A. Valle, F. Gomide and R. Gonalves. Fuzzy optmization model for train dispatch systems.
11th Int. Fuzzy Systems Association World Congress, Vol. 3, pp. 1788–1793, 2005
Multiobjective Evolutionary Search
of Difference Equations-based Models
for Understanding Chaotic Systems

Luciano Sánchez and José R. Villar

Abstract In control engineering, it is well known that many physical processes


exhibit a chaotic component. In point of fact, it is also assumed that conventional
modeling procedures disregard it, as stochastic noise, beside nonlinear universal ap-
proximators (like neural networks, fuzzy rule-based or genetic programming-based
models,) can capture the chaotic nature of the process.
In this chapter we will show that this is not always true. Despite the nonlinear ca-
pabilities of the universal approximators, these methods optimize the one step pre-
diction of the model. This is not the most adequate objective function for a chaotic
model, because there may exist many different nonchaotic processes that have near
zero prediction error for such an horizon. The learning process will surely converge
to one of them. Unless we include in the objective function some terms that depend
on the properties on the reconstructed attractor, we may end up with a non chaotic
model. Therefore, we propose to follow a multiobjective approach to model chaotic
processes, and we also detail how to apply either genetic algorithms or simulated
annealing to obtain a difference equations-based model.

Keywords: Nonlinear approximation; Chaotic signals; Genetic algorithms;


Simulated annealing

1 Introduction

When modeling complex processes, there is always a balance between the trans-
parency of the model and its accuracy. Chaotic signals are not an exception to this
rule: we expect a technique that produces a black box from data [12,20,25,32,34,41]

Luciano Sánchez and José R. Villar


Computer Science Department, Universidad de Oviedo, Edificio Departamental 1, Campus de
Viesques, 33213 Gijon (Spain), Tel.: +34 985182597; fax: +34 985 181 986., e-mail: villar-
[email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 181
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 181–201.
c 2008 Springer.
182 L. Sánchez and J.R. Villar

to produce more accurate results than other procedures that also gains insight into
the block structure of the system. The best representative of this last kind of model
(that we will name white boxes, understandable, or transparent models) is arguably a
set of difference equations. A difference equations-based model allows the user not
only to predict the output of the process, but to know the dynamics of the model and
ultimately to design a control system for it. Nevertheless, obtaining an appropriate
set of equations from data is a problem that cannot be regarded as solved.
Many of the most recent approaches to obtain understandable descriptions of
chaotic systems are based on evolutionary techniques. In particular, the use of tree-
based codifications allows us to define a simultaneous search in both the different
families of models, and the parameters that define a model within one of these fami-
lies. Since we want to discover the structure of the set of equations (i.e., a consistent
subset of state variables and the dependences between them) and also the numerical
values of the coefficients in these equations, it is convenient for us to combine an
evolutionary search with a tree-based representation of the model, as it was done,
among others, in [3, 4, 11, 16, 40, 49].
Some of the latest algorithms are able to obtain difference equations, but there is
work yet to be done. Many evolutionary modeling methods minimize the discrep-
ancies between the data and the one-step prediction of the model, and do not take
into account the dynamic behavior of the model [41, 47]. As we will show later in
this paper, should we search for a model on the basis on the lowest one-step predic-
tion error, we have high chances of finding a non-chaotic model. In that case, the
obtained equations would be meaningless. The use of greater prediction horizons
is not always feasible, though. Being chaotic systems, we can find large deviations
between the recursive evaluation of the model and the training data.
In the following sections we will solve this problem by enforcing an additional
constraint: the value of the largest Lyapunov exponent of our model has to match
that value estimated from our train data. The largest Lyapunov exponent is a mea-
sure of the amount of chaos in the signal [24,25,48], and the difference between the
maximum Lyapunov exponents of two models also gives us a measure of similar-
ity between the complexities of their dynamics [17, 47]. Accordingly, we propose
to extend the aforesaid balance between transparency and accuracy to a new triplet
transparency/accuracy/dynamic. We define a multiobjective problem, designed to
minimize the square error and the complexity of the model, while restricting the
search to those models whose largest Lyapunov exponents are similar to the esti-
mated value from the time series we want to analyze. Since the evaluation of the
Lyapunov exponents is very time costly, we also propose to use our own custom
evolutionary algorithm, that combine a tree-based codification with a population-
based, multiobjective extension of the Simulated Annealing. The algorithm that we
propose in this paper is able to find a set of difference equations that reproduces
the dynamics of a given chaotic time series, and improves the results of modern
multiobjective evolutionary algorithms like NSGA-2 [9, 10] when the number of
evaluations of the objective function is limited.
The organization of this paper is as follows: in Section 2, we make a brief bibli-
ographic analysis of transparent models of chaotic systems, detailing the unsolved
Multiobjective Evolutionary Search of Difference Equations-based Models 183

problems. In Section 3, we describe our own proposal. Experiments and results are
shown in Section 4, and the paper finishes with the concluding remarks and the
future work.

2 Evolutionary Transparent Modeling of Chaotic Systems

Genetic algorithms, genetic programming and evolutionary programming tech-


niques have been applied to identify and control nonlinear and chaotic systems.
The reader can refer to [30, 43], where genetic algorithms are compared against dif-
ferent identification techniques, or review the results in [5, 23, 38, 44]. The control
problem is less studied, but there also exist works like [38], where a genetic al-
gorithm was used to find the optimal control signals sequence in a chaotic cutting
process.
There are different views of the concept of “transparent model” for chaos. For
instance, linguistic fuzzy rules were combined with genetic algorithms in [5] and
in [23], and applied to analyze chaotic time series. Wavelet coefficients are also
considered to provide a certain degree of interpretability, as they were used in [44],
where genetic algorithms were applied to select wavelet threshold parameters in
an exchange-rate forecasting problem. Another approaches for nonlinear modeling
use polynomial models, as can be seen in [12, 39]. Many other different, problem
specific, analytical modeling approaches had been developed. For example, in [1],
evolutionary algorithms were used to propose nonlinear models for a satellite based
ocean forecasting system. In [11], evolutionary computing was used for extracting
mathematical models, and this proposal was analyzed with three different applica-
tions. Lastly, in [16] genetic programming was used to find difference equations
models of nonlinear processes, as we propose in this paper.
In all of the preceding methods, the fitness of a individual is based only in instan-
taneous error measures, thus not all the available information about the dynamic
of the process is used. As we mentioned in the introduction, this means that the
learning algorithm will surely converge to a nonchaotic model. In this case, the
usefulness of a transparent model is limited. A different approach was presented
in [14], where evolutionary computing was used for obtaining models for chaotic
time series, using the error of the recurrent outcome of the model, which is a mea-
sure of its dynamical behavior. The recurrent outcomes of a chaotic model are very
much different under small differences of the initial state, and then this measure of
error has to be taken with care, but our own approach shares properties with this
method. In the following section, we will propose to evaluate the dynamical proper-
ties of a candidate model by mean of its recursive evaluation, however not through
the error in the trajectory, but estimating the higher Lyapunov exponent of the time
series formed by this recursive prediction and including it in a multiobjective fitness
function.
Multiobjective techniques have been previously used to develop models for non-
linear and chaotic systems. In some of our own previous works [13], we have pro-
184 L. Sánchez and J.R. Villar

posed to use a linear combination of the quadratic error and the largest Lyapunov
exponent for the fitness function, and have optimized it by means of a genetic
algorithm. In [12, 39] a Pareto-based approach is used instead of scalar functions,
in combination with the MOGA algorithm described in [15]. There are some dif-
ferent Pareto-based multiobjective strategies that could also be applied for the same
problem, as can be seen in [6]. Later in this paper, we will evaluate a more recent
approach, the NSGA-II algorithm [9, 10].
Given the computational cost of evaluating the Lyapunov exponents of a model,
and the potentially large size of some individuals, we are mostly interested in algo-
rithms that need a low number of iterations and small population sizes. It is widely
admitted that genetic algorithms are the best choice for this matter. As a matter of
fact, these algorithms have become an standard in all kind of multiobjective prob-
lems [50]. However, in our opinion, the experimentation that support this assert
was intended to solve problems based in a linear genotype, and it is not immedi-
ate to extrapolate all of their conclusions to tree-based representations. In previous
works [42], we have combined a simulated annealing (SA) global search with a
grammar-tree-based codification, in the context of the learning of fuzzy rules. An
strategy so simple as keeping only one individual, and repeatedly mutating it, ad-
mitting or discarding the result according to a probability decreasing with time and
distance, was able to improve the results of the GA. With this result in mind, in
this paper we will extend our own algorithm to multiobjective problems, and pro-
pose a new population-based, multiobjective SA search (MOSA) able to elicit a set
of nondominated solutions. In the following sections we will show that the genetic
search (the NSGA-II algorithm,) while equally efficient in the long term, can be
improved in this specific problem by a Simulated Annealing-based search in both
accuracy and memory usage.
Interesting enough to mention, a pure Pareto-based MOSA has not been previ-
ously defined, to our knowledge. The most recent approaches weight the different
criteria into a scalar function [19, 31, 45]. Otherwise, in [8] it was proposed to use
the dominance to decide the evolution of the simulated annealing. That approach
was also used in [18], where fuzzy numbers and uncertainty in dominance is man-
aged to decide if an individual is better than other or not. Similarly, in [35, 36],
Pareto dominance is studied to decide how the multiobjective simulated anneal-
ing evolves. But, in all of these cases, an aggregated function of objectives still
is used to evaluate each individual. A different approach to Pareto-based MOSA,
nearer to ours, is presented in [2]. In that work, a comparison of a Pareto-based
evolutionary algorithm and a population-based simulated annealing with domi-
nance control approach is presented. In each simulated annealing iteration, a new
individual is obtained by means of an heuristic, and it is included in the popu-
lation if there is nondominance relation with the current individual. If the new
one dominates the current, then it becomes the current one. In the opposite case,
then it is accepted with temperature dependent probability. Observe that, even in
this last case, it is required that either an individual dominates or is dominated by
another. This is done, again, weighting the different objectives into a scalar func-
tion and therefore the algorithm does not homogeneously sample the Pareto front.
Multiobjective Evolutionary Search of Difference Equations-based Models 185

In the next sections we will propose a different algorithm that does not pose this
problem.

3 Operators Used in the Evolutionary Searches

The experimental analysis that we will show later compares the NSGA-II and the
MOSA algorithms, both sharing the same representation and operators. Our SA
search will be based in the mutation operator, in turn based in the genetic crossover
[42].
In this section we will state, for both search schemes, the representation of an
individual, its validation procedure, how to generate an individual at random, how
to evaluate it, the crossover and the mutation operators. In the next section we will
describe the pseudocode of the algorithms.

3.1 Representation of an Individual

We will build the input data from a time series, given an embedding dimension
n, thus the training set contains the sampled values of n system state variables
xk1 , . . . , xkn , at times k = 1, 2, . . .. We wish to obtain a set of m ≤ n difference equation-
based models, with the structure that follows:
i
xk+1 = fi (xk1 , . . . , xkn ) i ∈ {1, . . . , n}. (1)

One of these state variables will be identified as the output of the system. It is as-
sumed that xk+1 = xk for all those variables without an equation assigned.
The phenotype of an individual is, therefore, a list of m valid equations. We will
define the concept “valid equation” by means of the the grammar shown in Figure 1.

S → Structure Parameters
Structure → ArithOp ∨ NonLinearOp ∨ DelayOp
Parameters → Variable ∨ Constant
Variable → System signal
Constant → ℜ
ArithOp → (+ Exp, Exp) ∨ (− Exp, Exp) ∨ (* Exp, Exp)
NonLinearOp → (G [LC, UC] → OC, Exp) ∨ (Dz [LC, UC] → OC, Exp)
DelayOp → (Ret delay Variable)
LC → Constant
UC → Constant
OC → Constant

Fig. 1 Grammar defining a valid equation. “G” means “gain”, “Dz” means “dead zone”. There are
some restrictions in the value of the constants that are also enforced: LC < UC, and all constants
are bounded
186 L. Sánchez and J.R. Villar

createModel
needs: list of system signals, id. of the output-signal, experiment parameters
produce: a random set of signals, including the one designed as system output,
and a randomly generated equation for every one of them

for each signal s in the list of system signals


if (s = output-signal) or (random() < threshold) then
signals.push(s)
for each s in signals
equations.push(
createRandomEquation(signals, experiment parameters)
)
return { signals, equations }

Fig. 2 Simplified pseudocode of the random generation of a model using the PTC2 algorithm. The
function createRandomEquation takes into account constrains like the maximum height of a tree,
the probabilities of each type of node and the grammar shown in Figure 1

The genotype will be the syntactic tree of a valid chain in this grammar. Each node
of this tree will encode the name of the production rule that originated each subtree.
This information will be used later to define a typed crossover.
It can be observed that each equation comprises two parts, associated to the
productions “Structure” and “Parameters”. The first production defines which
operations are valid to define the functions fi , and the second one is a list of
numerical parameters, on which these last functions depend. Following [26], the
nonlinear elements in the definition of fi are selected from the usual catalog of
building blocks in control engineering. We have restricted ourselves to the blocks
“gain with saturation” and “dead zone”.

3.2 Random Generation of Genotypes

The PTC2 algorithm (see Figure 2) is used to generate random trees [27, 28]. This
algorithm allows to specify the maximum number of nodes, the maximum height,
the types of nodes, and the probability distribution for each tree height and the prob-
ability distribution of each type of node, conditioned to our grammar.

3.3 Genetic Crossover and Mutation

Our crossover operator has two different expressions, to which we will refer as
parametric and structural. The parametric crossover takes place between the parts of
the individuals that derive from the production rule “Parameters”, and the structural
crossover involves the parts originated in the production “Structure”. Leaving apart
the differences in the grammar, the same operators proposed in [42] were used:
Multiobjective Evolutionary Search of Difference Equations-based Models 187

• To perform the parametric crossover we select one of the nodes derived from the
production Constant in each one of the trees, and modify both values with an
extended intermediate crossover [33].
• To carry out the structural crossover of two individuals, a random node of the
first parent is selected. The subtree rooted in this node is to be interchanged
with another one in the second parent. A list of valid nodes of this last parent is
produced. That list of valid nodes not only has to take into account the syntactic
restrictions of the grammar, but there are also semantic constrains: the height of
the offspring must not be higher than the limit, and the individuals must not have
more than one equation for each one of the state variables. If the list is empty,
the procedure is repeated with a different node in the first parent. Once we have
a nonempty list, one of its elements is randomly chosen and interchanged with
the former one.
In previous works [42], we have proposed to implement the macromutation in
the SA algorithm by means of a subtree crossover with a randomly generated indi-
vidual [21, 37]. In our MOSA implementation we will use this technique: crossover
with a random individual followed by a selection at random from the offspring. The
same mutation operator will also be used in our implementation of the NSGA-II
algorithm.

3.4 Fitness Function

The fitness function comprises a pair of numbers: the mean error of the one-step
prediction of the model, and the absolute difference between the largest Lyapunov
exponents of the model and the training data. Different procedures have been pro-
posed to compare this kind of compound values [7]. We will use a Pareto multi-
objective evaluation, and guide the search towards obtaining a set of nondominated
individuals. In the most general case, it is said that an individual x dominates to
another individual y (x ≺ y,) if all the Fj components of the fitness vector F verify
Fj (x) ≤ Fj (y), and ∃t | Ft (x) < Ft (y). However, we are not interested in the whole
Pareto front, because models with a high prediction error are not of practical inter-
est. We will discard all models whose one-step prediction error is higher than the
variance of the time series, no matter their Lyapunov value.
The estimation of the one-step prediction error is immediate. Unfortunately the
same cannot be said about estimating the largest Lyapunov exponent of a model.
It will be computed, as mentioned, from the time series produced by the recursive
evaluation of the model since a given initial state, discarding the first samples of the
recursive evaluation, so we are certain that the trajectory is in the attractor.
Some different numerical algorithms were evaluated by us. Our first choice was
the well-known Wolf algorithm [48], that we had already used in previous works.
Unfortunately, the number of samples that this algorithm needs is rather high; this,
in combination with the large number of iterations and the population sizes needed
to obtain good models with multiobjective genetic algorithms makes the whole
188 L. Sánchez and J.R. Villar

identification procedure impractical (more than one week in a modern scientific


workstation.) There exist other algorithms, in particular those of Rosenstein and
Kantz [22, 29], which need lower sample sizes than Wolf’s; we have successfully
used a combination of the Rosenstein algorithm and our own heuristic estimation of
the point where the slope of the curves time vs divergence changes. The use of the
Rosenstein algorithm, in combination with the MOSA algorithm that we will detail
in the next section, reduces the computation time from days to hours. However, the
best results in both accuracy and computational effort have been obtained by an es-
timation based on the equations of the model and the principal axes of expansion, as
discussed in [46]: we follow the divergence of two close trajectories. One of them is
retained for reference. The other one is repeatedly renormalized so that the distance
between both is kept small. The maximum Lyapunov exponent is then estimated
by the average value of the logarithm of the quotients between the starting distance
between the trajectories and the distances after one step, before renormalizing.

4 Detailed Description of the MOSA Algorithm

4.1 Outline of the Algorithm

The pseudocode of the Multi-Objective Simulated Annealing-Programming (MOSA)


algorithm is shown in Figure 3. This algorithm is based in a variable sized population
of search points. At each iteration, all the search points are mutated and their respec-
tive fitness evaluated. The comparison between the fitness of the mutated individual
and that of its corresponding search point can produce three different results:
1. The new individual dominates the current search point.
2. The new individual is dominated by the search point.
3. Neither of them dominates the other.
The strategy of MOSA for these three cases is as follows:
1. If the mutated individual dominates the current search point, it replaces its parent
in an intermediate population.
2. If the mutated individual is dominated, then a random decision is made between
storing the current search point or the mutated one. Observe that, being a SA
search, the probability of admitting the mutated point depends on the cooling
pattern and decreases with both the distance between the fitness values and the
time. The distance between the fitness values is explained in the next subsection.
3. Otherwise, the size of the intermediate population is increased, and the mutated
model initiates a new search path.
Once all the individuals in the population have been mutated and the preceding
decisions have been taken, the intermediate population is sampled by means of the
selection operator to form the following generation. Aside from the population,
note that an elitist set of nondominated solutions is also kept; this set is the current
sample of the Pareto front and eventually will be the output of the algorithm.
Multiobjective Evolutionary Search of Difference Equations-based Models 189

4.2 The Distance Operator

To implement the simulated annealing we need to generate new individuals in the


neighborhood of the current one. The chances of a new individual being admitted de-
pend on the distance between the current and the new individual. When vector based
individuals are used, the euclidean distance can be used, but this is not longer true
with tree-based representations. In previous works [42], we postulated the use of an
edition distance between trees as the number of edition operations (add, remove or
replace a node) needed to transform the current into the new model. Besides, in the
same paper we also checked that there was possible that proximal individuals had
a very different evaluation of the fitness, and the same happens here. Therefore, we
have chosen to implement a distance in the fitness landscape (the supremum of the
distances in all the criteria) instead of an edition distance in the genotypical space.
Select initial and final temperatures: T0 , T1
Select the cooling factor : C
Select a starting model: x0
Initialize the population of search paths: X = {x0 }
Initialize the set of elites (sample of Pareto front): P = {x0 }
T ← T0
while T ≤ T1
// Initialize intermediate population X 
X ← X
for path ← 1 to size(X )
x ← mutation(Xpath )
if x ≺ Xpath then
// The search point is updated

Xpath ← x;
else if Xpath ≺ x then
// The search point might be updated

if rnd() < exp(-distance(Xpath , x)/T) then Xpath ←x
else
// A new search path is generated
X  ← X ∪ {x}
end if
end for
// The set of nondominated values up to this moment is updated
P ← nondominated models of P ∪ X 
// If needed, the size of the set of paths is adjusted
X ← selection(X  )
T ← T ·C
end while

Fig. 3 Pseudocode of the MOSA algorithm


190 L. Sánchez and J.R. Villar

4.3 The Selection Operator

The size of the intermediate population can be twice as high as the the current pop-
ulation size, in the worst case. To control the maximum population size, all the
dominated values and duplicated search points are removed at each iteration.
Our selection operator is a variation of that used in the NSGA-II algorithm [9,10].
In the first place, the set of nondominated search points is computed by pairwise
comparisons of all individuals in X  . Observe that we do not need to use fast sorting
algorithms to compute this set, because the size of X in our experimentations ranges
between 10 and 25 individuals and performing 252 comparisons is much faster than
evaluating once the fitness value. Secondly,
• If the size of the set of nondominated search points is small enough, this set is
the new population.
• If its size must be further reduced, we sort the individuals in this last set by
means of the same crowding distance defined in the NSGA-II algorithm, and
choose them in inverse order of distance.

4.4 Example of a MOSA Evolution

In Figure 4 a typical example of the evolution of the MOSA algorithm is shown.


The problem being solved is taken from [15]. Since this problem consists in finding
two real values, we have codified each individual by means of a vector instead of a
tree, and used a extended intermediate crossover with a randomly generated chain
to mutate them, but otherwise the search scheme of MOSA was followed. It can be
observed that all the solutions are in the Pareto front after 100 iterations, and it is
also shown how the population size evolves.

5 Experiment and Results

5.1 Dynamic Behavior of Universal Approximators

As we have mentioned in the introduction, a good error in the one-period prediction


error does not necessarily imply that the dynamic behavior of the system has been
captured. Suppose we intend to model a chaotic time series with an universal model,
a neural network, say. We first choose an embedding dimension d and convert the
time series into a training set. Each instance of this set has d inputs (the last d values
of the series) and one output (the next value in the series). We expect that, if the
embedding dimension is high enough, the neural network will capture the dynamics
of the model that generated the series.
Multiobjective Evolutionary Search of Difference Equations-based Models 191

The problem with this reasoning is, there are many different networks able to
approximate the former training set without error. Most of them do not correspond
with chaotic models. For instance, observe the one-step prediction errors of the net-
works in the table that follows. They all are near zero, and apparently the models are
very precise, although some of them have too low an embedding dimension. How-
ever, in Figure 5 we have plotted the step responses of these models. Observe that
all of them are stable systems, with a punctual attractor. None of the nets captured
the chaotic nature of the signal.

Multilayer Perceptron
Embedding dimension Nodes in each layer Err
1 1-3-1 0.000972
2 2-5-1 0.000034
3 3 - 10 - 1 0.000004
4 4 - 10 - 1 0.000009
If we use a transparent model instead, the same can happen. In Figure 6 a Genetic
Algorithm was used, with the same representation and operators described in the text
but an scalar fitness (based only on the one-step error.) We have trained it with data
from the Henon map. The learned model is not chaotic, though, as pictured in the
center part of the figure. Lastly, in the lower part of the same figure the step response
of a model learned by the MOSA algorithm is shown. This is a chaotic model, and
in the next section we will also show some examples of reconstructed attractors. It
is remarked that the one-step error of either model the MOSA and the GA are close
to zero.

1.2
"pareto0.dat" u 3:4
"pareto40.dat" u 3:4
1.1 "pareto60.dat" u 3:4
"pareto80.dat" u 3:4
"pareto100.dat" u 3:4
1 "pareto.dat" u 3:4

0.9

0.8

0.7

0.6

0.5

0.4
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

Fig. 4 Evolution of the population in the MOSA algorithm, for a two-criteria problem taken
from [15]
192 L. Sánchez and J.R. Villar

1.4
"mac2.dat" u 0:1
1.2

0.8

0.6

0.4

0.2
0 100 200 300 400 500 600 700 800 900 1000
1.4 1.4
"SerieArtificial.dat" u 0:1 "SerieArtificial.dat" u 0:1
1.2 1.2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000

1.2 1.6
"SerieArtificial.dat" u 0:1 "SerieArtificial.dat" u 0:1
1.4
1
1.2
0.8
1

0.6 0.8

0.6
0.4
0.4
0.2
0.2

0 0
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000

Fig. 5 Chaotic time series, and multilayer perceptrons with one hidden layer, trained to minimize
the one-step error. Upper part: Chaotic signal (train data). Central part: Step responses of the neural
networks 1-3-1 and 2-5-1. Lower part: Networks 3-10-1 and 4-10-1. Despite the low values in the
objective funcion shown in the text, none of the neural nets is a chaotic model

5.2 Benchmark Problems

In this section we will compare the results of MOSA and NSGA-II over some
benchmark problems. The NSGA-II is an implementation of the Pareto-based
multiobjective genetic algorithm detailed in [9, 10], which is currently assumed to
be among the best available implementations of such kind of algorithms.
The results will be shown with two different methodologies, graphical and statis-
tical. The graphical (qualitative) approach serves to identify the differences between
the combined Pareto fronts after a certain number of repetitions of each exper-
Multiobjective Evolutionary Search of Difference Equations-based Models 193

Fig. 6 Graphical analysis of experimental results, Henon map. Upper part: Train data. Center:
Typical recursive evaluation of a transparent model obtained by an evolutionary algorithm when the
maximum Lyapunov exponent is not included in the fitness function: In this case, the optimization
has converged to a stable model (Lyapunov exponent < 0.) Bottom: Step response of one of the
models found with the procedures mentioned in this paper
194 L. Sánchez and J.R. Villar

iment. The statistical (quantitative) comparison of the results of multiobjective


Evolutionary Algorithms is a current research field. There exist many different
measures of the degree to a Pareto front improves the results of another one, but
it is acknowledged that there are problems derived from the stochastic nature of
evolutionary algorithms that are still unsolved [50–52]. We propose to use an statis-
tical test about the probability of either algorithm dominates the other, based in the
binary ε -indicator described in [51]. Both the qualitative and quantitative analysis
will be explained in the sections that follow.

5.2.1 Experimental Setup

The parameters of the operators used in the experimentation are shown in the
following tableaux:
NSGA2
Parameter Value Parameter Value
Structural crossover 0.5 Parametric crossover 0.5
Mutation 0.01 Embedding dimension 2
Population size 100 Evaluations of fitness 5000
Constants minimum value −5 Constants maximum value 5

MOSA
Parameter Value Parameter Value
Initial temperature 1.00 Cooling Factor 0.999
Structural mutation 0.5 Parametric mutation 0.5
- Embedding dimension 2
Maximum population size 10 Evaluations of fitness 5000
Constants minimum value −5 Constants maximum value 5
The learning time is roughly proportional to the number of times that we esti-
mate the greater Lyapunov exponent of a model, and both algorithms are allowed
to evaluate 5,000 times this function. Since this estimation is not performed when
the one-step error is higher than the variance of the time series, this is equivalent
to 50 ≈ 100 generations of the NSGA-II algorithm. The parameters defining the
random initialization of the individuals are as follows:
Parameter Value
Maximum number of nodes in equations 10
Prob. of number of nodes/equation, 1 - 10 .05 .12 .11 .15 .15 .15 .11 .08 .05 .03
Maximum height 7
Height probability distribution, 1 - 7 .05 .4 .3 .15 .05 .025 .025
node types +; −; *; G; Dz;
Node type probability distribution .21 .21 .21 .21 .09 .07
Each experiment was repeated 10 times. The time series used for training and
validation have size 1,000. The chaotic systems that have have been used are the
Multiobjective Evolutionary Search of Difference Equations-based Models 195

Logistic and the Henon maps, with the set of parameters shown in the equations
that follow:
Logistic map: xk+1 = 4.0 ∗ xk ∗ (1 − xk ) (2)

xk+1 = 0.3yk + 1 − 1.4xk2
Henon map: (3)
yk+1 = xn

5.2.2 Commented Graphical Results

The graphical results are displayed in Figures 7 and 8. In both cases, we have
obtained the combined Pareto front (upper left part) after 10 repetitions of either
algorithm. This combined Pareto front is formed by selecting all the nondominated
individuals of the 10 runs. In the upper right part, all the elements of the 10 Pareto
fronts of each algorithm are displayed together, in the same graph. By last, in the
right lower part of the figures we have displayed a couple of reconstructed attractors

10 10
MOSA "PAR-GLOBAL-MOSA" u 1:2
NSGA-II "PAR-GLOBAL-NSGA" u 1:2
1 1

0.1 0.1

0.01 0.01

0.001 0.001

1e-04 1e-04
0.01 0.1 1 0.01 0.1 1

1.5 1.5
"atractor.dat" u 1:2 "atractor.dat" u 3:4
1 1

0.5 0.5

0 0

–0.5 –0.5

–1 –1

–1.5 –1.5
–1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5

Fig. 7 Graphical analysis of experimental results, Henon map. Upper part, left: Combined Pareto
front of ten repetitions of the algorithms NSGA-II (triangles) and MOSA (circles). All the models
in the Pareto front of the NSGA-II algorithm are dominated by at least one element in the Pareto
front of the MOSA. A logarithmic scale is used, to enhance the differences. The vertical axe rep-
resents the error in the Lyapunov exponent, the horizontal one is the one-step error. Upper part,
right: combined cloud of the 10 Pareto fronts of both experiments, from which the Pareto fronts
were calculated. Lower part, left: Attractor of the Henon map. Lower part, right: Attractor of one
of the models induced by the MOSA method
196 L. Sánchez and J.R. Villar

10 10
"PAR-GLOBAL-MOSA" u 1:2
MOSA "PAR-GLOBAL-NSGA" u 1:2
1 NSGA-II 1

0.1 0.1

0.01 0.01

0.001 0.001

1e-04 1e-04

1e-05 1e-05

1e-06 1e-06
1e-04 0.001 0.01 0.1 1e-04 0.001 0.01 0.1 1

1 1
0.9
"atractor.dat" u 1:2
0.8 0.8 "atractor.dat" u 3:4
0.7
0.6 0.6
0.5
0.4 0.4
0.3
0.2 0.2
0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1

Fig. 8 Graphical analysis of experimental results, Logistic map. Upper part, left: Combined Pareto
front of ten repetitions of the algorithms NSGA-II (triangles) and MOSA (circles). All but one of
the models in the Pareto front of the NSGA-II algorithm are dominated by at least one element in
the Pareto front of the MOSA. A logarithmic scale is used, to enhance the differences. The vertical
axe represents the error in the Lyapunov exponent, the horizontal one is the one-step error. Upper
part, right: combined cloud of the 10 Pareto fronts of both experiments, from which the Pareto
fronts were calculated. Lower part, left: Attractor of the Henon map. Lower part, right: Attractor
of one of the models induced by the MOSA method

that show the similarities between the dynamic behavior of the models and that of
the original system (left part.)
As there is a clear difference between the combined fronts (all of the points in the
NSGA-II front are dominated by those of the MOSA) this is not so in this second
graph, since some of the executions of MOSA were dominated by NSGA-II and
vice versa. The extent to which, in average, one algorithm is better than the other,
will be studied in the next section.

5.2.3 Numerical Comparison

There are functions (unary indicators) that can convert a Pareto front into a repre-
sentative value. It is possible to compare sets of these representative values with
the same methodology used in scalar evolutionary algorithms, i.e., a statistical test
able to discard that the expected errors are the same. However, some studies have
shown that these unary indicators are not able to show all the dominance relations
Multiobjective Evolutionary Search of Difference Equations-based Models 197

that can happen between Pareto fronts [52]. Therefore, to assess the average im-
provement between one algorithm and the other, we will used a method based on a
binary indicator, namely, the binary ε -indicator defined in [51].
Two different definitions of this last indicator are possible: the standard (multi-
plicative) Iε and the additive indicator Iε + . Given two fronts A and B, if Iε (A, B) < 1
and Iε (A, B) > 1, or if Iε + (A, B) < 0 and Iε + (A, B) > 0, we can state that A domi-
nates B. The values of these indicators for our combined Pareto fronts follow:

Iε (MOSA,NSGA) Iε (NSGA,MOSA)
Henon 0.25 122.98
Logistic 0.13 201.483

Iε + (MOSA,NSGA) Iε + (NSGA,MOSA)
Henon −0.04 0.19
Logistic −6 · 10−4 0.04

In both cases, we can conclude that combined MOSA results dominate that of
NSGA-II. These results are not conclusive, though, since one exceptionally good
result of either algorithm could be responsible of the dominance of the combined
Pareto front. Therefore, we propose to apply the ε -indicator to perform a full set of
comparisons between all pairs of fronts, and to calculate the fraction of times each
instance of the algorithm A dominates one of the instances of the algorithm B, and
vice versa.
Our methodology is as follows: Let pA (B) be 1 if A dominates B (i.e. when
Iε (A, B) > 1 and Iε (B, A) < 1), 0 otherwise. Given 10 repetitions B1 , . . . , B10 of an
algorithm B, let
1 10
PA (B) = ∑ pA (Bi ).
10 i=1
(4)

and, given another 10 repetitions A1 , . . . , A10 of an algorithm A, let

PA (B) = (PA1 (B), PA2 (B), . . . , PA10 (B)). (5)

The vector PA (B) can be seen as a sample of a random variable: the fraction of times
that the output of the algorithm A dominates the algorithm B. If the expectation of
PA (B) is greater that the expectation of PB (A), then we can state that the algorithm
A is better than the algorithm B, since it is easier that results of the former improve
that of the latter than the opposite.
Therefore, to know whether there is a significant difference between the two
algorithms we can use a statistical test to discard that the expectations of PA (B)
and PB (A) are the same. Since the distributions of none of them were compati-
ble with the Gaussian distribution, we have used a Wilcoxon test (null hypothesis
E(PA (B)) = E(PB (A)), alternate hypothesis E(PA (B)) > E(PB (A)).) The resulting
p-values are shown in the following table:
198 L. Sánchez and J.R. Villar
1.0

0.0 0.2 0.4 0.6 0.8 1.0


0.8
0.6
0.4

l l
0.2

l
0.0

1 2 1 2

Fig. 9 Boxplots of (1) PMOSA (NSGA-II) and (2) PNSGA-II (MOSA) for the Henon map (left part) and
Logistic map (right part.) This graph shows that the probability of MOSA improves NSGA − II is
higher than the probability of NSGA − II improves MOSA in both problems

p-value
Henon 0.00020
Logistic 0.00013
We can discard with a confidence greater than 99% that the means of both vari-
ables are the same in favor of the alternate hypothesis, thus we can conclude that
MOSA is a significant improvement wrt. NSGA-II in this particular application. In
Figure 9 the boxplots of PMOSA (NSGA-II) and PNSGA-II (MOSA) for both problems are
also given.

6 Concluding Remarks and Future Work

Modeling systems with chaotic dynamic is a complex task. It is easy to obtain a


model with low error in a one-step prediction, but it is not easy to capture their
dynamical properties. In this paper we have shown that many of these short-term
models are stable, and not chaotic.
If a transparent model is needed, the one-step approach is questionable. However,
using a larger horizon in the prediction is not feasible, since chaotic systems show
a high dependency on the initial conditions. Therefore, we have decided to com-
bine the one-step error and an invariant of the recursive evaluation of the model,
its largest Lyapunov error. Our results have shown that, for simple chaotic systems,
we are able to effectively obtain a model whose recursive evaluation converges to
an strange attractor very similar to that of the original system. Moreover, we have
shown that, for this task, the use of a Simulated Annealing-based search can improve
the results of recent multicriteria genetic algorithms in both memory requirements
and computational time.
Future work will be devoted to integrate the full spectra of Lyapunov exponents
in the learning. This is needed to identify models with more than one positive ex-
ponent. In this last case, it is hard for our algorithm to obtain a good model, since
most of the search is spent with models where only the largest exponent is similar.
Multiobjective Evolutionary Search of Difference Equations-based Models 199

The same can be said about unstable models, that are currently detected by mean of
heuristics (i.e., limits in the range of the output of the recursive evaluation.) The full
spectra or, at the least, the Kolmogorov entropy of the model should be evaluated
and taken into account along with the one step error and the largest exponent.

Acknowledgments The research in this paper has been funded by project TIN2005-08386-C05-
05, M.E.C., Spain

References

1. A. Alvarez, A. Orfila and J. Tintore, DARWIN: An evolutionary programa for nonlinear mod-
eling of chaotic time series. Computer Physics Communications, 136, pp. 334–349, 2000
2. E.K. Burke and J.D. Landa Silva. Improving the Performance of Trajectory-Based Multiob-
jective Optimisers by Using Relaxed Dominance. In: Lipo Wang, Kay Chen Tan, Takeshi
Furuhashi, Jong-Hwan Kim and Xin Yao, editors, Proceedings of the 4th Asia-Pacific Con-
ference on Simulated Evolution and Learning (SEAL’02), 1, pp. 203–207, Nanyang Technical
University, Orchid Country Club, Singapore, November 2002
3. H. Cao, L. Guo, Y. Chen and T. Guo, The Dynamic Evolutionary Modeling of HODEs for
Time Series Prediction., Computers and Mathematics with Applications, 46, pp. 1397–1411,
2003
4. Y.S. Chang, K.S. Park and B.Y. Kim, Nonlinear model for ECG R-R interval variation using
genetic programming approach. Future Generation Computer Systems, 21(7), pp. 1117–1123,
2005
5. I-F. Chung, C-J. Lin and C-T. Lin. A GA-based fuzzy adaptive learning control network. Fuzzy
Sets and Systems, 112, pp. 65–84, 2000
6. C.A. Coello. List of References on Evolutionary Multiobjective Optimization. http://www.
lania.mx/ccoello/EMOO/EMOObib.html
7. C.A. Coello. An Updated Survey of Evolutionary Multiobjective Optimization Techniques:
State of the Art and Future Trends. In 1999 Congress on Evolutionary Computation, IEEE
Service Center, 1, pp. 3–13, Washington, DC, 1999
8. P. Czyzak and A. Jaszkiewicz. Pareto simulated annealing — a metaheuristic technique for
multiple-objective combinatorial optimization. Journal of Multi-Criteria Decision Analysis, 7,
pp. 34–47, 1998
9. K. Deb, Samir Agrawal, Amrit Pratab and T. Meyarivan, A Fast Elitist Non-Dominated Sort-
ing Genetic Algorithm for Multi-Objective Optimization: NSGA-II. In: Marc Schoenauer, K.
Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, Juan Julian Merelo and Hans-Paul Schwe-
fel, editors. Proceedings of the Parallel Problem Solving from Nature VI Conference. Paris,
France. Springer. Lecture Notes in Computer Science No. 1917, p. 849–858, 2000
10. K. Deb and Tushar Goel. Controlled Elitist Non-dominated Sorting Genetic Algorithms for
Better Convergence. In: E. Zitzler, K. Deb, L. Thiele, C.A. Coello and David Corne, edi-
tors. First International Conference on Evolutionary Multi-Criterion Optimization. Springer-
Verlag. Lecture Notes in Computer Science No. 1993, pp. 67–81, 2001
11. K. Downing. Using evolutionary computational techniques in environmental modelling. Envi-
ronmental Modelling and Software, 13, pp. 519–528, 1998
12. C. Evans, P.J. Fleming, D.C. Hill, J.P. Norton, I. Pratt, D. Rees and K. Rodriguez-Vazquez,
Application of system identication techniques to aircraft gas turbine engines. Control Engi-
neering Practice, 9, pp. 135–148, 2001
13. A.I. Fernandez, L. Sanchez and J. J. Navarro. Approximating the discrete space equation from
chaotic noisy data (IPMU’2000). In Proceedings of Information Processing and Management
of Uncertainty in Knowledge-Based Systems, Madrid, Spain, pp. 149–156, 2000
200 L. Sánchez and J.R. Villar

14. D.B. Fogel and L.J. Fogel. Preliminary Experiments on Discriminating between Chaotic Sig-
nals and Noise using Evolutionary Programming. In: J.R. Koza, D.E. Goldberg, D.B. Fogel
and R. L. Riolo, editors. Genetic Programming 96., MIT Press, Cambridge, MA, 1996
15. C.M. Fonseca and Peter J. Fleming. Multiobjective Optimization and Multiple Constraint
Handling with Evolutionary Algorithms — Part I: A Unified Formulation. IEEE Transactions
on Systems, Man, and Cybernetics, Part A: Systems and Humans, 28(1), pp. 26–37, 1998
16. G.J. Gray, D.J. Murray-Smith, Y. Li, K.C. Sharman and T. Weinbrenner. Nonlinear model
structure identification using genetic programming. Control Engineering Practice, 6, pp.
1341–1352, 1998
17. N.F. Guler, E.D. Ubeyli and I. Guler. Recurrent neural networks employing Lyapunov ex-
ponents for EEG signals classification. Expert Systems with Applications, 29, pp. 506–514,
2005
18. M. Hapke, Andrzej Jaszkiewicz and Roman Slowinski. Pareto Simulated Annealing for Fuzzy
Multi-Objective Combinatorial Optimization. Journal of Heuristics, 6(3) pp. 329–345, August
2000
19. M. Hernández-Guı́a, R. Mulet and S. Rodrguez-Prez, A New Simulated Annealing Algorithm
for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media.
Physical Review E, 72 (3), 2005
20. W. Jiang, Q. Guo-Dong and D. Bin. Observer-based robust adaptive variable universe fuzzy
control for chaotic system. Chaos, Solutions and Fractals, 23, pp. 1013–1032, 2005
21. T. Jones. Crossover, macromutation and population-based search. In: 6th Internatonal Con-
ference on Genetic Algorithms. San Francisco, July 15–19, Morgan Kaufmann, 1, pp. 73–80,
2005
22. H. Kantz. A robust method to estimate the maximal Lyapunov exponent of a time series. Phys.
Lett. A, 185, pp. 77–87, 1994
23. D. Kim. Improving the fuzzy system performance by fuzzy system ensemble. Fuzzy Sets and
Systems, 98, pp. 43–56, 1998
24. D. Kugiumtzis, B. Lillekjendliey and N. Christophersen. Chaotic time series. Part I: Estima-
tion of some invariant properties in state space. Identification and Control, 4(15), pp. 205–224,
1995
25. B. Lillekjendlie, D. Kugiumtzis and N. Christophersen. Chaotic time series part II: System
identification and prediction. Identification and Control, 4(15), pp. 225–243, 1995
26. A. Lopez, H. Lopez and L. Sanchez. Graph based GP applied to dynamical system modeling.
In: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence, 6th Inter-
national Work-Conference on Artificial and Natural Neural Networks, IWANN 2001. Lecture
Notes in Computer Science, 2084, pp. 725–732, 2001
27. S. Luke. Two Fast Tree-Creation Algorithms for Genetic Programming. IEEE Transactions on
Evolutionary Computation, 4(3), pp. 274–283, September 2000
28. S. Luke and Liviu Panait. A Survey and Comparision of Tree Generation Algorithms. In: Lee
Spector et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO’2001). San Francisco, California, Morgan Kaufmann, pp. 81–88, 2001
29. M.T. Rosenstein, J.J. Collins and C.J. De Luca. A practical method for calculating largest
Lyapunov exponents from small data sets. Physica D, 65, pp. 117-134, June, 1993
30. M.W. Mak, K.W. Ku and Lu. On the improvement of the real time recurrent learning algorithm
for recurrent neural networks. Neurocomputing, 24, pp. 13–36, 1999
31. M.A. Matos and Paulo Melo. Multiobjective Reconfiguration for Loss Reduction and Service
Restorating Using Simulated Annealing. In: International Conference on Electric Power En-
gineering, Budapest 99. IEEE, pp. 213–218, 1999
32. Y. Mei-Ying and W. Xiao-Dong. Chaotic time series prediction using least squares support
vector machines. Chinese Physics, 13, pp. 454–458, 2004
33. Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs, 3rd ed.
Springer-Verlag, 1996
34. S. Mukherjee, E. Osuna and F. Girosi. Nonlinear Prediction of Chaotic Time Series Using
Support Vector Machines. in: Proceedings of IEEE NNSP’97, Amelia Island, FL, USA, IEEE
Service Center, pp. 24–26, September 1997
Multiobjective Evolutionary Search of Difference Equations-based Models 201

35. D. Nam and Cheol Hoon Park. Multiobjective Simulated Annealing: A Comparative Study to
Evolutionary Algorithms. International Journal of Fuzzy Systems, 2(2), pp. 87–97, 2000
36. D. Nam and Cheol Hoon Park. Pareto-Based Cost Simulated Annealing for Multiobjective Op-
timization. In: Lipo Wang, Kay Chen Tan, Takeshi Furuhashi, Jong-Hwan Kim and Xin Yao,
editors. Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning
(SEAL’02). Nanyang Technical University, Orchid Country Club, Singapore, 2, pp. 522–526,
November 2002
37. R. Poli and Nicholas F. McPhee. Exact GP Schema Theory for Headless Chicken Crossover
and Subtree Mutation. In: Proceedings of the 2001 Congress on Evolutionary Computation
CEC2001 COEX, World Trade Center, 159 Samseong-dong, Gangnam-gu, Seoul, Korea,
IEEE Press, pp. 1062–1069, 2001
38. P. Potocnik and I. Grabec. Nonlinear model predictive control of a cutting process. Neuro-
computing, 43, pp. 107–126, 2002
39. K. Rodrı́guez-Vázquez, C.M. Fonseca and P.J. Fleming. Identifying the Structure of NonLin-
ear Dynamic Systems Using Multiobjective Genetic Programming. IEEE Transactions on Sys-
tems, Man, and Cybernetics — Part A: Systems and Humans, 34(4), pp. 531–545, July 2004
40. J.J. Rowland. Model selection methodology in supervised learning with evolutionary compu-
tation. BioSystems, 72, pp. 187–196, 2003
41. A.E. Ruano, P.J. Fleming, C. Teixeira, K. Rodriguez-Vazquez and C.M. Fonseca. Nonlinear
identification of aircraft gas-turbine dynamics. Neurocomputing, 55, pp. 551–579, 2003
42. L. Sanchez, I. Couso and J.A. Corrales. Combining GP operators with SA search to evolve
Fuzzy Rule based classifiers. Information Sciences, 1–5, pp. 175–192, 2001
43. R.S. Sexton and J.N.D. Gupta. Comparative evaluation of genetic algorithm and backpropa-
gation for training neural networks. Information Sciences, 129, pp. 45–59, 2000
44. T. Shin and I. Han. Optimal signal multi-resolution by genetic algorithms to support artificial
neural networks for exchange-rate forecasting. Expert Systems with Applications, 18, pp.
257–269, 2000
45. Kevin I. Smith, Richard M. Everson and Jonathan E. Fieldsend. Dominance Measures
for Multi-Objective Simulated Annealing. In: 2004 Congress on Evolutionary Computation
(CEC’2004). Portland, Oregon, USA. IEEE Service Center, 1, pp. 23–30, June, 2004
46. J.C. Sprott. Chaos and Time-Series Analysis. Oxford University Press, 2003
47. Z. Wei, W. Zhi-ming and Y. Gen-ke. Genetic programming-based chaotic time series model-
ing. Journal of Zhejiang University SCIENCE, 5(11), pp. 1432–1439, 2004
48. A. Wolf, J.B. Switf, H.L. Swinney and J. A. Vastano. Determining Lyapunov Exponents from
a Time Series. Physica D, 16, pp. 285–317, 1985
49. A.M. Woodward, R.J. Gilbert and D. B. Kell. Genetic programming as an analytical tool for
non-linear dielectric spectroscopy. Bioelectrochemistry and Bioenergetics, 48, pp. 389–396,
1999
50. E. Zitzler, K. Deb and L. Thiele. Comparison of Multiobjective Evolutionary Algorithms:
Empirical Results. Evol. Comput., 8(2), pp. 173–195, 2000
51. E. Zitzler, M. Laumanns, L. Thiele, C.M. Fonseca and V. Grunert da Fonseca, Why Qual-
ity Assessment of Multiobjective Optimizers Is Difficult. In: W.B. Langdon, E. Cantú-Paz,
K. Mathias, R. Roy, D. Davis, R. Poli and K. Balakrishnan, V. Honavar, G. Rudolph,
J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E. Burke and N. Jonoska, edi-
tors. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’2002).
San Francisco, California, pp. 666–673, July 2002
52. E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca and V. Grunert da Fonseca. Performance
Assessment of Multiobjective Optimizers: An Analysis and Review. IEEE Transactions on Evo-
lutionary Computation, 7(2), pp. 117–132, April 2003
An Integrated Fuzzy Inference-based
Monitoring, Diagnostic, and Prognostic System
for Intelligent Control and Maintenance

Dustin R. Garvey and J. Wesley Hines

Abstract With the advent of modern computation, intelligent control and mainte-
nance systems have become a viable option for complex engineering processes and
systems. Such control and maintenance systems can be generically described as be-
ing composed of 5 analysis steps: (1) predict the expected system signals from their
measured values, (2) use the residual of the measured and predicted value to deter-
mine if the system is operating in a nominal or a degraded mode, (3) if the system
is operating in a degraded mode, diagnose the fault, (4) prognose the failure by es-
timating the remaining useful life (RUL) of the system, and (5) use the collected
information to determine if an appropriate control or maintenance action should be
performed to maintain the health and safety of the system performance. This chap-
ter presents the development and adaptation of a single generic inference procedure,
namely the nonparametric fuzzy inference system (NFIS), for monitoring, diagnos-
tics, and prognostics. To illustrate the proposed methodologies, the embodiments of
the NFIS are used to detect, diagnose, and prognose faults in the steering system
of an automated oil drill. The embodiments of the NFIS were found to have simi-
lar performance to traditional algorithms, such as autoassociative kernel regression
(AAKR) and k-nearest neighbor (kNN), for monitoring and diagnosis. The NFIS
prognoser was also shown to estimate the remaining useful life of the steering sys-
tem to within an hour of its actual time of failure.

Keywords: online monitoring, sensor calibration, empirical modeling, diagnostics,


surveillance.

1 Introduction

The ability to monitor and control complex systems has been of interest for decades
with a myriad of successful applications; however, the ability to identify system
Dustin R. Garvey and J. Wesley Hines
The University of Tennessee, Knoxville, Department of Nuclear Engineering, Knoxville, TN
37996-2300, United States of America, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 203
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 203–222.
c 2008 Springer.
204 D.R. Garvey and J.W. Hines

degradation and predict remaining useful life (RUL) has proved much more diffi-
cult. Research in prognostic methods has recently come to the forefront as com-
panies strive to become more competitive and as the US Department of Defense
requires prognostic capabilities in new weapon systems. The desired system would
take the form of an integrated system for monitoring, detection, identification, and
prognostics.

1.1 Reliability Engineering Methods

Traditional reliability methods [11, 24] predict system or device RUL based on his-
torical data collected from a population of identical or similar devices. However,
these predictions are accurate only for an “average”, or typical, device. Predictions
for an individual device are far more useful because the uncertainty is typically
much smaller, and, thus, are the focus of more recent research. Improved prognostic
methods use covariate information and cumulative damage models [5]. These meth-
ods provide a prediction based on how long the average component would operate
under the current conditions. More recent techniques use degradation data to assess
equipment condition and predict future behavior, such as time to failure (TTF) or
RUL. These individualized prognostics techniques have the ability to make RUL
predictions with less uncertainty than population-based methods; however, they re-
quire measurement information related to the equipment degradation. A detailed re-
view of the reliability data-analysis methods using degradation measurements rather
than time-to-failure data is given by Lu and Meeker [23] and a recent review of re-
search in the field of prognostics and health management (PHM) for electronics is
given by Vichare and Pecht [28].
Prognostics methods require either detailed physics-of-failure models or failure
data to train empirical failure modes. Because detailed physics models are usually
difficult to construct for each failure mode and sufficient historical failure data is
rarely available, successful prognostic applications are rare. In industry, when equip-
ment degradation is detected, maintenance procedures are implemented that restore
or replace the failing item. If items are not maintainable, usually the item is re-
designed to remove the fault mode. Items that are allowed to fail in service are not
usually monitored, or they fail so rapidly that prognostics would not be beneficial.
These are the main reasons that successful prognostics applications are not readily
available for study.

1.2 Integrated Framework

According to Isermann [18], to fully supervise a process or system we must be


able to detect, diagnose, and evaluate the magnitude of faults that occur. In his pro-
posed methodology, a predictive model is used to estimate model parameters, the
system state, and/or expected system signals. These predictions are then compared
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 205

PROCESS / SYSTEM

PREDICT DETECT DIAGNOSE EVALUATE


• Expected signal • Nominal or • What is the fault? • Control/maintenance
values? Degraded? action?

PROCESS / SYSTEM

PREDICT DETECT DIAGNOSE PROGNOSE EVALUATE


• Expected signal • Nominal or • What is the fault? • What is the RUL? • Control/maintenance
values? Degraded? action?

Fig. 1 (a) Traditional and (b) modified supervisory control/maintenance system

to reference values (i.e. measurements, nominal values, etc.) to generate residuals,


which are subsequently used to determine if the system is operating in a nomi-
nal or a degraded (faulted) mode. If the system is determined to be operating in a
degraded mode, the residuals can be used to generate fault symptoms for fault di-
agnosis. Once the fault has been diagnosed, the residuals and symptom patterns can
be evaluated to determine the severity of the fault and to determine whether or not
a control/maintenance action should be executed. This process is generalized by the
diagram presented in Figure 1(a). This work extends the framework described by
Isermann [18–20] to include a prognostic module. For the current discussion, prog-
nosis is interpreted as the process of estimating the RUL of a component or system.
Prognosis and RUL estimation will be used interchangeably throughout this report.
Returning to the problem at hand, the previously described process can be stated
as a five-step process which is presented in Figure 1(b). This process can also be
interpreted as answering the following questions:
1. From previous system behavior and current measurements, what should the
process signal values be?
2. By comparing the current prediction error to some nominal distribution, is there
a fault in the system?
3. If there is a fault, what is the fault?
4. For the identified fault, what is the RUL of the component and/or system?
5. Based on the collected information (i.e. prediction, fault alarms, diagnosis, and
RUL), what appropriate control or maintenance action can be performed to main-
tain the health and safety of the process or system performance?
206 D.R. Garvey and J.W. Hines

To implement the described intelligent control and maintenance system, methods


need to be developed and validated for each of the analysis steps. This paper presents
methods that utilize a single inference procedure for each of the core analysis steps,
namely prediction, detection, diagnosis, and prognosis.
To date there have been a plethora of methodologies that address several key
requirements of monitoring (prediction and detection), diagnostic, and prognostic
systems. While the proposed methods are beneficial to the scientific and engineering
community at large, most do not address the issue of being readily integrated into
a real-world system. For example, recent work by Whisnant et al. [30] describes a
monitoring system that uses a nonparametric prediction algorithm to estimate the
state of the system and then applies a statistical test to the prediction residuals to
determine if a fault has occurred. In addition, recent work by Yan et al. [31] des-
cribes a diagnostic system that uses a multiple classification algorithm to diagnose
faults. Also, recent work by Vichare and Pecht [28] provides a survey of different
prognostic algorithms, which range from built-in-tests (BIT) to cumulative dam-
age modeling. While these three examples represent significant steps in advancing
the systems that address the monitoring, diagnostic, and prognostic fields respec-
tively, they do not provide insight into how to bring the these three pieces together
into an integrated system. This paper addresses this issue by describing a fuzzy
inference based prediction algorithm and then modifies this base algorithm to per-
form the monitoring, diagnostic, and prognostic tasks. Cornerstone procedures in
system monitoring, diagnostics, and prognostics are re-examined as inference prob-
lems (i.e., given X, what is Y ) and the newly developed nonparametric fuzzy in-
ference system (NFIS) is adapted for each situation. In addition to describing the
algorithmic framework, this paper presents results of applying the proposed system
to detect, diagnose, and prognose faults in the steering system of an automated oil
drill.

2 Nonparametric Fuzzy Inference System

The nonparametric fuzzy inference system (NFIS) is a fuzzy inference system (FIS),
whose membership function centers and parameters are observations of exemplar
inputs and outputs. This approach is unique in that previous algorithms described
in the literature use “composed” observations to parameterize the membership func-
tions (MF) of the FIS. For example, Germond and Niebur [13] use expert knowledge
to create MFs about composed patterns that map to qualitative features such as hot,
cold, high, and low. Another popular approach for MF parameterization is partition-
ing [22]. In fuzzy partitioning, the data space is partitioned into regions and MFs
are created about the centers of these regions. Here, the composed patterns are the
region centers. A similar approach implemented in unsupervised clustering algo-
rithms, such as fuzzy c-means [3, 10]; and Adeli-Hung [1] clustering, centers the
MFs on composed cluster centers and calculates the cluster parameters in terms of
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 207

the distance from the cluster center. In yet another approach, the parameters of the
MFs can be determined by performing least squares optimization of the FIS inputs
and outputs [21].
At this point, the NFIS inference procedure will be briefly described. For a more
detailed explanation, refer to Garvey [12]. Suppose n exemplar observations of the p
inputs and r outputs that characterize the system’s normal operating conditions (S)
are collected. These observations should cover the system’s future operating space.
As with any nonlinear, empirical prediction algorithm, confidence cannot be given
to predictions made outside the trained region. The NFIS will infer the system’s
mathematical relationship:
Y = S(X)
The exemplar observations are represented by two matrices: X and Y, in which
Xi, j is the observation i of input j and Yi,k is observation i of output k.
⎡ ⎤ ⎡ ⎤
X1,1 X1,2 ... X1,p Y1,1 Y1,2 ... Y1,r
⎢ X2,1 X2,2 ... X2,p ⎥ ⎢ Y2,1 Y2,2 ... Y2,r ⎥
⎢ ⎥ ⎢ ⎥
X=⎢ . . . . ⎥ Y=⎢ . .. . . .. ⎥
⎣ .. .. . . .. ⎦ ⎣ .. . . . ⎦
Xn,1 Xn,2 ... Xn,p Yn,1 Yn,2 ... Yn,r

A mapping of new inputs to their respective outputs can be constructed using an


NFIS. If xi is a new observation of input i and y j is an observation of output j, the
fuzzy inference system can be represented by the following set of if-then statements.

IF x1 = X1,1 AND x2 = X1,2 AND ... AND x p = X1,p


THEN y1 = Y1,1 AND y2 = Y1,2 AND ... AND yr = Y1,r
IF x1 = X2,1 AND x2 = X2,2 AND ... AND x p = X2,p
THEN y1 = Y2,1 AND y2 = Y2,2 AND ... AND yr = Y2,r
...
IF x1 = Xn,1 AND x2 = Xn,2 AND ... AND x p = Xn,p
THEN y1 = Yn,1 AND y2 = Yn,2 AND ... AND yr = Yn,r

In the NFIS, the MFs from the exemplar inputs and outputs are directly defined by
the data matrices. As an example, consider creating the MFs for five exemplar obser-
vations of a single input. For the sake of simplicity, also assume that the exemplars
are sorted from smallest to largest, i.e.
⎡ ⎤
X1,1
⎢ X2,1 ⎥
⎢ ⎥
X=⎢ ⎥ X1,1 < X2,1 < X3,1 < X4,1 < X5,1
⎢ X3,1 ⎥
⎣ X4,1 ⎦
X5,1
208 D.R. Garvey and J.W. Hines

Overlap = 2
1
µx1,1 (x1)

0
X1,1 X2,1 X3,1 X4,1 X5,1 x1

Fig. 2 Final triangular membership functions for the five exemplar observations and an overlap of
two

In the NFIS MF creation algorithm, triangular MFs are centered on the exemplar
observations and the MF support is set to be neighboring signal observations. The
proximity of the neighbors are controlled by an overlap parameter. For example,
the right endpoint of a triangular MF for the ith exemplar observation is set to the
(i + overlap)th observation. The parameters for the boundary MFs are defined in
terms of the half-width of the current MF. For an overlap parameter of 2, the MFs
presented in Figure 2 are obtained. This process is repeated for each input and output
signals to obtain the remaining MFs.
To estimate the response for an observation of the inputs, the previously pre-
sented FIS with the created MFs is used. The MEAN operator is used to determine
the degree of fulfillment (DOF) or the extent by which each rule fires instead of the
traditional MIN (AND) operator. This concludes the derivation of the general NFIS
framework, next the framework will be used to implement the five analysis steps of
the control/maintenance system.

3 Embodiments of the NFIS

This section provides a description of the different embodiments of the general NFIS
used in the integrated system: prediction, detection, diagnosis, and prognosis. As a
starting point, the integrated monitoring, diagnosis, and prognosis system is pre-
sented in Figure 3. Here, asset (system or process) data is collected and digitized.
The collected data is then passed to a signal selector, which takes the input signals
and extracts previously identified, correlated signals. The collected observations
of the signals are then presented as inputs to an NFIS predictor, which produces
estimates of the “correct” signal values from their measured values. The prediction
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 209

Data Acquisition Current System


Asset and Digitization Signal Selector Signals NFIS Predictor
Means X

SPRT/CUMSUM
Detector

NO Fault
?

YES

NFIS Diagnoser

Control/Maintenance
NFIS Prognoser
Actioin

Operator

Fig. 3 Block diagram of the fuzzy inference-based monitoring, diagnostic, and prognostic system
for an autoassociative predictor architecture

residuals are then compared to the NFIS estimates by a cumulative sum (CUMSUM)
or sequential probability ratio test (SPRT) statistical detector, which determines if
the asset is operating in a nominal or degraded mode. If the detector output indicates
that the asset is operating normally (no fault/anomaly), then no maintenance/control
action is executed and the monitoring, diagnostic, and prognostic system examines
the next observation of the asset signals. However, if the detector output indicates
that the asset is operating in a degraded mode, the prediction and detection results
are passed to an NFIS diagnoser, which maps the provided symptom patterns (pre-
diction residuals, signals, alarms, etc.) to known fault conditions. Next, the predic-
tion, detection, and diagnosis results are passed to an NFIS prognoser, which esti-
mates RUL of the asset. Finally, the prediction, detection, diagnosis, and prognosis
results are used to determine an appropriate maintenance or control action.
In the remaining sections, the details of the different embodiments of the NFIS
will be described, beginning with the NFIS predictor.

3.1 Prediction

The NFIS methodology was previously presented for the prediction application;
therefore, an extensive discussion of the NFIS as a predictor is not necessary here.
210 D.R. Garvey and J.W. Hines

It is, however, important to describe the settings that are used to define the NFIS
architecture.
The NFIS architecture settings include options that are common to other non-
parametric predictors, such as the number of memory or exemplar vectors used to
define the system and the vector selection technique. A discussion of optimal vec-
tor selection is beyond the scope of this work and the reader is referred to a survey
paper by Hines and Garvey [17].
An important user selectable NFIS parameter is the membership function over-
lap. Recall that the overlap parameter controls the width of the MFs that are created
for each of the selected exemplar observations. This is similar to the kernel width
used in radial basis functions, generalized regression neural networks, and kernel
regression. The overlap parameter can be interpreted as a regularization parameter
because a larger overlap allows more exemplars to be deemed similar to the query,
which results in smoother model predictions.
The final NFIS architecture parameter is the implication method, which controls
how the memberships to each of the signals or variables are combined to obtain a
DOF for each exemplar observation. Common implication methods are the mini-
mum, maximum, sum, and mean operators. In general, the implication method does
not significantly affect the NFIS predictions, but may offer advantages and disad-
vantages for specific applications.

3.2 Detection

The NFIS is not explicitly used for anomaly and fault detection, but it does per-
form a critical task in the process. Isermann [20] describes the process by which
an anomaly or fault can be detected as being composed of two steps: (1) make a
prediction and (2) generate and evaluate a residual on the basis of being represen-
tative of a degraded system condition. For this work, the NFIS is used to predict
a system signal from other signal measurements. The residual is generated by cal-
culating the error between the predicted and measured values. Finally, the residual
is passed to a statistical routine that compares the current residual distribution to a
nominal distribution. More specifically, the statistical routine uses the distribution
of the residuals to determine if the system is currently operating in a nominal or de-
graded mode. Statistical routines that have been historically used for fault detection
include the sequential probability ratio test (SPRT) [14, 29] and the cumulative sum
(CUMSUM) test [2, 25].

3.3 Diagnosis

When considering the application of the NFIS to diagnosis, the most apparent ap-
proach would be to construct a NFIS predictor with symptom patterns as inputs
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 211

(e.g., prediction residuals for observations with fault alarms) and integer fault iden-
tifiers (ID) as the output. To obtain a classification, the observed residuals would
be input to the NFIS predictor and the predicted output would be rounded to the
nearest integer to obtain the fault type. The problem with this approach can be made
apparent by considering a quick example. Suppose symptom patterns for three dif-
ferent fault conditions exist and that a NFIS is trained to estimate the fault ID (1, 2,
or 3) for query symptom residual patterns. If there are no overlaps of the symptom
patterns for these three fault conditions, this approach should work well, but how
would the NFIS perform when there is symptom pattern overlap? To answer this
question, let us consider the case in which there is overlap between the symptom
patterns of the first and third fault types. Next, suppose that the goal is to diagnose
the fault of a query symptom pattern that lies in the overlapping regions of the first
and third faults. For this example, the memberships are near 0.5 for both the first
and third fault condition. The resulting diagnosis estimate would be near 2, which
means that we have diagnosed the query as belonging to the second fault condition.
Does this make sense? For a predictor model with continuous inputs and outputs,
this would be an appropriate estimate, since the inputs map to a value that is nu-
merically between 1 and 3. However, in some situations, a classification of 2 as
being an intermediate between the first and third class might not make any sense.
Therefore, the NFIS structure must be modified to reflect the occurrence of partial
memberships.
For this discussion, suppose that n observations of p inputs (variables) that are ex-
amples of nc classes (fault conditions) are collected. Also, let Ci designate the ith
class and ni the number of examples for this class. Using these definitions, the
sum of the number of examples for each class is equal to the number of example
observations.
nc
n = ∑ ni
i=1

These definitions can be used to formulate the classification problem in a similar


fashion as the prediction problem discussed earlier. If the training (example) inputs
are denoted by X and outputs (classes) by Y, “memory” matrices for the inputs and
outputs can be created.
⎡ ⎤ ⎡ ⎤
X1,1 ... X1,p C1
⎢ .. .. ⎥ ⎢ .. ⎥
⎢ . ... . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥
⎢ X ... X ⎥ ⎢ C1 ⎥
⎢ n1 ,1 n1 ,p ⎥ ⎢ ⎥
⎢ Xn1 +1,1 ... Xn1 +1,p ⎥ ⎢ C2 ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. .. ⎥ ⎢ .. ⎥
⎢ . ... . ⎥ ⎢ . ⎥
X=⎢ ⎥ Y=⎢ ⎥
⎢ Xn +n ,1
⎢ 1 2 ... Xn1 +n2 ,p ⎥ ⎥
⎢ C2 ⎥
⎢ ⎥
⎢ .. .. ⎥ ⎢ .. ⎥
⎢ . ... . ⎥ ⎢ ⎥
⎢ ⎥ ⎢ . ⎥
⎢ Xn +...+n −1,1 ... Xn +...+n −1,p ⎥ ⎢ Cn ⎥
⎢ 1 nc 1 nc ⎥ ⎢ c⎥
⎢ . . ⎥ ⎢ . ⎥
⎣ .
. ... .
. ⎦ ⎣ .. ⎦
Xn,1 ... Xn,p Cnc
212 D.R. Garvey and J.W. Hines

To use the NFIS for diagnosis, the output Y is converted to a binary format, which
will be designated by Y∗ . To do this, create an n × nc matrix of zeros and then set
the ith column elements to 1 for the symptom observations for fault Ci . Therefore,
Y can be rewritten:

C1 C2 ... Cn
⎡ ⎤
1 0 ... 0
⎢ .. .. .. .. ⎥
⎢. . . .⎥
⎢ ⎥
⎢ 1 0 ... 0 ⎥
⎢ ⎥
⎢ 0 1 ... 0 ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
⎢. . . .⎥
Y=⎢
⎢ 0 1 ... 0 ⎥

⎢ ⎥
⎢. . . .⎥
⎢ .. .. .. .. ⎥
⎢ ⎥
⎢ 0 0 ... 1 ⎥
⎢ ⎥
⎢. . . .⎥
⎣ .. .. .. .. ⎦
0 0 ... 1

Traditionally, Cn − 1 dummy variables are used to fully define Cn fault classes. How-
ever, Cn dummy variables should be used in this application to allow for partial
memberships to each fault class. To diagnose a fault from an observation of the
symptom patterns, simply stimulate the NFIS with the observed symptom pattern
as an input. The output of the NFIS diagnoser is a vector of nc memberships of the
symptom pattern to each of the fault classes. Finally, diagnose the fault as belonging
to the class to which it has the largest membership.

3.4 Prognosis

Vichare and Pecht [28] define prognostics as being “the process of predicting a
future state (of reliability) based on current and historic conditions.” Since the even-
tual goal of any prognostic system is to be able to determine when a component
is going to fail, another appropriate definition of prognostics that will be adopted
for this work is “the process by which the remaining useful life of a component or
system is estimated” [27].
Before examining how the NFIS can be used for RUL estimation, the general
prognostic approach that will be implemented in this work should be examined.
Suppose the degradation of a system or component can be quantified by a single
parameter, which is referred to as a prognostic parameter. This parameter may be
constructed as a function of several measured parameters or residuals. As the system
degrades, the prognostic parameter should increase until a threshold is reached and
a failure occurs [16]. As an example, consider the plot presented in Figure 4. Notice
that as time progresses, the prognostic parameter generally increases until it reaches
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 213

Prognostic Parameter, Y(t)


Y*

t0
Time

Fig. 4 Example of a prognostic parameter progression until failure [26]

Fig. 5 Example of a parametric regression of the prognostic parameter [26]

the threshold Y*. The threshold may simply be a specified operating level such as
an upper allowed voltage threshold or vibration level. At this point, a failure is said
to have occurred.
To obtain RUL estimates for observations of the degradation parameter, tradi-
tional regression techniques can be used. For example, in Figure 5, a nonlinear
function of the form y = at−b is fit to the observed data.
There are two major problems that must be addressed if this approach is to be
used: (1) a viable prognostic parameter must be identified and (2) the threshold for
214 D.R. Garvey and J.W. Hines

failure must be identified. Gross et al. [14] suggests using the alarm frequency since
it “scales monotonically with the degree of severity of the degradation, regardless
of the magnitude or units for the original monitored signals (e.g., temperatures,
voltages).” If the alarm frequency is implemented by using a local window, in some
situations the parameter will stop increasing prior to failure if it flat lines at 1.0
or 100% of the window observations. For this reason, the cumulative sum of the
number of fault alarms is more suitable as a prognostic parameter. It is important
to note that, if an appropriate window size can be determined, both methods could
produce equivalent results, but the latter was selected for this work to avoid the
window-size problem. Also, since the NFIS prognosis algorithm is formally based
on the concept of a generic prognostic parameter, either parameter could be used.
For this work, a modified form of the previously described algorithm will be im-
plemented, which does not make use of a prognostic parameter threshold. Rather
than define failure explicitly in terms of the value of a prognostic parameter, failure
will be defined in terms of how long the system has been operating after the onset
of a failure mechanism. Here, onset to failure is defined as the time at which a spec-
ified number of fault alarms have occurred (e.g., 25−100 alarms). For the example
presented in the next sections, onset to failure was defined as the instance where 100
fault alarms have been registered.
Now that the general RUL estimation process has been described, the use of
NFIS for prognosis will be examined. Suppose that n histories for the prognostic
parameter of a system have been collected. From these histories a vector of the
time-to-failures after onset can be extracted.
⎡ ⎤
TTF1
⎢ TTF2 ⎥
⎢ ⎥
TTF = ⎢ . ⎥
⎣ .. ⎦
TTFn

Furthermore, suppose that regression on each of the histories has been performed.
The bank of equations that relate the time after the onset of the failure (t) to the
prognostic parameter (Y ) of the system may be expressed by the following equation,
where θ̂ i are the regressed parameters for the ith history and Θ̂ are all of the regressed
parameters. ⎡ ⎤
Y1 (t, θ̂ 1 )
⎢ Y2 (t, θ̂ 2 ) ⎥
⎢ ⎥
Y(t, Θ̂) = ⎢ .. ⎥
⎣ . ⎦
Yn (t, θ̂ n )
For this discussion, suppose that the time after the onset to failure is the number of
observations after a specified number of fault alarms (e.g., 25−100) have occurred.
This method can be easily implemented for time-series data with a constant sample
rate. For example, if N observations after onset to failure are observed, then evalu-
ate Y(t, Θ̂) for t = N to determine what the prognostic parameter value should be
according to the n regressed histories.
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 215
⎡ ⎤
Ŷ1
⎢ Ŷ2 ⎥
⎢ ⎥
Y(t = N, Θ̂) = ⎢ . ⎥
⎣ .. ⎦
Ŷn

At this point, the number of observations after the onset-to-failure (N) and cur-
rent prognostic parameter (Y ) have been determined. Additionally the vector of n
estimates for the prognostic parameter Ŷ and a vector of the time-to-failures after
onset TTF are determined. These values are used to build a predictor that maps
the observed prognostic parameter to the RUL of the system. To do this, an NFIS
predictor is created “on the fly” with predicted prognostic parameter values Ŷ as
exemplar inputs and their corresponding RULs as outputs. Here, the RUL of the
ith prognostic parameter estimate is simply its time-to-failure minus the number of
current observations:
RULi = T T Fi − N
Finally, the RUL is estimated by supplying the NFIS predictor with the current prog-
nostic parameter.
To review, several degradation histories are collected that have known lifetimes
after the onset of a failure and have specific structures that are characterized by the
“shape” of the prognostic parameter history. If it is desired to calculate the RUL of
another similar system or component, two pieces of information are available: the
elapsed time after onset to failure and the value of the prognostic parameter. Next,
the regressed functions are evaluated for the current elapsed time. This provides
“guide posts” that can be compared to the current prognostic parameter to determine
the “shape” of its progression. In essence, the estimates of the prognostic parameter
are used to determine which degradation history the system is similar to, and since
the time-to-failure for the failure histories are known, the similarities can be related
to the system RUL.

4 Methodology

The data used in the example presented in this section were collected from the
hydraulic steering system of a drill used for deep oil exploration. In the system,
the drill bit rotates and dislodged material is pumped to the surface. For this work
we are interested in the steering system, whose major components are the three hy-
draulic units that are located near the drill bit. To steer the unit, ribs are extended
in their respective directions to “push” the head in the desired direction. To empir-
ically model each hydraulic unit, four sensor measurements were used: the target
hydraulic pressure (calculated by the control system), measured hydraulic pressure,
electrical current to the hydraulic pump motor, and the motor RPM.
For this work, 11 data sets which progress to failure are used. These data sets
represent three different fault conditions:
216 D.R. Garvey and J.W. Hines

1. Mud invasion — mud enters the hydraulic units and causes failure (3 data sets)
2. Pressure transducer offset — sensor offset (negative and positive) causes prob-
lems in the control of the system, which eventually results in system failure (2
negative offset and 3 positive offset)
3. Pump startup failure — pump failure shortly after the drill is started (3 data sets)
For each data set the embodiments of the NFIS were used for monitoring (pre-
diction and detection), diagnosis, and prognosis. Traditional algorithms that can be
found in the literature are compared to the embodiments of the NFIS, when applica-
ble. Before the results of this study are examined, the methodologies used in each
analysis step are briefly presented.
To evaluate the effectiveness of the NFIS for monitoring, it was used as a pre-
dictor with the SPRT to detect faults in the 11 data sets discussed earlier. For the
sake of comparison, a comparable system implementing an autoassociative kernel
regression (AAKR) [7,8] predictor was used with the same SPRT test. For this work
the predictors and detectors were trained on the first 8 hours of operational data ex-
tracted from each data set, which was determined to be fault-free based on visual
inspection.
To evaluate the effectiveness of the NFIS for diagnosis, a “bagging” architecture
was used [4]. Notice in Figure 6 that the 4 signal observations with fault alarms
generated by the NFIS monitoring system are used to diagnose the three fault con-
ditions. The final classification is made by fusing the output of the four classifiers
via the mean operator and then identifying the class with the maximum fused mem-
bership. For the sake of comparison, a comparable system implementing a k-nearest
neighbor (kNN) diagnoser was used [6, 9]. The bagged diagnoser architecture was
selected because it was found to significantly outperform diagnosers that use ob-
servations of all of the symptom patterns as inputs. This structure more effectively
uses the symptom patterns and fault alarms since the individual diagnosers examine

Fault?

Target YES NFIS


Pressure Diagnoser

Fault?
YES NFIS
Measured
Pressure Diagnoser

MEAN MAX FAULT


CLASS
Fault?
YES NFIS
Electric
Current
Diagnoser

Fault?

Motor YES NFIS


RPM Diagnoser

Memberships to 3
Fault Classes

Fig. 6 Illustration of the bagged diagnoser architecture


An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 217

symptom patterns for signals with alarms. For example, if there are faults in the
first two signals, then the diagnosis is based on the observed symptom patterns for
these two signal and not the other signals. For this work, the diagnosers were trained
on two mud invasion, two transducer offset (one positive and one negative), and two
pump startup data sets. To test the diagnoser, it was simulated with one mud inva-
sion, two transducer offset (one positive and one negative), and one pump startup
data set. One of the pressure transducer data sets was not used because a fault was
not detectable (Section 5).
Finally, to evaluate the effectiveness of the NFIS for prognosis, an NFIS prog-
noser was trained on each of the fault conditions. Here, the prognostic parameter is
the cumulative sum of the fault alarms and onset to failure was defined as being the
observation when 100 fault alarms have been registered. For this work a prognoser
is trained on two mud invasion, two transducer offset (one positive and one nega-
tive), and two pump startup data sets. To test the prognoser, the RUL is estimated for
the steering system with one mud invasion, two transducer offset (one positive and
one negative), and 1 pump startup data sets. Again, one of the pressure transducer
data sets is not used because a fault was not detectable (Section 5).

5 Results

The results of applying the previously described monitoring, diagnostic, and prog-
nostic systems to the hydraulic steering system are presented in this section.

5.1 Monitoring

The results of the monitoring systems implementing an NFIS and AAKR predictor
and SPRT detector are presented in Table 1. For this work, the warning time is
defined as the length of time from the instance of five sequential alarms and the
time of failure. The instance of five sequential alarms was used as an indicator of
warning time because the occurrence of multiple sequential alarms is more likely
due to an actual fault or anomaly as opposed to spurious alarms. Notice that both
monitoring systems detect faults in 10 of the 11 data sets, which translates to a
detection rate of approximately 91%. The missed detection was determined to be

Table 1 Monitoring Results for the NFIS and AAKR predictor


Predictor Number Detection Warning
Detected Rate Time (hrs)
NFIS 10 91% 19.70
AAKR 10 91% 21.97
218 D.R. Garvey and J.W. Hines

Hydraulic Unit #1 – Predictions for Target Pressure Observations


Pressure (bar) 400
Predictions
200 Alarm

0
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Measured Pressure
400
Pressure (bar)

200

0
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Electric Current
2
Current (A)

–2
0 5 10 15 20
Hydraulic Unit #1 – Predictions for Motor RPM
5000
Motor RPM

0
0 5 10 15 20
Time (hrs)

Fig. 7 NFIS fault detection results for the first hydraulic unit of the Mud Invasion #1 data

attributable to insufficient data (approximately 1 hour of data compared to 100 hours


in the largest set). Therefore, if a precursor was present, it did not have enough time
to propagate to a measurable magnitude. Finally, notice that the warning times of
the NFIS and AAKR monitoring system are comparable, both having values near
20 hours. The AAKR system performance is slightly better than the NFIS system,
in that the warning time is slightly larger. These results indicate that the NFIS is a
viable prediction algorithm for monitoring a system.
Before continuing, consider an example in which there are strong indicators for a
fault. For this discussion, consider the first mud invasion data set. The fault detection
results for the first hydraulic unit are displayed in Figure 7. Notice that there are a
series of fault alarms in the motor RPM signal (bottom) and measured pressure
(second down) starting around the 11th hour of operation. It can also be seen that
fault alarms are also present in the target pressure (top) and electric current (third
down) beginning around the 22nd hour of operation.

5.2 Diagnosis

The diagnosis results of the NFIS and kNN diagnosers are presented in the confu-
sion matrices below, Tables 2 and 3 respectively. In the following tables, the number
of NFIS or kNN classifications for the different data sets is presented in the columns.
For example, the number of classifications for the test mud invasion (MI) data set is
presented in the first column. The count in the first row is the number of MI faults
that are classified correctly as being MI, the second row is the number of pressure
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 219

Table 2 Confusion matrix for the NFIS diagnosis system


Predicted class Class Overall
MI PTO PS Accuracy (%) Accuracy (%)

MI 2087 44 128 92.39 %


Class
True

PTO 152 7350 117 96.47 % 93.62 %


PS 211 24 490 67.59 %

Table 3 Confusion matrix for the kNN diagnosis system


Predicted class Class Overall
MI PTO PS Accuracy (%) Accuracy (%)

MI 1616 19 24 97.41 %
Class
True

PTO 217 7076 43 96.46 % 88.14 %


PS 632 325 670 41.18 %

transducer offset (PTO) faults that are incorrectly classified as MI faults, and the
third row is the number of pump startup (PS) faults that are incorrectly classified
as MI faults. Ideally, only the diagonal of elements of the confusion matrix (CM)
should be nonzero, since these elements represent correct classifications.
Notice that both diagnosers are able to accurately diagnose the 3 fault conditions.
More specifically, the overall accuracy of the NFIS diagnosis system is ∼94%, while
the accuracy of the kNN diagnosis system is ∼88%. For this analysis, the perfor-
mance of the NFIS diagnosis system is slightly better than the kNN system, but
a more important feature of the results is that the NFIS diagnoser performance is
comparable to the traditional kNN diagnoser.

5.3 Prognosis

The results of using the NFIS for RUL estimation are presented in Table 4. Again,
MI refers to mud invasion, PTO refers to pressure transducer offset, and PS refers to
pump startup. Here, OTF refers to onset to failure or the time when 100 fault alarms
have been registered. The mean lifetime after OTF is included to aid in interpreting
the scale in the RUL estimate errors, i.e., the mean absolute error (MAE) should be
small relative to the lifetime after OTF.
It can be seen that for the MI and PTO data sets, we are able to estimate the RUL
with a high degree of accuracy, in that the MAE is less than an hour. Next, notice
that the RUL estimates for the PTO and PS data sets are progressively less accurate
then the estimates for the MI data. This result is expected since we are estimating
the RUL by performing a regression with two data points (two training histories for
MI, PTO, and PS). As additional data is integrated into the described system, the
performance should improve considerably.
220 D.R. Garvey and J.W. Hines

Table 4 Monitoring Results for the NFIS and AAKR predictor


Fault Mean Lifetime Mean RUL Estimate MAE (hrs) MAE (%)
after OTF (hrs) after OTF (hrs)
MI 3.36 4.00 0.64 19.05
PTO 2.81 1.94 0.87 30.96
PS 9.82 14.19 4.37 44.45

6 Conclusions

This paper has described an intelligent control and maintenance system that includes
modules for prediction, detection, diagnosis, prognosis, and evaluation. This pa-
per has also addressed a major hurdle in the development of such a system for a
“real world” process or system by developing an integrated monitoring (prediction
and detection), diagnostic, and prognostic system by adapting the newly developed
nonparametric fuzzy inference system (NFIS) for each task. To validate the pro-
posed methodologies, the embodiments of the NFIS were used to detect, diagnose,
and prognose faults in the hydraulic steering system of an automated oil drill. The
embodiments of the NFIS were found to have similar performance to traditional
algorithms, such as autoassociative kernel regression (AAKR) and k-nearest neigh-
bor (kNN), for monitoring and diagnosis. The NFIS prognoser was also shown to
be able to estimate the remaining useful life (RUL) of a steering system to within
an hour of its actual time of failure. In closing, it is important to note that the re-
sults presented in this paper are founded on a very limited amount of data, namely
11 failure data sets for 4 fault conditions. While the results presented here are
promising, models developed with more data are expected to outperform the current
system.

References

1. H. Adeli and S.L. Hung. Machine Learning, Neural Networks, Genetic Algorithms, and Fuzzy
Logic. Wiley, New York, 1995
2. M. Basseville and I.V. Nikiforov. Detection of Abrupt Changes: Theory and Application.
Prentice-Hall, Englewood Cliffs, NJ, 1993
3. J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press,
New York: 1981
4. C.L. Black, R.E. Uhrig and J.W. Hines. Inferential Neural Networks for Nuclear Power Plant
Sensor Channel Drift Monitoring. Proceedings of the ANS Topical Meeting on NPIC &
HMIT: 1996
5. J.L. Bogdanoff and F. Kozin. Probabilistic Models of Cumulative Damage. Wiley, New York,
1985
6. T.M. Cover and P.E. Hart. Nearest Neighbor Pattern Classification. IEEE Transactions on
Information Theory, Vol. 13, No. 1: January 1967
An Integrated Fuzzy Inference-based Monitoring, Diagnostic, and Prognostic System 221

7. I. Diaz. Deteccion E Identification De Fallos En Procesos Industriales Mediante Technicas De


Procesamiento Digital De Senal Y Redes Neuronales: Aplicacion Al Mantenimeiento Predic-
tivo De Accionamientos Electricos. Ph.D. Dissertation, Universidad De Oviedo, Departamento
de Ingenieria Electrica, Electronica, De Computadores Y Sistemas: July 2000
8. I. Diaz, A.B. Diez and A.A. Cuadrado Vega. Complex Process Visualization Through Continu-
ous Feature Maps Using Radial Basis Functions. Proceedings of the International Conference
on Artificial Neural Networks, Vienna, Austria: August 21–25, 2001
9. M. Dong, D.K. Xu, M.H. Li and X. Yan. Fault Diagnosis Model for Power Transformer Based
on Statistical Learning Theory and Dissolved Gas Analysis. Proceedings of the IEEE Inter-
national Symposium on Electrical Insulation, pp.85–88, Indianapolis, IN: September 19–22,
2004
10. J.C. Dunn. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-
Separated Clusters. Journal of Cybernetics, Vol. 3: 1973
11. E.A. Elsayed. Reliability Engineering. Addision Wesley, 1996
12. D.R. Garvey. An Integrated Fuzzy Inference Based Monitoring, Diagnostic, and Prognos-
tic System. Ph.D. Dissertation, Nuclear Engineering Department, University of Tennessee,
Knoxville: May 2006
13. A.J. Germond and D. Niebur. Survey of Knowledge-Based Systems in Power Systems: Europe.
Proceedings of the IEEE, Vol. 80, No. 5: May 1992
14. K.C. Gross, V. Bhardwaj and R.L. Bickford. Proactive Detection of Software Aging Mech-
anisms in Performance-Critical Computers. Proc. 27th Annual IEEE/NASA Software Engi-
neering Symposium, Greenbelt, MD: December 4–6, 2006
15. K.C. Gross, K.A. Whisnant and A.M. Urmanov Electronic Prognostics Through Continuous
System Telemetry. Proceedings of the 60th Meeting of the MFPT Society, Virginia Beach, VA,
pp.56-62: April 3–6, 2006
16. A. Hess, G. Calvello, P. Frith, S. Engle and D. Hoitsma. More Challenges, Issues, and Lessons
Learned Chasing Real Prognostic Capabilities Proceedings of the 60th Meeting of the MFPT
Society, Virginia Beach, VA, pp.437-464: April 3–6, 2006
17. J.W. Hines and D. Garvey Traditional and Robust Vector Selection Methods for Use with
Similarity Based Models. 5th International Topical Meeting on Nuclear Plant Instrumentation,
Control and Human-Machine Interface Technologies, Albuquerque, NM: November 12–14,
2006
18. R. Isermann. Process Fault Detection Based on Modeling and Estimation Methods – A Survey.
Automatica, Vol. 20, No. 4, pp. 387–404: 1984
19. R. Isermann. Model Based Fault Detection and Diagnosis Methods. Proceedings of the Amer-
ican Control Conference, pp. 1605–1609, Seattle, WA: 1995
20. R. Isermann. Model-Based Fault Detection and Diagnosis – Status and Applications. Pro-
ceedings of the 16th International Federation of Automatic Control (IFAC) Symposium on
Automatic Control in Aerospace, St. Petersburg, Russia: June 14–18, 2004
21. J.S. Jang. ANFIS: Adaptive-Network-Based Fuzzy Inference Systems. IEEE Transactions on
Systems, Man, and Cybernetics, Vol. 23, No. 3, pp. 665–685: 1993
22. J.S. Jang, C.T. Sun and E. Mizutani. Neuro-Fuzzy and Soft Computing. Prentice-Hall, Upper
Saddle River, NJ: 1997
23. C.J. Lu and W.Q. Meeker. Using Degradation Measures to Estimate a Time-to-Failure Distri-
bution. Technometrics, Vol. 35, 2, pp.161–174, 1993
24. W.O. Meeker and L.A. Escobar. Statistical Methods for Reliability Data. Wiley, 1998.
25. V.M. Morgenstern, B.R. Updahyaya and M. Benedetti. Signal Anomaly Detection Using Mod-
ified CUSUM Method. Proceedings of the 27th Conference on Decision and Control, Austin,
TX: December 1988
26. A. Urmanov and J.W. Hines. Electronic Prognostics, Short Course on Fault Diagno-
sis/Prognosis for Engineering Systems. Georgia Tech, Atlanta, GA: May 15–18, 2006.
27. A. Usynin. Model-Fitting Approaches to Reliability Assessment and Prognostic Problems.
Journal of Pattern Recognition Research, Vol. 1, pp.32–36., 2006
222 D.R. Garvey and J.W. Hines

28. N.M. Vichare and M.G. Pecht. Prognostic and Health Management of Electronics IEEE Trans-
actions on Components and Packaging Technologies, Vol. 29, No. 1, pp. 222–229: March 2006
29. A. Wald. Sequential Analysis. Wiley, New York, 1947
30. K. Whisnant, K. Gross and N. Lingurovska. Proactive Fault Monitoring in Enterprise Servers.
International Conference on Computer Design (CDES’05), Las Vegas, NV: June 27–30, 2005
31. W. Yan, C.J. Li and K.F. Goebel. A Multiple Classifier System for Aircraft Engine Fault Di-
agnosis. Proceedings of the 50th Meeting of the Machinery Failure Prevention Technology
(MFPT) Society, pp. 271–279, Virginia Beach, VA: April 3–6, 2006
Stable Anti-Swing Control for an Overhead
Crane with Velocity Estimation and Fuzzy
Compensation

Wen Yu, Xiaoou Li, and George W. Irwin

Abstract This chapter proposes a novel anti-swing control strategy for an overhead
crane. The controller includes both position regulation and anti-swing control. Since
the crane model is not exactly known, fuzzy rules are used to compensate friction,
gravity as well as the coupling between position and anti-swing control. A high-
gain observer is introduced to estimate the joint velocities to realize PD control.
Using a Lyapunov method and an input-to-state stability technique, the controller is
proven to be robustly stable with bounded uncertainties, if the membership functions
are changed by certain learning rules and the observer is fast enough. Real-time
experiments are presented comparing this new stable anti-swing PD control strategy
with regular crane controllers.

Keywords: Lyapunov stability; PD controller; Motion control

1 Introduction

Although cranes are very important systems for handling heavy goods, automatic
cranes are comparatively rare in industrial practice [24], because of high investment
costs. The need for faster cargo handling requires control of the crane motion so that

Wen Yu
Department of Automatic Control
Xiaoou Li
Department of Computer Science
Center for Research and Advanced Studies of the National Polytechnic Institute
(CINVESTAV-IPN)
A.P. 14-740, Av.IPN 2508, México D.F., 07360, México, e-mail: [email protected]
George W. Irwin
School of Electronics, Electrical Engineering and Computer Science
Queen’s University Belfast, Ashby Building, Stranmillis Road, Belfast, BT9 5AH, UK
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 223
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 223–240.
c 2008 Springer.
224 W. Yu et al.

its dynamic performance is optimized. Specifically, the control of overhead crane


systems aims to achieve both position regulation and anti-swing control. Several
authors have looked at this including [3], time-optimal control was considered using
boundary conditions, an idea which was further developed in [2] and [25]. Unfortu-
nately, to increase robustness, some time optimization requirements, like zero angu-
lar velocity at the target point [21], have to be given up. Gain scheduling has been
proposed as a practicable method [6] to increase tracking accuracy, while observer-
based feedback control was presented in [24].
Many attempts, such as planar operation [6] and assuming the absence of fric-
tion [21], have been made to introduce simplified models for application of model-
based control [24]. Thus, a self-tuning controller with a multilayer perceptron model
for an overhead crane system was proposed [19] while in reference [5], the con-
troller consists of a combined position servo control and a fuzzy-logic anti-swing
controller.
Classical proportional and derivative (PD) control has the advantage of not
requiring an overhead crane model but because of friction, gravitational forces and
the other uncertainties, it cannot guarantee a zero steady-state error. While PID con-
trol can remove this error, it lacks global asymptotic stability [14]. Several efforts
have therefore been made to improve the performance of PD controllers. Global
asymptotically stable PD control was realized by pulsing gravity compensation in
[27] while in [15], a PD controller for a vertical crane-winch system was developed,
which only requires the measurement of angles and their derivatives rather than a
cable angle measurement. In [9], a passivity-based controller was combined with a
PD control law. Here, asymptotic regulation of the gantry and payload position was
proven, but unfortunately both controllers again require a crane model to compen-
sate for the uncertainties.
There are two main weaknesses in applying PD control to this application:
(a) The PD controller requires suitable sensors to provide measurements of both
position and velocity. Position can be obtained very accurately by means of an
encoder, while velocity is usually measured by a tachometer, which can be expen-
sive and is often contaminated by noise [12]; (b) Due to the existence of friction
and gravitational forces, the steady-state error is not guaranteed to be zero [13].
It is therefore important to be able to realize PD control using only position mea-
surement. One possible approach is to use a velocity observer, which can be either
model-based or model-free. Model-based observers assume that the dynamics of
the overhead crane are either completely or partially known. For example, the vari-
able structure observer in [7] needed information about the inertia matrix to cal-
culate the sliding mode gain. In contrast model-free observers do not require such
exact knowledge about the overhead cranes. The most popular model-free observers
are high-gain ones which can estimate the derivative of the output [22]. Recently, an
observer was presented in reference [12], where the non-linearity of the manipulator
was estimated by a static neural network.
In this chapter, a new modified algorithm is proposed which overcomes both
these limitations of PD control at the same time. Firstly, a high-gain observer
which can achieve stability is added to regular PD control. A fuzzy system is
Stable Anti-Swing Control for an Overhead Crane 225

then used to estimate both friction and gravity. Unlike other work which used
the singular perturbation method [22], a new proof of stability is presented using
Lyapunov analysis. This proof explains the relation between the observer error and
the observer gain.
Since the swing of the payload depends on the acceleration of the trolley, mini-
mizing both the operation time and the payload swing produces partially conflicting
requirements. The anti-swing control problem involves reducing the swing of the
payload while moving it to the desired position as fast as possible [1]. One particu-
lar feedforward approach is input shaping [26], which is an especially practical and
effective method of reducing vibrations in flexible systems. In [20] the anti-swing
motion-planning problem is solved using the kinematic model in [17]. Here, anti-
swing control for a three-dimensional overhead crane is proposed, which addresses
the suppression of load swing. Non-linear anti-swing control based on the singu-
lar perturbation method is presented in [30]. Unfortunately, all of these anti-swing
controllers are model-based.
In this chapter, a PID law is used for anti-swing control which, being model-free,
will affect the position control. The same fuzzy compensator used for friction and
gravity is applied to handle the position error. The required online learning rule is
obtained from the tracking error analysis and there is no requirement for off-line
learning. The overall closed-loop system with the high-gain observer and the fuzzy
compensator is shown to be stable if the membership functions have certain learning
rules and the observer is fast enough. Finally, results from experimental tests carried
out to validate the controller are presented.

2 Preliminaries

The overhead crane system described schematically in Figure 1 (a) has the system
structure shown in Figure 1 (b). Here α is the payload angle with respect to the
vertical and β is the payload projection angle along the X-coordinate axis. The
dynamics of the overhead crane are given by [28]:

M (x) ẍ +C (x, ẋ) ẋ + G (x) + F = τ (1)

where x=[xw , yw , α , β , R]T , (xw , yw , R) is position of the payload, τ =[Fx , Fy , 0, 0, FR ]T ,


Fx , Fy and FR represent the control forces acting on the cart and rail and along the lift-
line, F = [µx , µy , 0, 0, µR ]T ẋ, µx , µy and µR are frictions factors, G (x) is gravitational
force, C (x, ẋ) is the Coriolis matrix and M (x) is the dynamic matrix of the crane.
In (1), there are some differences from other crane models in the literature. The
length of the lift-line is not considered in [9], so the dimension of M is 4 × 4, while
in [20], which also addresses anti-swing control and position control, the dimen-
sion of M is 3 × 3. In [16], the dimension of M is 5 × 5 as in this chapter. However,
some uncertainties such as friction and anti-swing control coupling are not included.
This overhead crane system shares one important property with robot systems: the
226 W. Yu et al.

rail

xw
cart Fy
Fx
y yw
Fx
a Fy R
3D Crane
R a
FR
x xw b FR
b
yw
payload

Mc g

(a) (b)

Fig. 1 Overhead crane

Coriolis matrix C (x, ẋ) is skew-symmetric, i.e., it satisfies the following relation-
ship [9]  
xT Ṁ(x) − 2C(x, ẋ) x = 0 (2)
A normal PD control law has the following from

τ = −Kp (x − xd ) − Kd (ẋ − ẋd )

where Kp and Kd are positive definite, symmetric and constant matrices, which cor-
respond to the proportional and derivative coefficients, x d ∈ ℜ5 is the desired po-
sition, and ẋ d ∈ ℜ5 is the desired joint velocity. Here the regulation problem is
discussed, so ẋ d = 0.
Input-to-state stability (ISS) is another elegant approach for stability analysis be-
sides the Lyapunov method. It can lead to general conclusions on stability using
the input and state characteristics. Thus, consider a class of non-linear systems de-
scribed by
ẋt = f (xt , ut ) (3)
where xt ∈ ℜn is the state vector, ut ∈ ℜm is the input vector, yt ∈ ℜm is the output
vector. f : ℜn × ℜm → ℜn is locally Lipschitz. Some passivity properties, as well
as some stability properties of passive systems are now recalled [4].

Definition 1. A system (3) is said to be globally input-to-state stability if there exists


a K-function γ (s) (continuous and strictly increasing γ (0) = 0) and a KL -function
β (s,t) (K-function and for each fixed s0 ≥ 0, lim β (s0 ,t) = 0), such that, for each
t→∞
ut ∈ L∞ (u(t)∞ < ∞) and each initial state x0 ∈ Rn , the following holds
& & '& & (
&x(t, x0 , ut )& ≤ β &x0 & ,t + γ (ut  )

for each t ≥ 0.
Stable Anti-Swing Control for an Overhead Crane 227

This definition implies that if a system has input-to-state stability, its behaviour
should remain bounded when its inputs are bounded.

3 Anti-Swing Control for the Overhead Crane

The control problem is to move the rail in such a way that the actual position of
the payload reaches the desired one. The three control inputs [Fx , Fy , FR ] can force
the crane to the position [xw , yw , R] , but the swing angles [α , β ] cannot be controlled
using the dynamic model (1) directly. In order to design an anti-swing control, lin-
earization models for [α , β ] are analyzed. Because the acceleration of the crane
is much smaller than the gravitational acceleration, the rope length is kept slowly
varying and the swing is not big, giving
 
|ẍw |  g, |ÿw |  g, R̈  g
   
Ṙ  R, |α̇ |  1, β̇   1
s1 = sin α ≈ α , c1 = cos α ≈ 1,

The approximated dynamics of [α , β ] are then

α̈ + ẍw + gα = 0, β̈ + ÿw + gβ = 0
Fx Fy
Since ẍw = Mr , ÿw = Mm , the dynamics of the swing angles are

Fx Fy
α̈ + gα = − , β̈ + gβ = − (4)
Mr Mm
The control forces Fx and Fy are assumed to have the following form

Fx = A1 (xw , ẋw ) + A2 (α , α̇ )
  (5)
Fy = B1 (yw , ẏw ) + B2 β , β̇

where A1 (xw , ẋw ) and B1 (yw , ẏw ) are position controllers, and A2 (α , α̇ ) and B2 (β , β̇ )
are anti-swing controllers. Substituting (5) into (4), produces the anti-swing control
model
A1 A2 B1 B2
α̈ + gα + = − , β̈ + gβ + =− (6)
Mr Mr Mm Mm
A1 B1 A2
Now if M r
and Mr
are regarded as disturbance, Mr
and MB2m as control inputs, then (6)
is a second-order linear system with disturbances. Standard PID control can now be
applied to regulate α and β thereby producing the anti-swing controllers
228 W. Yu et al.
-1
A2 (α , α̇ ) = k pa2 α + kda2 α̇ + kia2 0 α dt
  - (7)
B2 β , β̇ = k pb2 β + kdb2 β̇ + kib2 01 β dt

where k pa2 , kda2 and kia2 are positive constants corresponding to proportional, deriv-
ative and integral gains.
Substituting (5) into (1), produces the position control model

M (x) ẍ +V (x, ẋ) ẋ + G (x) + T ẋ + D = u1 (8)

where D = [A2 , B2 , 0, 0, 0]T , u1 = [A1 , B1 , 0, 0, FR ]T . Using this model, a position


controller will be designed in Section 4.

4 Position Control with Fuzzy Compensation

A generic fuzzy model for friction and gravity is provided by a collection of l fuzzy
rules (Mamdani fuzzy model [18])

 (yw is A2i ) and(α is A) and (β is A4i ) 


Ri : IF (xw is A1i ) and
(9)
and (R is A5i ) THEN fx is B1i and fy is B2i and fz is B3i

Here fx , fy and fz are the uncertainties (friction, gravity and coupling errors) along
the X, Y, Z -coordinate axis. i = 1, 2 · · · l. A total of fuzzy IF-THEN rules are used to
perform the mapping from the input vector x = [xw , yw , α , β , R]T ∈ ℜ5 to the output
T
vector y(k) = f1 , f2 , f3 = [ y1 , y2 , y3 ] ∈ R3 . Here A1i , · · · Ani and B1i , · · · Bmi are
standard fuzzy sets. In this chapter, some on-line learning algorithms are introduced
for the membership functions B ji such that the PD controller is stable.
By using product inference, centre-average defuzzification and a singleton fuzzi-
fier, the pth output of the fuzzy logic system can be expressed as [29]
 5 6  5 6
l 5 l 5 l
yp = ∑ w pi ∏ µA ji / ∑ ∏ µA ji = ∑ w pi φi (10)
i=1 j=1 i=1 j=1 i=1

where p = 1, 2, 3, µA ji is the membership functions of the fuzzy sets A ji , and w pi is


the point at which µB pi = 1. Defining

n l n
φi = ∏ µA ji / ∑ ∏ µA ji (11)
j=1 i=1 j=1

then (10) can be expressed in matrix form

y = Ŵt Φ (x) (12)


Stable Anti-Swing Control for an Overhead Crane 229
⎡ ⎤
w11 w1l
where the parameter matrix Ŵ (k) = ⎣ w21 . . . w2l ⎦ ∈ R3×l , and the data vector
w31 w3l
T
Φ [x] = [φ1 · · · φl ] ∈ R . The position controllers have a PD form with a fuzzy
l×1

compensator
   
u1 = [A1 (xw , ẋw ) , B1 (yw , ẏw ) , 0, 0, FR ]T = −Kp1 x − xd − Kd1 ẋ − ẋd + Ŵt Φ(x)
(13)
 T
where x = [xw , yw , α , β , R]T , xd = xwd , ydw , 0, 0, Rd , and xwd , ydw and Rd are the
d d d
desired
 positions. In the  regulation case ẋw = ẏw = Ṙ = 0. Further, Kp1 =
diag k pa1 , k pb1 , 0, 0, k pr , Kd1 = diag [kda1 , kdb1 , 0, 0, kdr ] . The time-varying weight
matrix Ŵt is determined by the fuzzy learning law. According to the Stone–Weierstrass
theorem [8], a general non-linear smooth function can be written as

f (x) = W ∗ Φ(x) + µ (t) (14)

where W ∗ is optimal weight matrix, and µ (t) is the modeling error. In this chap-
ter we use the fuzzy compensator (12) to approximate the unknown non-linearity
(gravity, friction and coupling of anti-swing control) as

Ŵt Φ(x) = G (x) + T ẋ + D + µ (t) (15)

When the velocity ẋ is not available, a velocity observer is needed. Section 6.5
describes how to incorporate a model-free observer to PD control for the overhead
crane.

5 PD Control with a Velocity Observer

The overhead crane dynamics (1) can be rewritten in state-space form as [22]

ẋ1 = x2
ẋ2 = H1 (X, u) (16)
y = x1

where x1 = x = [xw , yw , α , β , R]T is the position vector, x2 is the velocity vector,


X = [x1T , x2T ]T , and u = τ is the control input. The output is a position measurement,

H1 (X, u) = −M(x1 )−1 [C(x1 , x2 )ẋ1 + G(x1 ) + F ẋ1 + u] (17)

If the velocity vector x2 is not measurable and the dynamics of manipulator are
unknown, a high-gain observer can be used to estimate x2 [22]
230 W. Yu et al.

d 1
x̂1 = x̂2 + K1 (x1 − x̂1 )
dt ε
(18)
d 1
x̂2 = 2 K2 (x1 − x̂1 )
dt ε
where x̂1 ∈ ℜ5 , x̂2 ∈ ℜ5 denotes the estimated values of x1 , x2 respectively; ε is a
small positive parameter,
7 8 and K1 and K2 are positive definite matrices chosen such
−K1 I
that the matrix is stable. Defining the observer error as
−K2 0

x̃ = x − x̂, z̃1 = x̃1 , z̃2 = ε x̃2 (19)

where x̂ = [x̂1T , x̂2T ]T , the observer error equation can then be formed from (16) and
(18)
d
ε z̃1 = z̃2 − K1 z̃1
dt (20)
d
ε z̃2 = −K2 z̃1 + ε H1
2
dt
or in the matrix form:
d
ε z̃ = Az̃ + ε 2 BH1 (21)
dt
7 8 7 8
−K1 I 0
where A = ,B= . The structure of the velocity observer is the same
−K2 0 I
as in [22], but a new theorem is proposed here in order to integrate the observer and
the fuzzy compensator .

Theorem 2. If the high gain observer (18) is used to estimate the velocity of the
overhead crane (16), the observer error x̃ will converge to the following residual set

Dε = {x̃ | x̃ ≤ K̄ (ε )}

where K̄ (ε ) = 2ε 2 sup BH1 T P , P is the solution of Lyapunov equation:


t∈[0,T ]

AT P + PA = −I (22)

See appendix for the proof of Theorem 1.


Reference [22] gave the proof of stability under the assumption of ε → 0. Here ε
can be any positive constant. Since sup BH1 T P is bounded, ε can be selected
t∈[0,T ]
arbitrary small to make K̄ (ε ) small enough. Hence the'observer
( error x̃ becomes ar-
bitrary small as ε → 0. However, a large observer gain ε1 will enlarge the observer
noise, so ε should be selected to be as large as possible if the observer accuracy
K̄ (ε ) is within tolerance.
The PD control law in combination with the state estimate from a high-gain ob-
server is then given by
Stable Anti-Swing Control for an Overhead Crane 231

τ = −Kp (x1 − x1d ) − Kd (x̂2 − x2d ) (23)

where x1d ∈ ℜ5 is the desired position, x2d ∈ ℜ5 is the desired velocity. In the regula-
tion case x2d = 0, and x̂2 is of course the velocity approximation from the high-gain
observer.
The coupling between anti-swing control and position control can be explained
as follows. For the anti-swing control (6), the position control A1 and B1 are distur-
bances, which can be decreased by the integral action in PID control. The anti-swing
model (6) is an approximator, but the anti-swing control (7) does not in fact use this,
as it is model-free. Hence while the anti-swing control law (7) cannot suppress the
swing completely, it can minimize any consequent vibration.
For the position control (8), the anti-swing control lies in the term D =
[A2 , B2 , 0, 0, 0]T , which can also be regarded as a disturbance. The coupling due to
anti-swing control can be compensated by the fuzzy system. Consequently, the PD
control with the fuzzy compensation can be expressed as

τ = −Kp (x1 − x1d ) − Kd x2 + Ŵt Φ(x) (24)

If neither the velocity x2 nor the friction and gravity are known, the normal PD
control needs to be combined with velocity estimation and fuzzy compensation to
give
τ = −Kp (x1 − x1d ) − Kd (x̂2 − x2d ) + Ŵt Φ(s) (25)
' T T (T d
where s = x1 , x̂2 , x2 = 0. The stability of this controller is analysed next.

6 Stability Analysis

Equation (14) can be rewritten as

G (x) + F (x) = W ∗ Φ(x) + ηg (26)


7 8T
·T
T
where x = q , q , W ∗ is fixed bounded matrix, and ηg is the approximation
error whose magnitude also depends on the value of W ∗ . Now, ηg is assumed to be
quadratic bounded such that
ηgT Λg ηg ≤ η̄g (27)
where η̄g is a positive constant. Friction and gravity can be estimated according to

G (x) + F (x) ≈ Ŵt Φ(s) (28)

where Ŵt is a time-varying weight matrix for the fuzzy system. The following rela-
tion holds
W ∗ Φ(x) − Ŵt Φ(x) = W̃t Φ(s) (29)
232 W. Yu et al.

where W̃t = W ∗ − Ŵt . From Theorem 1 it is known that the high gain observer (18)
can make (x̂2 − x2 ) converge to a residual set and it is possible to write x2 = x̂2 + δ ,
where δ is bounded such that δ T Λδ δ ≤ η̄δ . Now defining the tracking error as
(x2d = 0), x̄1 = x1 − x1d :
x̃2 = x̂2 = x̄2 − δ (30)
the following theorem holds.

Theorem 3. If the updating laws for the membership functions in (28) are
d
Ŵt = −Kw Φ(s)x̃2T (31)
dt
where Kw , Kv and Λ3 are positive definite matrices, and Kd satisfies

Kd > Λ−1 −1
g + Λδ (32)

then the PD control law with fuzzy compensation in (25) can make the tracking error
stable. In fact, the average tracking error x̄2 converges to
 T
1
lim sup x̄2 2Q1 dt ≤ η̄g + 2η̄δ (33)
T →∞ T 0
' −1
(
where Q1 = Kd − Λ−1
g + Λδ .

The proof of Theorem 2 is contained in the Appendix.

7 Experimental Comparisons

The proposed anti-swing control for overhead crane systems has been implemented
on a InTeCo [10] overhead crane test-bed, see Figure 2. The rail is 150 cm long.,
and the physical parameters for the system are as follows:

Mr = 6.5kg, Mc = 0.8kg, Mm = 1.3kg, I = 0.01kg · m2

Here interfacing is based on a Xilinx FPGA microprocessor, comprising a multi-


function analog and digital I/O board dedicated to real-time data acquisition and
control in the Windows XP environment, mounted in a PC Pentium-III 500 MHz
host. Because the Xilinx FPGA chip supports real-time operations without intro-
ducing latencies caused by the Windows default timing system, the control program
operated in Windows XP with Matlab 6.5/Simulink. All of the controllers employed
a sampling frequency of 1 kHz.
The anti-swing control is discussed first. There are two inputs in the anti-swing
model (6), A1 and A2 with A1 from the position controller and A2 from the anti-
swing controller. When the anti-swing control A2 is designed by (6), A1 is regarded
as a disturbance. The chosen parameters of the PID (7) control law were
Stable Anti-Swing Control for an Overhead Crane 233

Fig. 2 Real-time control for an overhead crane

k pa2 = 2.5, kda2 = 18, kia2 = 0.01


k pb2 = 15, kdb2 = 10, kib2 = 0.6

The resulting angles are shown in Figure 3 for the position control without anti-
swing, and in Figure 4 for the position control with anti-swing. It can be seen that
the swing angles α and β are decreased a lot with the anti-swing controller.
The position control law in equation (13) is discussed next. In this case there
are two types of input to the position model (8), D = [A2 , . . .]T , u1 = [A1 , . . .]T .
When the position control A1 is designed by (25) with u1 = τ , the anti-swing
control A2 in (8) is regarded as a disturbance which will be compensated for the
fuzzy system (12). Theorem 2 implies that to assure stability, Kd should be large
−1
enough such that Kd > Λ−1 g + Λδ . Since these upper bounds are not known,
Kd1 = diag [80, 80, 0, 0, 10] is selected. The position feedback gain does not effect
the stability, but it should be positive, and was chosen as Kp1 = diag [5, 5, 0, 0, 1] .
A total of 20 fuzzy rules were used to compensate the friction, gravity and the
coupling from anti-swing control. The membership function for A ji was chosen to
be the Gaussian function
 
A ji = exp − (x j − m ji )2 /100 , j = 1 · · · 5, i = 1 · · · 20
234 W. Yu et al.

0.4
b a
0.2

–0.2
Time (second)
–0.4
0 5 10 15 20 25 30 35 40 45

0.4
b
0.2

–0.2
a
–0.4
–0.4 –0.3 –0.2 –0.1 0 0.1 0.2 0.3 0.4

Fig. 3 Without swing angles control

0.4
a (rad) b (rad)
0.2

–0.2

–0.4
0 5 10 15 20 25 30 35 40 45

0.4
b (rad)
0.2

–0.2
a (rad)
–0.4
–0.25 –0.2 –0.15 –0.1 –0.05 0 0.05 0.1 0.5 0.2

Fig. 4 With swing angles control

where the centres m ji were selected randomly to lie in the interval (0, 1). Hence,
Ŵt ∈ R5×20 , Φ(x) = [σ1 · · · σ20 ]T . The learning law took the form in (31) with
Stable Anti-Swing Control for an Overhead Crane 235

Kw = 10. The desired gantry position was selected as a square wave, and the re-
sulting gantry positions are shown in Figure 5. The regulation results from PD
control without fuzzy compensation [15] are shown in Figure 6. For compari-
son the PID control results (Kd1 = diag [80, 80, 0, 0, 10] , Kp1 = diag [5, 5, 0, 0, 1] ,
Ki1 = diag [0.25, 0.25, 0, 0, 0.1]) are shown in Figure 7.
Clearly, PD control with fuzzy compensation can successfully compensate the
uncertainties such as friction, gravity and anti-swing coupling. Because the PID
controller has no adaptive mechanism, it does not work well for anti-swing coupling
in contrast to the fuzzy compensator which can adjust its control action. On the other
hand, the PID controller is faster than the PD control with fuzzy compensation in
the case of small anti-swing coupling.
The structure of fuzzy compensator is very important. The constants in the mem-
bership functions of the fuzzy system have to be chosen either by simulation or
experiment. From fuzzy theory the form of the membership function is known not
to influence the stability of the fuzzy control, but the approximation ability of fuzzy
system for a particular non-linear process depends on the membership functions
selected. The number of fuzzy rules constitutes a structural problem for fuzzy sys-
tems. It is well known that increasing the dimension of the fuzzy rules can cause the
“overlap” problem and add to the computational burden [29]. The best dimension
to use is still an open problem for the fuzzy research community. In this application
20 fuzzy rules were used. Since it is difficult to obtain the fuzzy structure from prior
knowledge, several fuzzy identifiers can be put in parallel and the best one selected
by a switching algorithm. The learning gain Kw will influence the learning speed, so
a very large gain can cause unstable learning, while a very small gain produce slow
learning process.

8 Conclusion

In this chapter, the disadvantages of the popular PD control for overhead crane are
overcome in the following two ways: (1) a high-gain observer is proposed for the
estimation of the velocities of the joints; (2) a fuzzy compensator is used to com-
pensate for gravity and friction. Using Lyapunov-like analysis, the stability of the
closed-loop system with velocity estimation and fuzzy compensation was proven.
Real-time experiments were presented comparing our stable anti-swing PD control
strategy with regular crane controllers. These showed that the PD control law with
the anti-swing and fuzzy compensations is effective for the crane system.

Acknowledgments Wen Yu would like to thank CONACyT for supporting his visit to Queen’s
University Belfast under the projects 46729Y and 50480Y. The second author would like to ac-
knowledge the support received from the International Exchange Scheme of Queen’s University
Belfast.
236 W. Yu et al.

1
(m) xw

0.5

0
0 5 10 15 20 25 30 35 40 45
1
(m) yw

0.5

0
0 5 10 15 20 25 30 35 40 45
1
(m) R

0.5

Time(s)
0
0 5 10 15 20 25 30 35 40 45

Fig. 5 PD control with fuzzy compensation

1
(m)
xw

0.5

0
0 5 10 15 20 25 30 35 40 45
1
(m) yw

0.5

0
0 5 10 15 20 25 30 35 40 45
1
(m)
R

0.5

Time(s)
0
0 5 10 15 20 25 30 35 40 45

Fig. 6 PD control without compensation


Stable Anti-Swing Control for an Overhead Crane 237

1
(m) xw

0.5

0
0 5 10 15 20 25 30 35 40 45
1
(m)
yw
0.5

0
0 5 10 15 20 25 30 35 40 45
1 (m)
R

0.5

Time(s)
0
0 5 10 15 20 25 30 35 40 45

Fig. 7 PID

9 Appendix

Proof of Theorem 1. Since the spectra of K1 and K2 are in the left half plane,
(22) has a positive definite solution P. Consider the following candidate Lyapunov
function:V0 (z̃) = ε z̃T Pz̃. The derivative of this along the solutions of (20) is:

d T d
V̇0 = ε z̃ Pz̃ + ε z̃T P z̃
dt dt
' (
= z̃T AT P + PA z̃ + 2ε 2 (BH1 )T Pz̃ (34)
2
≤ − z̃ + 2ε BH1  P |z̃|
2

Since (16) has a solution for any t ∈ [0, T ] , H1  is bounded for any finite time T.
It can be therefore concluded that BH1  P is bounded.

V̇ ≤ − z̃2 + K̄ (ε ) z̃

where K̄ (ε ) = 2ε 2 sup BH1  P . Note that if


t∈[0,T ]

z̃(t) > K̄ (ε ) (35)

Now, let Tk denote the time interval during which z̃(t) > K̄ (ε ) . Then V̇0 < 0,
∀t ∈ [0, T ] means the total time during which z̃(t) > K̄ (ε ) is finite
238 W. Yu et al.

∑ Tk < ∞ (36)
k=1

If z̃(t) falls outside a ball of radius K̄ (ε ) for only a finite time and then re-enters
it, z̃(t) will eventually remain completely inside. If z̃(t) leaves the ball an infinite

times (k → ∞), since ∑ Tk < ∞ and Tk > 0, then it follows that Tk → 0. This then
k=1
means that z̃(t) finally stays inside the ball and so z̃(t) is bounded from an invariant
set argument.
Now, from (21), dtd z̃(t) is also bounded. If z̃k (t)Q is defined as the largest track-
ing error during Tk , (36) and a bounded dtd z̃(t) imply that lim [z̃k (t) − K̄ (ε )] = 0,
7 8
k→∞
I 0
and z̃k (t)will convergence to K̄ (ε ) , because x̃ = z̃ and ε < 1, as a result
0 ε1 I
x̃ converges to the ball of radius K̄ (ε ) . QED 


Proof of Theorem 2. The following Lyapunov function is proposed


1 1 1 ' (
V2 = x̃2T M x̃2 + x̄1T Kp x̄1 + tr W̃tT Kw−1W̃t (37)
2 2 2
where Kw and Kv are any positive definite matrices. Using (1), (25) and (26), the
closed-loop system is given by
M ẋ2 = −Cx2 − Kp x̄1 − Kd x̃2 + Ŵt Φ(s) −W ∗ Φ(s) − ηg (38)
Now the derivative of (37) is
· 1 ·
V̇2 = x̃2T M x̃2 + x̃2T Ṁ x̃2 + x̃2T Kp x̄1 + tr W̃tT Kw−1W̃t (39)
2
and from (38) and (29) it follows that
·  
x̃2T M x̃2 = −x̃2T M ẋ2d − x̃2T Cx̃2 − x̃2T Cx2d − x̃2T Kp x̄1 − x̃2T Kd x̃2 − x̃2T W̃t Φ(s) + ηg

Using (2) and (39), this then can be written as


V̇2 = −x̃2T M ẋ2d − x̃2T Cx2d − x̃2T Kd x̃2T − x̃2T [νσ + ηg ]
7 8
d (40)
+ x̃2T δ + tr Kw−1 W̃t − Φ(s)x̃2T W̃
dt

In view of the matrix inequality,


' (T
X T Y + X T Y ≤ X T Λ−1 X +Y T ΛY (41)

which is valid for any X,Y ∈ ℜn×k and for any positive definite matrix 0 < Λ =
ΛT ∈ ℜn×n , it follows that if X = x̃2 , and Y = δ , then x̃2T δ ≤ x̃2T Λ−1
δ x̃2 + η̄δ . Since
x2d = ẋ2d = 0, and using the learning law (31) and the skew-symmetric (2), then (40)
becomes
Stable Anti-Swing Control for an Overhead Crane 239

V̇2 ≤ −x̃2T Q1 x̃2 + η̄g + η̄δ (42)


' −1 −1
(
where Q1 = Kd − Λg + Λ−1 σ + Λδ . Now, from (32), it is known that Q > 0, and
(42) can then be represented as

V̇2 ≤ −λmin (Q) x̃2 2 + ηgT Λg ηg + δ T Λδ δ

V2 is therefore an ISS-Lyapunov function. Using Theorem 1 from [23], the bound-


edness of ηg and η̄δ implies that the tracking error x̃2  is stable. Integrating (42)
from 0 to T yields
 T
x̃2T Qx̃2 dt ≤ V2,0 −V2,T + (η̄g + η̄δ ) T ≤ V2,0 + (η̄g + η̄δ ) T
0

and, since x̄2 2Q = x̃2 2Q + η̄δ , equation (33) is established. QED 


References

1. E.M. Abdel-Rahman, A.H. Nayfeh and Z.N. Masoud. Dynamics and control of cranes: a
review. Journal of Vibration and Control, Vol. 9, No. 7, 863–908, 2003
2. J.W. Auernig and H. Troger. Time optimal control of overhead cranes with hoisting of the
payload. Automatica, Vol. 23, No. 4, 437–447, 1987
3. J.W. Beeston. Closed-loop time optimatial control of a suspended payload-a design study.
Proc. 4th IFAC World Congress, 85–99, Warsaw Poland, 1969
4. C.I. Byrnes, A. Isidori and J.C. Willems. Passivity, feedback equivalence, and the global
stabilization of minimum phase nonlinear systems. IEEE Trans. Automat. Contr., Vol. 36,
1228–1240, 1991
5. S.K. Cho and H.H. Lee. A fuzzy-logic antiswing controller for three-dimensional overhead
cranes. ISA Trans., Vol. 41, No. 2, 235–43, 2002
6. G. Corriga, A. Giua and G. Usai. An implicit gain-scheduling controller for cranes. IEEE
Trans. Control Systems Technology, Vol. 6, No. 1, 15–20, 1998
7. C. Canudas de Wit and J.J.E. Slotine. Sliding observers for overhead crane manipulator. Au-
tomatica, Vol. 27, No. 5, 859–864, 1991
8. G. Cybenko. Approximation by superposition of sigmoidal activation function. Math. Control,
Sig Syst, Vol. 2, 303–314, 1989
9. Y. Fang, W.E. Dixon, D.M. Dawson and E. Zergeroglu. Nonlinear coupling control laws for
an underactuated overhead crane system. IEEE/ASME Trans. Mechatronics, Vol. 8, No. 3,
418–423, 2003
10. InTeCo, 3DCrane: Installation and Commissioning Version 1.2, Krakow, Poland, 2000
11. P.A. Ioannou and J. Sun. Robust adaptive control. Prentice-Hall Inc., NJ, 1996
12. Y.H. Kim and F.L. Lewis Neural Network Output Feedback Control of overhead crane Ma-
nipulator. IEEE Trans. Neural Networks, Vol. 15, 301–309, 1999
13. R. Kelly. Global Positioning on overhead crane manipulators via PD control plus a classs of
nonlinear integral actions. IEEE Trans. Automat. Contr., Vol. 43, No. 7, 934–938, 1998
14. R. Kelly. A tuning procedure for stable PID control of robot manipulators. Robotica, Vol. 13,
141–148, 1995
15. B. Kiss, J. Levine and P. Mullhaupt. A simple output feedback PD controller for nonlinear
cranes. Proc. Conf. Decision and Control, 5097–5101, 2000
16. H.H. Lee. Modeling and control of a three-dimensional overhead crane. Journal of Dynamic
Systems, Measurement, and Control, Vol. 120, 471–476, 1998
240 W. Yu et al.

17. H.H. Lee. A new motion-planning scheme for overhead cranes with high-speed hoisting. Jour-
nal of Dynamic Systems, Measurement, and Control, Vol. 126, 359–364, 2004
18. E.H. Mamdani. Application of fuzzy algorithms for control of simple dynamic plant. IEE Pro-
ceedings — Control Theory and Applications, Vol. 121, No. 12, 1585–1588, 1976
19. J.A. Méndez, L. Acosta, L. Moreno, S. Torres and G.N. Marichal. An application of a neural
self-tuning controller to an overhead crane. Neural Computing and Applications, Vol. 8, No. 2,
143–150, 1999
20. K.A. Moustafa and A.M. Ebeid. Nonlinear modeling and control of overhead crane load sway.
Journal of Dynamic Systems, Measurement, and Control, Vol. 110, 266–271, 1988
21. M.W. Noakes and J.F. Jansen. Generalized input for damped-vibration control of suspended
payloads. Journal of Robotics and Autonomous Systems, Vol. 10, No. 2, 199–205, 1992
22. S. Nicosia and A. Tornambe. High-gain observers in the state and parameter estimation of
overhead cranes having elastic joins. System & Control Letters, Vol. 13, 331–337, 1989
23. E.D. Sontag and Y. Wang. On characterization of the input-to-state stability property. System
& Control Letters, Vol. 24, 351–359, 1995
24. O. Sawodny, H. Aschemann and S. Lahres. An automated gantry crane as a large workspace
robot. Control Engineering Practice, Vol. 10, No. 12, 1323–1338, 2002
25. Y. Sakawa and Y. Shindo. Optimal control of container cranes. Automatica, Vol. 18, No. 3,
257–266, 1982
26. W. Singhose, W. Seering and N. Singer. Residual vibration reduction using vector diagrams
to generate shaped inputs. Journal of Dynamic Systems, Measurement, and Control, Vol. 116,
654–659, 1994
27. M. Takegaki and S. Arimoto. A new feedback method for dynamic control of manipulator.
ASME J. Dynamic Syst. Measurement, and Contr., Vol. 103, 119–125, 1981
28. R. Toxqui, W. Yu and X. Li. PD control of overhead crane systems with neural compensa-
tion. Advances in Neural Networks -ISNN 2006, Springer-Verlag, Lecture Notes in Computer
Science, LNCS 3972, 1110–1115, 2006
29. L.X. Wang. Adaptive Fuzzy Systems and Control. Englewood Cliffs NJ: Prentice-Hall, 1994.
30. J. Yu, F.L. Lewis and T. Huang. Nonlinear feedback control of a gantry crane. Proc. 1995
American Control Conference, Seattle, 4310–4315, USA, 1995
Intelligent Fuzzy PID Controller

Prof. H.B. Kazemian, PhD, SMIEEE

Abstract This chapter aims to describe the development and two tuning methods
for a self-organising fuzzy PID controller. Before application of fuzzy logic, the
PID gains are tuned by conventional tuning methods. In the first tuning method,
fuzzy logic at the supervisory level readjusts the three PID gains during the system
operation. In the second tuning method fuzzy logic only readjusts the values of the
proportional PID gain, and the corresponding integral and derivative gains are read-
justed using Ziegler-Nichols tuning method while the system is in operation. For the
compositional rule of inferences in the fuzzy PID and the self-organising fuzzy PID
schemes two new approaches are introduced: the Min implication function with the
Mean-of-Maxima defuzzification method, and the Max-product implication func-
tion with the Centre-of-Gravity defuzzification method. The self-organising fuzzy
PID controller, the fuzzy PID controller and the PID controller are all applied to
a non-linear revolute-joint robot-arm for step input and path tracking experiments
using computer simulation. For the step input and path tracking experiments, the
novel self-organising fuzzy PID controller produces a better output response than
the fuzzy PID controller; and in turn both controllers produce better process output
that the PID controller.

Keywords: Fuzzy controller, fuzzy PID controller, self-organising fuzzy PID con-
troller, implication function, defuzzification method

Prof. H.B. Kazemian, PhD, SMIEEE


Computing, Communications Technology and Mathematics Department,
London Metropolitan University,
100 Minories, London EC3N 1JY,
England, UK.
TEL: ++44-20-7320 3109.
FAX:++44-20-7320 1717.
e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 241
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 241–260.
c 2008 Springer.
242 H.B. Kazemian

1 Introduction

The Proportional Integral Derivative (PID) controller is one of the most popular
controllers in industrial applications. However, the PID controller has a suboptimal
performance in the industrial processes. There have been many attempts in the past
to develop control techniques and algorithms to tune the PID gains KP, KI and KD
[1]– [3]. These control techniques and algorithms are largely inadequate for tuning
the gains of the PID controllers, for non-linear systems. Some of the techniques and
algorithms used to tune the PID gains demonstrate that further retuning is necessary
by a skilled human operator during the application of the controller to a process.
The fuzzy controllers have been applied to industrial processes with some degree of
success [4]– [6], where the rule buffer codifies the experience of a skilled human
operator. As a result of fuzzy controllers’ successes, the fuzzy PID controllers have
been studied in past decade [7]– [14]. Furthermore, the applications of autonomous
or intelligent fuzzy PID controllers have been recently gathering momentum and
many researchers have worked in the areas of self-tuning fuzzy PID controllers. For
example, self-tuning fuzzy PID have been applied to load and frequency control in
energy conversion and management [15], heating, ventilating and air conditioning
plant [16], and programmable logic controllers [17], to name a few. This article takes
the fuzzy PID controller and the self-tuning fuzzy PID controller research further
by developing a novel self-organising fuzzy PID controller.
The self-organising fuzzy PID controller is a learning controller. The rule pro-
duction and modification of the self-organising fuzzy PID controller generates its
own control rule strategies, and deposits the new rules in the rule buffer. The rules
are produced and updated constantly in the rule buffer during the system opera-
tion, according to the new experience encountered both at the setpoint and from the
process under control. For the self-organising fuzzy PID controller, the step input
and path tracking trajectories are applied to a non-linear revolute-joint robot-arm,
with presence of noise and time variant dynamics. The revolute-joint robot-arm is
used as a test bed to study the behaviour of the self-organising fuzzy PID controller
for dynamic system applications. The results of the computer simulation experi-
ments for the self-organising fuzzy PID controller are compared with the fuzzy PID
controller and the PID controller, to evaluate the suitability of the self-organising
fuzzy PID controller for dynamic system applications and also obtain some infor-
mation about the tuning procedure. In order to have measurements of the perfor-
mances of the self-organising fuzzy PID controller, the fuzzy PID controller and the
PID controller, the Integral of the Absolute magnitude of the Error (IAE) criterion
is used. The performance index IAE is particularly useful for computer simulation
studies.
Section 2 describes the fuzzy PID controller and the self-organising fuzzy PID
controller. Section 3, describes the applications of the controllers to a non-linear
revolute-joint robot-arm. Section 4 is the computer simulation results for the step
input and the path tracking trajectories. Section 5 is the conclusion.
Intelligent Fuzzy PID Controller 243

2 The Development of Self-Organizing Fuzzy PID Controller

In Figure 1, the general structure of the fuzzy controller is derived from the general
structure of the PID controller. Assilian in 1974 defined the fuzzy controller’s inputs
as the error and the change of error, and the output as an incremental one, similar
to the PID controller [18]. The fuzzy section of the fuzzy controller from Figure 1
is used for the fuzzy PID controller, as shown in Figure 2. The fuzzy section of the
fuzzy PID controller comprises of the fuzzifier, the rule buffer, the fuzzy control
and the defuzzifier blocks. The remaining blocks of the fuzzy PID controller of
Figure 2 are the PID gains and the revolute-joint robot-arm. The gains of the fuzzy
PID controller are initially tuned using a conventional tuning technique. The fuzzy
section has a supervisory role to readjust the gains of the PID controller during the
system operation.
In Figure 2, the fuzzifier block fuzzifies the error and the change of error. Scal-
ing and quantisation constitute the fuzzification of the error and the change of error.
The values of scaling factor are obtained by trial and error during the tuning of the

Fuzzy
Section rule
error buffer process O/P
setpoint
revolute-
fuzzy
fuzzifier defuzzifier joint
+ control
robot-arm

Fig. 1 The fuzzy controller

Fuzzy rule
Section buffer
error

fuzzy
fuzzifier defuzzifier
control

process O/P

setpoint PID gains


+ revolute-
KP(Fuzzy-apps),
joint
E KI(Fuzzy-apps), U robot-arm Y
– KD(Fuzzy-apps).

Fig. 2 The fuzzy PID controller


244 H.B. Kazemian

NL NS ZE PS PL
1

0
–4 –3 –2 –1 –0 +0 +1 +2 +3 +4

Fig. 3 Membership function against universe of discourse

Table 1 Membership matrix


Quantized levels/Universe of Discourse

Linguistic
Sets −4 −3 −2 −1 0 1 2 3 4
PL 0 0 0 0 0 0 0.3 0.7 1.0
PS 0 0 0 0 0.3 0.7 1.0 0.7 0
ZE 0 0 0.3 0.7 1.0 0.7 0 0 0
NS 0.3 0.7 1.0 0.7 0 0 0 0 0
NL 1.0 0.7 0 0 0 0 0 0 0
Membership Function

controller. The scaling factor for the error is shown as ESF and for the change of er-
ror is presented as CESF. Quantisation of the error and the change of error require
all the fuzzified values to remain within a certain range. In the experiments pre-
sented in this work, the range is from Negative Large to Positive Large. In Figure 3,
the linguistic codes for this range are: Negative Large (NL) = −4 or −3, Negative
Small (NS) = −2 or −1, Zero (ZE) = 0, Positive Small (PS) = +1 or +2, Positive
Large (PL) = +3 or +4. In the fuzzy control block, the fuzzified error, the fuzzified
change of error and the rules from the rule buffer block produce an output using
the compositional rule of inference [19]. An implication function and a defuzzifi-
cation method constitute the compositional rule of inference. There are many types
of implication functions and defuzzification methods. However, in the experiments
carried out for the novel self-organising fuzzy PID controller, the Min implication
function with the Mean-of-Maxima defuzzification method and the Max-product
implication function with the Centre-of-Gravity defuzzification method produced
better results for the process output. The Min implication function [20] with the
Mean-of-Maxima defuzzification method are shown by equations (1) and (2) re-
spectively:
uR (x, y) = uA (x) ∩ uB (y). (1)
uPi = [UPi(max) +UPi(max−1) ]/2. (2)
where a fuzzy subset A with elements x has a membership function of uA (x), within
a range of 0–1, see the membership matrix Table 1. Equally, a fuzzy subset B with
elements y has a membership function of uB (y). uR (x, y) is the resultant of the Min
implication function. The Mean of Maxima is defined [21], by taking an average
Intelligent Fuzzy PID Controller 245

between two elements in the universe of discourse, which correspond to two largest
values of the membership functions. The universe of discourse UPi(max) is the high-
est value of the membership function, the universe of discourse U|Pi(max − 1) is
the second highest value and UPi is the resultant. p is the proportional gain and i
is the sampling instant. The Max-product [22]- [23] implication function and the
Centre-of-Gravity [24] defuzzification method are shown by Equations (3) and (4)
respectively:
uR (x, y) = uA (x) · uB (y). (3)
n n
uPi = ∑(xn ∗Uni )/ ∑ xn . (4)
1 1

where · represents multiplication, x is the elements of the membership function, UPi


is the universe of discourse and n is the number of membership function contribu-
tions (n = 1, 2, ..., etc.). The output of the defuzzifier block needs to be defuzzified,
since non-fuzzy signal is required for the PID gains block. The fuzzified UPi , is de-
quantised and descaled and is added to the proportional gain KP in the PID gains
block to readjust the values of KP , using equation (5). Similar methods are also used
to readjust the values of KI and KD in equations (6) and (7).

KP(Fuzzy−apps) = KP +UPi ∗ KCP . (5)

KI(Fuzzy−apps) = KI +UIi ∗ KCI . (6)

KD(Fuzzy−apps) = KD +UDi ∗ KCD . (7)

KP , KI and KD on the right of equations (5), (6) and (7) represent the PID gains
before the readjustments, and KP(Fuzzy−apps) , KI(Fuzzy−apps) and KD(Fuzzy−apps) on
the left of equations represent the PID gains after the application of the fuzzy PID
controller. KCP , KCI and KCD are the descaling factor coefficients for the propor-
tional, integral and derivative PID gains, respectively. As the values of the PID gains
KP , KI and KD change at different rates, three different values for the descaling fac-
tor coefficients are used. For instance, the range of variations in values for KP is
greater than KI and KD . The values of the descaling factor coefficients KCP , KCI
and KCD are also chosen to be different for each link. For a 2 link revolute-joint
robot-arm, KCPS , KCIS and KCDS are the descaling factor coefficients for the shoul-
der; KCPA , KCIA and KCDA are the descaling factor coefficients for the arm.
Finally in Figure 2, the controller output from the PID gains block has a transfer
function
U(s) KI(Fuzzy−apps)
= KP(Fuzzy−apps) + + KD(Fuzzy−apps) ∗ s. (8)
E(s) s
The block diagram of a novel self-organising fuzzy PID controller is shown in
Figure 4. The broken lines in the block diagram show the self-organising fuzzy at the
supervisory controller level and the PID at the actuator level. The rule production
and modification section of the self-organising fuzzy PID controller presented in
Figure 4 had been initially proposed by Mamdani and Baaklini [25], and has been
246 H.B. Kazemian

Supervisory linguistic rule


Level table

past states buffer


rule
KProp (i-N), Kbit(i-N), KDeri(i-N).
reinforcement

rule PID fuzzifier


buffer
error

fuzzy
fuzzifier defuzzifier
control

Actuator Level process O/P

PID gains
revolute-
KP(Fuzzy-apps),
joint
KI(Fuzzy-apps), robot-arm
+ E KD(Fuzzy-apps). U Y
setpoint –

Fig. 4 The self-organising fuzzy PID controller

studied by various researchers such as Procyk and Mamdani [26] and Kazemian
and Scharf [27] to name a few. However, the rule production and modification sec-
tion at supervisory level readjusting PID gains at the actuator level, has only been
studied by Kazemian [28]– [31]. The self-organising fuzzy PID controller in this
research is in effect the fuzzy PID controller with an additional rule production and
modification. The self-organising fuzzy PID controller automatically builds its own
control rule strategies in the rule buffer according to the changes encountered both
at the setpoint and from the process under control, starting with no rules in the rule
buffer, during the application of the self-organising fuzzy PID controller to the dy-
namic system. In Figure 4, the rule production and modification comprises of four
blocks, the linguistic rule table, the PID fuzzifier, the past states buffer and the rule
reinforcement. The linguistic rule table is responsible for keeping the revolute-joint
robot-arm output as close as possible to the setpoint. If the revolute-joint robot-
arm output approaches or follows the setpoint, then no value (zero) is outputted
from the linguistic rule table block. If the revolute-joint robot-arm output deviates
from the setpoint, a value called the gain correction KGC is outputted from the lin-
guistic rule table block. Based on these objectives a set of nine linguistic rules are
produced, Figure 5. The nine linguistic rules are converted into a table (Table 2),
which is placed in the linguistic rule table block of the rule production and modi-
fication. From the fuzzifier block of Figure 2, two values of the fuzzified error and
Intelligent Fuzzy PID Controller 247

1- If E is NL and EC is NL or NS then KGC is ZE

2- If E is NL and EC is ZE or PS or PL then KGC is NL

3- If E is NS and EC is NL or NS then KGC is ZE

4- If E is NS and EC is ZE or PS or PL then KGC is NS

5- If E is ZE and EC is NL or NS or ZE or PS or PL then KGC is ZE


6- If E is PS and EC is NL or NS or ZE then KGC is PS

7- If E is PS and EC is PS or PL then KGC is ZE

8- If E is PL and EC is NL or NS or ZE then KGC is PL

9- If E is PL and EC is PS or PL then KGC is ZE

Fig. 5 The rules in the linguistic rule table

Table 2 The linguistic rule table


Fuzzy Change Of Error −→
Fuzzy error ↓ NL NL NS NS ZE PS PS PL PL
NL ZE ZE ZE ZE NL NL NL NL NL
NL ZE ZE ZE ZE NL NL NL NL NL
NS ZE ZE ZE ZE NS NS NS NS NS
NS ZE ZE ZE ZE NS NS NS NS NS
ZE ZE ZE ZE ZE ZE ZE ZE ZE ZE
PS PS PS PS PS PS ZE ZE ZE ZE
PS PS PS PS PS PS ZE ZE ZE ZE
PL PL PL PL PL PL ZE ZE ZE ZE
PL PL PL PL PL PL ZE ZE ZE ZE

the fuzzified change of error are fed into the linguistic rule table block (Table 2)
and a corresponding value of KGC is outputted. The PID fuzzifier block obtains and
fuzzifies the PID gains from the PID gains block. The scaling factors for the PID
fuzzifier block are denoted as SFp f , SFi f and SFd f . The past states buffer is a storage
block for the past values of the PID gains. Number of the PID gains in the past states
buffer are based on the time lag of the system and in turn the time lag depends on
delay-in-reward. The new values of the gain correction (KGC ) and the values of the
past states buffer generate new control rules in the rule reinforcement block, when
the revolute-joint robot-arm output deviates from the setpoint.

KProp(i) = KProp(i−N) + KGC . (9)

KInt(i) = KInt(i−N) + KGC . (10)

KDeri(i) = KDeri(i−N) + KGC . (11)

where KProp(i−N) , KInt(i−N) and KDeri(i−N) are the PID gains from the past states
buffer block. i is sampling instant and N is number of past samples before the
248 H.B. Kazemian

present sample. The new rules from the rule reinforcement block are transferred
continuously to the rule buffer block during the system operation.

3 Kinematics and Dynamics of the Robot-Arm

The mathematical model of a revolute-joint robot-arm is taken as a non-linear dy-


namic system and employed as a tool to study the behaviour of the SOF-PID con-
troller, the SOFC and the PID controller. The mathematical model outlines the
robot-arm by its rotational characteristics and comprises of three sections, the struc-
ture of the arm, the inverse-arm and the forward-arm. The structure of the robot-arm
consists of two sections, the kinematics and the dynamics. The kinematics describes
the relative positions between the links of the arm and gives the axes of rotation for
each of the joints [32]. The dynamics constitute the moment of inertia, the center of
mass and the mass for each of the links [33]. The inverse-arm is a set of equations
which, when evaluated, yield the motor voltages required to produce particular ac-
celerations. This is the inverse of a real arm, which produces accelerations given the
voltages. The forward-arm is the process of applying voltages to each of the motors
and calculating the movements of the joints in the robot-arm.
The robot-arm model can accommodate up to seven links and six joints. The
seven links comprise of link 0 to link 6, and the six joints consist of joint 1 to joint 6.
In the computer simulation, link 1 is taken as a single-input single-output. A 2-link
and a 3-link represent a multi-input multi-output, and link 0 is the static base. In
Figure 6, the Denavit–Hartenburg (D–H) [34] convention describes the kinematics

Joint i + 1
qi+1
Joint i
Link i – 1
qi
Link i

Link i + 1
ai
zi
ai xi

di zi – 1

qi
xi–1

Fig. 6 Robot-arm link coordination


Intelligent Fuzzy PID Controller 249

of the links and joints as such that, link i rotates around the Zi−1 axis of link i − 1
when joint i turns. Similarly, link i + 1, rotates around Zi at joint i + 1, etc. Xi is
related to link i and points along the common normal of Zi and Zi−1 . The D-H
representation of a link is based on four geometric parameters:
• θi is the angle between links, measuring the joint angle from the Xi−1 axis to the
Xi axis about the Zi−1 axis.
• αi is the twist of the link, the angle between axes Zi−1 and Zi about the Xi axis.
• ai is the length of the link, the shortest distance between the Zi−1 and Zi axes.
• di is the distance between the links, from the link i − 1 to the link i along the Zi−1
axis.
The driving force for each link is an armature controlled DC motor. The voltage is
applied at the input of the armature terminals and speed of rotation is produced at
the output. A second order differential equation is used to represent the dynamics of
a DC motor and load.
d2y dy
+f∗ + r(t) = r(t)u. (12)
dt 2 dt
where u is the input to the process, y is the output from the process, f is the friction,
and r(t) is the small friction values which varies with time. The non-linearities in a
revolute-joint robot-arm are caused by backlash, friction and motor characteristics.
In the robot-arm, the moment of inertia varies with time due to the movements of the
links. The DC motor dynamics is a time variant system, which could represent small
friction values and changes in the moment of inertia of the motor and load [35]. By
varying the term r(t) which stands for small friction values, changes in the moment
of inertia of the motor and load will take place. A sharp decrease or increase in
the moment of inertia makes the system more difficult to control. The third order
method of Runge-Kutta [36] is used to integrate the second order dynamic equation.
To simulate the noise, a random number generator program is used to produce 5,000
different numbers. This is based on a congruent linear random number generator,
which gives a distribution close to a rectangular. In accordance with observations
made with a practical system, the output is scaled to give a deviation of ±0.8 units,
which is added to the process output.

4 Computer Simulation Results

The self-organising fuzzy PID controller is applied to a revolute-joint robot-arm


for a step input and a trapezium waveform tracking experiments. The results are
compared with a fuzzy PID controller and a conventional PID controller. For the
step input two different methods are utilised. In method 1, Section 4 - part A experi-
ments, the fuzzy PID controller and the self-organising fuzzy PID controller readjust
the three PID gains, KP , KI and KD , while the system is in operation. In contrast,
in method 2, section 4 - part B, the fuzzy PID controller and the self-organising
fuzzy PID controller only readjust KP gain, and Ziegler-Nichols [37] tuning method
250 H.B. Kazemian

calculates the corresponding values of KI and KD . Method 1 is only suitable for step
input, since the values of KP , KI and KD are readjusted at the rise time, the steady
state error period and between the rise time and the steady state error period, respec-
tively. For a multi-input multi-output path tracking experiment such as the trapezium
waveform the rise time, the steady state error and overshoot do not apply.

A) Step input experiments using method 1

The step input experiments are to produce some initial results for the fuzzy PID con-
troller, the self-organising fuzzy PID controller and the conventional PID controller.
For a revolute-joint robot-arm with pick and place in mind, the parameters for the
three controllers are tuned to obtain an appropriate damping around the setpoint,
minimise the overshoot and depress the steady state error. The PID gains KP , KI
and KD are initially tuned off line, without the fuzzy controllers. The experimental
results presented in this article are based on the following off line tuning method.
Firstly, a large value of KP is chosen and gradually the KP value is reduced until the
time the output process overshoot is minimised. Subsequently, KI and KD are tuned;
and finally KP is re-tuned to deduce the best possible output response. Once the PID
gains KP , KI and KD are tuned, the fuzzy controllers readjust the PID gains during
the system operation. For the step input experiments, one of the PID gains are read-
justed at the time. From the start of the signal to the point near to the setpoint, KP is
readjusted to improve the rise time; from this point until approaching the steady state
error region, KD is readjusted to dampen the overshoot; and lastly, KI is readjusted
to reduce the steady state error.
One experiment for the fuzzy PID controller and one experiment for the self-
organising PID controller are outlined with the following parameter values: the ini-
tial PID gains are KP = 50, KI = 0.55, KD = 1.0; ESF = 0.3, CESFL for the linguistic
rule table = 4, CESFF for the fuzzy control = 12; delay-in-reward = 6; the descaled
coefficients for the defuzzifier block are KCP = 0.5, KCI = 0.05, KCD = 0.1; the
scaling factors for the PID fuzzifier block are SFp f = 0.12, SFi f = 12 and SFd f = 6;
and the linguistic rule table of Table 2 is used. Figures 7 and 8 demonstrate examples

80
70
Process Output

60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.

Fig. 7 Step response, method 1: using the fuzzy PID controller. Scaling: X-axis: 12 ms / sample
Y -axis: output - degrees
Intelligent Fuzzy PID Controller 251

80
70

Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.

Fig. 8 Step response, method 1: using the self-organising fuzzy PID controller, run number 5.
Scaling: X-axis: 12 ms / sample Y -axis: output - degrees

of process out for a step input using the fuzzy PID controller — method 1 and the
self-organising fuzzy PID controller — method 1, receptively. The Y -axis is the
process output in degree centigrade and the X-axis is the sample number, 12 ms
per sample. As the figures indicate, there is an improvement in the process output
for the self-organising fuzzy PID controller than the fuzzy PID controller. Due to
less computation in the simulation, the rise time is slightly faster for the fuzzy PID
controller than the self-organising fuzzy PID controller. The overshoot is virtually
non-existent for the self-organising fuzzy PID controller and the fuzzy PID con-
troller. The steady state error is improved considerably for the self-organising fuzzy
PID controller than the fuzzy PID controller. This is because, the self-organising
fuzzy PID controller continuously changes the values of the rule buffer block dur-
ing the system operation. In contrast, the values of the rule buffer block for the fuzzy
PID controller are predetermined, prior to the experiments being carried out.

B) Step input experiments using method 2


The PID gains are initially tuned off line, using the same tuning procedure ex-
plained in section 4 - part A. The fuzzy PID controller and the self-organising fuzzy
PID controller readjust the values of KP . The corresponding values of KI and KD
are calculated using Ziegler-Nichols tuning method. The method assumes that the
proportional PID gain KP is 60% of the gain KOSC at the time of oscillation, the
integral time constant TI is 50% of the oscillation period TOSC and the derivative
time constant TD is 12.5% of the oscillation period TOSC .

KP = 0.6 ∗ KOSC . (13)

TI = 00.5 ∗ TOSC . (14)


TD = 0.125 ∗ TOSC . (15)
The Ziegler–Nichols tuning method is based on continuous systems and can also
be used on discrete systems for a fast sampling time. By using the PID controller
252 H.B. Kazemian

equations, KI = KP /TI and KD = KP TD and the Ziegler–Nichols equations (13, 14


and 15), equations (16 and 17) are obtained:

KI = 2KP /TOSC . (16)

KD = 0.125 ∗ KP ∗ TOSC . (17)


By using the universe of discourses UPi ,UIi and UDi in place of KP , KI and KD
respectively, equations (18 and 19) are obtained.

UIi = 2UPi /TOSC . (18)

UDi = 0.125 ∗UPi ∗ TOSC . (19)


The equivalent values of UIi and UDi from equations (18 and 19) are substituted into
equations (6 and 7), and equations (20 and 21) are obtained. Equations (20 and 21)
constitute the mathematical calculation of the values of KI and KD using Ziegler–
Nichols method. Equation (5) remains the same.

KI(Fuzzy−apps) = KI + (2UPi /TOSC ) ∗ KCI . (20)

K(DF uzzy−apps) = KD + (0.125 ∗UPi TOSC ) ∗ KCD . (21)


One experiment out of many experiments carried out is outlined here, using the
same parameter values as section 4 - part A. Method 2 reduces the scaling factors
from three to one in the PID fuzzifier block, SFp f = 0.12, as the proportional gain
is only used at the rule production and modification section. Comparing Figures 9
and 10, there is an improvement in the self-organising fuzzy PID controller than the
fuzzy PID controller using method 2. As explained in section 4 - part A, the rise time
for the fuzzy PID controller is slightly faster than for the self-organising fuzzy PID
controller, and the overshoot is non-existent for both controllers. The steady state
error is improved notably for the self-organising fuzzy PID controller than the fuzzy

80
70
Process Output

60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.

Fig. 9 Step response, method 2: using the fuzzy PID controller. Scaling: X-axis: 12 ms/sample
Y -axis: output - degrees
Intelligent Fuzzy PID Controller 253

80
70

Process Output
60
50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.

Fig. 10 Step response, method 2: using the self-organising fuzzy PID controller, run number 4.
Scaling: X-axis: 12 ms/sample Y -axis: output - degrees

PID controller, as the self-organising fuzzy PID controller continuously changes the
values of the rule buffer during the system operation. Finally, comparing the fuzzy
PID controller and the self-organising fuzzy PID controller using method 1 (Figures
7 and 9) and method 2 (Figures 8 and 10), the two methods produce very similar
results using computer simulation. However, method 1 and method 2 might produce
different results for practical applications. For a step input experiment, after initial
tuning of the PID gains using conventional methods, it is possible to predict which
of the three PID gains should be readjusted at the rule production and modification
section to further improve the process output response. However, it should be noted
that readjusting the gains KP , KI and KD , improves some part of the output response
and deteriorates the other part. For instance for a step input, the proportional gain KP
has a direct effect over the rise time and oscillation, the integral gain KI reduces the
steady state error but increases the possibility of instability, and the derivative gain
KD reduces the overshoot but it may cause major fluctuations in the process output
in the presence of high rates of change like noise. In contrast for the path tracking
experiments, with continuous changes at the setpoint and from the process itself
during the system operation, one cannot instantaneously decide which PID gains
should be readjusted in order to obtain an optimum path. Therefore, it is better to
apply method 2 to the path tracking experiments, as only KP needs readjusting by
the rule production and modification section.
Figure 11, compares the fuzzy PID controller (method 2) and the self-organising
fuzzy PID controller (method 2) with the PID controller. In Figure 11, the steady
state errors are about 1.6% for the fuzzy PID controller, 1.1% for the self-organising
fuzzy PID controller and 2.3% for the PID controller. The overshoot is negligible
for the fuzzy PID controller and the self-organising fuzzy PID controller. However,
the overshoot is high for the PID controller. This is because, due to the derivative
part of the PID controller and in the presence of high rates of change such as noise,
the PID controller fluctuates in the process output.
254 H.B. Kazemian

Fuzzy PID controller

Self-organising fuzzy PID controller


80 PID controller
70
60
Process Output

50
40
30
20
10
0
0 30 60 90 120 150 180 210 240
Sample No.

Fig. 11 Step response, method 2: using the fuzzy PID Controller, the self-organising fuzzy PID
controller run number 4, and the PID controller. Scaling: X-axis: 12 ms/sample Y -axis: outputs -
degrees

C) Trapezium waveform experiments


The fuzzy PID, the self-organising fuzzy PID and the conventional PID are all
single-input single-output controllers. As a result for a two-input two-output, the ex-
periments bring together the simultaneous operations of two individual controllers
one controlling the shoulder movement and the other controlling the arm movement.
Each controller considers its joint as a single-input single-output system, learning its
rules in the face of cross-coupling effects experienced by the other system. To trace
the trapezium waveform of varying amplitudes, the two joint angles are moved us-
ing kinematics transformations and joint angles manipulations. The sampling time
is chosen to be 6 ms and the step size for Runge-Kutta integration is 8 ms. The rule
buffer block in the self-organising fuzzy PID controller initially has no rules. The
controller learns the appropriate control rule strategy by itself by going through a
series of training (RUNS), during which it produces and modifies its database. If
a stable control rule strategy is reached, then no new rules will be produced nor
modified in the subsequent RUNS. The maximum number of RUNS in the experi-
ments carried out were 6. For the path tracking experiments 6 RUNS were carried
out for the fuzzy PID controller, the self-organising fuzzy PID controller and the
PID controller, in order to obtain measurements of the performances provided by
the Integral of the Absolute magnitude of the Error (IAE) criterion.
For comparison purposes the experiments outlined here start from sample num-
ber 200. This is to allow the self-organising fuzzy PID controller to have about 1.2
seconds (200 samples x 6 ms = 1.2 sec) to build its database in the rule buffer
block. In Figures 12 and 14, a path tracking experiment for a trapezium wave-
form with the following parameters is outlined using two fuzzy PID controllers:
ESF = 0.4, CESF = 10, KCPS = 0.35, KCIS = 0.04, KCDS = 0.1, KCPA = 0.3,
KCIA = 0.04, KCDA = 0.09. The Min implication function with the Mean of Max-
ima is used in Figure 12 and the Max - product implication function with the Centre
Intelligent Fuzzy PID Controller 255

Fig. 12 (a,b). Tracking a trapezium waveform: using two fuzzy PID controllers, Min implica-
tion function with Mean-of-Maxima defuzzification method, run number 4. Scaling: X-axis: 6 ms/
sample Y -axis: outputs - degrees

Fig. 13 (a,b). Tracking a trapezium waveform: using two self-organising fuzzy PID controllers,
Min implication function with Mean-of-Maxima defuzzification method, run number 5. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees

Fig. 14 (a,b). Tracking a trapezium waveform: using two fuzzy PID controllers, Max-product
implication function with Centre-of-Gravity defuzzification method, run number 5. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees

of Gravity is used in Figure 14. In Figures 13 and 15, a path tracking experiment
for a trapezium waveform with the following parameter values is shown, using two
self-organising fuzzy PID controllers: ESF = 0.45, the change-of-error scaling fac-
tor for the fuzzy control block CESFF = 12, the change-of-error scaling factor for
the linguistic rule table block CESFL = 6, SFp f = 1.1, KCPS = 0.4, KCIS = 0.05,
KCDS = 0.11, KCPA = 0.35, KCIA = 0.05, KCDA = 0.1 and delay-in-reward = 6.
The Min implication function with the Mean of Maxima is used in Figure 13 and the
Max - product implication function with the Centre of Gravity is used in Figure 15.
256 H.B. Kazemian

Fig. 15 (a, b). Tracking a trapezium waveform: using two self-organising fuzzy PID controllers,
Max-product implication function with Centre-of-Gravity defuzzification method, run number 3.
Scaling: X-axis: 6 ms/sample Y -axis: outputs - degrees

For the purpose of comparison, the gains in the PID gains block for the fuzzy PID
controller and the self-organising fuzzy PID controller are chosen to be the same:
KP [S] = 4.5, KI [S] = 1.6, KD [S] = 1.15 for the shoulder, and KP [A] = 4, KI [A] = 1.3,
KD [A] = 1.1 for the arm. For the path tracking experiments, two self-organising
fuzzy PID controllers trace the trapezium waveform closer and smoother than two
fuzzy PID controllers, refer to Figures 12–15. Increasing the amplitude and fre-
quency of the trapezium waveform effect the fuzzy PID controller more than the
self-organising fuzzy PID controller. As a result, the self-organising fuzzy PID con-
troller can react quickly to the changes experienced both at the setpoint and from
the process. There have been numerous experiments carried out with different im-
plication functions and defuzzification methods using different fuzzy controllers.
Yamazaki [23], used the Max - product implication function in conjunction with
the Centre of Gravity defuzzification method and concluded that the process out-
put is smoother. Lembessis [38], combined the Min implication function [20] with
the Mean-of-Maxima defuzzification [21] method and argued that this combination
produces a faster convergence to the setpoint. There were some initial experiments
carried out in this research to apply the fuzzy PID controller and the self-organising
fuzzy PID controller to a revolute-joint robot-arm using the Max - product implica-
tion function with the Centre-of-Gravity defuzzification method, as well as the Min
implication function with the Mean-of-Maxima defuzzification method. The experi-
mental results of Figures 12 and 13 show that the Min implication function with the
Mean of Maxima produce a faster convergence to the setpoint. The experimental
results of Figures 14 and 15 also reveal the Max - product implication function with
the Centre of Gravity produce a smoother transient response. In contrast, the process
output response is much better for two fuzzy PID controllers and two self-organising
fuzzy PID controllers than two PID controllers, see Figures 12 – 16. An introduction
of noise to the system for the fuzzy PID controller and the self-organising fuzzy PID
controller produces less disturbances in the process output response than for the PID
controller.
In many cases, the number of rules that define different output conditions are
limited. Subsequently, so often, no rules in particular satisfy certain outputs. This is
of course one of the biggest drawbacks of the fuzzy controllers, as it undermines the
Intelligent Fuzzy PID Controller 257

80 80
70 70
Shoulder Output

60 60

Arm Output
50 50
40 40
30 30
20 20
10 10
0 0
200 250 300 350 400 450 500 550 600 200 250 300 350 400 450 500 550 600
Sample No. Sample No.
(a) (b)

Fig. 16 (a, b). Tracking a trapezium waveform: using two PID controllers, run number 4. Scaling:
X-axis: 6 ms/sample Y -axis: outputs - degrees

Table 3 Trapezium path tracking experiments


Fuzzy Self- Fuzzy Self- PID
PID orga- PID orga-
MM nising COG nising
PID PID
MM COG
IAE run Fig. IAE run Fig. IAE run Fig. IAE run Fig. IAE run Fig.
51.32 2 46.23 3 50.12 3 45.77 4 53.15 4 15
50.01 3 47.91 2 48.54 5 13 44.31 6 54.26 3
48.76 5 45.65 4 47.33 6 47.12 2 55.98 3
49.36 4 11 46.73 3 51.72 4 46.64 3 52.23 6
47.11 6 44.85 5 12 49.11 5 45.39 5 51.76 5
50.68 5 45.89 4 49.89 4 44.87 5 53.19 2
49.71 6 44.18 6 50.79 2 46.34 3 14 54.38 4

efficiency of such controllers. To overcome this, neighbouring control outputs are


used. In other words, for a given output, the control algorithms will check if there
is a corresponding rule. If there is not, then the rules in the closest neighbourhood
will be used. The extent of the neighbouring control output distance is determined
by the user; a distance of 1 unit is used in this work.
A system is considered an optimum control system, when the system parameters
are adjusted so that the performance index reaches a minimum positive value or
zero. The Integral of the Absolute magnitude of the Error (IAE) criterion is a suitable
performance index.
T
IAE = e(t)dt. (22)
0

In Equation (22), T is a finite time chosen so that the integral approaches a steady
state value. In the trapezium waveform experiments, the IAE criterion provides use-
ful information in the analysis of the path tracking ability of the system. Table 3
presents the IAE performance for the fuzzy PID controller, the self-organising fuzzy
PID controller and the PID controller. As already explained in this section, the Min
implication function with the Mean-of-Maxima defuzzification method as well as
258 H.B. Kazemian

the Max - product implication function with the Centre-of-Gravity defuzzification


method are used for the fuzzy PID controller and the self-organising fuzzy PID con-
troller. The three controllers’ performances could be evaluated by looking at the
figures. It could be concluded that the lower figure values are usually indicative of
better performances. However, for the trapezium waveform experiments, the lowest
value of the IAE criterion does not always produce the best tracking performance.
For instance, for very close path tracking experiments, the process output response
had an unexpected initial overshoot.

5 Conclusion

For the step input experiments, the fuzzy PID controller and the self-organising
fuzzy PID controller are applied to a non-linear robot-arm using computer simula-
tion. The results of the computer simulation for the fuzzy PID controller and the self-
organising fuzzy PID controller are compared with a conventional PID controller
subject to the same data provided at the setpoint, in order to analyse the results and
also obtain some information about the tuning procedure. The results of the step in-
put experiments for the fuzzy PID controller and the self-organising fuzzy PID con-
troller demonstrate that, using the first method that is readjusting the three PID gains
individually produces virtually the same results as, using the second method that is
readjusting the proportional PID gain first and applying Ziegler–Nichols method to
calculate the corresponding values of the integral and the derivative gains. In gen-
eral, the rise time for the fuzzy PID controller is faster than the self-organising fuzzy
PID controller. The steady state error is better for the self-organising fuzzy PID con-
troller than the fuzzy PID controller. The overshoot for the fuzzy PID controller and
the self-organising fuzzy PID controller is virtually non-existent. It is concluded
that for the step input experiments, the novel self-organising fuzzy PID controller
is capable of producing a better process output than the fuzzy PID controller and
the PID controller in controlling a non-linear robot-arm. An introduction of noise to
the system for the fuzzy PID controller and the self-organising fuzzy PID controller
creates less disturbances in the process output response than for the PID controller.
The fuzzy PID controller and the self-organising fuzzy PID controller are both
also applied to a non-linear revolute-joint robot-arm for a path tracking experiment
to trail a trapezium waveform. To conclude, the new self-organising fuzzy PID con-
troller traces the trapezium better than the fuzzy PID controller. This is because
the rules in the rule buffer are updated and changed constantly during the applica-
tion of the self-organising fuzzy PID controller to the process. The results of the
experiments for the fuzzy PID controller and the self-organising fuzzy PID con-
troller provide a smoother process output response using the Max - product implica-
tion function with the Centre-of-Gravity defuzzification method. The experimental
results for the fuzzy PID controller and the self-organising fuzzy PID controller
present a swifter convergence to the setpoint using the Min implication function with
the Mean-of-Maxima defuzzification method. For the path-tracking experiments,
Intelligent Fuzzy PID Controller 259

the fuzzy PID controller and the self-organising fuzzy PID controller both produce
a better process output response than the PID controller, in the presence of noise
and time-variant dynamics.

References

1. M.M. Zavarei and M. Jamshidi. Time-delay systems — analysis, optimisation and applica-
tions. Amsterdam: North-Holland Systems and Control Series, vol. 9, 1987
2. D.P. Atherton. PID controller tuning. IEE Computing & Control Engineering journal,
pp. 44–50, April 1999
3. P. Airikka. PID controller: algorithm and implementation. IEE Computing & Control Engi-
neering journal, pp. 6–11, Dec/Jan 2003/2004
4. M.S. Fodil, P. Siarry, F. Guely and J.L. Tyran. A fuzzy rule base for the improved control
of a pressurised water nuclear reactor. IEEE Transactions on Fuzzy Systems, vol. 8, no. 1,
pp. 1–10, February 2000
5. J.S. Won and R. Langari. Fuzzy torque distribution control for a parallel hybrid vehicle. Expert
Systems, Int. J. of Knowledge Engineering and Neural Networks, vol. 19, no. 1, pp. 4–10,
February 2002
6. S.X. Yang, H. Li, M.Q.-H. Meng and P.X. Liu. An embedded fuzzy controller for a behaviour-
based mobile robot with guaranteed performance. IEEE Transactions on Fuzzy Systems,
vol. 12, no. 4, pp. 436–446, August 2004
7. W. Li. Design of a hybrid fuzzy logic proportional plus conventional integral-derivative con-
troller. IEEE Trans. Fuzzy Systems, vol. 6, no. 4, pp. 449–463, 1998
8. R.K. Mudi and N.R. Pal. A robust self-tuning scheme for PI- and PD-type fuzzy controllers.
IEEE Trans. on Fuzzy Systems, vol. 7, no. 1, pp. 2–16, 1999
9. G.K.I. Mann, B.G. Hu and R.G. Gosine. Two level tuning of fuzzy PID controllers. IEEE
Transactions on Systems, Man and Cybernetics, Part B, vol. 31, no. 5, pp. 263–269, April
2001
10. K.S. Tang, K.F. Man, G. Chen and S. Kwong. An optimal fuzzy PID controller. IEEE Trans-
actions on Industrial Electronics, vol. 48, no. 4, pp. 757–765, August 2001
11. B.G. Hu, G.K.I. Mann and R.G. Gosine. A systematic study of fuzzy PID controllers-function-
based evaluation approach. IEEE Transactions on Fuzzy Systems, vol. 9, no. 5, pp. 699–712,
October 2001
12. R.S. Ranganathan, H.A. Malki and G. Chen. Fuzzy predictive PI control for processes with
large time delays. Expert Systems, Int. J. of Knowledge Engineering and Neural Networks,
vol. 19, no. 1, pp. 21–33, February 2002
13. G.K.I. Mann and R.G. Gosine. Adaptive hierarchical tuning of fuzzy controllers. Expert
Systems, Int. J. of Knowledge Engineering and Neural Networks, vol. 19, no. 1, pp. 34–45,
February 2002
14. Y. Zhao and E.G. Collins Jr. Fuzzy PI control design for an industrial weigh belt feeder. IEEE
Trans. Fuzzy Systems, vol. 11, no. 3, pp. 311–319, June 2003
15. E. Yesil, M. Guzelkaya and I. Eksin. Self tuning fuzzy PID type load and frequency controller.
Energy Conversion and Management Journal, vol. 45, no. 3, pp. 377–390, ISSN. 0196-8904,
2004
16. B. Moshiri and F. Rashidi. Self-tuning based fuzzy PID controllers: application to control
of nonlinear HVAC systems. Intelligent Data Engineering and Automated Learning - IDEAL
2004, vol. 3177, pp. 437–442, ISBN. 978-3-540-22881-3, October 2004
17. O. Karasakal, E. Yesil, M. Guzelkaya and I. Eksin. Implementation of a new self-tuning fuzzy
PID controller on PLC. Turk Journal of Elec. Eng., vol. 13, no. 2, pp. 277–286, 2005
18. S. Assilian. Artificial Intelligence in the control of real dynamic systems. PhD. Thesis, Queen
Mary University of London, 1974
260 H.B. Kazemian

19. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst., Man and Cybern., vol. 3, no. 1, pp. 28–44, 1973
20. E.H. Mamdani. Advances in linguistic synthesis of fuzzy controllers. Int. J. Man-Machine
Studies, vol. 8, pp. 669–678, 1976
21. W. Pedrycz. Fuzzy control and fuzzy systems, Second Extended Edition. Research Studies
Press LTD, Taunton, Somerset, England TA1 1HD, 1993
22. I.P. Holmblad and J.J. Ostergaard. Fuzzy logic control: operator experience applied in auto-
matic process control. FLS Review, F.L. Smidth & Co., 77 Vigerslev Alle, DK-2500, Valby,
Copenhagen, Denmark, vol. 45, pp. 11–16, 1981
23. T. Yamazaki. An improved algorithm for a self-organising controller. PhD. Thesis, Queen May
University of London, 1982
24. Y.F. Li and C.C. Lau. Development of Fuzzy Algorithms for Servo Systems. IEEE Control
Systems Magazine, pp. 65–72, April 1989
25. E.H. Mamdani and N. Baaklini. Prescriptive method for deriving control policy in a fuzzy
logic controller. Electronics Letters, vol. 1, pp. 625–626, 1975
26. T.J. Procyk and E.H. Mamdani. A Linguistic self-organising process controller. Automatica,
vol. 15, pp. 15-30, 1979
27. H.B. Kazemian and E.M. Scharf. An application of multi-input multi-output self organising
fuzzy controller for a robot-arm. IEEE Int. Journal Neural Network World, vol. 6, no. 4,
pp. 631–641, 1996
28. H.B. Kazemian. Study of learning fuzzy controllers Expert Systems: The Int. Journal of
Knowledge Engineering and Neural Networks. Blackwell publishers Ltd., vol. 18, no. 4,
pp. 186–193, September 2001
29. H.B. Kazemian. Comparative study of a learning fuzzy PID controller and a self-tuning con-
troller. ISA Transactions the Int. Journal of Science and Engineering of Measurement and
Automation. Elsevier Science Ltd., vol. 40, no. 3, pp. 245–253, July 2001
30. H.B. Kazemian. The SOF-PID controller for the control of a MIMO robot-arm. IEEE Trans-
actions on Fuzzy Systems, vol. 10, no. 4, pp. 523–532, August 2002
31. H.B. Kazemian. Developments of fuzzy PID controllers. Expert Systems: The Int. Journal
of Knowledge Engineering and Neural Networks. Blackwell publishers Ltd., vol. 22, no. 5,
pp. 254–264, November 2005
32. J. Denavit and R.S. Hartenburg. A kinematic notation for lower-pair mechanisms based on
matrices. J. Applied Mechanics, pp. 215–221, 1955
33. M.W. Walker and D.E. Orin. Efficient dynamic computer simulation of robotics mechanisms
J. Dyn. Sys., Meas., and Control, vol. 104, pp. 205–211, 1982
34. K.S. Fu, R.C. Gonzalez and C.S.G. Lee. Robotics: control, sensing, vision, and intelligence.
McGraw-Hill Int. Eds., Industrial Engineering Series, 1988
35. R.C. Dorf and R.H. Bishop. Modern control systems. Addison-Wesley Publishing Company,
10th Ed., 2004
36. W. Bolton. Essential mathematics for engineering. Butterworth Heinemann Publishing Com-
pany, 1st Ed., 1997
37. J.G. Ziegler and N.B. Nichols. Optimum settings for automatic controllers. Transaction of
ASME, vol. 65, pp. 433–444, 1943
38. E. Lembessis. Dynamic learning behaviour of a rule-based self organizing controller. Ph.D.
Thesis, Queen Mary University of London, UK, 1984
Stability Analysis and Performance Design
for Fuzzy Model-based Control Systems
using a BMI-based Approach

H.K. Lam, Member, IEEE and F.H.F. Leung, Senior Member, IEEE

Abstract This chapter presents the stability analysis and performance design for
nonlinear systems. To facilitate the stability analysis, the T-S fuzzy model is
employed to represent the nonlinear plant. A fuzzy controller with enhanced sta-
bilization ability is proposed to close the feedback loop. Membership functions
different from those of the fuzzy model are used by the fuzzy controller to sim-
plify its structure. However, under such a case, an imperfect premise-matching con-
dition is resulted, which will lead to conservative stability conditions. To reduce
the conservativeness, the information of the membership functions of the fuzzy
model and controller is employed. The enhanced stabilization ability of the fuzzy
controller is able to further relax the stability conditions. However, the stability
conditions derived using the Lyapunov-based approach are in the form of bilin-
ear matrix inequalities (BMIs) of which the solution is difficult to be found. The
genetic-algorithm based convex programming technique is proposed to solve the
solution of the BMIs. BMI-performance conditions subject to a scalar performance
index are derived to guarantee the system performance. Simulation examples are
given to illustrate that the proposed approach can provide a systematic and ef-
fective way to help design stable and well-performed fuzzy model-based control
systems.

Keywords: Fuzzy control; Lyapunov stability; Genetic algorithm; Stability


analysis

H.K. Lam, Member, IEEE


Department of Electronic Engineering, Division of Engineering, The King’s College London,
WC2R 2LS, United Kingdom
F.H.F. Leung, Senior Member, IEEE
Centre for Multimedia Signal Processing, Department of Electronic and Information Engineering,
The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 261
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 261–281.
c 2008 Springer.
262 H.K. Lam and F.H.F. Leung

1 Introduction

The T-S fuzzy modelling approach [1, 2] provides a systematic framework to rep-
resent nonlinear plants and facilitates the stability analysis and controller synthesis.
Using the Lyapunov-based method, various stability conditions [3–11] have been
derived to guarantee the system stability. Furthermore, stability conditions can be
expressed in terms of linear matrix inequalities (LMIs) [12] of which the solution
can be found by using some convex programming techniques.
In general, two cases of fuzzy model-based control systems have been investi-
gated. In the first case, the fuzzy controller is designed under the imperfect premise-
matching condition of which the fuzzy model and the fuzzy controller do not
share the same premises. In [3, 4], LMI-based stability conditions were derived to
guarantee the system stability of this class of fuzzy model-based control systems.
Under the imperfect premise-matching condition, the fuzzy controller exhibits two
favourable features. One, the premise membership functions can be freely designed
so that the design flexibility for the fuzzy controllers is enhanced. Some simple
and commonly used membership functions can be employed to lower the structural
complexity, computational demand and implementation cost of the fuzzy controller.
Two, the fuzzy controller displays an inherent robustness property to handle para-
meter uncertainties of the nonlinear plant. In [3, 4], it can be seen that the stability
conditions are not related to the membership functions of the non-linear plant. Con-
sequently, the fuzzy controller designed under imperfect premise-matching condi-
tion is able to stabilize nonlinear plant with its fuzzy model subject to uncertain
grades of membership due to the presence of parameter uncertainties. However, the
imperfect premise-matching condition will lead to conservative stability conditions
as the membership functions of the fuzzy model are not considered during the stabil-
ity analysis. This problem is partially answered by the second case of fuzzy model-
based control system design. In this case, the fuzzy controller is designed under the
perfect premise-matching condition. Unlike the imperfect premise-matching condi-
tion, the fuzzy model and the fuzzy controller share the same premises during the
design of the fuzzy controller. As the membership functions of the fuzzy model are
considered during stability analysis, the stability conditions can be relaxed [4–11].
However, as the grades of membership function are needed to be known, the fuzzy
model considered in [4–11] must be uncertainty free. Hence, under the perfect
premise-matching condition, the stability conditions are relaxed by sacrificing the
inherent robustness property of the fuzzy controller. It can be seen that both fuzzy
controllers designed under the imperfect and perfect premise-matching conditions
cannot replace each other; each has its own advantages in various applications.
In this chapter, the stability of fuzzy model-based control systems under the
imperfect premise-matching conditions is investigated. As revealed by the stability
analysis results of fuzzy model-based systems under perfect premise-matching con-
ditions [4–11] and the preliminary stability analysis result in [13] published by the
same authors, the information of the fuzzy model is important to relax the stability
conditions. The knowledge on the membership functions of the fuzzy model is
employed to design the membership functions of the fuzzy controller. Furthermore,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 263

in order to further relax the stability conditions under imperfect premise-matching


condition, a fuzzy controller with enhanced stabilization ability is proposed. Re-
ferring to the traditional fuzzy controller [3–11], the proposed fuzzy controller can
be regarded as the traditional one but with time-varying state-feedback gains. The
nonlinearity of the time-varying state-feedback gains offers a potential relaxation
to the stability conditions. The merits of the time-varying state-feedback gains have
been illustrated in the discrete-time fuzzy model-based control systems [14, 15]
under perfect premise-matching condition. Based on the Lyapunov-based approach
and the knowledge on the membership functions of the fuzzy model, bilinear-matrix-
inequality (BMI)-based stability conditions under imperfect premise-matching con-
dition are derived to guarantee the system stability. As the stability conditions are in
terms of BMIs, convex programming techniques cannot be applied directly to find
the solution. Taking advantage of the powerful global searching ability of the genetic
algorithm (GA) [16], a GA-based convex programming technique is proposed to
obtain the solution of the BMI-based stability conditions.
System performance is another important issue to be considered for fuzzy model-
based control systems. In this chapter, a scalar performance index [17] is employed
to measure quantitatively the system performance. BMI-based performance condi-
tions are derived to reduce the value of the scalar performance index to a prescribed
level. The BMI-based performance conditions are additional constraints to the BMI-
based stability conditions, and confine the searching domain of the parameters of the
fuzzy controller. Inside the constrained searching domain, any parameter values sat-
isfy the system performance requirement as described by the scalar performance
index. The BMI-based stability and performance conditions provide a systematic
and effective way to help design stable and well-performed fuzzy model-based con-
trol systems.
This chapter is organized as follows. In Section 2, the fuzzy model and fuzzy con-
troller are introduced. In Section 3, the stability analysis and system performance
of the fuzzymode-based control systems are investigated. BMI-stability and perfor-
mance conditions are derived based on the Lyapunov stability theory. In Sections 4
and 5, the GA-based convex programming technique is proposed to find the solution
of the BMI-based stability and performance conditions. In Section 6, simulation ex-
amples are given to show the effectiveness of the proposed approach. In Section 7,
a conclusion is drawn.

2 Fuzzy Model and Fuzzy Controller

A multivariable fuzzy model-based control system comprising a non-linear plant


represented by a fuzzy model and a fuzzy controller connected in a closed loop will
be considered.
264 H.K. Lam and F.H.F. Leung

2.1 Fuzzy Model

Let p be the number of fuzzy rules describing the non-linear plant. The i-th rule is
of the following format:

Rule i : IF f1 (x(t)) is M1i AND ... AND fΨ (x(t)) is MΨi THEN ẋ(t) = Ai x(t) + Bi u(t)
(1)

where Mαi is a fuzzy term of rule i corresponding to the known function fα (x(t)),
α = 1, 2, ..., Ψ; i = 1, 2, ..., p; Ψ is a positive integer; Ai ∈ Rn×n and Bi ∈ Rn×m are
known constant system and input matrices respectively; x(t) ∈ Rn×1 is the system
state vector and u(t) ∈ Rm×1 is the input vector. The system dynamics are described
by
p
ẋ(t) = ∑ wi (x(t)) (Ai x(t) + Bi u(t)) (2)
i=1

where
p
∑ wi (x(t)) = 1, wi (x(t)) ∈ [0, 1] for all i (3)
i=1

µMi ( f1 (x(t))) × µMi ( f2 (x(t))) × ... × µMi ( fΨ (x(t)))


1 2 Ψ
wi (x(t)) = 
p  (4)
∑ µMk ( f1 (x(t))) × µMk ( f2 (x(t))) × ... × µMk ( fΨ (x(t)))
k=1 1 2 Ψ

is a non-linear function of x(t) and µMαi (xα (t)) is the grade of membership corre-
sponding to the fuzzy term Mαi . The grade of membership is affected by any plant
parameter uncertainty.

2.2 Fuzzy Controller

A fuzzy controller with p fuzzy rules is to be designed for the non-linear plant. The
j-th rule of the fuzzy controller is of the following format:

Rule j : IF g1 (x(t)) is N1j AND ... AND gΩ (x(t)) is NΩj THEN u(t) = F j x(t)
(5)

where Nβj is a fuzzy term of rule j corresponding to the known function gβ (x(t)),
β = 1, 2, ..., Ω ; j = 1, 2, ..., p; Ω is a positive integer; F j ∈ Rm×n is the feedback
gain of rule j to be designed. The inferred output of the fuzzy controller is given by
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 265
p
u(t) = ∑ m j (x(t))F j x(t) (6)
j=1

where
p
∑ m j (x(t)) = 1, wi (x(t)) ∈ [0, 1] for all j (7)
j=1

µN j (g1 (x(t))) × µN j (g2 (x(t))) × ... × µN j (gΩ (x(t)))


1 2 Ω
m j (x(t)) = p   (8)
∑ µN k (g1 (x(t))) × µN k (g2 (x(t))) × ... × µN k (gΩ (x(t)))
k=1 1 2 Ω

is a non-linear function of x(t) and µN j (gβ (x(t))) is the grade of membership cor-
β
responding to the fuzzy term Nβj .
In order to improve the stabilization ability of the fuzzy controller, the feedback
Gj
gains are chosen to be F j = p to enhance the non-linearity for com-
∑ k m (x(t))ak
k=1
pensating the non-linear plant dynamics. From (6), we have,
p
∑ m j (x(t))G j x(t)
j=1
u(t) = p (9)
∑ mk (x(t))ak
k=1

where G j ∈ Rm×n , j = 1, 2, ..., p, are constant feedback gains and ak , k = 1, 2, ..., p,


p
are nonzero positive scalars so designed that we have ∑ mk (x(t))ak > 0.
k=1

Remark 1: It should be noted that the fuzzy controller of (9) is equivalent to


that in [18] when mi (x(t)) = wi (x(t)) for all i. It is reduced to the traditional fuzzy
controller [3-4] when ak = 1 for all k.

3 Stability Analysis

The fuzzy model-based control system is formed by the fuzzy model of (2) and the
fuzzy controller of (9) connected in a closed loop. From (2) and (9), we have,
266 H.K. Lam and F.H.F. Leung
⎛ p ⎞
∑ m j (x(t))G j x(t)
p ⎜ ⎟
∑ wi (x(t)) ⎜ ⎟
j=1
ẋ(t) = ⎝Ai x(t) + Bi p ⎠
i=1 ∑ mk (x(t))ak
k=1
p p
1
= p ∑ ∑ wi (x(t))m j (x(t)) (a j Ai + Bi G j ) x(t) (10)
∑ mk (x(t))ak i=1 j=1
k=1

The system stability of (10) is investigated using the Lyapunov-based approach. In


the following analysis, the property of
p p p p
∑ wi (x(t)) = ∑ m j (x(t)) = ∑ ∑ m j (x(t))mk (x(t)) = 1
i=1 j=1 j=1 k=1

is used. For simplicity, wi (x(t)) and m j (x(t)) are written as wi and m j . To investi-
gate the stability of system of (10), the following Lyapunov function candidate is
considered.
V (t) = x(t)T Px(t) (11)
where P = PT ∈ Rn×n > 0 . From (10) and (11), we have,

V̇ (t) = ẋ(t)T Px(t) + x(t)T Pẋ(t)


⎛ ⎞T
⎜ 1 p p ⎟
=⎜
⎝ p ∑ ∑ wi m j (a j Ai + Bi G j ) x(t)⎟
⎠ Px(t)
∑ mk ak i=1 j=1
k=1
⎛ ⎞
⎜ 1 p p ⎟
+x(t)T P ⎜
⎝ p ∑ ∑ wi m j (a j Ai + Bi G j ) x(t)⎟
⎠ (12)
∑ mk ak i=1 j=1
k=1
1 p p  
= p ∑ ∑ wi m j x(t)T (a j Ai + Bi G j )T P + P (a j Ai + Bi G j ) x(t)
∑ mk ak i=1 j=1
k=1

It can be seen that V̇ (t) < 0 , which implies the asymptotic stability of the fuzzy
model-based control system, is satisfied when

(a j Ai + Bi G j )T P + P (a j Ai + Bi G j ) < 0

for all i and j. In order to relax the conservativeness of the stability conditions, the
membership functions of the fuzzy controller are designed such that mi − ρ wi > 0
for all i and x(t), where 0 < ρ < 1 is a constant scalar to be determined. Let X =
XT = P−1 and z(t) = X−1 x(t), from (12), we have,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 267

1 p p  
V̇ (t) = p ∑ ∑ wi m j z(t)T a j XATi + XGTj BTi + a j Ai X + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (  
= p ∑ ∑ wi m j −ρ w j +ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
ρ p p  
= p ∑ ∑ wi w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (  
+ p ∑ ∑ wi m j − ρ w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1
1 p p ' (
+ p ∑ ∑ wi m j − ρ w j z(t)T (Λ i − Λ i ) z(t)
∑ mk ak i=1 j=1
k=1
ρ p p  
= p ∑ ∑ wi w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1

1 p p ' (  
+ p ∑ ∑ wi m j − ρ w j z(t)T a j XATi + a j Ai X + XGTj BTi + Bi G j X z(t)
∑ mk ak i=1 j=1
k=1

1 p p ' ( ρ p
(1 − ρ )
+ p ∑ ∑ wi m j − ρ w j z(t)T Λ i z(t) − p ∑ wi ρ
z(t)T Λ i z(t)
∑ mk ak i=1 j=1 ∑ mk ak i=1
k=1 k=1
p p
ρ (1 − ρ )
= p ∑ ∑ wi w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X−
ρ
Λ i z(t)
∑ mk ak i=1 j=1
k=1

1 p p ' (  
+ p ∑ ∑ wi m j −ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X + Λ i z(t)
∑ mk ak i=1 j=1
k=1
(13)

where Λ i = Λ Ti ∈ Rn×n , i = 1, 2, ..., p, are arbitrary matrices. It can be seen that


the matrices Λ i are able to transfer the unstable elements between the two terms
in the right hand side of (13) in order to alleviate the conservativeness of stability
conditions. From (13), we have,
268 H.K. Lam and F.H.F. Leung
p
ρ (1 − ρ )
V̇ (t) = p ∑ w2i z(t)T ai XATi + ai Ai X + XGTi BTi + Bi Gi X −
ρ
Λ i z(t)
∑ mk ak i=1
k=1
⎛ ⎞
(1−ρ )
ρ p a j XATi +a j Ai X+XGTj BTi +Bi G j X− ρ Λ i
⎜ ⎟
+ p ∑ ∑ wi w j z(t)T ⎝ (1− ρ )
⎠z(t)
∑ mk ak j=1 i< j
+ai XA j +ai A j X+XGi B j +B j Gi X− ρ Λ j
T T T
k=1
1 p p ' (
+ p ∑ ∑ wi (m j −ρ w j ) z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X+Λ i z(t)
∑ mk ak i=1 j=1
k=1
(14)

Let Si j = STji , i, j = 1, 2, ..., p,

(1 − ρ )
Sii > ai XATi + ai Ai X + XGTi BTi + Bi Gi X − Λ i , i = 1, 2, ..., p (15)
ρ

(1 − ρ )
Si j + STij ≥ a j XATi + a j Ai X + XGTj BTi + Bi G j X − Λi
ρ
(1 − ρ )
+ai XATj + ai A j X + XGTi BTj + B j Gi X − Λ j,
ρ
i, j = 1, 2, ..., p; i < j (16)

From (14) to (16), we have,

ρ p
ρ p  
V̇ (t) < p ∑ w2i z(t)T Sii z(t) + p ∑ ∑ wi w j z(t)T Si j + STij z(t)
∑ mk ak i=1 ∑ mk ak j=1 i< j
k=1 k=1
1 p p ' (  
+ p ∑ ∑ wi m j −ρ w j z(t)T a j XATi +a j Ai X+XGTj BTi +Bi G j X + Λ i z(t)
∑ mk ak i=1 j=1
k=1
⎤T ⎡
⎡ ⎤
w1 z(t) w1 z(t)
ρ ⎢ ⎥ ⎢ ⎥
= p ⎢ w2 z(t) ⎥ S ⎢ w2 z(t) ⎥
⎣ ... ⎦ ⎣ ... ⎦
∑ mk ak w z(t) w p z(t)
k=1 p

1 p p ' ( a j XATi + a j Ai X
+ p ∑ ∑ wi m j − ρ w j z(t)T
+XGTj BTi + Bi G j X + Λ i
z(t) (17)
∑ mk ak i=1 j=1
k=1
⎡ ⎤
S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
where S = ⎢ ⎥
⎣ ... ... ... ... ⎦. It can be seen from (17) that V̇ (t) < 0, which implies
S p1 S p2 ... S pp
the asymptotic stability of the fuzzy model-based control system, if S < 0 and
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 269

a j XATi + a j Ai X + XGTj BTi + Bi G j X + Λ i < 0

for all i and j. The stability analysis result is summarized in the following theorem.
Theorem 1: The fuzzy model-based control system of (10) formed by the non-
linear plant in the form of (2) and the fuzzy controller of (9) is asymptotically stable
if the membership functions of the fuzzy controller are designed such that mi (x(t)) −
ρ wi (x(t)) > 0 for all i and x(t), where 0 < ρ < 1, and there exist non-zero positive
scalars ai and matrices P = PT ∈ Rn×n , Si j = STji ∈ Rn×n , Gi ∈ Rm×n , and Λ i =
Λ Ti ∈ Rn×n such that the following BMIs are satisfied:
• P > 0;
(1 − ρ )
• Sii > ai XATi + ai Ai X + XGTi BTi + Bi Gi X − ρ Λ i , i = 1, 2, ..., p;
(1 − ρ )
• Si j + STij ≥ a j XATi + a j Ai X + XGTj BTi + Bi G j X − ρ Λ i
(1 − ρ )
+ ai XATj + ai A j X + XGTi BTj + B j Gi X − ρ Λ j ,
⎡ i, j = 1, 2, ..., p; i < j;

S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
• S=⎢ ⎥
⎣ ... ... ... ... ⎦ < 0;
S p1 S p2 ... S pp
• a j XAi + a j Ai X + XGTj BTi + Bi G j X + Λ i < 0, i, j = 1, 2, ..., p.
T

4 Design of G j and a j for the Fuzzy Controller

In the following, the feedback gains G j and a j for the fuzzy controller are deter-
mined using the BMI-based approach.

4.1 Design of Feedback Gains

The design of the feedback gains G j , j = 1, 2, ..., p, are formulated as BMI-based


conditions. Let G j = N j X−1 , where N j ∈ Rm×n , Theorem 1 can be modified to the
following theorem.
Theorem 2: The fuzzy model-based control system of (10) formed by the non-
linear plant in the form of (2) and the fuzzy controller of (9) is asymptotically
stable if the membership functions of the fuzzy controller are designed such that
mi (x(t)) − ρ wi (x(t)) > 0 for all i and (x(t)) , where 0 < ρ < 1, and there exist non-
zero positive scalars ai and matrices X = XT ∈ Rn×n , Si j = STji ∈ Rn×n , Ni ∈ Rm×n ,
and Λ i = Λ Ti ∈ Rn×n such that the follow BMIs are satisfied.
• X > 0;
270 H.K. Lam and F.H.F. Leung

Start

Genetic Algorithm

PS

LMI: L(Pm, PS)+zI>0

fitness = z

No Stop criterion
reached?

Yes

END

Fig. 1 Procedure of the combined GA-based and convex programming technique

(1 − ρ )
• Sii > ai XATi + ai Ai X + NTi BTi + Bi Ni − ρ Λ i , i = 1, 2, ..., p;
(1 − ρ )
• Si j + STij ≥ a j XATi + a j Ai X + NTj BTi + Bi N j − ρ Λ i
(1 − ρ )
+ ai XATj + ai A j X + NTi BTj + B j Ni − ρ Λ j ,
⎡ i, j = 1, 2, ..., p; i < j;

S11 S12 ... S1p
⎢ S21 S22 ... S2p ⎥
• S=⎢ ⎥
⎣ ... ... ... ... ⎦ < 0;
S p1 S p2 ... S pp
• a j XAi + a j Ai X + NTj BTi + Bi N j + Λ i < 0, i, j = 1, 2, ..., p.
T

and the feedback gains are designed as G j = N j X−1 , j = 1, 2, ..., p.

4.2 Solution Solving

Based on Theorem 1 and Theorem 2, the fuzzy model-based control system of (10)
is guaranteed to be asymptotically stable if there exist scalars a j , j = 1, 2, ..., p, such
that the stability conditions are satisfied. It should be noted that the stability con-
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 271

ditions in Theorem 1 and Theorem 2 are not LMIs if a j for all j are variables. To
deal with this problem, the GA-based convex programming technique is proposed
to solve the solution. The procedure is illustrated in Figure 1 and is summarized as
follows:
Step 1) GA generates the potential solution of Ps = [a1 , a2 , ..., a p ] which is kept
constant and fed to an LMI solver in the subsequent stage. It should be noted that
when the value of Ps is kept constant, the BMI-based stability conditions become
LMIs which can be solved using convex programming technique. In general, the
initial value of Ps is randomly generated.
Step 2) The LMI solver solves the solution Pm to the LMI conditions based on the
fixed value of Ps generated by GA in Step 1. The LMI problem is generally denoted
by L(Pm , Ps ) + zI > 0 where

Pm = [X, N1 , N2 , ..., N p , S11 , S12 , ..., S pp , Λ 1 , Λ 2 , ..., Λ p ]

denotes the potential solution of the LMI problem and z is a scalar. It should be
noted that the initial value of Pm is randomly generated or determined by the LMI
solver.
Step 3) If there exists a negative z such that L(Pm , Ps ) + zI > 0, it implies that
both Pm and Ps satisfy the stability conditions. A solution has been found. On us-
ing the GA-based convex programming process, z is taken as a fitness function to
indicate the degree of satisfaction of both Pm and Ps to the inequality problem. A
more negative value of z indicates better solutions of Pm and Ps . Consequently, the
finding of solution is realized as a minimization problem (minimizing the value of
z). A stopping criterion should be set to stop the solution finding process, e.g., a
predefined number of iteration has been reached.
Step 4) If the stopping criterion is not met, return to Step 1).

5 BMI-Based Performance Design of Fuzzy Model-Based


Control System

In this section, BMI-based performance conditions are derived to guarantee the sys-
tem performance under the consideration of system stability. The performance con-
ditions are extra constraints added to the stability conditions in Theorem 2, which
confine the searching domain of N j for all j. Any values of N j inside that search-
ing domain satisfies a pre-defined scalar performance index [17]. The performance
index, which measures quantitatively the system performance, is defined as follows.
∞ p p 7 8T 7 87 8
x(t) J1 0 x(t)
J= ∑∑ mγ mλ aγ aλ
u(t) 0 J2 u(t)
dt (18)
γ =1 λ =1
0
272 H.K. Lam and F.H.F. Leung

where J1 = JT1 ∈ Rn×n > 0, J2 = JT2 ∈ Rm×m > 0, which are constant weighting
matrices determined by designers. The weighting matrices allocate the importance
of each system state or control signal contributed to the performance index of (18).
It can be seen that the performance index of (18) reflects the integral of energy of the
system states and control signals. A smaller scalar value of J indicates better system
p p p
performance. From (9) and (18), and with the property that ∑ m j = ∑ ∑ m j mk =
j=1 j=1 k=1
1, we have,
⎡ ⎤T ⎡ ⎤
I 0 I 0
∞ p 7 8T ⎢ p ⎥ 7 8⎢ p ⎥7 8
p
x(t) ⎢ ∑ m j G j ⎥ J1 0 ⎢ ∑ mk Gk ⎥ x(t)
J= ∑ ∑ mγ mλ aγ aλ ⎢ j=1 ⎥ ⎢ k=1 ⎥
x(t) ⎢0 p ⎥ 0 J2 ⎢ 0 p ⎥ x(t) dt
γ =1 λ =1 ⎣ ⎦ ⎣ ⎦
0
∑ mξ aξ ∑ mϕ aϕ
ξ =1 ϕ =1
⎡ p ⎤T ⎡ ⎤ p
∞ 7 8T ∑
⎢ j=1 m j a j 0 7 8 ∑ mk ak
⎥ J1 0 ⎢ k=1 0 7 8
x(t) ⎢ ⎥ ⎥ x(t)
= ⎣ p ⎦ ⎣ p ⎦ dt
x(t) 0 J2 x(t)
0 0 ∑ m jG j 0 ∑ mk Gk
j=1 k=1
∞ p p 7 8T 7 8T 7 87 87 8
x(t) a jI 0 J1 0 ak I 0 x(t)
= ∑ ∑ m j mk x(t) 0 Gj 0 J2 0 Gk x(t)
dt
j=1 k=1
0
(19)

Let the performance index J satisfy the following condition,


∞ 7 8T 7 87 8
x(t) X−1 0 x(t)
J<η dt (20)
x(t) 0 X−1 x(t)
0

where η is a non-zero positive scalar. It can be seen that if condition of (20) is


satisfied, the scalar performance index J is attenuated to a prescribed level governed
by the value of η . From (19) and (20), recalling that G j = N j X−1 , j = 1, 2, ..., p,
we have,
∞ p p 7 8T 7 8T 7 87 8
x(t) a jI 0 J1 0 ak I 0
∑ ∑ m j mk x(t) 0 Gj 0 J2 0 Gk
j=1 k=1
0
7 −1 8 7 8
X 0 x(t)
−η dt < 0
0 X−1 x(t)
∞ p p 7 8T 7 −1 8 7 8T 7 87 8
x(t) X 0 a jX 0 J1 0 ak X 0
∑∑ m j m k
x(t) 0 X−1 0 Nj 0 J2 0 Nk
j=1 k=1
0
7 8 7 −1 87 8
X 0 X 0 x(t)
−η dt < 0 (21)
0 X 0 X−1 x(t)
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 273

The inequality of (21) is satisfied if the following condition holds.


7 8T 7 87 8 7 8
p p
a jX 0 J1 0 ak X 0 X 0
∑ ∑ m j mk 0 Nj 0 J2 0 Nk
−η
0 X
<0 (22)
j=1 k=1

By Schur complement [12], the inequality of (22) is equivalent to the following


inequality.
p
∑ m jW j < 0 (23)
j=1

where ⎡ ⎤
−η X 0 a j X 0
⎢ 0 −η X 0 Nj ⎥
T
Wj = ⎢ ⎥
⎣ a j X 0 −J−1 0 ⎦ , j = 1, 2, ..., p.
1
0 Nj 0 −J−1
2
The inequality of (23) is satisfied when W j < 0 for all j, which are the BMI-based
performance conditions. The analysis result is summarized in the following theo-
rem.
Theorem 3: The scalar performance index of (18) which measures quantitatively
the system performance of the fuzzy model-based control system of (10) is atten-
uated to a prescribed level governed by the non-zero-positive scalar value of η if
there exist nonzero positive scalars a j , j = 1, 2, ..., p, and matrices X = XT ∈ Rn×n ,
J1 = JT1 ∈ Rn×n > 0, J2 = JT2 ∈ Rm×m > 0, and N j ∈ Rn×n such that the follow BMIs
are satisfied.
• X > 0;⎡ ⎤
−η X 0 a j X 0
⎢ 0 −η X 0 NTj ⎥
• Wj = ⎢ ⎥
⎣ a j X 0 −J−1 0 ⎦ < 0, j = 1, 2, ..., p.
1
0 Nj 0 −J−12

The BMI-performance conditions in Theorem 3 are added to Theorem 2 to guar-


antee the system performance subject to the system stability. It should be noted
that the weighting matrices of J1 and J2 have to be determined prior to applying
Theorem 3. Furthermore, the conditions in Theorem 3 only govern the system per-
formance. The system is guaranteed to be asymptotically stable only if the stability
conditions in Theorem 2 are satisfied no matter the performance conditions in The-
orem 3 are satisfied or not.
In the following, the procedure to obtain the non-linear controller is summarized.
Step I) Obtain the fuzzy model of the non-linear plant by: 1) performing identifi-
cation methods through the use of the input–output data of the plant [1, 2, 19], or 2)
deriving directly from the mathematical model of the non-linear plant [4, 5].
274 H.K. Lam and F.H.F. Leung

Step II) Determine m j (x(t)) for the fuzzy controller and obtain the value of
0 < ρ < 1 subject to the conditions of m j (x(t)) − ρ w j (x(t)) for all j and (x(t)).
Determine the ranges a j for the GA-based convex programming technique.
Step III) Solve the solution of the stability conditions in Theorem 1 (if the values
of G j are pre-determined) or Theorem 2 (if the values of G j are determined au-
tomatically) using the GA-based convex programming technique process as shown
in Figure 1. If the system performance is considered, the BMI-based performance
conditions in Theorem 3 are needed to be added to those conditions in Theorem 2.
J1 and J2 have to be determined beforehand.
Step IV) Implement the fuzzy controller of (9) according to the values of G j and
a j.

6 Simulation Examples

Two simulation examples will be given to illustrate the merits of the proposed
approach.

6.1 Simulation Example 1

Considering the following fuzzy model,

Rule i : IF x1 (t) is M1i THEN ẋ(t) = Ai x(t) + Bi u(t), i = 1, 2 (24)

where
7 8 7 8 7 8 7 8
2 −10 a −10 1 b
A1 = , A2 = , B1 = , and B2 = ;
1 0 1 3 0 0

1 ≤ a ≤ 3 and 1 ≤ b ≤ 2.8. It is assumed that the membership functions of the fuzzy


model and controller are different. Considering ρ = 0.75 and employing the design
criterion in [5], the feedback gains of the fuzzy control controller are designed such
that the eigenvalues of A1 + B1 G1 and A2 + B2 G2 are all located at −2. It should be
noted that the proposed fuzzy controller is reduced to that in [3,4] under such a case.
It can be shown that the published stability conditions in [3, 4] cannot provide fea-
sible solutions. Furthermore, the stability conditions in [4–11], which require fuzzy
model and controller sharing the same membership functions, cannot be applied to
testify the system stability.
Figure 2 shows the stability region (as indicated by the small circles) with a j = 1
for j = 1, 2 under ρ = 0.75 and 0.9 respectively. It can be seen that that stability
region depends on the value of ρ . A larger value of ρ offers a lager stability region.
To show the effectiveness of a j , the proposed GA-based convex programming tech-
nique is employed to solve the solution to the stability conditions in Theorem 1. The
lower and upper bounds of a j , are chosen to be 10−3 and 2 respectively. The real-
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 275

2.8 2.8
2.6 2.6
2.4 2.4
2.2 2.2
2 2
b

b
1.8 1.8
1.6 1.6
1.4 1.4
1.2 1.2
1 1
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
a a
(a) (b)

Fig. 2 Stability region based on Theorem 1 with a j = 1 for all j Simulation Example 1 (a) ρ = 0.75
(b) ρ = 0.9

2.8 2.8
2.6 2.6
2.4 2.4
2.2 2.2
2 2
b

1.8 1.8
1.6 1.6
1.4 1.4
1.2 1.2
1 1
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
a a
(a) (b)

Fig. 3 Stability region based on Theorem 1 for Simulation Example 1 (a) ρ = 0.75 (b) ρ = 0.9

coded GA with arithmetic crossover and non-uniform mutation [16] is employed as


the convex programming technique. The parameters a j , j = 1, 2, form the chromo-
somes of the GA process. Their initial values are randomly generated. The control
parameters of the real-coded GA are as follows. The probability of crossover is 0.8;
the probability of mutation is 0.5; the shape parameter is 1; the population size is
40 and the number of training iteration is 500. The stability regions under ρ = 0.75
and 0.9 are shown in Figure 3. It can be seen that the stability region is larger than
that offered by the ones with a j = 1 for all j correspondingly.

6.2 Simulation Example 2

An example on stabilizing a cart-pole typed inverted pendulum [20] using the pro-
posed non-linear controller is given below.
276 H.K. Lam and F.H.F. Leung

x=q x=q

mg
l

M u

Fig. 4 Cart-pole typed inverted pendulum system

Step I) Figure 4 shows a diagram of the cart-pole typed inverted pendulum. The
dynamic equations of the inverted pendulum on the cart [20] are given by,

ẋ1 (t) = x2 (t) (25)


⎛ ⎞
−F1 (M + m)x2 (t) − m2 l 2 (x2 (t))2 sin x1 (t) cos x1 (t)
⎝ +F0 mlx4 (t) cos x1 (t) ⎠
+(M + m)mgl sin x1 (t) − ml cos x1 (t)u(t)
ẋ2 (t) = (26)
(M + m)(J + ml 2 ) − m2 l 2 (cos x1 (t))2

ẋ3 (t) = x4 (t) (27)


⎛ ⎞
F1 mlx2 (t) cos x1 (t) + (J + ml 2 )ml(x2 (t))2 sin x1 (t)
⎝ −F0 (J + ml 2 )x4 (t) ⎠
−m2 gl 2 sin x1 (t) cos x1 (t) + (J + ml 2 )u(t)
ẋ4 (t) = (28)
(M + m)(J + ml 2 ) − m2 l 2 (cos x1 (t))2

where x1 (t) and x2 (t) denote the angular displacement (rad) and the angular veloc-
ity (rad/s) of the pendulum from vertical respectively, x3 (t) and x4 (t) denote the
displacement (m) and the velocity (m/s) of the cart respectively, g = 9.8 m/s2 is the
acceleration due to gravity, m = 0.22 kg is the mass of the pendulum, M = 1.3282
kg is the mass of the cart, l = 0.304 m is the length from the centre of mass of the
pendulum to the shaft axis, J = ml 2 /3 kgm2 is the moment of inertia of the pendu-
lum around the centre of mass, F0 = 22.915 N/ms and F1 = 0.007056 N/rads are the
friction factors of the cart and the pendulum respectively, and u(t) is the force (N)
applied to the cart. The non-linear plant can be represented by a fuzzy model with
two fuzzy rules [20]. The i-th rule is given by,
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 277

Rule i : IF x1 (t) is M1i THEN ẋ(t) = Ai x(t) + Bi u(t) for i = 1, 2 (29)

The system dynamics are described by,


2
ẋ(t) = ∑ wi (Ai x(t) + Bi u(t)) (30)
i=1

where
• x(t) =⎡[x1 (t) x2 (t) x3 (t) x4 (t)]T ; ⎤
0 1 0 0
⎢ (M + m)mgl/a1 −F1 (M + m)/a1 0 F0 ml/a1 ⎥
• A1 = ⎢⎣
⎥;

0 0 1 0
−m2 gl 2 /a1 F1 Ml/a1 0 −F0 (J + ml 2 )/a1
⎡ ⎤
0
⎢ −ml/a1 ⎥
• B1 = ⎢

⎥;

0
(J + ml 2 )/a1
⎡ ⎤
√ 0 1 0 0
⎢ 3 3 ⎥
⎢ (M + m)mgl/a2 −F1 (M + m)/a2 0 F0 ml cos(π /3)/a2 ⎥
• A2 = ⎢⎢ 2π ⎥;
0 0 1 0 ⎥
⎣ √ ⎦
3 3
− 2π m gl cos(π /3)/a2 F1 ml cos(π /3)/a2 0 −F0 (J + ml )/a2
2 2 2
⎡ ⎤
0
⎢ −ml cos(π /3)/a2 ⎥
• B2 = ⎢

⎥;

0
2
(J + ml )/a2
• a1 = (M + m)(J + ml 2 ) − m2 l 2 ;
• a2 = (M + m)(J + ml 2 ) − m2 l 2 (cos(π /3))2 .
The membership functions are defined as

1 1
w1 (x1 (t)) = µM1 (x1 (t)) = 1 −
1 1 + e−7(x1 (t)−π /6) 1 + e−7(x1 (t)+π /6)
and
w2 (x1 (t)) = µM2 (x1 (t)) = 1 − µM1 (x1 (t))
1 1

which are shown in Figure 5.

Step II) A two-rule fuzzy controller is proposed to control the non-linear plant.
The j-th rule is given by,

Rule j : IF x1 (t) is N1j THEN u(t) = F j x(t), j = 1, 2 (31)

From (9), the fuzzy controller is defined as,


278 H.K. Lam and F.H.F. Leung
1

0.9

Grade of Membership 0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
–2 –1.5 –1 –0.5 0 0.5 1 1.5 2
x1(t)(rad)

Fig. 5 Membership functions of fuzzy model and fuzzy controller in Simulation Example 2.
ρ µM1 (x1 (t)) (bell in solid line) and µN 1 (x1 (t)) (trapezoid in solid line), ρ µM2 (x1 (t)) (bell in dotted
1 1 1
line) and µN 2 (x1 (t)) (trapezoid in dotted line) with ρ = 0.8
1

2
2 ∑ m j (x(t))G j x(t)
j=1
u(t) = ∑ m j (x(t))F j x(t) = 2
(32)
j=1
∑ mk (x(t))ak
k=1

The membership functions of the fuzzy controller are shown in Figure 5. A sim-
ple commonly used trapezoidal membership function is employed to implement
the fuzzy controller. Based on the membership information of the fuzzy model
and fuzzy controller, we have ρ = 0.8 such that the conditions of m j (x1 (t)) −
ρ w j (x1 (t)) > 0 for all i and x1 (t).
Step III) Theorem 2 is employed to help design a stable fuzzy controller for
the inverted pendulum. BMI-performance conditions in Theorem 3 are added to
Theorem 2 to govern the system performance. To measure the system perfor-
mance, ⎤ performance index of (18), with η = 0.01 and weighting matrices
⎡ the scalar
1000
⎢0 1 0 0⎥
J1 = ⎢ ⎥
⎣ 0 0 1 0 ⎦ and J2 = 0.1, is used. The proposed GA-based convex program-
0001
ming technique is employed to solve the solution of the BMI-based stability and
performance conditions. The lower and upper bounds of a j , j = 1, 2, are chosen em-
pirically to be 10−3 and 2 respectively. The real-coded GA with arithmetic crossover
and non-uniform mutation [16] are used as the convex programming technique in
this application example. The parameters a j , j = 1, 2, form the chromosomes of the
GA process. Their initial values are randomly generated. The control parameters of
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 279

the real-coded GA are as follows. The probability of crossover is 0.8; the probabil-
ity of mutation is 0.5; the shape parameter is 1; the population size is 40 and the
number of training iteration is 500.
After the process, we obtain a1 = 0.1467 and a2 = 0.1752, and the feedback
gains as

G1 = [79.74436.81170.18396.6751], G2 = [106.13538.11760.14826.6387]

such that the BMI-based stability and performance conditions in Theorem 2 and
Theorem 3 are satisfied. In the following, the fuzzy controller with these feedback
gains is referred as fuzzy controller 1. For comparison purpose and to show the
effectiveness of the performance conditions, another set of feedback gains is ob-
tained⎡for fuzzy controller
⎤ 2 of which every parameter is kept unchanged except
10 0 0
⎢0 1 0 0⎥
J1 = ⎢ ⎥
⎣ 0 0 100 0 ⎦. On solving the stability and performance conditions, the feed-
00 0 1
back gains obtained for fuzzy controller 2 are

G1 = [139.369511.84973.724312.0212], G2 = [165.784413.43603.746512.8725].

Both fuzzy controllers 1 and 2 in the form of (32) are employed to stabilize the in-
verted pendulum described in (25) to (28). Figure 6 shows the system state responses
under the initial system state x(0) = 512 π 0 0 0 T . Referring to this figure, it can
be seen that the inverted pendulum can be stabilized ⎡ by both⎤ fuzzy controllers.
10 0 0
⎢0 1 0 0⎥
Considering the fuzzy controller 2, we have J1 = ⎢ ⎥
⎣ 0 0 100 0 ⎦ in which a heav-
00 0 1
ier weight is put to x3 (t) in the performance index. Consequently, the system state
response of x3 (t) of the controlled inverted pendulum with fuzzy controller 2 offers
better system performance than that with fuzzy controller 1 in terms of transient
response and settling time.
In this example, it can be seen that simple membership functions can be used by
the fuzzy controller instead of some complicated membership functions of the fuzzy
model under the perfect premise-matching condition. Moreover, under the perfect
premise-matching condition, the stability conditions in [4–10] cannot be applied to
aid the design of the fuzzy controller. Under the imperfect premise-matching condi-
tion, the proposed BMI-based stability and performance conditions offer a system-
atic way to realize a stable and well-performed fuzzy controller for the non-linear
system.
280 H.K. Lam and F.H.F. Leung

1.5 5

0
1
–5

x2(t)(rad/s)
0.5
x1(t)(rad)

–10

0 –15

–20
–0.5
–25

–1 –30
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Time(sec) Time(sec)
(a) (b)
30 30

25 25

20 20
x4(t)(m/s)

15
x3(t)(m)

15

10 10

5 5

0 0

–5 –5
0 2 4 6 8 10 12 14 16 18 20
0 2 4 6 8 10 12 14 16 18 20
Time(sec) Time(sec)
(c) (d)

Fig. 6 System responses of the inverted pendulum with fuzzy controller 1 (solid lines) and fuzzy
controller 2 (dotted lines); (a) x1 (t); (b) x2 (t); (c) x3 (t); (d) x4 (t)

7 Conclusion

System stability of fuzzy model-based control systems under the imperfect premise-
matching condition has been investigated. A fuzzy controller with enhanced stabi-
lization ability has been proposed to deal with non-linear systems. The information
of the membership functions of the fuzzy model and fuzzy controller has been used
to facilitate the system analysis. Relaxed BMI-based stability conditions have been
derived using the Lyapunov-based approach to guarantee the system stability. Under
the imperfect premise-matching condition, simple membership functions can be em-
ployed to lower the structural complexity of the fuzzy controller. BMI-performance
conditions have been derived subject to a scalar performance index to guarantee
the system performance. The GA-based convex programming technique has been
proposed to solve the solution of the BMI-based stability and performance condi-
tions so as to aid the design of stable and well-performed fuzzy model-based control
systems. Simulation examples have been given to illustrate the effectiveness of the
proposed approach.
Stability Analysis and Performance Design for Fuzzy Model-based Control Systems 281

Acknowledgment The work described in this paper was supported by grants from King’s College
London and The Hong Kong Polytechnic University (Project No. G-YE92).

References

1. T. Takagi and M. Sugeno. Fuzzy identification of systems and its applications to modeling and
control. IEEE Trans. Sys., Man., Cybern., vol. smc-15 no. 1, pp. 116–132, Jan 1985
2. M. Sugeno and G.T. Kang, Structure identification of fuzzy model. Fuzzy sets and systems,
vol. 28, pp. 15–33, 1988
3. C.L. Chen, P.C. Chen and C.K. Chen. Analysis and design of fuzzy control system. Fuzzy Sets
and Systems, vol. 57, no 2, 26, pp. 125–140, Jul 1993
4. H.O. Wang, K. Tanaka and M.F. Griffin. An approach to fuzzy control of nonlinear systems:
stability and the design issues. IEEE Trans. Fuzzy Syst., vol. 4, no. 1, pp. 14–23, Feb 1996
5. K. Tanaka, T. Ikeda and H.O. Wang. Fuzzy regulator and fuzzy observer: Relaxed stability
conditions and LMI-based designs. IEEE Trans. Fuzzy Syst., vol. 6, no. 2, pp. 250–265, 1998
6. W.J. Wang, S.F. Yan and C.H. Chiu. Flexible stability criteria for a linguistic fuzzy dynamic
system. Fuzzy Sets and Systems, vol. 105, no. 1, pp. 63–80, Jul 1999
7. E. Kim and H. Lee. New approaches to relaxed quadratic stability conditions of fuzzy control
systems. IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 523–534, 2000
8. X. Liu and Q. Zhang. New approaches to H∞ -controller designs based on fuzzy observers for
T-S fuzzy systems via LMI. Automatica, vol. 39, no. 9, pp. 1571–1582, Sep 2003
9. X. Liu and Q. Zhang. Approaches to quadratic stability conditions and H∞ -control designs for
T-S fuzzy systems. IEEE Trans. Fuzzy Syst., vol. 11, no. 6, pp. 830–839, 2003
10. M.C.M. Teixeira, E. Assunção and R.G. Avellar. On relaxed LMI-based designs for fuzzy
regulators and fuzzy observers. IEEE Trans. on Fuzzy Systems, vol. 11, no. 5, pp. 613–623,
Oct 2003
11. C.H. Fang, Y.S. Liu, S.W. Kau, L. Hong and C.H. Lee. A new LMI-based approach to relaxed
quadratic stabilization of T-S fuzzy control systems. IEEE Trans. on Fuzzy Systems, vol. 14,
no. 3, pp. 386–397, Jun 2006
12. S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan. Linear Matrix Inequalities in Systems
and Control Theory. ser. SIAM studies in Applied Mathematics, Philadelphia, PA: SIAM,
1994
13. H.K. Lam and F.H.F. Leung. Stability analysis and synthesis of fuzzy control systems subject
to uncertain grades of membership. IEEE Trans. Syst., Man and Cybern, Part B: Cybernetics,
vol. 35, no. 6, pp. 1322–1325, Dec 2005
14. T.M. Guerra and L. Vermeiren. LMI-based relaxed nonquadratic stabilization conditions for
nonlinear systems in the Takagi-Sugeno’s form. Automatica, vol. 40, pp. 823–829, 2004
15. B.C. Ding, H.X. Sun and P. Yang. Further study on LMI-based relaxed nonquadratic stabi-
lization conditions for nonlinear systems in the Takagi-Sugeno’s form. Automatica, vol. 42,
pp. 503–508, 2006
16. Z. Michalewicz. Genetic Algorithm + Data Structures = Evolution Programs. 2nd ed.
Springer-Verlag, 1994
17. B.D.O. Anderson and J.B. Moore. Optimal Control: Linear Quadratic Methods. Prentice-Hall,
1990
18. T.M. Guerra, F. Delmotte, L. Vermeiren and H. Tirmant. Compensation and division control
law for fuzzy models. Fuzzy IEEE 2001, Australia, December, pp. 521–524, 2001
19. E. Kim, M. Park, S. Ji and M. Park. A new approach to fuzzy modeling. IEEE Trans. Fuzzy
Syst., vol. 7, no. 2, pp. 236–240, 1999
20. X.J. Ma and Z.Q. Sun. Analysis and design of fuzzy reduced-dimensional observer and fuzzy
functional observer. Fuzzy Sets and Systems, vol. 120, pp. 35–63, 2001
Two-Level Tuning of Fuzzy PID Controllers
for Multivariable Process Systems

George K.I. Mann and Eranda Harinath

Abstract This paper presents a novel design and tuning technique of fuzzy PID
(FPID) controllers for multivariable process systems. The inference mechanism of
the FPID system follows the Standard Additive Model (SAM)-based fuzzy rule
structure. The proposed design method can be used for any n × n dimensional multi-
input–multi-output (MIMO) process system and guarantees closed-loop stability. In
general the design of FPID for MIMO systems is challenging, mainly due to the ex-
istence of loop interactions. To address this issue a static decoupler is implemented
which has the capacity to remove steady-state loop interactions. The each control
loop is assigned with a FPID system. Two types of FPID configurations are consid-
ered. The first FPID system follows the Mamdani-type rule structure, where error
and error rates are directly used in the input space to derive fuzzy rules. The sec-
ond FPID configuration consists decoupled fuzzy rules where three decoupled rule
bases are assigned to follow individual PID actions. The tuning is achieved while
using the two-level tuning principle as described in [1]. The low-level tuning is ded-
icated to devise linear gain parameters in the FPID system where as the high-level
tuning is dedicated to adjust the fuzzy rule base parameters. The low-level tuning
method adopts a novel linear tuning scheme for general decoupled PID controllers
and the high-level tuning adopts a heuristic-based method to change the nonlinear-
ity in the fuzzy output. For robust implementation, a stability analysis is performed
using Nyquist array and Gershgorin band. The stability properties provides the hard
limits allowed for fuzzy rule parameters and also guarantees to operate within a
given gain phase margin limits. The performance and the design criterion is finally
evaluated using several control simulations.

Keywords: Multivariable control, Fuzzy PID control, Standard additive model,


Linear PID tuning, Nonlinear fuzzy tuning, Stability

George K.I. Mann and Eranda Harinath


Faculty of Engineering and Applied Science, Memorial University of Newfoundland, St. John’s,
NL, Canada, A1B 3X5
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 283
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 283–312.
c 2008 Springer.
284 G.K.I. Mann and E. Harinath

List of abbreviations

SAM = Standard Additive Model


FPID = Fuzzy Proportional Integral Derivative
MIMO = Multiple-Input–Multiple-output
SISO = Single-Input-Single Output
TITO = Two-Input–Two-Output
ZN = Ziegler and Nichols
MPC = Model Predictive Control
FLC = Fuzzy Logic Control
FPD = Fuzzy Proportional Derivative
FPI = Fuzzy Proportional Integral
RNA = Rosenbrock–Nyquist Array
BLT = Biggest Log Modulus
DNA = Direct Nyquist Array
ALG = Apparent Linear Gains
ANG = Apparent Nonlinear Gains

1 Introduction

Among various techniques available in controlling multi-input–multi-output (MIMO)


process systems, the Proportional Integral Derivative (PID) controllers received the
most popularity. The PID systems haven been extensively applied in industrial con-
trol [2], [3], mainly because of its versatility for many applications and inherent
robustness. Although there are many advanced controllers available, the PID sys-
tems offer satisfactory control with least effort. However, for optimum operations
the most challenging design task is in the tuning of PID gains. The most popular
Ziegler and Nichols (ZN) tuning rule [4] that was originally designed for single-
input single-output (SISO) systems are widely used in many MIMO applications as
well. Each loop in MIMO systems is tuned using the ZN rules [5], [6], [7]. In that
case individual loop is considered as an open-loop stable systems. The controller
design for a n × n MIMO system can be first considered as a task of designing n
number of individual PID controllers and each controller is dedicated to represent
a single loop in the overall system. The individual loops can now be tuned using
SISO based PID tuning rules, such as ZN. This is sometimes called as the decen-
tralized PID control. However, the decentralized control will become insufficient or
sometimes will fail to provide better control in the presence of loop interactions.
The loop interaction refers to the case where an input of a loop effects other loops
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 285

in the multivariable system. The detuning of ZN parameters sometimes helps to


achieve stable control. To address the effect of loop interactions, some researchers
have attempted to use an interaction measure as an input parameter to formulate
PID parameter for MIMO processes [8], [9], [10]. Such work is however is lim-
ited only for two-input–two-output (TITO) process systems. The complexity of the
design method in general does not allow one to extend those methods for higher
dimensional MIMO processes.
Recently, advanced control techniques such as optimal control, H∞ and model-
based predictive control (MPC) have been formulated to find equivalent PID terms
[11], [12], [13]. Although they are being classified as PID designs, the true func-
tionality of them are similar to the related advanced control systems and the PID
terms represent only the equivalent form of the preferred advanced control method.
In other words, the control structure constitutes a non-PID form and requires addi-
tional computing blocks, such as model identifications and predictions for real-time
control.
It is known that most industrial processes are often nonlinear [14] and in some
cases the PID controllers found to be unsuitable to use in highly nonlinear plants
[15]. During last three decades fuzzy logic control (FLC) has been widely used in
many engineering areas and has gained much interest in many branches of engi-
neering. Mamadani–Assilian’s pioneering work in 1974 [16], [17] inspired many
researchers to follow the FLC inference for control. The conventional FLC attempts
to replace the linear PID system with a linguistically defined fuzzy PID (FPID) sys-
tem. Often the FPID systems have shown superior performance against its linear
counterpart [18], [19], [20]. Also, the FPID has shown to be effective for control-
ling nonlinear process problems [21]. The nonlinear mapping in the fuzzy logic
generally allows the FPID systems to perform better than the linear PID system. In
addition the heuristic nature in the rule formulation allows the complex processes to
be model using fuzzy rules [22], [23].
There is a huge volume of FPID applications available in the literature where
the control has been performed for variety of processes, including nonlinear sys-
tems. Almost all of these applications belong to SISO process systems. Only in a
very few applications the MIMO systems have been considered. Chieh and Pey [24]
used pre-compensator to decouple the MIMO process and the design is based on
Rosenbrock–Nyquist Array (RNA) method. However, the FPID parameters have
been chosen arbitrary. Gamero and Medrano [23] used Mamadani based FPID to
control a biotechnology process. They have used dynamic decoupler in order to
reduce loop interactions. The controller is based on a two- dimensional Mamdani-
type fuzzy rule base. The application of dynamic decoupler for multivariable process
is sometimes not physically realizable [25]. Dynamic decoupler is also shown to be
more sensitive to plant and process mismatch and therefore is less popular in process
control. In another application, Rahmati et al. [18] used fuzzy PID controller for
HVAC plant. They have presented similarity between conventional digital PID con-
trol algorithm and Takagi–Sugeno-based fuzzy PID control. Recently, Shaoyuan
et al. [26] presented coordinated control strategy for boiler- turbine control us-
ing fuzzy reasoning and auto-tuning techniques. Self-organizing FPID controller
286 G.K.I. Mann and E. Harinath

is presented by Hassan et al. [27] for robot arm. In these applications fuzzy logic
controllers are used at supervisory level for self tuning of conventional PID gains
at the lower level. In all aforementioned methods the design of FPID have been
arbitrary and the gain parameters were chosen using trial and error methods.
The literature review revealed that there is no systematic design procedure is
available to design and tune FPID controllers for MIMO process systems. It is very
clear that the available SISO-based FPID design techniques have limitations to
extend for general MIMO systems. Alternatively, this paper proposes a general-
ized tuning scheme for both linear PID and FPID controllers. The FPID controller
follows the fuzzy inference based on standard additive model (SAM), proposed
in [28]. The proposed tuning scheme follows the two levels of tuning, namely
low-level tuning followed by high-level tuning [1]. By considering interaction mea-
sure among loops, a generalized tuning technique is developed for low-level tuning
for MIMO process. In SAM-based fuzzy inference the consequent fuzzy sets are
weighted using either centroid or volume of membership functions which can also
be calculated in advance using SAM theorem. In the proposed design the high-level
tuning is dedicated to determine these centroid and volumes in the view of achieving
desired nonlinearity of the fuzzy output.
This paper is organized as follows. First, system description is presented in
Section 2. In Section 3, two-level tuning technique is described. Low-level tuning
is performed and generalized linear PID controller design technique is described in
Section 4. In Section 4, a new interaction measure is derived via interaction index
and PID controllers are tuned for MIMO process based on this index. In Section 5,
High-level tuning is performed using SAM-based fuzzy system. Two types of FPID
configurations are considered in Section 6 and SAM-based fuzzy controllers are de-
signed for individual system. In Section 7, the stability analysis is performed using
direct Nyquist array (DNA) theorem where hard limits of high-level tuning parame-
ters are found. In Section 8, application of proposed tuning algorithms, FPID type
I and FPID type II are simulated for two examples and results are compared with
linear PID controller system. Sections 9 and 10 deal with performance analysis and
conclusions.

2 System Description

The conventional feedback strategy of a n inputs n outputs multivariable system with


a static decoupler and a FPID controller is shown in Figure 1 where the multivari-
able system is assumed as a linear and open-loop stable system. Then, the transfer
function of this MIMO process system is denoted by,
⎡ ⎤
g11 (s) g12 (s) . . . g1n (s)
⎢ g21 (s) g22 (s) . . . g2n (s) ⎥
⎢ ⎥
G(s) = ⎢ . .. . . .. ⎥ . (1)
⎣ .. . . . ⎦
gn1 (s) gn2 (s) . . . gnn (s)
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 287

Decentralized Static MIMO


fuzzy or linear PID Decoupler Process
r y
+
– Gc D G

Fig. 1 Statically decoupled multivariable control

The open-loop SISO transfer function between ith output and jth input when all
other inputs are zero is denoted by gi j where i, j = 1, 2, . . . , n. The static decoupler
D for the above system can be described using (2).
D = G−1 (0) (2)
Where it is assumed that G(0) is nonsingular.

3 Two-Level Tuning

The main challenge in fuzzy control design is in the tuning, particularly in choos-
ing correct fuzzy system and its associated fuzzy parameters. The curse of di-
mensionality during the rule explosion [28] has been the main draw-back in FLC
designs. In a typical tuning problem the parameters includes linear scaling para-
meters of the control variables, fuzzy membership parameters, rules and other
associated fuzzy variables in the rules base, such as number of rules, membership
distribution and rule composition. The mathematical complexity in the nonlinear
fuzzy control makes the formulation of a tuning mechanism an extremely a com-
plex problem. However, the recent increase in computing power enabled most de-
signers to adopt numerical optimization techniques for generating optimum or near
optimum solutions to fuzzy systems, such as genetic algorithm and neural network,
where those techniques have the capacity to determine a large number of unknown
parameters in fuzzy systems [29], [30]. However, those application are somewhat
specific and unable to generalize for wider process specifications. Most of those
designs adopt off-line optimization methods and cannot be implemented for online
control. Moreover the optimizations requires an accurate process model and any
process mismatch during operation can result in poor stability and affect the overall
performance.
The FPID design can be classified as a two-level tuning problem [1] in which
the tuning process is decomposed into two tuning levels. While low-level tuning
addresses the linear gain and overall stability, the high-level tuning provides nonlin-
ear control to enable superior performance. In a rule-coupled fuzzy system, such as
Mamdani–Zadeh-based system, the inputs (error and its derivative) are coupled to
produce a combined fuzzy PI output [1]. The coupled nature of the inputs generally
makes the nonlinear output a complex function. As a result, it is difficult for one
288 G.K.I. Mann and E. Harinath

to isolate linear gains from the nonlinear output. In order to facilitate the two-level
tuning, we define apparent linear gains (ALG) and apparent nonlinear gains (ANG).
While the ALG terms are related to the overall performance and stability of the sys-
tem the ANG terms provide the nonlinearity that is necessary in the fuzzy output.
In the past for SISO systems, some have attempted to provide tuning rules for linear
gains [31], [32], [33]. However the nonlinear tuning was not sufficiently or explic-
itly described. In [34], the design of a conventional FPID is identified as a two-level
tuning problem and described as a way of obtaining ALG terms for conventional
FPID type controllers. However, the nonlinearity tuning was not sufficiently or ex-
plicitly described for implementing a two-level tuning. In this section a systematic
procedure is developed to devise two-level tuning methodology for general FPID
controllers for MIMO systems.

4 Low-Level Tuning: Linear PID Controller Tuning

The PID controller matrix in a n × n MIMO process is expressed as,

Gc (s) = diag{c1 (s), . . . , cn (s)}. (3)

Where
KIi
ci (s) = KPi + + KDi s
s
and KPi , KIi and KDi are proportional, integral and derivative gains of the ith PID
controller. For the above system, shown in Figure 1, the overall compensated system
i.e. process model and static decoupler can be written as,

L(s) = G(s)D(s). (4)


Where G is the MIMO process modeled assuming an open-loop stable first-order
plus dead time model and D is the static decoupler. Using the truncated Taylor se-
ries expansion, the above transfer function L is approximated to a first-order model.
Since higher order terms in the Taylor series expansion are made to zero, this
approximation is valid only at low frequencies. The approximated system is thus
given by, ⎡ 1 ⎤
T11 s+1 K12 s . . . K1n s
⎢ K s 1 ⎥
⎢ 21 T22 s+1 . . . K2n s ⎥

L(s) ≈ ⎢ . .. . . . ⎥ ⎥. (5)
⎣ .. . . .. ⎦
Kn1 s Kn2 s . . . Tnn1s+1

Where Tii represents the time constant of the ith SISO loop and

Ki j ; i = j
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 289

represents off-diagonal parameters which represent different loop interactions dur-


ing steady state. It is clear that at low frequencies the off-diagonal terms are propor-
tional to the frequency (s). Hence the system can be approximately decoupled if the
bandwidth of decentralized PID controllers are low enough.
The open-loop transfer function of the system shown in Figure 1 is written as,

Q(s) = G(s)DGc (s) = L(s)Gc (s).

Let ⎡ ⎤
q11 (s) q12 (s) . . . q1n (s)
⎢ q21 (s) q22 (s) . . . q2n (s) ⎥
⎢ ⎥
Q(s) = ⎢ . .. . . . ⎥, (6)
⎣ .. . . .. ⎦
qn1 (s) qn2 (s) . . . qnn (s)
where 
Ki j s(KPi + KsIi + KDi s) ; i = j
qi j (s) = K
KPi + sIi +KDi s (7)
Tii s+1 ; i = j.
The close-loop relation for this system is expressed as,

y = [I + Q(s)Gc (s)]−1 Q(s)Gc (s)r. (8)

Where r and y are input and output vectors respectively. Then, the closed transfer
matrix H(s) between y and r can be written as,

H(s) = [I + Q(s)Gc (s)]−1 Q(s)Gc (s)

Let ⎡ ⎤
h11 (s) h12 (s) . . . h1n (s)
⎢ h21 (s) h22 (s) . . . h2n (s) ⎥
⎢ ⎥
H(s) = ⎢ . .. .. . ⎥. (9)
⎣ .. . . .. ⎦
hn1 (s) hn2 (s) . . . hnn (s)

4.1 Tuning First Loop

When all other loops are open, the elements in first column of H(s) can be written
as,

qi1 (s)
hi1 (s) = = qi1 (s)S1
1 + q11 (s)
where S1 = (1 + q11 (s))−1 is defined as sensitivity function of the first loop [35].
Thus, for a step input change in the first loop, the interactions to other loops at low
frequencies can be computed as,
290 G.K.I. Mann and E. Harinath

hi1 (s) = lim qi1 (s)S1


s→0
KI1
= lim Ki1 s(KP1 + + KDi s)S1
s→0 s
= Ki1 KI1 S1 . (10)

Then the upper bound of interaction is given by,

| hi1 (s) | ≤ max(| Ki1 |) | KI1 | (S1 )max (11)


i=1

where (S1 )max is the maximum value of S1 and max(| Ki1 |) is the maximum absolute
value of Ki1 ; i = 1. Hence we can introduce interaction index of the first loop as,

I1 = max(| Ki1 |) | KI1 | (S1 )max . (12)


i=1

The value of KI1 can be calculated at particular value of (S1 )max so that the inter-
action index, I1 is kept as low as possible. Then, the rest of interactions can also be
reduced according to the inequality (11). The proportional gain, KP1 of PID con-
troller is computed using time constant of the first-order approximated process and
the designed integral gain. The derivative gain, KD1 is chosen from ZN formula as,
1
TD1 = TI1 . (13)
4
Where TD1 and TI1 are derivative and integral time constants for PID controller at
the first loop. Then,
K2
KD1 = P1 . (14)
4KI1
In order to find KP1 , In this analysis we use direct pole placement method [35] as
follows. The closed-loop transfer function of the first loop with reduced first-order
model and PID controller is given by,
KD1 s2 +KP1 s+KI1
T11 +KD1
h11 (s) = 1+KP1 KI1
. (15)
2
s + ( T11 +KD1 )s + T1 +K D1

Considering second-order dynamics of the numerator in (21), the crossover fre-


quency of the first loop can be written as,
9
ωo1 = KI1 /(T11 + KD1 )

and the proportional gain is given by,

KP1 = 2ζ1 ωo1 (T11 + KD1 ) − 1. (16)


Where ζ1 is the damping constant of a second-order system. From (14) and (16),
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 291
9
1 ± ζ1 2 − 4ζ 2 K 3 T + 4K T 2
KI1 1 I1 11 I1 11
KP1 = 2 ζ2 −1
. (17)
4KI1 1

The same procedure is repeated for other loops and tuned while keeping interaction
index as minimum.

4.2 Tuning ith loop

This section introduces generalized interaction index for n × n MIMO process sys-
tem as follows.
Ii = max(| Ki j |)(| KIi |)(Si )max (18)
i= j

where
(Si )max = max(1 + qii (s))−1

is the maximum value of ith loop sensitivity function and the reasonable range of
(Si )max is 1.3–2 [35]. The max(| Ki j |) is the maximum absolute value of Ki j ; i = j.
The integral and proportional gains of each loop can be evaluated as,
Ii
KIi = (19)
maxi= j (| Ki j |)(Si )max

and 9
1 ± ζi KIi2 − 4ζi2 KIi3 Tii + 4KIi Tii2
KPi = . (20)
4KIi2 ζi2 − 1
By selecting suitable value for ζi , KPi can be calculated. Then,
2
KPi
KDi = . (21)
4KIi
and we can define KPi , KIi and KDi as ALG terms for FLC.

5 High-Level Tuning: Nonlinearity Tuning

The high-level tuning is dedicated to determine fuzzy rule base parameters which
has direct relevance to the nonlinearity of the FLC output. The nonlinearity that
is generated through fuzzy mapping is then adjusted using high-level tuning para-
meters. In general the nonlinearity can be adjusted either by changing rules or by
changing knowledge base rule parameters, such as membership shapes and their
distributions in the universe of discourse of variables. An effective nonlinearity
tuning mechanism should have the capacity to produce a flexibility to change the
292 G.K.I. Mann and E. Harinath

nonlinearity of the fuzzy output in a wider range. A proper selection of a fuzzy in-
ference mechanism is quite important in achieving efficient high-level tuning [38].
It is found that SAM-based fuzzy inference has the capacity to provide convenient
way to obtain the desired nonlinearity while changing membership parameters.

5.1 Standard Additive Model (SAM)

In the additive fuzzy systems (controller), rules are fired in parallel to some degree.
Then the system weights and average then-part or consequent fuzzy sets to infer the
output fuzzy set [36], [37]. Finally, the system defuzzifies the output fuzzy set using
centroid of membership functions to generate the fuzzy output. An additive fuzzy
system is a function approximator and SAM is the simplest form of an additive fuzzy
system [28]. According to Kosko, an additive FLC divides the global conditional
mean into a convex sum of local conditional means while the conventional centroid
type FLC computes the conditional mean as output. The then-part or consequent
fuzzy sets of the SAM consists of centroid and area or volume. The SAM theorem,
[28] which is described in Section 5.2 allows these volumes and centroid to be
computed in advance and this particular feature allows fast implementation of FLC
for real time control. Consider fuzzy rules of the form

IF X = Aα THEN Y = Bβ

where X and Y be nonempty sets and λ and ζ be nonempty index sets. Then, Aα :
α ∈ λ and Bβ : β ∈ ζ represent input fuzzy set of X and output fuzzy sets of Y
respectively. An additive fuzzy system stores m number of above fuzzy rules. These
rules describe fuzzy subsets or fuzzy patches in the Cartesian product space X × Y
as shown in the Figure 2. Hence an additive fuzzy system (collection of IF-THEN
rules) approximates a function F : X → Y . The general framework for a feed forward
additive fuzzy system is shown in Figure 3. The mapping of an input x causes to fire
the if-part of all m rules to some degree in parallel. Then the system weights (using
rule weight wm ) the then-part to produce a new fuzzy sets Bβ . The weighted sum of
the inferred fuzzy sets form the output sets B.
m
B= ∑ wβ Bβ (x). (22)
β =1

The weights w j is used to reflect rule credibility or frequency and then it provides
an extra term for a learning system to tune. In practice the rule weights are often set
as equal to unity: w1 = . . . wm = 1. SAM is a special case of the additive model
framework and following can be observed as special properties in SAM.
1. The fired then-part set Bβ is the fit product aβ (x)Bβ . Where the fit value aβ (x)
(aβ is called membership function) express the membership grade of input x in
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 293

B5

B4

B3

B2

B1

A1 A2 A3 A4 A5

Fig. 2 Function approximator: Additive fuzzy system

If A0 then B0 B⬘0
w0

w1
If A1 then B1 B⬘1
B Centroidal y = F(x)
x A Defuzzifier

wm
If Am then Bm B⬘m

Fig. 3 General framework of additive fuzzy system

the if-part fuzzy set Aα . Then the output set can be expressed as,
m
B= ∑ wβ aβ (x)Bβ (x). (23)
β =1

2. The system output F(x) computes as centroid of output set B(x) and defuzzifies
to a scalar or a vector.
 
m
F(x) = Centroid ∑ wβ aβ (x)Bβ (x) (24)
β =1

The centroid provides the structure of a conditional expectation to the fuzzy system
F and it acts as an optimal nonlinear approximator in the mean-squared sense.
294 G.K.I. Mann and E. Harinath

5.2 SAM Theorem

The SAM theorem, proposed by Kosko [28], allow us to compute then-part parame-
ters in advance. Suppose the fuzzy system F : Rn → R p is a standard additive model
as shown in (24). Then F(x) is a convex sum of the m then-part set centroid:

∑m
β =1 wβ aβ (x)Vβ Cβ
F(x) = (25)
∑mβ =1 wβ aβ (x)Vβ
m
= ∑ pβ (x)Cβ . (26)
β =1

The convex coefficients or discrete probability weights p1 (x), . . . , pm (x) depends


on the input x through the ratios

wβ aβ (x)Vβ
pβ (x) = m . (27)
∑k=1 wk ak (x)Vk

Vβ is the finite positive volume ( or area if p = 1 in the range space R p ) and Cβ is


the centroid of then-part set Bβ :

Vβ = bβ (y1 , . . . , y p )dy p > 0, (28)
Rp
-
ybβ (y1 , . . . , y p )dy1 . . . dy p
p
Cβ = R- . (29)
bβ (y1 , . . . , y p )dy1 . . . dy p
Rp

The popular scalar case of p = 1 reduces (28) and (29) to


∞
Vβ = bβ (y)dy (30)
−∞

-∞
ybβ (y)dy
−∞
Cβ = -∞ . (31)
bβ (y)dy
−∞

Then SAM theorem allows us to calculate these volumes and centroid (or local con-
ditional means) in advance. They can also set to be adaptive in real time control.
For each input x we need to compute only the mβ fit values aβ (x) and then update
the ratio in (25). The consequent then-part fuzzy sets Bβ can take the form of sym-
metrical triangle or trapezoidal or bell curve so that the area and centroid are easy
to calculate. The SAM structure (25) allows to replace all then- part fuzzy sets Bβ
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 295

to be even rectangle or non singletons Rβ having the same volume Vβ and centroid
Cβ . This would not change the output value F(x).

6 Fuzzy PID (FPID) Configurations

Two types of fuzzy PID configurations are considered. They are,


1. Rule-coupled FPID
2. Rule-Decoupled FPID
Figure 4 shows two types of FPID configuration for ith loop. The type I is a con-
ventional Mamdani’s type FPID and has three inputs and it produces an incremental
FPID signal. The type II uses SISO rule inference to provide decoupled and in-
dependent tuning for the three actions in the PID signal [38]. Using suitable scale
factors (Swi ),where w = 1, 2, 3, the feed back error terms (ei ) and its corresponding
normalized error variables (êi ) at nth sampling instance can be expressed as

ei êi
Sei ∆ûPIDi
ûPIDi uPIDi

∆ei ∆êi SAM Su


Scei + +
F (êi,∆êi,∆2êi)
Z–1
∆2ei ∆2êi
Srcei

(a)

SAM û1i
F1 (êi) KPi

êi û2i ûPIDi uPIDi


SAM +
Sei F2(êi) KIiTS Su
+ +
+
Z–1
û3i
SAM
KDi /TS
F3(êi) –
Z–1

(b)

Fig. 4 FPID configurations. (a) Type I: rule-coupled FPID, (b) Type II: rule decoupled FPID
296 G.K.I. Mann and E. Harinath

êi (n) = S1i ei (n)


∆ êi (n) = S2e ∆ ei (n)
∆ 2 êi (n) = S3e ∆ 2 ei (n). (32)

For convenience define

e1i = ei
e2i = ∆ ei
e3i = ∆ 2 ei . (33)

All FLC input variables are normalized to a compact region [-1,1]. The error vari-
ables are normalized by using the condition êwi = max(−1, min(1, Swi ewi )). The
defuzzified controller output after the fuzzy inference is denoted by û. Similarly the
FLC output is normalized by using the condition û ≡ u/umax .

6.1 High-Level Nonlinear Tuning Variables

The nonlinear tuning variables are selected to affect ANG terms at any given local
control point in the control surface. As PID gains are proportional to the slopes of the
control surface, the slope angles of the tangents drawn at a given point on the non-
linear control surface are considered to be the nonlinear tuning variables. In order to
isolate them from their associated outputs of type I controller, the slopes are mea-
sured in the planes of individual error axes. The measurement of these angles with
respect to a two-dimensional control surface is shown in Figure 5(a). Figure 5(b)
shows a control curve that has been projected into a chosen error variable. In gen-
eral, for a three- input coupled rule base the slope angles can be described by

ûf
ûf
(θ1)1i 1
(θ1)2i (θ1)wi
1

0 (α1)wi
0
(α0)wi

–1
1
ê2i 1ê1i
(θ0)2i (θ0)1i
0 0 (θ0)wi

–1 –1
–1 0 1 ê
(a) (b)

Fig. 5 Nonlinear tuning variables measured at local control points of SAM


Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 297

∂ û f
(θ0 )wi =
∂ êwi êwi =−1
∂ û f
(θ1 )wi = . (34)
∂ êwi êwi =1

Where û f = û(ê p = 0), p = 1, 2, 3 p = w.


The fuzzy system designed for the PID control should allow independent variations
of θ0 and θ1 within the range [0–90◦ ]. This would allow the nonlinearity to be ad-
justed doing the high level tuning for optimum performance.

6.2 Design of SAM

Consider two control regions in the controller output space. The first region is when
the normalized error variables are −1 ≤ êi < 0. The local control in this region af-
fects steady state, load disturbance and overshoot properties. The second region is
when 0 ≤ êi ≤ 1. The control in this region affects the speed of response during the
transient, undershoot and steady state properties. The objective is to realize inde-
pendent adjustment of FLC parameters in the view of changing ANG terms at the
chosen control points. The membership functions (ai ) for the if-part in SAM are
chosen as triangle functions as shown in the Figure 6. The slope angle θ for type II
(see Figure 5(b)) can be described by,
⎧  
⎪ −V C0i +V1i C1i (−êi V0i C0i +(êi +1)V1i C1i )(−V0i +V1i )
⎪ arctan −êi V0i0i +(
⎪ −
⎪ ê +1)V 2
(−êi V0i +(êi +1)V1i )

i 1i

 for − 1 ≤ êi < 0 


θ= (35)

⎪ arctan −V1i C1i +V2i C2i
− (−(êi −1)V1i C1i +êi V2i C2i )(−V1i +V2i )

⎪ −(ê −1)V +ê V −1)V 2
⎩ i 1i i 2i (−( ê i 1i +ê V
i 2i )
for 0 ≤ êi ≤ 1
In this analysis, the then-part centroid Cwi are selected as,

C0i = −1, C1i = 0 and C2i = 1. (36)

A0i , (V0i) A1i , (V1i) A2i , (V2i)


1

–1 0 1 êw

Fig. 6 Membership functions for if part in SAM


298 G.K.I. Mann and E. Harinath

The stability properties are determined by the extreme values of equivalent PID
gains. Therefore, to guarantee stability, the maximum and minimum ANG terms are
considered in an equivalent linear PID system. In the SAM inference the maximum
or minimum of ANG occurs when êi = −1, êi = 0 and êi = 1. Then, the slope angle
at selected four points (see Figure 4) are,

(θ0 )wi = arctan(V1i /V0i ) (37)


(α0 )wi = arctan(V0i /V1i ) (38)
(θ1 )wi = arctan(V1i /V2i ) (39)
(α1 )wi = arctan(V2i /V1i ) (40)

It is clear, the pairs {(θ0 )wi , (α0 )wi } and {(θ1 )wi , (α1 )wi } form a right angle. There
are two independent slope angles that can be defined over the control surface of
SAM corresponding to two regions −1 ≤ êi < 0 and 0 ≤ êi ≤ 1. Therefore we select
(θ0 )wi and (θ1 )wi as the two independent slope angles to be adjusted within the
range of [0–90◦ ] for high level tuning. In order to find two independent angles, the
then-part volume for second membership function is selected as unity:V1i = 1. Then,

θ0 = arctan(1/V0 ) (41)
θ1 = arctan(1/V2 ) (42)

Hence the terms V0 and V2 are the nonlinear tuning variable for the SAM.

7 Stability Analysis

7.1 Direct Nyquist Array (DNA) Stability Theorem

An analytical expression for the ith Gershgorin band of Q(s) is given by

qii ( jω ) + Ri (ω )ejθ , θ ∈ [0, 2π ], ∀ω .

Where
Ri (ω ) = ∑ | qi j ( jω ) | for i = 1, 2, . . . , n (43)
i,i= j

is radius of ith Gershgorin circle. Then, DNA stability theorem [39], [40], [41] is
expressed as follows. When the Gershgorin bands based on the diagonal elements
qii (s) of Q(s) exclude the point (−1 + j 0) and the ith Gershgorin band encircle the
point (−1 + j 0), Ni times anticlockwise, then the closed-loop system is stable if ,
and only if,
n
∑ Ni = p0 ,
i=1
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 299

where p0 is the number of unstable poles of Q(s). In this work we have assumed that
the open-loop stable process, Q(s) since most of industrial process are open-loop
stable systems [42]. Then, p0 = 0 for this stability analysis. Hence, if the Gershgorin
bands neither encircle nor include the critical point(−1, j0) for (∀i), the closed-loop
system is stable.

7.2 Maximum Values of PID Parameters

Ho et al. [41] have shown the definitions for gain and phase margins of MIMO
systems as follows. Figure 7 shows a Nyquist diagram with Gershgorin circle at
the gain crossover frequency (defined as ωgi ) of ith loop. The Gershgorin circle
intersects the unit circle at A. At the phase cross over frequency (defined as ω pi ),
the Gershgorin circle intersects the negative real axis at C as shown in Figure 8.
Then the phase and gain margins for the MIMO system are defined as ,

φi = π + arg(AOB) and (44)


1
αi = . (45)
| OC |
In order to guarantee stability, according to the DNA theorem, the Gershgorin bands
should be shaped based on a predefined values of φi and αi so that it excludes and

1
Gershgorin circle

qii(jω)
–1 B 0
0
Im

A φ⬘i
φi

Ri(ωg)
–1
|qii(jωg)|

–1 0 1
Re

Fig. 7 Nyquist diagram with the Gershgorin circle at the gain crossover frequency ωg
300 G.K.I. Mann and E. Harinath

Ri(ωp) + |qii(jωp)|=1/α⬘i
1
Gershgorin circle

|qii(jωp)|=1/αi

qii(jω)
–1
0
Im

C O

–1

–1 0 1
Re

Fig. 8 Nyquist diagram with the Gershgorin circle at the phase crossover frequency ω p

does not encircle the point (−1 + j 0). As a rule of thumb, φi and αi should satisfy
the following conditions,
300 ≤ φi ≤ 600 and (46)
2 ≤ αi ≤ 5. (47)
The φi in Figure 7 and αi in Figure 8 are phase and gain margins in the SISO system
respectively. The following expression can be derived for φi (see Figure 7)

∑i,i= j | qi j ( jωgi ) |
φi = φi + 2 arcsin
2 | qii ( jωgi ) |
∑ i,i= j | gi j ( j ωgi ) |
= φi + 2 arcsin . (48)
2 | gii ( jωgi ) |

From Figure 8, αi can be derived as follows:

∑i,i= j | qi j ( jω pi ) |
αi = αi 1 +
2 | qii ( jω pi ) |
∑i,i= j | gi j ( jω pi ) |
= αi 1+ . (49)
2 | gii ( jω pi ) |

In order to guarantee the stability the predefined gain margin αi and phase margin
φi of MIMO process can be predefined while satisfying (46) and (47). The limits of
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 301

PID parameters can be calculated for ith loop. Following four equations can be used
to calculate the four unknowns, ω pi , ωgi , KPi and KIi in ith loop.

1
αi = (50)
| gii ( jω pi )ci ( jω pi ) |
arg[gii ( jω pi )ci ( jω pi )] = −π (51)
φi = π + arg[gii ( jωgi )ci ( jωgi )] (52)
| gii ( jωgi )ci ( jωgi ) | = 1 (53)

Substituting from (48) and (49) in (50)–(53),

f1,i = αi | ci ( jω pi ) | {| gii ( jω pi ) | + ∑ | qi j ( jω pi ) |} − 1 = 0 (54)


i,i= j
f2,i = arg[gii ( jω pi )ci ( jω pi )] + π = 0 (55)
∑i,i= j | gi j ( jωgi ) |
f3,i = π + arg[gii ( jωgi )ci ( jωgi )] − φi − 2 arcsin
2 | gii ( jωgi ) |
=0 (56)
2
f4,i = | gii ( jωgi )ci ( jωgi ) | − 1 = 0 (57)

Then we can define


KPi max and KIi max , (58)

as maximum values of PI parameters at a given φi and αi . From (21),

2
KPi max
KDi max = . (59)
4KIi max

Since PID gains are proportional to the slopes of the control surface shown in Fig-
ure 5, we can find maximum values of slopes angle corresponding to KPi max , KIi max
and KDi max . For instance, let the proportional SAM based fuzzy controller for ith
has high-level tuning parameters:V0 and V2 . From (37)–(42) following expression
can be derived

V0 min = V2 min = KPi /KPi max


V0 max = V2 max = KPi max /KPi . (60)

Then limiting angles for θ0 , α0 , θ1 and α1 can expressed as,

θ0 max = α0 max = θ1 max = α1 max = arctan(KPi max /KPi ) and (61)


θ0 min = α0 min = θ1 min = α1 min = arctan(KPi /KPi max ). (62)

If {KPi max /KPi ≥ 1.571}, the fuzzy controller has independent variations of θ0 and
θ1 within the range [0 90◦ ]. Otherwise, it has feasible stability region as shown in
Figure 9.
302 G.K.I. Mann and E. Harinath

90⬚

α0max


45
0=
α0(degree)

α
0=
stability region

θ
α0min

0 θ0 min θ0(degree) θ0max 90⬚

Fig. 9 Stability region for θ0 and α0 . It is same for θ1 and α1

8 Control Simulation

The proposed FPID controllers tuning techniques are applied for a multivariable
process with the Finite Element (FE)-based model of 3 × 3 soil-cell [43]. Here, two
transfer functions are derived. The first one is obtained directly using FE analysis
of the soil model and the second one is obtained while increasing time delay of
the FE-based transfer function in two times. This is performed in order to justify the
robustness of the proposed controllers for different processes. In addition, the equiv-
alent delayed first-order models for all the higher order subprocesses are obtained
by analyzing the response using plant reaction curve methods [44]. Then equivalent
first-order models with dead time are used to design of linear PID controllers. Since
the models and the processes are mismatch the controllers are more robust for un-
certainty. The liner PID tuning method also simulated to confirm the superiority of
the FPID controllers techniques. The following steps summarizes design of FPID
controller.

1. Equivalent first-order delayed models are derived for all higher order subprocesses
by analyzing the response using plant reaction curve methods.
2. The static decoupler is obtained for the first-order plus dead time model.
3. An equivalent first-order model for overall compensated system (first-order
model with static decoupler) is obtained using truncated Taylor series approx-
imation at low frequencies.
4. A measure of interaction is developed and integral gains are calculated for each
loops at particular values of interaction indices.
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 303

5. Using direct pole placement method and Ziegler–Nichols tuning formula propor-
tional and derivative gains of linear PID controllers calculated.
6. Nonlinear tuning parameters (volumes of then fuzzy of SAM) are designed so
that overall system has specific gain and phase margins.

8.1 Example 1

The dynamics of transfer function between heat input (W) and temperature output
(◦ C) is described by
⎡ ⎤
0.0288e−0.6s 0.0119e−1.2s 0.00028e−3.6s
⎢ 6.605s2 + 5.14s + 1 97.02s2 + 19.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0.0141e−1.2s 0.0295e−0.6s 0.0035e−1.8s ⎥
⎢ ⎥. (63)
⎢ 10.11s2 + 6.36s + 1 5.523s2 + 4.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎣ 0.0015e−3.6s 0.0143e−1.8s −0.6s

0.0282e
2
17.56s2 + 8.38s + 1 6.605s2 + 5.14s + 1 7.29s +5.4s+1

The equivalent first-order model from plant reaction curve is given by


⎡ ⎤
0.0288e−1.85s 0.0119e−6s 0.00028e−5.95s
⎢ 4.35s + 1 16.05s + 1 8.1s + 1 ⎥
⎢ ⎥
⎢ −2.85s −1.85s ⎥
⎢ 0.0141e 0.0295e 0.0035e−4.25s ⎥
⎢ ⎥. (64)
⎢ 6.6s + 1 3.9s + 1 7.95s + 1 ⎥
⎢ ⎥
⎣ 0.0015e−5.65s 0.0143e−3.05s 0.0282e−1.9s ⎦
6.6s + 1 4.2s + 1 4.5s + 1
The initial set points are 50◦ C, 55◦ C and 60◦ C at the beginning of the simulation.
Once the steady conditions have been reached and at 70 minute the set-point of all
loops are changed to 100◦ C. In order to measure load disturbance rejection capabil-
ity, a step load disturbance is added to the third loop (y3 ) of the process. Figure 10
summarizes the response behavior. The system with FPID type II controller has
shown less over shoot and better load disturbance rejection as compared to type I
controller. Tables 1 and 2 summarize the performance indices of the experiments.
The controller tuning parameters for each loop are shown in Table 3.

8.2 Example 2

In this example the transfer function in Example 1 is modified to have increased


time delay (twice as compared to Example 1) in each subsystem. The dynamics of
this process is therefore described by
304 G.K.I. Mann and E. Harinath

Fig. 10 Example 1, Simulation of closed-loop system with PID and FPID controllers
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 305

Table 1 Performance characteristic indices of proposed FPID methods and PID method for set
point tracking in Example 1
Output Set point tracking
Rise time (minute) Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2 PID FPID1 FPID2

y1 5 11 5 19 11 17 15 26 11
y2 5 11 5 25 6 16 14 20 9
y3 5 11 6 33 8 25 19 14 20

Table 2 Performance characteristic indices of proposed FPID methods and PID method for load
disturbance in Example 1
Output Load disturbance
Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2

y1 2.6 3.6 1.6 0 0 0


y2 5 5 1 0 0 0
y3 28 32 75 10 13 7

Table 3 Tuning parameters of Example 1 for PID and FPID controllers


Loop No FPID2
PID FPID1 P I D
P I D v1 v3 v1 v3 v1 v3 v1 v3

(1) 2.61 0.61 2.79 0.7 0.9 2.2 1.3 2.3 1.5 1.8 1.1
(2) 2.03 0.57 1.82 0.9 1.1 4.0 1.4 3.5 1.6 3.8 1.4
(3) 1.7 0.52 1.40 0.9 1.0 2.0 0.6 1.8 0.8 2.3 1.3

⎡ ⎤
0.0288e−1.2s 0.0119e−2.4s 0.00028e−7.2s
⎢ 6.605s2 + 5.14s + 1 97.02s2 + 19.7s + 1 23.52s2 + 9.7s + 1 ⎥
⎢ ⎥
⎢ 0.0141e−2.4s 0.0295e−1.2s 0.0035e−3.6s ⎥
⎢ ⎥
⎢ 10.11s2 + 6.36s + 1 5.523s2 + 4.7s + 1 23.52s2 + 9.7s + 1 ⎥ . (65)
⎢ ⎥
⎣ 0.0015e−7.2s 0.0143e−3.6s 0.0282e−1.2s ⎦
17.56s2 + 8.38s + 1 6.605s2 + 5.14s + 1 7.29s2 + 5.4s + 1
The equivalent first-order model obtained from plant reaction curve is given by
⎡ ⎤
0.0288e−2.45s 0.0119e−7.45s 0.00028e−9.7s
⎢ 4.35s + 1
⎢ 16.05s + 1 8.1s + 1 ⎥ ⎥
⎢ 0.0015e−3.5s 0.0295e−2.4s 0.0035e−6.05s ⎥
⎢ ⎥. (66)
⎢ 6.6s + 1
⎢ 3.9s + 1 7.95s + 1 ⎥ ⎥
⎣ 0.0015e−9.5s 0.0143e−4.9s 0.0282e−2.5s ⎦
6.6s + 1 4.2s + 1 4.5s + 1
306 G.K.I. Mann and E. Harinath

Table 4 Performance characteristic indices of proposed FPID methods and PID method for set
point tracking in Example 2
Output Set point tracking
Rise time (minute) Overshoot % Setting time (minute)
PID FPID1 FPID2 PID FPID1 FPID2 PID FPID1 FPID2

y1 10 17 15 37 10 7 30 30 35
y2 9 20 11 8 0 6 22 23 22
y3 5 15 6 20 0 6 6 22 7

Table 5 Performance characteristic indices of proposed FPID methods and PID method for load
disturbance in Example 2
Output Load disturbance
Overshoot % Setting time (minute)

PID FPID1 FPID2 PID FPID1 FPID2

y1 34 38 17 14 16 23
y2 12 19 16 13 20 10
y3 1 2 3 0 0 0

Table 6 Tuning parameters of Example 2 for PID and FPID controllers


Loop No FPID2
PID FPID1 P I D
P I D v1 v3 v1 v3 v1 v3 v1 v3

(1) 0.56 0.25 0.32 1.1 0.8 5.5 0.35 4.2 0.5 3.5 0.2
(2) 1.45 0.23 2.29 0.8 1.0 1.1 0.80 1.3 0.9 1.1 1.8
(3) 2.34 0.39 3.54 0.9 1.2 1.5 0.50 1.8 0.2 0.9 0.6

The initial set-points are 40◦ C, 50◦ C and 40◦ C. Once the steady conditions have
been reached, the set-point of all three outputs are changed to 100◦ C at 80 minute.
In order to measure load disturbance rejection capability, a step-load disturbance
was applied to the first loop (y1 ) of the process. Figure 12 summarizes the output
behavior in the experiments. The system with FPID type II controller has again
shown less overshoot although response time is slow as compared to when the con-
troller is a linear PID system. However, all the systems show same capability of load
disturbance rejection. Tables 4 and 5 summarzies the comparisons of performance
indices. Table 6 provides all the tuning parameters.

9 Performance Analysis

The proposed algorithm is developed while minimizing the loop interactions at


low frequencies which leads to first-order model reduction. In order to justify the
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 307

1 1 1

0.5 0.5 0.5

0 y 0 y0
–1 –0.5 0.5 1 –1 –0.5 x 0.5 1 –1 –0.5 0.5 1
x
–0.5 –0.5 –0.5

–1 –1 –1
q11 q12 q13
1 1 1

0.5 0.5 0.5

y 0 0 y 0
–1 0.5 0.5 1 –1 –0.5 0.5 1 –1 –0.5 x 0.5 1
x

–0.5 –0.5 –0.5

–1 –1 –1
q21 q22 q23
1 1 1

0.5 0.5 0.5

y 0 y0 0
–1 0.5 0.5 1 –1 –0.5 0.5 1 –1 –0.5 0 0.5 1
x x
–0.5 –0.5 –0.5

–1 –1 –1
q31 q32 q33

Fig. 11 Example 1, Nyquist array and Gershgorin bands of system with liner PID controller

operation of this controller for any other frequency, a stability analysis has been
then performed. Nyquist array and Gershgorin bands have been constructed for both
the examples assuming a linear PID system (Figures 11 and 13). For simulations a
second order plant has been modeled using plant reaction curve and model/plant
mismatch has been already considered. The results justify the robustness of the pro-
posed method. The gain and phase margins for individual loops are shown in Tables
7 and 8 for linear PID systems. The results reveal that gain and phase margins for
both the examples are within the specified limits as proposed in Ho et. al. [41].
Therefore both the examples confirm to the DNA stability theorem. Overall the
FPID type II system able to provide improved control as compared to linear and
type I FPID.
308 G.K.I. Mann and E. Harinath

Fig. 12 Example 2, Simulation of closed-loop system with PID and FPID controllers
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 309

Table 7 Gain and Phase margins of each loop of the system with linear PID controller, Example 1
Loop No Gain margin Phase margin
1 2.2 32◦
2 2.6 31◦
3 8.1 41◦

Table 8 Gain and Phase margins of each loop of the system with linear PID controller, Example 2
Loop No Gain margin Phase margin
1 3.2 24◦
2 3.0 40◦
3 8.0 56◦

1 1 1

0.5 0.5 0.5

–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1

–0.5 –0.5 –0.5

–1 –1 –1
q11 q12 q13
1 1 1

0.5 0.5 0.5

–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1


x

–0.5 –0.5 –0.5

–1 –1 –1
q21 q22 q23
1 1 1

0.5 0.5 0.5

–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1

–0.5 –0.5 –0.5

–1 –1 –1
q31 q32 q33

Fig. 13 Example 2, Nyquist array and Gershgorin bands of system with liner PID controller
310 G.K.I. Mann and E. Harinath

10 Conclusions

Design and tuning of decoupled SAM-based FPID controllers for a general n × n


MIMO process system has been presented. The design of an FPID here is treated as
a two-level tuning problem. The tuning is performed in two stages, low-level tun-
ing followed by high-level tuning. For low-level tuning an equivalent linear gains
have been selected (or ANG) where as for high-level tuning the fuzzy parameters
are adjusted to achieve improved performance. In this proposed fuzzy tuning, the
linear PID can become a special class of the FPID when the nonlinearity is adjusted
to provide linear output surface. The linear PID tuning parameters are calculated
based on technique developed for MIMO PID tuning. In this case the system is con-
sidered with a static decoupler and a measure of interactions is developed through
interaction index for each loop. The PID parameters are calculated for each loop by
using interaction index and pole placement methods. The nonlinear fuzzy inference
is achieved using the SAM-based FLC controller. Two FPID configurations have
been considered and the general design technique has been formulated for each. The
performance of controllers have been compared for two 3 × 3 MIMO multiheating
process systems.
This work has several contributions. First, a generic FPID design and tuning
technique has been formulated for a general n × n multivariable process system. A
novel linear and nonlinear tuning methodology has been formulated based on two-
level tuning method. As opposed to general Mamdani–Zadeh type configuration,
SAM-based fuzzy inference is implemented to achieve better nonlinearity in the
fuzzy output. The stability is justified through the stability analysis and the results
show improved performance of the proposed FPID system against the linear PID
system.

Acknowledgment This work is undertaken as a part of PPSC (Pan-Atlantic Petroleum system.


consortium) project funded from Atlantic Innovation fund. Financial assistance from Natural Sci-
ences and Engineering Research Council (NSERC) of Canada is gratefully acknowledged.

References

1. George K.I. Mann, Bao-Gang Hu and Raymond G. Gosine. Two-Level Tuning of Fuzzy PID
Controllers. IEEE Transactions on Systems, Man and Cybernetics, Part B, 31(2), pp. 263–269,
Apr 2001
2. Jiawen Dong and Coleman B. Brosilow. Design of Robust Multivariable PID controllers via
IMC. Proceedings of the American Control Conference, 5, pp. 3380–3384, June 4–6, 1997
3. S. Yamamoto and I. Hashimoto. Present status and future needs: The view of from Japanese
industry. Proceedings of the 4th International Conference on Chemical Process Control. in
I. Arkun and I. Ray, Eds., New York: AIChe, 1991
4. J.G. Ziegler and N.B. Nichols. Optimum settings for automatic controllers. Trans. ASME, 64,
pp. 759–768, 1942
5. A. Niederlinski. A Heuristic Approach to the Deisgn of Linear Multivariable Interactiing Con-
trol Systems.. Automatica, 7, pp. 691–701, 1971
Two-Level Tuning of Fuzzy PID Controllers for Multivariable Process Systems 311

6. William L. Luyben. Simple Method for Tuning SISO Controllers in Multivariable Systems.
Ind. Eng. Chem. Process Des. Dev, 25(3), pp. 654–660, July 1986
7. D. Chen and D.E. Seborg. Multiloop PI/PID controller design based on Gershgorin bands.
Proceedings of the American Control Conference, 5, pp. 4122–4127, June 25-27, 2001
8. M. Witcher and T.J. McAvoy. Interacting Control Systems: Steady State and Dynamic Mea-
surement of Interaction. ISA Transactions, 16(3), pp. 35–41, 1977
9. Karl Johan Astrom, Karl Henrik Johansson and Qing-Guo Wang. Design of decoupled PID
controllers for MIMO systems. Proceedings of the American Control Conference, 3, pp. 2015–
2020, June 2001
10. Jietae Lee and Thomas F. Edgar. Interaction measure for decentralized control of multivari-
able processes. Proceedings of the American Control Conference, Anchorage, AK, United
States, 1, pp. 454–458, May 2002
11. M.H. Moradi, M.R. Katebi and M.A. Johnson. The MIMO Predictive PID Controller Design.
Asian Journal of Control, 4(4), pp. 452–463, Dec 2002
12. D.E. Rivera, S.M. Morari and S. Skogestad. Internal Model Control 4. PID Controller Design.
Ind. Eng. Chem. Proc. Des. Dev., 25(1), pp. 252–265, Jan 1986
13. J. Lieslehto, J.T. Tanttu and H.N. Koivo. An Expert System for Multivariable Controller
Design. Automatica, 29(4), pp. 953–968, 1993
14. R. Sehab, M. Remy and C. Renotte. An approach to design fuzzy PI supervisor for a nonlinear
system. IFSA World Congress and 20th NAFIPS International Conference, 2001. Joint 9th, 2,
pp. 894–899, July 25–28, 2001
15. A. Selk Ghafari and A. Alasty. Design and real-time experimental implementation of gain
scheduling PID fuzzy controller for hybrid stepper motor in micro-step operation. Proceedings
of the IEEE International Conference on Mechatronics ICM ’04., pp. 421–426, June 2004
16. E.H Mamdani. Application of fuzzy algorithms for control of simple dynamic plant. Proceed-
ings of the Institution of Electrical Engineers, 121(12), pp. 1585–1588, 1974
17. F.L. Lewis and Kai Liu. Towards a paradigm for fuzzy logic control. Automatica, 32(2),
pp. 167–181, Feb 1996
18. A. Rahmati, F. Rashidi and Rashidi M. A hybrid fuzzy logic and PID controller for control of
nonlinear HVAC systems. IEEE Transactions on Systems, Man and Cybernetics, 3, pp. 2249–
2254, Oct 2003
19. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Academic Press,
1980
20. M. Sugeno. Industrial Application of Fuzzy Control. North-Holland, Amsterdam, The
Netherlands. 1985
21. Han Xiong Li and Shaocheng Tong. A hybrid adaptive fuzzy control for a class of nonlinear
MIMO systems. Fuzzy Systems, IEEE Transactions on, 11(1), pp. 24–34, Feb 2003
22. Timothy J. Ross. Fuzzy Logic with Engineering Applications. Wiley, Chichester, UK, 2nd
edition, 2004
23. G.I. Eduardo and M.R. Hiram. Fuzzy multivariable control of a class of a biotechnology
process. Proceedings of the IEEE International Symposium on Industrial Electronics, 1,
pp. 419–424, July 1999
24. Chieh-Li Chen and Pey-Chung Chen. Application of fuzzy logic controllers in single-loop
tuning of multivariable system design. Computers in Industry, 17(1), pp. 33–41, 1991
25. B. Wayne Bequette. Process Control Modeling, Design and Simulation. Prentice-Hall of India,
2003
26. Shaoyuan Li, Hongbo Liu, Wen-Jian Cai, Yeng-Chai Soh and Li-Hua Xie. A new coordinated
control strategy for boiler-turbine system of coal-fired power plant. IEEE Transactions on
Control Systems Technology, 13(6), pp. 943–954, Nov 2005
27. Hassan B. Kazemian. The SOF-PID controller for the control of a MIMO robot arm. IEEE
Transactions on Fuzzy Systems, 10(4), pp. 523–532, Aug 2002
28. Bart Kosko. Fuzzy Engineering. Prentice-Hall, Simon & Schuster/A Viacom Company Upper
Saddle River, New Jersey, 1997
312 G.K.I. Mann and E. Harinath

29. H.A. Malki and D. Misir. Determination of the control gains of a fuzzy PID controller using
neural networks. Fuzzy Systems, Proceedings of the Fifth IEEE International Conference on,
2, pp. 1303–1307, Sept 1996
30. Yu Yongquan, Huang Ying, Wang Minghui, Zeng Bi and Zhong Guokun. Fuzzy neural PID
controller and Ftuning its weight factors using genetic algorithm based on different location
crossover. Systems, Man and Cybernetics, 2004 IEEE International Conference on, 4, pp.
3709–3713, Oct 10–13 2004
31. J.-X. Xu, C. Liu and C.C. Hang. Tuning of fuzzy PI controllers based on gain/phase margin
specifications and ITAE index. ISA Transactions, 35(1), pp. 59–91, May 1996
32. J.-X. Xu, C. Liu and C.C. Hang. Designing a stable fuzzy PI control system using extended
circle criterion. Int. J. of Intelligent Control and Systems, 1, pp. 355–366, 1996
33. S. Hayashi. Auto-tuning fuzzy PI Controller. Proceedings of the Intn’l Fuzzy Systems Associ-
ation Conference, pp. 41–44, 1991
34. H.-X. Li and H.B. Gatland. A new methodology for designing a fuzzy logic controller. IEEE
Transactions on Systems, Man and Cybernetics, 25(3), pp. 505–512, Mar 1995
35. K.J. Astrom and T. Hagglund. PID Controllers: Theory, Design and Tuning. Instrument Soci-
ety of America, Research Triangle Park, 2nd edition, NC, 1995
36. C.W. Reynolds. Flocks, Herds, and Schools: A Distributed Behavioral Model. Computer
Graphics, 21(4), pp. 25–45, July 1987
37. J.P. Martino. Technological Forecasting for Decisionmaking. Elsevier, 8(1), 1972
38. Baogang Hu, George K.I. Mann and Raymond G. Gosine. New methodology for analytical
and optimal design of fuzzy PID controllers. IEEE Transactions on Fuzzy Systems, 7(5), pp.
521–539, Oct 1999
39. H.H. Rosenbrock. State -Space and Multivariable Theory. Nelson, London, 1970
40. J.M. Maciejowski. Multivariable Feedback Design. Addison-Wesley, 1989
41. K. Ho Weng and H. Lee Tong, and Oon P. Gan. Tuning of Multiloop Proportional-Intergral-
Derivative Controllers Based on Gain and Phase Margin Specifications. Ind. Eng. Chem.
Res., 36, pp. 2231–2238, 1997
42. Weng Khuen Ho, Tong Heng Lee, Wen Xu, Jinrong R. Zhou and Ee Beng Tay. Direct Nyquist
array design of PID controllers. IEEE Transactions on Industrial Electronics, 47(1), pp 175–
185, Feb 2000
43. P.K. Roy and G. Mann and B.C. Hawlader. Fuzzy rule-adaptive model predictive control for a
multi-variable heating system. IEEE Conference on Control Applications., pp. 260–265, Aug
2005
44. E.F. Camacho and C. Bordons. Model Predictive Control. Springer-Verlag, London, 1999
Evaluation of Fuzzy Implications
and Intuitive Criteria of GMP and GMT
using MATLAB GUI

Sudesh K. Kashyap, J.R. Raol, and Ambalal V. Patel

Abstract The events or conditions with inherent uncertainties can be efficiently


modeled using fuzzy logic (FL) approach. The approximate reasoning feature of
FL makes it a very powerful tool for developing a variety of applications which
require a logical reasoning or inferencing. The performance of each of such appli-
cations depends upon the various ingredients of FL such as: membership functions,
a rule base consisting of different IF-THEN rules, implication methods for rule in-
terpretation, aggregation methods, and defuzzification methods. However, any new
or existing implication method to fit into FL requires satisfying intuitive criteria
of Generalized Modus Ponens (GMP) and Generalized Modus Tollens (GMT). In
this chapter we present a systematic approach to study existing implication methods
with a given set of intuitive criteria of GMP and GMT. In order to do so, we use
MATLAB and related graphics tools to develop a user interactive package to evalu-
ate the implication methods w.r.t. these criteria. The results are provided in terms of
tables and figures.

Keywords: fuzzy logic, implication, intuitive criteria, GMP, GMT

1 Introduction

FL is a multivalue logic used to model any events or conditions that are not precisely
defined or known. The inherent approximate reasoning capabilities of FL make it an
ideal tool to develop the applications which require a logical reasoning to define the

Sudesh K. Kashyap and J.R. Raol


Scientists, Flight Mechanics and Control Division, National Aerospace Laboratories, Bangalore -
560 017, India, e-mail: [email protected], [email protected]
Ambalal V. Patel
Scientist, IFCS Directorate, Aeronautical Development Agency, P.B. 1718, Vimanapura Post,
Bangalore - 560 017, India, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 313
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 313–385.
c 2008 Springer.
314 S.K. Kashyap et al.

imprecisely defined events. In order to use FL to its maximum potential, we need to


look into ingredients of FL and use them in an appropriate fashion based on the na-
ture of application. The FL ingredients are (i) Membership function (fuzzification):
it converts the input/output crisp values to corresponding membership grades indi-
cating its degree of membership to respective fuzzy sets; (ii) Rule base: consisting
of IF-THEN rules provided by an expert of a relevant field; (iii) Fuzzy implications:
used to map the fuzzified inputs to an appropriate fuzzified output; (iv) Aggregation:
used to combine the output fuzzy sets (a single-output fuzzy set for every rule fired)
to a single fuzzy set, and (v) Defuzzification: converts an aggregated output fuzzy
set from its fuzzified values to equivalent crisp values.
The core part of any FL-based system is a fuzzy inference engine where the
rules are processed using fuzzy implication methods to get the output fuzzy sets.
It will not be wrong to say that the implication method plays a critical role to get
the desired response from the system. Hence, it becomes necessary to select an
appropriate implication method from the existing methods. However, if any new
implication method is found then it should satisfy some of the intuitive criteria of
Generalized Modus Ponens and Generalized Modus Tollens, so that it can be fitted
into the process of system development using FL.
In this chapter, we describe a procedure/methodology that helps to find out if
any of the existing implication methods match with a given set of intuitive criteria
of GMP and GMT. In order to realize the scheme, we have used MATLAB and
graphics to develop a user interactive package to evaluate the implication methods
with respect to these criteria.
The chapter is organized as follows: Section 2 covers intuitive criteria of GMP
and GMT. In Section 3, we derive and explain the various fuzzy implication meth-
ods. Section 4 provides the steps required to interpretate the implication methods
with respect to intuitive criteria of GMP and GMT. Our approach of using MAT-
LAB and graphics to evaluate the implication methods against intuitive criteria is
presented in Section 5.

2 Intuitive Criteria of GMP and GMT

The two important fuzzy rules, used in FL for approximate reasoning or inferencing,
are GMP and GMT [2]. The basic definitions of these intuitive rules are as follows:
Generalized Modus Ponens: GMP is known as the direct reasoning or forward-
driven inferencing rule. It is defined by the following implication modus operandi:

Premise 1 : u is A
Premise 2 : IF u is A THEN v is B
Consequence : v is B
where A and A are input fuzzy sets, B and B are output fuzzy sets, u and v are the
linguistic variables corresponding to the input and output fuzzy sets, respectively.
The various values that the fuzzy set of premise 1 can have are: A, very A, more or
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 315

less A, and not A. The linguistic values such as very and more or less are known as
hedges and can be defined in terms of their membership grade as µ (·)2 and µ (·)1/2 ,
respectively. Here, (·) denotes the fuzzy sets A or B. Figure 1 shows the profiles
of these hedges. The various intuitive criteria of GMP, relating premise 1 and the
consequence for any given premise 2, are illustrated in Table 1 [2]. It is observed
from the table that there are in totality seven criteria under GMP in which each can
be related to our everyday reasoning. It is also noticed that if a fundamental relation
between “u is A” and “v is B” is not strong in premise 2 then the satisfaction of
criterion C2-2 and C3-2 is allowed.
Generalized Modus Tollens: GMT, known as indirect or backward goal-driven
inferencing rule, is defined by following inference procedure:

Premise 1 : v is B
Premise 2 : IF u is A THEN v is B
Consequence : u is A

Fig. 1 Linguistic variables (hedges) “very” and “more or less”

Table 1 Intuitive criteria of GMP — a direct reasoning or forward goal-driven inference rule
GMP u is A v is B
Criteria (premise 1) (premise 2)

C1 u is A v is B
C2-1 u is very A v is very B
C2-2 u is very A v is B
C3-1 u is more or less A v is more or less B
C3-2 u is more or less A v is B
C4-1 u is not A v is unknown
C4-2 u is not A v is not B
316 S.K. Kashyap et al.

Fig. 2 Linguistic variables (hedges) “not very” and “not more or less”

Table 2 Intuitive criteria of GMT — an indirect reasoning or backward goal-driven inference rule
GMT v is B u is A
Criteria (premise 1) (premise 2)

C5 v is not B u is not A
C6 v is not very B u is not very A
C7 v is not more or less B u is not more or less A
C8-1 v is B u is unknown
C8-2 v is B u is A

The various values that a fuzzy set B of premise 1 can have are: not B, not very B,
not more or less B, and B. The linguistic values such as not very and not more or
less are known as hedges and can be defined in terms of their membership grade as
1 − µ (·)2 and 1 − µ (·)1/2 , respectively. Figure 2 shows the profiles of these hedges.
The various intuitive criteria of GMT, relating premise 1 and its consequence for
any given premise 2, are illustrated in Table 2.

3 Fuzzy Implication Methods

Development of any FL-based system requires an appropriate selection of fuzzy im-


plication methods so that fuzzified inputs can be mapped to desirable output fuzzy
sets. In general, fuzzy implication is used to interpret IF–THEN rules provided by a
domain expert. Throughout this report, we consider a following fuzzy rule consist-
ing of a single input fuzzy set and single output fuzzy set.

IF u is A, THEN v is B (1)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 317

The rule has two parts known as antecedent or premise for “IF u is A” and conse-
quent for “THEN v is B”. Here, the crisp variable u, fuzzified by set A in a universe
of discourse U, is an input to the inference engine whereas the crisp variable v, rep-
resented by the set B in a universe of discourse V , is an output from the inference
engine. The formula used to compute the fuzzified output is given by

B = R◦A (2)

where ◦ is known as the compositional operator represented using sup-star with


“sup” as supremum and “star” as T-norm operator [1], and R is a fuzzy relation in
the 2D product space U × V . Equation (2) in terms of its membership functions is
given by
µB (v) = µR (u, v) ◦ µA (u) (3)
In Equation (3), µR (u, v) can be replaced by µA→B (u, v) due to fact that a fuzzy
implication is also a kind of relation that provides a mapping between input and
output. Hence, Equation (3) can be rewritten as

µB (v) = µA→B (u, v) ◦ µA (u) (4)

The following seven standard ways or interpretation of the fuzzy IF-THEN rule
exist, based on intuitive criteria or classical logic, to define the fuzzy implication.
• Fuzzy conjunction (FC):

µA→B (u, v) = µA (u) ∗ µB (v) (5)

where ∗ represents a T-norm operator;


• Fuzzy disjunction (FD):

µA→B (u, v) = µA (u)+̇µB (v) (6)

where +̇ represents a S-norm/T-conorm operator;


• Material implication (MI):

µA→B (u, v) = µĀ (u)+̇µB (v) (7)

where µĀ (u) is a fuzzy complement of µA (u);


• Propositional calculus (PC):

µA→B (u, v) = µĀ (u)+̇µA (u) ∗ µB (v) (8)

• Extended propositional calculus (EPC):

µA→B (u, v) = µĀ (u) × µB̄ (v)+̇µB (v) (9)

• Generalization of modus ponens (GMP):

µA→B (u, v) = sup {c ∈ [0, 1] : µA (u) ∗ c ≤ µB (v)} (10)


318 S.K. Kashyap et al.

• Generalization of modus tollens (GMT):

µA→B (u, v) = inf {c ∈ [0, 1] : µB (v)+̇c ≤ µA (u)} (11)

It is observed that the above family of fuzzy implications utilizes operators of T-


norms and S-norms and therefore by having different combinations of these norms
facilitates numerous ways to interpret the fuzzy IF-THEN rules. In other words, we
can derive different fuzzy implications by using these operators. However, it is not
necessary that all of these fuzzy implications or interpretations can completely fit
into the intuitive criteria [2] of Generalized Modus Ponens and Generalized Modus
Tollens. The most commonly used fuzzy implications which fit into intuitive criteria
are the following:
• Mini-operation rule of fuzzy implication (MORFI) – Mamdani: The equation
of MORFI is derived by applying the “standard intersection” operator of T-norms
in Equaion (5).

RMORFI = µA→B (u, v) = min(µA (u), µB (v)) (12)

• Product-operation rule of fuzzy implication (PORFI) – Larsen: The equation


of PORFI is derived by applying the “algebraic product” operator of T-norms in
Equation (5):
RPORFI = µA→B (u, v) = µA (u)µB (v) (13)

• Arithmetic rule of fuzzy implication (ARFI) – Zadeh/Lukasiewicz: The equa-


tion of ARFI is derived by applying the ‘bounded sum’ operator of S-norms and
the “complement” operator in Equation (7):

RARFI = µA→B (u, v) = µĀ (u)+̇µB (v)


= min(1, µĀ (u) + µB (v))
= min(1, 1 − µA (u) + µB (v)) (14)

• Max-min rule of fuzzy implication (MRFI) – Zadeh: The equation of MRFI is


derived by applying the “standard intersection” operator of T-norms, the “stan-
dard union” operator of S-norms and the fuzzy “complement” operator in Equa-
tion (8):

RMRFI = µA→B (u, v) = µĀ (u)+̇µA (u) ∗ µB (v)


= max(µĀ (u), µA (u) ∗ µB (v))
= max(1 − µA (u), , µA (u) ∗ µB (v))
= max(1 − µA (u), min(µA (u), µB (v))) (15)

• Standard sequence of fuzzy implication (SSFI): The equation of SSFI is de-


rived by applying the “bounded difference or product” operator of T-norms in
Equation (10):
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 319

RSSFI = µA→B (u, v) = sup {c ∈ [0, 1] : µA (u) ∗ c ≤ µB (v)}


= sup {c ∈ [0, 1] : max(0, µA (u) + c − 1) ≤ µB (v)}

1 if µA (u) ≤ µB (v)
= (16)
0 if µA (u) > µB (v)

• Boolean rule of fuzzy implication (BRFI): The equation of BRFI is derived by


applying the “standard union” operator of S-norms and the fuzzy “complement”
operator in Equation (7):

RBRFI = µA→B (u, v) = µĀ (u)+̇µB (v)


= max(µĀ (u), µB (v))
= max(1 − µA (u), µB (v)) (17)

• Goguen’s rule of fuzzy implication (GRFI): The equation of GRFI is derived


by applying the “algebraic product” operator of T-norms in Equation (10):

RBRFI = µA→B (u, v) = sup {c ∈ [0, 1] : µA (u) ∗ c ≤ µB (v)}


= sup {c ∈ [0, 1] : µA (u)c ≤ µB (v)}

1 if µA (u) ≤ µB (v)
= µB (v) (18)
if µA (u) > µB (v)
µA (u)

4 Properties of Interpretations of Fuzzy IF-THEN Rules

In this report, a procedure is evolved for investigating the consequences when fuzzy
implication methods mentioned by Equations (12)–(18) (except Equation (16)) are
applied in the fuzzy inference process, and then visualized whether these conse-
quences match with any of the intuitive criteria (see Tables) of GMP and GMT,
the two ideal inference rules in our day-to-day reasoning or thought processes. In
order to do so, we have used MATLAB and GUI to speed up the process of inves-
tigation by seeing the output through plots or numerical results. In this section, we
basically establish the formulas which will be the backbone for evaluating the fuzzy
implication methods against the intuitive criteria of GMP and GMT.
The following formula is required to compute the consequences (so that they can
be compared with consequences of GMP) when the fuzzy implication methods are
applied in the fuzzy inference process:

B = R ◦ A (19)

where ◦ is known as the compositional operator represented using sup-star with


“sup” as supremum and “star” as T-norm operator. In the present case, we use the
“standard union” or “max” operator for “sup” and the “standard intersection” or
320 S.K. Kashyap et al.

min operator for “star”. Thus, Equaion (19) in terms of its membership functions, is
given by

µB (v) = sup {µA→B (u, v) ∗ µA (u)} or


u∈U
y = µB (v) = sup {min [µA→B (u, v), µA (u)]} (20)
u∈U

where µA→B (u, v) is a fuzzy implication method from Equations (12)–(18) and x =
µA (u) is the premise 1 of GMP (see Table 1).containing any one of the following:
µA (u) = µA (u), µA (u) = µA2 (u), µA (u) = µA (u), or µA (u) = 1 − µA (u). It is
assumed that the fuzzy sets A and B are normalized ones, i.e. their membership
grades fall between 0 and 1.
Similarly in GMT, the following formula is required to compute the conse-
quences (so that they can be compared with consequences of GMT) when the fuzzy
implication methods are applied in the fuzzy inference process:

A = R ◦ B (21)

Equation (21) in terms of membership function is given by

µA (u) = sup {min [µA→B (u, v), µB (v)]} (22)
v∈V

where µB (v) is the premise 1 of GMT (see Table 2) containing .any one of the fol-
lowing: µB (v) = 1 − µB (v), µB (v) = 1 − µB2 (v), µB (v) = 1 − µB (v), or µB (v) =
µB (v).

5 Study of Satisfaction of Criteria using MATLAB/Graphics

In order to compare the consequences of the implication methods against those of


GMP and GMT, we have developed a MATLAB/Graphics based tool that helps
visualizing the results through plots as well as numerically. Figure 3 illustrates the
front menu that facilitates the user to select the implication method to be investigated

Fig. 3 Panel for selection of


fuzzy implication methods
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 321

Fig. 4 Panel for selection of


premise 1 of GMP criteria

Fig. 5 Panel for selection of


premise 1 of GMT criteria

and Figures 4–5 show the panel for selecting premise 1 of GMP and GMT criteria
to be applied to the selected implication method.
The following steps are used to realize the satisfaction of criteria using MAT-
LAB/Graphics:
Step 1: Generation of 2D plots of selected implication method
Consider a fuzzy input set A and output set B with following membership grades:

µA (u) = [0 0.05 0.1 0.15 , ..., 1] (23)


µB (v) = [0 0.05 0.1 0.15 , ..., 1] (24)

2D plots are generated by taking one value of Equation (24) at a time for the
entire set of values µA (u) of Equation (23) and applying them to selected implication
method of Equations(12)–(18). In these plots the X-axis is µA (u) and the Y -axis is
µA→B (u, v) for each value of µB (v). Figures 6–11 show the 2D plots of the various
implication methods. In these figures, the coding with symbols indicates the various
values of the fuzzy implication methods computed by varying fuzzy sets µB between
0 and 1 with a fixed interval of 0.05. Interestingly, it is important to realize that
there could be infinite such values possible if we reduce the interval of µB to a very
small value. However, for concept proving, we felt that the values shown in the
aforementioned figures will be sufficient and easy to visualize.
322 S.K. Kashyap et al.

Fig. 6 2D plots of MORFI

Fig. 7 2D plots of PORFI

Fig. 8 2D plots of ARFI


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 323

Fig. 9 2D plots of MRFI

Fig. 10 2D plots of BRFI

Fig. 11 2D plots of GRFI


324 S.K. Kashyap et al.

In steps 2–3, the GMP and GMT criteria that are applied to the implication meth-
ods and consequences are realized visually as well as analytically.
Step 2: One by one premise 1 of all GMP criteria, i.e. C1 to C4-2, are applied
to the implication methods. Let us first take the implication method “MORFI” for
investigation under the heading “MORFI: C#”, where # is a criterion index such as:
1, 2–1, 2–2, 3–1, 3–2, 4–1, and 4–2.
MORFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the
consequence µB (v). In this report, we first try to interpret the “min” operation of
Equation (20) by considering Figure 6 of the 2D view of the implication method
µA→B (u, v) and premise 1 µA (u) (Table 1). Figures 12 and 13 (for only one value of
µB (v)) show this superimposition, and it is observed that µA (u) is always larger or
equal to µA→B (u, v) for any value of µA (u), which means that outcome of the ‘min’
operation is µA→B (u, v) itself, i.e. Figure 6. It is also observed from Figures 12/13
that µA→B (u, v) = min(µA (u), µB (v)) converges to µB (v) (also the maximal value
of µA→B (u, v)) for µA (u) ≥ µB (v) and hence the supremum of µA→B (u, v) is µB (v),
i.e. µB (v) = µB (v). Therefore, it is concluded that MORFI satisfies the intuitive

Fig. 12 Superimposed plots of MORFI and premise 1 of C1

Fig. 13 plots of MORFI and premise 1 of C1 for µB = 0.35


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 325

criterion C1 of GMP (refer Table 1). We also prove this by an analytical method as
given below:
µB (v) = sup {min [min {µA (u), µB (v)} , µA (u)]}
u∈U

y = min {µA (u), µA (u)} ; for µA (u) ≤ µB (v)
= sup 1 (25)
u∈U y2 = min { µB (v), µA (u)} ; for µA (u) > µB (v)

y = µA (u) ; for µA (u) ≤ µB (v)
= sup 1
u∈U y2 = µB (v) ; for µA (u) > µB (v)

It is observed from the above equations that the outcome of the “min” operation
between µA→B (u, v) and µA (u) consists of y1 and y2 . The outcome starts with y1
which increases to a maximum value of µB (v) with an increase in µA (u) from zero
to µB (v). The y2 starts from the maximum value of y1 and remains constant on that
value in spite of any further increase in µA (u). Hence, we see that supremum is y2
only, i.e. µB (v) = µB (v).
MORFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to
get the consequence µB (v). Figures 14 and 15 illustrate the superimposed plots

Fig. 14 Superimposed plots of MORFI and premise 1 of C2-1/C2-2

Fig. 15 plots of MORFI and premise 1 of C2-1/C2-2 for µB = 0.35


326 S.K. Kashyap et al.

of µA→B (u, v) and µA (u). It is observed that the area below the intersection point
of µA→B (u, v) and µA (u) corresponds to the “min” operation of Equation (20). It is
also noticed that the supremum of the resultant area is nothing but those intersection
points having values equal to µB (v). Therefore, it is concluded that MORFI satisfies
the intuitive criterion C2-2 (not C2-1) of GMP. The analytical proof is given below:
!  "
µB (v) = sup min min {µA (u), µB (v)} , µA2 (u)
u∈U
 ! "
y1 = min !µA (u), µA2 (u)" ; for µA (u) ≤ µB (v)
= sup (26)
u∈U y2 = min µB (v), µA (u) ; for µA (u) > µB (v)
2

⎨ y1 = µA (u); since µA (u) ≤ µ. ; for µA (u) ≤ µB (v)
2 2
A (u)
= sup y21 = µA (u); for µA (u) ≤ . µB (v)
2
u∈U ⎩ ; for µA (u) > µB (v)
y22 = µB (v); for µA (u) > µB (v)

It is observed that the outcome of the “min” . operation between µA→B (u, v) and
µA (u) consists of y1 , y21 , and y22 . Since µB (v) > µB (v), therefore . y1 and y21
can be treated as one, having a value µA2 (u) for the value of µA (u) ≤ µB (v). The
outcome starts with y1 /y21 which . increases to a maximum value of µB (v) with an
increase in µA (u) from zero to µB (v). The y22 starts from the maximum value of
y1 /y21 and remains constant on that value in spite of any further increase in µA (u).
Hence, we see that supremum is y22 only, i.e. µB (v) = µB (v).
.
MORFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20)
to get the consequence µB (v). Figure 16 illustrates the superimposed plots of
µA→B (u, v) and µA (u). It is observed from the figure that µA (u) is always greater
than the implication µA→B (u, v) for any value of µB (v); therefore the outcome of the
‘min’ operation results in µA→B (u, v) itself. It is also observed from Figure 16 that
µA→B (u, v) = min(µA (u), µB (v)) converges to µB (v) for µA (u) ≥ µB (v)) and hence
the supremum of µA→B (u, v) is µB (v), i.e. µB (v) = µB (v). Therefore, it is concluded

Fig. 16 Superimposed plots of MORFI and premise 1 of C3-1/C3-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 327

that MORFI satisfies the intuitive criterion C3-2 (not C3-1) of GMP. The analytical
proof is given below:
 . 
µB (v) = sup min min {µA (u), µB (v)} , µA (u)
u∈U
⎧  . 
⎨ y1 = min µA (u), µA (u) ; for µA (u) ≤ µB (v)
= sup  .  (27)
u∈U ⎩ y2 = min µB (v), µA (u) ; for µA (u) > µB (v)
⎧ .
⎨ y1 = µA (u); since µA (u) ≤ µA (u) ; for µA (u) ≤ µB (v)

.
= sup y2 = µB (v); since µA (u) > µB (v) and µA (u) < µA (u),

u∈U ⎩ .
hence µA (u) > µB (v)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 and y2 . The outcome starts with y1 which increases to a maxi-
mum value of µB (v) with an increase in µA (u) from zero to µB (v). The y2 starts from
the maximum value of y1 and remains constant on that value in spite of any further
increase in µA (u). Hence, we see that supremum is y2 only, i.e. µB (v) = µB (v).
MORFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20)
to get the consequence µB (v). Figure 17 illustrates the superimposed plots of
µA→B (u, v) and µA (u). It is noticed from the figure that µA (u) intersects µA→B (u, v)
a first time at µA (u) = µB (v) = 0.5, a point at which the outcome of the “min”
operation is at its peak value and also equals the maximal value that the conse-
quence µB (v) can achieve. Similarly the other values of µB (v) are the next inter-
section points (below 0.5) of µA (u) to µA→B (u, v). Hence it can be concluded that
µB (v) falls between µB (v) to 0.5 or in other words, µB (v) = min(0.5, µB (v)); that
is µB (v) = 0.5 ∩ µB (v). The point to be noted here is that MORFI does not satisfy
the intuitive criteria C4-1/C4-2 of GMP. The analytical proof is given below:

Fig. 17 Superimposed plots of MORFI and premise 1 of C4-1/C4-2


328 S.K. Kashyap et al.

µB (v) = sup {min [min {µA (u), µB (v)} , 1 − µA (u)]} (28)
u∈U

y = min {µA (u), 1 − µA (u)} ; for µA (u) ≤ µB (v)
= sup 1
u∈U ⎧y2 = min { µB (v), 1 − µA (u)} ; for µA (u) > µB (v)

⎪ y11 = µA (u) ; for µA (u) ≤ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = 1 − µ (u) ; for µA (u) > 0.5
= sup  12 A
u∈U ⎪
⎪ y21 = 1 − µA (u) ; for µA (u) ≥ 1 − µB (v)
⎩ ; for µA (u) > µB (v)
y22 = µB (v) ; for µA (u) < 1 − µB (v)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 which can be divided into two regions with
the first region consisting of y11 , y21 , and y22 when µB (v) < 0.5 and a second region
consisting of y11 and y12 when µB (v) ≥ 0.5. The outcome of the first region starts
with y11 which increases to a maximum value of µB (v) with an increase of µA (u),
whereas y22 begins with that maximum value and remains the same until µA (u) <
1 − µB (v) and after that, y21 takes over, which decreases from its maximum value
of µB (v) to zero. Hence, we notice that the supremum in this region will be µB (v)
only having a value less than 0.5. In the second region, the outcome of the “min”
operation begins with y11 which increases to a maximum value of 0.5 and from there
y12 takes over which decreases to a zero value with any further increase of µA (u),
hence, the supremum of this region is 0.5 only and this also a maximum value that
the consequence µB (v) can achieve, otherwise, it is µB (v) < 0.5 or, in other words,
µB (v) = min(0.5, µB (v)); that is µB (v) = 0.5 ∩ µB (v).
Similar to MORFI, the implication method PORFI is investigated by applying
the intuitive criteria of GMP.
PORFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from the Figure 18 that µA (u) is always equal to
or greater than the implication µA→B (u, v) for any value of µB (v); therefore the
outcome of the “min” operation results in µA→B (u, v) itself. It is also noticed that

Fig. 18 Superimposed plots of PORFI and premise 1 of C1


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 329

the supremum of µA→B (u, v) turns out to be µB (v). Therefore, it is concluded that
PORFI satisfies the intuitive criterion C1 of GMP. The analytical proof is given
below:

µB (v) = sup {min [µA (u)µB (v), µA (u)]} (29)


u∈U
= sup {µA (u), µB (v)}
u∈U
(since the product of two normalized fuzzy numbers is
always less than either of the numbers)
= µB (v); (µA (u)µB (v) tends to µB (v) as µA (u) → 1)

PORFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figures 19 and 20 that the point of
intersection of µA→B (u, v) and µA2 (u) is at µA (u)µB (v) = µA2 (u) or µB (v) = µA (u).
Hence the outcome of the “min” operation is equal to µA2 (u) for µB (v) ≥ µA (u) and
µA (u)µB (v) for µB (v) < µA (u). It is also noticed that as µA (u) tends to unity, the out-
come of min converges to µB (v) which is also a largest value of the min operation.

Fig. 19 Superimposed plots of PORFI and premise 1 of C2-1/C2-2

Fig. 20 plots of PORFI and premise 1 of C2-1/C2-2 for µB = 0.35


330 S.K. Kashyap et al.

Hence, the supremum of the “min” operation becomes µB (v), i.e. µB (v) = µB (v).
Therefore, PORFI satisfies the C2-2 (not the C2-1) criterion of GMP. The analytical
proof is given below:
!  "
µB (v) = sup min min {µA (u), µB (v)} , µA2 (u) (30)
u∈U

y = µA2 (u) ; for µA2 (u) ≤ µA (u)µB (v) or µA (u) ≤ µB (v)
= sup 1
u∈U y2 = µA (u) µB (v) ; for µA (u) > µB (v)
= µB (v) ; (since µA2 (u) < µA (u)µB (v) and µA (u)µB (v)
tends to µB (v) as µA (u) → 1)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , and y2 . The y1 increases to a maximum value of µB2 (v) with an
increase of µA (u), then y2 starts with that value and further increases to a new max-
imum value of µB (v) as µA (u) → 1. Since µB2 (v) < µB (v), therefore, the supremum
of the outcome of the ‘min’ operation will be µB (v) only i.e. µB (v) = µB (v).
.
PORFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to
get the consequence µB (v). It is observed from the Figure 21 that µA (u) is always
greater than the implication µA→B (u, v) for any value of µB (v); therefore the out-
come of the “min” operation results in µA→B (u, v) itself. So naturally the supremum
of µA→B (u, v) is nothing but µB (v) i.e. µB (v) = µB (v). Therefore, PORFI satisfies
the C3-2 (not C3-1) criterion of GMP. The analytical proof is given below:
 . 
µB (v) = sup min µA (u)µB (v), µA (u) (31)
u∈U
.
= sup {µA (u)µB (v)} (; since µA (u) > µA (u) and µA (u)µB (v) < µA (u))
u∈U
= µB (v); (µA (u)µB (v) tends to µB (v) as µA (u) → 1)

Fig. 21 Superimposed plots of PORFI and premise 1 of C3-1/C3-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 331

PORFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20)


to get the consequence µB (v). It is observed from the Figures 22 and 23 that the
supremum of the min operation, i.e. µB (v), is nothing but the intersection points of
µA (u) and µA→B (u, v) at which µA (u) = µA→B (u, v). The analytical proof is given
below:
µB (v)
µB (v) = sup {min [µA (u)µB (v), 1 − µA (u)]} =
u∈U 1 + µB (v)

Since µB (v) = µA (u)µB (v) = 1 − µA (u), hence by solving µA (u)µB (v) = 1 − µA (u)
1 1 µB (v)
we get µA (u) = . Therefore, µB (v) = µ (v) or .
1 + µB (v) 1 + µB (v) B 1 + µB (v)
Therefore, PORFI does not satisfy the C4-1/C4-2 criteria of GMP. The analytical
proof is given below:

µB (v) = sup {min [µA (u)µB (v), 1 − µA (u)]}


u∈U

⎨ y1 = µA (u)µB (v) ; for µA (u) ≤ 1
1 + µB (v)
= sup (32)
u∈U ⎩ y2 = 1 − µA (u) ; for µA (u) >
1
1 + µB (v)

Fig. 22 Superimposed plots of PORFI and premise 1 of C4-1/C4-2

Fig. 23 plots of PORFI and premise 1 of C4-1/C4-2 for µB = 0.35


332 S.K. Kashyap et al.

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v)
µA (u) consists of y1 and y2 . The y1 increases to a maximum value of
1 + µB (v)
with an increase of µA (u) from zero to 1 , then y2 starts from that maximum
1 + µB (v)
value and decreases with any further increase of µA (u). Hence, the supremum of the
µB (v)
outcome of the “min” operation is µB (v) = .
1 + µB (v)
ARFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from the Figures 24 and 25 that µA→B (u, v) equals
unity unless µA (u) becomes larger than µB (v), otherwise µA→B (u, v) is equal to
1 − µA (u) + µB (v). We also noticed that the curve µA (u) intersects only with
1 − µA (u) + µB (v), therefore the supremum of the outcome of the “min” operation
between the curves µA→B (u, v) and µA (u) are their intersection points, obtained by
solving the following equality:

1 + µB (v)
µB (v) = 1 − µA (u) + µB (v) = µA (u) = µA (u) =
2

Fig. 24 Superimposed plots of ARFI and premise 1 of C1

Fig. 25 plots of ARFI and premise 1 of C1 for µB = 0.35


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 333

Since 1 − µA (u) + µB (v) = µA (u), and solving this equation, we get µA (u) =
1 + µB (v)
2 . Based on the consequence µB (v) obtained, it is concluded that ARFI
does not satisfy the C1 criterion of GMP. The analytical proof is given below:

µB (v) = sup {min [min {1, 1 − µA (u) + µB (v)} , µA (u)]} (33)
u∈U
= sup {min [1 − µA (u) + µB (v), µA (u)]} ( for µA (u) > µB (v))
u∈U

⎨ y = µ  (u) = µ (u) ; for µ (u) ≤ 1 + µB (v)
1 A A A 2
= sup
u∈U ⎩ y2 = 1 − µA (u) + µB (v) ; for µA (u) > 1 + µB (v)
2
It is observed from the above equation that the outcome of the “min” operation
between µA→B (u, v) and µA (u) is either y2 = 1 − µA (u) + µB (v) or y1 = µA (u).
Also, we see from the nature of the equations that µA (u) increases with increase in
µA (u), whereas 1 − µA (u) + µB (v) decreases and hence, the supremum of the “min”
operation is the point of intersection of y1 and y2 i.e. µB (v) = 1 − µA (u) + µB (v) =
1 + µB (v)
µA (u) or µB (v) = 2 .

ARFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figure 26 that the suprema of the
outcome of the “min” operation between the curves µA→B (u, v) and µA (u) are the
intersection points of these curves for any given value of µB (v). These intersection
points are obtained by solving the following equality:

µB (v) = 1 − µA (u) + µB (v) = µA2 (u)

Since 1 − µA (u) + µB (v) = µA2 (u), or µA2 (u) + µA (u) −√


1 − µB (v) = 0 is in the form
of ax2 + bx + c = 0 whose roots are given by x = −b ± 2a b2 − 4ac , hence, the values

Fig. 26 Superimposed plots of ARFI and premise 1 of C2-1/C2-2


334 S.K. Kashyap et al.

of µA (u) are computed as


. . .
−1 ± 1 + 4(1 + µB (v)) −1 ± 1 + 4 + 4µB (v) −1 ± 5 + 4µB (v)
µA (u) = = =
2 2 2
Since the value of µA (u) can not exceed the limits 0 and 1, therefore
. .
−1 + 5 + 4µB (v) 5 + 4µB (v) − 1
µA (u) = or
2 2
Thus µB (v) is given by

µB (v) = 1 − µA (u) + µB (v)


.   . 
5 + 4µB (v) − 1 2 − 5 + 4µB (v) + 1
= 1− + µB (v) = + µB (v)
2 2
. .
2 − 5 + 4µB (v) + 1 + 2µB (v) 3 + 2µB (v) − 5 + 4µB (v)
= =
2 2
Based on the consequence µB (v) obtained, it is concluded that ARFI does not sat-
isfy the C2-1/C2-2 criteria of GMP. The analytical proof is given below:

µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (34)
u∈U
!  "
= sup min 1 − µA (u) + µB (v)), µA2 (u) ; for µA (u) > µB (v)
u∈U

y = µA2 (u) ; for µA (u) ≤ µAmin A(u)
= sup 1
u∈U y2 = 1 − µA (u) + µB (v) ; for µA (u) > µA A(u)
min

.
5 + 4µB (v) − 1
where µAmin (u) = 2 is obtained by solving 1 − µA (u) + µB (v) =
µA (u).
2

It is observed from the above equation that the outcome of the “min” operation
between µA→B (u, v) and µA (u) is either 1 − µA (u) + µB (v) or µA (u). Also we see
from the nature of equations that µA (u) increases with increase in µA (u), whereas
1 − µA (u) + µB (v) decreases and hence, the supremum of the ‘min’ operation is the
point of intersection of y1 and y2 , i.e.
.
3 + 2µB (v) − 5 + 4µB (v)
µB (v) = 1 − µA (u) + µB (v) = µA (u) or µB (v) =
2
.
2
.
ARFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figure 27 that the suprema of the
outcome of the “min” operation between the curves µA→B (u, v) and µA (u) are the
intersection points of these curves for any given value of µB (v). These intersection
points are obtained by solving the following equality:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 335

Fig. 27 Superimposed plots of ARFI and premise 1 of C3-1/C3-2


.
µB (v) = 1 − µA (u) + µB (v) = µA (u)

Since

µA (u)=[1−µA (u)+ µB (v)]2 or µA2 (u)−(3+2µB (v))µA (u)+ µB2 (v)+2µB (v)+1 = 0,

by solving the above equation, we get


9
3 + 2µB (v) ± (3 + 2µB (v))2 − 4µB2 (v) − 8µB (v) − 4
µA (u) =
9 2
3 + 2µB (v) ± 9 + 4µB (v) + 12µB (v) − 4µB2 (v) − 8µB (v) − 4
2
=
. 2
3 + 2µB (v) − 5 + 4µB (v)
= since µA (u) ∈ [0, 1]
2
Thus µB (v) is given by
µB (v) = 1 − µA (u) + µB (v)
 . 
3 + 2µB (v) − 5 + 4µB (v)
= 1− + µB (v)
2
.
2 − 3 − 2µB (v) + 5 + 4µB (v)
= + µB (v)
2.
−1 − 2µB (v) + 5 + 4µB (v) + 2µB (v)
=
. 2
5 + 4µB (v) − 1
=
2
Based on consequence µB (v) obtained, it is concluded that ARFI does not satisfy
the C3-1/C3-2 criteria of GMP. The analytical proof is given below:
336 S.K. Kashyap et al.

µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (35)
u∈U  
.
= sup min 1 − µA (u) + µB (v), µA (u) ; for µA (u) > µB (v)
u∈U 
.
= sup y1 = µA (u) ; for µA (u) ≤ µAmin A(u)
u∈U y2 = 1 − µ A (u) + µ B (v) ; for µA (u) > µAmin A(u)
.
3 + 2µB (v)− 5 + 4µB (v)−1
where µAmin A(u)= , is obtained by solving 1− µA (u)+
. 2
µB (v) = µA (u).
It is observed from the above equation that the outcome of the “min” opera-
tion between µA→B (u, v) and µA (u) is either 1 − µA (u) + µB (v) or µA (u). Also we
see from the nature of the equations that µA (u) increases with increase in µA (u),
whereas 1 − µA (u) + µB (v) decreases, and hence, the supremum of the “min” op-
eration is the point of.intersection of y1 and y2 i.e. µB (v) = 1 − µA (u) + µB (v) =
. 5 + 4µB (v) − 1
µA (u) or µB (v) = 2 .
ARFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). We see from Figure 28 that µA→B (u, v) is always greater
than or equal to µA (u), thus the outcome of the “min” operation is always µA (u),
i.e. 1 − µA (u) for any value of µB (v). Since the supremum of 1 − µA (u) is always
the unity, hence µB (v) = 1. Therefore, ARFI satisfies the C4-1 (not C4-2) criterion
of GMP. The analytical proof is given below:
µB (v) = sup {min [min(1, 1 − µA (u) + µB (v)), µA (u)]} (36)
u∈U
= sup {min [1 − µA (u) + µB (v), µA (u)]} ; for µA (u) > µB (v)
u∈U

y1 = 1− µA (u)+ µB (v) ; for 1 − µA (u) + µB (v) < 1 − µA (u) i.e. µB (v) < 0
= sup
u∈U y2 = µA (u) = 1 − µA (u) ; for 1 − µA (u) + µB (v) > 1 − µA (u) i.e. µB (v) > 0

We observed from the above equation that y1 is not valid as µB (v) < 0 is not possible.
Hence, µB (v) is given by

Fig. 28 Superimposed plots of ARFI and premise 1 of C4-1/C4-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 337

µB (v) = sup {y2 = µA (u) = 1 − µA (u); for µB (v) > 0} .
u∈U

Since the supremum of 1 − µA (u) is always the unity, hence µB (v) = 1.
MRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from Figure 29 that the suprema of the outcome of
the “min” operation between µA→B (u, v) and µA (u) are their intersection points. We
observe that 0.5 as a minimum value of the supremum is a first point of intersection
and other values of the supremum are equal to µB (v) with values greater than 0.5
or, in other words, µB (v) = max(0.5, µB (v)) or µB (v) = 0.5 ∪ µB (v). Based the on
consequence µB (v), it is concluded that MRFI does not satisfy the C1 criterion of
GMP for the value of µB (v) greater than 0.5. The analytical proof is given below:

µB (v) = sup {min [max [min(µA (u), µB (v)), 1 − µA (u)] , µA (u)]} (37)
u∈U

y1 = min [max [µA (u), 1 − µA (u)] , µA (u)] ; for µA (u) ≤ µB (v)
= sup
u∈U y2 = min [max [µB (v), 1 − µA (u)] , µA (u)] ; for µA (u) > µB (v)
⎧

⎪ y11 = min [µA (u), µA (u)] ; for µA (u) ≥ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = min [1 − µ (u), µ (u)] ; for µA (u) < 0.5
= sup  12 A A
u∈U ⎪
⎪ y = min [µB (v), µA (u)] ; for µA (u) ≥ 1 − µB (v)
⎩ 21 ; for µA (u) > µB (v)
y22 = min [1 − µA (u), µA (u)] ; for µA (u) < 1 − µB (v)
⎧

⎪ y11 = µA (u) ; for µA (u) ≥ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = µA (u) ; for µA (u) < 0.5
= sup  12
u∈U ⎪
⎪ y21 = min [µB (v), µA (u)] ; for µA (u) ≥ 1 − µB (v)
⎩ ; for µA (u) > µB (v)
y22 = min [1 − µA (u), µA (u)] ; for µA (u) < 1 − µB (v)


⎪ y1 = µA (u); for µA (u) ≤ µB (v)


⎪ ⎨ y21 = µB (v) ; for µA (u) ≥ 1− µB (v)i.e.µB (v) ≥ 0.5
⎨ 
= sup y221 = µA (u) ; for µA (u) ≥ 0.5
u∈U ⎪
⎪ ⎩ ; for µA (u) < 1− µB (v) i.e. µB (v) < 0.5

⎪ y222 = 1− µA (u) ; for µA (u) < 0.5

; for µA (u) > µB (v)

Fig. 29 Superimposed plots of MRFI and premise 1 of C1


338 S.K. Kashyap et al.

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 , y221 , and y222 which are divided into two regions depend-
ing on the values of µB (v). The first region contains y1 and y21 for µB (v) ≥ 0.5 and
the second region having y221 and y222 for µB (v) < 0.5. In the first region, the out-
come of the “min” operation starts with y1 which increases to a maximal value of
µB (v) with an increase in µA (u), and then y21 , which starts with that maximal value,
and remains there in spite of any further increase in µA (u). Clearly, the supremum
of this region is µB (v) with a value greater than 0.5.
In the second region, the outcome of the “min” operation starts with y221 which
increases to a maximal value of 0.5 with an increase in µA (u) and then y222 starts
with that maximal value and decreases with any further increase in µA (u). Hence,
the supremum of this region is 0.5 only. It is observed that the supremum of the
first region is either equal to or greater than the supremum of the second region,
therefore µB (v) = max(0.5, µB (v)), or µB (v) = 0.5 ∪ µB (v).
MRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from the Figures 30 and 31 that the first
supremum of the outcome of the “min” operation between µA→B (u, v) and µA (u)

Fig. 30 Superimposed plots of MRFI and premise 1 of C2-1/C2-2

Fig. 31 plots of ARFI and premise 1 of C2-1/C2-2 for µB = 0.35


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 339

is the intersection point of the curve 1 − µA (u) and µA (u) = µA2 (u), computed by
solving the following equality:

µB (v) = 1 − µA (u) = µA2 (u)

Since 1 − µA (u) = √ µA2 (u) or µA2 (u) +√µA (u) − 1 = 0, solving the above equation, we
get µA (u) = −1 ± 2 1 + 4 = −1 ± 5 . Since the membership grade cannot exceed
2 √
the limits 0 and 1, therefore µA (u) has the following value µA (u) = 52− 1 , thus
the consequence µB (v) is given by
√ √ √
5−1 2− 5+1 3− 5
µB (v) = 1 − µA (u) = 1 − = =
2 2 2

It is also observed that 3 −2 5 is the lowest value of µB (v), and other values of the

supremum are equal to µB (v) with values greater than 3 −2 5 , or, in other words,
√ √
µB (v) = max 3 −2 5 , µB (v) or µB (v) = 3 −2 5 ∪ µB (v). Based on the conse-
quence µB (v), it is concluded that MRFI does not satisfy the C2-2 and C2-1 criteria
of GMP. The analytical proof is given below:
!  "
µB (v) = sup min max [min(µA (u), µB (v)), 1 − µA (u)] , µA2 (u) (38)
u∈U
  
y = min max [µA (u), 1 − µA (u)] , µA2 (u) ; for µA (u) ≤ µB (v)
= sup 1
u∈U y2 = min max [ µB (v), 1 − µA (u)] , µA (u) ; for µA (u) > µB (v)
2
⎧  
⎪ y11 = min µA (u), µA (u)2  ; for µA (u) ≥ 0.5
2

⎨ ; for µA (u) ≤ µB (v)
y = min 1 − µ (u), µA(u) ; for µA (u) < 0.5
= sup  12 A
u∈U ⎪
⎪ y21 = min µB (v), µA (u)
2
 ; for µA (u) ≥ 1 − µB (v) ; for µA (u) > µB (v)

y22 = min 1 − µA (u), µA2 (u) ; for µA (u) < 1 − µB (v)
⎧
⎪ y11 = µA2 (u) ; for µA (u) ≥ 0.5

⎪ ; for µA (u) ≤ µB (v)

⎪ y = µA2 (u) ; for µA (u) < 0.5 .

⎪ ⎧ 12
⎪ ⎪ y211 = µA (u) ; for µA (u) ≤ µB (v)
⎨ 2

⎨ . ; for µA (u) ≥ 1 − µB (v)
= sup
 y212 = µB (v) ; for µA (u) > µB (v)

u∈U ⎪

⎪ ⎪
⎪ y221 = 1 − µA (u) ; for µA (u) ≥ µA (u) ; for µ (u) < 1 − µ (v)
min

⎪ ⎩ y = µ 2 (u)

⎪ ; for µA (u) < µAmin (u) A B
⎩ 222 A
; for µA (u) > µB (v)

⎪ y1 = µA2 (u); for µA (u) ≤ µB (v)
⎪ ⎧  y = µ 2 (u) ; for µ (u) ≤ .µ (v)



⎪ ⎪
⎨⎪ . B
211 A
⎨ A ; for µA (u) ≥ 1 − µB (v)
 y 212 = µ B (v) ; for µA (u) > µB (v)
= sup
u∈U ⎪
⎪ ⎪ y221 = 1 − µA (u) ; for µA (u) ≥ µAmin (u)
⎪⎪
⎪ ⎩ ; for µA (u) < 1 − µB (v)

⎪ y222 = µA2 (u) ; for µA (u) < µAmin (u)

; for µA (u) > µB (v)

where µAmin (u) = 5 − 1 is obtained by solving 1 − µ (u) = µ 2 (u).
2 A A
340 S.K. Kashyap et al.

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) and consists of y1 , y211 , y212 , y221 , and y222 , which are divided into two re-
gions depending on the values of µB (v). The first region contains y1 , y211 , and y212
for µB (v) ≥ 0.5 and the second region having y221 and y222 for µB (v) < 0.5. In the
first region, the outcome of the “min” operation starts with y1 or y211 which increases
to a maximal value of µB (v) with an increase in µA (u) and then y212 which starts
with that maximal value, remains there in spite of any further increase in µA (u).
Clearly, the supremum of this region is µB (v) with value greater than 0.5.
In the second region, the outcome of the “min” operation starts with y222 which
increases to a maximal value of µAmax (u) with an increase in µA (u) and then
y221 starts with that maximal value and decreases with any further increase in
µA (u). Hence, the supremum of this region is µAmax (u) only. It is observed that
the supremum of the first region is greater than the supremum of the second re-
gion, therefore µ √B (v) = max(µA (u), µB (v)), or µB (v) = µA (u) ∪ µB (v), where
max max

µAmax (u) = 3 −2 5 .
.
MRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). Similar observations are made as for MRFI: C2-1/C2-2
with the only difference the minimum supremum point at which µA→B (u, v) and
µA (u) intersect a first time. The minimum supremum point is computed by solving
the following equality:
.
µB (v) = 1 − µA (u) = µA (u)
.
Since 1 − µA (u) = µA (u), or 1 + µA2 (u) − 2µA (u) = µ√A (u); µA2 (u) − 3√µA (u) + 1 =
0, solving the above equation, we get µA (u) = 3 ± 29 − 4 = 3 ±2 5 ; µA (u) =
√ √ √
3 − 5 . Thus µ  (v) is given by µ  (v) = 1 − µ (u) = 1 − 3 − 5 = 2 − 3 + 5 =
B B A
√2 2 2
5 − 1 . It is observed from Figure 32 that other values of the supremum are
2

Fig. 32 Superimposed plots of MRFI and premise 1 of C3-1/C3-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 341

equal to µB (v) with values greater than 52− 1 , or, in other words, µB (v) =
√ √
max 5 − 1 , µ (v) or µ  (v) = 5 − 1 ∪ µ (v). Based on the consequence
2 B B 2 B

µB (v), it is concluded that MRFI does not satisfy the C3-2 and C3-1 criteria of
GMP. The analytical proof is given below:

 . 
µB (v) = sup min max [min(µA (u), µB (v)), 1 − µA (u)] , µA (u) (39)
u∈U
⎧ .
⎨ y1 = min max [µA (u), 1 − µA (u)] , µA (u) ; for µA (u) ≤ µB (v)
= sup .
u∈U ⎩ y2 = min max [ µB (v), 1 − µA (u)] , µA (u) ; for µA (u) > µB (v)
⎧⎧ .

⎪⎨ y11 = min µA (u), µA (u) ; for µA (u) ≥ 0.5



⎪ . ; for µA (u) ≤ µB (v)
⎨⎩ y12 = min 1 − µA (u), µA (u) ; for µA (u) < 0.5
= sup ⎧ .
⎪⎨y21 = min
u∈U ⎪
⎪ µB (v), µA (u) ; for µA (u) ≥ 1 − µB (v)

⎪ . ; for µA (u) > µB (v)

⎩⎩y22 = min 1− µA (u), µA (u) ; for µA (u) < 1 − µB (v)
⎧
⎪ y11 = µA (u) ; for µA (u) ≥ 0.5

⎪ ; for µA (u) ≤ µB (v)
⎪ y12 = 1 − µA (u) ; for µA (u) < 0.5


⎪ ⎧  .

⎨⎪ y211 = µA (u) ; for µA (u) ≤ µB2 (v)

⎨ ; for µA (u) ≥ 1 − µB (v)
 y212 = µB (v) ; for µA (u) > µB (v)
= sup 2
u∈U ⎪


⎪ ⎪
⎪ y221 = 1.− µA (u) ; for µA (u) ≥ µA (u)
min

⎪ ⎩ ; for µA (u) < 1 − µB (v)

⎪ y 222 = µA (u) ; for µA (u) < µAmin (u)

; for µA (u) > µB (v)

where µAmin (u) = 3 − 5 is obtained by solving 1 − µ (u) = .µ (u).
2 A A
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y211 , y212 , y221 , and y222 , which are divided into two re-
gions depending on the values of µB (v). The first region contains y11 , y12 , y211 , and
y212 for µB (v) ≥ 0.5 and the second region having y221 and y222 for µB (v) < 0.5. In
the first region, the outcome of the “min” operation starts with y211 which increases
to some maximal value with an increase in µA (u), then y12 which starts with that
maximal value and decreases to some value, then y11 which increases to a maxi-
mum value of µB (v) and y212 remains equal to that maximum value in spite of any
further increase in µA (u). Clearly, the supremum of this region is µB (v) with value
greater than 0.5.
In the second region, the outcome of the “min” operation starts with y222 which
increases to a maximal value of µAmax (u) with an increase in µA (u) and then y221
starts with that maximal value and decreases with any further increase in µA (u).
Hence, the supremum of this region is µAmax (u) only. It is observed that the supre-
mum of the first region is greater than the supremum of the second region, therefore
µ√B (v) = max(µAmax (u), muB (v)), or µB (v) = µAmax (u) ∪ µB (v), where µAmax (u) =
5−1.
2
342 S.K. Kashyap et al.

Fig. 33 Superimposed plots of MRFI and premise 1 of C4-1/C4-2

MRFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to


get the consequence µB (v). It is observed from Figure 33 that the implication
µA→B (u, v) is always greater than or equal to µA (u) for any values of µA (u) and
µB (v). Hence, the outcome of the “min” operation between µA→B (u, v) and µA (u)
is µA (u), i.e. 1 − µA (u) itself, and the supremum of the outcome will be the unity
only or µB (v) = 1. Therefore, MRFI satisfies the C4-1 (not C4-2) criterion of GMP.
The analytical proof is given below:
µB (v) = sup {min [max [min(µA (u), µB (v)), 1 − µA (u)] , 1 − µA (u)]} (40)
u∈U

y1 = min [max [µA (u), 1 − µA (u)] , 1 − µA (u)] ; for µA (u) ≤ µB (v)
= sup
u∈U y2 = min [max [µB (v), 1 − µA (u)] , 1 − µA (u)] ; for µA (u) > µB (v)
⎧

⎪ y11 = min [µA (u), 1 − µA (u)] ; for µA (u) ≥ 0.5
⎨ ; for µA (u) ≤ µB (v)
y = min [1 − µ (u), 1 − µ (u)] ; for µA (u) < 0.5
= sup  12 A A
u∈U ⎪
⎪ y = min [µB (v), 1− µA (u)] ; for µA (u) ≥ 1 − µB (v)
⎩ 21 ; for µA (u) > µB (v)
y22 = min [1− µA (u), 1− µA (u)] ; for µA (u) < 1− µB (v)
⎧

⎪ y11 = 1 − µA (u) ; for µA (u) ≥ 0.5

⎪ ; for µA (u) ≤ µB (v)
⎨ y12 = 1 − µA (u) ; for µA (u) < 0.5
= sup y21 = 1 − µA (u) ; for µA (u) ≤ µB2 (v)
u∈U ⎪
⎪ ; for µA (u) ≥ 1 − µB (v)
⎪ 22 = 1 − µA (u) ; for µA (u) > µB (v)
y 2


; for µA (u) > µB (v)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) is 1 − µA (u) only and hence the supremum of that would be the unity, i.e.
µB (v) = 1.
BRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the conse-
quence µB (v). It is observed from Figures 34 and 35 that first the supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) is the intersection
point of the curve 1 − µA (u) and the curve µA (u) = µA (u), and is computed as
follows:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 343

Fig. 34 Superimposed plots of BRFI and premise 1 of C1

Fig. 35 plots of BRFI and premise 1 of C1 for µB = 0.35

µB (v) = 1 − µA (u) = µA (u) or 2µA (u) = 1, i.e. µA (u) = 0.5,

hence µB (v) = 0.5. It is also observed from the figure that the first supremum is the
lowest among other values of µB (v) which are equal to µB (v), or in other words,
µB (v) = max(0.5, µB (v)) = 0.5 ∪ µB (v). Based on the consequence µB (v) it is con-
cluded that BRFI does not satisfy the C1 criterion of GMP. The analytical proof is
given below:

µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (41)
u∈U

y = min [1 − µA (u), µA (u)] ; for µA (u) ≤ 1 − µB (v)
= sup 1
u∈U y2 = min [ µB (v), µA (u)] ; for µA (u) > 1 − µB (v)
⎧

⎪ y11 = µA (u) ; for µA (u) ≤ 0.5
⎨ ; for µA (u) ≤ 1 − µB (v)
y = 1 − µ (u) ; for µA (u) > 0.5
= sup  12 A
u∈U ⎪
⎪ y21 = µA (u) ; for µA (u) ≤ µB (v)
⎩ ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB (v)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 which are divided into two regions depending
344 S.K. Kashyap et al.

on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5 and the
second region having y21 and y22 for µB (v) > 0.5. In the first region, the outcome
of the “min” operation starts with y11 which increases to a maximal value of 0.5
with an increase in µA (u) and then y12 which starts with that maximal value and
decreases with any further increase in µA (u). Clearly, the supremum of this region
is 0.5.
In the second region, the outcome of the “min” operation starts with y21 which
increases to a maximal value of µB (v) with an increase in µA (u) and then y22 starts
with that maximal value and remains there in spite of any further increase in µA (u).
Hence, the supremum of this region is µB (v) only. It is observed that the supremum
of the second region is either equal to or greater than the supremum of first region,
therefore µB (v) = max(0.5, µB (v)) or µB (v) = 0.5 ∪ µB (v).
BRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get the
consequence µB (v). It is observed from Figure 36 that the first supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) is the intersection
point of the curve 1 − µA (u) and the curve µA (u) = µA2 (u), and is computed as
follows:
µB (v) = 1 − µA (u) = µA2 (u)
Since 1 − µA (u) = √ µA2 (u), or µA2 (u) +
√µA (u) − 1 = 0, solving the above equation, we
get µA (u) = −1 ± 1 + 4 = −1 ± 5 . Since the membership grade cannot exceed
2 2 √
the limits 0 and 1, therefore µA (u) has the following value µA (u) = 52− 1 , thus the
√ √
consequence µB (v) is given by µB (v) = 1 − µA (u) = 1 − 52− 1 = 2 − 25 + 1 =
√ √
3 − 5 . It is also observed that 3 − 5 is the lowest value of µ  (v), and other
B
2 2 √
values of the supremum are equal to µB (v) with values greater than 3 −2 5 , or, in

Fig. 36 Superimposed plots of BRFI and premise 1 of C2-1/C2-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 345
√ √
other words, µB (v) = max 3 − 5 3 − 5
2 , µB (v) or µB (v) = 2 ∪ µB (v). Based
on the consequence µB (v), it is concluded that BRFI does not satisfy the C2-2 and
C2-1 criteria of GMP. The analytical proof is given below:

µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (42)
u∈U
  
y1 = min 1 − µA (u), µA2(u) ; for µA (u) ≤ 1 − µB (v)
= sup
u∈U y2 = min µB (v), µA (u) ; for µA (u) > 1 − µB (v)
2
⎧

⎪ y11 = µ (u)
2 ; for µA (u) ≤ µ min (u)
⎨ y = 1 A− µ (u) ; for µ (u) > µAmin (u) ; for µA (u) ≤ 1 − µB (v)
A .
= sup 
12 A A
u∈U ⎪
⎪ y21 = µA2 (u) ; for µA (u) ≤ .µB (v)
⎩ ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB (v)

where µAmin (u) = 52− 1 is obtained by solving 1 − µA (u) = µA2 (u).
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 , and y22 which are divided into two regions depend-
ing on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5
and the second region having y21 and y22 for µB (v) > 0.5. In the first region, the
outcome of the “min” operation starts with y11 which increases to a maximal value
of µAmax (u) with an increase in µA (u) and then y12 which starts with that maximal
value and decreases with any further increase in µA (u). Clearly, the supremum of
this region is µAmax (u). In the second region, the outcome of the “min” operation
starts with y21 which increases to a maximal value of µB (v) with an increase in
µA (u) and then y22 starts with that maximal value and remains there in spite of any
further increase in µA (u). Hence, the supremum of this region is µB (v) only. It is
observed that the supremum of the second region is greater than the supremum of
first region, therefore, √µB (v) = max(µAmax (u), µB (v)) or µB (v) = µAmax (u) ∪ µB (v),
where µAmax (u) = 3 −2 5 .
.
BRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). Similar observations are made as for BRFI: C2-1/C2-2
with the only difference in the minimum supremum point at which µA→B (u, v) and
µA (u) intersect a first time. The minimum supremum point is computed by solving
the following equality:
.
µB (v) = 1 − µA (u) = µA (u)
.
Since 1 − µA (u) = µA (u), or 1 + µA2 (u) − 2µA√ (u)= µA (u); µ√A (u)−3 µA (u) + 1=0,
2

solving the above equation, we get µA (u) = 3 ± 29 − 4 = 3 ±2 5 ; µA (u) = 3 −2 5 .
Thus µB (v) is given by
√ √ √
3− 5 2−3+ 5 5−1
µB (v) = 1 − µA (u) = 1 − = =
2 2 2
346 S.K. Kashyap et al.

Fig. 37 Superimposed plots of BRFI and premise 1 of C3-1/C3-2

It is observed from Figure 37 that other values of the supremum are equal to µB (v)
√ √
with values greater than 52− 1 , or, in other words, µB (v) = max 5 − 1 , µ (v)
2 B

or µB (v) = 52− 1 ∪ µB (v). Based on the consequence µB (v), it is concluded that
BRFI does not satisfy the C3-2 and C3-1 criteria of GMP. The analytical proof is
given below:

µB (v) = sup {min [max(1 − µA (u), µB (v)), µA (u)]} (43)
u∈U
⎧ .
⎨ y1 = min 1 − µA (u), µA (u) ; for µA (u) ≤ 1 − µB (v)
= sup .
u∈U ⎩ y2 = min µB (v), µA (u) ; for µA (u) > 1 − µB (v)
⎧ .

⎪ y11 = µA (u) ; for µA (u) ≤ µAmin (u)
⎨ ; for µA (u) ≤ 1 − µB (v)
 y12 = 1
. − µA (u) ; for µA (u) > µAmin (u)
= sup
u∈U ⎪
⎪ y = µA (u) ; for µA (u) ≤ µB2 (v)
⎩ 21 ; for µA (u) > 1 − µB (v)
y22 = µB (v) ; for µA (u) > µB2 (v)
√ .
where µAmin (u) = 3 −2 5 is obtained by solving 1 − µA (u) = µA (u).
It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y11 , y12 , y21 and y22 , which are divided into two regions depending
on the values of µB (v). The first region contains y11 and y12 for µB (v) ≤ 0.5 and the
second region having y21 and y22 for µB (v) > 0.5. In the first region, the outcome of
the “min” operation starts with y11 which increases to a maximal value of µAmax (u)
with an increase in µA (u) and then y12 which starts with that maximal value and
decreases with any further increase in µA (u). Clearly, the supremum of this region
is µAmax (u).
In the second region, the outcome of the “min” operation starts with y21 which
increases to a maximal value of µB (v) with an increase in µA (u) and then y22
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 347

Fig. 38 Superimposed plots of BRFI and premise 1 of C4-1/C4-2

starts with that maximal value and remains there in spite of any further increase
in µA (u). Hence, the supremum of this region is µB (v) only. It is observed that
the supremum of the second region is greater than the supremum of the first re-
√ µB (v) = max(µA (u), µB (v)) or µB (v) = µA (u) ∪ µB (v), where
gion, therefore max max

µAmax (u) = 52− 1 .


BRFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to
get the consequence µB (v). It is observed from Figure 38 that the implication
µA→B (u, v) is always greater than or equal to µA (u) for any values of µA (u) and
µB (v). Hence, the outcome of the “min” operation between µA→B (u, v) and µA (u)
is µA (u), i.e. 1 − µA (u) itself and the supremum of the outcome will be the unity
only or µB (v) = 1. Therefore, BRFI satisfies the C4-1 (not C4-2) criterion of GMP.
The analytical proof is given below:

µB (v) = sup {min [max(1− µA (u), µB (v)), µA (u)]} (44)
u∈U

y = min [1 − µA (u), 1 − µA (u)] ; for µA (u) ≤ 1 − µB (v)
= sup 1
u∈U y2 = min [ µB (v), 1 − µA (u)] ; for µA (u) > 1 − µB (v)

⎨y1 = 1 − µA (u) ; for µA (u) ≤ 1 − µB (v)
= sup y21 = 1 − µA (u) ; for µA (u) > 1 − µB (v)
u∈U ⎩ y = µ (v) ; for µA (u) > 1 − µB (v)
22 B ; for µA (u) ≤ 1 − µB (v)

It is observed that y22 is not valid, hence, not considered. Therefore, the outcome
of the “min” operation between µA→B (u, v) and µA (u) is 1 − µA (u) only and the
supremum of the outcome will be the unity only, i.e. µB (v) = 1.
GRFI: C1: µA (u) = µA (u) is applied to RHS of Equation (20) to get the con-
sequence µB (v). It is observed from Figure 39 that µB (v) or the supremum of the
outcome of the “min” operation between µA→B (u, v) and µA (u) are the intersection
348 S.K. Kashyap et al.

Fig. 39 Superimposed plots of GRFI and premise 1 of C1

µB (v)
points of µA (u) = µA (u) and the curve (selected because µA (u) intersects
µA (u)
this curve only) at various values of µB (v). These intersection points are computed
by solving the following equality:

µB (v)
µB (v) = = µA (u)
µA (u)

µ (v) .
Since B = µA (u), solving this equation we get µA (u) = µB (v), hence µB (v) =
. µ A (u)
µB (v). Therefore, it is concluded that GRFI does not satisfy the C1 criterion of
GMP. The analytical proof is given below:

7 µA (u)]
⎨ y1 = min [1, 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) (45)
u∈U ⎩ y2 = min µ (u) , µA (u) ; for µA (u) > µB (v)
A


⎪ y1 = µA (u) ; for µA (u) ≤ µB (v)
⎨⎧ .
⎨ y21 = µA (u) ; for µA (u) ≤ µB (v)
= sup .
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > µB (v)
µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 ,.
y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ µB (v),. therefore, the outcome starts with y1 /y21 which in-
creases to a maximal value of µB (v) with an increase of µA (u) and then y22 which
starts with that maximal value and decreases . with any further increase . of µA (u). So
clearly the supremum of the outcome is µB (v) only, i.e. µB (v) = µB (v).
GRFI: C2-1/C2-2: µA (u) = µA2 (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figure 40 that µB (v) or the supremum
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 349

Fig. 40 Superimposed plots of GRFI and premise 1 of C2-1/C2-2

of the outcome of the “min” operation between µA→B (u, v) and µA (u) are the in-
µ (v)
tersection points of µA (u) = µA2 (u) and the curve B (selected because µA (u)
µA (u)
intersects this curve only) at various values of µB (v). These intersection points are
computed by solving the following equality:

µB (v)
µB (v) = = µA2 (u)
µA (u)

µB (v)
Since = µA2 (u), solving this equation we get µA (u) = (µB (v))1/3 , hence
µA (u)
µB (v) = (µB (v))2/3 . Therefore, it is concluded that GRFI does not satisfy the
C2-1/C2-2 criteria of GMP. The analytical proof is given below:
⎧  
⎨ y1 = min 71, µA2 (u) 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) 2 (46)
u∈U ⎩ y2 = min µ (u) , µA (u) ; for µA (u) > µB (v)
A


⎪ y = µA2 (u) ; for µA (u) ≤ µB (v)
⎨ ⎧1
⎨ y21 = µA2 (u) ; for µA (u) ≤ (µB (v))1/3
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > (µB (v))1/3
µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ (µB (v))1/3 , therefore, the outcome starts with y1 /y21 which
increases to a maximal value of (µB (v))2/3 with an increase of µA (u) and then y22
which starts with that maximal value and decreases with any further increase of
µA (u). So clearly the supremum of the outcome is (µB (v))2/3 only, i.e. µB (v) =
(µB (v))2/3 .
350 S.K. Kashyap et al.

Fig. 41 Superimposed plots of GRFI and premise 1 of C3-1/C3-2

.
GRFI: C3-1/C3-2: µA (u) = µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figure 41 that µB (v) or the supremum
of the outcome of the “min” operation between µA→B (u, v) and µA (u) are the inter-
. µ (v)
section points of µA (u) = µA (u) and the curve B (selected because µA (u)
µA (u)
intersects this curve only) at various values of µB (v). These intersection points are
computed by solving following equality:

µB (v) .
µB (v) = = µA (u)
µA (u)

µB (v) .
Since = µA (u), solving this equation we get µA (u) = (µB (v))2/3 , hence
µA (u)
µB (v) = (µB (v))1/3 . Therefore, it is concluded that GRFI does not satisfy the
C3-1/C3-2 criteria of GMP. The analytical proof is given below:
⎧ .

⎨ y1 = min 1, µA (u) ; for µA (u) ≤ µB (v)
7 8
µB (v) = sup µ (v) . (47)
u∈U ⎪
⎩ y2 = min B , µA (u) ; for µA (u) > µB (v)
µA (u)
⎧ .

⎪ y⎧1 = µA (u) ; for µA (u) ≤ µB (v)
⎨ .
⎨ y21 = µA (u) ; for µA (u) ≤ (µB (v))2/3
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > (µB (v))2/3
µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is noticed that y1 and y21 can be treated as the
same up to µA (u) ≤ (µB (v))2/3 , therefore, the outcome starts with y1 /y21 which
increases to a maximal value of (µB (v))1/3 with an increase of µA (u) and then y22
which starts with that maximal value and decreases with any further increase of
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 351

µA (u). So clearly the supremum of the outcome is (µB (v))1/3 only, i.e. µB (v) =
(µB (v))1/3 .
GRFI: C4-1/C4-2: µA (u) = 1 − µA (u) is applied to RHS of Equation (20) to get
the consequence µB (v). It is observed from Figures 42 and 43 that the outcome of
the “min” operation between the curves µA→B (u, v) and µA (u) always starts from
the unity value and follows the curve µA (u) = 1 − µA (u) for the value of µA (u) and
then µA→B (u, v) and back to µA (u). So we observe that the maximum value that
comes out of the “min” operation is the unity only, therefore µB (v) = 1. Hence,
GRFI satisfies the C4-1 (not C4-2) criterion of GMP. The analytical proof is given
below:

7 1 − µA (u)]
⎨ y1 = min [1, 8 ; for µA (u) ≤ µB (v)
µB (v) = sup µB (v) (48)
u∈U ⎩ y2 = min µ (u) , 1 − µA (u) ; for µA (u) > µB (v)
A


⎪ y = 1 − µ (u) ; for µA (u) ≤ µB (v)
⎨⎧
1 A
⎨ y21 = 1 − µA (u) ; for µA (u) ≤ µA (u)
min
= sup
u∈U ⎪
⎪ µ (v) ; for µA (u) > µB (v)
⎩ ⎩ y22 = B ; for µA (u) > µAmin (u)
µA (u)
.
1− 1 − 4µB (v) µ (v)
where µAmin (u) = , is obtained by solving 1 − µA (u) = B .
2 µA (u)

Fig. 42 Superimposed plots of GRFI and premise 1 of C4-1/C4-2

Fig. 43 plots of GRFI and premise 1 of C4-1/C4-2 for µB = 0.35


352 S.K. Kashyap et al.

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µA (u) consists of y1 , y21 and y22 . It is important to note that µAmin (u) is valid only
and hence y21 /y22 , when µB (v) ≤ 0.25. Therefore for µB (v) ≤ 0.25, the outcome
starts with y1 which decreases from its maximal value of unity to some value with an
increase of µA (u), then handed over to y22 which also decreases and finally ends with
y21 . Hence we see that the supremum of this region is the unity only. Similarly for
µB (v) > 0.25 there is only y1 and hence its supremum is again the unity. Therefore,
for any value of µB (v), the consequence µB (v) will be the unity only.
Step 3: One by one premise 1 of all GMT criteria, i.e. C5 to C8-2, are applied
to the implication methods. But before that, it is essential to realize here that the
relational matrices (R) of the implication methods, shown in Figures 6–11, should
be transposed before being investigated by the criteria of GMT. By doing so, the X–
axis now represents the fuzzy set µB (v) (in case of GMP it is µA (u)) and the Y –axis
represents the implication µA→B (u, v) computed for each value of fuzzy set µA (u).
The transpose of (R) is needed, due to fact that the inference rule of GMT is back-
ward goal-driven, as compared to GMP which is a forward goal-driven inference
rule. Figures 44–49 show the transpose of relational matrices of various implica-
tion methods to be put on under investigation against intuitive criteria of GMT. Let
us first take the implication method “MORFI” for investigation under the heading
“MORFI: C#”, where # is a criterion index such as: 5, 6, 7, 8–1 and 8–2.
MORFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). Figure 50 illustrates the superimposed plots of µA→B (u, v)
and µB (v). It is noticed from the figure that µB (v) intersects µA→B (u, v) a first
time at µA (u) = µB (v) = 0.5, a point at which the outcome of the “min” opera-
tion is at its peak value and also equal to the maximal value that the consequence
µA (u) can achieve. Similarly the other values of µA (u) are the next intersection
points (below 0.5) of µB (v) to µA→B (u, v). Hence it can be concluded that µA (u)
falls between µA (u) and 0.5, or, in other words, µA (u) = min(0.5, µA (u)) that is,

Fig. 44 2D plots of MORFI


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 353

Fig. 45 2D plots of PORFI

Fig. 46 2D plots of ARFI

Fig. 47 2D plots of MRFI


354 S.K. Kashyap et al.

Fig. 48 2D plots of BRFI

Fig. 49 2D plots of GRFI

Fig. 50 Superimposed plots of MORFI and premise 1 of C5


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 355

µA (u) = 0.5 ∩ µA (u). The point to be noted here is that MORFI does not satisfy the
intuitive criterion C5 of GMT. The analytical proof is given below:

µA (u) = sup {min [min (µA (u), µB (v)) , 1 − µB (v)]}


v∈V

y = min [µB (v), 1 − µB (v)] ; for µB (v) ≤ µA (u)
= sup 1 (49)
v∈V y2 = min [µA (u), 1 − µB (v)] ; for µB (v) > µA (u)
⎧

⎪ y11 = µB (v) ; for µB (v) < 0.5



⎪ y12 = 1 − µB (v) ; for µB (v) ≥ 0.5 i.e. when µA (u) ≥ 0.5

; for µB (v) ≤ µA (u)
= sup 
v∈V ⎪
⎪ y21 = µA (u) ; for µB (v) < 1 − µA (u)



⎪ y22 = 1 − µB (v) ; for µB (v) > 1 − µA (u)

; for µB (v) > µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when the value of µB (v) is less than 0.5
and µB (v) ≤ µA (u), then y21 for µA (u) < µB (v) < 1 − µA (u) and finally y22 (when
µB (v) > 1 − µA (u) as well as µB (v) > µA (u), which is only possible when µA (u) is
less than 0.5). In this case, the supremum will be y21 , i.e. µA (u) only. It is also ob-
served that the outcome, for µA (u) greater than 0.5, will be y11 (when µB (v) ≤ 0.5)
and then y12 (when µB (v) > 0.5). Thus we see that the supremum of the outcome of
the “min” operation between µA→B (u, v) and µB (v) will be the intersection point of
y11 and y12 and that happens to always be 0.5. We also see that 0.5 is a maximum
value that the supremum can achieve, otherwise it is just µA (u), in other words
µA (u) = min(0.5, µA (u)), that is µA (u) = 0.5 ∩ µA (u).
MORFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 51 and 52 that µB (v) intersects
µA→B (u, v) a first time in a point at which the curve µB (v) is equal to µB (v) for

Fig. 51 Superimposed plots of MORFI and premise 1 of C6


356 S.K. Kashyap et al.

Fig. 52 plots of MORFI and premise 1 of C6 for µB = 0.35

some value of µA (u). It is also observed that this point, computed below, is also a
maximum value that the consequence µA (u) can achieve.
Since µB (v) = 1 − µ√ 2
√µB (v)+ µB (v)− 1 = 0, by solving the above equation,
B (v), or
2

we get µB (v) = −1 ± 5 = 5 − 1 , hence the maximal value of µ  (u) = µ (v) =


A B
√ 2 2
5 − 1 . Similarly, the other values of µ  (u) are the next intersection points (below
A
√2
5 − 1 ) of µ  (v) to µ
2 B A→B (u, v). Hence it can be concluded that µA (u) falls
√ √
between µA (u) and 52− 1 , or, in other words, µA (u) = min 5 − 1 , µ (u) ,
2 A

that is µA (u) = 52− 1 ∩ µA (u). The point to be noted here is that MORFI does
not satisfy the intuitive criterion C6 of GMT. The analytical proof is given below:
!  "
µA (u) = sup min min (µA (u), µB (v)) , 1 − µB2 (v) (50)
v∈V
  
y = min µB (v), 1 − µB2 (v)  ; for µB (v) ≤ µA (u)
= sup 1
v∈V y2 = min µA (u), 1 − µB2 (v) ; for µB (v) > µA (u)
⎧  √


⎪ y11 = µB (v) ; for µB (v) < µBmin (v) where µBmin (v) = 52− 1



⎨ y12 = 1 − µB (v) ; for µB (v) ≥ µB (v) i.e. when µA (u) ≥ µB (v)
2 min min

= sup  ; for µB (v) ≤ µA (u) .
v∈V ⎪
⎪ y21 = µA (u) ; for µB (v) < .1 − µA (u)



⎪ y = 1 − µB2 (v) ; for µB (v) > 1 − µA (u)

⎩ 22
; for µB (v) > µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when value of µB (v).is less than µBmin (v) is
less than and µB (v) ≤.µA (u), then y21 for µA (u) < µB (v) < 1 − µA (u) and finally
y22 (when µB (v) > 1 − µA (u) as well as µB (v) > µA (u), which is only possible
when µA (u) is less than µBmin (v)). In this case, the supremum will be y21 , i.e. µA (u)
only. It is also observed that the outcome, for µA (u) greater than µBmin (v), will be
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 357

Fig. 53 Superimposed plots of MORFI and premise 1 of C7

y11 (when µB (v) ≤ µBmin (v)) and then y12 (when µB (v) > µBmin (v)). Thus we see
that the supremum of the outcome of the “min” operation between µA→B (u, v) and
µB (v) will be the intersection point of y11 and y12 and that turns always out to
be µBmin (v). We also see that µBmin (v) is a maximum value that the supremum can
achieve, otherwise it is just µA (u), in other words, µA (u) = min(µBmin (v), µA (u)),
that is µA (u) = µBmin (v) ∩ µA (u).
.
MORFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figure 53 that µB (v) intersects µA→B (u, v)
a first time in a point at which the curve µB (v) is equal to µB (v) for some value of
µA (u). It is also observed that this point, computed below, is also a maximum value
that the consequence µA.  (u) can achieve.

Since, µB (v) = 1 − µB (v) √ or µB (v)√− 3µB (v) + 1 = 0, by solving the above


2

equation, we get µB (v) = 3 ±2 5 = 3 −2 5 , hence then maximal value of µA (u) =



µB (v) = 3 −2 5 . Similarly, the other values of µA (u) are the next intersection points

(below 3 −2 5 ) of µB (v) to µA→B (u, v). Hence it can be concluded that µA (u) falls
√ √
between µA (u) and 3 −2 5 , or, in other words, µA (u) = min 3 −2 5 , µA (u) ,

that is µA (u) = 3 −2 5 ∩ µA (u). The point to be noted here is that MORFI does
not satisfy the intuitive criterion C7 of GMT. The analytical proof is given below:
 . 
µA (u) = sup min min (µA (u), µB (v)) , 1 − µB (v) (51)
v∈V
⎧ .
⎨ y1 = min µB (v), 1 − µB (v) ; for µB (v) ≤ µA (u)
= sup .
v∈V ⎩ y2 = min µA (u), 1 − µB (v) ; for µB (v) > µA (u)
358 S.K. Kashyap et al.
⎧⎧ √

⎪⎨y = µ (v) µ µ min (v) where µ min (v) = 3− 5

⎪ 11 B ; for B (v) <
⎪ B B 2
⎪⎩y = 1−.µ (v) ; for µ (v) ≥ µ min (v) i.e. when µ (u) ≥ µ min (v)



⎨ 12 B B B A B
= sup ;for µB (v) ≤ µA (u)
v∈V ⎪
⎪ y = µA (u) ; for µB (v) < (1 − µA (u))2

⎪ 21

⎪ .

⎪ y22 = 1 − µB (v) ; for µB (v) > (1 − µA (u))2


; for µB (v) > µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) results in y11 , y12 , y21 and y22 , depending on the values of µA (u) and µB (v).
We see that the outcome starts with y11 when the value of µB (v) is less than µBmin (v)
and µB (v) ≤ µA (u), then y21 for µA (u) < µB (v) < (1− µA (u))2 , and finally y22 (when
µB (v) > (1 − µA (u))2 as well as µB (v) > µA (u), which is only possible when µA (u)
is less than µBmin (v)). In this case, the supremum will be y21 , i.e. µA (u) only. It is
also observed that the outcome, for µA (u) greater than µBmin (v), will be y11 (when
µB (v) ≤ µBmin (v)) and then y12 (when µB (v) > µBmin (v)). Thus we see that the supre-
mum of the outcome of the “min” operation between µA→B (u, v) and µB (v) will
be the intersection point of y11 and y12 and that turns always out to be µBmin (v).
We also see that µBmin (v) is a maximum value that the supremum can achieve,
otherwise it is just µA (u), in other words, µA (u) = min(µBmin (v), µA (u)), that is
µA (u) = µBmin (v) ∩ µA (u).
MORFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 54 that µB (v) is always larger
than or equal to µA→B (u, v) for any value of µB (v), which means that the outcome of
the “min” operation is µA→B (u, v) itself i.e. fig. 44. It is also observed from Figure 54
that µA→B (u, v) = min(µA (u), µB (v)) converges to µA (u) (which is also the maximal
value of µA→B (u, v)) for µB (v) ≥ µA (u) and hence the supremum of µA→B (u, v)
is µA (u), i.e. µA (u) = µA (u). Therefore, it is concluded that MORFI satisfies the

Fig. 54 Superimposed plots of MORFI and premise 1 of C8-1/C8-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 359

intuitive criterion C8-2 (not C8-1) of GMT (refer Table 2). The analytical proof is
given below:
µA (u) = sup {min [min (µA (u), µB (v)) , µB (v)]}
v∈V

y = min [µB (v), µB (v)] ; for µB (v) ≤ µA (u)
= sup 1 (52)
v∈V y2 = min [µA (u), µB (v)] ; for µB (v) > µA (u)

It is observed from the above formula that the outcome of the “min” operation
between µA→B (u, v) and µB (v) (for some fixed value of µA (u)) consists of y1 having
value µB (v) when µB (v) ≤ µA (u) and which increases up to a value of µA (u), then y2
is equal to that fixed value of µA (u) when µB (v) > µA (u). Therefore, it can be con-
cluded that the supremum of y1 and y2 will be the curve µA (u), i.e. µA (u) = µA (u).
PORFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figures 55 and 56 that the supremum of
the “min” operation, i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v).

µA (u)
µA (u) = sup {min [µA (u)µB (v), 1 − µB (v)]} =
v∈V 1 + µA (u)

Fig. 55 Superimposed plots of PORFI and premise 1 of C5

Fig. 56 plots of PORFI and premise 1 of C5 for µB = 0.35


360 S.K. Kashyap et al.

Since µA (u) = µA (u)µB (v) = 1 − µB (v), hence by solving µA (u)µB (v) = 1 − µB (v),
1 1 µA (u)
we get µB (v) = . Therefore, µA (u) = µ (u) or .
1 + µA (u) 1 + µA (u) A 1 + µA (u)
Hence, PORFI does not satisfy the C5 criterion of GMT. The analytical proof is
given below:

µA (u) = sup {min [µA (u)µB (v), 1 − µB (v)]} (53)


v∈V

⎨ y1 =µA (u)µB (v) ; for µA (u)µB (v)≤1 − µB (v) or µB (v)≤ 1
1 + µA (u)
= sup
v∈V ⎩ y2 =1 − µB (v) ; for µB (v)>
1
1 + µA (u)

It is observed from y1 and y2 , computed for some fixed value of µA (u), these are
the outcome of the “min” operation between the implication µA→B (u, v) and µB (v).
µA (u)
The y1 increases with an increase in µB (v) to a maximum value equalling
1 + µA (u)
1 µA (u)
for µB (v) ≤ , whereas y2 starts from its maximum value of and
1 + µA (u) 1 + µA (u)
then decreases with any further increase of µB (v). Therefore, the supremum of these
µA (u)
curves will be only.
1 + µA (u)
PORFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 57 that the supremum of
the “min” operation i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v) .
9
!  " µA (u) µA2 (u) + 4 − µA2 (u)
µA (u) = sup min µA (u)µB (v), 1 − µB2 (v) =
v∈V 2

Fig. 57 Superimposed plots of PORFI and premise 1 of C6


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 361

Since µA (u) = µA (u)µB (v) = 1 − µB2 (v), or µB29


(v) + µA (u)µB (v) − 1 = 0, by solving
−µA (u) ± µA2 (u) + 4
the above equation, we get µB (v) = 29 , since the value of µB (v)
µA2 (u) + 4 − µA (u)
should lie between 0 and 1 therefore µB (v) = and µA (u) =
9 2
µA (u) µA2 (u) + 4 − µA2 (u)
2 . Therefore, PORFI does not satisfy the C6 criterion of
GMT. The analytical proof is given below:
!  "
µA (u) = sup min µA (u)µB (v), 1 − µB2 (v) (54)
v∈V

y = µA (u)µB (v) ; for µA (u)µB (v) ≤ 1 − µB2 (v) or µB (v) ≤ µBmin (v)
= sup 1
v∈V y2 = 1 − µB2 (v) ; for µB (v) > µBmin (v)
9
µA2 (u) + 4 − µA (u)
where µBmin (v) = 2 is obtained by solving the equation
µA (u)µB (v) = 1 − µB (v).
2

It is observed from y1 and y2 , computed for some fixed value of µA (u), these
are the outcome of the “min” operation between the implication µA→B (u, v) and
µB (v). The y1 increases with an increase in µB (v) to a maximum value equal to
µA (u)µB (v) = µA (u)µBmin (v) for the µB (v) ≤ µBmin (v), whereas, y2 starts from its
maximum value of µA (u)µBmin (v) and then decreases with any further increase of
µB (v).9Therefore, the supremum of these curves will be µA (u)µBmin (v) =
µA (u) µA2 (u) + 4 − µA2 (u)
2 only.
.
PORFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 58 that the supremum of

Fig. 58 Superimposed plots of PORFI and premise 1 of C7


362 S.K. Kashyap et al.

the “min” operation, i.e. µA (u), is nothing but the intersection point of µB (v) and
µA→B (u, v) at which µB (v) = µA→B (u, v). These intersection points are computed
as follows:
 .  2µ (u) + 1 − .4µ (u) + 1
A A
µA (u) = sup min µA (u)µB (v), 1 − µB (v) =
v∈V 2µA (u)
.
Since µA (u) = µA (u)µB (v) = 1 − µB (v), or µA2 (u)µB2 (v) − (2µA (u) + 1)µB (v) +
1 = 0, by solving the above equation,
9
.
2µA (u) + 1 ± (2µA (u) + 1)2 − 4µA2 (u) 2µA (u) + 1 ± 4µA (u) + 1
µB (v) = =
2µA2 (u) 2µA2 (u)
.
2µ (u) + 1 − (4µA (u) + 1
or µB (v) = A and
2µA2 (u)
.
2µA (u) + 1 − 4µA (u) + 1
µA (u) = µA (u)µB (v) = .
2µA (u)

Therefore, PORFI does not satisfy the C7 criterion of GMT. The analytical proof is
given below:
 . 
µA (u) = sup min µA (u)µB (v), 1− µB (v) (55)
v∈V
 .
y = µA (u)
. µB (v) ; for µA (u)µB (v) ≤ 1− µB (v) or µB (v) ≤ µBmin (v)
= sup 1
v∈V y2 = 1− µB (v) ; for µB (v) > µB (v)
min

.
2µA (u) + 1 − 4µA (u) + 1
where µBmin (v) = is obtained by solving the equation
. 2µA2 (u)
µA (u)µB (v) = 1 − µB (v).
It is observed from y1 and y2 , computed for some fixed value of µA (u), are the
outcome of the “min” operation between the implication µA→B (u, v) and µB (v). The
y1 increases with an increase in µB (v) to a maximum value equal to µA (u)µB (v) =
µA (u)µBmin (v) for the µB (v) ≤ µBmin (v), whereas, y2 starts from its maximum value
of µA (u)µBmin (v) and then decreases with any further increase of µB (v).
. Therefore,
2µ (u) + 1 − 4µA (u) + 1
the supremum of these curves will be µA (u)µBmin (v) or A
2µA (u)
only.
PORFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from the Figure 59 that µB (v) is always equal
to or greater than the implication µA→B (u, v) for any value of µA (u), therefore the
outcome of the “min” operation results in µA→B (u, v) itself. It is also noticed that
the supremum of µA→B (u, v) turns out to be µA (u). Therefore, it is concluded that
PORFI satisfies the intuitive criterion C8-2 (not C8-1) of GMT. The analytical proof
is given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 363

Fig. 59 Superimposed plots of PORFI and premise 1 of C8-1/C8-2

Fig. 60 Superimposed plots of ARFI and premise 1 of C5

µA (u) = sup {min [µA (u)µB (v), µB (v)]} (56)


v∈V
= sup {µA (u)µB (v)}
v∈V
(since the product of two normalized fuzzy numbers is always less than
either of the numbers)
= µA (u); (µA (u)µB (v) tends to µA (u) as µB (v) → 1)

ARFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 60 and 61 that the supremum of
the outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is
their intersection point obtained by solving following equality:

µA (u) = 1 − µA (u) + µB (v) = 1 − µB (v)


364 S.K. Kashyap et al.

Fig. 61 plots of ARFI and


premise 1 of C5 for µB = 0.35

µ (u)
since 1 − µA (u) + µB (v) = 1 − µB (v) or 2µB (v) = µA (u) or µB (v) = A2 , hence
µ (u)
the consequence µA (u) is given by µA (u) = A2 . Therefore it is concluded that
ARFI does not satisfy the criterion C5 of GMT. The analytical proof is given below:

µA (u) = sup {min [min (1, 1 − µA (u) + µB (v)) , 1 − µB (v)]} (57)
v∈V

y1 = min [1 − µA (u) + µB (v), 1 − µB (v)] ; for µB (v) ≤ µA (u)
= sup
v∈V y2 = min [1, 1 − µB (v)] ; for µB (v) > µA (u)
⎧
⎨ y11 = 1 − µA (u) + µB (v) ; for 0 < µB (v) < µA (u)/2
; for µB (v) ≤ µA (u)
= sup y12 = 1 − µB (v) ; for µB (v) > µA (u)
v∈V ⎩ y = 1 − µ (v) ; for µB (v) > µA (u)
2 B

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 , and y2 . It is noticed that the y12 and y2 represent the same
equation that is 1 − µB (v) for the range of µA (u)/2 < µB (v) ≤ 1. So it will not be
wrong to say that the outcome basically consists of c1 = 1 − µA (u) + µB (v) and
µ (u)
then c2 = 1 − µB (v). The c1 increases up to a maximum value of 1 − A2 with an
µ (u)
increase in µB (v) from a zero value to A2 , whereas c2 starts from its maximum
µ (u)
value of 1 − A2 and then decreases with any further increase in µB (v) from the
µ (u)
value of A2 . Hence, it can be concluded that the supremum out of c1 and c2 is
µ (u)
1 − A2 only.

ARFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from Figure 62 that the supremum of the outcome of
the “min” operation between the curves µA→B (u, v) and µB (v) is their intersection
point obtained by solving the following equality:
µA (u) = 1 − µA (u) + µB (v) = 1 − µB2 (v)

since 1 − µA (u) + µB (v) = 1 − µB2 (v) or.µB2 (v) + µB (v) −.µA (u) = 0. By solving the
−1 ± 1 + 4µA (u) 1 + 4µA (u) − 1
above equation, we get µB (v) = 2 = 2 and
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 365

Fig. 62 Superimposed plots of ARFI and premise 1 of C6

µA (u) = 1 − µA (u) + µB (v)


.
1 + 4µA (u) − 1
= 1 − µA (u) +
. 2
2 − 2µA (u) + 1 + 4µA (u) − 1
=
2
.
1 − 2µA (u) + 1 + 4µA (u)
or µA (u) = 2 . Therefore it is concluded that ARFI does
not satisfy the criterion C6 of GMT. The analytical proof is given below:
 
µA (u) = sup min min (1, 1 − µA (u) + µB (v)) , 1 − µB2 (v) (58)
v∈V
  
y1 = min 1 − µA (u) + µB (v), 1 − µB2 (v) ; for µB (v) ≤ µA (u)
= sup  
v∈V y2 = min 1, 1 − µB2 (v) ; for µB (v) > µA (u)
⎧
⎨ y11 = 1 − µA (u) + µB (v) ; for 0 < µB (v) ≤ µBmin (v)
; for µB (v) ≤ µA (u)
= sup y12 = 1 − µB2 (v) ; for µB (v) > µBmin (v)
v∈V ⎩ y = 1 − µ 2 (v) ; for µB (v) > µA (u)
2 B
.
1 + 4µA (u) − 1
where µBmin (v) = 2 is obtained by solving the equation 1 − µA (u) +
µB (v) = 1 − µB2 (v).
It is observed that the outcome of the “min” operation between µA→B (u, v)
and µB (v) consists of y11 , y12 and y2 . It is noticed that the y12 and y2 repre-
sent the same equation that is 1 − µB2 (v) for the range of µBmin (v) < µB (v) ≤ 1.
So it will be not wrong to say that the outcome basically consists of c1 = 1 −
µA (u) + µB (v).and then c2 = 1 − µB2 (v). The c1 increases up to a maximum value of
1 − 2µA (u) + 1 + 4µA (u)
2 with an increase in µB (v) from a zero value to µBmin (v),
366 S.K. Kashyap et al.

Fig. 63 Superimposed plots of ARFI and premise 1 of C7

.
1 − 2µA (u) + 1 + 4µA (u)
whereas c2 starts from its maximum value of 2 and then
decreases with any further increase in µB (v) from the value of µ. min
B (v). Hence, it is
1 − 2µA (u) + 1 + 4µA (u)
finally concluded that supremum over c1 and c2 is 2 only.
.
ARFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 63 that the supremum of the
outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is their
intersection point, obtained by solving the following equality:
.
µA (u) = 1 − µA (u) + µB (v) = 1 − µB (v)
.
since 1 − µA (u) + µB (v) = 1 − µB (v) or µB2 (v) − (2µA (u) + 1)µB (v) + µA2 (u) = 0.
By solving the above equation, we get
9
1 + 2µA (u) ± (1 + 2µA (u))2 − 4µA2 (u)
µB (v) =
9 2
1 + 2µA (u) ± 1 + 4µA2 (u) + 4µA (u) − 4µA2 (u)
=
. 2
1 + 2µA (u) ± 1 + 4µA (u)
=
2
.
1 + 2µA (u) ± 1 + 4µA (u)
and µA (u) = 1 − µA (u) + µB (v) = 1 − µA (u) +
. 2.
2 − 2µA (u) + 1 + 2µA (u) ± 1 + 4µA (u) 3 − 1 + 4µA (u)
or µA (u) = 2 = 2 .
Therefore it is concluded that ARFI does not satisfy criteria C7 of GMT. The ana-
lytical proof is given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 367
 . 
µA (u) = sup min min (1, 1 − µA (u) + µB (v)) , 1 − µB (v) (59)
v∈V
⎧ .
⎨ y1 = min 1 − µA (u) + µB (v), 1 − µB (v) ; for µB (v) ≤ µA (u)
= sup .
v∈V ⎩ y2 = min 1, 1 − µB (v) ; for µB (v) > µA (u)
⎧
⎨ y11 = 1 − .µA (u) + µB (v) ; for 0 < µB (v) ≤ µBmin (v)
; for µB (v) ≤ µA (u)
= sup − µB (v)
y12 = 1. ; for µB (v) > µBmin (v)

v∈V
y2 = 1 − µB (v) ; for µB (v) > µA (u)
.
1 + 2µA (u) − 1 + 4µA (u)
where µBmin (v) = is obtained by solving the equation
. 2
1 − µA (u) + µB (v) = 1 − µB (v).
It is observed that outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11.
, y12 and y2 . It is noticed that the y12 and y2 represent the same
equation, that is 1 − µB (v), for the range of µBmin (v) < µB (v) ≤ 1. So it will be not
wrong to say that the outcome basically consists of c1 = 1 − µA (u) + µ. B (v) and then
. 3 − 1 + 4µA (u)
c2 = 1 − µB (v). The c1 increases up to a maximum value of 2
with an increase in µB (v) . from a zero value to µB (v), whereas c2 starts from its
min

3 − 1 + 4µA (u)
maximum value of 2 and then decreases with any further increase
in µB (v) from the value
.of µB (v). Hence, it is finally concluded that the supremum
min

3 − 1 + 4µA (u)
over c1 and c2 is 2 only.
ARFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 64 that is always equal to or less
than the implication µA→B (u, v) for any value of µA (u), therefore the outcome of
the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).

Fig. 64 Superimposed plots of ARFI and premise 1 of C8-1/C8-2


368 S.K. Kashyap et al.

Therefore, it is concluded that ARFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:

µA (u) = sup {min [min (1, 1 − µA (u) + µB (v)) , µB (v)]} (60)
v∈V

y1 = min [1 − µA (u) + µB (v), µB (v)] ; for µB (v) ≤ µA (u)
= sup
v∈V y2 = min [1, µB (v)] ; for µB (v) > µA (u)
⎧

⎪ y = 1 − µA (u) + µB (v); for 1 − µA (u) + µB (v) < µBmin (v), i.e. µA (u) > 1
⎪ 11


y12 = µB (v) ; for µA (u) < 1
= sup

v∈V ⎪; for µ B (v) ≤ µ A (u)



y2 = µB (v); for µB (v) > µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 and y2 . Out of these, y11 is not valid as µA (u) > 1 is not
possible. Hence, the outcome is µB (v) only for any fixed value of µA (u). Therefore
the supremum of µB (v), i.e. µA (u), will be the unity only.
MRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figures 65 and 66 that the first supre-
mum of the outcome of the “min” operation between the implication µA→B (u, v) and
premise 1, µB (v), for the value of µA (u) greater than 0.5, is the intersection point
of the curve µB (v) of µA→B (u, v) and the curve µB (v) = 1 − µB (v). This intersec-
tion point is obtained by solving the equality µA (u) = 1 − µB (v) = µB (v) = 0.5. It
is also observed from Figure 65 that µA (u) = 0.5 is also the minimum value that
the consequence µA (u) can achieve. For the value of µA (u) less than 0.5, another
point of the supremum happens to be 1 − µA (u). Therefore, µA (u) falls between
0.5 and 1 − µA (u) with whichever is maximum, i.e. µA (u) = max(0.5, 1 − µA (u))

Fig. 65 Superimposed plots of MRFI and premise 1 of C5


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 369

Fig. 66 plots of MRFI and


premise 1 of C5 for µB = 0.35

or µA (u) = 0.5 ∪ 1 − µA (u). It is noticed that MRFI does not satisfy the intuitive
criterion C5 of GMT. The analytical proof is given below:
µA (u) = sup {min [max (min (µA (u), µB (v)) , 1 − µA (u)) , 1− µB (v)]}
v∈V

y1 = min [max (µB (v), 1 − µA (u)) , 1 − µB (v)] ; for µB (v) ≤ µA (u)
= sup (61)
v∈V y2 = min [max (µA (u), 1 − µA (u)) , 1 − µB (v)] ; for µB (v) > µA (u)
⎧

⎪ y11 = min [µB (v), 1 − µB (v)] ; for µB (v) ≥ 1 − µA (u)



⎪ y12 = min [1 − µ A (u), 1 − µ B (v)] ; for µB (v) < 1 − µA (u)

;for µB (v) ≤ µA (u)
= sup
v∈V ⎪
⎪ y21 = min [µA (u), 1 − µB (v)] ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5



⎪ y22 = min [1 − µ A (u), 1 − µ B (v)] ; for µA (u) < 0.5

; for µB (v) > µA (u)
⎧⎧
⎪ ⎨ y111 = µB (v) ; for µB (v) ≤ 0.5

⎪ ; for µB (v) ≥ 1 − µA (u)

⎪ y112 = 1 − µB (v) ; for µB (v) > 0.5

⎨ ⎩ y = 1 − µ (u)
12 A ; for µB (v) < 1 − µA (u)
= sup ; for µ (v) ≤ µ (u)

⎪  y = 1 − µ (v) ; for µ (v) ≥ 1 − µ (u) or µ (u) ≥ 0.5
v∈V ⎪ B A


⎪ 21 B B
⎩ y = 1 − µ (v) ; for µ (u) < 0.5
A A
22 B A
; for µB (v) > µA (u)

It is observed that y111 = µB (v) is only possible when µB (v) ≤ 0.5 and µB (v) ≥
1 − µA (u), i.e. 1 − µA (u) ≤ 0.5 or µA (u) ≥ 0.5. In a similar way, y112 = 1 − µB (v) is
only possible when µB (v) > 0.5 and hence µA (u) ≥ 0.5. If we observe carefully, then
we find that y21 is the same as y112 . Now we have two equations for µA (u) < 0.5,
the first is y12 = 1 − µA (u) when µB (v) ≤ µA (u) and then y22 = 1 − µB (v) when
µB (v) > µA (u). Hence, the outcome of the “min” operation between µA→B (u, v)
and µB (v) can be divided into two regions. The first region consists of y12 and y22
when µA (u) < 0.5 and the second region consists of y111 and y21 when µA (u) ≥ 0.5.
We observe that the supremum of the first region will be y12 , i.e. 1 − µA (u) only,
and for the second region, the supremum is the intersection point of y111 and y21 .
This intersection point is computed by solving the equality 1 − µB (v) = µB (v) and
that turns out to be 0.5. It is also noticed that the supremum of the first region
will always be equal to or greater than the supremum of the second region. In
other words, a minimum value that the consequence µA (u) can have is 0.5 only,
otherwise whichever is the maximum i.e. µA (u) = max(0.5, 1 − µA (u)) or µA (u) =
0.5 ∪ 1 − µA (u).
370 S.K. Kashyap et al.

Fig. 67 Superimposed plots of MRFI and premise 1 of C6

MRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 67 that the first supremum of the
outcome of the “min” operation between the implication µA→B (u, v) and premise
1, µB (v), is the intersection point of the curve of µB (v) and the curve µB (v) =
1 − µB2 (v). This intersection point is obtained by solving the following equality:

µA (u) = 1 − µB2 (v) = µB (v) or µB2 (v) + µB (v) − 1 = 0


√ √
By solving the above equation, we get µB (v) = −1 ± 5 = 5 − 1 , hence µ  (u) =
A
√ 2 2 √
µB (v) = 52− 1 . It is also observed from Figure 67 that µA (u) = 52− 1 is also
the minimum value that the consequence µA (u) can achieve, and the other √ point of
the supremum happens to be 1 − µA (u). Therefore, µA (u) falls between 52− 1 and

1 − µA (u) with whichever is maximum i.e. µA (u) = max 5 − 1 , 1 − µ (u) or
2 A

µA (u) = 52− 1 ∪ 1 − µA (u). It is noticed that MRFI does not satisfy the intuitive
criterion C6 of GMT. The analytical proof is given below:
!  "
µA (u) = sup min max (min (µA (u), µB (v)) 1 − µA (u)) , 1 − µB2 (v) (62)
v∈V
  
y = min max (µB (v), 1 − µA (u)) 1 − µB2 (v)  ; for µB (v) ≤ µA (u)
= sup 1
v∈V y2 = min max (µA (u), 1 − µA (u)) 1 − µB2 (v) ; for µB (v) > µA (u)
⎧  

⎪ y11 = min µB (v), 1 − µB2 (v)  ; for µB (v) ≥ 1 − µA (u)



⎪ y12 = min 1 − µA (u), 1 − µB2 (v) ; for µB (v) < 1 − µA (u)


⎨ ;⎧for µB (v) ≤µA (u) 
= sup ⎨ y21 = min µA (u), 1 − µB2 (v) ; for µB (v) ≥ 1 − µA (u)

v∈V ⎪

⎪   or µA (u) ≥ 0.5

⎪ ⎩
⎪ y22 = min 1 − µA (u), 1 − µB2 (v) ; for µA (u) < 0.5


; for µB (v) > µA (u)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 371
⎧⎧

⎪ ⎨ y111 = µB (v) ; for µB (v) ≤ µBmin (v)

⎪ ; for µB (v) ≥ 1 − µA (u)

⎪ y112 = 1 − µB (v) ; for µB (v) > µBmin (v)
2

⎪ ⎩
⎨ y12 = 1 − µA (u) ; for µB (v) < 1 − µA (u)
= sup  ; for µB (v) ≤ µA (u)
v∈V ⎪
⎪ y21 = 1 − µB2 (v) ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5





⎪ y = 1 − µB2 (v) ; for µA (u) < 0.5
⎩ 22
; for µB (v) > µA (u)

where µBmin (v) = 52− 1 is obtained by solving the equation µB (v) = 1 − µB2 (v).
It is observed that y111 = µB (v) is only possible when µB (v) ≤ µBmin (v) and
µB (v) ≥ 1 − µA (u), i.e. 1 − µA (u) ≤ µBmin (v) or µA (u) ≥ 1 − µBmin (v). In a sim-
ilar way, y112 = 1 − µB (v) is only possible when µB (v) > µBmin (v) and hence
µA (u) ≥ µBmin (v). If we observe carefully, then we find that y21 is the same as y112 .
Now we have two equations for µA (u) < µBmin (v), the first is y12 = 1 − µA (u) when
µB (v) ≤ µA (u) and then y22 = 1 − µB2 (v) when µB (v) > µA (u). Hence, the outcome
of the “min” operation between µA→B (u, v) and µB (v) can be divided into two re-
gions. The first region consists of y12 and y22 when µA (u) < µBmin (v), and the second
region consists of y111 and y21 when µA (u) ≥ µBmin (v). We observe that the supre-
mum of the first region will be y12 , i.e. 1 − µA (u) only, and for the second region, the
supremum is the intersection point of y111 and y21 . This intersection point is com-
puted by solving the equality 1 − µB2 (v) = µB (v), and that turns out to be µBmin (v).
It is also noticed that the supremum of the first region will be always equal to or
greater than the supremum of the second region. In other words, a minimum value
that the consequence µA (u) can have is µBmin (v) only, otherwise whichever is the
maximum, i.e. µA (u) = max(µBmin (v), 1 − µA (u)) or µA (u) = µBmin (v) ∪ 1 − µA (u).
.
MRFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 68 that the first supremum of
the outcome of the “min” operation between the implication µA→B (u, v) and the

Fig. 68 Superimposed plots of MRFI and premise 1 of C7


372 S.K. Kashyap et al.

premise 1 µB.  (v), is the intersection point of the curve of µB (v) and the curve

µB (v) = 1 − µB (v). This intersection point is obtained by solving the following
equality:
.
µA (u) = 1 − µB (v) = µB (v) or µB2 (v) − 3µB (v) + 1 = 0
√ √
By solving the above equation, we get µB (v) = 3 ± 5 = 3 − 5 µA (u) =
√ 2 2 , hence,

3 − 5
µB (v) = 2 . It is also observed from Figure 68 that µA (u) = 2 3 − 5 is also the
minimum value that the consequence µA (u) can achieve, and another point √of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 3 − 5 and
√ 2
1 − µA (u) with whichever is the maximum, i.e. µA (u) = max 3 −2 5 , 1 − µA (u)

or µA (u) = 3 −2 5 ∪1− µA (u). It is noticed that MRFI does not satisfy the intuitive
criterion C7 of GMT. The analytical proof is given below:
 . 
µA (u) = sup min max (min (µA (u), µB (v)) 1 − µA (u)) , 1 − µB (v)
v∈V
⎧ .

⎨ y1 = min max (µB (v), 1 − µA (u)) 1 − µB (v) ; for µB (v) ≤ µA (u)
= sup . (63)
v∈V ⎪
⎩ y2 = min max (µA (u), 1 − µA (u)) 1 − µB (v) ; for µB (v) > µA (u)
⎧⎧ .

⎪ ⎨ y11 = min µB (v), 1 − µB (v) ; for µB (v) ≥ 1 − µA (u)



⎪ .
⎪ ⎩
⎪ y12 = min 1 − µA (u), 1 − µB (v) ; for µB (v) < 1 − µA (u)




⎨ ;⎧for µB (v) ≤ µA (u)
.
= sup ⎪
⎪⎪
v∈V ⎪ ⎨ y21 = min µA (u), 1 − µB (v) ; for µB (v) ≥ 1 − µA (u)



⎪ or µA (u) ≥ 0.5

⎪ ⎪
⎪ .

⎪ ⎩ y22 = min 1 − µA (u), 1 − µB (v) ; for µA (u) < 0.5



; for µB (v) > µA (u)
⎧⎧

⎪ ⎨ y111 = µB (v) . ; for µB (v) ≤ µBmin (v)

⎪ ; for µB (v) ≥ 1 − µA (u)

⎪ y112 = 1 − µB (v) ; for µB (v) > µBmin (v)

⎪ ⎩
⎨ y12 = 1 − µA (u) ; for µB (v) < 1 − µA (u)
= sup ;for µB (v) ≤.µA (u)
v∈V ⎪


⎪ y21 = 1 − .µB (v) ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5



⎪ y 22 = 1 − µB (v) ; for µA (u) < 0.5

; for µB (v) > µA (u)
√ .
where µBmin (v) = 3 −2 5 is obtained by solving the equation µB (v) = 1 − µB (v).
It is observed that y111 = µB (v) is only possible when µB (v) ≤ µBmin (v) and
µB (v) ≥ 1 − µA (u), i.e.
. 1 − µA (u) ≤ µB (v) or µA (u) ≥ 1 − µBmin (v). In a sim-
min min

ilar way, y112 = 1 − µB (v) is only possible when µB (v) > µB (v) and hence
µA (u) ≥ µBmin (v). If we observe carefully, then we find that y21 is the same as
y112 . Now we have two equations for µA (u) < µBmin (v), the first is y12 = 1 − µA (u)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 373
.
when µB (v) ≤ µA (u) and then y22 = 1 − µB (v) when µB (v) > µA (u). Hence, the
outcome of the “min” operation between µA→B (u, v) and µB (v) can be divided
into two regions. The first region consists of y12 and y22 when µA (u) < µBmin (v)
and the second region consists of y111 and y21 when µA (u) ≥ µBmin (v). We ob-
serve that the supremum of first region will be y12 , i.e. 1 − µA (u) only, and for
the second region, the supremum is the intersection point . of y111 and y21 . This
intersection point is computed by solving the equality 1 − µB (v) = µB (v) and
that turns out to be µBmin (v). It is also noticed that the supremum of the first re-
gion will always be equal to or greater than the supremum of the second region.
In other words, a minimum value that the consequence µA (u) can have is µBmin (v)
only, otherwise whichever is the maximum, i.e. µA (u) = max(µBmin (v), 1 − µA (u))
or µA (u) = µBmin (v) ∪ 1 − µA (u).
MRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 69 that for some value of µA (u)
(lower values), the outcome of the “min” operation between µA→B (u, v) and µB (v)
consists of two curves, starting with the curve µB (v) and then 1 − µA (u). It is noticed
that the supremum of these curves is 1 − µA (u) only. Similarly for higher values of
µA (u), the outcome of the “min” operation between µA→B (u, v) and µB (v) consists
of two curves, starting with the curve µB (v) and then µA (u). In this case, it is no-
ticed that the supremum is µA (u) only. Therefore, the consequence µA (u) has either
1 − µA (u) or µA (u), with whichever is maximum, i.e. µA (u) = max(µA (u), 1 −
µA (u)) or µA (u) = µA (u) ∪ 1 − µA (u). It is noticed that MRFI does not satisfy the
intuitive criterion C8-1/C8-2 of GMT. The analytical proof is given below:

µA (u) = sup {min [max (min (µA (u), µB (v)) , 1 − µA (u)) , µB (v)]} (64)
v∈V

y = min [max (µB (v), 1 − µA (u)) , µB (v)] ; for µB (v) ≤ µA (u)
= sup 1
v∈V y2 = min [max (µA (u), 1 − µA (u)) , µB (v)] ; for µB (v) > µA (u)

Fig. 69 Superimposed plots of MRFI and premise 1 of C8-1/C8-2


374 S.K. Kashyap et al.
⎧

⎪ y11 = min [µB (v), µB (v)] ; for µB (v) ≥ 1 − µA (u)



⎪ y12 = min [1 − µ A (u), µ B (v)] ; for µB (v) < 1 − µA (u)

;for µB (v) ≤ µA (u)
= sup
v∈V ⎪
⎪ y21 = min [µA (u), µB (v)] ; for µB (v) ≥ 1 − µA (u) or µA (u) ≥ 0.5



⎪ y22 = min [1 − µ A (u), µ B (v)] ; for µA (u) < 0.5

; for µB (v) > µA (u)
⎧

⎪ y11 = µB (v) ; for µB (v) ≥ 1 − µA (u)

⎪ ; for µB (v) ≤ µA (u)
⎨ ⎧y12 = µB (v) ; for µB (v) < 1 − µA (u)
= sup ⎨ y21 = µA (u) ; for µB (v) ≥
v∈V ⎪
⎪ 1 − µA (u) or µA (u) ≥ 0.5 ; for µB (v) > µA (u)


⎩⎩
y22 = 1 − µA (u) ; for µA (u) < 0.5

It is observed from y11 , y12 , y21 and y22 that they can be divided into two re-
gions, based on the value of µA (u). The first region consists of y12 and y22 when
µA (u) < 0.5 and the second region consists of y11 and y21 when µA (u) ≥ 0.5. We
also observe that the supremum of the first region is y22 = 1 − µA (u), and the supre-
mum of the second region is y21 = µA (u). It is also noticed that the supremum of
the first region will always be equal to or greater than the supremum of the second
region. In other words, a minimum value that the consequence µA (u) can have is
µA (u) only, otherwise whichever is maximum, i.e. µA (u) = max(µA (u), 1 − µA (u))
or µA (u) = µA (u) ∪ 1 − µA (u).
BRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from the Figures 70 and 71 that the first supremum
of the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), for the value of µA (u) greater than 0.5, is the intersection point of
the curve µB (v) of µA→B (u, v) and the curve µB (v) = 1 − µB (v). This intersection
point is obtained by solving the following equality:

µA (u) = 1 − µB (v) = µB (v) = 0.5

It is also observed from Figure 70 that µA (u) = 0.5 is also the minimum value that
the consequence µA (u) can achieve. For the value of µA (u) less than 0.5, another
point of the supremum happens to be 1 − µA (u). Therefore, µA (u) falls between 0.5
and 1 − µA (u) with whichever is the maximum, i.e. µA (u) = max(0.5, 1 − µA (u))
or µA (u) = 0.5 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C5 of GMT. The analytical proof is given below:

µA (u) = sup {min [max (1 − µA (u), µB (v)) 1 − µB (v)]} (65)


v∈V

y = min [1 − µA (u), 1 − µB (v)] ; for µB (v) ≤ 1 − µA (u)
= sup 1
v∈V y2 = min [µB (v), 1 − µB (v)] ; for µB (v) > 1 − µA (u)
⎧

⎪ y11 = 1 − µA (u) ; for µB (v) ≤ µA (u)
⎨ ; for µB (v) ≤ 1 − µA (u)
y = 1 − µB (v) ; for µB (v) > µA (u)
= sup  12
v∈V ⎪
⎪ y21 = µB (v) ; for µB (v) ≤ 0.5
⎩ ; for µB (v) > 1 − µA (u)
y22 = 1 − µB (v) ; for µB (v) > 0.5
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 375

Fig. 70 Superimposed plots of BRFI and premise 1 of C5

Fig. 71 plots of BRFI and


premise 1 of C5 for µB = 0.35

It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for the second region the
supremum is the intersection point of y21 and y22 . This intersection point is com-
puted by solving the equality 1 − µB (v) = µB (v) and that turns out to be 0.5. It is
also noticed that the supremum of the first region will always be equal to or greater
than the supremum of the second region. In other words, a minimum value that
the consequence µA (u) can have is 0.5 only, otherwise whichever is maximum, i.e.
µA (u) = max(0.5, 1 − µA (u)) or µA (u) = 0.5 ∪ 1 − µA (u).
BRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 72 that the first supremum of
the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), is the intersection point of the curve µB (v) of µA→B (u, v) and
the curve µB (v) = 1 − µB2 (v). This intersection point is obtained by solving the
following equality:

µA (u) = 1 − µB2 (v) = µB (v) or µB2 (v) + µB (v) − 1 = 0


376 S.K. Kashyap et al.

Fig. 72 Superimposed plots of BRFI and premise 1 of C6

√ √
By solving the above equation, we get µB (v) = −1 ± 5 = 5 − 1 , hence, µ  (u) =
A
√ 2 2 √
5 − 1
µB (v) = 2 . It is also observed from Figure 72 that µA (u) = 2 5 − 1 is also the
minimum value that the consequence µA (u) can achieve, and another √ point of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 52− 1 and

1 − µA (u) with whichever is maximum, i.e. µA (u) = max 5 − 1 , 1 − µ (u) or
2 A

µA (u) = 52− 1 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C6 of GMT. The analytical proof is given below:
!  "
µA (u) = sup min max (1 − µA (u), µB (v)) 1 − µB2 (v) (66)
v∈V
  
y = min 1 − µA (u), 1 − µB2(v) ; for µB (v) ≤ 1 − µA (u)
= sup 1
v∈V y2 = min µB (v), 1 − µB2 (v) ; for µB (v) > 1 − µA (u)
⎧ .

⎪ y11 = 1 − µA (u) ; for µB (v) ≤ .µA (u)
⎨ ; for µB (v) ≤ 1 − µA (u)
 y12 = 1 − µB2 (v) ; for µB (v) > µA (u)
= sup
v∈V ⎪
⎪ y = µB (v) ; for µB (v) ≤ µBmin (v)
⎩ 21 ; for µB (v) > 1 − µA (u)
y22 = 1 − µB (v) ; for µB (v) > µBmin (v)
2


where µBmin (v) = 52− 1 is obtained by solving the equation µB (v) = 1 − µB2 (v).
It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions, based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for the second region, the
supremum is the intersection point of y21 and y22 . This intersection point is com-
puted by solving the equality 1 − µB2 (v) = µB (v) and that turns out to be µBmin (v).
It is also noticed that the supremum of the first region will always be equal to or
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 377

Fig. 73 Superimposed plots of BRFI and premise 1 of C7

greater than the supremum of the second region. In other words, a minimum value
that the consequence µA (u) can have is µBmin (v) only, otherwise whichever is the
maximum, i.e. µA (u) = max(µBmin (v), 1 − µA (u)) or µA (u) = µBmin (v) ∪ 1 − µA (u).
.
BRFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 73 that the first supremum of
the outcome of the “min” operation between the implication µA→B (u, v) and the
premise 1 µB (v), is the.intersection point of the curve µB (v) of µA→B (u, v) and
the curve µB (v) = 1 − µB (v). This intersection point is obtained by solving the
following equality:
.
µA (u) = 1 − µB (v) = µB (v) or µB2 (v) − 3µB (v) + 1 = 0
√ √
By solving the above equation, we get µB (v) = 3 ±2 5 = 3 −2 5 , hence µA (u) =
√ √
µB (v) = 3 −2 5 . It is also observed from Figure 73 that µA (u) = 3 −2 5 is also the
minimum value that the consequence µA (u) can achieve, and another point √of the
supremum turns out to be 1 − µA (u). Therefore, µA (u) falls between 3 − 5 and
√ 2
1 − µA (u) with whichever is the maximum, i.e. µA (u) = max 3 −2 5 , 1 − µA (u)

or µA (u) = 3 −2 5 ∪ 1 − µA (u). It is noticed that BRFI does not satisfy the intuitive
criterion C7 of GMT. The analytical proof is given below:
 . 
µA (u) = sup min max (1 − µA (u), µB (v)) 1 − µB (v) (67)
v∈V
⎧ .
⎨ y1 = min 1 − µA (u), 1 − µB (v) ; for µB (v) ≤ 1 − µA (u)
= sup .
v∈V ⎩ y2 = min µB (v), 1 − µB (v) ; for µB (v) > 1 − µA (u)
378 S.K. Kashyap et al.
⎧

⎪ µA (u) ; for µB (v) ≤ µA2 (u)
y11 = 1 − .
⎨ ; for µB (v) ≤ 1 − µA (u)
y = 1 − µB (v) ; for µB (v) > µA2 (u)
= sup  12
v∈V ⎪
⎪ y = µB (v). ; for µB (v) ≤ µB (v)
min
⎩ 21 ; for µB (v) > 1 − µA (u)
y22 = 1 − µB (v) ; for µB (v) > µBmin (v)

where µBmin (v) =3 − 5 is obtained by solving the equation µ (v) = 1 − .µ (v).
2 B B
It is observed that the outcome of the “min” operation between the implication
µA→B (u, v) and the premise 1 µB (v) can be divided into two regions based on the
value of µA (u). The first region consists of y11 and y12 for µA (u) ≤ 0.5, and the sec-
ond region consists of y21 and y22 for µA (u) > 0.5. We observe that the supremum
of the first region will be y11 , i.e. 1 − µA (u) only, and for second region, the supre-
mum is the intersection point . of y21 and y22 . This intersection point is computed
by solving the equality 1 − µB (v) = µB (v), and that turns out to be µBmin (v). It is
also noticed that the supremum of the first region will always be equal to or greater
than the supremum of the second region. In other words, a minimum value that the
consequence µA (u) can have is µBmin (v) only, otherwise whichever is the maximum,
i.e. µA (u) = max(µBmin (v), 1 − µA (u)) or µA (u) = µBmin (v) ∪ 1 − µA (u).
BRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 74 that µB (v) is always equal to
or less than the implication µA→B (u, v) for any value of µA (u), therefore the outcome
of the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).
Therefore, it is concluded that BRFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:

µA (u) = sup {min [max (1 − µA (u), µB (v)) µB (v)]} (68)


v∈V

y = min [1 − µA (u), µB (v)] ; for µB (v) ≤ 1 − µA (u)
= sup 1
v∈V y2 = min [µB (v), µB (v)] ; for µB (v) > 1 − µA (u)

Fig. 74 Superimposed plots of BRFI and premise 1 of C8-1/C8-2


Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 379

y1 = µB (v) ; for µB (v) ≤ 1 − µA (u)
= sup
v∈V y2 = µB (v) ; for µB (v) > 1 − µA (u)
= sup {µB (v)}
v∈V

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) is always µB (v), thus µA (u) = 1.
GRFI: C5: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from Figures 75 and 76 that the supremum of
the outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is
their intersection point, obtained by solving the following equality:

µB (v)
µA (u) = = 1 − µB (v),
µA (u)

since
µB (v) µA (u)
= 1 − µB (v), or (1 + µA (u))µB (v) = µA (u) or µB (v) =
µA (u) 1 + µA (u)

Fig. 75 Superimposed plots of GRFI and premise 1 of C5

Fig. 76 plots of GRFI and


premise 1 of C5 for µB = 0.35
380 S.K. Kashyap et al.

µB (v) 1
µA (u) = = .
µA (u) 1 + µA (u)
Therefore it is concluded that GRFI does not satisfy criterion C5 of GMT. The
analytical proof is given below:
⎧ 7 8
⎨ µB (v)
y1 = min , 1 − µB (v) ; for µB (v) ≤ µA (u)
µA (u) = sup µA (u) (69)
v∈V ⎩ y = min [1, 1 − µ (v)] ; for µB (v) > µA (u)
2 B
⎧⎧

⎪⎪
⎪ ⎨ y11 = µB (v) µ (v)
; for B ≤ 1 − µB (v) or µB (v) ≤
µA (u)

⎪ µ (u) µ (u) 1 + µA (u)
⎨ A A
µ
⎪ (u)
= sup ⎩ y12 = 1 − µB (v) ; for µB (v) > 1 + µ (u)
A
v∈V ⎪
⎪ A
⎪ ; for µB (v) ≤ µA (u)



y2 = 1 − µB (v); for µB (v) > µA (u)

It is observed that the outcome of the “min” operation between µA→B (u, v)
and µB (v) consists of y11 , y12 and y2 . The outcome starts with y11 which in-
creases to a maximum value of 1 with an increase in µB (v) from 0 to
1 + µA (u)
µA (u)
µB (v) ≤ . It is observed that y12 and y2 are the same; therefore, we take
1 + µA (u)
y12 or y2 which starts from its maximum value of 1 and decreases with
1 + µA (u)
any further increase of µB (v). Hence, the supremum of the outcome of the “min”
operation is 1 only.
1 + µA (u)
GRFI: C6: µB (v) = 1 − µB2 (v) is applied to RHS of Equation (22) to get the con-
sequence µA (u). It is observed from Figure 77 that the supremum of the outcome of
the “min” operation between the curves µA→B (u, v) and µB (v) is their intersection
point, obtained by solving the following equality:

µB (v)
µA (u) = = 1 − µB2 (v),
µA (u)

since
µB (v)
= 1 − µB2 (v), or µA (u)µB2 (v) + µB (v) − µA (u) = 0
µA (u)
9
−1 ± 1 + 4µA2 (u)
By solving the above equation, we get µB (v) = and
9 9 2µA (u)
µB (v) −1 ± 1 + 4µA2 (u) 1 + 4µA2 (u) − 1
µA (u) = = = . Therefore it is con-
µA (u) 2µA2 (u) 2µA2 (u)
cluded that GRFI does not satisfy criterion C6 of GMT. The analytical proof is
given below:
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 381

Fig. 77 Superimposed plots of GRFI and premise 1 of C6

⎧ 7 8
⎨ µB (v)
y1 = min , 1 − µB (v) ; for µB (v) ≤ µA (u)
2
µA (u) = sup  µA (u)  (70)

v∈V y2 = min 1, 1 − µB2 (v) ; for µB (v) > µA (u)
⎧⎧
⎪ ⎪ µ (v) µ (v)

⎪ ⎨ y11 = µB(u) ; for B ≤ 1 − µB2 (v)
⎨ A µA (u)
= sup ⎪ or µB (v) ≤ µBmin (v) ; for µB (v) ≤ µA (u)

v∈V ⎪
⎪ µB (v) ; for µB (v) > µBmin (v)
⎩ y12 = 1 −
2

y2 = 1 − µB (v)
2 ; for µB (v) > µA (u)
9
1 + 4µA2 (u) − 1 µ (v)
where µBmin (v) = is obtained by solving the equation B =
2µA (u) µA (u)
1 − µB (v).
2

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 and y2 . The outcome starts with y11 which increases to a
maximum value of µBmax (v) with an increase in µB (v) from 0 to µB (v) ≤ µBmin (v).
It is observed that y12 and y2 are the same; therefore, we take y12 or y2 which starts
from its maximum value of µBmax (v) and decreases with any further increase of
µB (v). Hence, the supremum
9 of the outcome of the “min” operation is µBmax (v) only,
1 + 4µA2 (u) − 1
where µBmax (v) = .
2µA2 (u)
.
GRFI: C7: µB (v) = 1 − µB (v) is applied to RHS of Equation (22) to get
the consequence µA (u). It is observed from Figure 78 that the supremum of the
outcome of the “min” operation between the curves µA→B (u, v) and µB (v) is their
intersection point obtained by solving the following equality:

µB (v) .
µA (u) = = 1 − µB (v),
µA (u)
382 S.K. Kashyap et al.

Fig. 78 Superimposed plots of GRFI and premise 1 of C7

since
µB (v) . ' (
= 1 − µB (v), or µB2 (v) − 2µA (u) + µA2 (u) µB (v) + µA2 (u) = 0
µA (u)

By solving the above equation, we get


9
2µA (u) + µA2 (u) − µA (u) µA2 (u) + 4µA (u)
µB (v) =
2
9
µB (v) 2 + µA (u) − µA2 (u) + 4µA (u)
and µA (u) = = . Therefore, it is concluded
µA (u) 2
that GRFI does not satisfy criterion C7 of GMT. The analytical proof is given below:
⎧ 7 8
.
⎨ y1 = min µB (v) , 1 − µB (v) ; for µB (v) ≤ µA (u)

µA (u) = sup µA (u) (71)
.
v∈V ⎪⎩ y2 = min 1, 1 − µB (v) ; for µB (v) > µA (u)
⎧⎧ .

⎪ ⎪ µB (v) µ (v)

⎪ ⎨ y11 = µ (u) ; for B ≤ 1 − µB (v)

⎪ A µ A (u)
⎨ or µB (v) ≤ µBmin (v)

⎩ .
= sup y12 = 1 − µB (v) ; for µB (v) > µBmin (v)
v∈V ⎪⎪

⎪ ; for µB (v)


⎩ .≤ µA (u)
y = 1 − µ (v); for µ (v) > µ (u)
2 B B A
9
2µA (u) + µA2 (u) − µA (u) µA2 (u) + 4µA (u)
where µBmin (v) = 2 is obtained by solving
µB (v) .
the equation = 1 − µB (v).
µA (u)
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 383

Fig. 79 Superimposed plots of GRFI and premise 1 of C8-1/C8-2

It is observed that the outcome of the “min” operation between µA→B (u, v) and
µB (v) consists of y11 , y12 , and y2 . The outcome starts with y11 which increases to a
maximum value of µBmax (v) with an increase in µB (v) from 0 to µB (v) ≤ µBmin (v).
It is observed that y12 and y2 are the same; therefore, we take y12 or y2 which starts
from its maximum value of µBmax (v) and decreases with any further increase of
µB (v). Hence, the supremum of9 the outcome of the “min” operation is µBmax (v) only.
2 + µA (u) − µA2 (u) + 4µA (u)
where, µBmax (v) = 2 .
GRFI: C8-1/C8-2: µB (v) = µB (v) is applied to RHS of Equation (22) to get the
consequence µA (u). It is observed from the Figure 79 that µB (v) is always equal to
or less than the implication µA→B (u, v) for any value of µA (u), therefore the outcome
of the “min” operation results in µB (v) itself. Hence, the supremum of µB (v) turns
out to be the unity only (since the maximum value of µB (v) = µB (v) is the unity).
Therefore, it is concluded that GRFI satisfies the intuitive criterion C8-1 (not C8-2)
of GMT. The analytical proof is given below:
⎧ 7 8
⎨ µB (v)
y1 = min , µ (v) ; for µB (v) ≤ µA (u)
µA (u) = sup µA (u) B (72)

v∈V y2 = min [1, µB (v)] ; for µB (v) > µA (u)
⎧⎧
⎪ ⎪ µ (v) µ (v)

⎪ ⎨ y11 = B ; for B ≤ µB (v)
⎨ µA (u) µA (u)
or µA (u) > 1 ; for µB (v) ≤ µA (u)
= sup ⎪
⎪ ⎩ y = µ (v) ; for µ (u) < 1
v∈V ⎪

⎩ 12 B A
y2 = µB (v) ; for µB (v) > µA (u)

It is observed that y11 is not valid as µA (u) > 1 is not possible. Hence, the out-
come of the “min” operation between µA→B (u, v) and µB (v) is always µB (v), thus
µA (u) = 1.
384

Table 3 Comparison of true and computed consequences of GMP


Implication MORFI PORFI ARFI MRFI BRFI GRFI
Methods T
Criteria Computed SF Computed SF Computed SF Computed SF Computed SF Computed SF
1+µB √
C1 µB µB Y µB Y 2 N 0.5 ∪ µB N 0.5 ∪ µB N µB N
C2-1 µB2 N N √ N √ N √ N N
µB µB 3+2µB − 5+4µB 3− 5 3− 5 ( µB )2/3
2 2 ∪ µB 2 ∪ µB
C2-2 µ Y Y N N N N
√B √ √ √
C3-1 µB N N 5+4µB −1 N 5−1 N 5−1 N N
µB µB 2 2 ∪ µB 2 ∪ µB ( µB )1/3
C3-2 µB Y Y N N N N
C4-1 1 N µB N Y Y Y Y
0.5 ∩ µB 1+µB 1 1 1 1
C4-2 µB N N N N N N

T = true consequence of GMP


SF = satisfaction flag; “Y” if computed consequence matches with true one / “N” if computed consequence does not match with true one

Table 4 Comparison of true and computed consequences of GMT


Implication MORFI PORFI ARFI MRFI BRFI GRFI
Methods T
Criteria Computed SF Computed SF Computed SF Computed SF Computed SF Computed SF
µA 1
C5 µA 0.5 ∩ µA N N 1 − µ2A N 0.5 ∪ 1 − µA N 0.5 ∪ 1 − µA N N
√ √1+2µA 2 √ √ √ √ 1+µA2
5−1 µA µA −4−µA 1−2µA + 1+4µA 5−1 5−1 1+4µA −1
C6 1 − µA2 2 ∩ µA N 2 N 2 N 2 ∪ 1 − µA N 2 ∪ 1 − µA N µA2
2√
N
√ √ √ √ √
√ 3− 5 2µA +1− 4µA +1 3− 1+4µA 3− 5 3− 5 2+µA − µA2 +4µA
C7 1 − µA 2 ∩ µA N 2 N 2 N 2 ∪ 1 − µA N 2 ∪ 1 − µA N 2 N
C8-1 1 N N Y N Y Y
µA µA 1 µA ∪ 1 − µA 1 1
C8-2 µA Y Y N N N N
S.K. Kashyap et al.
Evaluation of Fuzzy Implications and Intuitive Criteria of GMP and GMT 385

6 Discussions

Tables 3 and 4 given below, summarize the results of investigation [2], on various
implication methods compared with intuitive criteria of GMP and GMT. It is ob-
served from the tables that implication methods such as MORFI and PORFI satisfy
exactly the same intuitive criteria of GMP and GMT with a total number of satis-
factions equalling to four. A similar observation is made for implication methods
such as ARFI, BRFI and GRFI. These methods satisfy only 2 intuitive criteria of
GMP and GMT. It is also observed that MRFI has the minimum number (equalling
to one) of satisfactions with these criteria. The logical explanation of these obser-
vations would be that by referring to Figures 6 and 7 we see that the curve profile
(as far as their shape of envelope is of any concern) of implication methods such
as MORFI and PORFI are similar, both starting from the origin and ending with
the membership grade µB (v). Similarly, by referring to Figures 8, 10 and 11 for the
implication methods ARFI, BRFI and GRFI respectively, it is observed that these
methods also have a similar curve profile with each of them starting with the unity
and finally converging to µB (v). The implication method MRFI has a unique curve
profile that does not match with any of the other methods, and this make MRFI as
a separate member among the existing implication methods. Finally, it can be con-
cluded that probably similarity in the curve profile of these methods (MORFI and
PORFI in one group/ARFI, BRFI, and GRFI in another group) leads to an equal
number of satisfactions of intuitive criteria of GMP and GMT.

7 Conclusions

A systematic approach has been followed to find out whether any of the existing im-
plication methods match with a given set of intuitive criteria of GMP and GMT. For
that, MATLAB with graphics is used to develop a user interactive package to eval-
uate the implication methods with respect to those criteria. The results are provided
in terms of tables and figures. It is found that the graphical method of investigation
is much quicker and requires less effort from the user as compared to the analyti-
cal method. Also, the analytical method seeks diagnosis of various curves (i.e. the
nature of these curves with respect to variation of the fuzzy sets and) involved in
finding the consequences when intuitive criteria of GMP and GMT are applied to
various implication methods.

References

1. S.K. Kashyap and J.R. Raol. Unification and Interpretation of Fuzzy Set Operations.
CCECE/CCGEI, IEEE Electrical and Computer Engineering Conference, Ottawa, Canada,
May 7–10, 2006.
2. Li-Xin Wang. Adaptive Fuzzy Systems and Control, Design and Stability Analysis. Prentice-
Hall, Englewood Cliffs, NJ, 1994.
FzController: A Development Environment
for Fuzzy Controllers

I. Alvarez-López, O. Llanes-Santiago, and J.L. Verdegay

Abstract This chapter presents a general purpose development environment that


allows an easy and friendly specification, verification and synthesis of fuzzy con-
trollers. This CAD tool also allows the real-time control of systems with proper
constant of time and it contains the necessary tools for the signal processing. Among
the distinctive characteristics of this system is the possibility for users to define their
own operators, and to carry out in PLC the synthesis of the designed controller.

Keywords: Signal processing; Fuzzy control

1 Introduction

As it is well known, the theory of fuzzy sets, and hence fuzzy logic, was introduced
by L.A. Zadeh by the middle of the 1960s as a way to describe the mechanisms
of approximate inference that are performed in the human brain [1]. Since then,
automatic control of processes has been the field where the applications of fuzzy
logic have gained most importance, what has been demonstrated by the diversity of
fuzzy logic based registered patents and the very large number of papers presented
and published along the last three decades in congresses and specialized journals.
In spite of these facts, and in order to help users, still is necessary to develop CAD
tools that allow specification, verification and synthesis of fuzzy controllers.
In the past years many tools have been elaborated for the development of sys-
tems based on fuzzy logic. With no doubt the tool that has the higher number of

I. Alvarez-López and J.L. Verdegay


Dept. of Computer Science and A.I.
University of Granada. 18071 Granada (Spain)
O. Llanes-Santiago
Dept. of Automatic and Computation
Electric Faculty, ISPJAE. Havana (Cuba)
email: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 387
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 387–401.
c 2008 Springer.
388 I. Alvarez-López et al.

users is the package of programs MATLAB for fuzzy logic. This tool which has all
the potential that this powerful package offers, presents some basic drawbacks as it
neither allow an implementation of the designed controller in hardware nor allow to
carry out the direct synthesis of controllers in industrial devices as Programmable
Logic Controllers (PLC). Another tool is XFuzzy [3–5], which is very useful for
those who want to carry out an implementation of the designed controller’s hard-
ware, but it does not allow to carry out the controller’s direct synthesis in industrial
devices like the PLC, neither has possibilities to carry the identification out of the
process to be controlled.
There are many other developed tools which use indistinctly for the specification
of the system a graphic interface or a description language, but in general most of
them either have serious limitations in the fuzzy operators they implement and/or
are closed systems that do not allow the user to implement their own operators or
are development systems for not specific technologies [6].
In this chapter one presents FzController, a prototype tool that, although already
used in practice, is still under development . The tool is characterized by a friendly
and clear graphic interface which, besides to avoid some of the above mentioned
drawbacks, it allows the users to perform the following main actions:
1. Process identification
2. Fuzzy controllers design using graphic design tools
3. Real-time control
4. Automatic generation of code for PLC and high level languages
Each action is associated to a module in the system.
To show the tool, the chapter is developed according to the following. In Section 2, to
give an overall idea, the general diagram of the system is presented. In Section 3, the
basic characteristics of the modules of identification, design of the fuzzy controller,
real-time control and automatic generation of code, all of them illustrating the main
distinctive features of this development environment with regard to other existent
ones, are presented. Finally some conclusions are pointed out.

2 General Conception of the FzController System

As it is well known [2] there are two basic methods to implement fuzzy systems,
the exact one and the approximate one, each one with its pros and cons.

2.1 Exact method

It is based on studying the way that the fuzzy sets adopt before each implication
operator. As a whole, a parametric representation of the inferred fuzzy sets is done.
This method is inconvenient in that a previous computation of the parameter expres-
sions of the fuzzy sets has to be made before implementing the controller [7].
FzController: A Development Environment for Fuzzy Controllers 389

2.2 Approximated Method

Its basic characteristic is that it is not necessary to perform any previous computation
since the universe of discourse of each variable of the consequent is defined as a
finite discrete set. From a computational point of view, this fact implies a more time
consuming method, but the accuracy is given by the amount of points in the universe
of discourse. Hence it is necessary to reach a balance between computational speed
and accuracy. It has the advantage of being able to work with a bigger amount
of implication operators since it does not need to make a previous study of the
parameterized expressions of the fuzzy sets [7]. Thus the implication, aggregation
and defuzzification operators act on each one of the elements of the vectors obtained
as a result of the discretization process. In the implementation of the FzController
system here the Approximated Method is used.
The FzController system is developed under Windows environment and for an
efficient operation it needs a minimum configuration with any Pentium computer
with 128 Mb RAM and 40 Mb of HD.
Figure 1 presents the structure of the FzController system and the flow of infor-
mation among the different modules.

Module for design of fuzzy controllers

Editor of Operators
defined by the user
Code Editor (VBScript, Module of Automatic
JavaScript, Generation of codes for
DelphiScript) PLC and High level
programming languages
Graphic Editor of
systems:
- Controller
- Variables

Variables Editor FzController Kernel Identification Module

Rules Editor
Real Time Control
Module

Properties Editor

Industrial plant

Graphic Visualization of
Rules and Interface
process

Graphic
Representation of
system response
Control Surface

Fig. 1 FzController system diagram


390 I. Alvarez-López et al.

3 Modules in FzController

As said above there are, four main modules in FzController. In the following one
describes each.

3.1 Identification Module

For every development system it is very important to incorporate an identification


module allowing to find a mathematical model of the system to carry out the first
tests of the designed controller by computer simulations. In [8, 9] it was shown
that a fuzzy logic system behaves as a universal approximator. So they have an
extended use in non linear modelling problems with a great number of applications
in engineering. The two basic outlines used to carry out the identification are the
parallel one (Figure 2) and the series-parallel one (Figure 3). In [10] was shown
as the best outline is the series-parallel one. Hence this will be used here in the
identification block of FzController.
There are several training algorithms (back propagation, orthogonal least squares,
clustering, . . .) to make identification with a fuzzy system, each having advantages
and disadvantages depending on the system to be identified. For the FzController
it was decided to implement first the algorithm based on clustering because it is
simple from a computational point of view and it has a good behavior because it is
based on an optimum system of fuzzy logic. Any case, the system admits the possi-
bility to program other methods and add them to the identification module in a very
simple way.
To carry out the identification, this module can take the input and output data of
the real-time control module or may take them from a file that the user provides.

y
Plant
u e

y^
Fig. 2 Basic outline of fuzzy logic
the pattern of parallel system
identification

y
Plant
u e

y^
fuzzy logic
Fig. 3 Basic outline of the system
series-parallel identification
FzController: A Development Environment for Fuzzy Controllers 391

3.2 Design of Fuzzy Controllers Module

The module for design of fuzzy controllers is composed by:

a) Graphic editor
b) Properties editor
c) Variables editor
d) Rules editor
e) Operators defined by the user editor
f) Graphic visualization of the rules and the inference process
g) Graphic representation of the system answer, control surface

The first five elements allow the controller’s specification, and the last two the
verification of its operation.

a) Graphic editor
When a fuzzy controller is designed, the first step is to select the controller’s struc-
ture that will be implemented. In this case FzController allows to run Sugeno–
Mamdani type controllers (the classic structure for a fuzzy controller).
The system editor (Figure 4) allows to select the type of controller (Sugeno or
Mamdani) and to define its linguistic variables. Once selected the controller, its
logical operators are defined. Besides the above-mentioned, in the system editor the
controller’s linguistic variables are added and the universe of discourse of them is
described, as well as the linguistic label. The editor’s main characteristic is that it
simplify the specifications of the system.

Fig. 4 System editor screen


392 I. Alvarez-López et al.

Fig. 5 Properties and system


parameters editor

b) Properties editor
By means of the properties editor (Figure 5) the controller’s fuzzy operators are
edited. The operator is selected to be used for the connective AND, the connective
OR, implication operator, aggregation operator, addition operator to conjunction and
disjunction, defuzzification method, as well as the controller’s name.
Table 1 shows the operators that FzController has implemented by default.
One of the characteristics that enhance the system is that it is a general purpose
system allowing the implementation of any operator defined by the user. Later on
we will focus in some extent on this characteristic.
The properties editor also allows to modify, in a very simple way, each one of
the parameters of the membership functions or fuzzy sets defined in each one of the
variables.

c) Variables editor
It allows to edit the membership functions of each linguistic variable defined in the
system.
The FzController system has implemented by default membership functions of
trapezoid type (it includes the triangular functions as a particular case of this type
of functions), S-function, Z-function, Pi-function, Gauss Bell, singleton type (the
case of a controller with a Sugeno structure). Besides the above-mentioned, the
system offers the possibility for user to define membership functions by means of
mathematical expressions or by a vector (Fig. 6).
FzController: A Development Environment for Fuzzy Controllers 393

Table 1 Default operators implemented with the FzController system


Conjunction operators t-norms:
(connective AND) Minimum, Hamacher product, Product, Einstein
product, Bounded difference, Drastic product
Disjunction operators s-norms:
(connective OR) Maximum, Hamacher addition, Bounded addition,
Einstein addition, Algebraic addition, Drastic addition
Implication operators Diene, Dubois and Prade, Mizumoto, Goguen, Golden,
Lukasiewicz, besides the above t-norms and s-norms
Aggregation t-norms:
operators Minimum, Hamacher product, product, Einstein
product, Bounded difference, Drastic product
s-norms:
Maximum, Hamacher addition, Bounded addition,
Einstein addition, Algebraic addition, Drastic addition
Defuzzifications Center of Gravity, Bisector, Middle of Maxima, Last of
methods Maxima, First of Maxima, Height

Fig. 6 Variables editor of FzController

d) Rules editor
The rule base of a fuzzy controller contains the information or logical connection
between the input and output linguistic variables of the system. The FzController
system works with MISO rules (Multiple Inputs, Single Output) and it allows to
add the controller’s rules in a simple way and without committing syntax errors.
Figure 7 shows the window of the rule base editor.
394 I. Alvarez-López et al.

Fig. 7 Rule base editor of the system

e) Editor of operators defined by the user


Another distinctive feature of the FzController system is the possibility that the
user can define its own fuzzy operators. The user can works with the operators
more commonly used, implement its own operators, check the answer of the system
and make the necessary corrections. Figure 8 shows the window of the editor of
operators code defined by the user.
The editor of operators defined by the user has a graphic interface that allows
the user, using the programming language Delphi Script, Visual Basic Script or
Java Script to implement its own operators, as well as to check the code syntaxes
and evaluate the operator. It is recommended that the user of the system implement
his own operators in the language Delphi Script or Java Script to obtain a fastest
computation speed. It is important to emphasize that the user of the system should
have some minimum knowledge of programming to implement his operators.

f) Graphic visualization of the rules and inference process


Once the controller has been designed, it is important to study its behavior. The
visualization of the activation degree of each rule for a certain input vector con-
stitutes a very important tool in the analysis of the designed controller’s behavior
(some systems have implemented this tool, like MATLAB, but there are others, like
Xfuzy, that have not yet it. Figure 9 shows the window of graphic visualization of
the inference process.
FzController: A Development Environment for Fuzzy Controllers 395

Fig. 8 Editor of operators defined by the user

When the user implements his own operators, he not only can analyze it behavior
by means of the editor of operators, but also to analyze its effect in the controller’s
response by studying the inference process graphically. By an exhaustive analysis
of the inference process one can correct the operator designed to make sure on the
desired response of the system.
As it is appreciated in Figure 8 this screen does not only allow the visualization
of the activation or implied fuzzy group but also allows displaying the global fuzzy
group as a result of the adding process.

g) Graphic representation of the system response, control surface


FzController allows the graphic displaying response of the controlled system by
means of the control surface or by the curve of input/output relationship (Fig. 10).
As the graphic analysis of the inference process, the analysis of the control sur-
face constitutes an important tool in the study of behavior of the system. The control
surface shows all the possible values of the system response in any combination of
the entries. For example a surface of very soft control without abrupt changes of
the normal vector to the curve would indicates that control signal would not be very
oscillatory and before small changes in the input values, the output of the system
would has small changes.
396 I. Alvarez-López et al.

Fig. 9 Graphic visualization of the Rules and inference process

Throughout analysis of the control surface one can analyze to what combina-
tion of inputs the behavior of the system is not the one desired, as well as to fix it
by changing either the rules, or the operators or the membership functions of the
linguistic variables.

3.3 Real-Time Control Module

When one has a controller designed for an industrial plant or process it is important
to check the results obtained not only by simulation level but also with a real plant.
The developed control module in real time has as purpose to carry out control on
a real plant in experimental way, with the objective of improving the controller
designed before carrying out its syntheses for a PLC. To interact with the plant
or process it is used a data acquisition card.
FzController: A Development Environment for Fuzzy Controllers 397

Fig. 10 Control surface of the designed system

Hence the evidence of the general purpose characteristic of the system, since it
allows to work with any acquisition data device. The work with the card of data
acquisition is carried out by means of a dynamic link library (dll) that may be pro-
grammed and added by the final user of the system.
In the real-time control module is carried out also the signal conditioning that
consists on carrying out the filtering process, scale adjustment, or to obtain new
signals as a result of a mathematical transformation or operation on the signals read
directly from the process (Fig. 11).
FzController also allows to apply filtering algorithms, scale adjustment, deriva-
tion function, error function and integration function to the signals read on the phys-
ical process or that have been sent to the process by means of the data acquisition
device.

3.4 Automatic Generation of Codes Module

Nowadays most of the industrial applications that have being developed, using fuzzy
logic or fuzzy systems have been implemented using PLC with fuzzy processing
modules or customized hardware [11, 12]. With the module of automatic generation
398 I. Alvarez-López et al.

Fig. 11 Real-time control module

of codes for PLC is possible to implement the designed controller in PLC proper
codes. This feature allows implementing as many fuzzy controllers as it is desired
without the necessity of incorporate an additional fuzzy processing module. For this,
FzController system is able to generate the controller’s code in structured text and
using functions of the standard IEC 61131-3.
The norm IEC 61131-3 [14] has a great impact in the world of industrial con-
trol and this is not restricted to the conventional market of the PLC’s. The use of
IEC 61131-3 provides many benefits for users/programmers. There are many bene-
fits from the adoption of this standard depending on the application areas: process
control, integrator system, education, programming, maintenance, system installa-
tion, among others. IEC 61131-3 is the result of the great effort carried out by
7 multinational companies with many years of experience in the field of the in-
dustrial automation. The standard constitutes the specifications of the syntax and
semantics of a programming language (structured text), including the software pat-
tern and the structure of the language. The Structured Text (ST) is a language
of high level with origins in ADA, Pascal and C. It may be used to code com-
plex expressions and nested instructions. This language has structures for loops
(REPEAT-UNTIL; WHILE-DO), conditional execution (IF-THEN-ELSE; CASE),
and functions (SQRT, SIN, etc.). The generated code makes use of a library of func-
tions, in which are implemented all the operators and membership function that sys-
tem works with. This library of functions is of free distribution. Initially it has been
developed for “Panasonic” PLC (former NAIS). Being a standard code, in principle
it is valid for any PLC which developing environment incorporates the standard IEC
61131-3.
FzController: A Development Environment for Fuzzy Controllers 399

Table 2 Data of the carried out test


Characteristics of the implemented system

PLC type Panasonic FP0 C14


Implemented controller Sugeno type
Number of input variables 2
Number of output variables 1
Membership functions 3 to each variable
Number of rules 9
Results
Time of SCAN cycle 25.4 ms
Length of the programming software 100 CPU instructions

Fig. 12 Module of code generation for PLC

Table 2 shows the results obtained in the implementation of a system using a


PLC of the series FP0 of Panasonic (NAIS) [14].
It is important to keep in mind that for the implementation of fuzzy control sys-
tems using PLC up to now it was necessary to incorporate a fuzzy processing mod-
ule. These modules are usually expensive and they are limited for the number of
membership functions and operators available. The fuzzy processing modules that
have been marketing up to now, carry out a very quick processing and they do not
consume time of CPU SCAN cycle because they are independent processing units.
The code implemented with the FzController consumes time of SCAN cycle be-
cause it is a software implementation and this should be taking into account by the
programmer of the system.
Figures 12 and 13 show the generated code as well as the system’s option of
generation of code.
400 I. Alvarez-López et al.

Fig. 13 Module of code generation for PLC. Generated code

The introduction of this module in the system that represents its fundamental
distinctive characteristic with regard to the well-known systems, offers the following
advantages in the implementation of control systems applying fuzzy logic in PLC:
• A nonlimited amount of operators and fuzzy sets that can be used
• Versatility of the generated code
• Possibility to implement as many controllers as desired if the limitations of the
CPU and the SCAN cycle allows it
• Cheaper than other existing ones
• Possibility to implement fuzzy control systems in industrial plants that are al-
ready operative with not new investments
• A shorter development times of final applications

4 Conclusions

In this chapter one has presented a prototype system, called FzController, which
constitutes an important tool for the development, implementation and real-time
control of a plant using a fuzzy controller. This system presents many tools that
FzController: A Development Environment for Fuzzy Controllers 401

cover in a long extent the different stages of specification, verification and synthesis
in the design of a fuzzy control system.
At this moment FzController is for free distribution (upon request to the authors)
and it runs in any Windows OS. Authors are working in a version for Linux as
well as in the incorporation of learning methods and on the controller’s automatic
adjustment.
FzController is able to generate code for PLC. The generated code fulfills the
Standard IEC 61131-3. Initially one worked for Panasonic PLC, but by the present
time one works for Siemens PLC. Due to be a standard, the generated code should
be valid for any PLC that fulfills the standard. Also one works on developing the
choice of code generation for high-level languages.

References

1. L.A. Zadeh. Fuzzy Sets. Information and Control 8 pp. 338–358, 1965
2. P. Bonissone. Fuzzy Logic and Soft Computing. Technology Development and Applications.
GE Technical Report, 1997
3. http://www.imse.cnm.es/Xfuzzy/
4. F.J. Moreno, I.Baturone, S. Snchez and A. Barriga. Rapid design of fuzzy systems with
XFUZZY. IEEE International Conference on Fuzzy Systems FUZZ-IEEE, pp. 342–347, 2003
5. D.R. López, S. Sánchez-Solano and A. Barriga. XFL: a fuzzy logic systems language. Sixth
IEEE International Conference on Fuzzy Systems 3, pp. 1585–1591, 1997
6. J. Schwarz. Motorola microcontroller as the platform for fuzzy applications. In Scientific Inter-
national Conference on Communications, Signal and Systems CSS’96, Brno, Czech Republic,
AMSE, Sept. 1996
7. O. Cordón, F. Herrera and A. Peregrı́n. A Practical Study on the Implementation of Fuzzy
Logic Controllers. The International Journal of Intelligent Control and Systems 3, pp. 49–91,
1991
8. J.L. Castro. Fuzzy Logic Controllers are Universal Approximators. IEEE transactions on Sys-
tems, Man and Cybernetics 25, pp. 629–635, 1995
9. J.L. Castro and M. Delgado. Fuzzy Systems with Defuzzification are Universal Approximators.
IEEE transactions on Systems, Man and Cybernetics. 26, pp. 149–152, 1996
10. S.K. Narendra and K. Parthasarathy. Identification and control of dynamical systems using
neural networks. IEEE Transaction on Neural Networks 1 (1), pp. 4–27, 1990
11. J. Balcells and J.L. Romeral. Programable Automata. Marcombo, Madrid (In Spanish), 1997
12. U. Michel. Industry Programable Automata. Marcombo, Madrid (In Spanish), 1990
13. AENOR: “UNE-EN-61131-1,2,3”, 1994
14. http://www.nais-e.com/plc/uacs/plc dl manual.html
A Consistency Criterion for Optimizing
Defuzzification in Fuzzy Control

Hyei Kyung Lee, Eric Paillet, and Werner Peeters

Abstract Throughout the literature about fuzzy control, various defuzzification


methods have been proposed, as well as been classified according to their prop-
erties, such as continuity, scale-invariance, core consistency and much more. How-
ever, the choice of a suitable defuzzification operator still remains an arbitrary one.
We do not claim to have found the “perfect defuzzifier”, but in this article, we would
like to add one particular new criterion, that we would like to call the Consistency
Criterion, which can be used to measure the suitability of a certain defuzzifica-
tion process, or at least compare several defuzzification operators, even parametric
classes with infinitely many members, that contain some very commonly used ones,
such as D.P. Filev and R.R. Yager’s BADD-defuzzification ([3]) as the most com-
monly used class throughout this text. A surprising result is that the minima of a
measure of non-consistency yielded with respect to the parameters occurring in the
rule base or other parameters of the problem, are certainly not reached in the most
“natural” choice values for the parameters, but for some surprising, transcendent
numbers.

Keywords: fuzzy control, defuzzification, consistency, antecedent rule base

1 Introduction

Fuzzy control ([16]) is used in a wide scope of applied sciences, including physics,
electronics and economy. It is based on the concept of fuzzy sets as introduced
by L.A. Zadeh ([14] and [15]), extending the notion of membership of a function
from a two-valued logic to one in which the range values continuously vary within

Hyei Kyung Lee, Eric Paillet, and Werner Peeters


University of Antwerp, Dept. of Mathematics and Computer Science, Middelheimlaan 1, B-2020
Antwerp, Belgium, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 403
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 403–431.
c 2008 Springer.
404 H.K. Lee et al.

I = [0, 1]. The reason of its vast success is its fairly simple computational behaviour,
its obvious weakness however is, as is readily known, the inherently heuristic nature
of the design of such a fuzzy controller. The wide possibility of choice for shape
and parameters in the control variables shows the need for a solid mathematical
foundation, next to some obvious heuristic restraints which the controlled system
has to satisfy.
As everybody who is familiar with the basic concept of fuzzy control knows,
three key issues in the design of a fuzzy control system are:
• The choice of a suitable set of fuzzy variables, being functions from the space
in which control measurements are performed. Mostly this will be functions α
from R (or a part thereof) to I. Either the space R or an interval [a, b] will be
denoted as X.
• The choice of an implication function, or, equivalently, a set of linguistic rules,
each of the type

IF (X1 = A1 ) and ... and (Xn = An ) THEN (Y = B)

where the denoted variables Xi are linguistic, and linked to the fuzzy membership
sets αi , and coupled with an aggregation function to combine the consequences
of these assertions, and
• The choice of a suitable defuzzification method, assigning one crisp value with
the aggregated consequence function.
A fuzzy controller is then designed by choosing in a suitable way any combina-
tion of the three above. As for the fuzzy variables, every input variable of the fuzzy
controller has a finite collection of rule antecedents consisting of fuzzy functions,
which we will denote by {αi : X −→ I}ni=1 , and a consequence rule β : X −→ I.
Definition 1. For one such function α : X −→ I, the support will be defined as

supp α := {x ∈ X : α (x) > 0} .

Definition 2. A collection of fuzzy sets {αi : X −→ I}i=1 n will be called disjoint if

∀i = j ∈ {1, ..., n} : supp αi ∩ supp α j = 0.


/

If two rule antecedents αi and α j are not disjoint, they will be called overlapping.
Definition 3. On the other hand, the core of a fuzzy set α : X −→ I will be defined
as  :
core α := x ∈ X : α (x) = sup α (y) .
y∈X

Definition 4. Given two non–disjoint fuzzy sets αi and α j , we will say that αi super-
centrally overlaps α j if and only if core αi ⊆ supp α j , and αi subcentrally overlaps
α j if and only if core αi ∩ supp α j = 0.
/ If αi both overlaps α j subcentrally as well as
supercentrally, we say that αi centrally overlaps α j .
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 405

α1
α2

Fig. 1 α1 subcentrally over-


laps α2 , but α2 supercentrally
overlaps α1

As can be seen in the Figure 1, these notions of overlapping need not be


symmetric.
The set of all such collections of rule antecedents Ξ = {αi : X −→ I}ni=1 shall be
denoted as P ∗ (F(X)), being the collection of all finite subsets of F(X), the fuzzy
sets on X.
Definition 5. A collection of rule antecedents Ξ will be called a partition of unity if
and only if
n
∀x ∈ X : ∑ αi (x) = 1.
i=1

The consequence functions can be considered as members of the same set. As


for the implication, following E.H. Mamdani et al. in [6], given each rule is of the
type   ' (
r : IF X1 = A1j1 and ... and Xn = Anjn THEN (Y = B j ),

where Aiji is the value of the j–th term of the linguistic variable i corresponding to
the antecedent membership function αiji , and B j is the value of the j–th term of the
linguistic variable corresponding to the consequence membership function β j , then
the aggregation of the rules is made by calculating


n
kr (x) := αiji (xi )
i=1

for each of the input vectors x = (x1 , ..., xn ), and determining the consequence fuzzy
set as ' (
µx (y) := ρ (x, y) = β j (y) ∧ kr (x) .
r

While this operation is commonly referred to as a fuzzy implication, we would


like to stress that we are taking in fact the cartesian product, as seen in the Figure 2.
Hence this should be interpreted more as a fuzzy relation than as a fuzzy logical
implication.
Of course, many more possible implication operators can be considered, as de-
scribed by D. Dubois et al. in [1] and [2] and by D. Ruan et al. in [8]. A interesting
kind of controller in which the defuzzification process is incorporated in the rule
base, being an alteration of the method proposed by E.H. Mamdani et al. in [6], is
given by M. Sugeno in [11].
The most crucial step in the construction of a fuzzy controller however, is the de-
fuzzification method. Since in technical applications, it will not always be possible
406 H.K. Lee et al.

Fig. 2 The cartesian product

to make a machine–based decision based on a fuzzy function, a method has to be


chosen to select a suitable “representative” crisp value, assigned to the fuzzy output.
At one stage in the adaptive process a decision has to be taken as how to adjust the
system, thereby needing one output variable. Generally, a method that associates
a value D(µ ) ∈ X with any fuzzy function µ ∈ F(X) is called a defuzzification
method. Several defuzzification techniques have been studied extensively, and for
a good overview we refer to the articles of T.A. Runkler et al. [10] and W. Van
Leekwijck et al. [12]. However, in order to be able to apply various scaling argu-
ments so that our calculations may be reduced, it may be advisable to only take into
account defuzzification methods that are in a certain way compatible with linear
transformations on R. More concretely, the following conditions should hold:

1 The defuzzification value should be independent of any positive affine trans-


formation applied to the values in the range space I. Stated differently, for all
µ ∈ F(X), for all a ∈ R+0 and b ∈ R, define

aµ + b : X −→ I
x → aµ (x) + b
(of course on condition that this is well defined). Then the defuzzification value
should not be changed, or, in other words,
∀µ ∈ F(X) such that aµ + b ∈ F(X) : D(aµ + b) = D(µ ).
A defuzzifier D that satisfies this property will be called ordinal scale-invariant.
([7], [9], nicely summarized in [12])
2 Any positive affine transformation on X should induce the inverse affine trans-
formation on the defuzzification value. Stated differently, for all µ ∈ F(X), for
all a ∈ R0 and b ∈ R, define
µ a,b : X −→ I
x−b
x → µ
a
(of course again on condition that this is well defined). Then the defuzzification
value should be
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 407
 
D µ a,b = aD(µ ) + b.

A defuzzifier D that satisfies this property will be called universe scale-invariant.


([9])

Some of the most commonly used criteria satisfying these conditions include:

• The first of maxima defuzzification DFOM is a function that maps µ ∈ F(X) to


 
DFOM (µ ) = inf y ∈ X : µ (y) = sup µ (z)
z∈X

• The last of maxima defuzzification DLOM is a function that maps µ ∈ F(X) to


 
D LOM
(µ ) = sup y ∈ X : µ (y) = sup µ (z)
z∈X

• The middle of maxima defuzzification DMOM is a function that maps µ ∈ F(X)


to
DLOM (µ ) + DFOM (µ )
DMOM (µ ) =
2

• The middle of support defuzzification DMOS is a function that maps µ ∈ F(X) to


inf {y ∈ X : µ (y) > 0} + sup {y ∈ X : µ (y) > 0}
DMOS (µ ) =
2

• The center of gravity defuzzification DCOG is a function that maps µ ∈ F(X) to



yµ (y)dy
X
D COG
(µ ) = 
µ (y)dy
X

DCOG is perhaps the most commonly used defuzzification method, although it


heavily relies on the fact that the membership function is interpreted as a prob-
ability, which is strictly theoretically speaking, not so evident. In [3], D.P. Filev
and R.R. Yager considered this as being one particular case of a more general
parametric family of probability distributions.
• The basic defuzzification distributions DBADD (−, γ ) are a parametric family with
a parameter γ ∈ R+ of functions that map µ ∈ F(X) to

yµ γ (y)dy
DCOG (µ , γ ) = X
µ γ (y)dy
X
408 H.K. Lee et al.

Some interesting parameter classes of defuzzifiers and new combinations, apart


from the aforementioned ones are given by D.P. Filev et al. in [3], some other inter-
esting classes can be found in R.R. Yager et al. in [13].
One of the other obvious criterions a defuzzifier has to satisfy, which we have not
mentioned yet is continuity: when the rule antecedents are only modified slightly,
this should not drastically affect the output of the defuzzifier. It has extensively
been studied that for instance the MOM defuzzifier is not continuous, unlike the
COG defuzzifier. However, in doing so, one has to assume that F(X) carries some
topology to describe the distance between two fuzzy sets. Another, much simpler
criterion to evaluate the effectiveness of a defuzzification method, is the following:
Suppose one has a finite collection of fuzzy sets as rule antecedents. Each rule an-
tecedent consists of only one fuzzy set α : X −→ I. Suppose that each consequence
function β : X −→ I is obtained as the image of the fuzzy set through a mapping
f : X −→ X. Following the definition of the image of a fuzzy set in [14],

β (y) = sup {α (x) : f (x) = y}

Ideally, the following assumptions should hold for all applicable f : X −→ X, but
one can easily see that this demand is way too strict; an appropriate choice of such
f has to be made. In order to achieve a certain degree of regularity, we choose for
f : X −→ X the most obvious functions; while even a study of all such possible sin-
gle functions is way beyond the scope of this article, but nevertheless an interesting
topic for further research, in the remainder of this article we will suppose that f is
the identity mapping. So with each rule antecedent α : X −→ I, we associate iden-
tically the same collection as a fuzzy consequence variable, and this for each rule.
The function θ : X × P ∗ (F(X)) −→ X associates with each input value x and
each antecedent (and consequent) fuzzy set collection {αi }ni=1 in P ∗ (F(X)) an
output θ (x, Ξ) = D∗∗∗ (µx ) for some fixed choice of a defuzzifier ∗ ∗ ∗, where
µx : X −→ I is derived from the rule base Ξ = {αi : X −→ I}ni=1 as described above.
Schematically,

θ
X × P ∗ (F(X)) −→ X
(x, Ξ) → D∗∗∗ (µx )
θ∗  D∗∗∗
F(X)
µx

Ideally, we look for a Ξ and a D such that θ (·, Ξ) −→ · should be identical to the
function f we started with; in this case, ideally, θ should be the identity function.
We find however that this is almost never the case, apart from maybe some degen-
erate states which do never occur anyway. While this is understandable in the case
of a discontinuous defuzzifier such as MOM, it is surprising to see that a contin-
uous defuzzifier such as COG does not satisfy this property either. The continuity
of the restricion of the function θ to some fixed rule base in P ∗ (F(X)) as function
X −→ X, omitting the need for a topology on F(X), is strongly depending on the
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 409

choice of the element of P ∗ (F(X)). Of course, it is only useful to consider as el-


ements of P ∗ (F(X)) fuzzy rule bases which are effectively used for fuzzy control
purposes.

2 MOM- and COG-defuzzification

The triangular-shaped fuzzy set passing through the points (a, 0), (c, 1) and (b, 0)
with c ∈ ]a, b[ is defined by

⎪ x−a

⎪ if x ∈ [a, c]
⎨ c−a
α (x) = x − b if x ∈ [c, b]


⎩ c−b

0 otherwise

The fuzzy set is called symmetric if and only if c = a+b


2 . In that case it can be also
written in one of the two following shapes:
⎧ 7 8⎫

⎪ x−a a+b ⎪ ⎪
⎪ 2 if x ∈ a, ⎪

⎨ b−a 7 2 8⎪⎬ 
 x − a 

α (x) = x−b a+b = 1 − 1 − 2 ∨0

⎪ 2 if x ∈ ,b ⎪
⎪ b−a 

⎪ a−b 2 ⎪

⎩ ⎭
0 otherwise

In this section, we will study the defuzzification behaviour of some common


fuzzy controllers.

2.1 Single Controller

Let Ξ ∈ P ∗ (F([a, b])) be the fuzzy controller with as an antecedent


 rule
 base only
the fuzzy set α : [a, b] −→ I passing through the points (a, 0), a +2
b , 1 and (b, 0),
as seen in Figure 3. In that case, it is easy to see that
7     8
 x − a   y − a 

µx (y) = α (x) ∧ α (y) = 1 − 1 − 2 ∨ 1−2 ∨0
b−a   b−a 
 
If then x ∈ a, a + b , then α (x) = 2 x − a and hence
2 b−a

⎨ 2 a+b
µx (y) = b − a ((x − a) ∧ (y − a)) if y ∈ a, 2
⎩ 2 ((x − a) ∧ (b − y)) if y ∈ a + b , b
b−a 2
410 H.K. Lee et al.

0.8

0.6

0.4

0.2

0 1 2 3 4 5
Fig. 3 Single controller x

0.8

0.6

0.4

0.2

Fig. 4 Output of a single 0 1 2 3 4 5


controller x

If on the other hand x ∈ a + b


2 , b , then analogously

⎨ 2 ((b − x) ∧ (y − a)) if y ∈ a, a + b
µx (y) = b − a 2
⎩ 2 ((b − x) ∧ (b − y)) if y ∈ a + b , b
b−a 2
An example of the consequence of such a controller with a = 2, b = 4 and x = 2.3
can be seen in Figure 4. As for the First Of Maxima defuzzification, it is easy to see
that

⎨x if x ∈ a, a + b
DFOM (µx ) = 2
⎩ b + a − x if x ∈ a + b , b
2
and analogously,

⎨ b + a − x if x ∈ a, a + b
DLOM (µx ) = 2
⎩x if x ∈ a + b,b
2

Hence in both cases, DMOM (µx ) = a + b


2 . The MOM-defuzzification is a constant
function, hence certainly not the identity. As far as the COG-defuzzification is con-
cerned, fixing an x ∈ a, a + b
2 , we obtain through a simple calculation that
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 411
⎛ ⎞
b x 
b+a−x b
2 ⎝
µx (y)dy = (y − a)dy + (x − a)dy + (b − y)dy⎠
b−a
a a x b+a−x
2(x − a)(b − x)
= ,
(b − a)

while on the other hand,

b b
(a + b)(x − a)(b − x)
yµx (y)dy = y(α (x) ∧ α (y))dy =
(b − a)
a a

and hence DCOG (µx ) = a + b


2 , regardless of x. For symmetry reasons, the same
 
holds for x ∈ a + b COG 0
(µx ) is undetermined 0 . A single
2 , b ; in a and b, D
controller has as a defuzzification function a constant function, regardless which of
the two above defuzzification methods is used.

2.2 Two Single Disjoint Controllers

Let Ξ ∈ P ∗ (F([a, c])) be the fuzzy controller with as an antecedent rule base the
fuzzy sets
 
• α1 : [a, b] −→ I passing through the points (a, 0), a + b , 1 and (b, 0);
 2 
• α2 : [b, c] −→ I passing through the points (b, 0), b + c
2 , 1 and (c, 0);
with a < b < c as seen in Figure 5. The consequence is than given by
2
µx (y) = max (αi (x) ∧ αi (y)) .
i=1

It is an easy verification to calculate that if the rule antecedents are disjoint,


the MOM- and COG-defuzzifications both behave as a single controller on their
respective domains — we omit these calculations, as they are straightforward.

0.8

0.6

0.4

0.2

0
Fig. 5 Two disjoint 1 2 3 4 5 6 7
controllers x
412 H.K. Lee et al.

Fig. 6 Defuzzification of two 0 1 2 3 4 5 6 7


disjoint single controllers x

As a result though, the MOM-defuzzification is given by


⎧ a+c
⎨ 2 if x ∈ {a, b, c}

D MOM
( µx ) = a + b
2 if x ∈]a, b[

⎩ b+ c
2 if x ∈]b, c[

as seen in Figure 6 and the COG-defuzzification by



⎨ undetermined if x ∈ {a, b, c}

a+b if x ∈]a, b[
D COG
( µx ) = 2

⎩ b+c
2 if x ∈]b, c[

The two are hence identical except for a negligible set. It is easy to extend this result
to a finite number of single disjoint controllers.

2.3 Two Subcentrally Overlapping Controllers

Let Ξ ∈ P ∗ (F([a, d])) be the fuzzy controller with as an antecedent rule base the
overlapping fuzzy sets
 
• α1 : [a, b] −→ I passing through the points (a, 0), a + b , 1 and (b, 0);
 2 
• α2 : [c, d] −→ I passing through the points (c, 0), c +
2
d , 1 and (d, 0).

Initially, we have to distinguish between two cases: a + b


2 < c (subcentrally over-
lapping) and c < a + b
2 (supercentrally overlapping), although we will prove later on
that those yield the same result. Put a + b
2 < c, as seen in Figure 7. The consequence
then is dependent of x as follows:
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 413

x∈ α1 (x) α2 (x) µx (y)


⎧  


2((x−a)∧(y−a))
if y ∈ a, a+b
 a+b  ' x−a ( b−a  2 
2((x−a)∧(b−y))
a, 2 2 b−a 0
⎪ if y ∈ a+b 2 ,b
⎩0 b−a
if y ∈ [b, d]
⎧  
⎪ 2((b−x)∧(y−a)) if y ∈ a, a+b

 a+b  ' b−x ( b−a  2 
2((b−x)∧(b−y))
2 , c 2 b−a 0 ⎪ if y ∈ a+b 2 ,b
⎩0 b−a
if y ∈ [b, d]
⎧  


2((b−x)∧(y−a))
if y ∈ a, a+b

⎪ b−a  2 

⎪ 2((b−x)∧(b−y)) if y ∈ a+b
' b−x ( ' x−b ( ⎨ 2((b−x)∧(b−y))
b−a
2((x−c)∧(y−c))
2 ,c
[c, b] 2 b−a 2 c−b ∨ if y ∈ [c, b]


b−a d−c  


2((x−c)∧(y−c))
if y ∈ b, c+d

⎪ d−c  2 
⎩ 2((x−c)∧(d−y))
⎧ d−c if y ∈ c+d2 ,d
⎪0 if y ∈ [a, c]
 c+d  ' x−b ( ⎨ 2((x−c)∧(y−c))  
b, 2 0 2 c−b if y ∈ c, c+d
⎩ 2((x−c)∧(d−y)) if y ∈ c+d , d 
⎪ 
d−c 2

⎧ d−c 2
⎪ 0 if y ∈ [a, c]
 c+d  ' c−x ( ⎨ 2((d−x)∧(y−c))  c+d 
, d 0 2 if y ∈ c,
⎩ 2((d−x)∧(d−y)) if y ∈  c+d , d 
2 c−b ⎪ d−c 2
d−c 2

As a graphical example, let (a, b, c, d) = (2, 4, 3.5, 5.5). Then we obtain the rule
consequence functions given in Figure 9.
Calculating the FOM- and LOM-defuzzification then gives


⎪ x if x ∈ a, a + b

⎪ 2

⎪ b , ac − bd
⎪b+a−x
⎪ if x ∈ a +

⎪ 2 a−b+c−d

⎨ a2 − b2 + bc − ad ac − bd
DFOM (µx ) = a − b + c − d if x = a − b + c − d



⎪ x if x ∈ a −acb − bd , c + d
c−d 2

⎪ +

⎪ c + d

⎪ d +c−x if x ∈ 2 ,d


a if x ∈ {a, d}

0.8

0.6

0.4

0.2

Fig. 7 Two subcentrally 0


1 2 3 x 4 5 6
overlapping controllers
414 H.K. Lee et al.

0.8

0.6

0.4

0.2

Fig. 8 Two supercentrally 0


1 2 3 4 5 6
overlapping controllers x

1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
x = 2.3 x = 3.6 x = 5.1

Fig. 9 Possible rule consequences

Namely, if x ∈ [c, b], then the two fuzzy sets intersect in the point x for which
− x = 2 x − c ; i.e. x = ac − bd . One can easily verify that this always is
2 bb − a d −c a−b+c−d
a point in the interval [c, b]. Analogously,



⎪ b+a−x if x ∈ a, a + b

⎪ 2



⎪ if x ∈ a + b ac − bd



x 2 , a−b+c−d


⎨ c2 − d 2 + ad − bc if x = a −acb − bd
DLOM (µx ) = a−b+c−d +c−d


⎪d +c−x

⎪ if x ∈ a −acb − bd
+c−d, 2
c+d





⎪ if x ∈ c + d


x 2 ,d

d if x ∈ {a, d}

Thus



a+b
⎪ 2 if x ∈ a, a −acb − bd
+c−d



⎪ a2 − b2 + c2 − d 2

a−b+c−d if x = a −acb − bd
+c−d
DMOM
( µx ) = .




c + d if x ∈ a −acb − bd , d
+c−d

⎪ 2

⎩ a+d
2 if x ∈ {a, d}
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 415

This function is, obviously, not continuous. For the calculation of the COG-
defuzzification, note that we can obtain the following results:

Table 1 COG-defuzzification for subcentrally overlapping rule antecedents.

x∈ DCOG (µx ) =

a+b
]a, c]
2
7 8
ac − bd I1 I2 I3 I4
c, + / +
a−b+c−d b−a d −c b−a d −c
7 8
ac − bd J1 J2 J3 J4
,b + / +
a−b+c−d b−a d −c b−a d −c

c+d
[b, d[
2

with ⎧
⎪ 
b+a−x x t1



⎪ I1 = y(y − a)dy + y(b − x)dy + y(b − y)dy






a b+a−x x

⎪ 
d+c−x d



⎪ I2 = y(x − c)dy + y(d − y)dy


t1 d+c−x

⎪ 
b+a−x x t1



⎪ I3 = (y − a)dy + (b − x)dy + (b − y)dy



⎪ a x


b+a−x

⎪ 
d+c−x d



⎪ I = (x − c)dy + (d − y)dy
⎩ 4
t1 d+c−x

and ⎧
⎪ 
b+a−x t2



⎪ J1 = y(y − a)dy + y(b − x)dy






a b+a−x

⎪ x 
d+c−x d



⎪ J2 = y(y − c)dy + y(x − c)dy + y(d − y)dy


t2 x d+c−x

⎪ 
b+a−x t2



⎪ J3 = (y − a)dy + (b − x)dy



⎪ a


b+a−x

⎪ x 
d+c−x d



⎪ J = (y − c)dy + (x − c)dy + (d − y)dy
⎩ 4
t2 x d+c−x
416 H.K. Lee et al.

where t1 = bd − xb + xa − ca is the intersection point of the lines y = 2 b − t and


d −c b−a
y = 2 dx −
−c
c , and t = bd − xd + xc − ca is the intersection point of the lines y =
2 b−a
− x and y = 2 t − c , where in either case, x is a constant value.
2 bb − a d −c
Note that writing out an explicit form for the integrals above is possible, but
unless values for a, b, c and d are chosen, the resulting function is extremely com-
plicated and entirely unusable. More interesting, however, is that this form permits
to prove the continuity of the defuzzification DCOG (µx ):

Lemma 6. The COG–defuzzification DCOG (µx ) is a continuous function.

Proof: Putting Ii and Ji , i = 1, 2, 3, 4 as above, it is easy to verify that

I1 + I2
lim DCOG (µx ) = lim b − a d − c =
a+b
>
x→c
>
x→c
I3 I4 2
b−a + d −c

and
J1 + J2
lim DCOG (µx ) = lim b − a d − c =
c+d
<
x→b
<
x→b
J3 J4 2
b−a + d −c

I1 + I2
and furthermore that lim b − a d −c and lim
< ac − bd I3 + I4 > ac − bd
x→ x→
a−b+c−d b−a d −c a−b+c−d
J1 J2
b − a + d − c both are equal to the same rational expression of a, b, c and d, which
J3 J4
b−a + d −c
proves the assertion. QED 


2.4 Two Supercentrally Overlapping Controllers

Let Ξ ∈ P ∗ (F([a, d])) again be a fuzzy controller with as an antecedent rule base
the overlapping fuzzy sets
' (
• α1 : [a, b] −→ I passing through the points (a, 0), a+b , 1 and (b, 0);
' c+d2 (
• α2 : [c, d] → I passing through the points (c, 0), 2 , 1 and (d, 0).

This time however, put c < a+b


2 (supercentrally overlapping) as in Figure 8. The
consequence then again is dependent of x as follows:
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 417

x∈ α1 (x) α2 (x) µx (y)


⎧  


2((x−a)∧(y−a))
if y ∈ a, a+b
' x−a ( b−a  2 
2((x−a)∧(b−y))
[a, c] 2 b−a 0
⎪ if y ∈ a+b 2 ,b
⎩0 b−a
if y ∈ [b, d]



2((x−a)∧(y−a))
if y ∈ [a, c]

⎪ b−a  

⎪ 2((x−a)∧(y−a)) ∨ 2((x−c)∧(y−c)) if y ∈ c, a+b
 a+b  ' x−a ( ' x−c ( ⎨ b−a d−c  a+b2 c+d 
2((x−a)∧(b−y))
c, 2 2 2 ∨ 2((x−c)∧(y−c)) if y ∈ 2 , 2
b−a d−c ⎪

b−a d−c  


2((x−a)∧(b−y))
∨ 2((x−c)∧(d−y))
if y ∈ c+d

⎪ b−a d−c 2 ,b
⎩ 2((x−c)∧(d−y))
⎧ d−c if y ∈ [b, d]


2((b−x)∧(y−a))
if y ∈ [a, c]

⎪ b−a  

⎪ 2((b−x)∧(y−a))
∨ 2((x−c)∧(y−c)) if y ∈ c, a+b
 a+b  ' b−x ( ' x−c ( ⎨ b−a d−c  a+b2 c+d 
2((b−x)∧(b−y))
, c+d 2 2 ∨ 2((x−c)∧(y−c)) if y ∈ 2 , 2
2 2 b−a d−c ⎪

b−a d−c  


2((b−x)∧(b−y))
∨ 2((x−c)∧(d−y))
if y ∈ c+d

⎪ b−a d−c 2 ,b
⎩ 2((x−c)∧(d−y))
⎧ d−c if y ∈ [b, d]


2((b−x)∧(y−a))
if y ∈ [a, c]
⎪ 2((b−x)∧(y−a))
⎪ b−a  

⎪ ∨ 2((d−x)∧(y−c)) if y ∈ c, a+b
 c+d  ' b−x ( ' d−x ( ⎨ b−a d−c  2 
2((b−x)∧(b−y)) 2((d−x)∧(y−c))
,b 2 2 ∨ if y ∈ a+b2 , 2
c+d
2 b−a d−c ⎪

b−a d−c  c+d


2((b−x)∧(b−y))
∨ 2((d−x)∧(d−y)) if y ∈ 2 , b

⎪ b−a d−c
⎩ 2((d−x)∧(d−y))
⎧ d−c if y ∈ [b, d]

⎨0 if y ∈ [a, c]
' d−x ( 2((d−x)∧(y−c))  
[b, d] 0 2 if y ∈ c, c+d
d−c ⎪ d−c 
⎩ 2((d−x)∧(d−y)) if y ∈ c+d , d
2 
d−c 2

As an example, let (a, b, c, d) = (2, 4, 2.5, 4.5). Then we obtain the rule conse-
quence functions seen in Figure 10. Calculating the FOM- and LOM-defuzzification
then yields exactly the same result as for two subcentrally overlapping controllers.
For the calculation of the COG-defuzzification, we can obtain the following results:

1 1 1 1
0.8 0.8 0.8 0.8
0.6 0.6 0.6 0.6
0.4 0.4 0.4 0.4
0.2 0.2 0.2 0.2
0 0 0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
x = 2.3 x = 2.7 x = 3.1 x = 3.25
1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
x = 3.3 x = 3.8 x = 4.2

Fig. 10 Possible rule consequence functions for various values of x


418 H.K. Lee et al.

Table 2 COG-defuzzification for supercentrally overlapping rule antecedents.

x∈ DCOG (µx ) =
a+b
]a, c]
7 8 2
a+b K1 K2 K3 K4
c, + / +
7 2 8 b−a d −c b−a d −c
a+b ac − bd L1 L2 L3 L4
, + / +
7 2 a−b+c−d 8 b−a d −c b−a d −c
ac − bd c+d M1 M2 M3 M4
, + / +
7 a − b + 8c − d 2 b−a d −c b−a d −c
c+d N1 N2 N3 N4
,b + / +
2 b−a d −c b−a d −c
c+d
[b, d[
2

with ⎧
⎪ x 
b+a−x t1



⎪ K1 = y(y − a)dy + y(x − a)dy + y(b − y)dy






a x b+a−x

⎪ 
d+c−x d



⎪ K2 = y(x − c)dy + y(d − y)dy


t1 d+c−x

⎪ x 
b+a−x t1



⎪ K3 = (y − a)dy + (x − a)dy + (b − y)dy



⎪ a x


b+a−x

⎪ 
d+c−x d



⎪ K = (x − c)dy + (d − y)dy,
⎩ 4
t1 d+c−x

⎪ 
b+a−x x t1



⎪ L1 = y(y − a)dy + y(b − x)dy + y(b − y)dy






a b+a−x x

⎪ 
d+c−x d


⎪ L2 =
⎪ y(x − c)dy + y(d − y)dy


t1 d+c−x

⎪ 
b+a−x x t1



⎪ L3 = (y − a)dy + (b − x)dy + (b − y)dy



⎪ a x


b+a−x

⎪ 
d+c−x d



⎪ L = (x − c)dy + (d − y)dy,
⎩ 4
t1 d+c−x
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 419

⎪ 
b+a−x t2



⎪ M1 = y(y − a)dy + y(b − x)dy






a b+a−x

⎪ x 
d+c−x d


⎪ M2 = y(y − c)dy +
⎪ y(x − c)dy + y(d − y)dy


t2 x d+c−x

⎪ 
b+a−x t2



⎪ M3 = (y − a)dy + (b − x)dy



⎪ a


b+a−x

⎪ x 
d+c−x d



⎪ M = (y − c)dy + (x − c)dy + (d − y)dy,
⎩ 4
t2 x d+c−x

and ⎧
⎪ 
b+a−x t2



⎪ N1 = y(y − a)dy + y(b − x)dy






a b+a−x

⎪ 
d+c−x x d


⎪ N2 =
⎪ y(y − c)dy + y(d − x)dy + y(d − y)dy


t2 d+c−x x

⎪ 
b+a−x t2



⎪ N3 = (y − a)dy + (b − x)dy



⎪ a


b+a−x

⎪ 
d+c−x x d



⎪ N = (y − c)dy + (d − x)dy + (d − y)dy,
⎩ 4
t2 d+c−x x

where t1 = bd − xb + xa − ca is the intersection point of the lines y = 2 b − t


d −c b−a
and y = 2 dx −
−c
c , and t = bd − xd + xc − ca is the intersection point of the lines
2 b−a
b − x t − c
y = 2 b − a and y = 2 d − c , in both cases x being a constant value. Again, writing
out an explicit form for these integrals is boring as well as pointless without prior
knowledge of the values a, b, c and d. While this expression looks more complicated
than in the subcentrally overlapping case, we will prove in the next lemma that the
expressions are in fact the same:

Lemma 7. The formulas for the COG–defuzzification DCOG (µx ) in the subcentrally
overlapping case as given in Table 1 and those for the supercentrally overlapping
case as given in Table 2 are identical.

Proof: First of all, considering Table 2 only, it is easy to see by simple calculation
that K3 = L3 and K1 = L1 . Since furthermore K2 = L2 and K4 = L4 , being literally
the same, the expressions for DCOG (µx ) on c, a + b and on a + b , ac − bd
2 2 a−b+c−d
are identical. In exactly the same way, it can be proved that M4 = L4 and M2 = L2 ,
and obviously M1 = L1 and M3 = L3 . Thus also the expressions for DCOG (µx ) on
420 H.K. Lee et al.

ac − bd c + d and on c + d , b are identical. Therefore, Table 2 can be


a−b+c−d, 2 2
simplified as follows:

Table 2b COG-defuzzification for supercentrally overlapping rule antecedents.

x∈ DCOG (µx ) =
a+b
]a, c]
7 8 2
ac − bd K1 K2 K3 K4
c, + / +
7 a−b+c−d 8 b−a d −c b−a d −c
ac − bd N1 N2 N3 N4
,b + / +
a−b+c−d b−a d −c b−a d −c
c+d
[b, d[
2

Now it is easy to see that the formulas in Table 1 for the subcentrally overlapping and
Table 2b for the supercentrally overlapping case are factually identical. This state-
ment is trivial for x ∈]a, c] and x ∈ [b, d[. We will now prove that this is also the case
for x ∈ c, a −acb − bd ac − bd ac − bd
+ c − d and x ∈ a − b + c − d , b . If x ∈ c, a − b + c − d , then
it is verifiable with an easy calculation easy that I3 = K3 and that I1 = K1 , while I2 =
K2 and I4 = K4 are perfectly identical. If on the other hand x ∈ a −acb − bd
+c−d ,b ,
then it is equally easy to verify that J4 = N4 and that J2 = N2 , while J1 = N1 and
J3 = N3 are identical. Hence, in both cases the expressions DCOG (µx ) are identical.
QED 

Consequently, without having to apply any limit theorem, also in this case the
COG-defuzzification is a continuous function. Furthermore, we will from now on
omit the second set of formulas.

2.5 Overlapping Controllers with Border Conditions

In order to be able to keep as much from the calculations above as possible, it is


advisable that, when putting half triangles at the edges of the controller, we con-
sider this as full triangular controllers as above, and then take the restriction to the
appropriate domain.
Let Ξ ∈ P ∗ (F( a + b g+ f
2 , 2 )) therefore be a fuzzy controller with as an an-
tecedent rule base the overlapping fuzzy sets
' a+b (
• α1 : a + b
2 , b −→ I passing through the points 2 , 1 and (b, 0);
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 421
 
• α2 : [c, d] → I passing through the points (c, 0), c + d
2 , 1 and (d, 0);

g+ f g+ f
• α3 : g, 2 −→ I passing through the points (g, 0) and 2 , 1

g+ f
with a + b
2 < c < b and g < d < 2 . Then take the restriction to the domain
a + b , g + f . These functions look like Figure 11. The consequence then is given
2 2
by
3
µx (y) = max(αi (x) ∧ αi (y)).
i=1

It takes a tedious but similar verificaction that the MOM-defuzzification is not con-
tinuous, and that the COG-defuzzification, analogously we get a limit theorem sim-
ilar to 6 proving that the COG-defuzzification is a continuous function. Although
we will leave the necessary calculations to the interested reader, some of the typi-
cal shape functions of overlapping fuzzifiers with border constraints can be seen in
Figure 12.

0.8

0.6

0.4

0.2

0 1 2 3 4
x

Fig. 11 Overlapping controller with borders

5
(a) 6
(b) 12
(c) (d)
10

4 5 10
8
4 8
3 6
3 6
2 4
2 4
1 1 2 2

0 1 2 3 4 5 0 0 2 3 4 5 6 0 2 4 6 8 10 12 0 2 4 6 8 10
x x x x
8
10 8
8
6 8
6 6
4 6
4 4
4
2 2 2
2
0
2 4 6 8 1 2 4 6 8 0 2 4 6 8 10 0 2 4 6 8
x x x x
(e) (f) (g) (h)

Fig. 12 Various output functions


422 H.K. Lee et al.

3 The Consistency Criterion

Our goal now is to find out how the fuzzy controllers should be positioned with
respect to each other, such that the difference between the input value x and the
defuzzification value D(µx ) is minimal. Ideally, this difference should be zero, but
even in simple cases this is just not true. As for how this distance should be cal-
culated, various possibilities are open, but if we have to make a trade-off between
computational complexity and intuitive correctness, it seems only reasonable to take
the L1 -distance, defined on X by

∀µ , ν ∈ F(X) : d1 (µ , ν ) = |µ (x) − ν (x)| dx
X

Ansatz (Consistency Criterion) A rule base Ξ ∈ P ∗ (F(X)) and a defuzzifi-


cation operator D∗∗∗ are more suited for fuzzy control, the more the value
d1 (id, D∗∗∗ (µ )), with

D∗∗∗ (µ ) : X −→ X
x → D∗∗∗ (µx ) = θ (x, Ξ)

is smaller.
We will investigate this claim on some concrete examples.

Proposition 8. Let Ξ ∈ P ∗ (F([a, d])) be the fuzzy controller with as an antecedent


rule base the  overlapping
 fuzzy sets α1 : [a, b] −→ I passing through the
points (a, 0), a + b , 1 and (b, 0), and α2 : [c, d] → I passing through the points
 2
(c, 0), c + d
2 , 1 and (d, 0).
• If c >, = or < a + b
2 , then D
COG ( µ ) <, = or > c respectively.
c
c + d
• If b >, = or < 2 , then D COG (µb ) <, = or > b respectively.

Proof: This is a direct consequence of the fact that DCOG (µc ) = a + b


2 and
DCOG (µd )) = c + d
2 . QED

' ( 9. A point γ will be called a fixpoint for the ∗ ∗ ∗–defuzzification if


Definition
D∗∗∗ µγ = γ .

COG-defuzzification functions for which, e.g. c = a + b


2 at least have a fixpoint
there. If two fuzzy sets α1 and α2 of a rule base are such that α1 and α2 either both
subcentrally overlap or supercentrally overlap, then the COG-defuzzification must
have at least one fixpoint somewhere in between.
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 423

3.1 Example

Because of the scaling invariance demands as stated in the introduction, only the
relative position of the controllers with respect to each other is considered to be
important. All other defuzzification values can be calculated through applying the
appropriate affine transformations. Therefore, fix α1 (x) to be the triangular fuzzy
set through the points (0, 0), (1, 1) and (2, 0), and let α2 (x) be a variable fuzzy set
λ +µ
through the points (λ , 0), 2 , 1 and (µ , 0). This yields the following result:?

x∈ DCOG (µx ) =
[0, λ ] 1
7 8
2µ A1 + A2
λ, DCOG ( µx ) =
µ −λ +2 l A3 + A4
7 8
2µ B1 + B2
, 2 DCOG ( µx ) =
µ −λ +2 r B3 + B4
λ +µ
[2, µ [ 2
with
⎛ ⎞
2µ −2x
µ−λ
1⎜⎜x 
2−x ⎟

A1 = ⎜ y2 dy + xydy + y(2 − y)dy⎟
2⎝ ⎠
0 x 2−x
⎛ ⎞
µ+
λ −x µ
⎜ ⎟
1 ⎜ ⎟
A2 = ⎜ y(x − λ )dy + y(µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −2x µ +λ −x
µ −λ
⎛ ⎞
2µ −2x
µ−λ
1⎜⎜x 
2−x ⎟

A3 = ⎜ ydy + xdy + (2 − y)dy⎟
2⎝ ⎠
0 x 2−x
⎛ ⎞
⎜ µ +λ −x µ ⎟
1 ⎜ ⎟
A4 = ⎜ (x − λ )dy + (µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −2x µ +λ −x
µ −λ
424 H.K. Lee et al.

and
⎛ 2µ −xµ +xλ

⎜ 2
2−x
1⎜ ⎟
B1 = y2 dy + y(2 − x)dy⎟
2⎝ ⎠
0 2−x
⎛ ⎞
x µ+
λ −x µ
1 ⎜⎜

B2 = y(y − λ )dy + y(x − λ )dy + y(µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −xµ +xλ x µ +λ −x
2
⎛ 2µ −xµ +xλ

⎜ 2
2−x
1⎜ ⎟
B3 = ydy + (2 − x)dy⎟
2⎝ ⎠
0 2−x
⎛ ⎞
x µ+
λ −x µ
1 ⎜⎜

B4 = (y − λ )dy + (x − λ )dy + (µ − y)dy⎟
µ −λ ⎝ ⎠
2µ −xµ +xλ x µ +λ −x
2

Remark that if (λ , µ ) = (0, 2), then the function is the constant 1–mapping. Writ-
ing this out explicitly
 is a long
 and cumbersome work. However, when trying to
achieve that DCOG (µx ) − x is the constant zero function, it can be calculated that
the only solution of the non-linear equation system equals (λ , µ ) = (0, 2), which is
trivial. On the other hand, it might be interesting to note that this function is not only

continuous in x = , but also continuously differentiable. Indeed,
µ −λ +2

d d
lim (DCOG (µx )) = lim (DCOG (µx )) ,
< 2µ
x→ µ −λ +2
dx l >
x→
2µ dx r
µ −λ +2

which allows extremal analysis by calculating the derivative.


Another, better approach is the following: suppose we assume the extra condi-
tion that the controllers have the same width, i.e. µ − λ = 2. Then the equation for
DCOG (µx ) can be simplified as follows:

x ∈ DCOG (µx ) =
]0, λ ] 1
[λ , 2] −4x + 2x + 2λ2 − 3λ x + λ x2 − 4λ x + 4λ 2
2 3 2
2(x − 2x + λ − λ x)
2
[2, µ [ λ + 1

Let us determine the collection of points for which the middle expression equals x.
The solutions of this third-grade equation equal
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 425
 
λ λ 1.
+ 1, 1 + ± 4 + 4λ − 7λ 2 .
2 2 2

In that case, the point λ2 + 1 is always a fixpoint. The existence of other fixpoints
depends on the sign of ∆ (λ ) = 4 + 4λ − 7λ 2 :

λ 2 − 4 √2 2 + 4 √2
7 7 7 7
∆ (λ ) − 0 + 0 −
√ √
So there are two more fixpoints if and only if λ ∈ 27 − 47 2, 27 + 47 2 , approxi-
mately equalling [−0.52, 1.09]. Three is at once the maximal number of fixpoints
there can be, since the required equation is of degree 3. To illustrate this, we
will sketch some of the graphs we obtain, where we plot the defuzzification value
DCOG (µx ) against the value x. The description of the functions will be left to the
reader to write out, the actual graphs can be found in Figure 13. Notice the change
in the number of fixpoints.

4 4 4 4

3 3 3 3

2 2 2 2

1 1 1 1

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
x x x x
λ=0.1 λ=0.3 λ=0.5 λ=0.7

4 4
2 2
3 1.8 3 1.8
1.6 1.6
2 2
1.4 1.4
1 1.2 1
1.2
1
0 1 2 3 4 1 1.2 1.4 1.6 1.8 2 0 1 2 3 4 1 1.2 1.4 1.6 1.8 2
x x x x
λ=0.9 (detail) λ=1 (detail)

4 4
2 2
3 1.8 3 1.8

2 1.6 1.6
2
1.4 1.4
1 1
1.2 1.2

0 1 2 3 4 1 1.2 1.4 1.6 1.8 2 0 1 2 3 4 1 1.2 1.4 1.6 1.8 2


x x x x
λ=1.05 (detail) λ=1.1 (detail)

4 4 4 4

3 3 3 3

2 2 2 2

1 1 1 1

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
x x x x
λ=1.3 λ=1.5 λ=1.7 λ=1.9

Fig. 13 Plot of the defuzzification value against x for various λ


426 H.K. Lee et al.
2 1
1.8
1.6 0.8
1.4
1.2 0.6
1
0.8 0.4
0.6
0.4 0.2
0.2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Fig. 14 Extremal analysis of d1

Because of symmetry reasons, we would have expected the minimal difference


d1 (DCOG (µx ), x) to occur at λ = 1. Surprisingly, this is not the case. A tediously
long but otherwise straightforward verifications learns us that


⎪ − 3 λ 2 + λ + 1 + λ 2 ln 4λ if 0 ≤ λ ≤ 1
⎪ 4
⎪ 3λ + 2 √
⎨ 11 2 4λ 3 2+4 2
d1 (DCOG
(µx ), x) = − λ + λ + 3 + λ 2 ln if 1 ≤ λ ≤
⎪ 4 (λ − 2)2 (3λ + 2) √
7


⎪3 2
⎩ 3λ + 2 2+4 2
4 λ − λ + 1 + λ ln 4λ ≤λ ≤2
2 if
7

which has as a graphical plot the left graph in Figure 14. Moreover, considering
that the union of supports of the different rule bases are not of equal width, namely
[0, λ + 2], if we want to find the value for λ with — relatively speaking — the
d (DCOG (µx ), x)
smallest overlap, we have to examine the extreme values for 1 on
λ +2
the interval [0, 2]. The graphical plot of this function is the right graph in Figure 14.
d d1 (DCOG (µx ), x)
It is easy to investigate that = 0 for λ ! 1. 071 791. Hence,
dλ λ +2
the optimal value for λ is not equal to 1, which is, given the symmetric nature of the
problem, a remarkable result.
One possible drawback in the method as described above is the absence of de-
cent fuzzy sets in the antecent rule base that “round off the borders”. But even
then, asymmetries in the results are still occurring. Adding border constraints how-
ever does not seem to fix the problem of asymmetry in the search for a mini-
mal value for d1 (DCOG (µx ), x). We have checked thison an example  with two
semi–triangular fuzzy sets, passing through the points 3
− 2 , 1 , (λ − 1, 0) and
  
3
(−λ + 1, 0), 2 , 1 , and two triangular fuzzy sets, passing through the points
{(−λ − 1, 0), (−λ , 1), (−λ + 1, 0)} and {(λ − 1, 0), (λ , 1), (λ + 1, 0)}. Where one
would expect to find an optimal value for λ , such that the aforementioned differ-
ence is minimal, to occur at λ = 21 , a closed yet very tedious set of continuous func-
tions, parametrically dependent on λ could be deduced. As an illustration, some of
the graphs of these functions can be found in Figure 15. These formulas tend to
be much more complicated however, except for some easy values of λ . Instead of
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 427
1.5 1.5 1.5 1.5 1.5
1 1 1 1 1
0.5 0.5 0.5 0.5 0.5

–1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
–0.5 x –0.5 x –0.5 x –0.5 x –0.5 x

–1 –1 –1 –1 –1
–1.5 –1.5 –1.5 –1.5 –1.5

λ=0.1 λ=0.3 λ=0.5 λ=0.7 λ=0.9

Fig. 15 Plot of the defuzzification value against x for various λ in case of border constraints

tracking down the optimum by calculating the derivative, we plotted this value out
against the values of λ by means of Simpson’s integration rule by dividing the X–
interval into 100 equal intervals, using Matlab. All calculations have been carried
out with 64-bit precision, and found the minimum occuring for λ ! 0.5377, with
error margin 10−4 , which is most certainly not equal to 12 .

4 BADD-defuzzification

The previous result leads us to believe that better defuzzifiers with respect to the
consistency criterion must exist, because it seems only fair that the optimum should
not only be dependent of the appropriate choice of a rule base, but also of the de-
fuzzifier. While it is virtually impossible to study all defuzzifiers that have been
mentioned in the literature, the fact that the COG-defuzzifier is not necessarily the
best one, can be asserted by studying a few simple examples. An interesting para-
metric family to consider that incorporates the aforementioned defuzzifiers as well
as many others are the so-called basic defuzzification distributions DBADD (−, γ )

yµ γ (y)dy
DBADD (µ , γ ) = X
µ γ (y)dy
X

as introduced in [3], D.P. Filev and R.R. Yager. It is generally known (see also [12])
that
• DBADD (µ , 0) = DMOS (µ )
• DBADD (µ , 1) = DCOG (µ )
• lim DBADD (µ , γ ) = DMOM (µ )
γ →∞

The BADD-defuzzification parametric class hence comprises the discontinuous case


DMOM as well as the continuous case DCOG . Moreover, it would be interesting to
see if an adjustment of the parameter γ is useful to improve the degree of fulfil-
ment of the consistency criterion. On the other hand, in that case also allowing for a
flexibility in the shape of the controllers may make the optimization problem under-
determined. Although we have learned from the previous section that the partition
428 H.K. Lee et al.

of unity may not be the optimal choice for an antecedent rule base Ξ ∈ P ∗ (F(X)),
we have examined this one nevertheless because of its symmetric nature. Using a
long and cumbersome calculation (which we of course can provide to the reader
upon simple request), we found the results of the BADD-defuzzification to be valid
extensions of the three defuzzification operators which are given as a limit case by
the expressions above, at least in the case of single controllers, overlapping con-
trollers, with and without border conditions. One striking result was the continuity
of the BADD-defuzzification for any γ ∈ R+ 0 , while it is explicitly not continuous
for γ = 0 or γ → ∞, which can be considered as hybrid cases.

4.1 Results with No Border Constraints

Again, to give the reader an idea, we will sketch some of the graphs we obtain, where
we plot the defuzzification value DBADD (µx , γ ) in the case of no border constraints,
against the value x, for different values of γ , in Figure 16.
Since the only difference occurs on [1, 2], we tried to minimize

(  
2
' 
d1 DBADD (µx , γ ), x = DBADD (µx , γ ) − x dx
1

with respect to γ ∈ R+ 0 . Unlike expected, the optimum was not found for γ = 1.
d ' ' BADD ((
Calculating the derivative d1 D (µx , γ ), x is even more difficult than the

problem in Example 3.1, so we therefore plotted this value out again, this time
against the values of γ by means of Simpson’s integration rule by dividing the X–
interval into 100 equal intervals, using Matlab. All calculations have been carried
out with 64 bit precision. As a verification, we moreover could calculate the precise
results for γ = 1 and γ = 2, being
' (
d1 DBADD (µx , 1), x = 41 + ln 54 ! 0.02685644868579
' (
d1 DBADD (µx , 2), x ! 0.0764924981457362

Fig. 16 Input–output functions for different values of γ


A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 429

respectively. The two graphs shown in Figure 17 sketch the result with a precision
of γ taken every 10−3 , the second graph being a close up of the first one. With a
minimal step for γ of 10−5 used in the calculations, the minimum occurs for γ !
1.2041, with error margin 10−4 , which is most certainly not equal to 1. Moreover,
the minimum itself is almost, but not quite, zero.

4.2 Results with Border Constraints

In a similar case with border constraints, to give the reader an idea again, we
sketched some of the graphs in Figure 18. Again, we tried to minimize

(  
2
' 
d1 DBADD (µx , γ ), x = DBADD (µx , γ ) − x dx
0

x10–3
0.35 7

0.3 6.5
6
0.25
5.5
0.2 5
0.15 4.5
4
0.1
3.5
0.05
3
0 2.5
0 2 4 6 8 10 12 14 16 18 20 1.14 1.16 1.18 1.2 1.22 1.24 1.26

Fig. 17 Minimum for γ

Fig. 18 Different values of γ in case of constrained borders


430 H.K. Lee et al.

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2
0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 12 14 16 18 20

Fig. 19 Minimum for γ

with respect to γ ∈ R+ 0 . Yet again, the optimum was not found for γ = 1. Obviously
from the graphs, adding border' values to the fuzzy( controller does not improve the
result. Plotting the value of d1 DBADD (µx , γ ), x against the values of γ in a similar
Matlab environment as before, with only the precise for γ = 1 and γ = 2 being
known exactly (available as a means of verification)
' ( √ √
d1 DBADD (µx , 1), x = 83 + 23 ln 2 − 2 2 arctanh 22 ! 0.63586382647903
' (
d1 DBADD (µx , 2), x ! 0.459 352 869168308

the two graphs shown in Figure 19 sketch the result with a precision of γ taken
every 10−3 , the second graph being a close–up of the first one. With a minimal step
for γ of 10−8 (convergence is very slow in this case!) used in the calculations, the
minimum occurs for γ ! 5.24478, with error margin 10−5 , which is most certainly
not even anywhere near γ = 1. Unlike the example with no border constraints, the
minimum itself is not even close to zero.

5 Conclusions

While the consistency criterion 3 only seems a very reasonable demand, it is very
easy to debunk it: even for relatively simple functions, such as the identity, simple
rule bases, such as triangular-shaped fuzzy sets forming a partition of the union,
and simple defuzzification methods, such as COG-defuzzification, it is not hard to
find either rule bases or defuzzification methods that just yield better results. Does
this mean that the whole study has been pointless? Absolutely not: the consistency
criterion 3 is a good method of perception to quantify the quality of a defuzzifi-
caion, by measuring the difference between inputting the identity and yielding the
same identity as a result of a defuzzification process. While this is not an absolute
measure, and is strongly dependent of the chosen antecedent rule base, it can be
used, e.g. to compare two different defuzzification methods on a same rule base.
A Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 431

Further investigation still has to be carried out, such as the influence an increase in
the number of rules in the antecedent rule base. This will be the topic of a sequel
article.

References

1. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
2. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning part 1: Inference with possibility
distributions. Fuzzy Sets and Systems 40, pp. 143–202,1991
3. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions.
Internat. J. Intelligent Systems 6, pp. 687–697, 1991
4. E.E. Kerre. A comparative study of the behaviour of some popular fuzzy implication operators.
In: L.A. Zadeh and J. Kacprzyk, eds., Fuzzy Logic For The Mamagement of Uncertainty.
Wiley, New York, 1992
5. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic, Dordrecht, 1996
6. E.H. Mamdani and S.Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man-Machine Studies 7, pp. 1–13, 1975
7. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the conse-
quences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
8. D. Ruan, E.E. Kerre, G. De Cooman, B. Cappelle and F. Vanmassenhove. Influence of the fuzzy
implication operator on the method-of-cases inference rule. Internat. J. Approx. Reasoning, 4,
pp. 307–318, 1990
9. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1994
10. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1993
11. M. Sugeno. An introductory survey of fuzzy control. Inform. Sci 36, pp. 59–83, 1985
12. W. Van Leekwijck and E.E. Kerre. Defuzzification: criteria and classification. Fuzzy Sets and
Systems 108, pp. 159–178, 1999
13. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
14. L.A. Zadeh. Fuzzy sets. Inform. Control 8, pp. 338–353, 1965
15. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
16. H.J. Zimmermann. Fuzzy Set Theory And Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
An Asymptotic Consistency Criterion for
Optimizing Defuzzification in Fuzzy Control

Hyei Kyung Lee, Eric Paillet, and Werner Peeters

Abstract In [6], we already pointed out that in a fuzzy control process, the choice of
a good defuzzification method is quintessential. Throughout the literature, various
defuzzification methods have been proposed, classified according to the properties
they fulfil, such as continuity, scale invariance, core consistenty and so forth. In [6]
we added a new criterion, by demanding that the defuzzification of the fuzzy image
of a basic function, such as the identity, should still yield the identity, and we imme-
diately found that this is almost never the case. However, the numerical deviation
of this result can be established as a measure of fitness for the fuzzy controller in
the particular problem. Moreover, given a parametric family of such defuzzification
operators, such as D.P. Filev and R.R. Yager’s BADD-defuzzification ([3]), we were
able to optimize the problem with respect to the arbitrary parameter. In this chap-
ter, we will weaken out Consistency Criterion posed in [6] to a version that only
needs to hold in an asymptotic case, namely with an infinite refinement of the width
of the fuzzy antecedent rules. We will show that what ensues is a nice numerical
description of the fitness of certain (families of) fuzzy defuzzification operators.

Keywords: fuzzy control, defuzzification, consistency, antecedent rule base

1 Introduction

Fuzzy control ([18]) is used in a wide scope of applied sciences, including physics,
electronics and economy. It is based on the concept of fuzzy sets as introduced
by L.A. Zadeh ([16] and [17]), extending the notion of membership of a function
from a two-valued logic to one in which the range values continuously vary within
I = [0, 1]. One major step is the defuzzification process, in which the fuzzy data

Hyei Kyung Lee, Eric Paillet, and Werner Peeters


University of Antwerp, Dept. of Mathematics and Computer Science, Middelheimlaan 1, B-2020
Antwerp, Belgium, e-mail: [email protected]
R. Lowen and A. Verschoren (eds.), Foundations of Generic Optimization, 433
Volume 2: Applications of Fuzzy Control, Genetic Algorithms and Neural Networks, 433–456.
c 2008 Springer.
434 H.K. Lee et al.

is again sampled into a single output value which is asserted to be a good repre-
sentation value of the (fuzzy) outcome of the control process. Depending on the
circumstances, several properties of defuzzification techniques have been studied
extensively, such as continuity, core representation and scaling invariance, and for
a good overview we would like to refer to the excellent articles of T.A. Runkler
et al. [12] and W. Van Leekwijck et al. [14]. In [6], we defined a consistency cri-
terion (CC) which should be a measure to the effectiveness of a fuzzy controller
by calculating how much a given function — usually the identity — differs in L1 –
measure from its image through a fuzzy controller and defuzzification process. For
a more detailed description of the L1 –measure between functions, we refer to [4].
As we would expect, the result turns out to be dependent on the fuzzy rule base as
well as on the chosen defuzzification process, which allows for a quantitive com-
parative study. One major drawback though is that the identity function rarely ever
is mapped onto the identity function, even with the most obvious and usually well-
behaving defuzzification operators. Even then, we have not needed to ask ourselves
the question whether other “basic” functions are mapped onto themselves through
the (fuzzy) identity operator.
In this chapter however, we will study the influence of increasing the number of
controller functions in the rule base to this L1 –measure, and use it as a means to com-
pare the quality of two different fuzzy controller sets. Logically, an increase of the
number of controllers represents a fine-tuning of the way information is handled by
the antecedent rule base. We therefore expect the results to improve with the number
of controllers, and it indeed turns out to be the case when some selected additional
assumptions are made, regarding scale-invariance for instance. However, rather than
establishing this convergence, we would like to render the information obtained
in a numerical way, in order to compare the different defuzzification methods and
their consistency — or rather a concept we will call asymptotic consistency — as
described in [6]. The closed formulas are still manageable in the case of the most
common defuzzificators, such as center-of-gravity defuzzification (a canonical con-
tinuous example) or Mean-of-Maximum defuzzification (a canonical discontinuous
example). When we investigate this asymptotic consistency on a parametric family
of defuzzification operators such as the BADD–defuzzification presented in [3] —
the latter chosen because it incorporates both former examples — the formulas
become way to complicated, and we hence have to rely on numerical techniques
to draw some sensible conclusions.

2 Rule Antecedent Bases

In this section, we will establish the notations that will be used throughout this
article. Many of the already established results can be traced back to [6]. The set X
will denote the domain of the fuzzy sets, and can either be considered as R or any
(closed) interval thereof.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 435

Definition 1. A fuzzy rule antecent base will be defined as a finite collection of rule
antecedents consisting of fuzzy functions, which we will denote by {αi : X −→ I}ni=1 ,
and a consequence rule β : X −→ I.
For one such function α : X −→ I, the support will be defined as
supp α := {x ∈ X : α (x) > 0} .

On the other hand, the core of a fuzzy set α : X −→ I will be defined as


 :
core α := x ∈ X : α (x) = sup α (y) .
y∈X

= {x ∈ X : ∀y ∈ X : µ (y) ≤ µ (x)} .

The set of all such collections of rule antecedents Ξ = {αi : X −→ I}ni=1 shall be
denoted as P ∗ (F(X)), being the collection of all finite subsets of F(X), the fuzzy
sets on X. A collection of rule antecedents Ξ will be called a partition of unity if and
only if
n
∀x ∈ X, ∑ αi (x) = 1.
i=1
The consequence functions can be considered as members of the same set. As for
the implication, following E.H. Mamdani et al. in [8], given each rule is of the type
  ' (
r : IF X1 = A1j1 and ... and Xn = Anjn THEN (Y = B j ),

where Aiji is the value of the j–th term of the linguistic variable i corresponding to
the antecedent membership function αiji , and B j is the value of the j–th term of the
linguistic variable corresponding to the consequence membership function β j , then
the aggregation of the rules is made by calculating

n
kr (x) := αiji (xi )
i=1

for each of the input vectors x = (x1 , ..., xn ), and determining the consequence fuzzy
set as ' (
µx (y) := ρ (x, y) = β j (y) ∧ kr (x) .
r

As stated in [6], this operation commonly referred to as a fuzzy implication is


in fact the cartesian product. Other possible implication operators can be consid-
ered, as described by D. Dubois et al. in [1] and [2], by D. Ruan et al. in [10], by
E.H. Mamdani et al. in [8] and M. Sugeno in [13].
Definition 2. A defuzzifier will be any mapping that associates a value D(µ ) ∈ X
with any fuzzy function µ ∈ F(X).
Generally defuzzifiers should fulfil a number of constraints that make them se-
mantically correct, in order that they are able to select a suitable “representative”
436 H.K. Lee et al.

crisp value, assigned to the fuzzy output. Most of these properties are described in
T.A. Runkler et al. [12] and W. Van Leekwijck et al. [14]. In what follows, we will
have the need to apply various scaling arguments; therefore the conditions of ordi-
nal scale-invariance ([9], [11], [14]) and universal scale-invariance ([11]) should
hold. The descriptions are re-formulated in [6]. As also mentioned in that article, the
most commonly used criteria satisfying these conditions include: the first, last and
middle of maxima defuzzification DFOM , DLOM and DMOM , the middle of support
defuzzification DMOS , the center-of-gravity defuzzification

yµ (y)dy
X
D COG
(µ ) = 
µ (y)dy
X

and the basic defuzzification distributions DBADD



yµ γ (y)dy
∀γ ∈ R+ : DBADD (µ , γ ) = X
µ γ (y)dy
X

One other obvious criterion a defuzzifier has to satisfy is continuity, implying that
F(X) carries some topology to describe the distance between two fuzzy sets. In [6],
we described an alternative approach that does not rely on any structure on F(X)
as follows: suppose one has a finite collection of fuzzy sets as rule antecedents.
Each rule antecedent consists of only one fuzzy set α : X −→ I. Suppose that each
consequence function β : X −→ I is obtained as the image of the fuzzy set through
the mapping id : X −→ X. The function θ : X × P ∗ (F(X)) −→ X associates with
each input value x and each antecedent (and consequent) fuzzy set collection {αi }ni=1
in P ∗ (F(X)) an output θ (x, Ξ) = D∗∗∗ (µx ) for some fixed choice of a defuzzifier
∗ ∗ ∗, where µx : X −→ I is derived from the rule base Ξ = {αi : X −→ I}ni=1 as
described above. Schematically,

θ
X × P ∗ (F(X)) −→ X
(x, Ξ) → D∗∗∗ (µx )
θ∗  D∗∗∗
F(X)
µx

Ideally, we look for a Ξ and a D such that the input–output function

θ (·, Ξ) −→ ·
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 437

should be the identity function again, yet we found this is almost never the case. Yet
we have assumed the following condition to hold:
Ansatz (Consistency Criterion) A rule base Ξ ∈ P ∗ (F(X)) and a defuzzifi-
cation operator D∗∗∗ are more suited for fuzzy control, the more the value
d1 (id, D∗∗∗ (µ )), with

D∗∗∗ (µ ) : X −→ X
x → D∗∗∗ (µx ) = θ (x, Ξ)

is smaller.

3 Rule Base Sequences

Definition 3. If X = (X, d) is a metric space, then the width of a fuzzy set α ∈ F(X)
will then be defined as
width(α ) = sup d(x, y).
x,y∈supp α

By extension, for any rule base Ξ = {αi : X −→ I}ni=1 ∈ P ∗ (F(X)), we can define
the width of Ξ as
n
width(Ξ) = max width(αi ).
i=1

Definition 4. A rule base sequence is any mapping N −→ P ∗ (F(X)). The sequence


will also be denoted as (Ξn )n . The collection of all such rule base sequences will be
denoted R(X).

Of course, a finite number elements of N can be omitted without affecting the


possible convergence behaviour of such sequences. Among all rule base sequences,
we are particularly interested in those for which the maximal width of the supports
of the fuzzy sets in the antecedent rule base tends to zero.

Definition 5. A rule base sequence (Ξn )n will be called a zero rule base sequence if
and only if
lim width(Ξn ) = 0.
n→∞

The subcollection of all such zero rule base sequences will be denoted R0 (X) ⊆
R(X).

Example 6

Consider the following rule bases, depicted in Figure 1:


I. Ξ1 on [0, 2], consisting of the antecedent rules
• α1 (x), passing through the points (0, 1) and (1, 0);
438 H.K. Lee et al.

1.2 1.2
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2

0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3


–0.2 x –0.2 x

Fig. 1 Ξ1 and Ξ2

• α2 (x), passing through the points (0, 0), (1, 1) and (2, 0);
• α3 (x), passing through the points (1, 0) and (1, 1).

II. Ξ2 on [0, 3], consisting of the antecedent rules


• α1 (x), passing through the points (0, 1) and (1, 0);
• α2 (x), passing through the points (0, 0), (1, 1) and (2, 0);
• α3 (x), passing through the points (1, 0), (2, 1) and (3, 0);
• α4 (x), passing through the points (2, 0) and (3, 1).

All of these are special cases of the antecedent rules Ξn on [0, n + 1], consisting of
the antecedent rules
• α1 (x), passing through the points (0, 1) and (1, 0);
• α2 (x), passing through the points (0, 0), (1, 1) and (2, 0);
• ...
• αn+1 (x), passing through the points (n − 1, 0), (n, 1) and (n + 1, 0);
• αn+2 (x), passing through the points (n, 0) and (n + 1, 1)

Obviously, in any of the examples above, width(Ξ1 ) = 2. Remark that these are
all partitions of unity. We would like to redefine all these onto the same base space
X = [0, 1], since the defuzzifications that will be used will all be universe scale-
invariant. Therefore, let us define

x
∀n ∈ N0 , ∀x ∈ [0, 1] : Θn (x) = Ξn
n+1

Then width (Θn ) = n + 2 , so lim width (Θ ) = 0. Therefore (Θ ) is a zero rule


1 n n n
n→∞
base sequence, which is moreover consisting of partitions of unity.
In such a case of scaling described as above, and considering a defuzzification
which must be universe scale-invariant, it may also be benificial to know what influ-
ence such a scaling has on the L1 –distance between say, the defuzzificated function
D∗∗∗ (µ ) and the identity.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 439

Proposition 7. Let a ∈ R+0 and b ∈ R define a positive affine transformation τ (x) =


ax + b on X. Then if µ ∈ F(X) and we define

µ a,b : X −→ I
' (  
x → [τ (µ )](x) = µ τ −1 (x) = µ x −
a
b

on condition that this is well–defined. Let furthermore D∗∗∗ (µ ) be a universe scale-


invariant defuzzification operator. Then
     
d1 D∗∗∗ µ a,b , idX = a2 d1 D∗∗∗ (µ ), idτ −1 (X)

Proof: For universe scale-invariant defuzzifiers, we know that ([6])


 
D µ a,b = aD(µ ) + b.

Therefore, if we denote Y := τ −1 (X) as the pre-image of X through the affine


transformation τ (x) = ax + b, it is easy to see that idX ◦τ = τ ◦ idY and hence
idX = τ ◦ idY ◦τ −1 . So
        
 
d1 D∗∗∗ µ a,b , idX = D∗∗∗ µ a,b − idX (x) dx
X
    
 ∗∗∗ a,b 
= D µ − τ ◦ idY ◦τ −1 (x) dx
X
' (
Putting x = τ (y) = ay + b, we obtain that dx = ady and, more important, D µ a,b =
aD(µ ) + b. Therefore
    
d1 D∗∗∗ µ a,b , τ (idX ) = |aD∗∗∗ (µ ) + b − τ ◦ idY (y)| · ady
Y

=a |aD∗∗∗ (µ ) + b − τ (y)| dy
Y

=a |aD∗∗∗ (µ ) + b − (ay + b)| dy
Y
  
= a2 |D∗∗∗ (µ )−idY (y)| dy = a2d1 D∗∗∗ (µ ), idτ −1 (X)
Y

QED. 

440 H.K. Lee et al.

4 The Asymptotic Consistency Criterion

In this section, we will study the different defuzzification operators on the standard
zero rule base sequence (Θn )n as given in Section 3, even though we learned from
[6] that the partition of unity is by no means the optimal antecedent rule base. Instead
of refining the control by increasing the number of controllers on the same unit
interval, we can also use the scale invariancy and along with the increase of the
number n of controllers, extend the interval on which we are working. To make a
feasible comparism, afterwards we will scale the d1 -distance with a factor (n + 1)2 .
Using the defuzzification formulas we obtained in the aforementioned article, we
find the following results:
A first question which rises is whether the L1 –difference between D∗∗∗ (µx ) and
the identity function decreases with an increase in the number of antecedent rules.
Roughly put, would it be true that

lim sup d1 (D∗∗∗ (θ ∗ (x, Θn )) , idX ) = 0?


n→∞

In this section we will prove that for some important defuzzification operators
L1
D∗∗∗ such as DMOM and DCOG not only (D∗∗∗ (θ ∗ (x, Θn )))n → id[0,1] , but even
u
(D∗∗∗ (θ ∗ (x, Θn )))n → id[0,1] , which is a much stronger assertion. The purpose of
calculating the L1 –distance nevertheless is that it permits us to compare different
defuzzification methods with each other in terms of a factor how many antecent
rules more or less are needed to achieve the same degree of accuracy.
As we have seen, the consistency criterion does mostly not hold, so we will now
first weaken it to an asymptotic form:
Ansatz (Asymptotic Consistency Criterion) A zero rule base sequence

(Ξn )n ⊆ R0 (X)

and a defuzzification operator D∗∗∗ should be such that


' (
lim sup d1 D∗∗∗ (θ ∗ (x, Ξn )) , id[0,1] = 0
n→∞

It is trivial to see that if an antecedent rule base Ξ fulfills the consistency criterion,
that the constant rule base sequence (Ξn = Ξ)n , even though it is not a zero rule base
sequence, fulfills
(D∗∗∗ (θ ∗ (x, Ξn )) = idX )n → idX
u

and hence trivially also


L
(D∗∗∗ (θ ∗ (x, Ξn )) = idX )n −→
1
idX .
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 441

On the other hand, this condition is way too strong, certainly for the most com-
mon types of defuzzifiers, to be fulfilled. Therefore, the consistency criterion CC is
stronger than he asymptotic consistency criterion ACC.

4.1 MOM-defuzzification

Before stating and proving the general theorems, we would like to provide the reader
with some basic examples, in order to show the tecniques involved.

Examples 8

I. Considering Ξ1 on [0, 2] and putting µx := θ ∗ (x, Ξ1 ), following [6], we have that




⎪0 if x = 0



⎪ x/2 if x ∈ ]0, 1/2[


⎨ 3/4 if x = 1/2
DMOM (µx ) = 1 if x ∈ ]1/2, 3/2[



⎪ 5/4 if x = 3/2



⎪ (2 + x)/2 if x ∈ ]3/2, 2[

2 if x = 2

which looks like the left graph in Figure 2. Consequently,

(  
2
' 
d1 DMOM (µ ), id[0,2] = DMOM (µx ) − x dx
0
1 3
2  1 2
x
= x− dx + (1 − x)dx + (x − 1)dx
2
0 1 1
2
2
2+x 3
+ − x dx =
2 8
3
2

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ1 ),


therefore,
    3
3
D1 = d1 DMOM µ a,b , id[0,1] = 82 =
2 32

II. Considering Ξ2 on [0, 3] and putting µx : g = θ ∗ (x, Ξ2 ), following [6], we have


that
442 H.K. Lee et al.

1 3
2.5
1.5
2
1 1.5
1
0.5
0.5
0 0 0.5 1 1.5 2 2.5 3
0.5 1 1.5 2
x x

Fig. 2 The input–output curve for the rule bases Ξ1 and Ξ2 with MOM-defuzzification



⎪ 0 if x = 0



⎪ x/2 if x ∈ ]0, 1/2[



⎪ 3/4 if x = 1/2


⎨1 if x ∈ ]1/2, 3/2[
DMOM (µx ) = 3/2 if x = 3/2



⎪ 2 if x ∈ ]3/2, 5/2[



⎪ 9/4 if x = 5/2



⎪ (3 + x)/2 if x ∈ ]5/2, 3[

3 if x = 3
which looks like the right graph in Figure 2. Consequently, analogously to the
previous example,

(  
3
'  5
d1 DMOM (µ ), id[0,3] = DMOM (µx ) − x dx =
8
0

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ2 ),


therefore,
    5
5
D2 = d1 DMOM µ a,b , id[0,1] = 82 =
3 72
More generally, the following theorem holds:
' ( L1
Theorem 9. DMOM (θ ∗ (x, Θn )) n → id[0,1]
Proof: Considering Ξn on [0, n + 1] and putting µx := θ ∗ (x, Ξn ), following [6], we
have that


⎪ 0 if x = 0



⎪ x/2 if x ∈ ]0, 1/2[



⎪ 3/4 if x = 12

k if x ∈ ]k − 1/2, k + 1/2[ with k ∈ {1, 2, ..., n}
DMOM (µx ) =

⎪ k + 1/2 if x = k + 1/2 with k ∈ {1, 2, ..., n − 1}



⎪ n + 1/4 if x = n + 1/2



⎪ (n + 1 + x)/2 if x ∈ ]n + 1/2, n + 1[

n+1 if x = n + 1
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 443

Consequently,

' ( 
n+1
 MOM 
d1 D MOM
(µ ), id[0,n+1] = D (µx ) − x dx
0

1
2  1 
x
n+1
n+1+x
= x− dx+2n (1 − x)dx+ − x dx
2 2
0 1 1
2 n+ 2

1 1 1 1 1
= + n+ = n+
16 4 16 4 8

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θn ), there-
fore
    1
+ 1n 1 + 2n
Dn = d1 DMOM µ a,b , id[0,1] = 8 4 2 =
(n + 1) 8(n + 1)2

As a result, lim sup Dn = lim 1+2n


2 = 0, which proves the theorem. QED. 

n→∞ n→∞ 8(n+1)
Although this is a nice result, we can do even better. When all these functions
are scaled back to [0, 1], we claim that we even have uniform convergence to the
identity function.
' ( u
Theorem 10. DMOM (θ ∗ (x, Θn )) n → id[0,1]

Proof: Considering Ξn on [0, n + 1] and putting νx := θ ∗ (x, Ξn ), following [6], we


have that

⎪ 0 if x = 0





⎪ x if x ∈ 0, 12

⎪ 2



⎪ 3

⎪ if x = 12

⎪ 4


⎨k if x ∈ k − 21 , k + 21 with k ∈ {1, 2, ..., n}
DMOM (νxn ) =



⎪ k + 21 if x = k + 12 with k ∈ {1, 2, ..., n − 1}



⎪ 1

⎪n+ 4
⎪ if x = n + 12



⎪ n+1+x


⎪ 2 if x ∈ n + 12 , n + 1


n+1 if x = n + 1

Scaled to the unit interval, and considering Θn on [0, n + 1] and putting µxn :=
θ ∗ (x, Θn ), we have that
444 H.K. Lee et al.


⎪ 0 if x = 0





⎪ x if x ∈ 0, 2n1+ 2

⎪ 2



⎪ 3 if x = 2n1+ 2

⎪ 4n + 4



⎨ k 2k − 1 , 2k + 1 with k ∈ {1, 2, ..., n}
if x ∈ 2n
D MOM
( µx ) = n + 1 + 2 2n + 2

⎪ 2k + 1 +

⎪ if x = 2n + 12
2k with k ∈ {1, 2, ..., n − 1}

⎪ 2n + 2



⎪ 4n + 1 if x = 2n +1

⎪ 4n + 4 2n + 2



⎪ 1+x 2n + 1 , 1

⎪ if x ∈ 2n

⎩ 2 +2
1 if x = 1

Therefore,
⎛ ⎞
& &    
& MOM n & ⎜  t ⎟  3 1 
&D (µx ) − id & t − ⎠ ∨  −
& [0,1]
& =⎝ sup
2 4n + 4 2n + 2 
∞ 1
t∈ 0, 2n+2

⎛ ⎞
 
n ⎜  k ⎟
∨ sup ⎝ sup t − ⎠
 n+1
k=1 2k−1 2k+1
t∈ 2n+2 , 2n+2

⎛ ⎞
   
 2n + 1 4n + 1  ⎜  1 + t ⎟
∨  − ∨⎝ sup t − ⎠
2n + 2 4n + 4  2n+1
 2 
t∈ 2n+2 ,1

1 1 1 1 1 1
= ∨ ∨ ∨ ∨ =
4 |n+1| 4 |n+1| 2 |n+1| 4 |n+1| 4 |n+1| 2 |n+1|

Hence & &


& MOM n &
&
lim D (µx ) − id & 1
n→∞ & [0,1]
& = n→∞
lim 2n+2 = 0;

' ( u
therefore DMOM (µxn ) n → id QED. 


4.2 COG-defuzzification

Unlike the MOM-defuzzification, COG-defuzzification is a continuous operation


with respect to the standard euclidean topology on the base space. Again, before
stating and proving the general theorems, we would like to provide the reader with
some examples.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 445

Examples 11

I. Considering Ξ1 on [0, 2] and putting µx := θ ∗ (x, Ξ1 ), following [6], we have that


⎧ 3
⎨ x + 23x − 9x − 1 if 0 ≤ x ≤ 1
2

3(x − 2x − 1)
DCOG (µx ) =
⎩ x − 23x + 3x − 7 if 1 ≤ x ≤ 2
⎪ 3 2
3(x − 2x − 1)

which looks like the left graph in Figure 3. Consequently,

' ( 2   √ √ 
d1 D COG
(µx ), id[0,2] = DCOG (µx ) − x dx = 8 + 2 ln 2 − 2 2 ln  2 + 1
3 3
0

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ1 ),


therefore
√ √ 
 √
− 2 √ 
8 2
3 + 3 ln 2 2 2 ln  2 + 1 2 1
D1 = = + ln 2 − ln 2 + 1
22 3 6 2

II. Considering Ξ2 on [0, 3] and putting µx := θ ∗ (x, Ξ2 ), following [6], we have that
⎧ 3
⎪ x + 23x − 9x − 1 if 0 ≤ x ≤ 1
⎪ 2

⎪ − 2x − 1)

⎨ 23(x
D COG
( µx ) = 3x − 11x +6 if 1 ≤ x ≤ 2
2 − 3x + 1)

⎪ 2(x


⎩ x 2− 3x − 8
3 2
⎪ if 2 ≤ x ≤ 3
3(x − 4x + 2)

which looks like the right graph in Figure 3. Consequently, analogously to the
first example,

2.5
1.5
2
1 1.5
1
0.5
0.5

0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3


x x

Fig. 3 The input–output curve for the rule bases Ξ1 and Ξ2 with COG-defuzzification
446 H.K. Lee et al.

' ( 3  
d1 DCOG (µ x ), id[0,3] = DCOG (µx ) − x dx = 35 + 8 ln 2
12 3
0
√ √ 
 
−2 2 ln  2 + 1 − ln 5

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ2 ),


therefore,
√ √ 

35 8
12 + 3 ln 2−2 2ln 2+1−ln 5 35 8 2√ √  1
D2 = = + ln 2− 2 ln 2+1 − ln5
32 108 27 9 9
More generally, we are able to establish the following theorem:
' ( L1
Theorem 12. DCOG (θ ∗ (x, Θn )) n → id[0,1]
Proof: Considering Ξn on [0, n + 1] and putting µx := θ ∗ (x, Ξn ), following [6], we
have that
⎧ 3
⎪ x + 3x2 − 9x − 1

⎪ if 0 ≤ x ≤ 1

⎪ 3(x2 − 2x − 1)
⎨ 2
DCOG (µx ) = 2x k + x −24k x − 4kx − 3x
2 2 3 2
+ 2k + 3k + k
if ∀k ∈ {1, ..., n−1} : k ≤ x ≤ k+1

⎪ 2(x − 2kx − x + k 2 + k − 1)



⎩ x − 3x − 3n2 x + 6nx +22n − 3n − 6n
3 2 2 3 2
if n ≤ x ≤ n + 1
3(x − 2nx + n − 2)

Consequently, using symmetry considerations, one can easily verify that

' ( 
n+1
 COG 
d1 DCOG (µ x ), id[0,n+1] = D (µx ) − x dx

8 √ √  20
 1 1
= − 2 2 ln  2 + 1 + ln 2 + 2(n − 1) − ln 5 + ln 2 +
3 3 2 8

Scaled to the unit interval following Proposition 7, because µxa,b := θ ∗ (x, Θ2 ), there-
fore,
   
Dn = d1 DCOG µxa,b , id[0,1]
√ √ 
 2 ' (
8
3 − 2 2 ln  2 + 1 + 3 ln 2 + 2(n − 1) − 12 ln 5 + ln 2 + 18
=
(n + 1)2

Here, obviously, lim Dn = 0, which proves the theorem. QED. 



n→∞
And also in this case, we can prove that this convergence is uniform.
' ( u
Theorem 13. DCOG (θ ∗ (x, Θn )) n → id[0,1]
Proof: Considering once again, Ξn on [0, n + 1] and putting µxn := θ ∗ (x, Ξn ), follow-
ing [6], we have that
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 447
⎧ x3 +3x2 −9x−1

⎪ if 0 ≤ x ≤ 1
⎨ 3(x22−2x−1)2
2x k + x − 4k 2 x − 4kx − 3x + 2k3 + 3k2 + k
DCOG (µxn )= 12 if ∀k ∈ {1, ..., n−1} : k ≤ x ≤ k+1

⎪ 2(x2 − 2kx − x + k2 + k − 1)
⎩ x3 −3x2 −3n2 x+6nx+2n3 −3n2 −6n if n ≤ x ≤ n + 1
3(x2 −2nx+n2 −2)

We might as well apply scaling and consider Θn on [0, 1] rightaway. Then putting
1 ,
νxn := θ ∗ (x, Θn ), we obtain that on 0, n + 1
  |(n − 2)x| + |3(n + 3)x| + |3(3n + 2)x| + 1
lim DCOG (νxn ) − x ≤ lim = 0,
n→∞ n→∞ 3 |(n + 1)2 x2 − 2(n + 1) − 1|

1 , 2 ,
and that on n + 1 n+1
  |8x| + |6(n + 3)x| + |(11n + 13) x| + 6
lim DCOG (νxn ) − x ≤ lim = 0,
n→∞ n→∞ 2 |(n2 + 2n + 1)x2 + (−3n − 3)x + 1|

for which we leave the calculations


& involved to &the interested reader.
' Because ( of
symmetry reasons, therefore lim &DCOG (νxn ) − id&∞ = 0; therefore DCOG (νxn ) n →
u
n→∞
id. QED. 


4.3 BADD-defuzzification

Just as in [6], it would be advantageous to compare also asymptotically the behav-


iour of a parametric class of defuzzifiers that incorporates the most important ones,
such as COG or MOM. Therefore, we will study again the so-called basic defuzzifi-
cation distributions DBADD (−, γ ) as introduced in [3], D.P. Filev and R.R. Yager.

yµ γ (y)dy
DBADD (µ , γ ) = X
µ γ (y)dy
X

As also stated in [6], it is generally known (see also [14]) that


• DBADD (µ , 0) = DMOS (µ );
• DBADD (µ , 1) = DCOG (µ );
• lim DBADD (µ , γ ) = DMOM (µ ).
γ →∞

The BADD-defuzzification parametric class hence comprises the discontinuous case


DMOM as well as the continuous case DCOG . Moreover, it would be interesting to see
if an adjustment of the parameter γ is useful to improve the degree of fulfilment of
the asymptotic consistency criterion, and if so, if any comparism can be made with
the result we obtained in [6], namely that the optimal γ –value should be γ ! 1.2041
448 H.K. Lee et al.

when not considering borders, or γ ! 5.24478 when we do consider these. It is


intuitively assumed that the value of γ will diminish with an increasing number of
controllers, as the influence of the borders is reduced. Again, we will only study
zero rule base sequences that consist only of partitions of unity. Using the universe
scale-invariance and the ordinal scale-variance, and making use of the calculations
in [6], we can derive the following theorem:
' ( L1
Theorem 14. ∀γ > 0, DBADD (θ ∗ (x, Θn ), γ ) n → id[0,1]
Proof: Note first of all that we consider
 γ +1 
1/2 x +x(1−x) 
γ +1 x2 (1−x)γ +xγ (3−2x) (1−x)γ +2
γ γ + + γ γ 
I1 =  +1

2

( +1)( +2)
− x dx

0
 (1 − x)γ x + γ1−x +1 + x
γ 
   
1  γ (1−x) 
γ +2
xγ +2 γ 2(1 − x) + (2−x)x
γ 2(γ +2) + γ +1 + x γ +1 
I2 =    − x dx, and
 γ (1−x)γ +1
 γ +1
γ
+ 2x γ +1 + (1 − x)
x 
1/2
 
3/2 (2−x)γ +2 +(x−1)γ +1 +x(2−x)γ +1 + 2(2 − x)γ (x − 1) + (x−1)γ (7−2x) 

γ  γ +1 2  dx
I3 =  − x 
 γ
2
+1 (2 − x)γ +1 + 2(2 − x)γ (x − 1) + (x − 1)γ 
1

which are calculated by combining the results in [6] for the appropriate domains,
which means that we consider the defuzzification DBADD (θ ∗ (x, Ξn ), γ ) first, and
then scale it to the unit interval by use of Proposition 7. Therefore we can derive
that γ γ γ
' ( I + I + (n − 1)I3
d1 DBADD (θ ∗ (x, Θn ), γ ) , id[0,1] = 2 1 2
(n + 1)2
It is then a triviality to see that
' (
lim d1 DBADD (θ ∗ (x, Θn ), γ ) , id[0,1] = 0,
n→∞

which proves the theorem. QED. 




5 Defuzzification Fitness Comparison

There are two standard ways to compare two different defuzzification methods and
the amount to which they fulfill the asymptotic consistency criterion. It is possible to
compare the quotient of the L1 –distances obtained by putting the same antecedent
rule base sequences through two different defuzzification processes, and take the
limit for an increasing number of antecedent rules. Alternatively, one may compare
this distance of one particular defuzzification operator on the elements of a fixed an-
tecedent rule base sequence with a parameter that is characteristic for this sequence.
As an immediate candidate, the width of the antecedent rule base pops to mind.
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 449

Definition 6. For any rule base sequence (Ξn )n ∈ R(X), we will define the relative
fitness of the defuzzification operators D∗∗∗ and D··· as

d1 (D∗∗∗ (θ ∗ (x, Ξn )) , idX )


RF∗∗∗,··· [(Ξn )] := lim sup
n→∞ d1 (D··· (θ ∗ (x, Ξn )) , idX )

It is easy to see that if both defuzzification operators D∗∗∗ and D··· satisfy the as-
ymptotic consistency criterion, this limit is obviously 00 , and hence initially undeter-
mined. One can calculate this value in particular cases, though.
Example 16
For instance, let us compare the MOM-defuzzification (thin line) with the COG-
defuzzification (thick line) on the rule base sequence (Θn )n mentioned in Section 3.
The graphs are sketched in Figure 4. We obtain asymptotically
' (
COG,MOM d1 DCOG (θ ∗ (x, Θn )) , idX
RF [(Θn )n ] = lim sup ∗
n→∞ d1 (D  (θ (x, Θn ))
MOM
 , idX )
8 √ √ 2 1 1
3 −2 2 ln| 2+1 | 3
+ ln 2+2(n−1) − 2 ln 5+ln 2+ 8
(n+1)2
= lim sup 1+2n
n→∞
8(n+1)2
4
= 1 + 4 ln ! 0.1074257947
5
which indicates that, with an increasing number of controllers, eventually the COG–
defuzzification becomes about 10 times better.
For an absolute measure of fitness, we suggest the following definition:

Definition 7. For any rule base sequence (Ξn )n ∈ R(X), we will define the fitness
of the defuzzification D∗∗∗ as

d1 (D∗∗∗ (θ ∗ (x, Ξn )) , idX )


F∗∗∗ [(Ξn )n ] := lim sup
n→∞ width[(Ξn )n ]

0.1

0.08

0.06

0.04

0.02

0
2 4 6 8 10
n

Fig. 4 Fitness comparison between MOM- and COG-defuzzification


450 H.K. Lee et al.

The smaller this value is, the better the defuzzification D∗∗∗ is as a fuzzifier for
the rule base (Ξn )n . If (Ξn )n ∈ R0 (X) is a zero rule base sequence and if the de-
fuzzification operator D∗∗∗ satisfies the asymptotic consistency criterion on (Ξn )n ,
then this limit is obviously again 00 . Mark that this fitness depends of the antecedent
rule base (Ξn )n as well as of the chosen defuzzification operator D∗∗∗ ; however,
when yielding the same rule base, it is a means of comparing the speed with which
a defuzzification operator tends to fulfill the asymptotic consistency criterion. It is
furthermore trivial to see that for any two defuzzification operators D∗∗∗ and D··· on
the same rule base (Ξn )n fulfill

F∗∗∗ [(Ξn )n ]
= RF∗∗∗,··· [(Ξn )n ]
F··· [(Ξn )n ]

on condition that all of these values exist.


Examples 17
I. For the MOM-defuzzification on the rule base sequence (Θn )n mentioned in
Section 3, we obtain that
1+2n
MOM 8(n+1)2 1
F [(Θn )n ] = lim sup 2
= = 0.125
n→∞ n+1
8

II. For the COG-defuzzification on the rule base sequence (Θn )n mentioned in
Section 3, we obtain that
 
8 √ √ 2 1 1
3 −2 2 ln| 2+1|+ 3 ln 2+2(n−1) − 2 ln 5+ln 2+ 8
(n+1)2
FCOG [(Θn )n ] = lim sup 2
n→∞ n+1
1 2
= + ln √ ! 0.013 428
8 5
Therefore
1
FCOG [(Θn )n ] 8 + ln √25 2√
RF COG,MOM
[(Θn )n ] = MOM = 1
= 1 + 8 ln 5 ! 0.107 426
F [(Θn )n ] 8
5

which is in accordance with the previous results.


III. The BADD-defuzzification is a tougher nut to crack. Because writing down a
closed form for this integrals is almost impossible, certainly with the absolute
value-signs which require an investigation of the sign of the integrandum, we
prefer to calculate these values by computer. The values
γ
' ( Dn
Dγn := d1 DBADD,γ (θ ∗ (x, Θn )) , idX and Enγ := 2
n+1
An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 451

then still vary with different values of γ and n. Remark that for all γ > 0 obvi-
γ γ
ously D1 = E1 . If one tries for instance to find an optimal γ –value for a fixed n,
γ
this requires to find the derivative dD n
d γ , which is another good reason to seek
refuge in numerical techniques, such as Simpson’s integration rule with steps
of order 10−5 . We programmed this in C as to optimalize the speed, and all
calculations have been carried out with 80-bit precision. Section 4.2 learns us
that
√ √ 

E11 = D11 = 83 + 23 ln 2 − 2 2 ln  2 + 1 ! 0.635 864
√ √ 
D 1
35
108 + 278
ln 2 − 29 2 ln 2 + 1 − 19 ln 5
E21 = 22 = 2
! 0.110 453
3 3

We want to give the reader an approximation of which results can be obtained


handling this rule. We use the simplified formula
' ( γ γ γ γ γ γ
γ d1 DBADD,γ (θ ∗ (x, Θn )) , idX I + I + (n − 1)I3 I1 + I2 + (n − 1)I3
En := =2 1 2 2
=
width[(Ξn )n ] (n + 1)2 n+1 n+1

We carried out the following numerical computations:


γ
(a). Optimize E1 with respect to γ .
γ
As can be calculated, plotting γ against E1 we find an optimum for γ !
−5
5.24476 with error margin 10 . We obtain the graph given in Figure 5. Com-
pared with the result we obtained in [6], this is correct within the boundaries
of two times the aforementioned error margin, which could be due to the dif-
ferent algorithms we used.
γ
(b). Does lim sup E1 converge to E1MOM ?
γ →∞
According to R.R. Yager and D.P. Filev in [3], it does indeed. We have verified

D1γ
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20
γ γ
Fig. 5 Plotting γ against E1 = D1
452 H.K. Lee et al.

1 1
D γ
2
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0 2000 4000
γ
Fig. 6 lim sup E1 converges to E1MOM
γ →∞

γ |1γ |2γ |3γ


1000 0,062192099 0,124924061 0,124826612
2000 0,062346227 0,124962009 0,124913328
3000 0,06239754 0,124974668 0,124942221
4000 0,062423179 0,124980999 0,124956664
5000 0,062438557 0,124984798 0,124965329
6000 0,062448805 0,124987332 0,124971104
7000 0,062456124 0,124989141 0,124975229
8000 0,062461612 0,124990498 0,124978322
9000 0,06246588 0,124991554 0,124980727
10000 0,062469293 0,124992398 0,12498265
INF 1/16 1/8 1/8

Fig. 7 Extension for other n

this for example for n = 1, for which the graph is given in Figure 6. Noticeably,
γ
quite soon a relaxation to lim sup D1 = 0.375 = 38 occurs, which can equally
γ →∞
be found by considering Section 4.1.
γ
(c). Can this result be extended such that for all other n, lim sup En converges to
γ →∞
EnMOM ?
Absolutely. In the table in Figure 7, we see the results for n ∈ {1, 2, 3} when
γ
γ increases with steps 1, 000. One consequence is that apparently lim sup I1 =
γ →∞
1 γ γ
16 and that both lim sup I2 and lim sup I3 equal 18 , we can find a closed for-
γ →∞ γ →∞
mula
γ γ γ
I1 + I2 + (n − 1)I3 1
+ 18 + (n−1) 2n + 1
lim sup Enγ = lim sup = 16 8
=
γ →∞ γ →∞ n+1 n+1 16(n + 1)

On the other hand, following Theorem 9, we find that


An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 453
1+2n
DMOM
n 8(n+1)2 2n + 1
EnMOM = 2
= 2
=
n+1 n+1
16(n + 1)

which emphasizes the correctness of the result. Obviously this remains true
when we take a limit for n → ∞.
(d). What about the fitness of DBADD,γ ?
If we fix a γ , we consequently obtain that
γ γ γ
I1 + I2 + (n − 1)I3 γ
FBADD,γ [(Θn )n ] = lim sup Enγ = lim = I3
n→∞ n→∞ n+1

γ
Therefore, a study of the values of I3 with respect to γ is required. We al-
γ γ
ready know that lim I3 = 81 . Investigating I3 for γ ∈ {1, 2, 3, ..., 50} and
γ →∞
 
γ ∈ 1, 12 , 31 , ..., 50
1 , we obtain the results in Table 4 in Figure 8. One would

γ |3γ k γ=1/k | 3γ
1 0,013428224 1 1 0,013428224
2 0,038246249 2 0,5 0,059122185
3 0,064139561 3 0,333333 0,078830338
4 0,078842882 4 0,25 0,089601578
5 0,08807597 5 0,2 0,096345332
6 0,094331906 6 0,166667 0,100950403
7 0,098820545 7 0,142857 0,104289498
8 0,102185503 8 0,125 0,106819211
9 0,104796014 9 0,111111 0,108800879
10 0,106877353 10 0,1 0,110394597
11 0,108574081 11 0,090909 0,111703802
12 0,109982924 12 0,083333 0,112798246
13 0,111170909 13 0,076923 0,113726648
14 0,112185881 14 0,071429 0,114524057
15 0,113062866 15 0,066667 0,11521632
16 0,11382808 16 0,0625 0,115822915
17 0,114501521 17 0,058824 0,116358795
18 0,115098701 18 0,055556 0,116835627
19 0,115631837 19 0,052632 0,117262652
20 0,116110674 20 0,05 0,117647276
21 0,116543082 21 0,047619 0,11799551
22 0,116935485 22 0,045455 0,118312281
23 0,117293175 23 0,043478 0,118601664
24 0,117620554 24 0,041667 0,118867062
25 0,117921311 25 0,04 0,119111336
26 0,118198558 26 0,038462 0,119336909
27 0,118454945 27 0,037037 0,119545849
28 0,118692737 28 0,035714 0,119739929
29 0,118913883 29 0,034483 0,119920678
30 0,119120069 30 0,033333 0,120089424
31 0,119312761 31 0,032258 0,120247324
32 0,11949324 32 0,03125 0,12039539
33 0,119662631 33 0,030303 0,120534513
34 0,119821924 34 0,029412 0,120665479
35 0,119971994 35 0,028571 0,120788984
36 0,12011362 36 0,027778 0,120905649
37 0,120247493 37 0,027027 0,121016025
38 0,120374233 38 0,026316 0,121120609
39 0,120494395 39 0,025641 0,121219844
40 0,120608477 40 0,025 0,121314129
41 0,120716929 41 0,02439 0,121403827
42 0,120820158 42 0,02381 0,121489264
43 0,120918532 43 0,023256 0,121570737
44 0,121012384 44 0,022727 0,121648515
45 0,121102021 45 0,022222 0,121722843
46 0,121187718 46 0,021739 0,121793947
47 0,121269731 47 0,021277 0,121862032
48 0,121348292 48 0,020833 0,121927285
49 0,121423613 49 0,020408 0,12198988
50 0,121495892 50 0,02 0,122049976

Fig. 8 Extensions for other γ


454 H.K. Lee et al.

be tempted to consider γ = 1 as a minimum, but upon closer study, we find


the optimal value to be γ ! 1. 2041 with error 10−4 , for which the minimum is
γ
I3 ! 0.001 282 1, a fitness of about 10 times better than COG-defuzzification.
Not coincidentally, we found this samen this optimal γ –value in [6].
(e). As a last result, we made a quite precise calculation for which γ –value the
γ γ γ
I1 + I2 + (n − 1)I3
Enγ = lim sup
γ →∞ n+1

is optimal for various values of n. These results can be found in Table 5 in in


Figure 9.

n γopt Enγ n γopt Enγ n γopt Enγ n γopt E nγ


1 5,244771 0,074236356 51 1,205119 0,00689763 100 1,204603 0,00417352 510 1,204172 0,001853611
2 4,210323 0,077761177 52 1,205098 0,006791696 110 1,204554 0,003913052 520 1,20417 0,001842641
3 1,222680 0,074022226 53 1,205079 0,006689686 120 1,204513 0,003695633 530 1,204168 0,001832084
4 1,21796 0,059536127 54 1,20506 0,006591383 130 1,204479 0,003511405 540 1,204166 0,001821917
5 1,215106 0,049856191 55 1,205042 0,006496592 140 1,20445 0,003353307 550 1,204165 0,00181212
6 1,213219 0,042932984 56 1,205025 0,006405125 150 1,204424 0,003216147 560 1,204163 0,001802672
7 1,211882 0,037736319 57 1,205008 0,006316812 160 1,204402 0,003096025 570 1,204161 0,001793554
8 1,210885 0,033692186 58 1,204991 0,006231492 170 1,204382 0,002989952 580 1,20416 0,001784751
9 1,210114 0,030455546 59 1,204976 0,006149015 180 1,204365 0,002895599 590 1,204158 0,001776245
10 1,209499 0,027806555 60 1,204961 0,006069243 190 1,204349 0,002811125 600 1,204157 0,001768023
11 1,208998 0,025598516 61 1,204946 0,005992043 200 1,204335 0,002735056 610 1,204155 0,001760069
12 1,208581 0,023729802 62 1,204932 0,005917293 210 1,204322 0,002666197 620 1,204154 0,001752372
13 1,20823 0,022127784 63 1,204918 0,005844879 220 1,204311 0,002603569 630 1,204152 0,001744919
14 1,207929 0,020739177 64 1,204905 0,005774693 230 1,2043 0,002546364 640 1,204151 0,001737698
15 1,207669 0,019524003 65 1,204892 0,005706634 240 1,20429 0,002493905 650 1,20415 0,001730699
16 1,207441 0,018451682 66 1,204879 0,005640605 250 1,204281 0,002445627 660 1,204148 0,001723912
17 1,207241 0,017498424 67 1,204867 0,005576519 260 1,204273 0,002401048 670 1,204147 0,001717327
18 1,207063 0,016645444 68 1,204855 0,005514289 270 1,204266 0,002359758 680 1,204146 0,001710936
19 1,206904 0,015877709 69 1,204844 0,005453838 280 1,204258 0,002321407 690 1,204145 0,00170473
20 1,206761 0,015183049 70 1,204833 0,005395089 290 1,204252 0,002285692 700 1,204144 0,0016987
21 1,206632 0,014551506 71 1,204822 0,005337971 300 1,204246 0,00225235 710 1,204143 0,00169284
22 1,206514 0,013974852 72 1,204811 0,005282419 310 1,20424 0,002221153 720 1,204142 0,001687143
23 1,206407 0,013446228 73 1,204801 0,005228367 320 1,204235 0,002191898 730 1,204141 0,001681602
24 1,206309 0,012959874 74 1,204791 0,005175757 330 1,20423 0,002164412 740 1,20414 0,00167621
25 1,206219 0,012510916 75 1,204782 0,005124531 340 1,204225 0,002138537 750 1,204139 0,001670962
26 1,206136 0,012095199 76 1,204772 0,005074635 350 1,20422 0,002114137 760 1,204138 0,001665852
27 1,206059 0,011709165 77 1,204763 0,005026019 360 1,204216 0,002091089 770 1,204137 0,001660874
28 1,205987 0,011349743 78 1,204754 0,004978633 370 1,204212 0,002069283 780 1,204136 0,001656024
29 1,20592 0,011014274 79 1,204745 0,004932432 380 1,204208 0,002048621 790 1,204135 0,001651296
30 1,205858 0,01070044 80 1,204737 0,004887371 390 1,204205 0,002029017 800 1,204134 0,001646686
31 1,2058 0,010406213 81 1,204729 0,004843409 400 1,204201 0,00201039 810 1,204133 0,001642191
32 1,205746 0,010129813 82 1,20472 0,004800507 410 1,204198 0,001992669 820 1,204133 0,001637804
33 1,205695 0,009869666 83 1,204713 0,004758626 420 1,204195 0,001975791 830 1,204132 0,001633523
34 1,205647 0,00962438 84 1,204705 0,00471773 430 1,204192 0,001959696 840 1,204131 0,001629344
35 1,205601 0,009392717 85 1,204697 0,004677785 440 1,204189 0,00194433 850 1,20413 0,001625264
36 1,205559 0,009173572 86 1,20469 0,004638758 450 1,204186 0,001929646 860 1,20413 0,001621278
37 1,205518 0,008965958 87 1,204683 0,004600618 460 1,204184 0,001915599 870 1,204129 0,001617383
38 1,20548 0,008768988 88 1,204676 0,004563335 470 1,204181 0,001902149 880 1,204128 0,001613577
39 1,205443 0,008581864 89 1,204669 0,004526881 480 1,204179 0,001889257 890 1,204128 0,001609856
40 1,205409 0,008403865 90 1,204662 0,004491227 490 1,204177 0,001876891 900 1,204127 0,001606218
41 1,205376 0,008234341 91 1,204656 0,004456349 500 1,204174 0,001865018 910 1,204126 0,00160266
42 1,205345 0,008072699 92 1,204649 0,00442222 920 1,204126 0,001599179
43 1,205315 0,007918403 93 1,204643 0,004388818 930 1,204125 0,001595773
44 1,205286 0,007770962 94 1,204637 0,004356119 940 1,204124 0,001592439
45 1,205259 0,007629931 95 1,204631 0,004324101 950 1,204124 0,001589176
46 1,205233 0,0074949 96 1,204625 0,004292743 960 1,204123 0,00158598
47 1,205208 0,007365494 97 1,204619 0,004262025 970 1,204123 0,00158285
48 1,205185 0,007241368 98 1,204614 0,004231927 980 1,204122 0,001579784
49 1,205162 0,007122207 99 1,204608 0,004202431 990 1,204121 0,00157678
50 1,20514 0,007007717 100 1,204603 0,00417352 1000 1,204121 0,001573836

Fig. 9 Optimal values for γ


An Asymptotic Consistency Criterion for Optimizing Defuzzification in Fuzzy Control 455

6 Conclusions

Although we are fully aware of the limitedness of the cases we investigated in this
article, we would nevertheless like to point out that a consistency criterion as formu-
lated in [6] or an asymptotic consistency criterion as formulated in Section 4, turn
out to be a key notion in understanding the consistency of a fuzzy controller. For
investigating, looking at a family of defuzzification operators, such as the BADD–
defuzzification operators introduced in [3], the obvious value for the adjustable pa-
rameter γ turns out to be anything but the obvious one. We therefore think that a
much deeper investigation needed to establish a link between the rule bases, the
defuzzification operators, their width, the consistenty of other functions that the
identity that are mapped through the fuzzy controller, not to mention the computa-
tional complexity involved.

References

1. D. Dubois, J. Lang and H. Prade. Fuzzy sets in approximate reasoning, Part 2: Logical ap-
proaches. Fuzzy Sets and Systems 40, pp. 203–244, 1991
2. D. Dubois and H. Prade. Fuzzy sets in approximate reasoning, Part 1: Inference with possibility
distributions. Fuzzy Sets and Systems 40, pp. 143–202, 1991
3. D.P. Filev and R.R. Yager. A generalized defuzzification method via BADD distributions. In-
ternat. J. Intelligent Systems 6, pp. 687–697, 1991
4. A.N. Kolmogorov and S.V. Fomin. Measure, Lebesgue Integrals, and Hilbert Space. Acad-
emic Press, New York, 1961
5. E.E. Kerre. A comparative study of the behaviour of some popular fuzzy implication operators.
in: L.A. Zadeh and J. Kacprzyk, eds., Fuzzy Logic For The Mamagement of Uncertainty.
Wiley, New York, 1992
6. H. Lee, E. Paillet and W. Peeters A Consistency Criterion for Optimizing Defuzzification in
Fuzzy Control. In: Foundations of Generic Optimization Vol II: Applications of Fuzzy Control,
Genetic Algorithms and Neural Networks, Editors R. Lowen and A. Verschoren. Mathematical
Modelling: Theory and Applications, Springer Verlag, 2007
7. R. Lowen. Fuzzy Set Theory: Basic Concepts, Techniques and Bibliography. Kluwer Acad-
emic Dordrecht, 1996
8. E.H. Mamdani and S. Assilian. An experiment in linguistic synthesis with a fuzzy logic con-
troller. Int. Journal of Man–Machine Studies 7, pp. 1–13, 1975
9. A.M. Norwich and I.B. Turksen. A model for the measurement of membership and the conse-
quences of its empirical implementation. Fuzzy Sets and Systems 12, pp. 1–25, 1985
10. D. Ruan, E.E. Kerre, G. De Cooman, B. Cappelle and F. Vanmassenhove. Influence of the fuzzy
implication operator on the method-of-cases inference rule. Internat. J. Approx. Reasoning, 4,
pp. 307–318, 1990
11. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1994
12. T.A. Runkler and M. Glesner. A set of axioms for defuzzification strategies — towards a the-
ory of rational defuzzification operators. Second IEEE International Conference on Fuzzy
Systems, San Francisco, pp. 1161–1166, 1993
13. M. Sugeno. An introductory survey of fuzzy control. Inform. Sci 36, pp. 59–83, 1985
14. W. Van Leekwijck and E.E. Kerre. Defuzzification: criteria and classification. Fuzzy Sets and
Systems 108, pp. 159–178, 1999
456 H.K. Lee et al.

15. R.R. Yager and D.P. Filev. SLIDE: A simple adaptive defuzzification method. IEEE Trans.
Fuzzy Systems 1(1), pp. 69–78, 1993
16. L.A. Zadeh. Fuzzy sets. Inform. Control 8, pp. 338–353, 1965
17. L.A. Zadeh. Outline of a new approach to the analysis of complex systems and decision
processes. IEEE Trans. Syst. Man. Cybernet., 3, pp. 28–44, 1973
18. H.J. Zimmermann. Fuzzy Set Theory And Its Applications. Kluwer Academic, Boston/
Dordrecht/London, 1996
MATHEMATICAL MODELLING:
Theory and Applications

1. M. Křížek and P. Neittaanmäki: Mathematical and Numerical Modelling in Elec-


trical Engineering. Theory and Applications. 1996 ISBN 0-7923-4249-6
2. M.A. van Wyk and W.-H. Steeb: Chaos in Electronics. 1997
ISBN 0-7923-4576-2
3. A. Halanay and J. Samuel: Differential Equations, Discrete Systems and Con-
trol. Economic Models. 1997 ISBN 0-7923-4675-0
4. N. Meskens and M. Roubens (eds.): Advances in Decision Analysis. 1999
ISBN 0-7923-5563-6
5. R.J.M.M. Does, K.C.B. Roes and A. Trip: Statistical Process Control in Indus-
try. Implementation and Assurance of SPC. 1999 ISBN 0-7923-5570-9
6. J. Caldwell and Y.M. Ram: Mathematical Modelling. Concepts and Case Studies.
1999 ISBN 0-7923-5820-1
7. 1. R. Haber and L. Keviczky: Nonlinear System Identification - Input-Output
Modeling Approach. Volume 1: Nonlinear System Parameter Identification.
1999 ISBN 0-7923-5856-2; ISBN 0-7923-5858-9 Set
2. R. Haber and L.Keviczky: Nonlinear System Identification - Input-Output
Modeling Approach. Volume 2: Nonlinear System Structure Identification.
1999 ISBN 0-7923-5857-0; ISBN 0-7923-5858-9 Set

8. M.C. Bustos, F. Concha, R. Bürger and E.M. Tory: Sedimentation and Thick-
ening. Phenomenological Foundation and Mathematical Theory. 1999
ISBN 0-7923-5960-7
9. A.P. Wierzbicki, M. Makowski and J. Wessels (eds.): Model-Based Decision Sup-
port Methodology with Environmental Applications. 2000 ISBN 0-7923-6327-2

10. C. Rocşoreanu, A. Georgescu and N. Giurgiţeanu: The FitzHugh-Nagumo Model.


Bifurcation and Dynamics. 2000 ISBN 0-7923-6427-9
11. S. Aniţa: Analysis and Control of Age-Dependent Population Dynamics. 2000
ISBN 0-7923-6639-5
12. S. Dominich: Mathematical Foundations of Informal Retrieval. 2001
ISBN 0-7923-6861-4
13. H.A.K. Mastebroek and J.E. Vos (eds.): Plausible Neural Networks for Biological
Modelling. 2001 ISBN 0-7923-7192-5

14. A.K. Gupta and T. Varga: An Introduction to Actuarial Mathematics. 2002


ISBN 1-4020-0460-5
15. H. Sedaghat: Nonlinear Difference Equations. Theory with Applications to
Social Science Models. 2003 ISBN 1-4020-1116-4
MATHEMATICAL MODELLING:
Theory and Applications

16. A. Slavova: Cellular Neural Networks: Dynamics and Modelling. 2003


ISBN 1-4020-1192-X
17. J.L. Bueso, J.Gómez-Torrecillas and A. Verschoren: Algorithmic Methods in
Non-Commutative Algebra. Applications to Quantum Groups. 2003
ISBN 1-4020-1402-3
18. A. Swishchuk and J. Wu: Evolution of Biological Systems in Random Media:
Limit Theorems and Stability. 2003 ISBN 1-4020-1554-2
19. K. van Montfort, J. Oud and A. Satorra (eds.): Recent Developments on Struc-
tural Equation Models. Theory and Applications. 2004 ISBN 1-4020-1957-2

20. M. Iglesias, B. Naudts, A. Verschoren and C. Vidal: Foundations of Generic


Optimization. Volume 1: A Combinatorial Approach to Epistasis. 2005
ISBN 1-4020-3666-3
21. G. Marinoschi: Functional Approach to Nonlinear Models of Water Flow in Soils.
2006 ISBN 978-1-4020-4879-1
22. E. Allen: Modeling with Itô Stochastic Differential Equations. 2007
ISBN 978-1-4020-5952-0
23. Not yet published
24. R. Lowen and A. Verschoren (eds.): Foundations of Generic Optimization. Volume 2:
Applications of Fuzzy Control, Genetic Algorithms and Neural Networks. 2008
ISBN 978-1-4020-6667-2

www.springer.com

You might also like