A Course in Fuzzy Systems and Control - Part 1
A Course in Fuzzy Systems and Control - Part 1
Li-Xin Wang
Contents
Preface 1 Introduction 1.1 Why Fuzzy Systems? 1.2 What Are Fuzzy Systems? 1.3 Where Are Fuzzy Systems Used and How? 1.3.1 Fuzzy Washing Machines 1.3.2 Digital Image Stabilizer 1.3.3 Fuzzy Systems in Cars 1.3.4 Fuzzy Control of a Cement Kiln 1.3.5 Fuzzy Control of Subway Train 1.4 What Are the Major Research Fields in Fuzzy Theory? 1.5 A Brief History of Fuzzy Theory and Applications 1.5.1 The 1960s: The Beginning of Fuzzy Theory 1.5.2 The 1970s: Theory Continued to Grow and Real Applications Appeared 1.5.3 The 1980s: Massive Applications Made a Difference 1.5.4 The 1990s: More Challenges Remain 1.6 Summary and Further Readings 1.7 Exercises
xv
I
2
19
vi
2.3 Operations on Fuzzy Sets 2.4 Summary and Further Readings 2.5 Exercises
Contents
3 Further Operations on Fuzzy Sets 3.1 Fuzzy Complement 3.2 Fuzzy Union-The S-Norms 3.3 Fuzzy Intersection-The T-Norms 3.4 Averaging Operators 3.5 Summary and Further Readings 3.6 Exercises
4 Fuzzy Relations and the Extension Principle 4.1 From Classical Relations to Fuzzy Relations 4.1.1 Relations 4.1.2 Projections and Cylindric Extensions 4.2 Compositions of Fuzzy Relations 4.3 The Extension Principle 4.4 Summary and Further Readings 4.5 Exercises
5
Linguistic Variables and Fuzzy IF-THEN Rules 5.1 From Numerical Variables to Linguistic Variables 5.2 Linguistic Hedges 5.3 Fuzzy IF-THEN Rules 5.3.1 Fuzzy Propositions 5.3.2 Interpretations of Fuzzy IF-THEN Rules 5.4 Summary and Further Readings 5.5 Exercises Fuzzy Logic and Approximate Reasoning 6.1 From Classical Logic to Fuzzy Logic 6.1.1 Short Primer on Classical Logic 6.1.2 Basic Principles in Fuzzy Logic 6.2 The Compositional Rule of Inference 6.3 Properties of the Implication Rules 6.3.1 Generalized Modus Ponens 6.3.2 Generalized Modus Tollens
CONTENTS
vii -
6.3.3 Generalized Hypothetical Syllogism 6.4 Summary and Further Readings 6.5 Exercises
I1
7 Fuzzy Rule Base and Fuzzy Inference Engine 7.1 Fuzzy Rule Base 7.1.1 Structure of Fuzzy Rule Base 7.1.2 Properties of Set of Rules 7.2 Fuzzy Inference Engine 7.2.1 Composition Based Inference 7.2.2 Individual-Rule Based Inference 7.2.3 The Details of Some Inference Engines 7.3 Summary and Further Readings 7.4 Exercises
8 Fuzzifiers and Defuzzifiers 8.1 Fuzzifiers 8.2 Defuzzifiers 8.2.1 center of gravity Defuzzifier 8.2.2 Center Average Defuzzifier 8.2.3 Maximum Defuzzifier 8.2.4 Comparison of the Defuzzifiers 8.3 Summary and Further Readings 8.4 Exercises 9 Fuzzy Systems as Nonlinear Mappings 9.1 The Formulas of Some Classes of Fuzzy Systems 9.1.1 Fuzzy Systems with Center Average Defuzzifier 9.1.2 Fuzzy Systems with Maximum Defuzzifier 9.2 Fuzzy Systems As Universal Approximators 9.3 Summary and Further Readings 9.4 Exercises
10 Approximation Properties of Fuzzy Systems I 10.1 Preliminary Concepts 10.2 Design of Fuzzy System
viii
Contents
10.3 Approximation Accuracy of the Fuzzy System 10.4 Summary and Further Readings 10.5 Exercises
11 Approximation Properties of Fuzzy Systems I1 140 11.1 Fuzzy Systems with Second-Order Approximation Accuracy 140 11.2 Approximation Accuracy of Fuzzy Systems with Maximum Defuzzifierl45 11.3 Summary and Further Readings 149 149 11.4 Exercises
I11
151
12 Design of Fuzzy Systems Using A Table Look-Up Scheme 153 12.1 A Table Look-Up Scheme for Designing Fuzzy Systems from InputOutput Pairs 153 12.2 Application to Truck Backer-Upper Control 157 Application to Time Series Prediction 161 12.3 12.4 Summary and Further Readings 166 166 12.5 Exercises and Projects 13 Design of Fuzzy Systems Using Gradient Descent Training 13.1 Choosing the Structure of Fuzzy Systems 13.2 Designing the Parameters by Gradient Descent 13.3 Application to Nonlinear Dynamic System Identification 13.3.1 Design of the Identifier 13.3.2 Initial Parameter Choosing 13.3.3 Simulations 13.4 Summary and Further Readings 13.5 Exercises and Projects 14 Design of Fuzzy Systems Using Recursive Least Squares 14.1 Design of the Fuzzy System 14.2 Derivation of the Recursive Least Squares Algorithm 14.3 Application to Equalization of Nonlinear Communication Channels 14.3.1 The Equalization Problem and Its Geometric Formulation 14.3.2 Application of the Fuzzy System to the Equalization Problem 14.4 Summary and Further Readings 14.5 Exercises and Projects 168 168 169 172 172 174 175 176 178 180 180 182 183 183 186 190 190
CONTENTS
15 Design of Fuzzy Systems Using Clustering 15.1 An Optimal Fuzzy System 15.2 Design of Fuzzy Systems By Clustering 15.3 Application to Adaptive Control of Nonlinear Systems 15.4 Summary and Further Readings 15.5 Exercises and Projects
IV
205
17 f i z z y Control of Linear Systems I: Stable Controllers 17.1 Stable Fuzzy Control of Single-Input-Single-Output Systems 17.1.1 Exponential Stability of Fuzzy Control Systems 17.1.2 Input-Output Stability of Fuzzy Control Systems 17.2 Stable Fuzzy Control of Multi-Input-Multi-Output Systems 17.2.1 Exponential Stability 17.2.2 Input-Output Stability 17.3 Summary and Further Readings 17.4 Exercises 18 Fuzzy Control of Linear Systems 11: Optimal and Robust Con230 trollers 18.1 Optimal Fuzzy Control 230 231 18.1.1 The Pontryagin Minimum Principle 18.1.2 Design of Optimal Fuzzy Controller 231 18.1.3 Application t o the Ball-and-Beam System 234
Contents
18.2 Robust Fuzzy Control 18.3 Summary and Further Readings 18.4 Exercises
19 Fuzzy Control of Nonlinear Systems I: Sliding Control 238 19.1 Fuzzy Control As Sliding Control: Analysis 238 19.1.1 Basic Principles of Sliding Control 238 19.1.2 Analysis of Fuzzy Controllers Based on Sliding Control Principle241 19.2 Fuzzy Control As Sliding Control: Design 241 19.2.1 Continuous Approximation of Sliding Control Law 241 19.2.2 Design of Fuzzy Controller Based on the Smooth Sliding Control Law 244 19.3 Summary and Further Readings 247 19.4 Exercises 247 20 Fuzzy Control of Nonlinear Systems 11: Supervisory Control 20.1 Multi-level Control Involving Fuzzy Systems 20.2 Stable Fuzzy Control Using Nonfuzzy Supervisor 20.2.1 Design of the Supervisory Controller 20.2.2 Application to Inverted Pendulum Balancing 20.3 Gain Scheduling of PID Controller Using Fuzzy Systems 20.3.1 The PID Controller 20.3.2 A Fuzzy System for Turning the PID Gains 20.4 Summary and Further Readings 20.5 Exercises 21 Fuzzy Control of Fuzzy System Models 21.1 The Takagi-Sugeno-Kang Fuzzy System 21.2 Closed-Loop Dynamics of Fuzzy Model with Fuzzy Controller 21.3 Stability Analysis of the Dynamic TSK Fuzzy System 21.4 Design of Stable Fuzzy Controllers for the Fuzzy Model 21.5 Summary and Further Readings 21.6 Exercises 249 249 25 1 251 254 257 257 258 263 264 265 265 266 269 273 275 276
22 Qualitative Analysis of Fuzzy Control and Hierarchical Fuzzy Systems 277 277 22.1 Phase Plane Analysis of Fuzzy Control Systems 280 22.2 Robustness Indices for Stability 22.2.1 The One-Dimensional Case 281
CONTENTS
xi
282 284 284 285 286 287 288
22.2.2 The n-Dimensional Case 22.3 Hierarchical Fuzzy Control 22.3.1 The Curse of Dimensionality 22.3.2 Construction of the Hierarchical Fuzzy System 22.3.3 Properties of the Hierarchical Fuzzy System 22.4 Summary and Further Readings 22.5 Exercises
23 Basic Adaptive Fuzzy Controllers I 23.1 Classification of Adaptive Fuzzy Controllers 23.2 Design of the Indirect Adaptive Fuzzy Controller 23.2.1 Problem Specification 23.2.2 Design of the Fuzzy Controller 23.2.3 Design of Adaptation Law 23.3 Application to Inverted Pendulum Tracking Control 23.4 Summary and Further Readings 23.5 Exercises
24 Basic Adaptive Fuzzy Controllers 11 24.1 Design of the Direct Adaptive Fuzzy Controller 24.1.1 Problem Specification 24.1.2 Design of the Fuzzy Controller 24.1.3 Design of Adaptation Law 24.1.4 Simulations 24.2 Design of the Combined DirectIIndirect Adaptive Fuzzy Controller 24.2.1 Problem Specification 24.2.2 Design of the Fuzzy Controller 24.2.3 Design of Adaptation Law 24.2.4 Convergence Analysis 24.3 Summary and Further Readings 24.4 Exercises 25 Advanced Adaptive Fuzzy Controllers I 25.1 State Boundedness By Supervisory Control 25.1.1 For Indirect Adaptive Fuzzy Control System
304 304 304 304 305 306 309 311 311 312 313 315 315 317 317 317
xii
25.1.2 For Direct Adaptive Fuzzy Control System 25.2 Parameter Boundedness By Projection 25.2.1 For Indirect Adaptive Fuzzy Control System 25.2.2 For Direct Adaptive Fuzzy Control System 25.3 Stable Direct Adaptive Fuzzy Control System 25.3.1 Stability and Convergence Analysis 25.3.2 Simulations 25.4 Summary and Further Readings 25.5 Exercises
Contents
26 Advanced Adaptive Fuzzy Controllers I1 328 26.1 Stable Indirect Adaptive Fuzzy Control System 328 328 26.1.1 Stability and Convergence Analysis 329 26.1.2 Nonlinear Parameterization 26.2 Adaptive Fuzzy Control of General Nonlinear Systems 331 26.2.1 Intuitive Concepts of Input-Output Linearization 332 26.2.2 Design of Adaptive Fuzzy Controllers Based on Input-Output 334 Linearization 26.2.3 Application to the Ball-and-Beam System 335 26.3 Summary and Further Readings 339 339 26.4 Exercises
VI
Miscellaneous Topics
341
27 The Fuzzy C-Means Algorithm 27.1 Why Fuzzy Models for Pattern Recognition? 27.2 Hard and Fuzzy c-Partitions 27.3 Hard and Fuzzy c-Means Algorithms 27.3.1 Objective Function Clustering and Hard c-Means Algorithm 27.3.2 The Fuzzy c-Means Algorithm 27.4 Convergence of the Fuzzy c-Means Algorithm 27.5 Summary and Further Readings 27.6 Exercises 28 Fuzzy Relation Equations 28.1 Introduction 28.2 Solving the Fuzzy Relation Equations 28.3 Solvability Indices of the Fuzzy Relation Equations
CONTENTS
28.3.1 Equality Indices of Two Fuzzy Sets 28.3.2 The Solvability Indices 28.4 Approximate Solution-A Neural Network Approach 28.5 Summary and Further Readings 28.6 Exercises
29 Fuzzy Arithmetic 29.1 Fuzzy Numbers and the Decomposition Theorem 29.2 Addition and Subtraction of Fuzzy Numbers 29.2.1 The a-Cut Method 29.2.2 The Extension Principle Method 29.3 Multiplication and Division of Fuzzy Numbers 29.3.1 The a-Cut Method 29.3.2 The Extension Principle Method 29.4 Fuzzy Equations 29.5 Fuzzy Ranking 29.6 Summary and Further Readings 29.7 Exercises
30 Fuzzy Linear Programming 30.1 Classification of Fuzzy Linear Programming Problems 30.2 Linear Programming with Fuzzy Resources 30.3 Linear Programming with Fuzzy Objective Coefficients 30.4 Linear Programming with Fuzzy Constraint Coefficients 30.5 Comparison of Stochastic and Fuzzy Linear Programming 30.6 Summary and Further Readings 30.7 Exercises 31 Possibility Theory 31.1 Introduction 31.2 The Intuitive Approach to Possibility 31.2.1 Possibility Distributions and Possibility Measures 31.2.2 Marginal Possibility Distribution and Noninteractiveness 31.2.3 Conditional Possibility Distribution 31.3 The Axiomatic Approach to Possibility 31.3.1 Plausibility and Belief Measures 31.3.2 Possibility and Necessity Measures 31.4 Possibility versus Probability
xiv
31.4.1 The Endless Debate 31.4.2 Major Differences between the Two Theories 31.4.3 How to View the Debate from an Engineer's Perspective 31.5 Summary and Further Readings 31.6 Exercises
Contents
Bibliography
Index
419
Preface
The field of fuzzy systems and control has been making rapid progress in recent years. Motivated by the practical success of fuzzy control in consumer products and industrial process control, there has been an increasing amount of work on the rigorous theoretical studies of fuzzy systems and fuzzy control. Researchers are trying to explain why the practical results are good, systematize the existing approaches, and develop more powerful ones. As a result of these efforts, the whole picture of fuzzy systems and fuzzy control theory is becoming clearer. Although there are many books on fuzzy theory, most of them are either research monographs that concentrate on special topics, or collections of papers, or books on fuzzy mathematics. We desperately need a real textbook on fuzzy systems and control that provides the skeleton of the field and summarizes the fundamentals. This book, which is based on a course developed at the Hong Kong University of Science and Technology, is intended as a textbook for graduate and senior students, and as a self-study book for practicing engineers. When writing this book, we required that it be:
Well-Structured: This book is not intended as a collection of existing results on fuzzy systems and fuzzy control; rather, we first establish the structure that a reasonable theory of fuzzy systems and fuzzy control should follow, and then fill in the details. For example, when studying fuzzy control systems, we should consider the stability, optimality, and robustness of the systems, and classify the approaches according to whether the plant is linear, nonlinear, or modeled by fuzzy systems. Fortunately, the major existing results fit very well into this structure and therefore are covered in detail in this book. Because the field is not mature, as compared with other mainstream fields, there are holes in the structure for which no results exist. For these topics, we either provide our preliminary approaches, or point out that the problems are open.
a
Clear and Precise: Clear and logical presentation is crucial for any book, especially for a book associated with the word "fuzzy." Fuzzy theory itself is precise; the "fuzziness" appears in the phenomena that fuzzy theory tries
xvi
Preface
to study. Once a fuzzy description (for example, "hot day") is formulated in terms of fuzzy theory, nothing will be fuzzy anymore. We pay special attention to the use of precise language to introduce the concepts, to develop the approaches, and to justify the conclusions.
Practical: We recall that the driving force for fuzzy systems and control is practical applications. Most approaches in this book are tested for problems that have practical significance. In fact, a main objective of the book is to teach students and practicing engineers how to use the fuzzy systems approach to solving engineering problems in control, signal processing, and communications. Rich and Rigorous: This book should be intelligently challenging for students. In addition to the emphasis on practicality, many theoretical results are given (which, of course, have practical relevance and importance). All the theorems and lemmas are proven in a mathematically rigorous fashion, and some effort may have to be taken for an average student to comprehend the details. Easy to Use as Textbook: To facilitate its use as a textbook, this book is written in such a style that each chapter is designed for a one and one-half hour lecture. Sometimes, three chapters may be covered by two lectures, or vice versa, depending upon the emphasis of the instructor and the background of the students. Each chapter contains some exercises and mini-projects that form an integrated part of the text.
The book is divided into six parts. Part I (Chapters 2-6) introduces the fundamental concepts and principles in the general field of fuzzy theory that are particularly useful in fuzzy systems and fuzzy control. Part I1 (Chapters 7-11) studies the fuzzy systems in detail. The operations inside the fuzzy systems are carefully analyzed and certain properties of the fuzzy systems (for example, approximation capability and accuracy) are studied. Part I11 (Chapters 12-15) introduces four methods for designing fuzzy systems from sensory measurements, and all these methods are tested for a number of control, signal processing, or communication problems. Part IV (Chapters 16-22) and Part V (Chapters 23-26) parts concentrate on fuzzy control, where Part IV studies nonadaptive fuzzy control and Part V studies adaptive fuzzy control. Finally, Part VI (Chapters 27-31) reviews a number of topics that are not included in the main structure of the book, but are important and strongly relevant to fuzzy systems and fuzzy control. The book can be studied in many ways, according to the particular interests of the instructor or the reader. Chapters 1-15 cover the general materials that can be applied to a variety of engineering problems. Chapters 16-26 are more specialized in control problems. If the course is not intended as a control course, then some materials in Chapters 16-26 may be omitted, and the time saved may be used for a more detailed coverage of Chapters 1-15 and 27-31. On the other hand, if it
Preface
xvii
is a control course, then Chapters 16-26 should be studied in detail. The book also can be used, together with a book on neural networks, for a course on neural networks and fuzzy systems. In this case, Chapters 1-15 and selected topics from Chapters 16-31 may be used for the fuzzy system half of the course. If a practicing engineer wants to learn fuzzy systems and fuzzy control quickly, then the proofs of the theorems and lemmas may be skipped. This book has benefited from the review of many colleagues, students, and friends. First of all, I would like thank my advisors, Lotfi Zadeh and Jerry Mendel, for their continued encouragement. I would like to thank Karl Astrom for sending his student, Mikael Johansson, to help me prepare the manuscript during the summer of 1995. Discussions with Kevin Passino, Frank Lewis, Jyh-Shing Jang, Hua Wang, Hideyuki Takagi, and other researchers in fuzzy theory have helped the organization of the materials. The book also benefited from the input of the students who took the course at HKUST. Support for the author from the Hong Kong Research Grants Council was greatly appreciated. Finally, I would like to express my gratitude to my department at HKUST for providing the excellent research and teaching environment. Especially, I would like to thank my colleagues Xiren Cao, Zexiang Li, Li Qiu, Erwei Bai, Justin Chuang, Philip Chan, and Kwan-Fai Cheung for their collaboration and critical remarks on various topics in fuzzy theory. Li-Xin Wang The Hong Kong University of Science and Technology
xviii
Preface
Chapter 1
Introduction
Introduction
Ch. 1
engineering. As a general principle, a good engineering theory should be capable of making use of all available information effectively. For many practical systems, important information comes from two sources: one source is human experts who describe their knowledge about the system in natural languages; the other is sensory measurements and mathematical models that are derived according to physical laws. An important task, therefore, is to combine these two types of information into system designs. To achieve this combination, a key question is how to formulate human knowledge into a similar framework used to formulate sensory measurements and mathematical models. In other words, the key question is how to transform a human knowledge base into a mathematical formula. Essentially, what a fuzzy system does is to perform this transformation. In order to understand how this transformation is done, we must first know what fuzzy systems are.
1.2
Fuzzy systems are knowledge-based or rule-based systems. The heart of a fuzzy system is a knowledge base consisting of the so-called fuzzy IF-THEN rules. A fuzzy IF-THEN rule is an IF-THEN statement in which some words are characterized by continuous membership functions. For example, the following is a fuzzy IF-THEN rule: I F the speed of a car i s high, T H E N apply less force t o the accelerator (1.1)
where the words "high" and "less" are characterized by the membership functions shown in Figs.l.1 and 1.2, respectively.' A fuzzy system is constructed from a collection of fuzzy IF-THEN rules. Let us consider two examples.
Example 1.1. Suppose we want to design a controller to automatically control the speed of a car. Conceptually, there are two approaches to designing such a controller: the first approach is to use conventional control theory, for example, designing a PID controller; the second approach is to emulate human drivers, that is, converting the rules used by human drivers into an automatic controller. We now consider the second approach. Roughly speaking, human drivers use the following three types of rules to drive a car in normal situations:
I F speed i s low, T H E N apply more force t o the accelerator I F speed i s medium, T H E N apply normal force t o the accelerator I F speed i s high, T H E N apply less force t o the accelerator
where the words "low," "more," "medium," "normal," "high," and "less" are characterized by membership functions similar to those in Figs.l.1-1.2. Of course, more rules are needed in real situations. We can construct a fuzzy system based on these
l A detailed definition and analysis of membership functions will be given in Chapter 2. At this point, an intuitive understanding of the membership functions in Figs. 1.1 and 1.2 is sufficient.
t
I
speed (mph)
Figure 1.1. Membership function for "high," where the horizontal axis represents the speed of the car and the vertical axis represents the membership value for "high."
rules. Because the fuzzy system is used as a controller, it also is called a fuzzy controller.
Example 1.2. In Example 1.1, the rules are control instructions, that is, they represent what a human driver does in typical situations. Another type of human knowledge is descriptions about the system. Suppose a person pumping up a balloon wished to know how much air he could add before it burst, then the relationship among some key variables would be very useful. With the balloon there are three key variables: the air inside the balloon, the amount it increases, and the surface tension. We can describe the relationship among these variables in the following fuzzy IF-THEN rules: I F the amount of air i s small and it i s increased slightly, T H E N the surface tension will increase slightly I F the amount of air i s small and it i s increased substantially, T H E N the surface tension will increase substantially I F the amount of air i s large and it i s increased slightly, T H E N the surf ace tension will increase moderately
(1.5) (1.6)
(1.7)
I F the amount o f air i s large and it i s increased substantially, (1.8) T H E N the surf ace tension will increase very substantially where the words "small," "slightly," "substantially," etc., are characterized by membership functions similar to those in Figs.l.1 and 1.2. Combining these rules into a fuzzy system, we obtain a model for the balloon.
Introduction
Ch. 1
'2
force t o accelerator
Figure 1.2. Membership function for "less," where the horizontal axis represents the force applied to the accelerator and the vertical axis represents the membership value for "less."
In summary, the starting point of constructing a fuzzy system is to obtain a collection of fuzzy IF-THEN rules from human experts or based on domain knowledge. The next step is to combine these rules into a single system. Different fuzzy systems use different principles for this combination. So the question is: what are the commonly used fuzzy systems? There are three types of fuzzy systems that are commonly used in the literature: (i) pure fuzzy systems, (ii) Takagi-Sugeno-Kang (TSK) fuzzy systems, and (iii) fuzzy systems with fuzzifier and defuzzifier. We now briefly describe these three types of fuzzy systems. The basic configuration of a pure fuzzy system is shown in Fig. 1.3. The f t ~ z z y r u l e base represents the collection of fuzzy IF-THEN rules. For examples, for the car controller in Example 1.1, the fuzzy rule base consists of the three rules (1.2)-(1.4), and for the balloon model of Example 1.2, the fuzzy rule base consists of the four rules (1.5)-(1.8). The fuzzy inference engine combines these fuzzy IF-THEN rules into a mapping from fuzzy sets2 in the input space U c Rn to fuzzy sets in the output space V C R based on fuzzy logic principles. If the dashed feedback line in Fig. 1.3 exists, the system becomes the so-called fuzzy dynamic system. The main problem with the pure fuzzy system is that its inputs and outputs are
2The precise definition of fuzzy set is given in Chapter 2. At this point, it is sufficient to view a fuzzy set as a word like, for example, "high," which is characterized by the membership function shown in Fig.l.1.
I
fuzzy sets in u
/-
fuzzy sets (that is, words in natural languages), whereas in engineering systems the inputs and outputs are real-valued variables. To solve this problem, Takagi, Sugeno, and Kang (Takagi and Sugeno [I9851 and Sugeno and Kang [1988])proposed another fuzzy system whose inputs and outputs are real-valued variables. Instead of considering the fuzzy IF-THEN rules in the form of (1.l ) , the TakagiSugeno-Kang (TSK) system uses rules in the following form:
(1.9)
where the word "high" has the same meaning as in (1.1), and c is a constant. Comparing (1.9) and (1.1) we see that the THEN part of the rule changes from a description using words in natural languages into a simple mathematical formula. This change makes it easier to combine the rules. In fact, the Takagi-Sugeno-Kang fuzzy system is a weighted average of the values in the THEN parts of the rules. The basic configuration of the Takagi-Sugeno-Kang fuzzy system is shown in Fig. 1.4. The main problems with the Takagi-Sugeno-Kang fuzzy system are: (i) its THEN part is a mathematical formula and therefore may not provide a natural framework to represent human knowledge, and (ii) there is not much freedom left to apply different principles in fuzzy logic, so that the versatility of fuzzy systems is not well-represented in this framework. To solve these problems, we use the third type of fuzzy systems-fuzzy systems with fuzzifier and defuzzifier.
Introduction
Ch. 1
_I
xin U
52-t
Fuzzy Rule Base Weighted Average
y in V
In order t o use pure fuzzy systems in engineering systems, a simple method is t o add a fuzzifier, which transforms a real-valued variable into a fuzzy set, t o the input, and a defuzzifier, which transforms a fuzzy set into a real-valued variable, to the output. The result is the fuzzy system with fuzzifier and defuzzifier, shown in Fig. 1.5. This fuzzy system overcomes the disadvantages of the pure fuzzy systems and the Takagi-Sugeno-Kang fuzzy systems. Unless otherwise specified, from now on when we refer fuzzy systems we mean fuzzy systems with fuzzifier and defuzzifier. To conclude this section, we would like t o emphasize a distinguished feature of fuzzy systems: on one hand, fuzzy systems are multi-input-single-output mappings from a real-valued vector to a real-valued scalar (a multi-output mapping can be decomposed into a collection of single-output mappings), and the precise mathematical formulas of these mappings can be obtained (see Chapter 9 for details); on the other hand, fuzzy systems are knowledge-based systems constructed from human knowledge in the form of fuzzy IF-THEN rules. An important contribution of fuzzy systems theory is that it provides a systematic procedure for transforming a knowledge base into a nonlinear mapping. Because of this transformation, we are able t o use knowledge-based systems (fuzzy systems) in engineering applications (control, signal processing, or communications systems, etc.) in the same manner as we use mathematical models and sensory measurements. Consequently, the analysis and design of the resulting combined systems can be performed in a mathematically rigorous fashion. The goal of this text is to show how this transformation is done, and how the analysis and design are performed.
Fuzzifier
' Defuzzifier + y in V
x in U
fuzzy sets in U
Engine
fuzzy sets in V
Figure 1.5. Basic configuration of fuzzy systems with fuzzifier and defuzzifier.
Introduction
Ch. 1
systerr
Process
I
Fuzzy system
4
1.3.1
The fuzzy washing machines were the first major ,consumer products t o use fuzzy systems. They were produced by Matsushita Electric Industrial Company in Japan around 1990. They use a fuzzy system to automatically set the proper cycle according to the kind and amount of dirt and the size of the load. More specifically, the fuzzy system used is a three-input-one-output system, where the three inputs
are measuremeqts of-dirtiness, type of dirt, and load size, and the output is the correct cycle. Sensors supply the fuzzy system with the inputs. The optical sensor sends a beam of light through the water and measures how much of it reaches the other side. The dirtier the water, the less light crosses. The optical sensor also can tell whether the dirt is muddy or oily. Muddy dirt dissolves faster. So, if the light readings reach minimum quickly, the dirt is muddy. If the downswing is slower, it is oily. And if the curve slopes somewhere in between, the dirt is mixed. The machine also has a load sensor that registers the volume of clothes. Clearly, the more volume of the clothes, the more washing time is needed. The heuristics above were summarized in a number of fuzzy IF-THEN rules that were then used to construct the fuzzy system. 1.3.2
Digital Image Stabilizer
Anyone who has ever used a camcorder realizes that it is very difficult for a human hand to hold the camcorder-without shaking.it slightly and imparting an irksome *4dquiver to the tape. Smoothing out this jitter would produce a new generation of camcorders and would have tremendous commercial value. Matsushita introduced what it calls a digital image stabilizer, based on fuzzy systems, which stabilizes the picture when the hand is shaking. The digital image stabilizer is a fuzzy system that is constructed based on the following heuristics:
I F all the points in the picture are moving i n the same direction, T H E N the hand i s shaking I F only some points i n the picture are moving, T H E N the hand i s not shaking
(1.10)
(1.11)
More specifically, the stabilizer compares each current frame with the previous images in memory. If the whole appears to have shifted, then according to (1.10) the hand is shaking and the fuzzy system adjusts the frame to compensate. Otherwise, it leaves it alone. Thus, if a car crosses the field, only a portion of the image will change, so the camcorder does not try to compensate. In this way the picture remains steady, although the hand is shaking.
1.3.3
An automobile is a collection of many systems-engine, transmission, brake, suspension, steering, and more-and fuzzy systems have been applied to almost all of them. For example, Nissan has patented a fuzzy automatic transmission that saves fuel by 12 to 17 percent. It is based on the following observation. A normal transmission shifts whenever the car passes-a certain speed, it therefore changes quite often and each shift consumes gas. However, human drivers ~t --only shift - . less frequently, but also consider nonspeed factors. For example, if accelerating up
10
Introduction
Ch. 1
a hill, they may delay the shift. Nissan's fuzzy automatic transmission device summarized these heuristics into a collection of fuzzy IF-THEN rules that were then used to construct a fuzzy system to guide the changes of gears. Nissan also developed a fuzzy antilock braking system. The challenge here is to apply the greatest amount of pressure to the brake without causing it to lock. The Nissan system considers a number of heuristics, for example,
I F the car slows down very rapidly, T H E N the system assumes brake - lock and eases up on pressure
(1.12)
In April 1992, Mitsubishi announced a fuzzy omnibus system that controls a car's automatic transmission, suspension, traction, four-wheel steering, four-wheel drive, and air conditioner. The fuzzy transmission downshifts on curves and also keeps the car from upshifting inappropriately on bends or when the driver releases the accelerator. The fuzzy suspension contains sensors in the front of the car that register vibration and height changes in the road and adjusts the suspension for a smoother ride. Fuzzy traction prevents excess speed on corners and improves the grip on slick roads by deciding whether they are level or sloped. Finally, fuzzy steering adjusts the response angle of the rear wheels according to road conditions and the car's speed, and fuzzy air conditioning monitors sunlight, temperature, and humidity to enhance the environment inside the car.
1.3.4
Fuzzy Control of a Cement Kiln
Cement is manufactured by finegrinding of cement clinker. The clinkers are produced in the cement kiln by heating a mixture of linestone, clay, and sand components. Because cement kilns exhibit time-varying nonlinear behavior and relatively few measurements are available, they are difficult to control using conventional control theory. In the late 1970s, Holmblad and Bstergaard of Denmark developed a fuzzy system to control the cement kiln. The fuzzy system (fuzzy controller) had four inputs and two outputs (which can be viewed as two fuzzy systems in the form of Fig. 1.5, which share the same inputs). The four inputs are: (i) oxygen percentage in exhausted gases, (ii) temperature of exhaust gases, (iii) kiln drive torque, and (iv) litre weight of clinker (indicating temperature level in the burning zone and quality of clinker). The two outputs are: (i) coal feed rate and (ii) air flow. A collection of fuzzy IF-THEN rules were constructed that describe how the outputs should be related to the inputs. For example, the following two rules were used:
I F the oxygen percentage i s high and the temperature i s low, T H E N increase air flow
(1.13)
I F the oxygen percentage i s high and the temperature i s high, T H E N reduce the coal feed rate slightly
Sec. 1.4. What Are the Major Research Fields i n Fuzzy Theory?
11
The fuzzy controller was constructed by combining these rules into fuzzy systems. In June 1978, the fuzzy controller ran for six days in the cement kiln of F.L. Smidth & Company in Denmark-the first successful test of fuzzy control on a full-scale industrial process. The fuzzy controller showed a slight improvement over the results of the human operator and also cut fuel consumption. We will show more details about this system in Chapter 16.
For safety; I F the speed of train i s approaching the limit speed, T H E N select the m a x i m u m brake notch For riding comf ort; I F the speed i s i n the allowed range, T H E N do not change the control notch
(1.15)
(1.16)
More rules were used in the real system for traceability and other factors. The automatic stopping controller was constructed from the rules like:
For riding comfort; I F the train will stop i n the allowed zone, T H E N do not change the control notch For riding cornf ort and s a f e t y ; I F the train i s i n the allowed zone, T H E N change the control notch from acceleration to slight braking
(1.17)
(1.18)
Again, more rules were used in the real system to take care of the accuracy of stopping gap and other factors. By 1991, the Sendai subway had carried passengers for four years and was still one of the most advanced subway systems.
12
Introduction
Ch. 1
Fuzzy Theory
Fuzzy Mathematics
Fuzzy Systems
fuzzy sets fuzzy measures fuzzy analysis fuzzy relations fuzzy topology
fuzzy control processing controller design stability analysis ... equalization channel assignment measures of uncertainty
...
...
are introduced and expert systems are developed based on fuzzy information and approximate reasoning; (iii) fuzzy systems, which include fuzzy control and fuzzy approaches in signal processing and communications; (iv) uncertainty and information, where different kinds of uncertainties are analyzed; and (v) fuzzy decision making, which considers optimalization problems with soft constraints.
Of course, these five branches are not independent and there are strong interconnections among them. For example, fuzzy control uses concepts from fuzzy mathematics and fuzzy logic.
fiom a practical point of view, the majority of applications of fuzzy theory has concentrated on fuzzy systems, especially fuzzy control, as we could see from the examples in Section 1.3. There also are some fuzzy expert systems that perform
13
medical diagnoses and decision support (Terano, Asai and Sugeno [1994]). Because fuzzy theory is still in its infancy from both theoretical and practical points of view, we expect that more solid practical applications will appear as the field matures. From Fig. 1.8 we see that fuzzy theory is a huge field that comprises a variety of research topics. In this text, we concentrate on fuzzy systems and fuzzy control. We first will study the basic concepts in fuzzy mathematics and fuzzy logic that are useful in fuzzy systems and control (Chapters 2-6), then we will study fuzzy systems and control in great detail (Chapters 7-26), and finally we will briefly review some topics in other fields of fuzzy theory (Chapters 27-31).
1.5
1.5.1
Fuzzy theory was initiated by Lotfi A. Zadeh in 1965 with his seminal paper LLF'uzzy Sets" (Zadeh [1965]). Before working on fuzzy theory, Zadeh was a well-respected scholar in control theory. He developed the concept of "state," which forms the basis for modern control theory. In the early '60s, he thought that classical control theory had put too much emphasis on'$;lcision and therefore could not handle the complex systems. As early as 1962, he wrote that to handle biological systems "we need a radically different kind of mathematics, the mathematics of fuzzy or cloudy quantities which are not describable in terms of probability distributions" (Zadeh [1962]). Later, he formalized the ideas into the paper "Fuzzy Sets.'' Since its birth, fuzzy theory has been sparking /controversy. Some scholars, like Richard Bellman, endorsed the idea and began t o work in this new field. Other scholars objected to the idea and viewed "fuzzification" as against basic scientific principles. The biggest challenge, however, came from mathematicians in statistics and probability who claimed that probability is sufficient to characterize uncertainty and any problems that fuzzy theory can solve can be solved equally well or better by probability theory (see Chapter 31). Because there were no real practical applications of fuzzy theory in the beginning, it was difficult to defend the field from a purely philosophical point of view. Almost all major research institutes in the world failed to view fuzzy theory as a serious research field. Although fuzzy theory did not fall into the mainstream, there were still many researchers around the world dedicating themselves to this new field. In the late 1960s, many new fuzzy methods like fuzzy algorithms, fuzzy decision making, etc., were proposed.
1.5.2 The 1970s: Theory Continued to Grow and Real Applications Appeared
It is fair to say that the establishment of fuzzy theory as an independent field is largely due to the dedication and outstanding work of Zadeh. Most of the funda-
14
Introduction
Ch. 1
mental concepts in fuzzy theory were proposed by Zadeh in the late '60s and early '70s. After the introduction of fuzzy sets in 1965, he proposed the concepts of fuzzy algorithms in 1968 (Zadeh [1968]), fuzzy decision making in 1970 (Bellman and Zadeh [1970]), and fuzzy ordering in 1971 (Zadeh [1971b]). In 1973, he published another seminal paper, "Outline of a new approach to the analysis of complex systems and decision processes" (Zadeh [1973]),which established the foundation for fuzzy control. In this paper, he introduced the concept of linguistic variables and proposed to use fuzzy IF-THEN rules to formulate human knowledge. A big event in the '70s was the birth of fuzzy controllers for real systems. In 1975, Mamdani and Assilian established the basic framework of fuzzy controller (which is essentially the fuzzy system in Fig.l.5) and applied the fuzzy controller to control a steam engine. Their results were published in another seminal paper in fuzzy theory "An experiment in linguistic synthesis with a fuzzy logic controller" (Mamdani and Assilian [1975]). They found that the fuzzy controller was very easy to construct and worked remarkably well. Later in 1978, Holmblad and Bstergaard developed the first fuzzy controller for a full-scale industrial process-the fuzzy cement kiln controller (see Section 1.3). Generally speaking, the foundations of fuzzy theory were established in the 1970s. With the introduction of many new concepts, the picture of fuzzy theory as a new field was becoming clear. Initial applications like the fuzzy steam engine controller and the fuzzy cement kiln controller also showed that the field was promising. Usually, the field should be founded by major resources and major research institutes should put some manpower on the topic. Unfortunately,:this never happened. On the contrary, in the late '70s and early '80s, many researchers in fuzzy theory had to change their field because they could not find support to continue their work. This was especially true in the United States.
1.5.3
In the early '80s, this field, from a theoretical point of view, progressed very slowly. Few new concepts and approaches were proposed during this period, simply because very few people were still working in the field. It was the application of fuzzy control that saved the field. Japanese engineers, with their sensitivity to new technology, quickly found that fuzzy controllers were very easy to design and worked very well for many problems. Because fuzzy control does not require a mathematical model of the process, it could be applied to many systems where conventional control theory could not be used due to a lack of mathematical models. In 1980, Sugeno began to create, Japan's first fuzzy application-control of a Fuji Electric water purification plarit. In 1983, he began the pioneer work on a fuzzy robot, a self-parking car that was controlled by calling out commands (Sugeno and Nishida [1985]). In the early 1980s, Yasunobu and Miyamoto from Hitachi began to develop a fuzzy control system for the Sandai
15
subway. They finished the project in 1987 and created the most advanced subway system on earth. This very impressive application of fuzzy control made a very big i,',/. difference. In July 1987, the Second Annual International Fuzzy Systems Association Conference was held in Tokyo. The conference began three days after the Sendai subway began operation, and attendees were amused with its dreamy ride. Also, in the conference Hirota displayed a fuzzy robot arm that played two-dimensional Ping-Pong 3nd Yamakawa demonstrated a in real time (Hirota, Arai and Hachisu [1989])>, fuzzy system that balanced an inverted pen'&9"~&'[~amakawa[1989]). Prior to this event, fuzzy theory was not well-known in Japan. After it, a wave of pro-fuzzy sentiment swept through the engineering, government, and business communities. By the early 1990s, a large number of fuzzy consumer products appeared in the market (see Section 1.3 for examples).
1.5.4
The 1990s: More Challenges Remain
The success of fuzzy systems in Japan surprised the mainstream researchers in the United States and in Europe. Some still criticize fuzzy theory, but many others have been changing their minds and giving fuzzy theory a chance to be taken seriously. In February 1992, the first IEEE International Conference on Fuzzy Systems was held in San Diego. This event symbolized the acceptance of fuzzy theory by the largest engineering organization-IEEE. In 1993, the IEEE Transactions on Fuzzy Systems was inaugurated. From a theoretical point of view, fuzzy systems and control has advanced rapidly in the late 1980s and early 1990s. Although it is hard to say there is any breakthrough, solid progress has been made on some fundamental problems in fuzzy systems and control. For examples, neural network techniques have been used to determine membership functions in a systematic manner, and rigor&% stability analysis of fuzzy control systems has appeared. Although the whole picture of fuzzy systems and control theory is becoming clearer, much work remains to be done. Most approaches and analyses are preliminary in nature. We believe that only when the top research institutes begin to put some serious man power on the research of fuzzy theory can the field make major progress.
1.6
In this chapter we have demonstrated the following: The goal of using fuzzy systems is to put human knowledge into engineering systems in a systematic, efficient, and analyzable order. The basic architectures of the commonly used fuzzy systems.
16
Introduction
Ch. 1
The fuzzy IF-THEN rules used in certain industrial processes and consumer products. Classification and brief history of fuzzy theory and applications.
A very good non-technical introduction to fuzzy theory and applications is McNeil1 and Freiberger [1993]. It contains many interviews and describes the major events. Some historical remarks were made in Kruse, Gebhardt, and Klawonn [1994]. Klir and Yuan [I9951 is perhaps the most comprehensive book on fuzzy sets and fuzzy logic. Earlier applications of fuzzy control were collected in Sugeno [I9851 and more recent applications (mainly in Japan) were summarized in Terano, Asai, and Sugeno [1994].
1.7
Exercises
Exercise 1.1. Is the fuzzy washing machine an open-loop control system or a closed-loop control system? What about the fuzzy cement kiln control system? Explain your answer. Exercise 1.2. List four to six applications of fuzzy theory to practical problems other than those in Section 1.3. Point out the references where you find these applications.
17
Exercise 1.3. Suppose we want to design a fuzzy system to balance the inverted pendulum shown in Fig. 1.9. Let the angle 8 and its derivation 8 be the inputs to the fuzzy system and the force u applied to the cart be its output.
(a) Determine three to five fuzzy IF-THEN rules based on the common sense of how to balance the inverted pendulum. (b) Suppose that the rules in (a) can successfully control a particular inverted pendulum system. Now if we want to use the rules to control another inverted pendulum system with different values of m,,m, and 1, what parts of the rules should change and what parts may remain the same.
18
Introduction
Ch. 1
Chapter 2
(2.1)
There is yet a third method to define a set A-the membership method, which introduces a zero-one membership function (also called characteristic function, discrimination function, or indicator function) for A, denoted by pA(x), such that
) the The set A is mathematically equivalent to its membership function p ~ ( x in sense that knowing p~ (x) is the same as knowing A itself.
Example 2.1. Consider the set of all cars in Berkeley; this is the universe of discourse U. We can define different sets in U according to the properties of cars. Fig. 2.1 shows two types of properties that can be used to define sets in U: (a) US cars or non-US cars, and (b) number of cylinders. For example, we can define a set A as all cars in U that have 4 cylinders, that is,
A = {x E Ulx has 4 cylinders) (2.3)
21
4 Cylinder
8 Cylinder
Others
Figure 2.1. Partitioning of the set of all cars in Berkeley into subsets by: (a) US cars or non-US cars, and (b) number of cylinders.
or
1 i f x E U and x has 4 cylinders 0 i f x E U and x does not have 4 cylinders
(2.4)
If we want to define a set in U according to whether the car is a US car or a non-US car, we face a difficulty. One perspective is that a car is a US car if it carries the name of a USA auto manufacturer; otherwise it is a non-US car. However, many people feel that the distinction between a US car and a non-US car is not as crisp as it once was, because many of the components for what we consider to be US cars (for examples, Fords, GM's, Chryslers) are produced outside of the United States. Additionally, some "non-US" cars are manufactured in the USA. How to deal with this kind of problems? I7 Essentially, the difficulty in Example 2.1 shows that some sets do not have clear boundaries. Classical set theory requires that a set must have a well-defined property, therefore it is unable to define the set like "all US cars in Berkeley." To overcome this limitation of classical set theory, the concept of fuzzy set was introduced. It turns out that this limitation is fundamental and a new theory is needed-this is the fuzzy set theory. Definition 2.1. A fuzzy set in a universe of discourse U is characterized by a 1 . membership function P A (x) that takes values in the interval [0, 1 Therefore, a fuzzy set is a generalization of a classical set by allowing the mem-
22
Ch. 2
bership function to take any values in the interval [0, 11. In other words, the membership function of a classical set can only take two values-zero and one, whereas 1 . the membership function of a fuzzy set is a continuous function with range [0, 1 We see from the definition that there is nothing L'fuzzy7'about a fuzzy set; it is simply a set with a continuous membership function. A fuzzy set A in U may be represented as a set of ordered pairs of a generic element x and its membership value, that is,
where the integral sign does not denote integration; it denotes the collection of all points x E U with the associated membership function p A ( x ) . When U is discrete, A is commonly written as
where the summation sign does not represent arithmetic addition; it denotes the collection of all points x E U with the associated membership function p A ( x ) . We now return to Example 2.1 and see how to use the concept of fuzzy set to define US and non-US cars.
Example 2.1 (Cont'd). We can define the set 'LUScars in Berkeley," denoted by D, as a fuzzy set according to the percentage of the car's parts made in the USA. Specifically, D is defined by the membership function
where p ( x ) is the percentage of the parts of car x made in the USA and it takes values from 0% to 100%. For example, if a particular car xo has 60% of its parts made in the USA, then we say that the car xo belongs to the fuzzy set D to the degree of 0.6. Similarly, we can define the set "non-US cars in Berkeley," denoted by F , as a fuzzy set with the membership function
where p ( x ) is the same as in (2.8). Thus, if a particular car xo has 60% of its parts made in the USA, then we say the car xo belongs to the fuzzy set F to the degree of 1-0.6=0.4. Fig. 2.2 shows (2.8) and (2.9). Clearly, an element can belong to different fuzzy sets to the same or different degrees. We now consider another example of fuzzy sets and from it draw some remarks.
23
Figure 2.2. Membership functions for US ( p D ) and nonUS ( p F ) cars based on the percentage of parts of the car made in the USA (p(x)).
Example 2.2. Let Z be a fuzzy set named "numbers close to zero." Then a possible membership function for Z is
where x E R. This is a Gaussian function with mean equal to zero and standard derivation equal to one. According to this membership function, the numbers 0 and 2 belong to the fuzzy set Z to the degrees of e0 = 1 and e-4, respectively. We also may define the membership function for Z as
According to this membership function, the numbers 0 and 2 belong to the fuzzy set Z to the degrees of 1 and 0, respectively. (2.10) and (2.11) are plotted graphically in Figs. 2.3 and 2.4, respectively. We can choose many other membership functions to characterize "numbers close to zero." 0 From Example 2.2 we can draw three important remarks on fuzzy sets: The properties that a fuzzy set is used to characterize are usually fuzzy, for example, "numbers close to zero" is not a precise description. Therefore, we
24
Ch. 2
may use different membership functions to characterize the same description. However, the membership functions themselves are not fuzzy-they are precise mathematical functions. Once a fuzzy property is represented by a membership function, for example, once "numbers close to zero" is represented by the membership function (2.10) or (2.11), nothing will be fuzzy anymore. Thus, by characterizing a fuzzy description with a membership function, we essentially defuzzify the fuzzy description. A common misunderstanding of fuzzy set theory is that fuzzy set theory tries to fuzzify the world. We see, on the contrary, that fuzzy sets are used to defuzzify the world. Following the previous remark is an important question: how to determine the membership functions? Because there are a variety of choices of membership functions, how to choose one from these alternatives? Conceptually, there are two approaches to determining a membership function. The first approach is to use the knowledge of human experts, that is, ask the domain experts to specify the membership functions. Because fuzzy sets are often used to formulate human knowledge, the membership functions represent a part of human knowledge. Usually, this approach can only give a rough formula of the membership function; fine-tuning is required. In the second approach, we use data collected from various sensors to determine the membership functions. Specifically, we first specify the structures of the membership functions and then fine-tune the parameters of the membership functions based on the data. Both approaches, especially the second approach, will be studied in detail in
25
Figure 2.4. Another possible membership function to characterize "numbers close to zero."
later chapters. Finally, it should be emphasized that although (2.10) and (2.11) are used to characterize the same description "numbers close to zero," they are different fuzzy sets. Hence, rigorously speaking, we should use different labels to represent the fuzzy sets (2.10) and (2.11); for example, we should use pz, (x) in (2.10) and pz, (x) in (2.11). A fuzzy set has a one-to-one correspondence with its membership function. That is, when we say a fuzzy set, there must be a unique membership function associated with it; conversely, when we give a membership function, it represents a fuzzy set. Fuzzy sets and their membership functions are equivalent in this sense. Let us consider two more examples of fuzzy sets, one in continuous domain and the other in discrete domain; they are classical examples from Zadeh's seminal paper (Zadeh [1965]).
Example 2.3. Let U be the interval [O, 1001 representing the age of ordinary humans. Then we may define fuzzy sets "young" and "old" as (using the integral notation (2.6))
26
Ch. 2
That is, 5 and 6 belong to the fuzzy set "several" with degree 1,4 and 7 with degree 0.8, 3 and 8 with degree 0.5, and 1,2,9 and 10 with degree 0. See Fig. 2.6.
2.2
We now introduce some basic concepts and terminology associated with a fuzzy set. Many of them are extensions of the basic concepts of a classical (crisp) set, but some are unique to the fuzzy set framework.
Definition 2.2. The concepts of support, fuzzy singleton, center, crossover point, height, normal fuzzy set, a-cut, convex fuzzy set, and projections are defined as follows.
The support of a fuzzy set A in the universe of discourse U is a crisp set that contains all the elements of U that have nonzero membership values in A, that is, supp(A) = {a: E U~PA(X) > 0) (2.15)
27
integer x
1 2 3 4 5 6 7 8 9 1 0
Figure 2.6. Membership function for fuzzy set "several."
where supp(A) denotes the support of fuzzy set A. For example, the support of fuzzy set "several" in Fig. 2.6 is the set of integers {3,4,5,6,7,8). If the support of a fuzzy set is empty, it is called an empty fuzzy set. A fuzzy singleton is a fuzzy set whose support is a single point in U . The center of a fuzzy set is defined as follows: if the mean value of all points at which the membership function of the fuzzy set achieves its maximum value is finite, then define this mean value as the center of the fuzzy set; if the mean value equals positive (negative) infinite, then the center is defined as the smallest (largest) among all points that achieve the maximum membership value. Fig. 2.7 shows the centers of some typical fuzzy sets. The crossover point of a fuzzy set is the point in U whose membership value in A equals 0.5. The height of a fuzzy set is the largest membership value attained by any point. For example, the heights of all the fuzzy sets in Figs.2.2-2.4 equal one. If the height of a fuzzy set equals one, it is called a normal fuzzy set. All the fuzzy sets in Figs. 2.2-2.4 are therefore normal fuzzy sets. An a-cut of a fuzzy set A is a crisp set A, that contains all the elements in U that have membership values in A greater than or equal to a, that is,
For example, for a = 0.3, the a-cut of the fuzzy set (2.11) (Fig. 2.4) is the crisp set [-0.7,0.7], and for a = 0.9, it is [-0.1,0.1]. When the universe of discourse U is the n-dimensional Euclidean space Rn, the
28
Ch. 2
center of Al
center center of A2 of A3
center of A4
concept of set convexity can be generalized to fuzzy set. A fuzzy set A is said to 1 . be convex if and only if its a-cut A, is a convex set for any a in the interval (0, 1 The following lemma gives an equivalent definition of a convex fuzzy set.
1 . for all XI,x2 E Rn and all X E [0, 1 Proof: First, suppose that A is convex and we prove the truth of (2.17). Let xl and $2 be arbitrary points in Rn and without loss of generality we assume pA(xl) 5 pA(x2). If pA(xl) = 0, then (2.17) is trivially true, so we let pA(xl) = a > 0. Since by assumption the a-cut A, is convex and X I , x2 E A, (since pA(x2) L PA (XI) = a ) , we have Axl (1 - X)x2 E A, for all X E [0, 1 1 . Hence, pAIXxl (1 - X)x2] 2 a =PA(X~ = ) min[pA(xl),PA(XZ)]. Conversely, suppose (2.17) is true and we prove that A is convex. Let a be an arbitrary point in (0,1]. If A, is empty, then it is convex (empty sets are convex by definition). If A, is nonempty, then there exists XI E Rn such that pA(xl) = a (by the definition of A,). Let xa be an arbitrary element in A,, then pA(x2) a = pA(xl). Since (2.17) is true by assumption, we have ~ A [ X X ~ (1 A)%,] min[pA(xl),pA(x2)] = pA(xl) = a for all X E [O, 1 1 , which means that Axl (1 - X)x2 E A,. So A, is a convex set. Since a is an arbitrary point in (0, 1 1 , the convexity of A, implies the convexity of A.
> >
29
Let A be a fuzzy set in Rn with membership function pA(x) = ~ A ( X..., I , x,) and H be a hyperplane in Rn defined by H = {x E Rnlxl = 0) (for notational simplicity, we consider this special case of hyperplane; generalization to general hyperplanes is straightforward). The projection of A on H is a fuzzy set AH in RTL-1 defined by
PAH
(2.18)
where supzl E R p~ (XI,...,x,) denotes the maximum value of the function p~ (XI,..., x,) when xl takes values in R.
2.3
The basic concepts introduced in Sections 2.1 and 2.2 concern only a single fuzzy set. In this section, we study the basic operations on fuzzy sets. In the sequel, we assume that A and B are fuzzy sets defined in the same universe of discourse U . Definition 2.3. The equality, containment, complement, union, and intersection of two fuzzy sets A and B are defined as follows. ) all x E U . We say We say A and B are equal if and only if pA(x) = p ~ ( x for B contains A, denoted by A c B , if and only if pA(x) 5 pB(x) for all x E U . The complement of A is a fuzzy set A in U whose membership function is defined as
The u n i o n of A and B is a fuzzy set in U , denoted by A U B, whose membership function is defined as (2.20) PAUB (XI = ~ ~ X [ P (XI, A PB(x)] The intersection of A and B is a fuzzy set A n B in U with membership function
The reader may wonder why we use "max" for union and "min" for intersection; we now give an intuitive explanation. An intuitively appealing way of defining the union is the following: the union of A and B is the smallest fuzzy set containing both A and B. More precisely, if C is any fuzzy set that contains both A and B, then it also contains the union of A and B. To show that this intuitively appealing definition is equivalent to (2.20), we note, first, that A U B as defined by (2.20) contains both A and B because m a x [ p ~ ,UB] , 2: PA and m a x [ p ~pg] , p ~ Furthermore, . if C is any fuzzy set containing both A and B, then p c p~ and p c p ~ Therefore, . pc 2 max[p~ PB] , = ~ A U B ,which means that A U B as defined by (2.20) is the smallest fuzzy set containing both A and B. The intersection as defined by (2.21) can be justified in the same manner.
>
>
>
30
Ch. 2
and F.
Example 2.5. Consider the two fuzzy sets D and F defined by (2.8) and (2.9) (see also Fig. 2.2). The complement of F , F , is the fuzzy set defined by
which is shown in Fig. 2.8. Comparing (2.22) with (2.9) we see that F = D. This makes sense because if a car is not a non-US car (which is what the complement of F means intuitively), then it should be a US car; or more accurately, the less a car is a non-US car, the more the car is a US car. The union of F and D is the fuzzy set F U D defined by
which is plotted in Fig. 2.9. The intersection of F and D is the fuzzy set F f l D defined by
which is plotted in Fig. 2.10. With the operations of complement, union and intersection defined as in (2.19), (2.20) and (2.21), many of the basic identities (not all!) which hold for classical sets can be extended to fuzzy sets. As an example, let us consider the following lemma.
Lemma 2.2. The De Morgan's Laws are true for fuzzy sets. That is, suppose A and B are fuzzy sets, then A U B = A ~ B (2.25)
31
Figure 2.9. The membership function for F U D, where F and D are defined in Fig. 2.2.
Figure 2.10. The membership function for F n D, where F and D are defined in Fig. 2.2.
and
AnB=AuB
Proof: We only prove (2.25); (2.26) can be proven in the same way and is left as an exercise. First, we show that the following identity is true:
32
Ch. 2
To show this we consider the two possible cases: PA 2 PB and PA < PB. If ye, then 1-PA 1-PB and 1-max[pn,pp] = =m i n [ l - ~ ~ , whichis(2.27). I f p a < p s , t h e n l - p ~ > l - p ~ a n d l - r n a r [ ~ ~ , p ~ ] = l - p ~ = ( min[l -PA, 1-PB], which is again (2.27). Hence, (2.27) is true. From the definitions (2.19)-(2.21) and the definition of the equality of two fuzzy sets, we see that (2.27) implies (2.25).
PA
>
<
PA
2.4
In this chapter we have demonstrated the following: The definitions of fuzzy set, basic concepts associated with a fuzzy set (support, a-cut, convexity, etc.) and basic operations (complement, union, intersection, etc.) of fuzzy sets. The intuitive meaning of membership functions and how to determine intuitively appealing membership functions for specific fuzzy descriptions. Performing operations on specific examples of fuzzy sets and proving basic properties concerning fuzzy sets and their operations. Zadeh's original paper (Zadeh [1965]) is still the best source to learn fuzzy set and related concepts. The paper was extremely well-written and the reader is encouraged to read it. The basic operations and concepts associated with a fuzzy set were also introquced in Zadeh [1965].
2.5
Exercises
Exercise 2.1. Determine reasonable membership functions for "short persons," "tall persons," and "heavy persons." Exercise 2.2. Model the following expressions as fuzzy sets: (a) hard-working students, (b) top students, and (c) smart students. Exercise 2.3. Consider the fuzzy sets F, G and H defined in the interval U = [O, 101 by the membership functions
Determine the mathematical formulas and graphs of membership functions of each of the following fuzzy sets: (a) F , G , H (b) F U G , F U H , G U H
33
(c) F n G , F n H , G n H (d) F U G U H , F n G n H
(e)
F~H,W,-
Exercise 2.4. Determine the a-cuts of the fuzzy sets F,G and H in Exercise 2.3 for: (a) a = 0.2, (b) a = 0.5, (c) a = 0.9, and (d) a = 1. Exercise 2.5. Let fuzzy set A be defined in the closed plane U = [-I, 1 1x [-3,3] with membership function
Determine the projections of A on the hyperplanes HI = {x E Ulxl = 0) and H2 = {x E U1x2 = 01, respectively. Exercise 2.6. Show that the law of the excluded middle, F U F = U, is not true if F is a fuzzy set. Exercise 2.7. Prove the identity (2.26) in Lemma 2.2. Exercise 2.8. Show that the intersection of two convex fuzzy sets is also a convex fuzzy set. What about the union?
Chapter 3
In Chapter 2 we introduced the following basic operators for complement, union, and intersection of fuzzy sets:
We explained that the fuzzy set A U B defined by (3.2) is the smallest fuzzy set containing both A and B, and the fuzzy set A n B defined by (3.3) is the largest fuzzy set contained by both A and B. Therefore, (3.1)-(3.3) define only one type of operations on fuzzy sets. Other possibilities exist. For example, we may define A U B as any fuzzy set containing both A and B (not necessarily the smallest fuzzy set). In this chapter, we study other types of operators for complement, union, and intersection of fuzzy sets. Why do we need other types of operators? The main reason is that the operators (3.1)-(3.3) may not be satisfactory in some situations. For example, when we take the intersection of two fuzzy sets, we may want the larger fuzzy set to have an impact on the result. But if we use the min operator of (3.3), the larger fuzzy set will have no impact. Another reason is that from a theoretical point of view it is interesting to explore what types of operators are possible for fuzzy sets. We know that for nonfuzzy sets only one type of operation is possible for complement, union, or intersection. For fuzzy sets there are other possibilities. But what are they? What are the properties of these new operators? These are the questions we will try to answer in this chapter. The new operators will be proposed on axiomatic bases. That is, we will start with a few axioms that complement, union, or intersection should satisfy in order to be qualified as these operations. Then, we will list some particular formulas that satisfy these axioms.
35
3.1
Fuzzy Complement
Let c : [0,1] + [0,1] be a mapping that transforms the membership function of fuzzy set A into the membership function of the complement of A, that is,
In the case of (3.1), c [ p (x)] ~ = 1- pA(x). In order for the function c to be qualified as a complement, it should satisfy at least the following two requirements:
Axiom cl. c(0) = 1 and c(1) = 0 (boundary condition). Axiom c2. For all a,b E [0, 11, if a < b, then c(a) 2 c(b) (nonincreasing condition), where (and throughout this chapter) a and b denote membership functions of some fuzzy sets, say, a = ,UA(X) and b = p~ (x).
Axiom c l shows that if an element belongs to a fuzzy set to degree zero (one), then it should belong to the complement of this fuzzy set to degree one (zero). Axiom c2 requires that an increase in membership value must result in a decrease or no change in membership value for the complement. Clearly, any violation of these two requirements will result in an operator that is unacceptable as complement.
Definition 3.1. Any function c : [0, 1 1 + [O, 1 1 that satisfies Axioms c l and c2 is called a fuzzy complement.
One class of fuzzy complements is the Sugeno class (Sugeno 119771) defined by
where X E (-1, oo). For each value of the parameter A, we obtain a particular fuzzy complement. It is a simple matter to check that the complement defined by (3.5) satisfies Axioms c l and c2. Fig. 3.1 illustrates this class of fuzzy complements for different values of A. Note that when X = 0 it becomes the basic fuzzy complement (3.1). Another type of fuzzy complement is the Yager class (Yager [1980]) defined by
where w E (0, oo). For each value of w ,we obtain a particular fuzzy complement. It is easy to verify that (3.6) satisfies Axioms c l and c2. Fig. 3.2 illustrates the Yager class of fuzzy complements for different values of w. When w = 1, (3.6) becomes (3.1).
3.2
Fuzzy Union-The
S-Norms
Let s : [O, 1 1x [O,1] + [0,1]be a mapping that transforms the membership functions of fuzzy sets A and B into the membership function of the union of A and B, that
36
Ch. 3
Figure 3.1. Sugeno class of fuzzy complements cx(a) for different values of A.
Figure 3.2. Yager class of fuzzy complements %(a) for different values of w.
is, S[PA(X), PB(X)I = PAUB(X) In the case of (3.2), s [ p A ( x ) ,p~ ( x ) ] = max[pA ( x ) ,,uB ($11. In order for the function s to be qualified as an union, it must satisfied at least the following four requirements:
S-Norms
37
Axiom s l . s ( 1 , l ) = 1,s(0, a) = s(a, 0) = a (boundary condition). Axiom s2. s(a, b ) = s(b, a) (commutative condition). Axiom s3. If a 5 a' and b 5 b', then s(a, b) 5 s(al, b') (nondecreasing condition).
Axiom s4. s(s(a, b ) , c) = s(a, s(b, c)) (associative condition). Axiom s l indicates what an union function should be in extreme cases. Axiom
s2 insures that the order in which the fuzzy sets are combined has no influence on the result. Axiom s3 shows a natural requirement for union: an increase in membership values in the two fuzzy sets should result in an increase in membership value in the union of the two fuzzy sets. Axiom s4 allows us to extend the union operations to more than two fuzzy sets.
-+ [O, 1 1 that
satisfies Axioms
It is a simple matter to prove that the basic fuzzy union m a s of (3.2) is a s-norm. We now list three particular classes of s-norms: Dombi class (Dombi [1982]):
where the parameter w E (0, ooj. With a particular choice of the parameters, (3.8)-(3.10) each defines a particular s-norm. It is straightforward to verify that (3.8)-(3.10) satisfy Axioms sl-s4. These s-norms were obtained by generalizing the union operation for classical sets from different perspectives.
Many other s-norms were proposed in the literature. We now list some of them below:
38
Ch. 3
Einstein sum:
+ b - ab
Why were so many s-norms proposed in the literature? The theoretical reason is that they become identical when the membership values are restricted to zero or one; that is, they are all extensions of nonfuzzy set union. The practical reason is that some s-norms may be more meaningful than others in some applications.
Example 3.1: Consider the fuzzy sets D and F defined in Example 2.1 of Chapter 2 ((2.8) and (2.9)). If we use the Yager s-norm (3.10) for fuzzy union, then the fuzzy set D U F is computed as
Fig. 3.3 illustrates this pDUF(x)for w = 3. If we use the algebraic sum (3.13) for the fuzzy union, the fuzzy set D U F becomes
which is plotted in Fig. 3.4. Comparing Figs. 3.3 and 3.4 with Fig. 2.9, we see that the Yager s-norm and algebraic sum are larger than the maximum operator. In general, we can show that maximum (3.2) is the smallest s-norm and drastic sum (3.11) is the largest s-norm.
Theorem 3.1: For any s-norm s, that is, for any function s : [O, 1 1x [O,l] that satisfies Axioms sl-s4, the following inequality holds:
+ [O, 1 1
Proof: We first prove max(a, b) s(a, b ) . Fkom the nondecreasing condition Axiom s3 and the boundary condition Axium s l , we obtain
<
S-Norms
39
Figure 3.3. Membership function of DUF using the Yager s-norm (3.10) with w = 3.
Furthermore, the commutative condition Axiom s2 gives s(a,b) = s(b,a) 2 s(b,0 ) = b Combining (3.17) and (3.18) we have s(a,b) 2 max(a,b). Next we prove s(a, b) 5 sd,(a, b). If b = 0, then from Axiom s l we have s(a, b) = (3.18)
40
Ch. 3
s(a, 0) = a, thus s(a, b) = sd,(a, b). By the commutative condition Axiom s2 we have s(a,b) = sd,(a,b) if a = 0. If a # 0 and b # 0, we have Thus s(a, b) _< sd,(a, b) for all a, b E [O,l]. Finally, we prove an interesting property of the Dombi s-norm sx(a, b) (3.8): sx(a,b) converges to the basic fuzzy union max(a, b) as the parameter X goes to infinity and converges to the drastic sum sd,(a, b) as X goes to zero. Therefore, the Dombi s-norm covers the whole spectrum of s-norms.
sd,
Proof: We first prove (3.20). If a = b # 0, then from (3.8) we have lirnx,, sx(a, b) = limx-tm [1/(1+2-l/'($ - I))] = a = max(a, b). If a = b = 0, then lirnx,, sx(a, b) = limx+w 1/(1 0-'/') = 0 = max(a, b). If a # b, then without loss of generality (due to Axiom s2) we assume a < b. Let z = [(; - l)-X ($ - l)-X]-l/X, then using 1'Hospital's rule, we have
ln[(: - I)-' ( $ - I)-'] lirn ln(z) = lirn x , w A--w X ($ - ~ ) - ~ l n (; 1) ( i - l)-'ln(i 1) = lirn X+oo ( ; - 1)-A + - 1)-A . [($ - I)/($ - ~ ) ] - ~ l n (; 1) l n ( i - 1) = lirn X-tw - I)/($ - 1)I-x 1 1 = ln(- - 1) b Hence, limx,, z = 1 - 1, and
(i
[(i
+ +
(3.22)
(3.23)
Next we prove (3.21). If a = 0 and b # 0, we have sx(a, b) = l / [ l + ( i - l ) - x q ] = b = sd,(a, b). By commutativity, we have s,+(a,b) = a = sd,(a, b) if b = 0 and a # 0. , If a # 0 and b # 0, we have limx,osx(a, b) = limx+o 1/[1+2-'/'] = 1 = ~ d , ( ab). Finally, if a = b = 0, we have limx,o sx(a, b) = limx,o 1/[1+0-~/'] = 0 = sd,(a, b). Similarly, it can be shown that the Yager s-norm (3.10) converges to the basic fuzzy union max(a, b) as w goes to infinity and converges to the drastic sum sd,(a, b) as w goes to zero; the proof is left as an exercise.
T-Norms
41
3.3
Fuzzy Intersection-The
T-Norms
Let t : [O, 1 1x [O,1] -+ [O, 1 1 be a function that transforms the membership functions of fuzzy sets A and B into the membership function of the intersection of A and B, that is, (3.24) PA (XI,PB (~11 = P A ~ (x) B ] min[,uA(x),pB (x)]. In order for the funcIn the case of (3.3), t[pA(x),~ B ( x ) = tion t to be qualified as an intersection, it must satisfy at least the following four requirements: Axiom t 1: t(0,O) = 0; t(a, 1) = t(1, a) = a (boundary condition). Axiom t2: t(a, b) = t(b, a) (commutativity). Axiom t3: If a
(nondecreasing).
Axiom t4: t[t(a, b), c] = t[a, t(b, c)] (associativity). These axioms can be justified in the same way as for Axioms sl-s4. Definition 3.3. Any function t : [O, 1 1 x [O, 1 1 -+ [O, 1 1 that satisfies Axioms tl-t4 is called a t-norm. We can verify that the basic fuzzy intersection min of (3.3) is a t-norm. For any t-norm, there is an s-norm associated with it and vice versa. Hence, associated , with the s-norms of Dombi, Dubois-Prade and Yager classes ((3.8)-(3.10)), there ase t-norms of Dombi, Dubois-Prade and Yager classes, which are defined as follows: Dombi class (Dombi [1982]):
where a E [0, 1 1 . Yager class (Yager [1980]): t,(a, b) = 1- min[l, ((1 - a)" where w E (0, w). With a particular choice of the parameters, (3.25)-(3.27)each defines a particular t-norm. We can verify that (3.25)-(3.27) satisfy Axiom tl-t4. Associated with the particular s-norms (3.11)-(3.13) and (3.2), there are t-norms that are listed below:
+ (1 - b)")'lW]
(3.27)
42 Drastic product:
Ch. 3
(3.28) 0 otherwise Einstein product: tep(a, b, = 2 Algebraic product: tap(a,b) = ab Minimum: (3.3)
Example 3.2: Consider the fuzzy sets D and F defined in Example 2.1 of Chapter 2. If we use the Yager t-norm (3.27) for fuzzy intersection, then D n F is obtained as
ab + b - ab)
(3.29)
Fig. 3.5 shows this pDnP(x) for w = 3. If we use the algebraic product (3.30) for fuzzy intersection, the fuzzy set D n F becomes
P D ~(x) F = tap [PD(x) PF (x)]
=P(x)(~ - P(x))
which is plotted in Fig. 3.6. Comparing Figs. 3.5 and 3.6 with Fig. 2.10, we see that the Yager t-norm and algebraic product are smaller than the minimum operator. In general, we can show that minimum is the largest t-norm and drastic product is the smallest t-norm.
Theorem 3.2: For any t-norm t, that is, for any function t : [O, 1 1x [O,1] + [O, 1 1 that satisfies Axioms tl-t4, the following inequality holds:
for any a, b E [O,l]. The proof of this theorem is very similar to that of Theorem 3.1 and is left as , an exercise. Similar to Lemma 3.1, we can show that the Dombi t-norm t ~ ( ab) of (3.25) converges to the basic fuzzy intersection min(a, b) as X goes to infinity and converges to the drastic product tdp(a, b) as X goes to zero. Hence, the Dombi t-norm covers the whole range of t-norms. Lemma 3.2: Let tx(a, b) be defined as in (3.25), then
x+m
T-Norms
43
Figure 3.5. Membership function of D n F using the Yager t-norm (3.27) with w = 3.
nF
and
X+O
This lem.ma can be proven in a similar way as for Lemma 3.1. Comparing (3.8)-(3.13) with (3.25)-(3.30), respectively, we see that for each s-
44
Ch. 3
norm there is a t-norm associated with it. But what does this "associated" mean? It means that there exists a fuzzy complement such that the three together satisfy the DeMorgan's Law. Specifically, we say the s-norm s(a, b), t-norm t(a, b) and fuzzy complement c(a) form an associated class if
Example 3.3: The Yager s-norm sw(a,b) of (3.10), Yager t-norm tw(a,b) of (3.27), and the basic fuzzy complement (3.1) form an associated class. To show this, we have from (3.1) and (3.10) that
c[sw(a,b)] = 1- min[l, (aw-tbw)l/w] (3.37)
where c(a) denotes the basic fuzzy complement (3.1). On the other hand, we have from (3.1) and (3.27) that
Example 3.4: The algebraic sum (3.13), algebraic product (3.30), and the basic fuzzy complement (3.1) form an associated class. To show this, we have from (3.1) and (3.13) that c[sas(a,b)] = 1- a - b ab (3.39)
3.4
Averaging Operators
From Theorem 3.1 we see that for any membership values a = , u ~ ( xand ) b =,u~(x) of arbitrary fuzzy sets A and B, the membership value of their union AU B (defined by any s-norm) lies in the interval [max(a, b), sds(a,b)]. Similarly, from Theorem 3.2 we have that the membership value of the intersection A n B (defined by any tnorm) lies in the interval [&,(a, b), min(a, b)]. See Fig.3.7. Therefore, the union and intersection operators cannot cover the interval between min(a, b) and max(a, b). The operators that cover the interval [min(a,b), max(a, b)] are called averaging operators. Similar to the s-norms and t-norms, an averaging operator, denoted by v, is a function from [0, I] x [0, I] to [0, 11. Many averaging operators were proposed in the literature. Here we list four of them:
45
nimum
b @1
I
$
ombi t-norm
x*~
fuzzy and
fuzzy or Yager s-norm
pager t-norm
' E
r
max-min averages
3-w --%
i
i
sds(?b)
L
generalized means 1
+a
t&(a,b) mn(a,b)
+
max(a,b) union operators
intersect~on operators
averaging operators
Max-min averages:
where X E [ O , l ] .
where a E R ( a # 0 ) .
"Fuzzy and":
(a+ + (1-p)2
b)
46
Ch. 3
Clearly, the max-min averages cover the whole interval [min(a,b),max(a, b)] as the parameter X changes from 0 to 1. The "fuzzy and" covers the range from min(a, b) to (a+b)/2, and the "fuzzy or" covers the range from (a+b)/2 to max(a, b). It also can be shown that the generalized means cover the whole range from min(a, b) to max(a, b) as a changes from -w to m.
3.5
In this chapter we have demonstrated the following: The axiomatic definitions of fuzzy complements, s-norms (fuzzy unions), and t-norms (fuzzy intersections). Some specific classes of fuzzy complements, s-norms, t-norms, and averaging operators, and their properties. How to prove various properties of some particular fuzzy complements, snorms, t-norms, and averaging operators. The materials in this chapter were extracted from Klir and Yuan [I9951 where more details on the operators can be found. Dubois and Prade [I9851 provided a very good review of fuzzy union, fuzzy intersection, and averaging operators.
3.6
Exercises
Exercise 3.1. The e q u i l i b r i u m of a fuzzy complement c is defined as a E [O,1] such that c(a) = a. (a) Determine the equilibrium of the Yager fuzzy complement (3.6). (b) Prove that every fuzzy complement has at most one equilibrium. (c) Prove that a continuous fuzzy complement has a unique equilibrium.
Exercise 3.2. Show that the Yager s2norm (3.10) converges to the basic fuzzy union (3.2) as w goes to infinity and converges to the drastic sum (3.11) as w goes to zero.
Exercise 3.3. Let the fuzzy sets F and G be defined as in Exercise 2.3. (a) Determine the membership functions for F U G and F n G using the Yager s-norm (3.10) and t-norm (3.27) with w = 2. (b) Using (3.1) as fuzzy complement, algebraic sum (3.13) as fuzzy union, and algebraic product (3.30) as fuzzy intersection, compute the membership functions for F n G, E n G, and
m.
47
Exercise 3.5. A fuzzy complement c is said to be involutive if c[c(a)] = a for all a E [0, 1 1 .
(a)Show that the Sugeno fuzzy complement (3.5) and the Yager fuzzy complement (3.6) are involutive. (b) Let c be an involutive fuzzy complement and t be any t-norm. Show that 1 x [O,1] --+ [0,1] defined by the operator u : [O, 1
is an s-norm. (c) Prove that the c, t , and u in (b) form an associated class.
Exercise 3.6. Determine s-norm s,(a, b) such that s,(a, b ) , the minimum tnorm (3.3), and the Yager complement (3.6) with w = 2 form an associated class. Exercise 3.7. Prove that the following triples form an associated class with respect to any fuzzy complement c: (a) (min, max, c), and (b) (tdp,sds, c). Exercise 3.8. Prove that the generalized means (3.42) become min and max operators as a -+ -oo and a: --+ oo, respectively.
Chapter 4
Note that the order in which U and V appears is important; that is, if U # V, then U x V # V x U. In general, the Cartesian product of arbitrary n nonfuzzy sets Ul, U2, ...,U,, denoted by Ul x U2 x .. . x U,, is the nonfuzzy set of all n-tuples (ul, u2, ...,u,) such that ui E Ui for i E {1,2, ..., }; that is,
A (nonfuzzy) relation among (nonfuzzy) sets Ul, U2, ...,Un is a subset of the Cartesian product Ul x U2 x . x U,; that is, if we use Q(U1, U2, ..., Un) t o denote a relation among Ul, U2, ..., Un, then
As a special case, a binary relation between (nonfuzzy) sets U and V is a subset of the Cartesian product U x V.
Example 4.1. Let U = {1,2,3} and V = {2,3,4}. Then the cartesian product ofU and V i s t h e set U x V = {(1,2), (1,3), (1,4), (2,2), (2,3), (2,4), (3,2), (3,3), (3,4)). A relation between U and V is a subset of U x V. For example, let Q(U, V) be a relation named "the first element is no smaller than the second element," then
49
Because a relation is itself a set, all of the basic set operations can be applied to it without modification. Also, we can use the following membership function to represent a relation: , L L Q ( U I ., .~ ., Z ~, n ) =
1 if (UI,UZ, .--,un)E Q(Ui,u~i...,Un) 0 otherwise
(4.5)
For binary relation Q(U, V) defined over U x V which contains finite elements, we often collect the values of the membership function ,LLQinto a relational matriq see the following example.
Example 4.1 (Cont'd). The relation Q(U,V) of (4.4) can be represented by the following relational matrix:
A classical relation represents a crisp relationship among sets, that is, either there is such a relationship or not. For certain relationships, however, it is difficult to give a zero-one assessment; see the following example.
Example 4.2. Let U = {SanFrancisco, HongKong, Tokyo) and V = {Boston, HongKong). We want to define the relational concept %cry far" between these two sets of cities. Clearly, classical relations are not useful because the concept "very far" is not well-defined in the framework of classical sets and relations. However, "very far" does mean something and we should find a numerical system to characterize it. If we use a number in the interval [O,1] to represent the degree of %cry far," then the concept "very far" may be represented by the following (fuzzy) relational matrix:
HK 0.9 0 0.1
Example 4.2 shows that we need to generalize the concept of classical relation in order to formulate more relationships in the real world. The concept of fuzzy relation was thus introduced.
Definition 4.1. A fuzzy relation is a fuzzy set defined in the Cartesian product of crisp sets Ul , U2, ..., Un. With the representation scheme (2.5), a fuzzy relation
50
where p~ : Ul x U 2 x . . . x Un + [0, 11. As a special case, a binary fuzzy relation is a fuzzy set defined in the Cartesian product of two crisp sets. A binary relation on a finite Cartesian product is usually represented by a fuzzy relational matrix, that is, a matrix whose elements are the membership values of the corresponding pairs belonging to the fuzzy relation. For example, (4.7) is a fuzzy relational matrix representing the fuzzy relation named "very far" between the two groups of cities.
Example 4.3. Let U and V be the set of real numbers, that is, U = V = R. A fuzzy relation "x is approximately equal to y," denoted by AE, may be defined by the membership function p ~ , (x, y y) = e-(x-~)2 14.9)
Similarly, a fuzzy relation "x is much larger than y," denoted by ML, may be defined by the membership function
Of course, other membership functions may be used to represent these fuzzy relations. 4.1.2
Projections and Cylindric Extensions
Because a crisp relation is defined in the product space of two or more sets, the concepts of projection and cylindric extension were proposed. For example, consider 5 1) which is a relation in U x V = R2. the set A = {(x, y) E R21(x- 1)2 (y Then the projection of A on U is Al = [O, 1 1 c U, and the projection of A on V is AS = [0,1] C V; see Fig. 4.1. The cylindric extension of Al to U x V = R2 is AIE = [0,1] x (-00, CO) c R2. These concepts can be extended to fuzzy relations.
Definition 4.2. Let Q be a fuzzy relation in Ul x . . . x Un and {il, ..., i k ) be a subsequence of {1,2, ..., n ) , then the projection of Q on Uil x . - x Uik is a fuzzy relation Q p in Uil x . . . x Ui, defined by the membership function ~ ~ p ( ~ i l , . . -= ,~ ik) max PQ(UI, ..., un) ujlEUjl,".,~~(n-k)Euj(~-k) (4.11)
where {ujl, ..., uj(,-k)) is the complement of {uil, ...,uik) with respect to {ul, ..., u,). As a special case, if Q is a binary fuzzy relation in U x V, then the projection of Q on U, denoted by Q1, is a fuzzy set in U defined by
51
Note that (4.12) is still valid if Q is a crisp relation. For example, if Q is the crisp relation A in Fig. 4.1, then its projection Q1 defined by (4.12) is equal to the Al in Fig. 4.1. Hence, the projection of fuzzy relation defined by (4.11) is a natural extension of the projection of crisp relation.
Example 4.4. According to (4.12), the projection of fuzzy relation (4.7) on U and V are the fuzzy sets
and
Qz = l/Boston + 0.9/HK
respectively. Similarly, the projections of AE defined by (4.9) on U and V are the fuzzy sets
and
respectively. Note that AE1 equals the crisp set U and AE2 equals the crisp set V. The projection constrains a fuzzy relation to a subspace; conversely, the cylindric extension extends a fuzzy relation (or fuzzy set) from a subspace to the whole space. Fo,rmally, we have the following definition.
52
Ch. 4
Definition 4.3. Let Q p be a fuzzy relation in Uil x . . . x Uik and {il, ..., ik) is , is a subsequence of {1,2, ..., n), then the cylindric extension of Qp to Ul x . . . x U a fuzzy relation Q ~ in E Ul x . - . x Un defined by
As a special case, if Q1 is a fuzzy set in U, then the cylindric extension of Q1 to U x V is a fuzzy relation QIE in U x V defined by
The definition (4.17) is also valid for crisp relations; check Fig. 4.1 for an example.
Example 4.5. Consider the projections Q1 and Q2 in Example 4.4 ((4.13) and (4.14)). According to (4.18), their cylindric extensions to U x V are
QIE = 0.9/(SF, Boston) 0.9/(SF1H K ) l / ( H K , Boston) + l / ( H K , H K ) O.95/(Tokyo, Boston) +0.95/(Tokyo1 H K )
(4.19)
(4.20)
Similarly, the cylindric extensions of AE1 and AE2 in (4.15) and (4.16) to U x V are r and 1/(x, y) = U x V (4.22)
From Examples 4.4 and 4.5 we see that when we take the projection of a fuzzy relation and then cylindrically extend it, we obtain a fuzzy relation that is larger than the original one. To characterize this property formally, we first introduce the concept of Cartesian product of fuzzy sets. Let A1 , ..., A, be fuzzy sets in Ul , ..., Un, respectively. The Cartesian product of Al, ..., A,, denoted by A1 x . .. x A,, is a , whose membership function is defined as fuzzy relation in Ul x . . . x U
where
Lemma 4.1. If Q is a fuzzy relation in Ul x . .. x U , and Q1, ..., Q, are its projections on Ul , ..., U,, respectively, then (see Fig. 4.2 for illustration)
53
Figure 4.2. Relation between the Cartesian product and intersection of cylindric sets.
where we use min for the t-norm in the definition (4.23) of Q1 x . . . x Q,.
max
PQ ( ~ 1..., , an)
(4.25)
Hence,
Q c QIE
for all i = 1,2, ..., n , where QiE is the cylindric extension of Qi to Ul x ... x U,. Therefore, if we use min for intersection, we have
4.2
Let P(U, V) and Q(V, W) be two crisp binary relations that share a common set V. The composition of P and Q, denoted by P o Q, is defined as a relation in U x W such that (x, z) E P o Q if and only if there exists at least one y E V such that (x, y) E P and (y, z) E Q. Using the membership function representation of relations (see (4.5)), we have an equivalent definition for composition that is given in the following lemma.
54
Fuzzy
Ch. 4
Proof: We first show that if P o Q is the composition according to the definition, then (4.28) is true. If P o Q is the composition, then (x, z) E P o Q implies that there exists y E V such that pp(x, y) = 1 and pQ(y,z) = 1. Hence, ppoQ(x,Z) = 1 = max,Ev t[pp(x, y), pQ(y, z)], that is, (4.28) is true. If (x, z) # P o Q, then for any y E V either pp(x, y) = 0 or pQ(y,Z) = 0. Hence, ppoQ(x,Z) = 0 = max,~ t[pp(x, ~ y), pQ(y, z)]. Therefore, (4.28) is true for any (x, z) E U x W.
t[pp(x, ~ y), Conversely, if (4.28) is true, then (x, z) E P o Q implies m a x , ~ pQ(y, z)] = 1, which means that there exists a t least one y E V such that pP(x, y) = pQ(y,z) = 1 (see Axiom t l in Section 3.3); this is the definition. For (x, z) # P o Q, we have from (4.28) that max, E V t[pp(x, y), pQ(y, z)] = 0, which means that there is no y E V such that pp(x, y) = pQ(y, Z) = 1. Therefore, (4.28) implies that P o Q is the composition according to the definition. Now we generalize the concept of composition to fuzzy relations. From Lemma 4.2 we see that if we use (4.28) to define composition of fuzzy relations (suppose P and Q are fuzzy relatioins), then the definition is valid for the special case where P and Q are crisp relations. Therefore, we give the following definition.
Definition 4.4. The composition of fuzzy relations P(U, V) and Q(V, W), denoted by P o Q, is defined as a fuzzy relation in U x W whose membership function is given by (4.28).
Because the t-norm in (4.28) can take a variety of formulas, for each t-norm we obtain a particular composition. The two most commonly used compositions in the literature are the so-called max-min composition and max-product composition, which are defined as follows: The max-min composition of fuzzy relations P(U, V) and Q(V, W) is a fuzzy relation P o Q in U x W defined by the membership function
where (2, z) E U x W. The max-product composition of fuzzy relations P(U, V) and Q(V, W) is a fuzzy relation P o Q in U x W defined by the membership function p p O Q ( ~ , = F~;[PP(x, y)pQ(y,z)] where (x, z) E U x W. (4.30)
55
We see that the max-min and max-product compositions use minimum and algebraic product for the t-norm in the definition (4.28), respectively. We now consider two examples for how to compute the compositions. Example 4.6. Let U and V be defined as in Example 4.2 and W = {New York City, Beijing). Let P(U, V) denote the fuzzy relation "very far" defined by (4.7). Define the fuzzy relation "very near'' in V x W, denoted by Q(V, W), by the relational matrix W NYC Beijing (4.31) V Boston 0.95 0.1 HK 0.1 0.9 Using the notation (2.7), we can write P and Q as
P = 0.3/(SF, Boston) 0.9/(SF, H K ) + l / ( H K , Boston) +O/(HK, H K ) O.95/(Tokyo, Boston) + O.l/(Tokyo, H K ) (4.32) Q = 0.95/(Boston, NYC) + O.l/(Boston, Beijing) + O.l/(HK, NYC) +0.9/(HK, Beijing) (4.33)
We now compute the max-min and max-product compositions of P and Q. First, we note that U x W contains six elements: (SF,NYC), (SF,Beijing), (HK,NYC), (HK,Beijing), (Tokyo,NYC) and (Tokyo,Beijing). Thus, our task is to determine the ~ six elements. Using the max-min composition membership values of p p at~these (4.29), we have ppo~(SF NYC) , = max{min[pp(SF, Boston), pQ(Boston, NYC)], min[~~(s H FK , ) ,PQ(HK, NYC)]) = max[min(0.3,0.95), min(0.9,0.1)] = 0.3 Similarly, we have ppoQ(SF, Beijing) = max{min[pp(SF, Boston), ,UQ (Boston, Beijing)], min[pp (SF, H K ) , pQ(HK, Beijing)]) = max[min(0.3,0.1), min(0.9,0.9)] = 0.9 (4.35) The final P o Q is P o Q = 0.3/(SF, NYC) 0.9/(SF, Beijing) 0.95/(HK, NYC) +O.l/(HK, Beijing) 0.95/(Tokgo, NYC) +O.l/(Tokyo, Beijing)
(4.34)
(4.36)
56
Ch. 4
If we use the max-product composition (4.30), then following the same procedure as above (replacing min by product), we obtain
P o Q = 0.285/(SF7NYC) 0.81/(SF7Beijing) + 0.95/(HK, NYC) +O.l/(HK, Beijing) + 0.9025/(Tokyo7NYC) +0.095/(Tokyo, Beijing) (4.37)
From (4.36), (4.37) and the relational matrices (4.7) and (4.31), we see that a simpler way to compute P o Q is to use relational matrices and matrix product. Specifically, let P and Q be the relational matrices for the fuzzy relations P(U, V) and Q(V, W), respectively. Then, the relational matrix for the fuzzy composition P o Q can be computed according to the following method: For max-min composition, write out each element in the matrix product P Q , but treat each multiplication as a min operation and each addition as a rnax operation. For max-product composition, write out each element in the matrix product P Q , but treat each addition as a max operation. We now check that (4.36) and (4.37) can be obtained by this method. Specifically, we have
for max-product composition. In Example 4.6, the universal sets U, V and W contain a finite number of elements. In most engineering applications, however, the U, V and W are real-valued spaces that contain an infinite number of elements. We now consider an example for computing the composition of fuzzy relations in continuous domains. Example 4.7: Let U = V = W = R. Consider the fuzzy relation AE (approximately equal) and ML (much larger) defined by (4.9) and (4.10) in Example 4.3. We now want to determine the composition A E o ML. Using the max-product composition, we have
57
To compute the right hand side of (4.40), we must determine the y E R at which achieves its maximum value, where x and z are considered to be fixed values in R. The necessary condition for such y is
e-(m-~)2 l+e-(y-z,
Because it is impossible to obtain a closed form solution for (4.41), we cannot further simplify (4.40). In practice, for given values of x and z we can first determine the numerical solution of (4.41) and then substitute it into (4.40). Comparing this example with Example 4.6, we see that fuzzy compositions in continuous domains are much more difficult to compute than those in discrete domains.
4.3
The extension principle is a basic identity that allows the domain of a function to be extended from crisp points in U to fuzzy sets in U. More specifically, let f : U -+ V be a function from crisp set U to crisp set V. Suppose that a fuzzy set A in U is given and we want to determine a fuzzy set B = f (A) in V that is induced by f . If f is an one-to-one mapping, then we can define
where f-l(y) is the inverse of f , that is, f[fF1(y)] = y. If f is not one-to-one, then an ambiguity arises when two or more distinct points in U with different membership values in A are mapped into the same point in V. For example, we may have f (XI)= f (x2) = y but xl # x2 and p ~ ( x 1 # ) P A ( x ~ )SO , the right hand side of (4.42) may take two different values p ~ ( x 1 = f -l(y)) or ~ A ( X = Zf To resolve this ambiguity, we assign the larger one of the two membership values . generally, the membership function for B is defined as to p ~ ( y ) More
where f-l(y) denotes the set of all points x E U such that f (x) = y. The identity (4.43) is called the eztension principle.
Example 4.8. Let U = {1,2, ..., 10) and f (x) = x2. Let s m a l l be a fuzzy set in U defined by
(4.44)
58
4.4
In this chapter we have demonstrated the following: The concepts of fuzzy relations, projections, and cylindric extensions. The max-min and max-product compositions of fuzzy relations. The extension principle and its applications. The basic ideas of fuzzy relations, projections, cylindric extensions, compositions of fuzzy relations, and the extension principle were proposed in Zadeh [1971b] and Zadeh [1975]. These original papers of Zadeh were very clearly written and are still the best sources to learn these fundamental concepts.
4.5
Exercises
Exercise 4.1. Given an n-ary relation, how many different projections of the relation can be taken? Exercise 4.2. Consider the fuzzy relation Q defined in Ul x . . . x U4 where Ul = {a, b, c), U 2 = {s, t), U3 = {x, y) and U 4 = {i, j):
(a) Compute the projections of Q on Ul x U2 x U4, Ul x U3 and U4. (b) Compute the cylindric extensions of the projections in (a) to Ul x Uz x U3 x U4.
Exercise 4.3. Consider the three binary fuzzy relations defined by the relational matrices:
0 3
0 1 (4.46) Compute the max-min and max-product compositions Q1 o Q2,Q1 o Q3 and Q1 o Q2 0 Q3. Exercise 4.4. Consider fuzzy set A = 0.51- 1+0.8/0+1/1+0.4/2 and function f (x) = x2. Determine the fuzzy set f (A) using the extension principle.
Exercise 4.5. Compute the pAEoML($,Z) in Example 4.7 for (x, Z) = (0, O), (0, I), (1, O), (171).
0 0.7
Q 3 = ( 0.7 0 0 1
Chapter 5
5.1
In our daily life, words are often used to describe variables. For example, when we say "today is hot," or equivalently, "today's temperature is high," we use the word "high" to describe the variable "today's temperature." That is, the variable "today's temperature7' takes the word "high" as its value. Clearly, the variable "today's temperature" also can take numbers like 25Oc, lgOc,etc., as its values. When a variable takes numbers as its values, we have a well-established mathematical framework to formulate it. But when a variable takes words as its values, we do not have a formal framework to formulate it in classical mathematical theory. In order to provide such a formal framework, the concept of linguistic variables was introduced. Roughly speaking, if a variable can take words in natural languages as its values, it is called a linguistic variable. Now the question is how to formulate the words in mathematical terms? Here we use fuzzy sets to characterize the words. Thus, we have the following definition.
Definition 5.1. If a variable can take words in natural languages as its values, it is called a linguistic variable, where the words are characterized by fuzzy sets defined in the universe of discourse in which the variable is defined. Example 5.1. The speed of a car is a variable x that takes values in the interval , is the maximum speed of the car. We now define three fuzzy [0, Vmas], where V sets "slow," "medium," and "fast" in [0, V,,] as shown in Fig. 5.1. If we view x as a linguistic variable, then it can take "slow," "medium" and "fast" as its values. That is, we can say "x is slow," "x is medium," and "x is fast." Of course, x also can take numbers in the interval [0, Vma,] as its values, for example, x = 50mph, 35mph, etc.
Definition 5.1 gives a simple and intuitive definition for linguistic variables. In the fuzzy theory literature, a more formal definition of linguistic variables was usu-
60
Ch. 5
slow
medium
fast
Figure 5.1. The speed of a car as a linguistic variable s that can take fuzzy sets "slow," "medium" and "fast" a its values.
ally employed (Zadeh [I9731 and [1975]). This definition is given below.
X is the name of the linguistic variable; in Example 5.1, X is the speed of the \ car. T is the set of linguistic values that X can take; in Example 5.1, T = {slow, medium, fast).
U is the actual physical domain in which the linguistic variable X takes its quantitative (crisp) values; in Example 5.1, U = [0,V,,,].
M is a semantic rule that relates each linguistic value in T with a fuzzy set in U; in Example 5.1, M relates "slow," "medium," and "fast" with the membership functions shown in Fig. 5.1.
Comparing Definitions 5.1 with 5.2, we see that they are essentially equivalent. Definition 5.1 is more intuitive, whereas Definition 5.2 looks more formal. From these definitions we see that linguistic variables are extensions of numerical variables in the sense that they are allowed to take fuzzy sets as their values; see Fig. 5.2. Why is the concept of linguistic variable important? Because linguistic variables are the most fundamental elements in human knowledge representation. When we use sensors to measure a variable, they give us numbers; when we ask human experts to evaluate a variable, they give us words. For example, when we use a radar gun to measure the speed of a car, it gives us numbers like 39mph,42mph, etc.; when
61
numerical variable
linguistic variable
we ask a human to tell us about the speed of the car, he/she often tells us in words like "it's slow," "it's fast," etc. Hence, by introducing the concept of linguistic variables, we are able to formulate vague descriptions in natural languages in precise mathematical terms. This is the first step to incorporate human knowledge into engineering systems in a systematic and efficient manner.
5.2
Linguistic Hedges
With the concept of linguistic variables, we are able to take words as values of (linguistic) variables. In our daily life, we often use more than one word to describe a variable. For example, if we view the speed of a car as a linguistic variable, then its values might be "not slow," "very slow," "slightly fast," "more or less medium," etc. In general, the value of a linguistic variable is a composite term x = ~ 1 x . 2. .x, that is a concatenation of atomic terms xl,x2, ...,x,. These atomic terms may be classified into three groups: Primary terms, which are labels of fuzzy sets; in Example 5.1, they are LLslow," "medium," and "fast." Complement "not" and connections "and" and "or."
Hedges, such as "very," L'slightly," "more or less," etc.
The terms "not," "and," and "or" were studied in Chapters 2 and 3. Our task now is to characterize hedges.
62
Ch. 5
Although in its everyday use the hedge very does not have a well-defined meaning, in essence it acts as an intensifier. In this spirit, we have the following definition for the two most commonly used hedges: very and more or less.
Definition 5.3. Let A be a fuzzy set in U, then very A is defined as a fuzzy set in U with the membership function
Example 5.2. Let U = {1,2, ..., 5) and the fuzzy set small be defined as
small = 111 0.812 + 0.613 + 0.414
Then, according to (5.1) and (5.2), we have
+ 0.215 +
(5-3)
very small = 111 0.6412 0.3613 0.1614 0.0415 very very small = very (very small) = 111 0.409612 0.129613 0.025614 +0.0016/5 more or less small = 111 0.894412 0.774613 + 0.632514 +0.4472/5
+ +
+
(5.4)
(5.5) (5.6)
5.3
In Chapter 1 we mentioned that in fuzzy systems and control, human knowledge is represented in terms of fuzzy IF-THEN rules. A fuzzy IF-THEN rule is a conditional statement expressed as
IF
(5.7)
Therefore, in order to understand fuzzy IF-THEN rules, we first must know what are fuzzy propositions.
63
where x is a linguistic variable, and A is a linguistic value of x (that is, A is a fuzzy set defined in the physical domain of x). A compound fuzzy proposition is a composition of atomic fuzzy propositions using the connectives "and," "or," and "not" which represent fuzzy intersection, fuzzy union, and fuzzy complement, respectively. For example, if x represents the speed of the car in Example 5.1, then the following are fuzzy propositions (the first three are atomic fuzzy propositions and the last three are compound fuzzy propositions): x is s x is M x is F x i s S or x i s not M x i s not S and x i s not F (x i s S and x i s not F ) or x i s M (5.9) (5.10) (5.11) (5.12) (5.13) (5.14)
where S,M and F denote the fuzzy sets "slow," "medium," and "fast," respectively. Note that in a compound fuzzy proposition, the atomic fuzzy propositions are independent, that is, the x's in the same proposition of (5.12)-(5.14) can be different variables. Actually, the linguistic variables in a compound fuzzy proposition are in general not the same. For example, let x be the speed of a car and y = x be the acceleration of the car, then if we define fuzzy set large(L) for the acceleration, the following is a compound fuzzy proposition x i s F and y i s L Therefore, compound fuzzy propositions should be understood as fuzzy relations. How to determine the membership functions of these fuzzy relations?
For connective "and" use fuzzy intersections. Specifically, let x and y be linguistic variables in the physical domains U and V, and A and B be fuzzy sets in U and V, respectively, then the compound fuzzy proposition
x i s A and y i s B (5.15)
For connective "or" use fuzzy unions. Specifically, the compound fuzzy proposition x i s A or y i s B (5.17)
lNote that in Chapters 2 and 3, A and B are fuzzy sets defined in the same universal set U and A U B and A n B are fuzzy sets in U ; here, A U B and A 17 B are fuzzy relations in U x V, where U may or may not equal V.
64
Ch. 5
For connective "not" use fuzzy complements. That is, replace not A by which is defined according to the complement operators in Chapter 3.
A,
Example 5.3. The fuzzy proposition (5.14), that is, F P = (x is S and x i s not F ) or x i s M
(5.19)
where s, t and c are s-norm, t-norm and fuzzy complement operators, respectively, the fuzzy sets S = small, M = medium, and F = f a s t are defined in Fig. 5.1, and 2 1 = 2 2 =x3 =x. We are now ready to interpret the fuzzy IF-THEN rules in the form of (5.7). 5.3.2
Interpretations of Fuzzy IF-THEN Rules
Because the fuzzy propositions are interpreted as fuzzy relations, the key question remaining is how to interpret the IF-THEN operation. In classical propositional calculus, the expression IF p THEN q is written as p -+ q with the implication -+ regarded as a connective defined by Table 5.1, where p and q are propositional variables whose values are either truth (T) or false (F). From Table 5.1 we see that if both p and q are true or false, then p -+ q is true; if p is true and q is false, then p --t q is false; and, if p is false and q is true, then p -+ q is true. Hence, p --t q is equivalent to PVq (5.21) and
CPA~)VP
in the sense that they share the same truth table (Table 5.1) as p -+ q, where; V and A represent (classical) logic operations "not," "or," and "and," respectively. Because fuzzy IF-THEN rules can be viewed as replacing the p and q with fuzzy propositions, we can interpret the fuzzy IF-THEN rules by replacing the; V and A operators in (5.21) and (5.22) with fuzzy complement, fuzzy union, and fuzzy intersection, respectively. Since there are a wide variety of fuzzy complement, fuzzy union, and fuzzy intersection operators, a number of different interpretations of fuzzy IF-THEN rules were proposed in the literature. We list some of them below.
\,
65
+q
In the following, we rewrite (5.7) as I F < FPl > T H E N < FP2 > and replace the p and q in (5.21) and (5.22) by FPl and FP2, respectively, where FPl and FP2 are fuzzy propositions. We assume that FPl is a fuzzy relation defined in tJ = Ul x . . . x U,, FP2 is a fuzzy relation defined in V = Vl x . . . x V,, and x and y are linguistic variables (vectors) in U and V, respectively.
Dienes-Rescher Implication: If we replace the logic operators - and V in (5.21) by the basic fuzzy complement (3.1) and the basic fuzzy union (3.2), respectively, then we obtain the so-called Dienes-Rescher implication. Specifically, the fuzzy IF-THEN rule I F < FPl > T H E N < FP2 > is interpreted as a fuzzy relation Q D in U x V with the membership function
Lukasiewicz Implication: If we use the Yager s-norm (3.10) with w = 1 for the V and basic fuzzy complement (3.1) for the-in (5.21), we obtain the Lukasiewicz implication. Specifically, the fuzzy IF-THEN rule I F < FPl > T H E N < FP2 > is interpreted as a fuzzy relation Q L in U x V with the membership function
Zadeh Implication: Here the fuzzy IF-THEN rule I F < FPl > T H E N < FP2 > is interpreted as a fuzzy relation Q z in U x V with the membership function
Clearly, (5.25) is obtained from (5.22) by using basic fuzzy complement (3.1), basic fuzzy union (3.2), and basic fuzzy intersection (3.3) for -V and A, respectively.
Giidel Implication: The Godel implication is a well-known implication formula in classical logic. By generalizing it to fuzzy propositions, we obtain
66
Ch. 5
the following: the fuzzy IF-THEN rule I F < FPl > T H E N < FP2 > is interpreted as a fuzzy relation QG in U x V with the membership function (x'y) =
' Q G
< PFPZ(Y)
It is interesting to explore the relationship among these implications. The following lemma shows that the Zadeh implication is smaller than the Dienes-Rescher implication, which is smaller than the Lukasiewicz implication. Lemma 5.1. For all (x, y) E U x V, the following is true
Proof: Since 0 I 1 - ~ F (x) P I ~ 1 and 0 5 p ~ p , ( y )5 1, we have max[l PFPI (XI,PFPZ(Y)]I 1 - PFP~ (2) PFP, (Y) and max[l - p ~ (x), q pFp2(y)] 5 1. Hence, PQD (x, Y) = m a 4 1 - PFPI (x),P F P ~ (Y)] min[l, 1- pFp1 (x) p ~ p (y)] , = PQL(x, Y). Since m i n [ P ~ p(XI, , PFP, (Y)] I PFP, (y), we have max[min(pFpl (x), PFPz (Y)),~ - P F P I (211 5 ~ ~ X [ P F P (Y), , 1- ~ F P , (x)], which is PQ, (x, 9) 5 p~~ (x, y) .
<
Conceptually, we can replace the ; V and A in (5.21) and (5.22) by any fuzzy complement, s-norm and t-norm, respectively, to obtain a particular interpretation. So a question arises: Based on what criteria do we choose the combination of fuzzy complements, s-norms, and t-norms? This is an important question and we will discuss it in Chapters 7-10. Another question is: Are (5.21) and (5.22) still "equivalent" to p -+ q when p and q are fuzzy propositions and what does this "equivalent" mean? We now try to answer this question. When p and q are crisp propositions (that is, p and q are either true or false), p + q is a global implication in the sense that Table 5.1 covers all the possible cases. However, when p and q are fuzzy propositions, p + q may only be a local implication in the sense that p + q has large truth value only when both p and q have large truth values. For example, when we say "IF speed is high, THEN resistance is high," we are concerned only with a local situation in the sense that this rule tells us nothing about the situations when "speed is slow," "speed is medium," etc. Therefore, the fuzzy IF-THEN rule
IF
should be interpreted as
(5.28)
IF
(5.29)
where N O T H I N G means that this rule does not exist. In logic terms, it becomes
Using min or algebraic product for the A in (5.30), we obtain the Mamdani implications.
67
Mamdani Implications: The fuzzy IF-THEN rule (5.28) is interpreted as a fuzzy relation QMM or QMp in U x V with the membership function
Mamdani implications are the most widely used implications in fuzzy systems and fuzzy control. They are supported by the argument that fuzzy IF-THEN rules are local. However, one may not agree with this argument. For example, one may argue that when we say "IF speed is high, THEN resistance is high," we implicitly indicate that "IF speed is slow, THEN resistance is low." In this sense, fuzzy IF-THEN rules are nonlocal. This kind of debate indicates that when we represent human knowledge in terms of fuzzy IF-THEN rules, different people have different interpretations. Consequently, different implications are needed to cope with the diversity of interpretations. For example, if the human experts think that their rules are local, then the Mamdani implications should be used; otherwise, the global implications (5.23)-(5.26) should be considered. We now consider some examples for the computation of Q D , QL,Qz, QMM and QMP. Example 5.4. Let xl be the speed of a car, x2 be the acceleration, and y be the force applied to the accelerator. Consider the following fuzzy IF-THEN rule:
(5.33)
"small" is a fuzzy set in the domain of acceleration with the membership function
and "large" is a fuzzy set in the domain of force applied to the accelerator with the membership function
Let the domains of XI,2 2 and y be Ul = [O, 1001, U2 = [O, 301, and V = [O, 31, respectively. If we use algebraic product for the t-norm in (5.16), then the fuzzy proposition FPl = $1 i s slow and xz i s small (5.37)
68
Ch. 5
slow,
Figure
5.3.
Illustration
fisiour ( x l ) ~ s r n a ~ ~ in ( x Example ~)
to
compute
{*
(5.38)
~ F P($1, , x2).
If we use the Dienes-Rescher implication (5.23), then the fuzzy IF-THEN rule X~ in , Ul x U2 x V with the (5.33) is interpreted as a fuzzy relation Q D ( X ~ , y) membership function
(55-x~)(lO-xz) 200
if if
if
XI
XI
(5.40)
To help us combi&ng 1 - p ~ (XI, q x2) of (5.40) with ~,,,,(y) of (5.36) using the max operator, we illustrate in Fig. 5.4 the division of the domains of 1-pFp, (XI,22)
69
Figure 5.4. Division of the domains of 1 - p . 1 (XI, ~ ~22) ~ and pl,,,,(y) and their combinations for Example 5.4.
and pl,,,,(y)
max[y - 1, xz/10]
if
<
(5.41)
max[y - 1 , 1 1
(55-"1)(10-"2) 200
For Lukasiewicz, Zadeh and Mamdani implications, we can use the same procedure to determine the membership functions. [7 From Example 5.4 we see that if the membership functions in the atomic fuzzy propositions are not smooth functions (for example, (5.34)-(5.36)), the computation of the final membership functions p ~p ~~ etc., ,~ is, cumbersome, although it is straightforward. A way to resolve this complexity is to use a single smooth function t o approximate the nonsmooth functions; see the following example.
70
Ch. 5
of (5.35), and
to approximate the pl,,,,(y) of (5.36). Now if we use Mamdani product implication (5.32) and algebraic product for the t-norm in (5.16),then the membership function p~~~ ( ~ 1 ~ x y) 2 can , be easly computed as
Example 5.5. Let U = { 1 , 2 , 3 , 4 ) and V = { 1 , 2 , 3 ) . Suppose we know that x E U is somewhat inversely propositional to y E V. To formulate this knowledge, we may use the following fuzzy IF-THEN rule: IF x is large, T H E N y is small
where the fuzzy sets "large" and "small" are defined as large = 011 small = 111
(5.46)
(5.47) (5.48)
If we use the Dienes-Rescher implication (5.23), then the fuzzy IF-THEN rule (5.46) is interpreted as the following fuzzy relation QD in U x V:
71
For the Zadeh implication (5.25) and the Godel implication (5.26), we have
(5.51)
= 1/(1,1)
+1/(3,2)
+ 1/(1,2) + 1/(1,3) + 1/(2,1) + 1/(2,2) + 1/(2,3) + 1/(3,1) (5.52) + 0.1/(3,3) + 1/(4,1) + 0.5/(4,2) + 0.1/(4,3)
Finally, if we use the Mamdani implication (5.31) and (5.32), then the fuzzy IFTHEN rule (5.46) becomes
QMM
= 0/(1,1)
+0.1/(2,3) +0.5/(3,1) +0.5/(3,2) +0.1/(3,3) +1/(4,1) 0.5/(4,2) 0.1/(4,3) and QMP = 0/(1,1) 0/(1,2) 0/(1,3) 0.1/(2,1) 0.05/(2,2) +0.01/(2,3) 0.5/(3,1) 0.25/(3,2) 0.05/(3,3) +1/(4,1)
+ 0.5/(4,2) + 0.1/(4,3)
(5.54)
From (5.49)-(5.52) we see that for the combinations not covered by the rule (5.46), that is, the pairs (1, l ) , (1,2) and (1,3) (because pl,,,,(l) = 0), QD, QL, QZ and QG give full membership value to them, but QMM and QMp give zero membership value. This is consistent with our earlier discussion that Dienes-Rescher, Lukasiewicz, Zadeh and Godel implications are global, whereas Marndani implications are local.
5.4
In this chapter we have demonstrated the following: The concept of linguistic variables and the characterization of hedges. The concept of fuzzy propositions and fuzzy IF-THEN rules. Different interpretations of fuzzy IF-THEN rules, including Dienes-Rescher, Lukasiewicz, Zadeh, Godel and Mamdani implications. Properties and computation of these implications. Linguistic variables were introduced in Zadeh's seminal paper Zadeh [1973]. This paper is another piece of art and the reader is highly recommended to study it. The comprehensive three-part paper Zadeh [I9751 summarized many concepts and principles associated with linguistic variables.
72
Ch. 5
5.5
Exercises
Exercise 5.1. Give three examples of linguistic variables. Combine these linguistic variables into a compound fuzzy proposition and determine its membership function. Exercise 5.2. Consider some other linguistic hedges than those in Section 5.2 operations that represent them. and propose reas~nable Exercise 5.3. Let QL,QG,QMM and QMP be the fuzzy relations defined in (5.24), (5.26), (5.31), and (5.32), respectively. Show that
Exercise 5.4. Use basic fuzzy operators (3.1)-(3.3) for "not," "or," and "and," respectively, and determine the membership functions for the fuzzy propositions (5.12) and (5.13). Plot these membership functions. Exercise 5.5. Consider the fuzzy IF-THEN rule (5.33) with the fuzzy sets L L ~ l o"small" ~ , 7 ' and "large" defined by (5.42), (5.43) and (5.44), respectively. Use m i n for the t-norm in (5.16) and compute the fuzzy relations QD, &L,Q z , QG, QMM and QMP. Exercise 5.6. Let Q be a fuzzy relation in U x U . Q is called reflexive if pQ(u, u) = 1 for all u E U . Show that if Q is reflexive, then: (a) Q o Q is also reflexive, and (b) Q Q o Q, where o denotes m a x - min composition.
Chapter 6
74
Ch. 6
Table 6.1. Truth table for five operations that are frequently applied to propositions.
other logic function. Logic formulas are defined recursively as follows: The truth values 0 and 1 are logic formulas. If p is a proposition, then p and p are logic formulas. If p and q are logic formulas, then p V q and p A q are also logic formulas. The only logic formulas are those defined by (a)-(c). When the proposition represented by a logic formula is always true regardless of the truth values of the basic propositions participating in the formula, it is called a tautology; when it is always false, it is called a contradiction.
To prove (6.1) and (6.2), we use the truth table method, that is, we list all the possible values of (6.1) and (6.2) and see whether they are all true. Table 6.2 shows the results, which indicates that (6.1) and (6.2) are tautologies.
Various forms of tautologies can be used for making deductive inferences. They are referred to as inference rules. The three most commonly used inference rules are:
75
Modus Ponens: This inference rule states that given two propositions p and p -+ q (called the premises), the truth of the proposition q (called the conclusion) should be inferred. Symbolically, it is represented as
Modus Tollens: This inference rule states that given two propositions q and p -+ q, the truth of the proposition p should be inferred. Symbolically, it becomes ( Q A (P+ q)) - + P (6.4)
A more intuitive representation of modus tollens is Premise 1 : Premise 2 : Conclusion : 7~ is not B IF x i s A T H E N y is B x is not A
Hypothetical Syllogism: This inference rule states that given two propostions p -+ q and q -+ r , the truth of the proposition p t r should be inferred. Symbolically, we have
76
Ch. 6,
Generalized M o d u s Ponens: This inference rule states that given two fuzzy propositions x is A' and IF x is A THEN y is B, we should infer a new fuzzy proposition y is B' such that the closer the A' to A, the closer the B' to B, where A, A', B and B' are fuzzy sets; that is,
Table 6.3 shows the intuitive criteria relating Premise 1 and the Conclusion in generalized modus ponens. We note that if a causal relation between "x is A" and "y is B" is not strong in Premise 2, the satisfaction of criterion p3 and criterion p5 is allowed. Criterion p7 is interpreted as: "IF x is A THEN y is B, ELSE y is not B." Although this relation is not valid in classical logic, we often make such an interpretation in everyday reasoning.
Table 6.3. Intuitive criteria relating Premise 1 and Premise 2 in generalized modus ponens.
Generalized M o d u s Tollens: This inference rule states that given two fuzzy propositions y is B' and IF x is A THEN y is B, we should infer a new fuzzy proposition x is A' such that the more difference between B' and B , the more difference between A' and A, where A', A, B' and B are fuzzy sets; that is,
y i s B'
I F x is A THEN y is B x is A '
Table 6.4 shows some intuitive criteria relating Premise 1 and the Conclusion in generalized modus tollens. Similar to the criteria in Table 6.3, some criteria in Table 6.4 are not true in classical logic, but we use them approximately in our daily life.
77
Table 6.4. Intuitive criteria relating Premise 1 and the Conclusion for given Premise 2 in generalized modus tollens.
criterion t2 criterion t4
y is not very B
x is not very A
x is unknown
y is B
Generalized Hypothetical Syllogism: This inference rule states that given two fuzzy propositions IF x is A THEN y is B and IF y is B' THEN z is C, we could infer a new fuzzy proposition IF x is A THEN z is C' such that the closer the B t o B', the closer the C' to C, where A, B , B', C and C' are fuzzy sets; that is,
I F x is A T H E N y is B I F y i s B' T H E N z i s C I F x i s A T H E N z is C'
Table 6.5 shows some intuitive criteria relating y is B' with z is C' in the generalized hypothetical syllogism. Criteria s2 is obtained from the following intuition: To match the B in Premise 1 with the B' = very B in Premise 2, we may change Premise 1 to IF x is very A THEN y is very B, so we have IF x is very A THEN z is C. By applying the hedge more or less to cancel the very, we have IF x is A THEN z is more or less C, which is criterion s2. Other criteria can be justified in a similar manner.
Table 6.5.
Intuitive criteria relating y is B' in Premise 2 and Conclusion in generalized hypothetical syllogism. criterion s l criterion s2 criterion s3 criterion s4 criterion s5 criterion s6 criterion s7
y is B' (Premise 2) y is B y is very B y is very B y is more or less B
y is more or less y is not B y is not B
z
z is C' in
the
78
Ch. 6
We call the criteria in Tables 6.3-6.5 intuitive criteria because they are not necessarily true for a particular choice of fuzzy sets; this is what approximate reasoning means. Although these criteria are not absolutely correct, they do make some sense. They should be viewed as guidelines (or soft constraints) in designing specific inferences. We have now shown the basic ideas of three fundamental principles in fuzzy logic: generalized modus ponens, generalized modus tollens, and generalized hypothetical syllogism. The next question is how to determine the membership functions of the fuzzy propositions in the conclusions given those in the premises. The compositional rule of inference was proposed to answer this question.
6.2
The compositional rule of inference is a generalization of the following procedure (referring to Fig. 6.1): suppose we have a curve y = f (x) from x E U to y E V and are given x = a, then from x = a and y = f (x) we can infer y = b = f (a). Let us generalize the above procedure by assuming that a is an interval and f (x) is an interval-valued function as shown in Fig. 6.2. To find the interval b which is inferred from a and f (x), we first construct a cylindrical set aE with base a and find its intersection I with the interval-valued curve. Then we project I on V yielding the interval b.
Going one step further in our chain of generalization, assume the A' is a fuzzy set in U and Q is a fuzzy relation in U x V. Again, forming a cylindrical extension
79
Figure 6.2. Inferring interval b from interval a and interval-valued function f (x).
Figure 6.3. Inferring fuzzy set B' from fuzzy set A' and fuzzy relation Q .
A& of A' and intersecting it with the fuzzy relation Q (see Fig. 6.3), we obtain a fuzzy set A&n Q which is analog of the intersection I in Fig. 6.2. Then, projecting A& n Q on the y-axis, we obtain the fuzzy set B'. More specifically, given (x) and pQ (x, y), we have
PA',
(2, Y) = PA1(x)
80
Ch. 6
(6.8) is called the compositional rule of inference. In the literature, the symbol is often used for the t-norm operator, so (6.8) is also written as
"*"
The compositional rule of inference is also called the sup-star composition. In Chapter 5, we learned that a fuzzy IF-THEN rule, for example, IF x is A THEN y is B, is interpreted as a fuzzy relation in the Cartesian product of the domains of x and y. Different implication principles give different fuzzy relations; see (5.23)-(5.26), (5.31), and (5.32). Therefore, the Premise 2s in the generalized modus ponens and generalized modus tollens can be viewed as the fuzzy relation Q in (6.9). For generalized hypothetical syllogism, we see that it is simply the composition of two fuzzy relations, so we can use the composition (4.28) to determine the conclusion. In summary, we obtain the detailed formulas for computing the conclusions in generalized modus ponens, generalized modus tollens, and generalized hypothetical syllogism, as follows:
a Generalized M o d u s Ponens: Given fuzzy set A' (which represents the
premise x is A') and fuzzy relation A -+ B in U x V (which represents the premise IF x is A T H E N y is B), a fuzzy set B' in V (which represents the conclusion y is B') is inferred as
Generalized M o d u s Tollens: Given fuzzy set B' (which represents the premise y i s B') and fuzzy relation A -+ B in U x V (which represents the premise IF x is A T H E N y is B), a fuzzy set A' in U (which represents the conclusion x is A') is inferred as
Generalized Hypothetical Syllogism: Given fuzzy relation A -+ B in U x V (which represents the premise IF x is A T H E N y is B) and fuzzy relation B' -+ C in V x W (which represents the premise IF y is B' T H E N z
81
is C ) , a fuzzy relation A -+ C' in U x W (which represents the conclusion IF x is A THEN z is C') is inferred as
Using different t-norms in (6.10)-(6.12) and different implication rules (5.23)(5.26), (5.31) and (5.32), we obtain a diversity of results. These results show the properties of the implication rules. We now study some of these properties.
Example 6.2. Suppose we use min for the t-norm and Mamdani's product impliy) , in the generalized modus ponens (6.10). Consider cation (5.32) for the ~ A + B ( x four cases of A': (a) A' = A, (b) A' = very A, (c) A' = more or less A, and (d) A = A. Our task is to determine the corresponding B'. We assume that supXEU[pA(x)] = 1 (the fuzzy set A is normal). If A' = A, we have
= 1 and x can take any values in U, for any y E V there exists Since supxEu[p~(x)] . (6.14) can be simplified to x E U such that PA(%)2 p ~ ( y ) Thus
= PB(Y)
(6.15)
If A' = more or less A, then from P;'~(x) 2 pA(x) 2 pA(x)pB(x),we have PB' (Y)= SUP {min[P;l2 (XI, PA (X)PB (Y)])
XEU
82
Ch. 6
y )an increasing function with pA(x) while Since for fixed y E V, p ~ ( x ) p ~ ( is 1- p ~ ( x is ) a decreasing function with PA(%), the supxEumin in (6.17) is achieved ) p ~ ( x ) (y), p ~that is, when pa(x) = &. Hence, when 1 - ~ A ( x=
From (6.13), (6.15), (6.16), (6.18), and Table 6.3 we see that the particular generalized modus ponens considered in this example satisfies critera pl, p3 and p5, but does not satisfy criteria p2, p4, p6, and p7.
Example 6.3. In this example, we still use min for the t-norm but use Zadeh in the generalized modus ponens (6.10). implication (5.25) for the ,uA-+B(x,Y) Again, we consider the four typical cases of A' in Example 6.2 and assume that SUP~EU[PA(X)I = 1. (a) For A' = A, we have
Since s u p X E U p ~ ( = x ) 1, the supZEumin in (6.19) is achieved at the particular E U when (6.20) PA ( ~ 0= ) max[min(~A(xI)), PB (Y)), 1- PA(XO)] If p ~ ( x 0 < ) p ~ ( y )then , (6.20) becomes
XI)
which is true when p ~ ( x 0 ) 0.5; thus from (6.19) and (6.20) we have pBt (y) = (XI)).Since s u p X E U [(x)] p ~ = 1,it must be true that pA(XI)) = 1, but this leads to p ~ ( y> ) ~A(X= O) 1, which is impossible. Thus, we cannot have pA(xo)< pB(y). Now consider the only possible case pA(xO) pB(y). In this case, (6.20) becomes
PA
>
>
If p ~ ( y < ) 1- ~ A ( x o then ) , p ~ ( x 0= ) 1- ,UA(XO), which is true when pA(xo)= 0.5. If p ~ ( y ) 1 - ~ A ( x o then ) , from (6.22) we have ~ A ( X I= ) )p ~ ( y 1 ) 0.5. Hence, p ~ ( x 0= ) max[0.5, p ~ ( y ) and ] we obtain
>
83
which is true only when p A ( x o ) = 1 , but this leads to the contradiction p B ( y ) > 1. Thus p ~ ( x o ) 2 p ~ ( y is ) the only possible case. If p A ( x O ) 2 p B ( y ) , then (6.25) becomes (6.27) ~ i ( x o= )~~x[PB(1 Y) ,P A ( X O ) ] If p ~ ( y ) < 1 - ~ A ( x o ) then , p i ( x o ) = 1 - p A ( x o ) , which is true when p A ( x o ) = Hence, if p ~ ( y ) < 1 - pA ( x o ) = , we have pB, ( y ) = p i ( x O )= If p ~ ( y ) 2 1- ~ A ( x o ) we , have p ~ ( y = ) p i ( x o ) = p ~ ( y ) 2 -. In summary, we obtain
q.
v.
Similar to the A' = v e r y A case, we can show that p A ( x O ) < p B ( y ) is impossible. For P A ( X O ) 2 PB ( Y ) , we have
If
q.
PB ( Y )
112 ( x o ) = 1 - ~ A ( x o which < 1 - ~ A ( x o then ) , pA ), is true when p A ( x o ) = 112 ( x O )= +. Thus, if p ~ ( y < ) l - p ~ ( x ~ = )+, we have p B t ( y ) = pA If
PBI ( y ) = pi12(xo) = p ~
(9) 2
q. To summarize,
P B ~ ( Y= ) ~ f n / ~ ( x= o)
&-1 ma^[^
PB ( Y ) ]
(6.32)
84
Ch. 6
By inspecting (6.33) we see that if we choose xo E U such that pA(xO)= 0, then 1 - ,UA(XO) = 1 and m a x [ m i n ( p ~ ( x ) PB , (y)), 1 - PA (x)] = 1, thus the supxEumin is achieved at x = XO. Hence, in this case we have PBI(Y)= 1 (6.34)
From (6.23), (6.28), (6.32), and (6.34), we see that for all the criteria in Table 6.3, only criterion p6 is satisfied. (This approximate reasoning is truely approximate!)
6.3.2
Example 6.4. Similar to Example 6.2, we use min for the t-norm and Mamdani's y)_ , in the generalized modus tollens product implication (5.32) for the ~ A + B ( x (6.11). Consider four cases of B': (a) B' = B , (b) B' = not very B , (c) B' = not more or less B , and (d) B' = B. We assume that supYEV[pB(y)] = 1. If B' = B, we have from (6.11) that
The sup,,, min is achieved at yo E V when 1 - pB (yo) = pA(x)pB(yo), which hence, implies PB (YO) = PA' (2) = 1- PB (YO) = If B' = not very B, then
PA ($1 1+ PA($1
. Hence,
(x)-\/r:(x)+l. Hence,
2 : '
(2)
Sec. 6.3.
85
From (6.36), (6.38), (6.40) and (6.41) we see that among the seven intuitive criteria in Table 6.4 only criterion t5 is satisfied. CI
6.3.3
Example 6.5. Similar to Examples 6.2 and 6.4, we use min for the t-norm and y) , and P B / + ~ ( Y Z) , in the generalMamdani product implication for the ~ A + B ( x = 1and consider four ized hypothetical syllogism (6.12). We assume supYEv[p~(y)] typical cases of B': (a) B' = B , (b) B' = very B, (c) B' = more or less B , and (d) B' = B. If B' = B , we have from (6.12) that
If PA(X) > PC (z), then it is always true that PA (X)PB (y) > p i (y),uc(z), thus, 2 P A - ~ C '( 2 , ~ = ) supYEv P~(Y)PC(Z = ) PC(Z). If PA($) i PC(Z),then the supyEvmin (YO) = ,u;(y~)pc(~), which gives pB(yo) = is achieved at yo E V, when PA (X)PB &. hence, in this case ,ua+c~(x,z) = p ~ ( x ) p ~ ( = y ~ ) In summary, we fit ( 2 ) ' obtain
s.
If B' = more or less B, then using the same method as for the B' = very B case, we have PA(%) i f PA(X)< PC(Z) if PA (m) PC(Z) PA(X) Finally, when B' = B, we have
86
Ch. 6
where the supyEvmin is achieved at yo E V when pA(x)pB(yo) = (1-pB (YO))pC (z), that is, when ps(yo) = Hence,
6.4
In this chapter we have demonstrated the following: Using truth tables to prove the equivalence of propositions. Basic inference rules (Modus Ponens, Modus Tollens, and Hypothetical Syllogism) and their generalizations to fuzzy propositions (Generalized Modus Ponens, Generalized Modus Tollens, and Generalized Hypothetical Syllogism). The idea and applications of the compositional rule of inference. Determining the resulting membership functions from different implication rules and typical cases of premises.
A comprehensive treatment of many-valued logic was prepared by Rescher [1969]. The generalizations of classical logic principles to fuzzy logic were proposed in Zadeh [1973], Zadeh [I9751 and other papers of Zadeh in the 1970s. The compositional rule of inference also can be found in these papers of Zadeh.
6.5
Exercises
Exercise 6.1. Use the truth table method to prove that the following are tautologies: (a) modus ponens (6.3), (b) modus tollens (6.4), and (c) hypothetical syllogism (6.5). Exercise 6.2. Let U = {xl, 2 2 , 2 3 ) and V = {yl, yz}, and assume that a fuzzy IF-THEN rule "IF x is A, THEN y is B" is given, where A = .5/x1 11x2 .6/x3 and B = l/yl +.4/y2. Then, given a fact "x is A'," where A' = .6/x1 +.9/x2 +.7/x3, use the generalized modus ponens (6.10) to derive a conclusion in the form "y is B'," where the fuzzy relation A + B is interpreted using:
87
Exercise 6.3. Repeat Exercise 6.2 with A = .6/xl llyz,, and A' = .5/x1 .9/x2 1/23.
Exercise 6.4. Let U, V, A, and B be the same as in Exercise 6.2. Now given a fact "y is B'," where B' = .9/yl + .7/y2, use the generalized modus tollens (6.11) to derive a conclusion "x is A'," where the fuzzy relation A + B is interpreted using: (a) Lukasiewicz implication (5.24), and (b) Mamdani Product implication (5.32). Exercise 6.5. Use min for the t-norm and Lukasiewicz implication (5.24) for the p A + . ~ ( y) x , in the generalized modus ponens (6.10), and determine the t in terms of pB(y) for: (a) A' = A, (b) A' = very A, membership function p ~(y) (c) A' = more or less A, and (d) A' = A. Exercise 6.6. Use min for the t-norm and Dienes-Rescher implication (5.23) Y )the generalized modus ponens (6.10), and determine the for the P ~ + . ~ ( x , in ) terms of p ~ ( y for: ) (a) A' = A, (b) A' = very A, membership function p ~ j ( y in (c) A' = more or less A, and (d) A' = A. Exercise 6.7. With min as the t-norm and Mamdani minimum implication (5.31) for the P ~ + . ~ ( x , Y in)the generalized modus tollens (6.11), determine the t in terms of pA(x) for: (a) B' = B, (b) B' = not very B, membership function p ~(x) (c) B' = not more or less B, and (d) B' = B. Exercise 6.8. Consider a fuzzy logic based on the standard operation (min, max, 1 - a ) . For any two arbitrary propositions, A and B, in the logic, assume that we require that the equality AAB=BV(AAB) (6.48) holds. Imposing such requirement means that pairs of truth values of A and B become restricted to a subset of [O, 112.Show exactly how they are restricted.
88
Ch. 6
Chapter 7
Consider the fuzzy system shown in Fig. 1.5, where U = UI x U 2 x . . . x Un c Rn and V c R. We consider only the multi-input-single-output case, because a multioutput system can always be decomposed into a collection of single-output systems. For example, if we are asked to design a 4-input-boutput fuzzy system, we can first design three 4-input-1-output fuzzy systems separately and then put them together as in Fig. 7.1.
Figure 7.1. A multi-input-multi-output fuzzy system can be decomposed into a collection of multi-input-singleoutput fuzzy systems.
In this chapter, we will study the details inside the fuzzy rule base and fuzzy
91
inference engine; fuzzifiers and defuzzifiers will be studied in the next chapter.
7.1
: IF xl
(7.1)
where Af and B' are fuzzy sets in U i c R and V c R, respectively, and x = (xl,xz, ...,x,)* E U and y E V are the input and output (linguistic) variables of the fuzzy system, respectively. Let M be the number of rules in the fuzzy rule base; that is, 1 = 1,2, ...,M in (7.1). We call the rules in the form of (7.1) canonical fuzzy IF-THEN rules because they include many other types of fuzzy rules and fuzzy propositions as special cases, as shown in the following lemma.
Lemma 7.1. The canonical fuzzy IF-THEN rules in the form of (7.1) include the following as special cases:
rules" : (a) LLPartial
IF XI is A k n d ... a n d x, is A;,
where m
T H E N y is B"
(7.2)
< n.
IF xl is A: a n d
T H E N y is B'
... and x,
is
AL or x,+l
y is B'
is
is A;,
(7.3)
(7.5)
I F x l is A : a n d ... a n d x, i s T H E N y is B'
AL and x,+l
is I a n d ... a n d x, is I , (7-6)
92
Ch. 7
where I is a fuzzy set in R with pI(x) = 1 for all x E R. The preceding rule is in the form of (7.1); this proves (a). Based on intuitive meaning of the logic operator "or," the "Or rule" (7.3) is equivalent to the following two rules: I F z l i s A; and ... and x, i s T H E N y i s B1 IFX,+~ i s A&+, and ... and s, i s A : , T H E N y i s B'
AL,
(7.7) (7.8)
From (a) we have that the two rules (7.7) and (7.8) are special cases of (7.1); this proves (b). The fuzzy statement (7.4) is equivalent to
(7.9)
which is in the form of (7.1); this proves (c). For (d), let S be a fuzzy set representing "smaller," for example, ps(x) = 1/(1 exp(5(x 2))), and B be a fuzzy set representing "bigger," for example, pB(y) = 1/(1 exp(-5(y - 2))), then the "Gradual rule" (7.5) is equivalent to
IF x i s S, T H E N y i s B
(7.10)
which is a special case of (7.1); this proves (d). Finally, if the membership functions of Af and B' can only take values 1 or 0, then the rules (7.1) become non-fuzzy , rules. In our fuzzy system framework, human knowledge has to be represented in the form of the fuzzy IF-THEN rules (7.1). That is, we can only utilize human knowledge that can be formulated in terms of the fuzzy IF-THEN rules. Fortunately, Lemma 7.1 ensures that these rules provide a quite general knowledge representation scheme. 7.1.2
Properties of Set of Rules
Because the fuzzy rule base consists of a set of rules, the relationship among these rules and the rules as a whole impose interesting questions. For example, do the rules cover all the possible situations that the fuzzy system may face? Are there any conflicts among these rules? To answer these sorts of questions, we introduce the following concepts. Definition 7.1. A set of fuzzy IF-THEN rules is complete if for any x E U , there exists at least one rule in the fuzzy rule base, say rule R U ( ~ ) (in the form of (7.1)), such that PA: (xi) # 0 (7.11) for all i = 1,2, ..., n. Intuitively, the completeness of a set of rules means that at any point in the input space there is at least one rule that "fires"; that is, the membership value of the IF part of the rule at this point is non-zero.
93
Example 7.1. Consider a Zinput-1-output fuzzy system with U = Ul x U z= [O, 1 1 x [O,1] and V = [O, 1 1 . Define three fuzzy sets S1,MI and L1 in Ul, and two fuzzy sets S2and L2 in Uz, as shown in Fig. 7.2. In order for a fuzzy rule base to be complete, it must contain the following six rules whose IF parts constitute all the possible combinations of S1,MI, L1 with Sz,La:
I F XI i s S1 and 2 2 i s S2,T H E N y i s B1 I F XI i s S1 and x2 i s La, T H E N y i s B2 I F x1 i s MI and 2 2 i s S2,T H E N y i s B3 I F xl i s MI and 2 2 i s La, T H E N y i s B4 I F $1 i s L1 and x2 i s Sz, T H E N y i s B~ I F xl i s L1 and x2 i s L2, T H E N y i s B~
(7.12)
where B' (1 = 1,2, ...,6) are fuzzy sets in V. If any rule in this group is missing, then we can find point x* E U, at which the IF part propositions of all the remaining rules have zero membership value. For example, if the second rule in (7.12) is missing, then this x* = (0,l) (Why?).
From Example 7.1 we see that if we use the triangular membership functions as in Fig. 7.2, the number of rules in a complete fuzzy rule base increases exponentially with the dimension of the input space U. This problem is called the curse of dimensionality and will be further discussed in Chapter 22.
94
Ch. 7
Definition 7.2. A set of fuzzy IF-THEN rules is consistent if there are no rules with the same IF parts but different THEN parts.
For nonfuzzy production rules, consistence is an important requirement because it is difficult to continue the search if there are conflicting rules. For fuzzy rules, however, consistence is not critical because we will see later that if there are conflicting rules, the fuzzy inference engine and the defuzzifier will automatically average them out to produce a compromised result. Of course, it is better to have a consistent fuzzy rule base in the first place.
Definition 7.3. A set of fuzzy IF-THEN rules is continuous if there do not exist such neighboring rules whose THEN part fuzzy sets have empty intersection.
Intuitively, continuity means that the input-output behavior of the fuzzy system should be smooth. It is difficult to explain this concept in more detail at this point, because we have not yet derived the complete formulas of the fuzzy systems, but it will become clear as we move into Chapter 9.
7.2
In a fuzzy inference engine, fuzzy logic principles are used to combine the fuzzy IFTHEN rules in the fuzzy rule base into a mapping from a fuzzy set A' in U to a fuzzy set B' in V. In Chapter 6 we learned that a fuzzy IF-THEN rule is interpreted as a fuzzy relation in the input-output product space U x V, and we proposed a number of implications that specify the fuzzy relation. If the fuzzy rule base consists of only a single rule, then the generalized modus ponens (6.10) specifies the mapping from fuzzy set A' in U to fuzzy set B' in V. Because any practical fuzzy rule base constitutes more than one rule, the key question here is how to infer with a set of rules. There are two ways to infer with a set of rules: composition based inference and individual-rule based inference, which we will discuss next. 7.2.1
Composition Based lnference
In composition based inference, all rules in the fuzzy rule base are combined into a single fuzzy relation in U x V, which is then viewed as a single fuzzy IF-THEN rule. So the key question is how to perform this combination. We should first understand what a set of rules mean intuitively, and then we can use appropriate logic operators to combine them. There are two opposite arguments for what a set of rules should mean. The first one views the rules as independent conditional statements. If we accept this point of view, then a reasonable operator for combining the rules is union. The second one views the rules as strongly coupled conditional statements such that the conditions of all the rules must be satisfied in order for the whole set of rules to have an impact. If we adapt this view, then we should use the operator intersection
95
to combine the rules. The second view may look strange, but for some implications, for example, the Godel implication (5.26), it makes sense, as we will see very soon in this section. We now show the details of these two schemes. Let RU(') be a fuzzy relation in U x V, which represents the fuzzy IF-THEN : -+ B'. From Chapter 5 we know that rule (7.1); that is, RU(') = A: x . .. x A A; x . . - x A: is a fczzy relation in U = Ul x . . - x Un defined by
where * represents any t-norm operator. The implication -+ in RU(') is defined according to various implications (5.23)-(5.26), (5.31), and (5.32). If we accept the first view of a set of rules, then the M rules in the form of (7.1) are interpreted as a single fuzzy relation QM in U x V defined by
This combination is called the Mamdani combination. If we use the symbol represent the s-norm, then (7.14) can be rewritten as
PQ? (x, Y) = P R U ~ (x, ~ )?I)+ . . . / P R U ( M ) (x, 31)
+ to
(7.15)
For the second view of a set of rules, the M fuzzy IF-THEN rules of (7.1) are interpreted as a fuzzy relation QG in U x V, which is defined as
or equivalently,
where
Let A' be an arbitrary fuzzy set in U and be the input to the fuzzy inference engine. Then, by viewing QM or QG as a single fuzzy IF-THEN rule and using the generalized modus ponens (6.10), we obtain the output of the fuzzy inference engine as (7.18) PB' (Y) = SUP ~ [ P A(XI, J PQM (x, Y)]
XEU
Y)]
(7.19)
if we use the Godel combination. In summary, the computational procedure of the composition based inference is given as follows:
96
Ch. 7
Step 1: For the M fuzzy IF-THEN rules in the form of (7.1), determine the membership functions PA: x..,xA;($1, ...,Xn) for 1 = 1,2, ...,M according to (7.13). Step 2: View All x . . x A; as the FPI and B' as the FP2 in the implications (5.23)-(5.26), (5.31) and (5.32), and detbrmine pBu(l) (xl, ...,x , , y) = p ~ ; ~ . , . ~ ~ ~ ..., .+ xn, ~ y) i ( for x~ 1 ,= 1,2, ..., M according to any one of these implications. Step 3: Determine 1 - 1 (x, ~ y) ~ or , u (x, ~y) ~ according to (7.15) or (7.17). Step 4: For given input A', the fuzzy inference engine gives output B' according to (7.18) or (7.19).
7.2.2
Individual-Rule Based lnference
In individual-rule based inference, each rule in the fuzzy rule base determines an output fuzzy set and the output of the whole fuzzy inference engine is the combination of the M individual fuzzy sets. The combination can be taken either by union or by intersection. The computational procedure of the individual-rule based inference is summarized as follows:
Steps 1and 2: Same as the Steps 1and 2 for the composition based inference. Step 3: For given input fuzzy set A' in U , compute the output fuzzy set B,' in V for each individual rule RU(') according to the generalized modus ponens (6.10), that is,
PB;(Y) = SUP ~[PA'(X), PRU(" (x, Y)]
XEU
(7.20)
Step 4: The output of the fuzzy inference engine is the combination of the M fuzzy sets {B:, ...,B b ) either by union, that is,
where
97
7.2.3
From the previous two subsections we see that there are a variety of choices in the fuzzy inference engine. Specifically, we have the following alternatives: (i) composition based inference or individual-rule based inference, and within the composition based inference, Mamdani inference or Godel inference, (ii) Dienes-Rescher implication (5.23), Lukasiewicz implication (5.24), Zadeh implication (5.25), Godel implication (5.26), or Mamdani implications (5.31)-(5.32), and (iii) different operations for the t-norms and s-norms in the various formulas. So a natural question is: how do we select from these alternatives? In general, the following three criteria should be considered: Intuitive appeal: The choice should make sense from an intuitive point of view. For example, if a set of rules are given by a human expert who believes that these rules are independent of each other, then they should be combined by union.
a
Computational eficiency: The choice should result in a formula relating B' with A', which is simple to compute. Special properties: Some choice may result in an inference engine that has special properties. If these properties are desirable, then we should make this choice.
We now show the detailed formulas of a number of fuzzy inference engines that are commonly used in fuzzy systems and fuzzy control.
e
P r o d u c t Inference Engine: In product inference engine, we use: (i) individualrule based inference with union combination (7.21), (ii) Mamdani's product implication (5.32), and (iii) algebraic product for all the t-norm operators and max for all the s-norm operators. Specifically, from (7.20), (7.21), (5.32), and (7.13), we obtain the product inference engine as
That is, given fuzzy set A' in U , the product inference engine gives the fuzzy set B' in V according to (7.23). M i n i m u m Inference Engine: In minimum inference engine, we use: (i) individual-rule based inference with union combination (7.21), (ii) Mamdani's minimum implication (5.31), and (iii) min for all the t-norm operators and max for all the s-norm operators. Specifically, from (7.20), (7.21), (5.31), and (7.13) we have
PB' (9)= ~ ? C [ S U P
M
XEU
min(PA'
PA;
(7.24)
98
Ch. 7
That is, given fuzzy set A' in U, the minimum inference engine gives the fuzzy set B' in V according to (7.24). The product inference engine and the minimum inference engine are the most commonly used fuzzy inference engines in fuzzy systems and fuzzy control. The main advantage of them is their computational simplicity; this is especially true for the product inference engine (7.23). Also, they are intuitively appealing for many practical problems, especially for fuzzy control. We now show some properties of the product and minimum inference engines.
Lemma 7.2. The product inference engine is unchanged if we replace "individualrule based inference with union combination (7.21)" by "composition based inference with Mamdani combination (7.15)."
Proof: From (7.15) and (7.18) we have
Because the maxE, and supxGu are interchangeable, (7.26) is equivalent to (7.23). Lemma 7.3. If the fuzzy set A' is a fuzzy singleton, that is, if
Proof: Substituting (7.27) into (7.23) and (7.24), we see that the SUPXE~ is achieved at x = x*. Hence, (7.23) reduces (7.28) and (7.24) reduces (7.29).
Lemma 7.2 shows that although the individual-rule based and composition based inferences are conceptually different, they produce the same fuzzy inference engine in certain important cases. Lemma 7.3 indicates that the computation within the
99
fuzzy inference engine can be greatly simplified if the input is a fuzzy singleton (the most difficult computation in (7.23) and (7.24) is the supxEu, which disappears in (7.28) and (7.29)).
A disadvantage of the product and minimum inference engines is that if at some x E U the pA;(xi)'s are very small, then the ~ B(y) I obtained from (7.23) or (7.24) will be very small. This may cause problems in implementation. The following three fuzzy inference engines overcome this disadvantage.
Lukasiewicz Inference Engine: In Lukasiewicz inference engine, we use: (i) individual-rule based inference with intersection combination (7.22), (ii) Lukasiewicz implication (5.24), and (iii) min for all the t-norm operators. Specifically, from (7.22), (7.20), (5.24) and (7.13) we obtain
1=1
XEU
+ P B (Y)]} ~
(7.30)
That is, for given fuzzy set A' in U, the Lukasiewicz inference engine gives the fuzzy set B' in V according to (7.30). Zadeh Inference Engine: In Zadeh inference engine, we use: (i) individualrule based inference with intersection combination (7.22), (ii) Zadeh implication (5.25), and (iii) min for all the t-norm operators. Specifically, from (7.22), (7.20), (5.25), and (7.13) we obtain PB' (3) = mln{ sup m i n [ p ~(x), l max(min(pA;(XI),...,
1=1
XEU
1 - %=I
PA: (xi)))]}
Dienes-Rescher Inference Engine: In Dienes-Rescher inference engine, we use the same operations as in the Zadeh inference engine, except that we replace the Zadeh implication (5.25) with the Dienes-Rescher implication (5.23). Specifically, we obtain from (7.22), (7.20), (5.23), and (7.13) that
PB. (9) = mini sup min[lrat (x), max(1 - $ n ( p ~ i (xi)), p ~(y))]} i
1=1
XEU
2=1
(7.32)
Similar to Lemma 7.3, we have the following results for the Lukasiewicz, Zadeh and Dienes-Rescher inference engines.
100
Ch. 7
L e m m a 7.4: If A' is a fuzzy singleton as defined by (7.27), then the Lukasiewicz, Zadeh and Dienes-Rescher inference engines are simplified to
respectively.
Proof: Using the same arguments as in the proof of Lemma 7.3, we can prove this lemma.
We now have proposed five fuzzy inference engines. Next, we compare them through two examples. Example 7.2: Suppose that a fuzzy rule base consists of only one rule
(7.36)
Assume that A' is a fuzzy singleton defined by (7.27). We would like to plot obtained from the five fuzzy inference engines. Let B;, B L , BL, Bh the PBI(Y) and B& be the fuzzy set B' using the product, minimum, Lukasiewicz, Zadeh and , ..., PA, (xk)] = Dienes-Rescher inference engines, respectively, and let m i n [ p ~(x;), pAp (xJ) and PA, (xf ) = pA(x*). Then from (7.28), (7.29), and (7.33)-(7.35) we have
n : = l
For the case of PA, (x;) 0.5, p ~ : ,(y) and p ~ (y) h are plotted in Fig. 7.3, and p ~ (y), ; p~:, (y) and PB:, (y) are plottc"3 in Fig. 7.4. For the case of p~~ (x;) < 0.5, PB:, (y) and p ~ (y) g are plotted in Fig. 7.5, and ~ B ((y), L PB:, (y) and ,UB&(y) are plotted in Fig. 7.6. From Figs.7.3-7.6 we have the following observations: (i) if the membership value of the IF part at point x* is small (say, /AA,(xJ) < 0.5), then the product
>
101
and minimum inference engines give very small membership values, whereas the Lukasiewicz, Zadeh and Dienes-Rescher inference engines give very large membership values; (ii) the product and ,minimum inference engines are similar, while the Lukasiewicz, Zadeh and Dienes-Rescher inference engines are similar, but there are big differences between these two groups; and (iii) the Lukasiewicz inference engine gives the largest output membership function, while the product inference engine gives the smallest output membership function in all the cases; the other three inference engines are in between.
Figure 7.3. Output membership functions using the Lukasiewicz, Zadeh and Dienes-Rescher inference engines for the par; (xi)2 0.5 case.
E x a m p l e 7.3: In this example, we consider that the fuzzy system contains two rules: one is the same as (7.36), and the other is
(7.43)
Assume again that A' is the fuzzy singleton defined by (7.27). We would like to plot the ~ B (y) I using the product inference engine, that is, ,UB; (y). Fig. 7.7 shows the PB; (Y),where P A (x*) = n:=l P A ; (2;) and pc(x*) = ny=l pcZ(x;).
102
Ch. 7
Figure 7.4. Output membership functions using the product and minimum inference engines for the PA; (x;) 0.5 case.
>
Figure 7.5. Output membership functions using the product and minimum inference engines for the PA; (x;) < 0.5 case.
103
Figure 7.6. Output membership functions using the Lukasiewicz, Zadeh and Dienes-Rescher inference engines for the PA; ( z ; ) < 0.5 case.
Figure 7.7. Output membership function using the product inference engine for the case of two rules.
104
Ch. 7
7.3
In this chapter we have demonstrated the following: The structure of the canonical fuzzy IF-THEN rules and the criteria for evaluating a set of rules. The computational procedures for the composition based and individual-rule based inferences. The detailed formulas of the five specific fuzzy inference engines: product, minimum, Lukasiewicz, Zadeh and Dienes-Rescher inference engines, and their computations for particular examples. Lee [I9901 provided a very good survey on fuzzy rule bases and fuzzy inference engines. This paper gives intuitive analyses for various issues associated with fuzzy rule bases and fuzzy inference engines. A mathematical analysis of fuzzy inference engines, similar to the approach in this chapter, were prepared by Driankov, Hellendoorn and Reinfrank [1993].
7.4
Exercises
Exercise 7.1. If the third and sixth rules in (7.12) (Example 7.1) are missing, at what points do the IF part propositions of all the remaining rules have zero membership values? Exercise 7.2. Give an example of fuzzy sets B1,...,B6 such that the set of the six rules (7.12) is continuous. Exercise 7.3. Suppose that a fuzzy rule base consists of only one rule (7.36) with PB(Y)= exp(-y2) (7.45)
Let the input A' to the fuzzy inference engine be the fuzzy singleton (7.27). Plot ) (a) product, (b) minimum, (c) the output membership functions , u ~ , ( y using: Lukasiewicz, (d) Zadeh, and (e) Dienes-Rescher inference engines.
Exercise 7.4. Consider Example 7.3 and plot the p ~(y) , using: (a) Lukasiewicz inference engine, and (b) Zadeh inference engine. Exercise 7.5. Use the Godel implication to propose a so-called Godel inference engine.
Chapter 8
We learned from Chapter 7 that the fuzzy inference engine combines the rules in the fuzzy rule base into a mapping from fuzzy set A' in U to fuzzy set B' in V. Because in most applications the input and output of the fuzzy system are realvalued numbers, we must construct interfaces between the fuzzy inference engine and the environment. The interfaces are the fuzzifier and defuzzifier in Fig. 1.5.
8.1
Fuzzifiers
The fuzzifier is defined as a mapping from a real-valued point x* E U C Rn to a fuzzy set A' in U. What are the criteria in designing the fuzzifier? First, the fuzzifier should consider the fact that the input is at the crisp point x*, that is, the fuzzy set A' should have large membership value at x*. Second, if the input to the fuzzy system is corrupted by noise, then it is desireable that the fuzzifier should help to suppress the noise. Third, the fuzzifier should help to simplify the computations involved in the fuzzy inference engine. From (7.23), (7.24) and (7.30)-(7.32) we see that the most complicated computation in the fuzzy inference engine is the supxEu, therefore our objective is to simplify the computations involving supxEu. We now propose three fuzzifiers:
Singleton fuzzifier: The singleton fuzzijier maps a real-valued point x* E U into a fuzzy singleton A' in U, which has membership value 1 at x* and 0 at all other points in U ; that is,
=
Gaussian fuzzifier: The Gaussian fuzzijier maps x* E U into fuzzy set A' in U , which has the following Gaussian membership function:
106
Ch. 8
where ai are positive parameters and the t-norm braic product or min.
* is usually-skosenas alge-
Triangular fuzzifier: The triangular fuzzifier maps x* E U into fuzzy set A' in U , which has the following triangular membership function
PA!
(4 =
{o
(1 - I m - ~ T l ) * . . . ~ ( l
otlzer1.i.e
lxn-x:l
6,
where bi are positive parameters and the t-norm braic product or min.
(8.3)
From (8.1)-(8.3) we see that all three fuzzifiers satisfy pA,(x*) = 1; that is, they satisfy the first criterion mentioned before. From Lemmas 7.3 and 7.4 we see that the singleton fuzzifier greatly simplifies the computations involved in the fuzzy inference engine. Next, we show that if the fuzzy sets A: in the rules (7.1) have Gaussian or triangular membership functions, then the Gaussian or triangular fuzzifier also will simplify some fuzzy inference engines. Lemma 8.1. Suppose that the fuzzy rule base consists of M rules in the form of (7.1) and that
where 5: and a% are constant parameters, i = 1,2, ..., n and 1 = 1,2, ...,M. If we use the Gaussian fuzzifier (8.2), then: (a) If we choose algebraic product for the t-norm * in (8.2), the product inference engine (7.23) is simplified to
where
(b) If we choose min for the t-norm (7.24) is simplified to pB, (y) = m%[min(e
1=1
M-~!, )2,
, ..., e - (
"-5,
P B I (Y))I
(8.7)
where
107
Proof: (a) Substituting (8.2) and (8.4) into (7.23) and noticing that the * is an algebraic product, we obtain pB. (y) = max[sup em( 1=1 X E U i,l = m a x [ n sup e 1=1 i=lXEU Since
M
n
n
n
=.-ST 2 )
-(
ai
7pBl (y)]
":
C L B ~ (Y)]
),
-(*
. i - ~ ? )2
_(Si-Ef )2
where kl and k2 are not functions of xi, the supxEu in (8.9) is achieved at xip E U (i = 1,2, ...,n), which is exactly (8.6). (b) Substituting (8.2) and (8.4) into (7.24) and noticing that the obtain p ~ ~ (= y max{min[sup ) rnin(e-(+)',
1=1
XEU
Sn-a;
* is min, we
.I-.*
-(+)2
), - .. ,
-(%+)2
sup rnin(e-(?)',
XEU
), P B ( ~ Y ) ] }
which gives (8.8). Substituting xi = xiM into (8.11) gives (8.7). We can obtain similar results as in Lemma 8.1 when the triangular fuzzifier is used. If ai = 0, then from (8.6) and (8.8) we have xfp = xfM = xf; that is, in this case the Gaussian fuzzifier becomes the singleton fuzzifier. If ai is much larger than uj,then from (8.6) and (8.8) we see that xip and xfM will be very close to 5:; that is, xfp and xfM will be insensitive to the changes in the input x;. Therefore, by choosing large ai, the Gaussian fuzzifier can suppress the noise in the input xf. More specifically, suppose that the input xf is corrupted by noise, that is,
where xzo is the useful signal and nf is noise. Substituting (8.13) into (8.6), we have
108
Ch. 8
From (8.14) we see that after passing through the Gaussian fuzzifier, the noise is suppressed by the factor . When a i is much larger than ol, the noise will be greatly suppressed. Similarly, we can show that the triangular fuzzifier also has this kind of noise suppressing capability.
a:(:l$),
In summary, we have the following remarks about the three fuzzifiers: The singleton fuzzifier greatly simplifies the computation involved in the fuzzy inference engine for any type of membership functions the fuzzy IF-THEN rules may take. The Gaussian or triangular fuzzifiers also simplify the computation in the fuzzy inference engine, if the membership functions in the fuzzy IF-THEN rules are Gaussian or triangular, respectively. The Gaussian and triangular fuzzifiers can suppress noise in the input, but the singleton fuzzifier cannot.
8.2
Defuzzifiers
The defuzzifier is defined as a mapping from fuzzy set B' in V c R (which is the output of the fuzzy inference engine) to crisp point y* E V. Conceptually, the task of the defuzzifier is to specify a point in V that best represents the fuzzy set B'. This is similar to the mean value of a random variable. However, since the B' is constructed in some special ways (see Chapter 7), we have a number of choices in determining this representing point. The following three criteria should be considered in choosing a defuzzification scheme: Plausibility: The point y* should represent B' from an intuitive point of view; for example, it may lie approximately in the middle of the support of B' or has a high degree of membership in B'. Computational simplicity: This criterion is particularly important for fuzzy control because fuzzy controllers operate in real-time. Continuity: A small change in B' should not result in a large change in y*. We now propose three types of defuzzifiers. For all the defuzzifiers, we assume that the fuzzy set B' is obtained from one of the five fuzzy inference engines in Chapter 7, that is, B' is given by (7.23), (7.24), (7.30), (7.31), or (7.32). From these equations we see that B' is the union or intersection of M individual fuzzy sets.
109
8.2.1
The center of gravity defuzzifier specifies the y" as the center of the area covered by the membership function of B', that is,
where Jv is the conventional integral. Fig. 8.1 shows this operation graphically.
If we view p ~ ~ ( as y )the probability density function of a random variable, then the center of gravity defuzzifier gives the mean value of the random variable. Sometimes it is desirable to eliminate the y E V, whose membership values in B' are too small; this results in the indexed center of gravity defuzzifier, which gives
where V, is defined as
v m = {Y E V~PB'(Y) 2 Q)
and a is a constant.\ The advantage of the center of gravity defuzzifier lies in its intuitive plausibility. The disadvantage is that it is computationaily intensive. In fact, the membership
110
Ch. 8
function ,uB,(y) is usually irregular and therefore the integrations in (8.15) and (8.16) are difficult to compute. The next defuzzifier tries to overcome this disadvantage by approximating (8.15) with a simpler formula. 8.2.2
Center Average Defuzzifier
Because the fuzzy set B' is the union or intersection of M fuzzy sets, a good approximation of (8.15) is the weighted average of the centers of the M fuzzy sets, with the weights equal the heights of the corresponding fuzzy sets. Specifically, let gjl be l be its height, the center average defuzzifier the center of the l'th fuzzy set and w determines y* as
Fig. 8.2 illustrates this operation graphically for a simple example with M = 2.
The center average defuzzifier is the most commonly used defuzzifier in fuzzy systems and fuzzy control. It is computationally simple and intuitively plausible. l result in small changes in y*. We now compare Also, small changes in g b d w the center of gravity and center average defuzzifiers for a simple example. Example 8.1. Suppose that the fuzzy set B' is the union of the two fuzzy sets shown in Fig. 8.2 with jjl = 0 and G~ = 1. Then the center average defuzzifier gives
111
Table 8.1. Comparison of the center of gravity and center average defuzzifiers for Example 8.1.
w l
w 2
y*
(center of gravity)
0.4258 0.5457 0.7313 0.3324 0.4460 0.6471 0.1477 0.2155 0.3818
y* (center average) 0.4375 0.5385 0.7000 0.3571 0.4545 0.6250 0.1818 0.2500 0.4000
relative error
0.0275 0.0133 0.0428 0.0743 0.0192 0.0342 0.2308 0.1600 0.0476
We now compute the y* resulting from the center of gravity defuzzifier. First, we hence, notice that the two fuzzy sets intersect at y =
JV
+ area of
Dividing (8.21) by (8.20) we obtain the y* of the center of gravity defuzzifier. Table 8.1 shows the values of y* using these two defuzzifiers for certain values of w1 and w2.We see that the computation of the center of gravity defuzzifier is much more intensive than that of the center average defuzzifier.
112
8.2.3
Maximum Defuzzifier
Ch. 8
Conceptually, the maximum defuzzifier chooses the y* as the point in V at which ~ B(y) I achieves its maximum value. Define the set
that is, hgt(Bi) is the set of all points in V at which ~ B(y) I achieves its maximum value. The maximum defuzzifier defines y* as an arbitrary element in hgt(B1), that is, y* = any point i n hgt(B1) (8.23) If hgt(B1) contains a single point, then y* is uniquely defined. If hgt(B1) contains more than one point, then we may still use (8.23) or use the smallest of maxima, largest of maxima, or mean of maxima defuzzifiers. Specifically, the smallest of maxima defuzzifier gives (8.24) Y* = i n f {y E hgt(B1)) the largest of maxima defizzifier gives
the usual integration for the continuous part of hgt(B1) and is where Jhgt(Bl)i~ summation for the discrete part of hgt(B1). We feel that the mean of maxima defuzzifier may give results which are contradictory to the intuition of maximum membership. For example, the y* from the mean of maxima defuzzifier may have very small membership value in B'; see Fig. 8.3 for an example. This problem is l due to the nonconvex nature of the membership function p ~(y). The maximum defuzzifiers are intuitively plausible and computationally simple. But small changes in B' may result in large changes in y*; see Fig.8.4 for an example. If the situation in Fig.8.4 is unlikely to happen, then the maximum defuzzifiers are good choices.
8.2.4
Comparison of the Defuzzifiers
Table 8.2 compares the three types of defuzzifiers according to the three criteria: plausibility, computational simplicity, and continuity. From Table 8.2 we see that the center average defuzzifier is the best. Finally, we consider an example for the computation of the defuzzifiers with some particular membership functions.
113
smallest of maxima
mean of maxima
Figure 8.3. A graphical representation of the maximum defuzzifiers. In this example, the mean of maxima defuzzifier gives a result that is contradictory to the maximummembership intuition.
Table 8.2. Comparison of the center of gravity, center average, and maximum defuzzifiers with respect to plausibility, computational simplicity, and continuity.
center of gravity plausibility computational simplicity continuitv Yes no ves center average Yes Yes ves maximum Yes Yes
no
Example 8.2. Consider a two-input-one-output fuzzy s y s t e m that is constructed from the following two rules:
IF IF
XI xl
(8.27) (8.28)
(u)=
1-Iu-it if 0 5 ~ 5 2
0 otherwise
114
Ch. 8
Figure 8.4. An example of maximum defuzzifier where small change in B' results in large change in y*.
Suppose that the input to the fuzzy system is (x;, x4) = (0.3,O.s) and we use the singleton fuzzifier. Determine the output of the fuzzy system y* in the following situations: (a) product inference engine (7.23) and center average defuzzifier (8.18); (b) product inference engine (7.23) and center of gravity defuzzifier (8.15); (c) Lakasiewicz inference engine (7.30) and mean of maxima defuzzifier (8.26); and (d) Lakasiewicz inference engine (7.30) and center average defuzzifier (8.18). (a) Since we use singleton fuzzifier, from Lemma 7.3 ((7.28)) and (8.29)-(8.30) we have
which is shown in Fig. 8.2 with jjl = 0, jj2 = 1,wl = 0.42, and wa = 0.12. Hence, from (8.19) we obtain that the center average defuzzifier gives
115
(c) If we use the Lukasiewicz inference engine, then from Lemma 7.4 ((7.33)) we have PB[(Y)= m i n i l , 1 - m i n [ p ~(0.3), , PA,(O.~)] PA, (y), 1 - m i n [ P ~(0.3), , PA, (0-6)1+ PA, (Y) 1 = min[l, 0.4 + PA, (y), 0.7 PA2(Y)]
(8.36)
which is plotted in Fig. 8.5. From Fig. 8.5 we see that supyEV pat ( y ) is achieved in the interval [0.3,0.4], so the mean of maxima defuzzifier gives
(d) From Fig. 8.5 we see that in this case jjl = 0, jj2 = 1, wl = 1 and wz = 1. So the center average defuzzifier (8.18) gives
8.3
In this chapter we have demonstrated the following: The definitions and intuitive meanings of the singleton, Gaussian and triangular fuzzifiers, and the center of gravity, center average and maximum defuzzifiers. Computing the outputs of the fuzzy systems for different combinations of the fuzzifiers, defuzzifiers, and fuzzy inference engines for specific examples. Different defuzzification schemes were studied in detail in the books Driankov, Hellendoorn and Reifrank [I9931 and Yager and Filev [1994].
116
Ch. 8
8.4
Exercises
Exercise 8.1. Suppose that a fuzzy rule base consists of the M rules (7.1) with
and that we use the triangular fuzzifier (8.3). Determine the output of the fuzzy ) for: inference engine ~ B(yI
* = algebraic product, and (b) minimum inference engine (7.24) with all * = min.
(a) product inference engine (7.23) with all
Exercise 8.2. Consider Example 8.1 and determine the y* using the indexed center of gravity defuzzifier with a = 0.1. Compute the y* for the specific values of wl and wz in Table 8.1. Exercise 8.3. ~oAsider Example 8.2 and determine the fuzzy system output y* (with input (x: ,xg) = (0.3,0.6)) for:
(a) Zadeh inference engine (7.31) and maximum (or mean of maxima) defuzzifier, and
117
(b) Dienes-Rescher inference engine (7.32) with maximum (or mean of maxima) defuzzifier.
Exercise 8.4. Use a practical example, such as the mobile robot path planning problem, to show that the center of gravity and center average defuzzifiers may create problems when the.fuzzy set B' is non-convex. Exercsie 8.5. When the fuzzy set B' is non-convex, the so-called center of largest area defuzzifier is proposed. This method determines the convex part of B' that has the largest area and defines the crisp output y* to be the center of gravity of this convex part. Create a specific non-convex B' and use the center of largest area defuzzifier to determine the defuzzified output y * .
Chapter 9
119
where x E U c Rn is the input to the fuzzy system, and f(x) E V C R is the output of the fuzzy system. Proof: Substituting (8.1) into (7.23), we have
Since for a given input x f , the center of the ltth fuzzy set in (9.2) (that is, the fuzzy , see that the jjl set with membership function pA!(x;)pBt ( y ) ) is the center of B ~we in (8.18) is the same jjl in this lemma. Additionally, the height of the ltth fuzzy set p ~(x;)pBl : (9') = pAf(xb) (since B1 in (9.2), denoted by wl in (8.18), is is normal). Hence, using the center average defuzzifier (8.18) for (9.2), we obtain
ny=,
ny=l
Using the notion of this lemma, we have x* = x and y* = f (x); thus, (9.3) becomes (9.1). From Lemma 9.1 we see that the fuzzy system is a nonlinear mapping from x E U c Rn to f (x) E V C R, and (9.1) gives the detailed formula of this mapping. The fuzzy systems in the form of (9.1) are the most commonly used fuzzy systems in the literature. They are computationally simple and intuitively appealing. From (9.1) we see that the output of the fuzzy system is a weighted average of the centers of the fuzzy sets in the THEN parts of the rules, where the weights equal the membership values of the fuzzy sets in the IF parts of the rules at the input point. Consequently, the more the input point agrees with the IF part of a rule, the larger weight this rule is given; this makes sense intuitively. Lemma 9.1 also reveals an important contribution of fuzzy systems theory that is summarized as follows:
The dual role of fuzzy systems: On one hand, fuzzy systems are rulebased systems that are constructed from a collection of linguistic rules; on the other hand, fuzzy systems are nonlinear mappings that in many cases can be represented by precise and compact formulas such as (9.1). An important contribution of fuzzy systems theory is to provide a systematic procedure for transforming a set of linguistic rules into a nonlinear mapping. Because nonlinear mappings are easy to implement, fuzzy systems have found their way into a variety of engineering applications.
By choosing different forms of membership functions for pAf:and ~ B I ,we obtain different subclasses of fuzzy systems. One choice of and pg1 is Gaussian
120
Ch. 9
membership function. Specifically, if we choose the following Gaussian membership function for ,uA{and ~ B: I
where af E (O,l],cri E (0, m) and z:, j ' E R are real-valued parameters, then the fuzzy systems in Lemma 9.1 become
We call the fuzzy systems in the form of (9.6) fuzzy systems with product inference engine, singleton fuzzijier, center average defuzzijier, and Gaussian membership functions. Other popular choices of pA! and pBi are tciangular and trapezoid membership functions. We will study the f<zzy systems with these types of membership functions in detail in Chapters 10 and 11. Another class of commonly used fuzzy systems is obtained by replacing the product inference engine in Lemma 9.1 by the minimum inference engine. Using the same procedure as in the proof of Lemma 9.1, we obtain that the fuzzy systems with fuzzy rule base (%I), minimum inference engine (7.24), singleton fuzzifier (8.1), and center average defuzzifier (8.18) are of the following form:
where the variables have the same meaning as in (9.1). We showed in Chapter 8 (Lemma 8.1) that if the membership functions for At are Gaussian, then the Gaussian fuzzifier also significantly simplifies the fuzzy inference engine. What do the fuzzy systems look like in this case? Lemma 9.2. The fuzzy systems with fuzzy rule base (7.1), product inference engine (7.23), Gaussian fuzzifier (8.2) with *=product, center average defuzzifier (8.18), and Gaussian membership functions (9.4) and (9.5) (with at = 1) are of the following form:
If we replace the product inference engine (7.23) with the minimum inference engine
121
Proof: Substituting (8.6) into (8.5) and use x for x*, we have
Using the same arguments as in the proof of Lemma 9.1 and applying the center average defuzzifier (8.18) to (9.lo), we obtain (9.8). Similarly, substituting (8.8) into (8.7), we have
Applying the center average defuzzifier (8.18) to (9.11), we obtain (9.9). In Chapter 7 we saw that the product and minimum inference engines are quite different from the Lukasiewicz, Zadeh and Dienes-Rescher inference engines. What do the fuzzy systems with these inference engines look like?
Lemma 9.3. If the fuzzy set B~in (7.1) are normal with center jj" then the fuzzy systems with fuzzy rule base (7.1), Lukasiewicz inference engine (7.30) or DienesRescher inference engine (7.32), singleton fuzzifier (8.1) or Gaussian fuzzifier (8.2) or triangular fuzzifier (8.3), and center average defuzzifier (8.18) are of the following form:
. M
Proof: Since ~ B(gl) I = 1, we have 1- minYZl (pAf (xi)) pBi (jjl) 2 1;therefore, the height of the l'th fuzzy set in (7.30) is
url
(a1)] l
122
Ch. 9
where we use the fact that for all the three fuzzifiers (8.1)-(8.3) we have supzEuPA, (x) = 1. Similarly, the height of the E'th fuzzy set in (7.32) is also equal to one. Hence, with the center average defuzzifier (8.18) we obtain (9.12). The fuzzy systems in the form of (9.12) do not make a lot of sense because it gives a constant output no matter what the input is. Therefore, the combinations of fuzzy inference engine, fuzzifier, and defuzzifier in Lemma 9.3 do not result in useful fuzzy systems.
' *1 ) when 1 where l* is defined according to (9.15). Since / A ~ I ( ~5 pgl* (gl*) = 1, we have
# I* and
123
Hence, the sup,,, in (9.16) is achieved at jjl*. Using the maximum defuzzifier (8.23) we obtain (9.14). From Lemma 9.4 we see that the fuzzy systems in this case are simple functions, that is, they are piece-wise constant functions, and these constants are the centers of the membership functions in the THEN parts of the rules. From (9.15) we see that as long as the product of membership values of the IF-part fuzzy sets of the rule is greater than or equal to those of the other rules, the output of the fuzzy system remains unchanged. Therefore, these kinds of fuzzy systems are robust to small disturbances in the input and in the membership functions pA<xi). However, these fuzzy systems are not continuous, that is, when I* changes fr6m one number to the other, f (x) changes in a discrete fashion. If the fuzzy systems are used in decision making or other open-loop applications, this kind of abrupt change may be tolerated, but it is usually unacceptable in closed-loop control. The next lemma shows that we can obtain a similar result if we use the minimum inference engine.
Lemma 9.5. If we change the product inference engine in Lemma 9.4 to the minimum inference engine (7.24) and keep the others unchanged, then the fuzzy systems are of the same form as (9.14) with 1* determined by
"ln[P~;* $=I ("ill 2 rnln[P~: 2=1 where 1 = 1,2, ..., M. Proof: From (7.29) (Lemma 7.3) and using the facts that sup,,, are interchangeable and that B1 are normal, we have and maxE,
(9.19)
n.
= mln(p~:* (xi)) i=l Also from (7.29) we have that p ~(jjl*) l = mi$==, (pAf* (xi)), thus the sup,,, (9.20) is achieved at gl*. Hence, the maximum defuzzifier (8.23) gives (9.14). Again, we obtain a class of fuzzy systems that are simple functions. It is difficult to obtain closed-form formulas for fuzzy systems with maximum defuzzifier and Lukasiewicz, Zadeh, or Dienes-Rescher inference engines. The difficulty comes from the fact that the sup,,, and min operators are not interchangeable in general, therefore, from (7.30)-(7.32) we see that the maximum defuzzification becomes an optimization problem for a non-smooth function. In these cases, for a given input x, the output of the fuzzy system has to be computed in a step-by-step fashion, that is, computing the outputs of fuzzifier, fuzzy inference engine, and defuzzifier in sequel. Note that the output of the fuzzy inference engine is a function, in
124
Ch. 9
not a single value, so the computation is very complex. We will not use this type of fuzzy systems (maximum defuzzifier with Lukasiewicz, Zadeh, or Dienes-rescher inference engine) in the rest of this book.
9.2
In the last section we showed that certain types of fuzzy systems can be written as compact nonlinear formulas. On one hand, these compact formulas simplify the computation of the fuzzy systems; on the other hand, they give us a chance to analyze the fuzzy systems in more details. We see that the fuzzy systems are particular types of nonlinear functions, so no matter whether the fuzzy systems are used as controllers or decision makers or signal processors or any others, it is interesting to know the capability of the fuzzy systems from a function approximation point of view. For example, what types of nonlinear functions can the fuzzy systems represent or approximate and to what degree of accuracy? If the fuzzy systems can approximate only certain types of nonlinear functions to a limited degree of accuracy, then the fuzzy systems would not be very useful in general applications. But if the fuzzy systems can approximate any nonlinear function to arbitrary accuracy, then they would be very useful in a wide variety of applications. In this section, we prove that certain classes of fuzzy systems that we studied in the last section have this universal approximation capability. Specifically, we have the following main theorem.
Theorem 9.1 (Universal Approximation Theorem). Suppose that the input universe of discourse U is a compact set in Rn. Then, for any given real continuous function g(x) on U and arbitrary 6 > 0, there exists a fuzzy system f (x) in the form of (9.6) such that
That is, the fuzzy systems with product inference engine, singleton fuzzifier, center average defuzzifier, and Gaussian membership functions are universal approximators. One proof of this theorem is based on the following Stone-Weierstrass Theorem, which is well known in analysis.
Stone-WeierstrassTheorem (Rudin [1976]). Let Z be a set of real continuous functions on a compact set U . If (i) Z is an algebra, that is, the set Z is closed under addition, multiplication, and scalar multiplication; (ii) Z separates points on U, that is, for every x, y E U ,x # y, there exists f E Z such that f (x) # f (y); and (iii) Z vanishes at n o point of U, that is, for each x E U there exists f E Z such that f (x) # 0; then for any real continuous function g(x) on U and arbitrary 6 > 0, there exists f E Z such that supzEu If (a) - g(x)l < E . Proof of Theorem 9.1: Let Y be the set of all fuzzy systems in the form of
125
(9.6). We now show that Y is an algebra, Y separates points on U, and Y vanishes at no point of U. Let
f l , f2
Hence,
Since alf1a2f2exp(-
(w
a: -fl?lfl
and il" 6212 can be viewed as the center of a fuzzy set in the form of (9.5), fl(x) f2(x) is in the form of (9.6); that is, f l f 2 E Y. Similarly,
which is again in the form of (9.6), so cfl E Y. Hence, Y is an algebra. We show that Y separates points on U by constructing a required fuzzy system f (x). Let xO,zOE U be two arbitrary points and xO# zO. We choose the parameters =1 , ~= : of the f (x) in the form of (9.6) as follows: M = 2, y1 = 0, y2 = 1,af = 1, (T: xp and 3 : = 29, where i = 1,2, ...,n and I = 1,2. This specific fuzzy system is
126
Ch. 9
and
Since xO# zO,we have exp(- 1 lxO- zO 11;) # 1 which, from (9.28) and (9.29), gives f (xO)# f (zO). Hence, Y separates points on U. To show that Y vanishes at no point of U, we simply observe that any fuzzy system f (x) in the form of (9.6) with all yZ > 0 has the property that f (x) > 0, Vx E U. Hence, Y vanishes at no point of U. In summary of the above and the Stone-Weierstrass Theorem, we obtain the conclusion of this theorem. Theorem 9.1 shows that fuzzy systems can approximate continuous functions to arbitrary accuracy; the following corollary extends the result to discrete functions.
Corollary 9.1. For any square-integrable function g(x) on the compact set U c Rn, that is, for any g E L2(U) = {g : U -t RI JU lg(x)I2dx < m), there exists fuzzy system f (x) in the form of (9.6) such that
Proof: Since U is compact, JU dx = E < m. Since continuous functions on U form a dense subset of L2(U) (Rudin [1976]), for any g E L2(U) there exists a l /12. ~ By Theorem continuous function g on U such that (JU lg(x) - g ( x ) ~ ~ d x )< 9.1, there exists f E Y such that supzEu If (x) - g(x) I < E / ( ~ E ' / ~ ) Hence, . we have
Theorem 9.1 and Corollary 9.1 provide a justification for using fuzzy systems in a variety of applications. Specifically, they show that for any kind of nonlinear operations the problem may require, it is always possible to design a fuzzy system that performs the required operation with any degree of accuracy. They also provide a theoretical explanation for the success of fuzzy systems in practical applications. However, Theorem 9.1 and Corollary 9.1 give only existence result; that is, they show that there exists a fuzzy system in the form of (9.6) that can approximate any function to arbitrary accuracy. They do not show how to find such a fuzzy system. For engineering applications, knowing the existence of an ideal fuzzy system is not enough; we must develop approaches that can find good fuzzy systems for the
127
particular applications. Depending upon the information provided, we may or may not find the ideal fuzzy system. In the next few chapters, we will develop a variety of approaches to designing the fuzzy systems.
9.3
In this chapter we have demonstrated the following: The compact formulas of zome useful classes of fuzzy systems. How to derive compact formulas for any classes of fuzzy systems if such compact formulas exist. How to use the Stone-Weierstrass Theorem. The derivations of the mathematical formulas of the fuzzy systems are new. A related reference is Wang [1994]. The Universal Approximation Theorem and its proof are taken from Wang [1992]. Other approaches to this problem can be found in Buckley [1992b] and Zeng and Singh [1994].
9.4
Exercises
Exercise 9.1. Derive the compact formula for the fuzzy systems with fuzzy rule base (7.1), Zadeh inference engine (7.31), singleton fuzzifier (8.1), and center average defuzzifier (8.18). Exercise 9.2. Repeat Exercise 9.1 using Lukasiewicz inference engine rather than Zadeh inference engine. Exercise 9.3. Show that the fuzzy systems in the form of (9.1) have the universal approximation property in Theorem 9.1. Exercise 9.4. Can you use the Stone-Weierstrass Theorem to prove that fuzzy systems in the form of (9.7) or (9.6) with a: = 1 are universal approximators? Explain your answer. Exercise 9.5. Use the Stone-Weierstrass Theorem to prove that polynomials are universal appproximators. Exercise 9.6. Plot the fuzzy systems fi(x) and fi(x) for x E U = [-I, 2 1x [-l,2], where f ~ ( x is ) the fuzzy system with the two rules (8.27) and (8.28), product inference engine (7.23), singleton fuzzifier (8.I), and maximum defuzzifier (8.23), and fi(x) is the same as f ~ ( x except ) that the maximum defuzzifier is replaced by the center average defuzzifier (8.18).
Chapter 10
In Chapter 9 we proved that fuzzy systems are universal approximators; that is, they can approximate any function on a compact set to arbitrary accuracy. However, this result showed only the existence of the optimal fuzzy system and did not provide methods to find it. In fact, finding the optimal fuzzy system is much more difficult than proving its existence. Depending upon the information provided, we may or may not find the optimal fuzzy system. To answer the question of how to find the optimal fuzzy system, we must first see what information is available for the nonlinear function g(x) : U C Rn -+ R, which we are asked to approximate. Generally speaking, we may encounter the following three situations: The analytic formula of g(x) is known. The analytic formula of g(x) is unknown, but for any x E U we can determine the corrspending g(x). That is, g(x) is a black box-we know the input-output behavior of g(x) but do not know the details inside it. The analytic formula of g(x) is unknown and we are provided only a limited number of input-output pairs (xj, g(xj)), where x j E U cannot be arbitrarily chosen. The first case is not very interesting because if the analytic formula of g(x) is known, we can use it for whatever purpose the fuzzy system tries to achieve. In the rare case where we want to replace g(x) by a fuzzy system, we can use the methods for the second case because the first case is a special case of the second one. Therefore, we will not consider the first case separately. The second case is more realistic. We will study it in detail in this and the following chapters.
129
The third case is the most general one in practice. This is especially true for fuzzy control because stability requirements for control systems may prevent us from choosing the input values arbitrarily. We will study this case in detail in Chapters 12-15.
So, in this chapter we assume that the analytic formula of g(x) is unknown but we can determine the input-output pairs (x; g(x)) for any x E U . Our task is to design a fuzzy system that can approximate g(x) in some optimal manner.
Definition 10.1: Pseudo-Trapezoid Membership Function. Let [a,dJ c R. The pseudo-trapezoid membership function of fuzzy set A is a continuous function in R given by I(x), x E [a, b) H , x E [b, cl (10.1) PA(X; a, b, c, d, H ) = D(x>, x E (c, dl 0, x E R - (a, d)
where a 5 b 5 c 5 d, a < d, 0 < H 5 1,O 5 I(x) 5 1 is a nondecreasing function in [a, b) and 0 5 D(x) 5 1 is a nonincreasing function in (c, 4. When the fuzzy set A is normal (that is, H = I ) , its membership function is simply written as PA(%; a , b, c, dl. Fig. 10.1 shows some examples of pseudo-trapezoid membership functions. If the universe of discourse is bounded, then a, b, c, d are finite numbers. Pseudo-trapezoid membership functions include many commonly used membership functions as sp&cia1 cases. For example, if we choose
then the pseudo-trapezoid membership functions become the Gaussian membership functions. Therefore, the pseudo-trapezoid membership functions constitute a very rich family of membership functions.
Definition 10.2: Completeness of Fuzzy Sets. Fuzzy sets A', A2, ...,AN in R are said to be complete on W if for any x E W, there exists Aj such that PA^ (2) > 0.
C
130
Ch. 10
Definition 10.3: Consistency of Fuzzy Sets. Fuzzy sets A', A2,...,AN in W C R are said to be consistent on W if p ~ (jx ) = 1 for some x E W implies that ( x ) = 0 for all i # j. Definition 10.4: High Set of Fuzzy Set. The high set of a fuzzy set A in W C R is a subset in W defined by
If A is a normal fuzzy set with pseudo-trapezoid membership function P A ( % ;a, b, C , d), then hgh(A) = [b,c]. Definition 10.5: Order Between Fuzzy Sets. For two fuzzy sets A and B in E hgh(B), then x > 2').
We now show some properties of fuzzy sets with pseudo-trapezoid membership functions.
Lemma 10.1. If A1,A2,...,AN are consistent and normal fuzzy sets in W c R with pseudo-trapezoid membership functions ( x ;ai, bi, ci, di) (i = 1,2, ...,N ) , then there exists a rearrangement { i l ,i 2 ,..., i N ) of {1,2,..., N ) such that
131
since otherwise the fuzzy sets A', ..., AN would not be consistent. Thus, there exists a rearrangement {il,i2, ...,iN}of {1,2, ...,N ) such that
which implies (10.5). Lemma 10.1 shows that we can always assume A1 loss of generality.
Lemma 10.2. Let the fuzzy sets A', A2, ...,AN in W c R be normal, consistent i ai, bi, ci, di). If and complete with pseudo-trapezoid membership functions p ~ (x; A' < A2 < ... < AN, then
(10.7)
Fig. 10.2 illustrates the assertion of Lemma 10.2. The proof of Lemma 10.2 is straightforward and is left as an exercise.
I bi+l.
I ai+l <
132
Ch. 10
two-input fuzzy systems; however, the approach and results are all valid for general n-input fuzzy systems. That is, we can use exactly the same procedure to design n-input fuzzy systems. We first specify the problem.
The Problem: Let g(x) be a function on the compact set U = [al,&] x [a2, ,821 c R2 and the analytic formula of g(x) be unknown. Suppose that for any x E U , we can obtain g(x). Our task is to design a fuzzy system that approximates g(x). We now design such a fuzzy system in a step-by-step manner.
Design of a Fuzzy System: S t e p 1. Define Ni (i = 1,2) fuzzy sets At, A:, ...,A? in [ai, Pi] which are normal, consistent, complete with pesudo-trapezoid membership functions N. N. pA:(~i;a:,b:,ct,d~),...,pA$ xi;af",bf",ci ' , d , '),and At < A : < < A? with at = b: = ai and cf" = df" = Define e: = al,el"f = PI, and e3- 1 ?(b; 3 for j = 2,3, ...,Nl - 1. Similarly, define ei = a 2 ,e p = P2,
a.
=P2 = 1.
Figure 10.3. An example of the fuzzy sets defined in Step 1 of the design procedure.
(10.8)
133
where i l = 1,2, ..., N I , i2 = 1,2, ..., N2, and the center of the fuzzy set Bili2, denoted by jjiliz, is chosen as
For the example in Fig. 10.3, we have 3 x 4 = 12 rules, and the centers of Bili2 are equal to the g(x) evaluated at the 12 dark points shown in the figure. Step 3. Construct the fuzzy system f (x) from the Nl x N2 rules of (10.8) using product inference engine (7.23), singleton fuzzifier (8.1), and center average defuzzifier (8.18) (see Lemma 9.1) :
Since the fuzzy sets A:, ...A ? are complete, at every x E U there exist il and i2 such that p,;, (x1)pAt2 (x2) # 0. Hence, the fuzzy system (10.10) is well defined, 2 that is, its denominator is always nonzero. From Step 2 we see that the IF parts of the rules (10.8) constitute all the possible combinations of the fuzzy sets defined for each input variable. So, if we generalize the design procedure to n-input fuzzy systems and define N fuzzy sets for each input variable, then the total number of rules is N n that is, by using this design method, the number of rules increases exponentially with the dimension of the input space. This is called the curse of dimensionality and is a general problem for all highdimensional approximation problems. We will address this issue again in Chapter 22. The final observation of the design procedure is that we must know the values of g(z) a t x = (ell, e y ) for i l = 1,2, ..., Nl and i2 = 1,2, ...,N2. Since (e?, e ) : can be arbitrary points in U, this is equivalent to say that we need to know the values of g(x) a t any x E U. Next, we study the approximation accuracy of the f (x) designed above t o the unknown function g(x).
where the infinite norm 11 * 1 , is defined as l/d(x)lloo= supzEu Id(x)l, and hi = m a x l<j < ~ , - l lei+' - eil (i = 1,2).
134
Ch. 10
Proof: Let Uili2 = [e? ,e?+ '1 x [e"j2 e?"], where il = 1,2, ..., Nl - 1 and Ni-1 N. i2 = 1,2, ..., N2 - 1. Since [ai,pi] = [ei,e:] u [e:, e:] u . - . u [ei ,ei $1, i = 1,2, we have
N1-1 N2-I
U = [ a l , h ] x[a2,81=
il=l iz=l
U U uhi2
(10.12)
which implies that for any x E U, there exists Uili2 such that x E uili2. Now suppose x E Uili2, that is, xl E [e? ,e?"] and 2 2 E [e? ,e?+'] (since x is fixed, il and i2 are also fixed in the sequel). Since the fuzzy sets A:, ...,A? are normal, consistent and complete, at least one and at most two pAj1($1) are nonzero for
j l
= 1,2, ..., Nl. From the definition of eil (jl = 1,2, ..., Nl - I), these two possible
nonzero PA;' (21)'s are pAp (XI) and pAp+l (XI). Similarly, the two possible nonzero PA:, ( ~ 2 ) ' s(for j 2 = 1,2, ...,N2) are p y ("2) and pA2+l(x2). Hence, the fuzzy system f (x) of (10.10) is simplified to
we have
we have Since x E Uili2, which means that xl E [e? ,eP+l] and x2 E [e?, that 1x1 -eF1 5 le?" - e ? ~ and 1x2 - e?l 5 le? -e?+ll for jl = il,il 1 and j 2 = i2,ia 1. Thus, (10.16) becomes
135
Theorem 10.1 is an important theorem. We can draw a number of conclusions from it, as follows: From (10.11) we can conclude that fuzzy systems in the form of (10.10) are universal approximators. Specifically, since 11 lm and 11 1 lm are finite numbers (a continuous function on a compact set is bounded by a finite number), for any given E > 0 we can choose hl and h2 small enough such that ~l$ll~hl ~l%ll~hz < t. Hence from (10.11) we have supXEv/g(x) f (211 = 119 - f llm < 6 .
2I
From (10.11) and the definition of hl and h2 we see that more accurate approximation can be obtained by defining more fuzzy sets for each xi. This confirms the intuition that more rules result in more powerful fuzzy systems. From (10.11) we see that in order to design a fuzzy system with a prespecified accuracy, we must know the bounds of the derivatives of g(x) with respect to xl and x2, that is, I I IB; L and 11 f& 1 lm. In the design process, we need to know the value of g(x) at z = (e? ,e?) for il = 1,2, ...,Nl and i2 = 1,2, ...,N2. Therefore, this approach requires these two pieces of information in order for the designed fuzzy system to achieve any prespecified degree of accuracy.
From the proof of Theorem 10.1 we see that if we change pAil (x1)pA2(22) to min[p i, (XI),pAi2 (x2)], the proof is still vaild. Therefore, if we use mini-41 mum inference engine in the design procedure and keep the others unchanged, the designed fuzzy system still has the approximation property in Theorem 10.1. Consequently, the fuzzy systems with minimum inference engine, singleton fuzzifier, center average defuzzifier and pseudo-trapezoid membership functions are universal approximators.
Theorem 10.1 gives the accuracy of f (x) as an approximator to g(x). The next lemma shows at what points f (x) and g(x) are exactly equal. L e m m a 1 0 . 3 . Let f (x) be the fuzzy system (10.10) and e b n d e p be the points defined in the design procedure for f (x). Then,
136
Ch. 10
AP'S
# il
Lemma 10.3 shows that the fuzzy system (10.10) can be viewed as an interpolation of function g(x) at some regular points (ey , e?) (il = 1,2, ...,Nl, i2 = 1,2, ..., N2) in the universe of discourse U. This is intuitively appealing. Finally, we show two examples of how to use Theorem 10.1 to design the required fuzzy system.
Example 10.1. Design a fuzzy system f (x) to uniformly approximate the continuous function g(x) = sin(x) defined on U = [-3,3] with a required accuracy of E = 0.2; that is, supxEu Jg(x) - f (x)J < 6 .
Since 11% 1 , = Ilcos(x)II, = 1, from (10.11) we see that the fuzzy system with h = 0.2 meets our requirement. Therefore, we define the following 31 fuzzy sets AJ in U = [-3,3] with the triangular membership functions
and pAj(x) = pAj(x; ej-l , ej, ej+l ) where j = 2,3, ..., 30, and ej = -3 0.2(j - 1). These membership functions are shown in Fig. 10.4. According t o (10.10), the designed fuzzy system is
which is plotted in Fig.lO.5 against g(x) = sin(x). We see from Fig.10.5 that f (x) and g(x) are almost identical.
Example 10.2. Design a fuzzy system t o uniformly approximate the function g(x) = 0.52 + 0 . 1 + ~ 0.2822 ~ - 0.06x1x2 defined on U = [-I, 1 1 x [-I, 1 1 with a required accuracy of E = 0.1.
= supztu 10.1 - 0.06x2l = 0.16 and I I ~ I I , = supXtu 10.28 Since 0 . 0 6 ~= ~ )0.34, from (10.11) we see that hl = hz = 0.2 results in - f 11, 5 0.16 * 0.2 + 0.34 * 0.2 = 0.1. Therefore, we define 11 fuzzy sets Aj ( j = 1,2, ..., 11) in [- 1,I] with the following triangular membership functions:
llzllw
(10.24)
137
Figure 10.5. The designed fuzzy system f ( x ) and the function g(x) = six(%) (they are almost identical).
138
Ch. 10
and
PA^ (x) = P A j (x; ej-l , 2,ej+l)
for j = 2,3, ..., 10, where ej = -1 0.2(j - 1). The fuzzy system is constructed from the following 11 x 11 = 121 rules:
IF XI i s
(10.27)
where il ,i 2 = 1,2, ...,11, and the center of Bhi2 is galZ2 = g(eil, ei2). The final fuzzy system is
From Example 10.2 we see that we need 121 rules to approximate the function g(x). Are so many rules really necessary? In other words, can we improve the bound in (10.11) so that we can use less rules to approximate the same function to the same accuracy? The answer is yes and we will study the details in the next chapter.
10.4
In this chapter we have demonstrated the following: The three types of approximation problems (classified according to the information available). The concepts of completeness, consistence, and order of fuzzy sets, and their application to fuzzy sets with pseudo-trapezoid membership functions. For a given accuracy requirement, how to design a fuzzy system that can approximate a given function to the required accuracy. The idea of proving the approximation bound (10.11). Approximation accuracies of fuzzy systems were analyzed in Ying [I9941 and Zeng and Singh (19951. This is a relatively new topic and very few references are available.
10.5
Exercises
Exercise 10.1. Let fuzzy sets Aj in U = [a, b] ( j = 1,2, ...,N) be normal, consistent, and complete with pseudo-trapezoid membership functions PA^ (x) =
139
(x; a j , bj, cj, dj). Suppose that A1 < A2 whose membership functions are given as
PAj
Show that: (a) P g j (x) ( j = 1,2, ..., N) are also pesudo-trapezoid membership functions. (b) The fuzzy sets Bj ( j = 1,2, ..., N) are also normal, consistent, and complete. (d) If P A j (2) = PAj (x; a j , bj, c j , dj) are trapezoid membership functions with = ai+l, di = bi+l for i = 1,2, ...,N - 1,then p ~ (2) j = PAj (x) for j = 1,2, ..., N -
Ci
1.
Exercise 10.2. Design a fuzzy system to uniformly approximate the function g(x) = sin(x7r) cos(xn) sin(xn)cos(xn) on U = [-I, 1 1 with a required accuracy of E = 0.1.
Exercise 10.3. Design a fuzzy system to uniformly approximate the function g(x) = sin(xln) cos(x2n) s i n ( x ~ n ) c o s ( x ~on n ) U = [-I, 1 1 x [-I, 1 1 with a required accuracy of E = 0.1.
Exercise 10.4. Extend the design method in Section 10.2 to n-input fuzzy systems. Exercise 10.5. Let the function g(x) on U = [O, 1 1 3be given by
where K = {klkzk3(ki = 0 , l ; i = 1,2,3 and kl k 2 kg > 0). Design a fuzzy system to uniformly approximate g(x) with a required accuracy of E = 0.05.
+ +
Exercise 10.6. Show that if the fuzzy sets A:, A:, ...,A? in the design procedure of Section 10.2 are not complete, then the fuzzy system (10.10) is not well defined. If these fuzzy sets are not normal or not consistent, is the fuzzy system (10.10) well defined? Exercise 10.7. Plot the fuzzy system (10.28) on U = [-I, 1 1 x [-I, 1 1 and -~ 0.06x1x2. compare it with g(x) = 0.52 +O.lxl 0 . 2 8 ~
Chapter 11
In Chapter 10 we saw that by using the bound in Theorem 10.1 a large number of rules are usually required to approximate some simple functions. For example, Example 10.2 showed that we need 121 rules to approximate a two-dimensional quadratic function. Observing (10.11) we note that the bound is a linear function of hi. Since hi are usually small, if a bound could be a linear function of h:, then this bound would be much smaller than the bound in (10.11). That is, if we can obtain tighter bound than that used in (10.11), we may use less rules to approximate the same function with the same accuracy.
In approximation theory (Powell [1981]), if g(x) is a given function on U and Uili2 (il = 1,2, ..., Nl, i2 = 1,2, ..., Nz) is a partition of U as in the proof of Theorem 10.1, then f (x) is said to be the k1th order accurate approximator for g(x) if I(g~ ~ where h " M, is a constant that depends on the function g, and h is the module of the partition that in our case is max(h1, hz). In this chapter, we first design a fuzzy system that is a second-order accurate approximator.
f ,1
<
Step 1. Define Ni ( i = 1,2) fuzzy sets Af ,A:, ...,A? in [ai, Pi], which are normal, consistent, and complete with the triangular membership functions
141
(11.2)
(11.3)
where i = 1 , 2 , and ai = e i < e T < ... < e y = Pi. Fig. 11.1 shows an example with N l = 4, N2 = 5 , al = a2 = 0 and P1 = b2 = 1.
Steps 2 and 3. The same as Steps 2 and 3 of the design procedure in Section 10.2. That is, the constructed fuzzy system is given by (10.10), where vjili2 are given by (10.9) and the A? and A: are given by (11.1)-(11.3).
Since the fuzzy system designed from the above steps is a special case of the fuzzy system designed in Section 10.2, Theorem 10.1 is still valid for this fuzzy system. The following theorem gives a stronger result.
he or em 11.1. Let f ( x ) be the fuzzy system designed through the above three steps. If the g(x) is twice continuously differentiable on U , then
142
Ch. 11
Nz-1 uiliz Proof: As in the proof of Theorem 10.1, we partition U into U = u?=y1 Uiz=l > . . where U2122= [e?, ell+'] x [e?, e?"]. So, for any x E U, there exists Uili2 such suppose x E iYili2, then by the consistency and completeness that x E Uili2. NOW) of fuzzy sets A:, A:, ..., A? (i = 1,2), the fuzzy system can be simplified to (same as (10.13))
Since pA?"xi) are the special triangular membership functions given by (11.1)(11.3), we have (11.6) (xi) + pA:1+1(xi) = 1 for i = 1,2. Hence,
be ) the set of all twice continuously differentiable functions on Uili2 Let C 2 ( ~ i 1 i z and define linear operators L1 and L2 on C2(Uili2) as follows:
Since pA31 ($1) and pAJ2 (22) are linear functions in Uili2, they are twice continuously dikerentiable. gence, g E C2(Uil'2) implies Llg E C2(Ui1'z) and L2g E C2(Uili2). From (11.9) and (11.6) we have
143
'1
Similarly, we have
Substituting (11.14) and (11.15) into (11.13) and noticing the definition of hi, we obtain (11.4). From Theorem 11.1 we see that if we choose the particular triangular membership functions, a second-order accurate approximator can be obtained. We now design fuzzy systems to approximate the same functions g(x) in Examples 10.1 and 10.2 using the new bound (11.4). We will see that we can achieve the same accuracy with fewer rules.
Example 11.1. Same as Example 10.1 except that we now use the bound = 1, we have from (11.4) that if we choose h = 1, then (11.4). Since 1 9 - f lloo I < 6. Therefore, we define 7 fuzzy sets Aj in the form of we have 1 (11.1)-(11.3) with eJ = -3 ( j - 1) for j = 1,2, ..., 7. The designed fuzzy system is
~lzll~
+
f (x) =
Comparing (11.16) with (10.23) we see that the number of rules is reduced from 31
to 7, but the accuracy remains the same. The f (x) of (11.16) is plotted in Fig. 11.2 against g(x) = sin(x).
Example 11.2. Same as Example 10.2 except that we now use the bound (11.4). = 0 (i = 1,2), we know from (11.4) that f (x) = g (x) for all z E U . In Since fact, choosing hi = 2,e: = -1 and e: = 1 for i = 1,2 (that is, Nl = N2 = 2), we
144
Ch. 11
Figure 11.2. The designed fuzzy system f(x) of (11.16) and the function g(x) = s i n ( x ) to be approximated.
where the membership functions are given by (11.1)-(11.3). For this particular case, we have for i = 1,2 and x E U that
g(e:, e i ) = g(-1, -1) = 0.08, g(e:, e ; ) = g ( - 1 , l ) = 0.76, g(e?,e i ) = g(1, -1) = 0.4, and g(e2, e;) = g ( 1 , l ) = 0.84. Substituting these results into (11.17),we obtain
f ( x ) = [-(I 4
0.4 + x2) + -(I 4 + x1)(1 - x2) 0.84 1 ++I + x1)(1+ ~ 2 ) 11/ [ ~ -(x1 1 ) ( 1 - x2) + ; ( l - .1)(1+ 1 1 +-(I + x1)(1 - x,) + q ( l+ x 1 ) ( l + xz)] 4
0.08
- x1)(1 - 2 2 )
0.76 + -(1 4
- x1)(1
22)
= 0.52
+ O.lxl + 0 . 2 8 ~ -~ 0.06xlx2
(11.20)
145
which is exactly the same as g(x). This confirms our conclusion that f (x) exactly reproduces g(x). In Example 10.2 we used 121 rules to construct the fuzzy system, whereas in this example we only use 4 rules to achieve zero approximation error. To generalize Example 11.2 , we observe from (11.4) that for any function with I%I/, = 0, our fuzzy system f (x) designed through the three steps reproduces the g(x), that is, f (x) = g(x) for all x E U . This gives the following corollary of Theorem 11.1.
Corollary 11.1. Let f (x) be the fuzzy system designed through the three steps in this section. If the function g(x) is of the following form:
ProoE Since % ax, = % ax, = 0 for this class of g(z), the conclusion follows immediately from (11.4).
Design of Fuzzy System with Maximum Defuzzifier: Step 1. Same as Step 1 of the design procedure in Section 11.1. Step 2. Same as Step 2 of the design procedure in Section 10.2. Step 3. Construct a fuzzy system f (x) from the Nl x N2 rules in the form of (10.8) using product inference engine (7.23), singleton fuzzifier (8.1), and maximum defuzzifier (8.23). According to Lemma 9.4, this fuzzy system is
The following theorem shows that the fuzzy system designed above is a firstorder accurate approximator of g(x).
Theorem 11.2. Let f (x) be the fuzzy system (11.22) designed from the three PI] x [az, ,021, then steps above. If g(x) is continuously differentiable on U = [al,
I Nz-1 uili2 Proof As in the proof of Theorem 11.1, we partition U into U = UilZl Ui2=l , where Uili2 = [efl?eFf '1 x [e?, e?+ 'J., We now further partition Uili2 into Uili2 = u$: U u"'" U UZQ U u:;", where U:AZ2 = [ey , f (e;' + eF+')] x [e? , (e? + e?")], Olil , Z(e: 1 . Z2 . = [+(e2; ~ 6 ; " = [el e;'")] x [f(e? e?), e:+'], U:A eF+'), eF+l] x .. [e?, i ( e 2 e?+l)], and U:iZ2 = [;(e;' + eFS1), e;'+l] x [I(& 2 . 2 + +?+I), eFf '1; see Fig. 11.3 for an illustration. So for any x E U, there exist U;iz2 (p, q = 0 OT 1) such that x E Ujii2. If x is in the interior of U z , then with the help of Fig. 11.3 we see that p A i l + p (21) > 0.5, pA9+* (x2) > 0.5, and all other membership values are less than 0.5. Hence, from (11.22) and (11.23) we obtain
Using the Mean Value Theorem and the fact that x E Uili2, we have
If x is on the boundary of U;ki2, then with the help of Fig. 11.3 we see that f (x) may take any value from a set of at most the four elements {g(eF, e?), g(e?, e?"), g(eF+l, e?), g(e:+', e?")); so (11.26) is still true in this case. Finally, (11.24) follows from (11.26). From Theorem 11.2 we immediately see that the fuzzy systems with product inference engine, singleton fuzzifier, and maximum defuzzifier are universal approximators. In fact, by choosing the hl and h2 sufficiently small, we can make 1 1 9 - f 1 lo < 6 for arbitrary E > 0 according to (11.24). We now approximate the functions g(x) in Examples 11.1 and 11.2 using the fuzzy system (11.22).
Example 11.3. Same as Example 10.1 except that we use the fuzzy system (11.22). Since ~lgll, = 1, we choose h = 0.2 and define 31 fuzzy sets Aj in the form of (10.20)-(10.22) (Fig. 10.4). Let e j = -3 0.2(j - l ) , then the fuzzy system
147
uili2 into
sub-
( 11.22) becomes
Example 11.4. Same as Example 10.2 except that we use the fuzzy system (11.22). As in Example 10.2, we choose hl = h2 = 0.2 and define 11 fuzzy sets Aj on [-I, 1 1 given by (10.24)-(10.26). We construct the fuzzy system using the 121 rules in the form of (10.27). For this example, ej = -1 + 0.2(j - 1) ( j = 1 , 2 , ...,11) and Uili2 = [eil,e i ~ + l ~ [ e i l ,e i ~ +]l ( i l , i 2 = 1,2 ,..., 10). As shown (J1 Uiliz where in Fig. 11.3, we further decompose Uhi2 into Uili2 = . . . . p=p q = O Pq ' Uih" = [e?, f (e? e?+l)] x [e:, f (e: + e:+')], Ui;'2 = [e;li t ( e F + e?+')] x . . [ f ( , $ + ,iz) u!dz = [ f (e$ + eF+l ) , x [e: , f (e$ + eF'l)], and U:;'2 = 2 7 ."+I], 2 [3(e41+.?'I), e?"] x [$(eF+eF+'), e$+l]. Then the fuzzy system (11.22) becomes
u1
which is computed through the following two steps: (i) for given x E U , determine il ,i2, p, q such that z E U;ii2, and (ii) the f ( x ) equals g(eil+p,ei2+q).
148
Ch. 11
Figure 11.4. The designed fuzzy system (11.22) and the function g ( x ) = sin(x) to be approximated in Example 11.3.
In Section 11.1 we showed that the fuzzy systems with center average defuzzifier are second-order accurate approximators. Theorem 11.2 shows that the fuzzy systems with maximum defuzzifier are first-order accurate approximators. So it is natural to ask whether they are second-order accurate approximators also? The following example shows, unfortunately, that they cannot be second-order accurate approximators.
Example 11.5. Let g ( x ) = x on U = [O, 1 1 and ,UA" ( x ) = ,UAZ( x ;ei-l, e2,ei+' 1, where e0 = 0, eN+l = 1, ei = a, N-1 i = 1,2, ...,N , and N can be any positive integer. So, in this case we have N rules and h = &. Let U Z= [ei,e2+'] (i = 1 , 2 , ...,N - 1) and f ( x ) be the fuzzy system (11.22). If x E Ui, then
1 . 1 x - g(ei) or g(ei++' = -(e2+' - e') = - h (11.29) 2 2
xEU
ma? / g ( x )- f (x)l =
max
x[ei ,ei+l]
Since h(= &) 2 h2 for any positive integer N, the fuzzy system (11.22) cannot approximate the simple function g ( x ) = x to second-order accuracy. Because of this counter-example, we conclude that fuzzy systems with maximum defuzzifier cannot be second-order accurate approximators. Therefore, fuzzy systems with center average defuzzifier are better approximators than fuzzy systems with maximum defuzzifier .
149
11.3
In this chapter we have demonstrated the following: Using the second-order bound (11.4) to design fuzzy systems with required accuracy. Designing fuzzy systems with maximum defuzzifier to approximate functions with required accuracy. The ideas of proving the second-order bound for fuzzy systems with center average defuzzifier (Theorem 11.1) and the first-order bound for fuzzy systems with maximum defuzzifier (Theorem 11.2). Again, very few references are available on the topic of this chapter. The most relevant papers are Ying [I9941 and Zeng and Singh [1995].
11.4
Exercises
Exercise 11.1. Use the first-order bound (10.11) and the second-order bound (11.4) to design two fuzzy systems with center average defuzzifier to uniformly 1 approximate the function g(x1, x2) = on U = [-I, 1 1 x [-I, 1 1 to the accuracy of E = 0.1. Plot the designed fuzzy systems and compare them. Exercise 11.2. Repeat Exercise 11.1 with g(xi, x2) = l+ri+xi. Exercise 11.3. Design a fuzzy system with maximum defuzzifier to uniformly approximate the g(xl, x2) in Exercise 11.1 on the same U to the accuracy of E = 0.1. Plot the designed fuzzy system. Exercise 11.4. Repeat Exercise 11.3 with the g(xl, 22) in Exercise 11.2. Exercise 11.5. Generalize the design procedure in Section 11.1 to n-input fuzzy systems and prove that the designed fuzzy system f (x) satisfies
150
Ch. 11
152
Ch. 11
Expert knowledge
1
Subconscious knowledg
c
Input-output pairs
Fuzzy systems
In this part (Chapters 12-15), we will develop a number of methods for constructing fuzzy systems from input-output pairs. Because in many practical situations we are only provided with a limited number of input-output pairs and cannot obtain the outputs for arbitrary inputs, the design methods in Chapters 10 and 11 are not applicable (recall that the methods in Chapters 10 and 11 require that we can determine the output g(x) for arbitrary input x E U). That is, we are now considering the third case discussed in the beginning of Chapter 10. Our task is to design a fuzzy system that characterizes the input-output behavior represented by the input-output pairs. In Chapter 12, we will develop a simple heuristic method for designing fuzzy systems from input-output pairs and apply the method to the truck backer-upper control and time series prediction problems. In Chapter 13, we will design the fuzzy system by first specifying its structure and then adjusting its parameters using a gradient descent training algorithm; we will use the designed fuzzy systems to identify nonlinear dynamic systems. In Chapter 14, the recursive least squares algorithm will be used to design the parameters of the fuzzy system and the designed fuzzy system will be used as an equalizer for nonlinear communication channels. Finally, Chapter 15 will show how to use clustering ideas to design fuzzy systems.
Chapter 12
where x : E U = [al, PI] x . . . x [a,, Pn] c Rn and y ; E V = [ay, ,By] c R. Our objective is to design a fuzzy system f (x) based on these N input-output pairs. We now propose the following five step table look-up scheme to design the fuzzy system:
Step 1. Define fuzzy sets to cover the input and output spaces.
Specifically, for each [ai, Pi], i = 1,2, ..., n, define Ni fuzzy sets A: ( j = 1,2, ..., Ni), which are required to be complete in [ai, Pi]; that is, for any xi E [ai, Pi], there : such that pA;(xi) # 0. For example, we may choose pA; ( s )to be exists A the pseudo-trapezoid membership functions: pA; (xi) = pA; (xi; a:, b:,
4, d ! ) , where
af = bl 2 = a 2 c?, = ~ af" 2 < b:+l = d: ( j = 1,2, ..., Ni - I), and d d 2 N ; = pi. Similarly, define Ny fuzzy sets B j , j = 1,2, ..., N,, which are complete in [ay,Py]. We also may choose pBj,(y) to be the pseudo-trapezoid membership functions: pgi(y) = p g j ( y ; a j , b j , d , d j ) , whereal = b1 = a,,d = aj+l < bj+l = dj ( j = 1,2, ..., Ny - I ) , and cNy = dNy = Py. Fig.12.2 shows an example for the n = 2 case, where Nl = 5, N2 = 7, Ny = 5, and the membership functions are all triangular.
Step 2. Generate one rule from one input-output pair.
First, for each input-output pair (xil, ..., xi,; y ; ) , determine the membership : ( j = 1,2, ..., Ni) and the membership values of xgi (i = 1,2, ..., n) in fuzzy sets A values of y; in fuzzy sets B' ( I = 1,2, ..., N,). That is, compute the following:
154
Ch. 12
Figure 12.2. An example of membership functions and input-output pairs for the two-input case.
pAi (xEi) for j = 1,2, ...,Ni, i = 1,2, ..., n , and pBt (yi) for I = 1,2, ..., Ny. For the example in Fig. 12.2, we have approximately that: xA1 has membership value 0.8 in B1, 0.2 in B2, and zero in other fuzzy sets; xA2 has membership value 0.6 in S1, 0.4 in S2, and zero in other fuzzy sets; and, yA has membership value 0.8 in CE, 0.2 in B1, and zero in other fuzzy sets. Then, for each input variable xi (i = 1,2, ...,n), determine the fuzzy set in which xpi has the largest membership value, that is, determine A: * such that pAi. (x&) 2 pa: (x&) for j = 1,2, ...,Ni. Similarly, determine Bz*such that p ~ t (y: . ) 2 pBi(y:) for 1 = 1,2, ...,N y . For the example in Fig. 12.2, the input-output pair (xAl, xkziYA) gives A(* = B1, A;* = S1 and B'* = CE, and the pair (xi,, xi,; yg) gives A;* = B~,A;* = CE and B'* = B1. Finally, obtain a fuzzy IF-THEN rule as
(12.2)
Sec. 12.1. A Table Look-Up Scheme for Designing Fuzzy Systems from Input-Output Pairs
2 2 i s 5' 1, T H E N y i s CE; and the pair (xi,, xi2;y:) i s B1 and 2 2 i s C E , T H E N y i s B1.
155
and
For the example in Fig. 12.2, the rule generated by (xh,, xi,; y:) has degree
If the input-output pairs have different reliability and we can determine a number to assess it, we may incorporate this information into the degrees of the rules. : ) has reliable degree p p (E [0, I]), Specifically, suppose the input-output pair (xg; y then the degree of the rule generated by (x:; y;) is redefined as
In practice, we may ask an expert to check the data (if the number of input-output pairs is small) and estimate the degree p P . Or, if we know the characteristics of the noise in the data pair, we may choose pP to reflect the strength of the noise. If we cannot tell the difference among the input-output pairs, we simply choose all p P = 1 SO that (12.6) reduces to (12.3).
156
Ch. 12
The rule from a conflicting group that has the maximum degree, where a group of conflicting rules consists of rules with the same IF parts. Linguistic rules from human experts (due to conscious knowledge). Since the first two sets of rules are obtained from subconscious knowledge, the final fuzzy rule base combines both conscious and subconscious knowledge. Intuitively, we can illustrate a fuzzy rule base as a look-up table in the twoinput case. For example, Fig. 12.3 demonstrates a table-lookup representation of the fuzzy rule base corresponding to the fuzzy sets in Fig. 12.2. Each box represents and thus a possible a combination of fuzzy sets in [a1, @I] and fuzzy sets in [az, rule. A conflicting group consists of rules in the same box. This method can be viewed as filling up the boxes with appropriate rules; this is why we call this method a table look-up scheme.
Step 5. Construct the fuzzy system based on the fuzzy rule base.
We can use any scheme in Chapter 9 to construct the fuzzy system based on the fuzzy rule base created in Step 4. For example, we may choose fuzzy systems with product inference engine, singleton fuzzifier, and center average defuzzifier (Lemma 9.1).
157
We now make a few remarks on this five step procedure of designing the fuzzy system from the input-output pairs. A fundamental difference between this method and the methods in Chapters 10 and 11 is that the methods in Chapters 10 and 11 require that we are able to determine the exact output g(x) for any input x E U , whereas in this method we cannot freely choose the input points in the given input-output pairs. Also, in order to design a fuzzy system with the required accuracy, the methods in Chapters 10 and 11 need to know the bounds of the first or second order derivatives of the function to be approximated, whereas this method does not require this information. If the input-output pairs happen to be the ( e t ,e?; giliz) in (10.9), then it is easy to verify that the fuzzy system designed through these five steps is the same fuzzy system as designed in Section 10.2. Therefore, this method can be viewed as a generalization of the design method in Section 10.2 to the case where the input-output pairs cannot be arbitrarily chosen.
a
The number of rules in the final fuzzy rule base is bounded by two numbers: " , n : the number of all possible N , the number of input-output pairs, and,i= combinations of the fuzzy sets defined for the input variables. If the dimension of the input space n is large, Ni will be a huge number and may be larger than N . Also, some input-output pairs may correspond to the same box in Fig. 12.3 and thus can contribute only one rule to the fuzzy rule base. Therefore, the number of rules in the fuzzy rule base may be much less than Ni and N. Consequently, the fuzzy rule base generated by this both method may not be complete. To make the fuzzy rule base complete, we may fill the empty boxes in the fuzzy rule base by interpolating the existing rules; we leave the details to the reader to think about.
n:=l
n : = ,
Next, we apply this method to a control problem and a time series prediction problem.
158
Ch. 12
look-up scheme in the last section, and replace the human driver by the designed fuzzy system. The simulated truck and loading zone are shown in Fig. 12.4. The truck position is determined by three state variables 4, x and y, where $ is the angle of the truck with respect to the horizontal line as shown in Fig. 12.4. Control to the truck is the steeling angle 8. Only backing up is permitted. The truck moves backward by a fixed unit distance every stage. For simplicity, we assume enough clearance between the truck and the loading dock such that y does not have to be considered as a state variable. The task is to design a controller whose inputs are (x, $) and whose output is 0, such that the final state will be (xf, $f) = (10,90). We assume that x E [O, 20],$ E [-90, 270] and 0 E [-40, 40]; that is, U = [O, 201 x [-90, 270] and V = [-40, 40].
x=o
x=20
First, we generate the input-output pairs (xp, q5p; 8p). We do this by trial and error: at every stage (given x and 4) starting from an initial state, we determine a control f3 based on common sense (that is, our own experience of how to control the steering angle in the situation); after some trials, we choose the inputoutput pairs corresponding to the smoothest successful trajectory. The following 14 initial states were used to generate the desired input-output pairs: (xo,4 ) : = (1,0), (1,90), (1,270); (7,0), (7,901, (7,1801, (7,270); (13,0), (13, 901, (13,180), (13,270);
159
(19, go), (19,180), (19,270). Table 12.1 shows the input-output pairs starting from the initial state (xo,q50) = (l,OO). The input-output pairs starting from the other 13 initial states can be obtained in a similar manner. Totally, we have about 250 input-output pairs. We now design a fuzzy system based on these input-output pairs using the table look-up scheme developed in the last section.
Table 12.1. Ideal trajectory ( x t , @ ) and the corresponding control 0: starting from (xo, 40) = (1,OO).
In Step 1, we define 7 fuzzy sets in [-90,2700], 5 fuzzy sets in [0,20] and 7 fuzzy sets in [-40, 40'1, where the membership functions are shown in Fig.12.5. In Steps 2 and 3, we generate one rule from one input-output pair and compute the corresponding degrees of the rules. Table 12.2 shows the rules and their degrees generated by the corresponding input-output pairs in Table 12.1. The final fuzzy rule base generated in Step 4 is shown in Fig.12.6 (we see that some boxes are empty, so the input-output pairs do not cover all the state space; however, we will see that the rules in Fig. 12.6 are sufficient for controlling the truck to the desired position starting from a wide range of initial positions). Finally, in Step 5 we use the ifuzzy system with product inference engine, singleton fuzzifier, and center average defuzzifier; that is, the designed fuzzy system is in the form of (9.1) with the rules in Fig. 12.6. We now use the fuzzy system designed above as a controller for the truck. To simulate the control system, we need a mathematical model of the truck. We use
160
Ch. 12
Figure 12.5. Membership functions for the truck backer-upper control problem.
Figure 12.6. The final fuzzy rule base for the truck backer-upper control problem.
161
Table 12.2. Fuzzy IF-THEN rules generated from the input-output pairs in Table 12.1 and their degrees.
x is S2 S2 S2 S2 S2 S1 S1 S1 S1 S1 CE CE CE CE CE CE CE CE
4 is
S2 S2 S2 S2 S2 S2 S1 S1 S1 S1 S1 S1 S1 CE CE CE CE CE
0 is S2 S2 S2 S2 S2 S1 S1 S1 S1 S1 S1 S1 CE CE CE CE CE CE
degree 1.00 0.92 0.35 0.12 0.07 0.08 0.18 0.53 0.56 0.60 0.35 0.21 0.16 0.32 0.45 0.54 0.88 0.92
(12.7) (12.8)
where b is the length of the truck and we assume b = 4 in our simulations. Fig. 12.7 shows the truck trajectory using the designed fuzzy system as the controller for two initial conditions: ( g o ,$o) = (3, -30) and (13,30). We see that the fuzzy controller can successfully control the truck to the desired position.
162
Ch. 12
10
Figure 12.7. Truck trajectories using the fuzzy controller.
20
When r
Let x(k) (k = 1,2,3, ...) be the time series generated by (12.10) (sampling the continuous curve x(t) generated by (12.10) with an interval of 1 sec.). Fig.12.8 shows 600 points of x(k). The problem of time series prediction can be formulated as follows: given x(k - n I), x(k - n + 2), ..., x(k), estimate x(k I), where n is a positive integer. That is, the task is to determine a mapping from [x(k-n+l), x(kn +2), ..., x(k)] E Rn to [x(k+ I)] E R, and this mapping in our case is the designed
163
fuzzy system based on the input-output pairs. Assuming that x ( l ) , x ( 2 ) ,...,x ( k ) are given with k > n, we can form Ic - n input-output pairs as follows:
[ x ( k - n ) ,...,x ( k - 1 ) ;x ( k ) ] [ x ( k - n - I ) , ..., x ( k - 2 ) ;x ( k - I ) ]
...
[x(l), -..,x ( n ) ;x ( n + 111
(12.11)
These input-output pairs are used to design a fuzzy system f ( x ) using the table lookup scheme in Section 12.2, and this f ( x ) is then used to predict x ( k 1 ) for 1 = 1 , 2 , ..., where the input to f ( x ) is [ x ( k- n 1 ) , ...,x ( k - 1 l ) ] when predicting x ( k 1).
We now use the first 300 points in Fig. 12.8 to construct the input-output pairs and the designed fuzzy system is then used to predict the remaining 300 points. We consider two cases: (i) n = 4 and the 7 fuzzy sets in Fig.12.9 are defined for each input variable, and (ii) n = 4 and the 15 fuzzy sets in Fig.12.10 are used. The prediction results for these two cases are shown in Figs.12.11 and 12.12, respectively. Comparing Figs. 12.11 and 12.12, we see that the prediction accuracy is improved by defining more fuzzy sets for each input variable.
164
Ch. 12
Figure 12.9. The first choice of membership functions for each input variable.
Figure 12.10. The second choice of membership functions for each input variable.
165
Figure 12.11. Prediction and the true values of the time series using the membership functions in Fig. 12.9.
Figure 12.12. Prediction and the true values of the time series using the membership functions in Fig. 12.10.
166
Ch. 12
12.4
In this chapter we have demonstrated the following: The details of the table look-up method for designing fuzzy systems from input-output pairs. How to apply the method to the truck backer-upper control and the time series prediction problems. How to combine conscious and subconscious knowledge into fuzzy systems using this table look-up scheme. This table look-up scheme is taken from Wang and Mendel [1992b] and Wang [1994], which discussed more details about this method and gave more examples. Application of the method to financial data prediction can be found in Cox [1994].
12.5
Exercise 12.1. Consider the design of a 2-input-1-output fuzzy system using the table look-up scheme. Suppose that in Step 1 we define the fuzzy sets as shown 2 =a , = 0 and P I = P2 = P , = 1, and the membership in Fig. 12.2, where a1 = a functions are triangular and equally spaced.
(a) What is the minimum number of input-output pairs such that every fuzzy sets in Fig. 12.2 will appear at least once in the generated rules? Give an example of this minimum set of input-output pairs. (b) What is the minimum number of input-output pairs such that the generated fuzzy rule base is complete? Give an example of this minimum set of input-output pairs.
Exercise 12.2. Consider the truck backer-upper control problem in Section 12.3.
(a) Generate a set of input-output pairs by driving the truck from the initial state (ao,do) = (1,90) to the final state ( x f , 4 f ) = (10,90) using common sense. (b) Use the table look-up scheme to create a fuzzy rule base from the inputoutput pairs generated in (a), where the membership functions in Step 1 are given in Fig. 12.5. (c) Construct a fuzzy system based on the fuzzy rule base in (b) and use it to control the truck from ( x o , 4 ~= ) (0,90) and ( x o , 4 ~ = ) (-3,90). Comment on the simulation results.
Exercise 12.3. Let f (x) be the fuzzy system designed using the table look-up scheme. Can you determine an error bound for If (xg) - ygl? Explain your answer.
167
Exercise 12.4. Propose a method to fill up the empty boxes in the fuzzy rule base generated by the table look-up scheme. Justify your method and test it through examples. Exercise 12.5. We are given 10 points x(1),x(2), ...,x(10) of a time series and we want to predict x(12).
(a) If we use the fuzzy system f [x(10),x(8)] to predict x(12), list all the inputoutput pairs for constructing this fuzzy system. (b) If we use the fuzzy system f [%(lo), x(9), x(8)] to predict x(12), list all the input-output pairs for constructing this fuzzy system.
12.6 (Project). Write a computer program to implement the table look-up scheme and apply your program to the time series prediction problem in Section 12.4. To make your codes generally applicable, you may have to include a method to fill up the empty boxes.
Chapter 13
: and of are free parameters (we choose af = 1). Although where M is fixed, and jjl, 3 the structure of the fuzzy system is chosen as (13.1), the fuzzy system has not been f ufare not specified. Once we specify the designed because the parameters jj" , ~ and
169
parameters jjl,%f and c:,we obtain the designed fuzzy system; that is, designing : and a:. the fuzzy system is now equivalent to determining the parameters yl, % To determine these parameters in some optimal fashion, it is helpful to represent the fuzzy system f (x) of (13.1) as a feedforward network. Specifically, the mapping from the input x E U C Rn to the output f(x) E V c R can be implemented according to the following operations: first, the input x is passed through a product n a: -z" Gaussian operator to become zz = e ~ ~ ( - ( + ) ~ ) ;then, the z1 are passed through a summation operator and a weighted summation operator to obtain b = zz and a = &"l; finally, the output of the fuzzy system is computed as f (x) = a/b. This three-stage operation is shown in Fig. 13.1 as a three-layer feedforward network.
n,=,
*%
zIVfl
xKl
170
error
Ch. 13
1 ep = -[f(xg) - y;I2 2
is minimized. That is, the task is to determine the parameters yl, 2: and 0;such that ep of (13.2) is minimized. In the sequel, we use e, f and y to denote ep, f (xz) and y;, respectively. We use the gradient descent algorithm to determine the parameters. Specifically, to determine yl, we use the algorithm
where I = 1,2, ...,M , q = 0,1,2, ..., and a is a constant stepsize. If yYq) converges = 0 at the converged g" which as q goes to infinity, then from (13.3) we have means that the converged jjl is a local minimum of e. From Fig. 13.1 we see that f (and hence e) depends on y1 only through a, where f = alb, a = ~ E , ( j j l z ' ) ,
b=
cE,
Substituting (13.4) into (13.3), we obtain the training algorithm for gl:
where i = 1 , 2,...,n,l = 1 , 2,...,M, and q = 0,1,2,.... We see from Fig. 13.1 that f (and hence e) depends on 2: only through zl; hence, using the Chain Rule, we have
Substituting (13.7) into (13.6), we obtain the training algorithm for 2;:
171
Using the same procedure, we obtain the training algorithm for af:
1 oi(q
The training algorithm (13.5), (13.8), and (13.9) performs an error back-propagation procedure. To train jjl, the "normalized" error (f - y)/b is back-propagated to the layer of jjl; then jjl is updated using (13.5) in which z1 is the input to jjZ (see Fig. 13.1). To train 3f and a:, the "normalized" error (f - y)/b times (jjl - f ) and z1 is : and af back-propagated to the processing unit of Layer 1whose output is zl; then 3 are updated using (13.8) and (13.9), respectively, in which the remaining variables z:, x;,, and of (that is, the variables on the right-hand sides of (13.8) and (13.9), except the back-propagated error F ( j j z- f)zl) can be obtained locally. Therefore, this algorithm is also called the error back-propagation training algorithm. We now summarize this design method.
Design of Fuzzy Systems Using Gradient Descent Training: Step 1. Structure determination and initial parameter setting. Choose the fuzzy system in the form of (13.1) and determine the M. Larger M results in more parameters and more computation, but gives better approximation accuracy. Specify the initial parameters jjl(0), zf(0) and af(0). These initial parameters may be determined according to the linguistic rules from human experts, or be chosen in such a way that the corresponding membership functions uniformly cover the input and output spaces. For particular applications, we may use special methods; see Section 13.3 for an example. Step 2. Present input and calculate the output of the fuzzy system. For a given input-output pair (x:; y:), p = 1,2,..., and at the q'th stage of : to the input layer of the fuzzy system in training, q = 0,1,2, ..., present x Fig. 13.1 and compute the outputs of Layers 1-3. That is, compute
172
Ch. 13
Step 3. Update the parameters. Use the training algorithm (13.5), (13.8) and (13.9) to compute the updated parameters yl(q+l), 2: (q+l) and of(q+l), where y = y , : and zz,b, a and f equal those computed in Step 2.
0
Step 4. Repeat by going to Step 2 with q = q 1, until the error If - y l : is less than a prespecified number E , or until the q equals a prespecified number. Step 5. Repeat by going to Step 2 with p = p 1; that is, update the paramters using the next input-output pair ($$I; y:S1). Step 6. If desirable and feasible, set p = 1 and do Steps 2-5 again until the designed fuzzy system is satisfactory. For on-line control and dynamic system identification, this step is not feasible because the input-output pairs are provided one-by-one in a real-time fashion. For pattern recognition problems where the input-output pairs are provided off-line, this step is usually desirable.
Because the training algorithm (13.5), (13.8) and (13.9) is a gradient descent algorithm, the choice of the initial parameters is crucial to the success of the algorithm. If the initial parameters are close to the optimal parameters, the algorithm has a good chance to converge to the optimal solution; otherwise, the algorithm may converge to a nonoptimal solution or even diverge. The advantage of using 3 1 and crf have clear physical meanings the fuzzy system is that the parameters yZ, and we have methods to choose good initial values for them. Keep in mind that the parameters yZ are the centers of the fuzzy sets in the THEN parts of the rules, and the parameters 2: and crf are the centers and widths of the Gaussian fuzzy sets in the IF parts of the rules. Therefore, given a designed fuzzy system in the form of (13.1), we can recover the fuzzy IF-THEN rules that constitute the fuzzy system. These recovered fuzzy IF-THEN rules may help to explain the designed fuzzy system in a user-friendly manner. Next, we apply this method to the problem of nonlinear dynamic system identification.
173
(13.1) equipped with the training algorithm (13.5), (13.8) and (13.9) to approximate unknown nonlinear components in dynamic systems. Consider the discrete time nonlinear dynamic system
where f is an unknown function we want to identify, u and y are the input and output of the system, respectively, and n and m are positive integers. Our task is to identify the unknown function f based on fuzzy systems. Let f(x) be the fuzzy system in the form of (13.1). We replace the f (x) in (13.14) by f(x) and obtain the following identification model:
Our task is to adjust the parameters in f(x) such that the output of the identification model y(k + 1) converges to the output of the true system y(k 1) as k goes to infinity. Fig. 13.2 shows this identification scheme.
Y
plant f
*
-w
Figure 13.2. Basic scheme of identification model for the nonlinear dynamic system using the fuzzy system.
The input-output pairs in this problem are (x$+'; yt+'), where xi+' = (y(k), ..., y(k - n 1);u(k), ..., u(k - m f I)), y:+' = y(k l ) , and k = 0,1,2, .... Because the system is dynamic, these input-output pairs are collected one at a time. The operation of the identification process is the same as the Steps 1-5 in Section 13.2. Note that the p there is the k in (13.14) and (13.15) and the n in (13.1) equals
n+m.
174 13.3.2
Ch. 13
As we discussed in Section 13.2, a good initial f is crucial for the success of the approach. For this particular identification problem, we propose the following method for on-line initial parameter choosing and provide a theoretical justification (Lemma 13.1) to explain why this is a good method.
An on-line initial parameter choosing method: Collect the input-output y(k 1)) for the pairs (x;+'; y;+') = (y(k), ...,y(k - n l),u(k), ...,u(k - m 1); first M time points k = 0,1, ...,M - 1, and do not start the training algorithm until k = M-1 (that is, set q = k- M in (13.5), (13.8) and (13.9)). The initial parameters are chosen as: yl(0) = y;, %1(0)= xii, and af(0) equals small numbers (see Lemma 13.1)oraf(0) = [max(xki : l = 1,2,...,M)-min(xbi : 1 = 1,2,..., M)]/M,wherel = 1,2, ..., M and i = 1,2, ...,n m. The second choice of of(0) makes the membership functions uniformly cover the range of xk, from 1 = 1 to 1 = M.
We now show that by choosing the af(0) sufficiently small, the fuzzy system with the preceding initial parameters can match all the M input-output pairs (x;+'; y:+'), k = 0,1, ..., M - 1, to arbitrary accuracy.
Lemma 13.1. For arbitrary E > 0, there exists a* > 0 such that the fuzzy system f(x) of (13.1) with the preceding initial parameters g1 and %$ and af = a " has the property that (13.16) [f(x;+l) - y:+ll < E
fork=0,1,
5 ;
..., M - 1 .
Proof: Substituting the initial parameters gl(0) and %1(0)into (13.1) and setting = 5*,we have
Setting x = x;+'
Hence,
175
1f x$ # for l1 # 12,then we have exp(-( "'g, ) ) + Oasa* + Oforl # k+ 1. Therefore, by choosing a* sufficiently small, we can make lf(x,k+') 1 < 6. Similarly, we can show that (13.16) is true in the case where xi = xi+' for some 1 # k + 1; this is left as an exercise.
$2
fly=+,"
x~+~--x; 2~
Yi+l
Based on Lemma 13.1, we can say that the initial parameter choosing method is a good one because the fuzzy system with these initial parameters can at least match the first M input-output pairs arbitrarily well. If these first M input-output pairs contain important features of the unknown nonlinear function f (a), we can hope that after the training starts from time point M, the fuzzy identifier will converge to the unknown nonlinear system very quickly. In fact, based on our simulation results in the next subsection, this is indeed true. However, we cannot choose a: too small because, although a fuzzy system with small of matches the first M pairs quite well, it may give large approximation errors for other input-output pairs. Therefore, in our following simulations we will use the second choice of of described in the on-line initial parameter choosing method.
13.3.3 Simulations
Example 13.1. The plant to be identified is governed by the difference equation
where the unknown function has the form g(u) = 0.6sin(~u) 0.3sin(3~u) O.lsin(5.rru). From (13.15), the identification model is governed by the difference equation $(k 1) = 0.3y(k) 0.6y(k - 1) f"[u(k)] (13.21)
where f[*] is in the form of (13.1) with M = 10. We choose a = 0.5 in the training algorithm (13.5), (13.8) and (13.9) and use the on-line parameter choosing method. We start the training from time point k = 10, and adjust the parameters yl, 35, and af for one cycle at each time point. That is, we use (13.5), (13.8) and (13.9) once at each time point. Fig. 13.3 shows the outputs of the plant and the identification model when the training stops at k = 200, where the input u(k) = sin(2nk/200). We see from Fig. 13.3 that the output of the identification model follows the output of the plant almost immediately and still does so when the training stops at k = 200.
Example 13.2. In this example, we show how the fuzzy identifier works for a multi-input-multi-output plant that is described by the equations
and
f2,and is described
176
Ch. 13
Figure 13.3. Outputs of the plant and the identification model for Example 13.1 when the training stops at k = 200.
by the equations
$1 ( k + 1 ) f 1 ( Y I ( k ) ,~ z ( k ) ) ] [ul ( k ) [G2(k+ = [ f 2 ( y l ( k )y , 2(k)) u2(k)
+
]
(13.23)
Both and f 2 are in the form of (13.1) with M = 121. The identification procedure is carried out for 5,000 time steps using random inputs u l ( k ) and u 2 ( k ) whose magnitudes are uniformly distributed over [-1,1],where we choose a = 0.5, use the on-line initial parameter choosing method, and train the parameters for one cycle at each time point. The responses of the plant and the trained identification model for a vector input [ul( k ) ,u2 ( k ) ] = [ s i n ( 2 n k / 2 5 ) ,c o s ( 2 n k / 2 5 ) ] are shown in Figs. 13.4 and 13.5 for y l ( k ) - y l ( k ) and yz(lc)-yz(k), respectively. We see that the fuzzy identifier follows the true system almost perfectly.
f^l
13.4
In this chapter we have demonstrated the following: The derivation of the gradient descent training algorithm. The method for choosing the initial parameters of the fuzzy identification model and its justification (Lemma 13.1).
177
Figure 13.4. Outputs of the plant yl(k) and the identification model Gl(k) for Example 13.2 after 5,000 steps of training.
Figure 13.5. Outputs of the plant ys(k) and the identification model Gz(k) for Example 13.2 after 5,000 steps of training.
178
Ch. 13
Design of fuzzy systems based on the gradient descent training algorithm and the initial parameter choosing method. Application of this approach to nonlinear dynamic system identification and other problems. The method in this chapter was inspired by the error back-propagation algorithm for neural networks (Werbos [1974]). Many similar methods were proposed in the literature; see Jang [I9931 and Lin [1994]. More simulation results can be found in Wang [1994]. Narendra and Parthasarathy [1990] gave extensive simulation results of neural network identifiers for the nonlinear systems such as those in Examples 13.1 and 13.2.
13.5
Exercise 13.1. Suppose that the parameters 3; and only the jjz are free to change.
(a) Show that if the training algorithm (13.5) converges, then it converges to the global minimum of ep of (13.2). (b) Let ep(gl) be the ep of (13.2). Find the optimal stepsize cu by minimizing e ~ [ g l (~a ) F z l ] with respect to a.
Exercise 13.2. Why is the proof of Lemma 13.1 not valid if x i = z[+' for some 1 # k l ? Prove Lemma 13.1 for this case.
Exercise 13.3. Suppose that we are given K input-output pairs (xg; y:),p = 1,2, ..., K , and we want to design a fuzzy system f (x) in the form of (13.1) such that the summation of squared errors
f be fixed and only the yZ be free to is minimized. Let the parameters 3f and a change. Determine the optimal y"(1 = 1,2, ..., M) such that J is minimized (write the optimal jjl in a closed-form expression).
Exercise 13.4. Let [x(k)]be a sequence of real-valued vectors generated by the gradient descent algorithm
where e : Rn -+ R is a cost function and e E C2 (i.e., e has continuous second derivative). Assume that all x(k) E D C Rn for some compact D, then there exist E > 0 and L > 0 such that if
179
Exercise 13.5. It is a common perception that the gradient decent algorithm in Section 13.2 converges to local minima. Create a set of training samples that numerically shows this phenomenon; that is, show that a different initial condition may lead to a different final convergent solution. Exercise 13.6. Explain how conscious and subconscious knowledge are combined into the fuzzy system using the design method in Section 13.2. Is it possible that conscious knowledge disappears in the final designed fuzzy system? Why? If we want to preserve conscious knowledge in the final fuzzy system, how to modify the design procedure in Section 13.2?
13.7 (Project). Write a computer program to implement the training algorithm (13.5), (13.8), and (13.9), and apply your codes to the time series prediction problem in Chapter 12.
Chapter 14
is minimized. Additionally, we want to design the fuzzy system recursively; that is, if fP is the fuzzy system designed to minimize J,, then fP should be represented as a function of fPPl. We now use the recursive least squares algorithm to design the fuzzy system.
Design of the Fuzzy System by Recursive Least Squares: Step 1. Suppose that U = [al,Pl] x . . . x [an,&] c Rn. For each [ai,Pi] (i = 1,2, ..., n), define Ni fuzzy sets A: (li = 1,2, ..., Ni), which are complete : to be the pseudo-trapezoid fuzzy in [ai,Pi]. For example, we may choose A sets: p r i (xi) = pAfi(xi; a:, b : , c : , dp), where a: = ba = a i , 5 a{+' <
d{
< b{+'
A,
Step 2. Construct the fuzzy system from the following THEN rules: IF z1 i s A? and . . . and xn i s A C , T H E N y is
(14.2)
181
where 1; = 1,2, ..., Ni, i = 1,2, ..., n and ~ ' 1 " " - is any fuzzy set with center at 811...1n which is free to change. Specifically, we choose the fuzzy system with product inference engine, singleton fuzzifier, and center average defuzzifier; that is, the designed fuzzy system is
where gll."'n are free parameters to be designed, and A: are designed in Step 1. Collect the free parameters y'l'..ln into the nY=l Ni-dimensional vector 0 = (,gl...l -N11-.1 -121...1 -N121...1, ...,Y ~ ~ ,~...,' Y . . ~~ ~)T ~ ~ , ..., Y ,Y , ..., Y (14.4) and rewrite (14.3) as (14.5) f (x) = bT (x)0 where q X ) = (bl-.l(x), ..., bN11-1 (x), b 121..,1 (x), ...,bN121"'1 (x), ..., blNz.-N, (x), ..., bN1N2"'Nn (XI)* (14.6)
Step 3. Choose the initial parameters 0(0) as follows: if there are linguistic rules from human experts (conscious knowledge) whose IF parts agree with the IF parts of (14.2), then choose yzl""n (0) to be the centers of the THEN part fuzzy sets in these linguistic rules; otherwise, choose O(0) arbitrarily in i . 0 or the elements of 0(0) the output space V C R (for example, choose O(0) . uniformly distributed over V). In this way, we can say that the initial fuzzy system is constructed from conscious human knowledge. Step 4. For p = 1,2, ..., compute the parameters 0 using the following recursive least squares algorithm:
P 0 b ) = Q(P - 1) + K ( P ) [ Y~ bT (xo)0b -1 1 1 K(p) = P ( p - l ) b ( ~ g ) [ b ~ ( x ; )~ l)b(x:) (~ 11-I P(P) = P(P - 1) - P ( P - l)b(x:)
(14.8) (14.9)
T P 1 T P (14.10) [b (x0)P(p- l)b(xg) 11- b (xO)P(p - 1) where 8(0) is chosen as in Step 3, and P(0) = OI where a is a large constant. The designed fuzzy system is in the form of (14.3) with the parameters gll"'ln equal to the corresponding elements in 8(p).
The recursive least squares algorithm (14.8)-(14-10) is obtained by minimizing Jp of (14.1) with f (xi) in the form of (14.3); its derivation is given next.
182
Ch. 14
= (b(xi), ...,
Since Jp-1 is a quadratic function of 8, the optimal 8 that minimizes JP-i, denoted by 8(p - I ) , is 80, - 1) = (B:, B ~ - ~ ) - ~ B ~ ~ Y O ~ - ~ (14.12) When the input-output pair (xg;ygP) becomes available, the criterion changes to Jp of (14.1) which can be rewritten as
Similar to (14.12), the optimal 8 which minimizes Jp, denoted by 8(p), is obtained as
183
Defining K (p) = P ( p - l)b(xi)[bT (xi)P ( ~ - l)b(x$) (14.9). Finally, we derive (14.10). By definition, we have
+ I]-',
Using the matrix identity (14.15) and the fact that B;, BP-' = P-' 0) - I), we obtain (14.10) from (14.18).
14.3
Nonlinear distortion over a communication channel is now a significant factor hindering further increase in the attainable data rate in high-speed data transmission (Biglieri, E., A. Gersho, R.D. Gitlin, and T.L. Lim [1984]). Because the received signal over a nonlinear channel is a nonlinear function of the past values of the transmitted symbols and the nonlinear distortion varies with time and from place to place, effective equalizers for nonlinear channels should be nonlinear and adaptive. In this section, we use the fuzzy system designed from the recursive least squares algorithm as an equalizer for nonlinear channels. The digital communication system considered here is shown in Fig. 14.1, where the channel includes the effects of the transmitter filter, the transmission medium, the receiver matched filter, and other components. The transmitted data sequence s(k) is assumed to be an independent sequence taking values from {-1,l) with equal probability. The inputs to the equalizer, x(k), x(k - I), . . ,x(k - n I), are the channel outputs corrupted by an additive noise e(k). The task of the equalizer at the sampling instant k is to produce an estimate of the transmitted symbol s(k - d) using the information contained in x(k), x(k- 1), . . . ,x(k - n + 1), where the integers n and d are known as the order and the lag of the equalizer, respectively.
We use the geometric formulation of the equalization problem due to Chen, S., G.J. Gibson, C.F.N. Cowan and P.M. Grand [1990]. Define
where
+ l)lT,
(14.21)
k(k) is the noise-free output of the channel (see Fig. 14.1), and Pn,d(l) and Pn,d(-1) represent the two sets of possible channel noise-free output vectors f (k) that can
l$4
Ch. 14
-+ Channel
s(k)
Equalizer
be produced from sequences of the channel inputs containing s ( k - d ) = 1 and s ( k - d ) = -1, respectively. The equalizer can be characterized by the function
with
i ( k - d ) = g!, (14.23)
where
x(k) = [x(k)x , ( k - I ) , . . . , ~ ( -kn
+ l)lT
(14.24)
is the observed channel output vector. Let pl [ x ( k )1% ( k ) E Pn,d(l)]and p-I [ x ( k )Ix(k) E P n , d ( - l ) ] be the conditional probability density functions of x ( k ) given x ( k ) E Pn,d(l) and x ( k ) E Pn,d(-1), respectively. It was shown in Chen, S., G.J. Gibson, C.F.N. Cowan and P.M. Grand [1990] that the equalizer that is defined by
achieves the minimum bit error rate for the given order n and lag d , where s g n ( y ) = 1 ( - 1 ) if y 0 ( y < 0 ) . If the noise e ( k ) is zero-mean and Gaussian with covariance matrix Q = E [ ( e ( k ) ,..., e ( k - n l ) ) ( e ( k ) ..., , e(k - n I ) ) ~ ] (14.26)
>
185
where the first (second) sum is over all the points x+ E P n , d ( l )( 2- E Pn,d(-l)). Now consider the nonlinear channel
and white Gaussian noise e ( k ) with E [ e 2 ( k ) ] = 0.2. For this case, the optimal decision region for n = 2 and d = 0,
is shown in Fig. 14.2 as the shaded area. The elements of the sets P2,o(l) and P2,0(-l) are illustrated in Fig. 14.2 by the "on and "*", respectively. From Fig. 14.2 we see that the optimal decision boundary for this case is severely nonlinear.
Figure 14.2. Optimal decision region for the channel (14.28), Gaussian white noise with variance a : = 0.2, and equalizer order n = 2 and lag d = 0, where the horizontal axis denotes x(k) and the vertical axis denotes x(k - 1).
186
Ch. 14
Yt
Application Phase: In this phase, the transmitted signal s(k) is unknown and the designed equalizer (the fuzzy system (14.3)) is used to estimate s(k d). Specifically, if the output of the fuzzy system is greater than or equal to zero, the estimate s^(k- d) = 1; otherwise, s^(k- d) = -1. E x a m p l e 14.1. Consider the nonlinear channel (14.28). Suppose that n = 2 and d = 0, so that the optimal decision region is shown in Fig. 14.2. Our task is to design a fuzzy system whose input-output behavior approximates that in Fig. 14.2, where the output of the fuzzy system is quantized as in the Application Phase. We use the design procedure in Section 14.1. In Step 1, we choose Nl = N2 = 9 and z.-z? 2 (xi) = exp(-(+) ), where i = 1 , 2 and 3: = -2 0.5(1- 1) for 1 = 1,2, ..., 9. In Step 3, we choose the initial parameters 6(0) randomly in the interval [-0.3,0.3]. In Step 4, we choose cr = 0.1. Figs.14.3-14.5 show the decision regions resulting from the designed fuzzy system when the training in Step 4 stops at k = 30,50 and 100 (that is, when the p in (14.8)-(14.10) equals 30,50 and loo), respectively. From Figs.14.3-14.5 we see that the decision regions tend to converge to the optimal decision region as more training is performed.
Example 14.2. In this example, we consider the same situation as in Example 14.1 except that we choose d = 1 rather than d = 0. The optimal decision region for this case is shown in Fig.14.6. Figs.14.7 and 14.8 show the decision regions resulting from the fuzzy system equalizer when the training in Step 4 stops at k = 20 and k = 50, respectively. We see, again, that the decision regions tend to converge to the optimal decision region.
187
Figure 14.3. Decision region of the fuzzy system equalizer when the training stops at k = 30,where the horizontal axis denotes x(k) and the vertical axis denotes x(k - 1).
Figure 14.4. Decision region of the fuzzy system equalizer when the training stops at k = 50, where the horizontal axis denotes x(k) and the vertical axis denotes x(k - 1).
188
Ch. 14
Figure 14.5. Decision region of the fuzzy system equalizer when the training stops at k = 100, where the horizontal axis denotes x ( k ) and the vertical axis denotes x ( k - 1 ) .
Figure 14.6. Optimal decision region for the channel (14.28), Gaussian white noise with variance uz = 0.2, and equalizer order n = 2 and lag d = 1 , where the horizontal axis denotes x ( k ) and the vertical axis denotes x ( k - 1).
189
Figure 14.7. Decision region of the fuzzy system equalizer in Example 14.2 when the training stops at k = 20, where the horizontal axis denotes x(k) and the vertical axis denotes x(k - 1).
Figure 14.8. Decision region of the fuzzy system equalizer in Example 14.2 when the training stops at k = 50, where the horizontal axis denotes x(k) and the vertical axis denotes x(k - 1).
190
Ch. 14
14.4
In this chapter we have demonstrated the following: Using the recursive least squares algorithm to design the parameters of the fuzzy system. The techniques used in the derivation of the recursive least squares algorithm. Formulation of the channel equalization problem as a pattern recognition problem and the application of the fuzzy system with the recursive least squares algorithm to the equalization and similar pattern recognition problems. The recursive least squares algorithm was studied in detail in many standard textbooks on estimation theory and adaptive filters; for example, Mendel [I9941 and Cowan and Grant [1985]. The method in this chapter is taken from Wang [1994a] and Wang and Mendel [I9931 where more simulation results can be found. For a similar approach using neural networks, see Chen, S., G.J. Gibson, C.F.N. Cowan and P.M. Grand [1990].
14.5
(a) Design a fuzzy system f (xl, x2) in the form of (14.3) such that sgn[f (XI,x2)] implements the X o R function, where sgn(f) = 1i f f 2 0 and sgn(f) = -1 iff < 0. (b) Plot the decision region {x E Ulsgn[f (x)]2 01, where U = [-2,2] x [-2,2] and f (x) is the fuzzy system you designed in (a).
Exercise 14.2. Discuss the physical meaning of each of (14.8)-(14.10). Explain why the initial P(0) = a1 should be large. Exercise 14.3. Prove the matrix identity (14.15). Exercise 14.4. Suppose that we change the criterion Jk of (14.1) to
191
where X E (O,1] is a forgetting factor, and that we still use the fuzzy system in the form of (14.5). Derive the recursive least squares algorithm similar to (14.8)-(14.10) for this new criterion Jk.
Exercise 14.5. The objective to use the JL of (14.30) is to discount old data by putting smaller weights for them. Another way is to consider only the most recent N data pairs, that is, the criterion now is
Let f (xi) be the fuzzy system in the form of (14.5). Derive the recursive least squares algorithm similar to (14.8)-(14.10) which minimizes J[.
Exercise 14.6. Determine the exact locations of the sets of points: (a) Pz,0(l) and P~,o(-1)in Fig. 14.2, and (b) P2,1(l) and Pztl(-l) in Fig. 14.6.
14.7 (Project). Write a computer program to implement the design method in Section 14.1 and apply your program to the nonlinear system identification problems in Chapter 13.
Chapter 15
In Chapters 12-14, we proposed three methods for designing fuzzy systems. In all these methods, we did not propose a systematic procedure for determining the number of rules in the fuzzy systems. More specifically, the gradient descent method of Chapter 13 fixes the number of rules before training, while the table look-up scheme of Chapter 12 and the recursive least squares method of Chapter 14 fix the IF-part fuzzy sets, which in turn sets a bound for the number of rules. Choosing an appropriate number of rules is important in designing the fuzzy systems, because too many rules result in a complex fuzzy system that may be unnecessary for the problem, whereas too few rules produce a less powerful fuzzy system that may be insufficient to achieve the objective. In this chapter, we view the number of rules in the fuzzy system as a design parameter and determine it based on the input-output pairs. The basic idea is to group the input-output pairs into clusters and use one rule for one cluster; that is, the number of rules equals the number of clusters. We first construct a fuzzy system that is optimal in the sense that it can match all the input-output pairs to arbitrary accuracy; this optimal fuzzy system is useful if the number of input-output pairs is small. Then, we determine clusters of the input-output pairs using the nearest neighborhood clustering algorithm, view the clusters as input-output pairs, and use the optimal fuzzy system to match them.
193
Clearly, the fuzzy system (15.1) is constructed from the N rules in the form of (7.1) 1%"-x',12 ) and the center of Bzequal to y;, and using the with ,u*t(xi) = exp(- O,, product 'inference engine, singleton fuzzifier, and center average defuzzifier. The following theorem shows that by properly choosing the parameter a , the fuzzy system (15.1) can match all the N input-output pairs to any given accuracy.
Theorem 15.1: For arbitrary E > 0, there exists a* > 0 such that the fuzzy system (15.1) with u = u* has the property that
If(&) - Y for all 1 = l , 2 , ..., N.
~ I<
Proof: Viewing x i and yi as the x;+l and y ! + ' in Lemma 13.1 and using exactly the same method as in the proof of Lemma 13.1, we can prove this theorem.
The u is a smoothing parameter: the smaller the a, the smaller the matching error If (xi) - y$, but the less smooth the f (x) becomes. We know that if f (x) is not smooth, it may not generalize well for the data points not in the training set. Thus, the a should be properly chosen to provide a balance between matching and generalization. Because the a is a one-dimensional parameter, it is usually not difficult to determine an appropriate u for a practical problem. Sometimes, a few trial-and-error procedures may determine a good u. As a general rule, large a can smooth out noisy data, while small u can make f (x) as nonlinear as is required to approximate closely the training data. The f (x) is a general nonlinear regression that provides a smooth interpolation between the observed points (xi; y;). It is well behaved even for very small a .
Example 15.1. In this example, we would like to see the influence of the parameter a on the smoothness and the matching accuracy of the optimal fuzzy system. We consider a simple single-input case. Suppose that we are given five input-output pairs: (-2, I ) , (-1, O), (0,2), (1,2) and (2,l). The optimal fuzzy system f (x) is in the form of (15.1) with (xi; y;) = (-2, I), (-1, O), (0,2), (1,2), (2,l) for 1 = 1,2, ..., 5, respectively. Figs. 15.1-15.3 plot the f (x) for a =. 0.1,0.3 and 0.5, respectively. These plots confirm our early comment that smaller u gives smaller matching errors but less smoothing functions. 17
15.2
The optimal fuzzy system (15.1) uses one rule for one input-output pair, thus it is no longer a practical system if the number of input-output pairs is large. For
194
Ch. 15
Figure 16.1. The optimal fuzzy system in Example 15.1 with u = 0.1.
Figure 15.2. The optimal fuzzy system in Example 15.1 with u = 0.3.
these large sample problems, various clustering techniques can be used to group the input-output pairs so that a group can be represented by one rule. From a general conceptual point of view, clustering means partitioning of a collection of data into disjoint subsets or clusters, with the data in a cluster having
195
Figure 15.3. The optimal fuzzy system in Example 15.1 with a = 0.5.
some properties that distinguish them from the data in the other clusters. For our problem, we first group the input-output pairs into clusters according to the distribution of the input points, and then use one rule for one cluster. Fig. 15.4 illustrates an example where six input-output pairs are grouped into two clusters and the two rules in the figure are used to construct the fuzzy system. The detailed algorithm is given next. One of the simplest clustering algorithms is the nearest neighborhood clustering algorithm. In this algorithm, we first put the first datum as the center of the first cluster. Then, if the distances of a datum to the cluster centers are less than a prespecified value, put this datum into the cluster whose center is the closest to this datum; otherwise, set this datum as a new cluster center. The details are given as follows.
Design of the Fuzzy System Using Nearest Neighborhood Clustering: Step 1. Starting with the first input-output pair (xi; yi), establish a cluster center s: at xi, and set A1(l) = yi, B1(l) = 1. Select a radius r. Step 2. Suppose that when we consider the k'th input-output pair (xg; yk), k = 2,3, ..., there are M clusters with centers at x:, x:, ..., x p . Compute the distances of x$ to these M cluster centers, 1x5 - xi1, 1 = 1,2, ...,M, and let the smallest distances be 1x5 - ~ $ 1 , that is, the nearest cluster to xk is x$. Then:
a) If Ix,k - x$I
>
= x$, set
196
Ch. 15
IF xl is AI2 and x2 is
THEN y is B2
Figure 15.4. An example of constructing fuzzy IF-THEN rules from input-output pairs, where six input-output pairs (x;; ...,(x:;yg) are grouped into two clusters from whlch ths two rules are generated.
yA),
1%;
- x$
I5
T,
+ y: +1
and set
Step 3. 1f x$ does not establish a new cluster, then the designed fuzzy system y ; ) , j = 1 , 2 , ..., k , is based on the k input-output pairs (xi;
197
+ 1.
From (15.3)-(15.6) we see that the variable B1(k) equals the number of inputoutput pairs in the l'th cluster after k input-output pairs have been used, and A1(k) equals the summation of the output values of the input-output pairs in the l'th cluster. Therefore, if each input-output pair establishes a cluster center, then the designed fuzzy system (15.8) becomes the optimal fuzzy system (15.1). Because the optimal fuzzy system (15.1) can be viewed as using one rule t o match one input-output pair, the fuzzy system (15.7) or (15.8) can be viewed as using one rule to match one cluster of input-output pairs. Since a new cluster may be introduced whenever a new input-output pair is used, the number of rules in the designed fuzzy system also is changing during the design process. The number of clusters (or rules) depends on the distribution of the input points in the input-output pairs and the radius r . The radius r determines the complexity of the designed fuzzy system. For smaller r , we have more clusters, which result in a more sophisticated fuzzy system. For larger r, the designed fuzzy system is simpler but less powerful. In practice, a good radius r may be obtained by trials and errors.
Example 15.2. Consider the five input-output pairs in Example 15.1. Our task now is to design a fuzzy system using the design procedure in this section. If r < 1, then each of the five input-output pair establishes a cluster center and the designed fuzzy system f5(x) is the same as in Example 15.1. We now design the fuzzy system with r = 1.5.
In Step 1, we establish the center of the first cluster xi = -2 and set A1(l) = y,l = 1, B1(l) = 1. In Step 2 with k = 2, since lxi-x21 = Ixi-xiI = I-1-(-2)1 = 1 < r = 1.5, we have A1(2) = A1(l) y; = 1 0 = 1 and B1(2) = B1(l) 1 = 2. = l 10-(-2)1 = 2 > r , we establish anew cluster Fork = 3, since lxg-x21 = l x ~ - - x ~ : = 2 and B2(3) = 1. The A' and B1 center xz = x3 = 0 together with A2(3) = y remain the same, that is, A1(3) = A1(2) = 1 and B1(3) = B1(2) = 2. For k = 4, since 1x:-21,41 = Ix:-xzI = 11-01 = 1 < r , wehaveA2(4) =A2(3)+yo4 = 2 + 2 = 4, B2(4) = B2(3) 1 = 2, A1(4) = A1(3) = 1 and B1(4) = B1(3) = 2. Finally, for k = 5, since Ixg-x$I = Ixg-xzI = 12-01 = 2 > r , anew cluster center xz = x: = 2 is established with A3(5) = = 1 and B3(5) = 1. The other variables remain the same, that is, A1(5) = A1(4) = 1, B1(5) = B1(4) = 2,A2(5) = A2(4) = 4 and B2(5) = B2(4) = 2. The final fuzzy system is
yi
198
-
Ch. 15
exp(-*) 2exp(-
+ 4 e x p ( - f ) + exp(-*)
(15.9)
which is plotted in Fig. 15.5 with a = 0.3. Comparing Figs. 15.5 with 15.2 we see, as expected, that the matching errors of the fuzzy system (15.9) at the five input-output pairs are larger than those of the optimal fuzzy system.
Figure 15.5. The designed fuzzy system f5(x) (15.9) in Example 15.2 with u = 0.3.
Since the A 1 ( k )and ~ l ( kcoefficients ) in (15.7) and (15.8) are determined using the recursive equations (15.3)-(15.6),it is easy to add a forgetting factor to (15.3)(15.6). This is desirable if the fuzzy system is being used to model systems with changing characteristics. For these cases, we replace (15.3) and (15.4) with 7-1 I k A'* ( k ) = -A1* (k - 1) ; y o (15.10)
(15.11)
7-1 ~ l ( k= )-~l(k7
1)
where T can be considered as a time constant of an exponential decay function. For practical considerations, there should be a lower threshold for B y k ) so that when sufficient time has elapsed without update for a particular cluster (which results in the B 1 ( k )to be smaller than the threshold), that cluster would be eliminated.
199
is assumed to be unknown. The objective here is to design a controller u(k) (based on the fuzzy system (15.7) or (15.8)) such that the output y(k) of the closed-loop system follows the output ym(k) of the reference model
where r(k) = sin(2~k/25).That is, we want e(k) = y(k) - y,(k) as k goes to infinity.
converge to zero
from which it follows that limk+.,e(k) = 0. However, since g[y(k),y(k - I)] is unknown, the controller (15.17) cannot be implemented. To solve this problem, we replace the g[y(k), y(k - I)] in (15.17) by the fuzzy system (15.7) or (15.8); that is, we use the following controller
where fk[y(lc),y(k- I)] is in the form of (15.7) or (15.8) with x = (y(k), y(k - I ) ) ~ . This results in the nonlinear difference equation
200
Ch. 15
Figure 15.6. Overall adaptive fuzzy control system for Example 15.3.
governing the behavior of the closed-loop system. The overall control system is shown in Fig. 15.6. From Fig. 15.6 we see that the controller consists of two parts: an identifier and a controller. The identifier uses the fuzzy system fk to approximate the unknown nonlinear function g, and this fk is then copied to the controller. We simulated the following two cases for this example:
Case 1: The controller in Fig. 15.6 was first disconnected and only the identifier was operating to identify the unknown plant. In this identification phase, we chose the input u(k) to be an i.i.d. random signal uniformly distributed in the interval [-3,3]. After the identification procedure was terminated, (15.20) was used to generate the control input; that is, the controller in Fig. 15.6 began operating with fk copied from the final fk in the identifier. Figs.15.7 and 15.8 show the output y(k) of the closed-loop system with this controller together with the reference model output y,(k) for the cases where the identification procedure was terminated at k = 100 and k = 500, respectively. In these simulations, we chose u = 0.3 and r = 0.3. From these simulation results we see that: (i) with only 100 steps of training the identifier could produce an accurate model that resulted in a good tracking performance, and (ii) with more steps of training the control performance was improved. Case
201
Figure 15.7. The output y(k) (solid line) of the closedloop system and the reference trajectory y,(k) (dashed line) for Case 1 in Example 15.3 when the identification procedure was terminated at k = 100.
Figure 15.8. The output y(k) (solid line) of the closedloop system and the reference trajectory ym(k) (dashed line) for Case 1 in Example 15.3 when the identification procedure was terminated at k = 500.
202
Ch. 15
in Fig. 15.6) from k = 0. We still chose a = 0.3 and r = 0.3. Fig. 15.9 shows y(k) and ym(k) for this simulation.
Figure 15.9. The output y(k) (solid line) of the closedloop system and the reference trajectory y m ( k ) (dashed line) for Case 2 in Example 15.3.
where the nonlinear function is assumed to be unknown. The aim is to design a controller u(k) such that y(k) will follow the reference model
where fk[y (k), y (k - 1),y (k - 2)] is in the form of (15.7) or (15.8). The basic configuration of the overall control scheme is the same as Fig.15.6. Fig.15.10 shows y(k) and ym(k) when both the identifier and the controller began operating from k = 0. We chose a = 0.3 and r = 0.3 in this simulation.
203
Figure 15.10. The output y(k) (solid line) of the closedloop system and the reference trajectory y m ( k ) (dashed line) for Example 15.4.
15.4
In this chapter we have demonstrated the following: The idea and construction of the optimal fuzzy system. The detailed steps of using the nearest neighborhood clustering algorithm to design the fuzzy systems from input-output pairs. Applications of the designed fuzzy system to the adaptive control of discretetime dynamic systems and other problems. Various clustering algorithms can be found in the textbooks on pattern recognition, among which Duda and Hart [I9731 is still one of the best. The method in this chapter is taken from Wang [1994a], where more examples can be found.
15.5
Exercise 15.1. Repeat Example 15.2 with r = 2.2. Exercise 15.2. Modify the design method in Section 15.2 such that a cluster center is the average of inputs of the points in the cluster, the A z ( k ) parameter parameter is is the average of outputs of the points in the cluster, and the Bz(k) deleted.
204
Ch. 15
Exercise 15.3. Create an example to show that even with the same set of input-output pairs, the clustering method in Section 15.2 may create different fuzzy systems if the ordering of the input-output pairs used is different. Exercise 15.4. The basic idea of hierarchical clustering is illustrated in Fig. 15.11. Propose a method to design fuzzy systems using the hierarchical clustering idea. Show your method in detail in a step-by-step manner and demonstrate it through a simple example.
where the nonlinear functions are assumed to be unknown. (a) Design an identifier for the system using the fuzzy system (15.7) or (15.8) as basic block. Explain the working procedure of the identifier. (b) Design a controller for the system such that the closed-loop system outputs follow the reference model
where rl (k) and r2(k) are known reference signals. Under what conditions will the tracking error converge to zero?
15.6 (Project). Write a computer program to implement the design method in Section 15.2 and apply your program to the time series prediction and nonlinear system identification problems in Chapters 12 and 13.