0% found this document useful (0 votes)
14 views

Fuzzy Logic

Uploaded by

Jasmine Soreng
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Fuzzy Logic

Uploaded by

Jasmine Soreng
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 134

FUZZY LOGIC

COMPILED BY :
ER. RAVI PRAKASH SHAHI
[email protected]
Uncertainty Studies
Uncertainty Study

Qualitative
Reasoning
Probability Based Information Theory Fuzzy Logic Based
based

Probabilistic Markov
Entropy
Reasoning Processes
Centric
& Graphical
Algos
Models

Bayesian Belief Network


04/06/24 Reasoning under Uncertainty
04/06/24 Reasoning under Uncertainty
Outlook Temp Humidit Windy Decision
(0) (T) y (W) to play
(H) (D)
Sunny High High F N
Sunny High High T N
Cloudy High High F Y
Rain Med High F Y
Rain Cold Low N Y
Rain Cold Low T N
Cloudy Cold Low T Y

To-play-or-not-to-play-tennis data vs. Climatic-Condition


Weather Temp Humidit Windy Decision
(0) (T) y (W) (D)
(H)

Sunny Med High F N


Sunny Cold Low F Y
Rain Med Low F Y
Sunny Med Low T Y
Cloudy Med High T Y
Cloudy High Low F Y
Rain High High T N
Outlook

Sunny Cloudy Rain

Humidity Yes Windy

High Low T F

No Yes No Yes
Rule Base

R1: If outlook is sunny and if humidity is high then


Decision is No.

R2: If outlook is sunny and if humidity is low then


Decision is Yes.

R3: If outlook is cloudy then Decision is Yes.


Fuzzy Logic
Fuzzy Logic tries to capture the
human ability of reasoning with
imprecise information

• Models Human Reasoning


• Works with imprecise statements such as:
In a process control situation, “If the
temperature is moderate and the pressure is high,
then turn the knob slightly right”
• The rules have “Linguistic Variables”, typically
adjectives qualified by adverbs (adverbs are
hedges).
Underlying Theory: Theory of Fuzzy
Sets
• Intimate connection between logic and set theory.
• Given any set ‘S’ and an element ‘e’, there is a very natural
predicate, μs(e) called as the belongingness predicate.
• The predicate is such that,
μs(e) = 1, iff e ∈ S
= 0, otherwise
• For example, S = {1, 2, 3, 4}, μs(1) = 1 and μs(5) = 0
• A predicate P(x) also defines a set naturally.
S = {x | P(x) is true}
For example, even(x) defines S = {x | x is even}
Fuzzy Set Theory (contd.)

• Fuzzy set theory starts by questioning the fundamental


assumptions of set theory viz., the belongingness
predicate, μ, value is 0 or 1.
• Instead in Fuzzy theory it is assumed that,
μs(e) = [0, 1]
• Fuzzy set theory is a generalization of classical set theory
also called Crisp Set Theory.
• In real life belongingness is a fuzzy concept.
Example: Let, T = set of “tall” people
μT (Ram) = 1.0
μT (Shyam) = 0.2
Shyam belongs to T with degree 0.2.
Linguistic Variables
• Fuzzy sets are named by
Linguistic Variables
(typically adjectives). μtall(h)
• Underlying the LV is a
numerical quantity 1
E.g. For ‘tall’ (LV), ‘height’ is
numerical quantity.
• Profile of a LV is the plot 0.4
shown in the figure shown 4.5
alongside.
0 1 2 3 4 5 6
height h
Example Profiles

μrich(w) μpoor(w)

wealth w wealth w
Example Profiles

μA (x) μA (x)

x x

Profile representing Profile representing


moderate (e.g. moderately rich) extreme
Concept of Hedge
• Hedge is an intensifier
• Example:
LV = tall, LV1 = very tall, LV2
= somewhat tall somewhat tall tall
• ‘very’ operation: 1

μvery tall(x) = μ2tall(x)


μtall(h) very tall
• ‘somewhat’ operation:
μsomewhat tall(x) = √(μtall(x))

0
h
Representation of Fuzzy sets
Let U = {x1,x2,…..,xn}
|U| = n
The various sets composed of elements from U are presented as
points on and inside the n-dimensional hypercube. The crisp
sets are the corners of the hypercube. μA(x1)=0.3
μA(x2)=0.4
(0,1) (1,1)
x2 (x1,x2)
U={x1,x2}
x2 A(0.3,0.4)

(0,0) (1,0)
x1 x1
Φ

A fuzzy set A is represented by a point in the n-dimensional


space as the point {μA(x1), μA(x2),……μA(xn)}
Degree of fuzziness
The centre of the hypercube is the “most fuzzy” set. Fuzziness decreases
as one nears the corners
Measure of fuzziness
Called the entropy of a fuzzy set

Fuzzy set Farthest corner

E ( S )  d ( S , nearest ) / d ( S , farthest )

Entropy Nearest corner


(0,1) (1,1)

x2

A (0.5,0.5)

d(A, nearest)

(0,0) (1,0)
x1

d(A, farthest)
Definition
Distance between two fuzzy sets
n
d ( S1 , S 2 )   |  s1 ( xi )  s2 ( xi ) |
i 1

L1 - norm

Let C = fuzzy set represented by the centre point


d(c,nearest) = |0.5-1.0| + |0.5 – 0.0|
=1
= d(C,farthest)
=> E(C) = 1
Definition
Cardinality of a fuzzy set
n
m( s )   s ( xi ) [generalization of cardinality of
i 1 classical sets]

Union, Intersection, complementation, subset hood


 s  s ( x)  max[ s ( x),  s ( x)]x  U
1 2 1 2

 s  s ( x)  min[ s ( x),  s ( x)]x  U


1 2 1 2

 s ( x)  1   s ( x)
c
Note on definition by extension and intension
S1 = {xi|xi mod 2 = 0 } – Intension
S2 = {0,2,4,6,8,10,………..} – extension

How to define subset hood?


Meaning of fuzzy subset
Suppose, following classical set theory we say
A B

if
 A ( x)   B ( x)x

Consider the n-hyperspace representation of A and B

(0,1) (1,1)

A
x2 . B1 Region where  A ( x )   B ( x )
.B2
.B3
(0,0) (1,0)
x1
This effectively means
B  P ( A) CRISPLY
P(A) = Power set of A
Eg: Suppose
A = {0,1,0,1,0,1,…………….,0,1} – 104 elements
B = {0,0,0,1,0,1,……………….,0,1} – 104 elements
Isn’t B  A with a degree? (only differs in the 2nd element)
Fuzzy definition of subset

Measured in terms of “fit violation”, i.e. violating the


condition  A ( x)   B ( x)
Degree of subset hood = 1- degree of superset hood
= x
 max(0,  A ( x)   B ( x))
1
m( A)

m(A) = cardinality of A
=
x
 A ( x)
We can show that E ( A)  S ( A  Ac , A  Ac )
Exercise 1:
Show the relationship between entropy and subset hood
Exercise 2:
Prove that
S ( B, A)  m( A  B ) / m( A)

Subset hood of B in A
Fuzzy sets to fuzzy logic
Forms the foundation of fuzzy rule based system or fuzzy expert system
Expert System
Rules are of the form
If
C1  C2  ...........Cn
then
Ai
Where Cis are conditions
Eg: C1=Colour of the eye yellow
C2= has fever
C3=high bilurubin
In fuzzy logic we have fuzzy predicates
Classical logic
P(x1,x2,x3…..xn) = 0/1
Fuzzy Logic
P(x1,x2,x3…..xn) = [0,1]
Fuzzy OR
P ( x)  Q( y )  max( P ( x), Q( y ))

Fuzzy AND
P ( x)  Q( y )  min( P ( x), Q( y ))

Fuzzy NOT
~ P( x)  1  P( x)
Fuzzy Implication
• Many theories have been advanced and many expressions exist
• The most used is Lukasiewitz formula
• t(P) = truth value of a proposition/predicate. In fuzzy logic t(P) = [0,1]
• t( ) = min[1,1 -t(P)+t(Q)]

PQ

Lukasiewitz definition of
implication
t ( P  Q)  min(t ( P ), t (Q))

Eg: If pressure is high then Volume is low


t (high( pressure )  low(volume))

High
Pressure

Pressure
Fuzzy Inferencing
Fuzzy Inferencing: illustration through inverted
pendulum control problem
Core
The Lukasiewitz rule
t( P  Q ) = min[1,1 + t(P) – t(Q)]
An example
Controlling an inverted pendulum

.
θ   d / dt = angular velocity

Motor i=current
The goal: To keep the pendulum in vertical position (θ=0)
in dynamic equilibrium. Whenever the pendulum departs
from vertical, a torque is produced by sending a current ‘i’

Controlling factors for appropriate current


.
Angle θ, Angular velocity θ
Some intuitive rules
.
If θ is +ve small and θ is –ve small
then current is zero
.
If θ is +ve small and θ is +ve small
Control Matrix
θ -ve -ve
Zero
+ve +ve
. med small small med
θ
-ve
med

-ve +ve +ve Region of


small Zero
med small interest

Zero +ve -ve


Zero
small small

+ve Zero -ve -ve


small small med

+ve
med
Each cell is a rule of the form
.
If θ is <> and θ is <>
then i is <>
4 “Centre rules”
.
1. if θ = = Zero and θ = = Zero then i = Zero
.
2. if θ is +ve small and θ = = Zero then i is –ve small
.
3. if θ is –ve small and θ = = Zero then i is +ve small
.
4. if θ = = Zero and θ is +ve small then i is –ve small
.
5. if θ = = Zero and θ is –ve small then i is +ve small
Linguistic variables
1. Zero
2. +ve small
3. -ve small

Profiles

1
-ve small +ve small

-ε3 -ε2 ε2 ε3
-ε +ε .
Quantity (θ, θ , i)
Inference procedure

1. Read actual numerical values of θ and θ.


2. Get the corresponding μ values μZero, μ(+ve small), μ(-
ve small). This is called FUZZIFICATION
3. For different rules, get the fuzzy I-values from
the R.H.S of the rules.
4. “Collate” by some method and get ONE current
value. This is called DEFUZZIFICATION
5. Result is one numerical value of ‘i’.
WHAT IS FUZZY LOGIC?

• Fuzzy logic is an approach to computing based on "degrees of


truth" rather than the usual "true or false" (1 or 0) Boolean logic on
which the modern computer is based.
• The idea of fuzzy logic was first advanced by Lotfi Zadeh of the
University of California at Berkeley in the 1960s. Zadeh was
working on the problem of computer understanding of natural
language.
• Natural language -- like most other activities in life and indeed the
universe -- is not easily translated into the absolute terms of 0 and
1. Whether everything is ultimately describable in binary terms is
a philosophical question worth pursuing, but in practice, much data
we might want to feed a computer is in some state in between and
so, frequently, are the results of computing. It may help to see
fuzzy logic as the way reasoning really works and binary, or
Boolean, logic is simply a special case of it.
FUZZY DATA – CRISP DATA

• The reasoning in Fuzzy logic is similar to human reasoning


• It allows for approximate values and inferences as well as
incomplete or ambiguous data.
• Binary (YES/NO) Choices
• Fuzzy logic is able to process incomplete data and provide
approximate solutions to problems which other methods are
unable to solve.
Degrees of Truth
FUZZY LOGIC – OVERVIEW
• The approach of FL imitates the way of decision making in
humans that involves all intermediate possibilities between
digital values YES and NO.
• The conventional logic block that a computer can understand
takes precise input and produces a definite output as TRUE or
FALSE, which is equivalent to human’s YES or NO.
• The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike
computers, the human decision making includes a range of
possibilities between YES and NO, such as −

• The fuzzy logic works on the levels of possibilities of input to achieve the
definite output.
FL IMPLEMENTATION

• t can be implemented in systems with various sizes and


capabilities ranging from small micro-controllers to
large, networked, workstation-based control systems.
• It can be implemented in hardware, software, or a
combination of both.
FL SYSTEM ARCHITECTURE
• It has four main parts :
1.Fuzzification Module − It transforms the system inputs, which
are crisp numbers, into fuzzy sets. It splits the input signal
into five steps such as −

2.Knowledge Base − It stores IF-THEN rules provided by experts.


3.Inference Engine − It simulates the human reasoning process
by making fuzzy inference on the inputs and IF-THEN rules.
4.Defuzzification Module − It transforms the fuzzy set obtained by
the inference engine into a crisp value.
FL SYSTEM ARCHITECTURE (2)
MEMBERSHIP
FUNCTION
UNCERTAINTY (2)
UNCERTAINTY (3)
UNCERTAINTY (4)
Example of Dental Diagnosis
Probability (over Finite Sets)
Probability in General
Conditional Probabilities
Conditional Probabilities (2)
Properties and Sets
04/06/24 Reasoning under Uncertainty
04/06/24 Reasoning under Uncertainty
REASONING UNDER UNCERTAINTY -
STRATEGIES

• Various types of strategies are used to handle uncertainty:


1.PROBABILISTIC MODELS
2.BAYESIAN NETWORKS
3.MONTE CARLO METHODS
4.DECISION THEORY
5.FUZZY LOGIC
6.QUALITATIVE REASONING
PROBABILISTIC REASONING
• Probabilistic reasoning is a way of knowledge representation where we apply
the concept of probability to indicate the uncertainty in knowledge. In
probabilistic reasoning, we combine probability theory with logic to handle the
uncertainty.
• In the real world, there are lots of scenarios, where the certainty of something is
not confirmed, such as "It will rain today," "behavior of someone for some
situations," "A match between two teams or two players."
• Need of PR:
a)When there are unpredictable outcomes.
b)When specifications or possibilities of predicates becomes too large to handle.
c)When an unknown error occurs during an experiment
• In probabilistic reasoning, there are two ways to solve problems with uncertain
knowledge: 1. BAYES’ RULE 2. BAYESIAN STATISTICS
04/06/24 Reasoning under Uncertainty
04/06/24 Reasoning under Uncertainty
04/06/24 Reasoning under Uncertainty
CONDITIONAL PROBABILITY
PROBABILISTIC REASONING-BAYES
THEOREM
P(A|B) represents the posterior probability i.e.
Prob of hypothesis A when B is true.
P(B|A) is known as probability of evidence or
the likelihood probability.
P(A) is called the prior probability i.e the prob of
hypothesis A(considering the evidence).
P(B) is known as the marginal probability.
Here A is the hypothesis, and B is the evidence)
04/06/24 Reasoning under Uncertainty
Applying Bayes' rule:
 Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B),
P(B), and P(A). This is very useful in cases where we have a good probability
of these three terms and want to determine the fourth one.
 Suppose we want to perceive the effect of some unknown cause,
and want to compute that cause, then the Bayes' rule becomes:

EXAMPLE 1
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it
occurs 80% of the time. He is also aware of some more facts, which are given as follows:
The Known probability that a patient has meningitis disease is 1/30,000.
The Known probability that a patient has a stiff neck is 2%.
What is the probability that a patient has diseases meningitis with a stiff neck?

Let A be the proposition that patient has stiff neck and B be the proposition
that patient has meningitis. P(A|B) = 0.8, P(B) = 1/30000, P(A)= 0.02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease
with a stiff neck.
A doctor is aware that disease meningitis causes a patient to
have a stiff neck, and it occurs 80% of the time. He is also
aware of some more facts, which are given as follows:
The Known probability that a patient has meningitis disease
is 1/30,000.
The Known probability that a patient has a stiff neck is 2%.
What is the probability that a patient has disease
meningitis with a stiff neck?
SOLUTION:
Let A be the proposition that patient has stiff neck and
B be the proposition that patient has meningitis.
P(A|B) = 0.8, P(B) = 1/30000, P(A)= 0.02
P(B | A) = ?
1.333333/1000 = 4/3 divide by 1000 = 4/3000
=1/750
Hence, we can assume that 1 patient out of 750
patients has meningitis disease with a stiff neck.

04/06/24 Reasoning under Uncertainty


EXAMPLE 2

From a standard deck of playing cards, a


single card is drawn. The probability that
the card is king is 4/52, then calculate
posterior probability P(King|Face), which
means the drawn face card is a king card.
SOLUTION :
P(king): probability that the card is King= 4/52= 1/13
P(face): probability that a card is a face card= 12/52
= 3/13
P(Face|King): probability of face card when we assume
it is a king = 1
Putting all values in equation (i) we will get:
Bayesian Belief Network in AI

• Bayesian belief network deals with probabilistic events


and to solve a problem which has uncertainty. It is also
called a Bayes network, belief network, decision
network, or Bayesian model. Bayesian networks are
probabilistic, because these networks are built from
a probability distribution.
• Bayesian Belief N/w defines probabilistic independencies
and dependencies among the variables in the network.
• "A Bayesian network is a probabilistic graphical
model which represents a set of variables and their
conditional dependencies using a directed acyclic
graph(DAG)“.
• Built from a probability distribution.
• Consists of two parts :
a) Directed Acyclic Graph
b) Table of conditional probabilities.
• The generalized form of Bayesian network that
represents and solve decision problems under
uncertain knowledge is known as an Influence
diagram
Bayesian Network graph
• A Bayesian network is made up of nodes and Arcs
(directed links):
• Node corresponds to a random variable, and it can
be continuous or discrete.
• Arc or directed arrows represent the causal relationship
or conditional probabilities between random variables.
These directed links or arrows connect the pair of nodes
in the graph.
• These links represent that one node directly influence
the other node, and if there is no directed link that
means that nodes are independent with each other.
 In the diagram, A, B, C, and D are
random variables represented by the
nodes of the network graph.
 If we are considering node B, which
is connected with node A by a
directed arrow, then node A is called
the parent of Node B and Node D.
 Node C is independent of node A.
 There is a direct influence of A on B
(Conditional Prob). A table is created
which contains the conditional prob
values of all the nodes with effect of
their parent nodes (BAYESIAN BELIEF
NETWORK).

Note: The Bayesian network graph does not contain any cyclic
graph. Hence, it is known as a directed acyclic graph or DAG.
Bayesian Network graph (2)
• The Bayesian network has mainly two components – a) Causal
component b) Actual numbers.
• Each node in Bayesian network has conditional prob distribution P(Xi |Parent(Xi) ),
which determines the effect of the parent on that node.
• Bayesian network is based on Joint probability distribution and
conditional probability.
• JOINT PROBABILITY DISTRIBUTION
• If we have variables x1, x2, x3,....., xn, then the probabilities of a different
combination of x1, x2, x3, ... xn, are known as Joint probability distribution.
• P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint
probability distribution.
= P[x1| x2, x3,....., xn]. P[x2, x3,....., xn]
= P[x1| x2, x3,....., xn] . P[x2|x3,....., xn]....P[xn-1|xn] . P[xn].
Creating a Directed Acyclic Graph(DAG)
• Harry installed a new burglar alarm at his home to detect burglary. The
alarm reliably responds at detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors - David and Sophia, who have taken
a responsibility to inform Harry at work when they hear the alarm.
• David always calls Harry when he hears the alarm, but sometimes he gets
confused with the phone ringing and calls at that time too.
• On the other hand, Sophia likes to listen to high music, so sometimes she
misses to hear the alarm. Here we would like to compute the probability of
Burglary Alarm.

• Calculate the probability that alarm has sounded, but there


is neither a burglary, nor an earthquake occurred, and
David and Sophia both called the Harry.
• SOLUTION : List of all events occurring in this
network:
• Burglary (B)
• Earthquake(E)
• Alarm(A)
• David Calls(D)
• Sophia calls(S)
• We can write the events of problem statement in the
form of probability: P[D, S, A, B, E], can rewrite the
above probability statement using joint probability
distribution:

04/06/24 Reasoning under Uncertainty


SOLUTION :
•The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly affecting
the probability of alarm's going off, but David and Sophia's calls depend on alarm
probability.
•The network is representing that our assumptions do not directly perceive the burglary and
also do not notice the minor earthquake, and they also not confer before calling.
•Conditional distributions for each node are given as conditional probabilities table(CPT).
•Each row in the CPT must be sum to 1 because all the entries in the table represent an
exhaustive set of cases for the variable.
•In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if there
are two parents, then CPT will contain 4 probability values
•Events occurring in this network:
Burglary (B)
Earthquake(E)
Alarm(A)
David Calls(D)
Sophia calls(S)
• P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
• =P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
• = P [D| A]. P [ S| A, B, E]. P[ A, B, E]
• = P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
• = P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
• Let's take the observed probability for the Burglary and earthquake component:
• P(B= True) = 0.002, which is the probability of burglary.
• P(B= False)= 0.998, which is the probability of no burglary.
• P(E= True)= 0.001, which is the probability of a minor earthquake
• P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
• We can provide the conditional probabilities as per the below tables:
• Conditional probability table for Alarm A:
• The Conditional probability of Alarm A depends on Burglar and earthquake:
• Conditional probability table for David Calls:
• The Conditional probability of David that he will call depends on the probability of
Alarm.

• Conditional probability table for Sophia Calls:


• The Conditional probability of Sophia that she calls is depending on its Parent Node
"Alarm."
From the formula of joint distribution, we can write the problem statement in
the form of probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using
Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is
given below:
1. To understand the network as the representation of the Joint probability
distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional
independence statements.
It is helpful in designing inference procedure.
Bayesian Network applications
• There are many cases where precise answers and numbers are required to make a
decision, especially in the financial world. It is the time when technology comes
in handy to make the right decision.
• ML is one of the technologies that help make the right decision at such times, and
the Bayes Theorem helps make those conditional probability decisions better.
• These events have occurred, and the decision then predicted acts as a cross-
checking answer. It helps immensely in getting a more accurate result. Therefore,
whenever there is a conditional probability problem, the Bayes Theorem in AI/
ML is used. The direct conclusion of this process is that the more data you have,
the more accurate the result can be determined.
• Thus, it makes conditional probability a must to determine or predict more
accurate chances of an event from happening in Machine Learning.
• Real world applications are probabilistic in nature, and to represent the
relationship between multiple events, we need a Bayesian network. It can also be
used in various tasks including prediction, anomaly detection, diagnostics,
automated insight, reasoning, time series prediction, and decision making
under uncertainty.
PRACTICE PROBLEM 1

There are three urns containing 3 white and 2 black


balls; 2 white and 3 black balls; 1 black and 4 white
balls respectively. There is an equal probability of
each urn being chosen. One ball is equal probability
chosen at random. what is the probability that a
white ball is drawn?
Let E1, E2, and E3 be the events of choosing the first, second,
and third urn respectively. Then,
P(E1) = P(E2) = P(E3) =1/3
Let E be the event that a white ball is drawn. Then,
P(E/E1) = 3/5, P(E/E2) = 2/5, P(E/E3) = 4/5
By theorem of total probability, we have
P(E) = P(E/E1) . P(E1) + P(E/E2) . P(E2) + P(E/E3) . P(E3)
= (3/5 × 1/3) + (2/5 × 1/3) + (4/5 × 1/3)
= 9/15 = 3/5
04/06/24 Reasoning under Uncertainty
PRACTICE PROBLEM 2
A person has undertaken a job. The probabilities of completion of the job
on time with and without rain are 0.44 and 0.95 respectively. If the
probability that it will rain is 0.45, then determine the probability that the
job will be completed on time?
Hill Climbing Algorithm

• We will assume we are trying to maximize a function. That is, we are trying to
find a point in the search space that is better than all the others. And by "better"
we mean that the evaluation is higher. We might also say that the solution is of
better quality than all the others.
• The idea behind hill climbing is as follows:
• Pick a random point in the search space.
• Consider all the neighbors of the current state.
• Choose the neighbor with the best quality and move to that state
• Repeat 2 thru 4 until all the neighboring states are of lower quality
• Return the current state as the solution state.
How do Expert Systems Deal with
Uncertainty?
Uncertainty?
Humans & Reasoning !!!

•We take pride in the way we reason !!!


•What exactly is reasoning?

• A ‘process’ of thinking/arguing ‘logically’.

• Verifications or Adaptation.
• New deductions.

Reasoning under Uncertainty


Predicate Logic?

• Symbolic representation of facts.


• Deduction of new facts.
• Certainty.

04/06/24 Reasoning under Uncertainty


Logic Based Expert Systems

• In diagnosis of diseases, where system decides the


disease, given the symptoms.
What if:
• No information for given set of symptoms.
• Facts are not enough.
• Multiple diseases.
• A new case in medical history.
• In such cases, the reasoning by expert systems using
Predicate Logic fails.

04/06/24 Reasoning under Uncertainty


Uncertainty

• Predicate logic used - only if there is no


uncertainty.
• But uncertainty is omnipresent.
• The sources of uncertainty:
• Data or Expert Knowledge
• Knowledge Representation
• Rules or Inference Process

04/06/24 Reasoning under Uncertainty


Uncertainty in Knowledge

• Prior Knowledge.
• Imprecise representation.
• Data derived from defaults/assumptions.
• Inconsistency between knowledge from different
experts.
• “Best Guesses”.

04/06/24 Reasoning under Uncertainty


Representation and Reasoning

• Knowledge Representation
• Restricted model of the real system.
• Limited expressiveness of the representation
mechanism.
• Rules or Inference Process
• Conflict Resolution
• Subsumption
• Derivation of the result may take very long.

04/06/24 Reasoning under Uncertainty


Solution

• Intelligence in Reasoning
• Adaptability.
• Capability of adding and retracting beliefs as new
information is available.
• This requires non-monotonic reasoning.

04/06/24 Reasoning under Uncertainty


Non-monotonic Reasoning

• In a non-monotonic system:
• We make assumptions about unknown facts.
• The addition of new facts can reduce the set of logical
conclusions.
• S is a conclusion of D, but is not necessarily a
conclusion of D + {new fact}.
• Humans use non-monotonic reasoning constantly!

04/06/24 Reasoning under Uncertainty


Knowledge Base

• Conflicting consequences of a set of facts:


• Rank all the assumptions and use rank to determine
which to believe.
• Tag given (and some other) facts as protected, these
cannot be removed or changed.
• When a new fact is given:
• Get the explanation (list of contradicting facts).
• Maintain consistency.

04/06/24 Reasoning under Uncertainty


Probability in Reasoning

• Probabilities to determine when contradiction


arises.
• Label each fact with a probability of being true.
• Change the probabilities of existing facts to reflect new
facts.
• Use certainty instead of probability to label facts.

04/06/24 Reasoning under Uncertainty


Default Reasoning

Construction of sensible guesses when some useful


information is lacking and no contradictory evidence
is present.

04/06/24 Reasoning under Uncertainty


How it does so?

• It tries to reason with the given knowledge and


generates the most likely result.
• Use ‘ Whatever is available '.

04/06/24 Reasoning under Uncertainty


A Classic Example

Can Tweety fly???

• Birds typically fly  Birds typically fly


• Tweety is a bird.  Penguins are birds
• Tweety flies  Penguins typically do not fly
 Tweety is a Penguin.
 Tweety does not fly.

04/06/24 Reasoning under Uncertainty


Nonmonotonic Logic (NML)

Can Tweety fly???

Bird(x) ^ M fly(x)-> fly(x) penguin(x) -> bird(x)


Bird(Tweety) penguin(x) -> ~fly(x)
penguin(Tweety)

M is known as MODAL operator.


Read it as: 'If it is consistent to assume'

04/06/24 Reasoning under Uncertainty


IDEA

• If there is no reason to believe otherwise, assume


that fly (x) is TRUE.

• The default is that everything is normal.

• Now we only need to supply additional


information for exceptions.

04/06/24 Reasoning under Uncertainty


Problem with NML

Russian Roulette Example

04/06/24 Reasoning under Uncertainty


Would you take the bet ?

• A revolver is loaded with 1 bullet (it has 5 empty


chambers), and the cylinder is spun.
• With these stakes:
• If correct, the system wins $1.
• If wrong, the system loses $1.

04/06/24 Reasoning under Uncertainty


Another Scenario..

• Again the revolver is loaded with exactly 1 bullet


and the cylinder is spun.
• With these new stakes:
• If correct, the system wins $1.
• If wrong, the system loses its life.

04/06/24 Reasoning under Uncertainty


So, where does the problem lie?

In these two scenarios the uncertainty is the


same, but it is not rational to draw the same
conclusion.

04/06/24 Reasoning under Uncertainty


Rational Default Reasoning

• Assign a degree of belief


• Define a acceptance rule
• if P(S | e) > b then accept the bet.
• Here, b will be calculated using the payoff.
• A tentative conclusion is an assertion about the
desirability of a bet, not a direct assertion about a
sentence.

04/06/24 Reasoning under Uncertainty


Solution of Russian Roulette

• Decision theory gives the answer:


• Compare the probability of the sentence to the
breakeven probability determined by the payoff.
• b = 1/(1 + 1) = 0.5
• P(gun_will_not_fire | 1_bullet_and_spun) > 0.5

•b =
• The system should ignore a better-than-even probability and
refuse to bet.

04/06/24 Reasoning under Uncertainty


What we observed?

• Using Default Reasoning, we are able to reach at


some conclusion.
• It works well under the non-monotonic
knowledge base.
• Its basic idea is that common-sense reasoning
applies regularity assumptions as long as these
are not explicitly ruled out.
• So, we need to quantify exceptions explicitly.

04/06/24 Reasoning under Uncertainty


Dempster Shafer (D-S) Theory

• Provides a numerical method to represent and


reason about uncertainty.
• “Absence of evidence is not an evidence of
absence”.
• Provides a way to combine evidence from two
or more sources and to draw conclusions from
them.

04/06/24 Reasoning under Uncertainty


Basics

• Frame of Discernment - Sample space of DS theory


denoted by
• Propositions - Subsets of frame of discernment
• Probability values are assigned to the propositions.
• Basic Probability Assignments - Probability values
assigned to the propositions denoted by m.
• Focal Elements - Propositions with non-zero
probability assignment.
• Core – Union of focal elements.

04/06/24 Reasoning under Uncertainty


Basic Probability Assignment
Properties of BPA: 1.m( )  0
2.  m( A)  1
A

Ex. I am not sure if   {H , T }


the coin is fair or biased. m({H })  0
m({T })  0
m({H , T })  1
04/06/24 Reasoning under Uncertainty
Belief function

• Function to express the extent to which we are


confident about the occurrence of a proposition.
• Bel(A) is the total belief committed to A.

Bel ( )  0
Bel ()  1
Bel ( A  B )  Bel ( A)  Bel ( B )  Bel ( A  B )
Bel ( A)   m( B )
B A

04/06/24 Reasoning under Uncertainty


Plausibility

• Function to express the extent to which a proposition is


credible or plausible.

Pl ( A)    m( B )
B  A
 pl ( A)  Bel ( A)

• [Bel(A), Pl(A)] represents the credibility status of A


• Pl(A) - Bel(A) represents uncertainty in the
occurrence of A

04/06/24 Reasoning under Uncertainty


Dempster’s rule of Combination

• Allows us to combine Basic Probability


Assignment (m) values from two arguments and
draw conclusions.
• m1 and m2 are two independent BPAs.

 m ( A)m ( B) 1 2
( m  m )(C )  A B  C
1   m ( A)m ( B )
1 2
1 2
 A B 

04/06/24 Reasoning under Uncertainty


Example

m1{H}=0.3 m1{T}=0 m1{H,T}=0.7


m2{H}=0.5 {H}, 0.15 { }, 0 {H}, 0.35
m2{T}=0.5 { }, 0.15 {T}, 0 {T}, 0.35
m2{H,T}=0 {H}, 0 {T}, 0 {H,T}, 0

•(m1 (+) m2)({H}) = (0.15+0.35)/(1-(0.15+0)) = 0.58.

•(m1 (+) m2)({T}) = (0.35)/(1-(0.15+0)) = 0.42.


04/06/24 Reasoning under Uncertainty
Advantages & Disadvantages

Advantages
• Uncertainty and ignorance can be expressed.
• Dempster’s rule can be used to combine evidences.

Disadvantages
• Computational complexity of applying Dempster’s rule is
high.

04/06/24 Reasoning under Uncertainty


Conclusions

• Uncertainty is omnipresent.
• We can use symbolic and statistical methods like Default
Reasoning and Dempster – Shafer theory to handle
uncertainty to some extent.
• Default Reasoning guarantees a conclusion for the given
knowledge base and desired fact. Although for handling
exceptions they have to be explicitly quantified.
• Dempster – Shafer theory combines evidences from
different sources to draw conclusion.

04/06/24 Reasoning under Uncertainty


References

• Russell and Norvig (1993), “Artificial Intelligence – A modern


approach”. Pearson Education, Inc. Second Edition
• Pelletier F.J., and R. Elio (1997). What should default reasoning be,
by default? Computational Intelligence 13:165-188
• Glenn Shafer(1976), “A mathematical theory of evidence”. Princeton :
Princeton University Press.
• Carl M. Kadie, "Rational Non-Monotonic Reasoning." in Proceedings of
the Fourth Workshop on Uncertainty in Artificial Intelligence,
Minneapolis, August 1988
http://research.microsoft.com/~carlk/papers/uncert.ps

04/06/24 Reasoning under Uncertainty


Project

• We propose to build a Medical Diagnosis System


and we will try to use non-monotonic and
probabilistic reasoning for the diagnosis.

04/06/24 Reasoning under Uncertainty


Thank you

04/06/24 Reasoning under Uncertainty

You might also like