0% found this document useful (0 votes)

35 views

Module-5 complete notes-Quantifying Uncertainty 20th February 2024

Uploaded by

sandeshssanshi7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Module-5 complete notes-Quantifying Uncertainty 20th February 2024

Uploaded by

sandeshssanshi7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 66

Fundamentals of Artificial Intelligence

Chapter 13: Quantifying Uncertainty

Dr AJ KAMESWARA PRASAD
PROFESSOR
Department of Acharya Institute of Technology
Bangalore
Copyright notice: Most examples and images displayed in the slides of this course are taken from [Russell &
Norwig, “Artificial Intelligence, a Modern Approach”, 3rd ed., Pearson], including explicitly figures from the above-
mentioned book, so that their copyright is detained by the authors. A few other material, included AJK detain its
copyright.
These slides cannot can be displayed in public without the permission of the author. Following fair use doctrine
these slides shall be shared to students

1 / 44
Fundamentals of Artificial Intelligence
Chapter 13: Quantifying Uncertainty

Module-5
Uncertain Knowledge and Reasoning: Quantifying Uncertainty: Acting
under Uncertainty, Basic Probability Notation, Inference using Full Joint
Distributions, Independence, Baye’s Rule and its use. Wumpus World
Revisited Text Book 1: Chapter 13-13.1, 13.2, 13.3, 13.4, 13.5, 13.6

2 / 44
Module-5
Syllabus Uncertain Knowledge and Reasoning:
Quantifying Uncertainty: Acting under Uncertainty,
Basic Probability Notation, Inference using Full Joint
Distributions, Independence, Baye’s Rule and its use.
Wumpus World Revisited
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

4 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

5 / 44
Acting Under Uncertainty
Agents often make decisions based on incomplete information
partial observability
nondeterministic actions
Partial solution (see previous chapters): maintain belief states
represent the set of all possible world states the agent might be
in generating a contingency plan handling every possible
eventuality
Several drawbacks:
must consider every possible explanation for the observation (even
very-unlikely ones)
=⇒ impossibly complex belief-states
contingent plans handling every eventuality grow arbitrarily large
sometimes there is no plan that is guaranteed to achieve the
goal
Agent’s knowledge cannot guarantee a successful outcome ...
... but can provide some degree of belief (likelihood) on it
A rational decision depends on both the relative importance of (sub)goals and the likelihood
6 / 44
Acting Under Uncertainty: Example
Automated taxi to Airport
Goal: deliver a passenger to the airport on time
Action At : leave for airport t minutes before flight
How can we be sure that A90 will succeed?
Too many sources of uncertainty:
partial observability (ex: road state, other drivers’ plans, etc.)
uncertainty in action outcome (ex: flat tire, etc.)
noisy sensors (ex: unreliable traffic reports)
complexity of modelling and predicting traffic
=⇒ With purely-logical approach it is difficult to
anticipate everything that can go wrong
risks falsehood: “A25 will get me there on
time” or
leads to conclusions that are too weak for
decision making:
“A25 will get me there on time if there’s no accident on the bridge , and it doesn’t rain and my tires
remain intact , and...”
7 / 44
Acting Under Uncertainty: Example (2)

A medical diagnosis
Given the symptoms (toothache) infer the cause (cavity)
How to encode this relation in logic?
diagnostic rules:
Toothache → Cavity (wrong)
Toothache → (Cavity ∨ GumProblem ∨ Abscess
∨ ...) (too many possible causes, some very unlikely)
causal rules:
Cavity → Toothache (wrong)
(Cavity ∧ ...) → Toothache (many possible
(con)causes)
Problems in specifying the correct logical rules:
Complexity: too many possible antecedents or
consequents Theoretical ignorance: no complete theory for
the domain Practical ignorance: no complete knowledge of
the patient
8 / 44
Summarizing Uncertainty

Probability allows to summarize the uncertainty on effects of

laziness: failure to enumerate exceptions, qualifications, etc.
ignorance: lack of relevant facts, initial conditions, etc.
Probability can be derived from
statistical data (ex: 80% of toothache patients so far had cavities)
some knowledge (ex: 80% of toothache patients has cavities)
their combination thereof
Probability statements are made with respect to a state of knowledge (aka evidence), not
with respect to the real world
e.g., “The probability that the patient has a cavity, given that she has a toothache, is 0.8”:
P(HasCavity (patient ) | hasToothAche(patient)) = 0.8
Probabilities of propositions change with new evidence:
“The probability that the patient has a cavity, given that she has a toothache and a history of gum
disease, is 0.4”:
P(HasCavity (patient ) | hasToothAche(patient ) ∧ HistoryOfGum(patient )) = 0.4

9 / 44
Making Decisions Under Uncertainty

Ex: Suppose I believe:

P(A25 gets me there on time |...) = 0.04
P(A90 gets me there on time |...) = 0.70
P(A120 gets me there on time |...) = 0.95
P(A1440 gets me there on time |...) =
0.9999 Which action to choose?
=⇒ Depends on tradeoffs among
preferences:
missing flight vs. costs (airport cuisine,
sleep overnight in airport)
When there are conflicting goals the agent may express preferences among them by means
of a utility function.
Utilities are combined with probabilities in the general theory of rational decisions, aka
decision theory:
Decision theory = Probability theory + Utility theory
Maximum Expected Utility (MEU): an agent is rational if and only if it chooses the action that
yields the maximum expected utility, averaged over all the possible outcomes of the action. 10 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

11 / 44
Probabilities Basics: an Artificial Intelligentish Introduction
Probabilistic assertions: state how likely possible worlds are
Sample space Ω: the set of all possible worlds
ω ∈ Ω is a possible world (aka sample point or atomic
event) ex: the dice roll (1,4)
the possible worlds are mutually exclusive and exhaustive
ex: the 36 possible outcomes of rolling two dice: (1,1),
(1,2), ...
A probability model (aka probability space) is a sample space with an assignment P(ω) for
every ω ∈ Ω s.t.
0 ≤ P(ω) ≤ 1, for every ω ∈
Ω Σω∈ΩP(ω) = 1
Ex: 1-die roll: P(1) = P(2) =
P(3) = P(4) = P(5) = P(6) =
1/6
An Event A is any subset of Ω,
s.t. P(A) = Σω∈AP(ω)
events can be described by propositions in some formal 12 / 44
Random Variables

Factored representation of possible worlds: sets of ⟨variable, value⟩ pairs

Variables in probability theory: Random variables
domain: the set of possible values a variable can take on
ex: Die: {1, 2, 3, 4, 5, 6}, Weather: {sunny, rain, cloudy, snow}, Odd:
{true, false}
a r.v. can be seen as a function from sample points to the
domain: ex: Die(ω), Weather (ω),... (“(ω)” typically omitted)
Probability Distribution gives the probabilities of all the possible
values of a random variable
X : P(X = xi ) =def Σ P(ω)
ω∈X (ω)
ex: P(Odd = true) = P(1) + P(3) + P(5) = 1/6 + 1/6 + 1/6 =
1/2

13 / 44
Propositions and Probabilities

We think a proposition a as the event A (set of sample points) where the proposition is true
Odd is a propositional random variable of range {true, false}
notation: a ⇐ ⇒ “A = true′′
Given Boolean random variables A
and B:
a: set of sample points where A(ω) =
true
¬a: set of sample points where A(ω) =
false
a ∧ b: set of sample points where A(ω)
= true, B(ω) = true
=⇒ with Boolean random variables, sample points are PL models
Proposition: disjunction of the sample points in which it is true
ex: (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b)
=⇒ P(a ∨ b) = P(¬a ∧ b) + P(a ∧ ¬b) + P(a ∧ b)
Some derived facts:
P(¬a) = 1 − P(a) 14 / 44
Probability Distributions

Probability Distribution gives the probabilities of all the possible values of a random variable
ex: Weather: {sunny, rain, cloudy, snow}

 P(Weather = sunny ) = 0.6 

P(Weather = rain) = 0.1
=⇒ P(Weather ) = (0.6, 0.1, 0.29, 0.01)
P(Weather = =
⇐⇒  cloudy ) P(Weather 0.29 
normalized: their sum is 1 = snow ) =
Joint Probability Distribution for multiple variables 0.01
gives the probability of every sample point 
Weather = sunny rain snow
ex: P(Weather, Cavity ) cloudy Cavity = true
0.144 0.02 0.016 0.02
= Cavity = false 0.576 0.08 0.064 0.08
Every event is a sum of sample points,
=⇒ its probability is determined by the joint distribution

15 / 44
Probability for Continuous Variables
Express continuous probability distributions:
∫+∞
density functions f (x ) ∈ [0, 1] s.t− ∞ f (x )dx =
P(x ∈1 [a, b]) =∫ b f (x )
a
dx
=⇒ P(x ∈ [val, val]) = 0, P(x ∈ [−∞, +∞])
∫
= 1 ex: P(x ∈ [20, 22]) = 22 0.125 dx =
20
0.25 P(x ) = P(X = x ) =def lim P(X ∈ [x, x + dx
Density: '→0
dx
])/dxex: P(20.1) = limdx'→0 P(X ∈ [20.1, 20.1 + dx ])/dx =
0.125
note: P(v ) /= P(x ∈ [v, v ]) = 0

(© S. Russell & P. Norwig, AIMA)

16 / 44
Conditional Probabilities
Unconditional or prior probabilities refer to degrees of belief in propositions in the absence of
any other information (evidence)
ex: P(cavity ) = 0.2, P(Total = 11) = 1/18, P(double) = 1/6
Conditional or posterior probabilities refer to degrees of belief in proposition a given some
evidence b: P(a|b)
evidence: information already revealed
ex: P(cavity|toothache) = 0.6: p. of a cavity given a toothache (assuming no other information
is provided!)
ex: P(Total = 11|die1 = 5) = 1/6: p. of total 11 given first die is 5
=⇒ restricts the set of possible worlds to those where the first die is 5
Note: P(a|... ∧ a) = 1, P(a|... ∧ ¬a) = 0
ex: P(cavity|toothache ∧ cavity ) = 1, P(cavity|toothache ∧ ¬cavity ) = 0
Less specific belief still valid after more evidence arrives
ex: P(cavity ) = 0.2 holds even if P(cavity|toothache) = 0.6
New evidence may be irrelevant, allowing for simplification
ex: P(cavity|toothache, 49ersWin) = P(cavity|toothache)
= 0.8
17 / 44
Conditional Probabilities [cont.]
P(a∧b)
Conditional probability: P(a|b) =def P(b) , s.t. P(b) >
0 ex: P(Total = 11|die = 5) = P(Total=11∧die1 =5) = 1/6·1/6 = 1/6
1
P(die1 =5) 1/6
observing b restricts the possible worlds to those where b is true
Production rule: P(a ∧ b) = P(a|b) · P(b) = P(b|a) · P(a)
Production rule for whole distributions: P(X, Y ) = P(X|Y ) · P(Y
)
ex: P(Weather, Cavity ) = P(Weather|Cavity )P(Cavity ),
that is:
P(sunny, cavity ) = P(sunny|cavity )P(cavity )
...
P(snow, ¬cavity ) = P(snow|¬cavity )P(¬cavity )
a 4 × 2 set of equations, not matrix multiplication!
Chain rule is derived by successive application of product
rule:
=Q ..
P(X1, ..., Xn )
.= ni=1 P(Xi |X1, ...,
= P(X1, ..., Xn−1)P(Xn|X1, ..., Xn−1)
X i −1 )
= P(X1, ..., Xn−2)P(Xn−1|X1, ..., Xn−2)P(Xn|X1, ..., Xn−1)
18 / 44
Logic vs. Probability

19 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

20 / 44
Probabilistic Inference via Enumeration

Basic Ideas
Start with the joint distribution P(Toothache, Catch, Cavity )
For any proposition ϕ, sum the atomic events where ϕ is true: P(ϕ) = Σω : ω|=ϕP(ω)

21 / 44
Probabilistic Inference via Enumeration: Example

Example: Generic Inference

Start with the joint distribution P(Toothache, Catch, Cavity )
For any proposition ϕ, sum the atomic events where ϕ is true: P(ϕ) = Σω : ω|=ϕP(ω):
Ex: P(cavity ∨ toothache) = 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064 =
0.28

(© S. Russell & P. Norwig, AIMA)

20 / 44
Probabilistic Inference via Enumeration: Example

Example: Generic Inference

(© S. Russell & P. Norwig, AIMA)

20 / 44
Marginalization

Start with the joint distribution P(Toothache, Catch, Cavity )

Marginalization (aka summing out):
sum up the probabilities for each possible value of the other variables:
Σ
P(Y) = z∈Z P(Y, z)
Σ
Ex: P(Toothache) =
z∈{Catch,Cavity} P(Toothache, z)
Conditioning: variant of marginalization, involving conditional probabilities instead of joint
probabilities (using the product rule)
Σ
P(Y) = z∈Z P(Y|z)P(z)
Σ
Ex: P(Toothache) = z∈{Catch,Cavity} P(Toothache|z)P(z)

21 / 44
Marginalization: Example
Start with the joint distribution P(Toothache, Catch, Cavity )
Marginalization (aka summing out):
sum up the probabilities for each possible value of the other variables:
Σ
P(Y) = z∈Z P(Y, z)
Σ
Ex: P(Toothache) =
z∈{Catch,Cavity} P(Toothache, z)
P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 =
0.2
P(¬toothache) = 1 − P(toothache) = 1 − 0.2 = 0.8
=⇒ P(Toothache) = ⟨0.2, 0.8⟩

(© S. Russell & P. Norwig, AIMA)

22 / 44
Marginalization: Example
Start with the joint distribution P(Toothache, Catch, Cavity )
Marginalization (aka summing out):
sum up the probabilities for each possible value of the other variables:
Σ
P(Y) = z∈Z P(Y, z)
Σ
Ex: P(Toothache) =
z∈{Catch,Cavity} P(Toothache, z)
P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 =
0.2
P(¬toothache) = 1 − P(toothache) = 1 − 0.2 = 0.8
=⇒ P(Toothache) = ⟨0.2, 0.8⟩

(© S. Russell & P. Norwig, AIMA)

22 / 44
Conditional Probability via Enumeration: Example

Start with the joint distribution P(Toothache, Catch,

Cavity )
Conditional Probability: = P(¬cavity∧toothache)
Ex: P(¬cavity|toothache) P(toothache)
0.016+0.064
= 0 .108+0.012+0.016+0.064
= 0.4
P(cavity∧toothache)
Ex: P(cavity|toothache) = P(toothache) = ... = 0.6

(© S. Russell & P. Norwig, AIMA)

23 / 44
Conditional Probability via Enumeration: Example

Start with the joint distribution P(Toothache, Catch,

(© S. Russell & P. Norwig, AIMA)

23 / 44
Conditional Probability via Enumeration: Example

Start with the joint distribution P(Toothache, Catch,

(© S. Russell & P. Norwig, AIMA)

23 / 44
Normalization

Let X be all the variables. Typically, we want P(Y|E = e):

the conditional joint distribution of the query variables Y
given specific values e for the evidence variables E
def
let the hidden variables be H = X \ (Y ∪ E)
The summation of joint entries is done by summing out the hidden variables:
P(Y|E = e) = αP(Y, E = e) = αΣh∈HP(Y, E = e, H = h)
where α def
1/P(E = e) (different α’s for different values of e)
=
=⇒ it is easy to compute α by normalization
note: the terms in the summation are joint entries,
because Y, E, H together exhaust the set of random variables X
Idea: compute whole distribution on query variable by:
fixing evidence variables and summing over hidden
Σ
variables normalize the final distribution, so that ... = 1
Complexity: O(2 ), n number of propositions =⇒ impractical
n

for large n’s

24 / 44
Normalization: Example
def
α = 1/P(toothache) can be viewed as a normalization constant
Idea: compute whole distribution on query variable by:
fixing evidence variables and summing over hidden
Σ
variables normalize the final distribution, so that ... = 1
Ex:
P(Cavity|toothache) = αP(Cavity ∧ toothache)
= α[P(Cavity, toothache, catch) + P(Cavity, toothache,
¬catch)]
= α[⟨0.108, 0.016⟩ + ⟨0.012, 0.064⟩]
= α⟨0.12, 0.08⟩ = (normalization) = ⟨0.6, 0.4⟩ [α = 5]
P(Cavity|¬toothache) = ... = α⟨0.08, 0.72⟩ = ⟨0.1, 0.9⟩[α
= 1.25]

(© S. Russell & P. Norwig, AIMA)

25 / 44
Normalization: Example
def
α = 1/P(toothache) can be viewed as a normalization constant
Idea: compute whole distribution on query variable by:
fixing evidence variables and summing over hidden
Σ
variables normalize the final distribution, so that ... = 1
Ex:
P(Cavity|toothache) = αP(Cavity ∧ toothache)
= α[P(Cavity, toothache, catch) + P(Cavity, toothache,
¬catch)]
= α[⟨0.108, 0.016⟩ + ⟨0.012, 0.064⟩]
= α⟨0.12, 0.08⟩ = (normalization) = ⟨0.6, 0.4⟩ [α = 5]
P(Cavity|¬toothache) = ... = α⟨0.08, 0.72⟩ = ⟨0.1, 0.9⟩[α
= 1.25]

(© S. Russell & P. Norwig, AIMA)

25 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

26 / 44
Independence
Variables X and Y are independent iff P(X, Y ) = P(X )P(Y )
(or equivalently, iff P(X|Y ) = P(X ) or P(Y|X ) = P(Y ))
ex: P(Toothache, Catch, Cavity, Weather ) = P(Toothache, Catch,
Cavity )P(Weather )
=⇒ e.g. P(toothache, catch, cavity, cloudy ) = P(toothache, catch, cavity )P(cloudy )
typically based on domain knowledge
May drastically reduce the number of entries and computation
=⇒ ex: 32-element table decomposed into one 8-element and one 4-element table
Unfortunately, absolute independence is quite rare

(© S. Russell & P. Norwig, AIMA)

27 / 44
Conditional Independence

28 / 44
Conditional Independence [cont.]

In many cases, the use of conditional independence reduces the size of the representation
of the joint distribution dramatically
even from exponential to linear!
P(Toothache, Catch, Cavity )
= P(Toothache|Catch, Cavity )P(Catch, Cavity )
Ex:
= P(Toothache|Catch, Cavity )P(Catch|
Cavity )P(Cavity )
= P(Toothache|Cavity
=⇒ Passes )P(Catch|Cavity
from 7 to 2+2+1=5 independent )P(Cavity )
numbers
P(Toothache, Catch, Cavity ) contains
Σ 7 independent
8th can be obtained as 1 −
entries
(the
...)
P(Toothache|Cavity ),P(Catch|Cavity ) contain 2 independent entries (2 × 2 matrix, each
row sums to 1)
P(Cavity ) contains 1 independent entry
General Case: if one causes has n independent
Q effects:
P(Cause, Effect1 , ..., Effectn ) = P(Cause) i P(Effecti |Cause)
=⇒ reduces from 2n+1 − 1 to 2n + 1 independent entries

29 / 44
Exercise

Consider the joint probability distribution described in the table in previous section (slide 20
onwards): P(Toothache, Catch, Cavity )
Consider the example in previous slide:
P(Toothache, Catch, Cavity )
= P(Toothache|Catch, Cavity )P(Catch, Cavity )
= P(Toothache|Catch, Cavity )P(Catch|Cavity )P(Cavity )
= P(Toothache|Cavity )P(Catch|Cavity )P(Cavity )
Compute separately the distributions P(Toothache|Catch, Cavity ), P(Catch|Cavity ),
P(Cavity ), P(Toothache|Cavity ).
Recompute P(Toothache, Catch, Cavity ) in two
ways: P(Toothache|Catch, Cavity )P(Catch|
Cavity )P(Cavity ) P(Toothache|Cavity )P(Catch|Cavity
)P(Cavity )
and compare the result with P(Toothache, Catch, Cavity )

30 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

31 / 44
Bayes’ Rule

Bayes’ Rule/Theorem/Law
P(a ∧ b) P(b|a)P(a)
Bayes’ rule: P(a|b) = =
P(b)
P(b) P(X|Y )P(Y )
In distribution form P(Y|X ) = P(X = αP(X|Y )P(Y )
def )
α = 1/P(X ): normalization constant to make P(Y|X ) entries sum to
1 (different α′s for different values of X )
A version conditionalized on some background evidence e:

P(X|Y, e)P(Y|e)
P(Y|X, e) =
P(X|e)

32 / 44
Using Bayes’ Rule: The Simple Case
Used to assess diagnostic probability from causal probability:

P(effect|cause)P(cause)
P(cause|effect ) =
P(effect )
P(cause|effect ) goes from effect to cause (diagnostic direction)
P(effect|cause) goes from cause to effect (causal direction)

Example
An expert doctor is likely to have causal knowledge ... P(symptoms|disease)
(i.e., P(effect|cause))
... and needs producing diagnostic knowledge P(disease|symptoms) (i.e., P(cause|effect ))
Ex: let m be meningitis, s be stiff neck
P(m) = 1/50000, P(s) = 0.01 (prior knowledge, from statistics)
“meningitis causes to the0.7
P(s|m)P(m) patient a stiff neck in 70% of cases”: P(s|m) = 0.7 (doctor’s
· 1/50000
=⇒ P(m|s) =
experience) = = 0.0014
P(s) 0.01
33 / 44
Using Bayes’ Rule: Combining Evidence
A naive Bayes model is a probability model that assumes the effects are conditionally
independent, given the cause
Q
=⇒ P(Cause, Effect1 , ..., Effect
n ) = P(Cause) i P(Effecti |Cause)
total number of parameters is linear in n
ex: P(Cavity, Toothache, Catch) = P(Cavity )P(Toothache|Cavity )P(Catch|Cavity
)
Q: How can we compute P(Cause|Effect1, ..., Effectk )?
ex P(Cavity|toothache ∧ catch)?

(© S. Russell & P. Norwig, AIMA)

34 / 44
Using Bayes’ Rule: Combining Evidence [cont.]

Q: How can we compute P(Cause|Effect1, ..., Effectk )?

ex: P(Cavity|toothache ∧ catch)?
P(Cavity|toothache ∧ catch)
A: Apply Bayes’ Rule = P(toothache ∧ catch|Cavity )P(Cavity )/P(toothache ∧ catch)
= αP(toothache ∧ catch|Cavity )P(Cavity )
= αP(toothache|Cavity )P(catch|
def Cavity )P(Cavity )
α = 1/P(toothache ∧ catch) not computed explicitly
Q
General case: P(Cause|Effect1 , ..., Effectn ) = (Cause) i P(Effecti |Cause)
αP α def
= 1/P(Effect1, ..., Effectn ) not computed
explicitly (one α value for every value of
Effect1, ..., Effectn )
=⇒ reduces from 2n+1 − 1 to 2n + 1 independent
entries

35 / 44
Outline

1 Acting Under Uncertainty

2 Basics on Probability

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

36 / 44
An Example: The Wumpus World
A probability model of the Wumpus World
Consider again the Wumpus World (restricted to pit detection)
Evidence: no pit in (1,1), (1,2), (2,1), breezy in (1,2), (2,1)
Q. Given the evidence, what is the probability of having a pit in (1,3), (2,2) or (3,1)?
Two groups of variables:
Pij = true iff [i, j] contains a pit
(“causes”)
Bij = true iff [i, j] is breezy
(“effects”, consider only
B1,1, B1,2, B2,1)
Joint
Distribution:
P(P1,1, ..., P4,4, B1,1,
B1,2, B∗2,1def)
b = ¬b1,1 ∧
Known facts def
b1,2 ∧ b2,1
p∗ = ¬p1,1 ∧ ¬p1,2 ∧ ¬p2,1
(evidence):
Queries: P(P1,3|b∗, p∗)? P(P22|b∗, p∗)?
(P(P3,1|b∗, p∗) symmetric) (© S. Russell & P. Norwig, AIMA)

37 / 44
An Example: The Wumpus World
A probability model of the Wumpus World
Consider again the Wumpus World (restricted to pit detection)
Evidence: no pit in (1,1), (1,2), (2,1), breezy in (1,2), (2,1)
Q. Given the evidence, what is the probability of having a pit in (1,3), (2,2) or (3,1)?
Two groups of variables:
Pij = true iff [i, j] contains a pit
(“causes”)
Bij = true iff [i, j] is breezy
(“effects”, consider only
B1,1, B1,2, B2,1)
Joint
Distribution:
P(P1,1, ..., P4,4, B1,1,
B1,2, B∗2,1def)
b = ¬ b1 1, ∧ b1 2, ∧ 2 1
Known facts
b∗, =
def
p ¬ p1 1, ∧ ¬1p2 , ∧ 2 1
(evidence):
¬p ,
Queries: P(P1,3|b∗, p∗)? P(P22|b∗, p∗)?
(© S. Russell & P. Norwig, AIMA)
(P(P3,1|b∗, p∗) symmetric) 37 / 44
An Example: The Wumpus World [cont.]

Specifying the probability model

Apply the product rule to the joint distribution P(P1,1, ..., P4,4, B1,1, B1,2, B2,1) =
P(B1,1, B1,2, B2,1|P1,1, ..., P4,4) P(P1,1, ..., P4,4)
P(B1,1, B1,2, B2,1|P1,1, ..., P4,4)
1 if one pit is adjacent to
breeze, 0 otherwise
P(P1,1, ..., P4,4): pits are placed randomly except in (1,1):
Q4 Q4
P(P1,1 , ..., P4,4 ) i=1 j=1 P(Pi,j )
= 0.2 if (i, j) /= (1,
P(P i,j ) =
1)}
0 otherwise
ex: P(P1,1, ..., P4,4) = 0.23 · 0.815−3 ≈ 0.00055 if 3 pits

38 / 44
An Example: The Wumpus World [cont.]
Inference by enumeration
Case P1,3:
Σ
General form of query: P(Y|E = e) = αP(Y, E = e) = α h P(Y, E = e, H =
h)
Y: query vars; E,e: evidence vars/values; H,h: hidden vars/values
Our case: P(P1,3|p∗, b∗), s.t. the evidence is
def
b∗ = чb1 ,1 A b1 ,2 A b2 ,1
def
p∗ = чp1 ,1 A чp1 ,2 A 2 1
Sumчp
over
,
hidden variables:
Σ 1,3|p , b ) =
P(P ∗ ∗

α unknown P(P1,3 |p∗∗ , b ,

unknown unknown)
are all Pij ’s s.t.
(i, j) /∈ {(1, 1), (1, 2), (2, 1), (1, 3)} l "/∈" is read as
"not an element of" or "not in.“
=⇒ 216−4 = 4096 terms of the sum!
Grows exponentially in the number of hidden variables H! (© S. Russell & P. Norwig, AIMA)
=⇒ Inefficient
39 / 44
An Example: The Wumpus World [cont.]
Inference by enumeration
Case P1,3:
Σ
General form of query: P(Y|E = e) = αP(Y, E = e) = α h P(Y, E = e, H =
h)
Here's how to read the equation:
• P(YE=e): This is the probability of event Y happening given that
∗
event E has already happened. It is read as "the posterior
∗
probability of Y given E".
• =: This symbol represents "equals".
• a: This is a constant of proportionality, which means it is a value
∗
that is multiplied
∗ by another expression to obtain the final result.
The value of this constant is not important for understanding the
general form of the equation.
• P(Y, E=e): This is the joint probability of events Y and E
happening together, given that event E has already happened. It
is read as "the joint probability of Y and E given E". (© S. Russell & P. Norwig, AIMA)

39 / 44
An Example: The Wumpus World [cont.]
Using conditional independence
Basic insight: Given the fringe squares (see below), b∗ is conditionally independent of the
other hidden squares
def
Unknown = Fringe ∪ Other
def
=⇒ P(b∗|p∗, P1,3, Unknown) = P(b∗|p∗, P1,3, Fringe, Others) = P(b∗|p∗, P1,3, Fringe)
Next: manipulate the query into a form
where this equation can be used

40 / 44
An Example: The Wumpus World [cont.]
Using conditional independence

Fringe in Quantifying Uncertainty:

In quantifying uncertainty, particularly in probabilistic or statistical contexts, the

fringe represents the regions of a distribution where the likelihood of occurrence is
relatively low but not impossible. It essentially captures the tails or outskirts of the
distribution.
For example, in a normal distribution, the fringe would refer to the regions far away
from the mean where the probability density is low. Even though events in the
fringe are less likely to occur, they are still within the realm of possibility according
to the distribution.
Understanding the fringe is crucial in risk assessment and decision-making, as it
helps account for extreme or rare events that may have significant consequences.

40 / 44
An Example: The Wumpus World [cont.]
Using conditional independence

Fringe in Normalization:
In the context of normalization, which often involves scaling data to fit within a certain
range or distribution, the fringe refers to the extreme values of the original data that
may lie outside the normalized range.
For instance, if you're normalizing data to a range between 0 and 1, the fringe would
consist of the original data points that are closest to the minimum and maximum
values. Normalization techniques like min-max scaling or z-score
normalization aim to handle these fringe values effectively to ensure that
they are appropriately represented in the normalized data.
Handling fringe values properly during normalization is important to prevent them
from unduly skewing the analysis or the performance of machine learning algorithms
trained on the data.
In both cases, understanding and appropriately dealing with the fringe are essential
for accurate modeling, analysis, and decision-making in uncertain or data-driven
(© S. Russell & P. Norwig, AIMA)
contexts. 40 / 44
An Example: The Wumpus World [cont.]
Using conditional independence

Min-max scaling, also known as min-max normalization, transforms the original data
into a new range, typically between 0 and 1. It does this by linearly scaling each
feature in the dataset based on the minimum and maximum values observed in that
feature.
The formula for min-max scaling is as follows:
�scaled=�−�min�max−�minxscaled=xmax−xminx−xmin
Where:
• �x is an original data point.
• �minxminis the minimum value of the feature.
• �maxxmaxis the maximum value of the feature.
• �scaledxscaledis the scaled value of �x within the range [0, 1].

40 / 44
An Example: The Wumpus World [cont.]
P(p∗, b∗) = P(p∗, b∗) is scalar; use as a normalization constant

41 / 44
An Example: The Wumpus World [cont.]
Sum over the unknowns

41 / 44
An Example: The Wumpus World [cont.]
Use the product rule

41 / 44
An Example: The Wumpus World [cont.]
Separate unknown into fringe and other

41 / 44
An Example: The Wumpus World [cont.]
b∗ is conditionally independent of other given fringe

41 / 44
An Example: The Wumpus World [cont.]
Move P(b∗|p∗, P1,3, fringe) outward

41 / 44
An Example: The Wumpus World [cont.]
All of the pit locations are independent

41 / 44
An Example: The Wumpus World [cont.]
Move P(p∗), P(P1,3), and P(fringe) outward

41 / 44
An Example: The Wumpus World [cont.]
Σ
Remove other P(other ) because it equals 1

41 / 44
An Example: The Wumpus World [cont.]
P(p∗) is scalar, so make it part of the normalization constant

41 / 44
An Example: The Wumpus World [cont.]

Σ
We have obtained: P(P1,3|p∗, b∗) = α′P(P1,3) f r i n g e P(b∗|p∗, P1,3, fringe)P(fringe)
We know that P(P1,3) = ⟨0.2, 0.8⟩ (see slide 38)
We can compute the normalization coefficient α ′ afterwards
Σ ∗ ∗
fringe P(b |p , P1,3 , fringe)P(fringe): only 4 possible fringes
Start by rewriting as two separate
Σ
equations:
P( p1,3|p∗, b∗) = α′P( p1,3) fringe P(b∗|p∗, p1,3, fringe)P(fringe)
Σ
P(чp1,3|p∗, b∗) = α′P(чp1,3) fringe P(b∗|p∗, чp1,3, fringe)P(fringe)

42 / 44
An Example: The Wumpus World [cont.]
Start by rewriting as two separate
Σ
equations:
P( p1,3|p∗, b∗) = α′P( p1,3) fringe P(b∗|p∗, p1,3, fringe)P(fringe)
Σ
P(чp1,3|p , b ) = α P(чp1,3)
∗ ∗ ′
fringe P(b |p , чp1,3, fringe)P(fringe)
∗ ∗

For
Σ
each of them, P(b∗|...) is 1 if the breezes occur, 0 otherwise:
P(b∗|p∗, p1,3, fringe)P(fringe) = 1 · 0.04 + 1 · 0.16 + 1 · 0.16 + 0 · 0.64 =
Σfringe
f r i n g e P(b |p , чp1,3, fringe)P(fringe) = 1·0.04 + 1 · 0.16 + 0 · 0.16 + 0 · 0.64 =
∗ ∗
0.36 Σ
P(P1,3 |p∗ , b∗ ) = α′ (P 1,3 ) fringe P(b ∗|p ∗, P1,3 , fringe)P(fringe)
=⇒ 0.2
P = α′⟨0.2, 0.8⟩⟨0.36, 0.2⟩ = α′⟨0.072, 0.16⟩ = (normalization, s.t. α ′ ≈ 4.31) ≈ ⟨0.31, 0.69⟩

43 / 44
Exercise

Compute P(P2,2|p∗, b∗) in the same way.

44 / 44

Solutions To Chapter 5 Problems
60% (5)
Solutions To Chapter 5 Problems
32 pages
Probability and Statistics Explorations With Maple
No ratings yet
Probability and Statistics Explorations With Maple
287 pages
Markov Models Supervised and Unsupervised Machine Learning: Mastering Data Science And Python
From Everand
Markov Models Supervised and Unsupervised Machine Learning: Mastering Data Science And Python
William Sullivan
2/5 (1)
Module 5 Complete Notes Quantifying Uncertainty 20th February 2024.Pptx
No ratings yet
Module 5 Complete Notes Quantifying Uncertainty 20th February 2024.Pptx
66 pages
Module 5
No ratings yet
Module 5
65 pages
Chapter 5
No ratings yet
Chapter 5
18 pages
4 AI Module 5
No ratings yet
4 AI Module 5
13 pages
MOD 3-1
No ratings yet
MOD 3-1
80 pages
Module 5
No ratings yet
Module 5
14 pages
PAI Module 5
No ratings yet
PAI Module 5
13 pages
AI notes
No ratings yet
AI notes
30 pages
Quantifying Uncertainty
No ratings yet
Quantifying Uncertainty
44 pages
AI_BAD402_ M5
No ratings yet
AI_BAD402_ M5
14 pages
AIML mod 2
No ratings yet
AIML mod 2
13 pages
Unit IV Uncertain Knowledge and Decision Theory
No ratings yet
Unit IV Uncertain Knowledge and Decision Theory
54 pages
Unit II Full Notes
No ratings yet
Unit II Full Notes
108 pages
Ch-5 Uncertain Knowledge and Reasoning
No ratings yet
Ch-5 Uncertain Knowledge and Reasoning
25 pages
Unit-4 Uncertainty
No ratings yet
Unit-4 Uncertainty
49 pages
24-Module - 5 Uncertain Knowledge and Reasoning-12!03!2024
No ratings yet
24-Module - 5 Uncertain Knowledge and Reasoning-12!03!2024
54 pages
Chapter Five AI
No ratings yet
Chapter Five AI
30 pages
FALLSEM2023-24 CSE3013 ETH VL2023240103712 2023-08-01 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE3013 ETH VL2023240103712 2023-08-01 Reference-Material-I
34 pages
UNIT II_AIML.docsd
No ratings yet
UNIT II_AIML.docsd
43 pages
ch13 Uncertainty
No ratings yet
ch13 Uncertainty
26 pages
Dempster Shafer
No ratings yet
Dempster Shafer
134 pages
AI - Unit 4 - Part 3 - Uncertainty - Probabilty Basics - Bayes Rule
No ratings yet
AI - Unit 4 - Part 3 - Uncertainty - Probabilty Basics - Bayes Rule
40 pages
Module 5
No ratings yet
Module 5
18 pages
Bcse306l Ai Module-5 Smsatapathy (1)
No ratings yet
Bcse306l Ai Module-5 Smsatapathy (1)
98 pages
Chapter13 Uncertainty
No ratings yet
Chapter13 Uncertainty
49 pages
Topic - 7 (Uncertainty)
No ratings yet
Topic - 7 (Uncertainty)
25 pages
UNIT-4 Uncertainty in Artificial Intelligence
No ratings yet
UNIT-4 Uncertainty in Artificial Intelligence
38 pages
University of Dar Es Salaam Coict: Department of Computer Science & Eng
No ratings yet
University of Dar Es Salaam Coict: Department of Computer Science & Eng
42 pages
Artificial Intelligence- Module 4
No ratings yet
Artificial Intelligence- Module 4
19 pages
TTNT 07 QUANTIFYING UNCERTAINTY
No ratings yet
TTNT 07 QUANTIFYING UNCERTAINTY
35 pages
mid2
No ratings yet
mid2
211 pages
6-Module 4 Reasoning With Uncertainty-15-03-2024
No ratings yet
6-Module 4 Reasoning With Uncertainty-15-03-2024
80 pages
Lecture Quantifying Uncertainty
No ratings yet
Lecture Quantifying Uncertainty
40 pages
Uncertainty: Russell & Norvig - AIMA2e
No ratings yet
Uncertainty: Russell & Norvig - AIMA2e
34 pages
Topic - 8 (Uncertainty)
No ratings yet
Topic - 8 (Uncertainty)
25 pages
Module 2
No ratings yet
Module 2
12 pages
AI CH:05 Reasoning Under Uncertainty: Universal College of Engineering, Vasai (E)
No ratings yet
AI CH:05 Reasoning Under Uncertainty: Universal College of Engineering, Vasai (E)
19 pages
Unit 2
No ratings yet
Unit 2
19 pages
Uncertainty
No ratings yet
Uncertainty
22 pages
Uncertainty: Vineet Sahula
No ratings yet
Uncertainty: Vineet Sahula
42 pages
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
No ratings yet
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
59 pages
M05_01 Quantifying Uncertainty
No ratings yet
M05_01 Quantifying Uncertainty
20 pages
Module 3
No ratings yet
Module 3
36 pages
Topic - 8 (Uncertainty)
No ratings yet
Topic - 8 (Uncertainty)
25 pages
Uncertainty: CSE-345: Artificial Intelligence
No ratings yet
Uncertainty: CSE-345: Artificial Intelligence
30 pages
13-LIKELIHOOD, PRIOR, POSTERIOR, MARGINAL Probability, Problems Solved Appliying Bayes Rule-11-03-2024
No ratings yet
13-LIKELIHOOD, PRIOR, POSTERIOR, MARGINAL Probability, Problems Solved Appliying Bayes Rule-11-03-2024
43 pages
Lecture 2-3 Reasoning With Uncertainty-1
No ratings yet
Lecture 2-3 Reasoning With Uncertainty-1
27 pages
M2
No ratings yet
M2
9 pages
ai-lecture10
No ratings yet
ai-lecture10
20 pages
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
No ratings yet
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
31 pages
L11a Uncertainty171105
No ratings yet
L11a Uncertainty171105
25 pages
AI - Unit - 5
No ratings yet
AI - Unit - 5
25 pages
PPT05-Quantifying Uncertainty
No ratings yet
PPT05-Quantifying Uncertainty
39 pages
Unit-4
No ratings yet
Unit-4
74 pages
uncertainty-probabilty
No ratings yet
uncertainty-probabilty
25 pages
Chapter13 PDF
No ratings yet
Chapter13 PDF
34 pages
03 QuantifyingUncertainty
No ratings yet
03 QuantifyingUncertainty
145 pages
8 - Probability
No ratings yet
8 - Probability
54 pages
Abductive Reasoning: Fundamentals and Applications
From Everand
Abductive Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
2030Immediate download An Introduction to Statistical Computing A Simulation based Approach 1st Edition Jochen Voss ebooks 2024
No ratings yet
2030Immediate download An Introduction to Statistical Computing A Simulation based Approach 1st Edition Jochen Voss ebooks 2024
77 pages
Joint Probability Distributions
No ratings yet
Joint Probability Distributions
47 pages
4 - Markov Process
No ratings yet
4 - Markov Process
86 pages
Ac43-2b Minimum Barometry For Calibration and Test
No ratings yet
Ac43-2b Minimum Barometry For Calibration and Test
6 pages
Lecture 7 Risk Return, and Portfolio Theory
No ratings yet
Lecture 7 Risk Return, and Portfolio Theory
65 pages
Methods in Biochemical Research
No ratings yet
Methods in Biochemical Research
38 pages
Tutorial 2
No ratings yet
Tutorial 2
2 pages
Case Study 05
No ratings yet
Case Study 05
12 pages
A Short Review of Probability Theory
No ratings yet
A Short Review of Probability Theory
55 pages
Inbound 91797242154262642
No ratings yet
Inbound 91797242154262642
7 pages
Unit-5 (Notes AI)
No ratings yet
Unit-5 (Notes AI)
28 pages
Post Graduate Diploma in Management: Narsee Monjee Institute of Management Studies
100% (1)
Post Graduate Diploma in Management: Narsee Monjee Institute of Management Studies
4 pages
Stat 2024 Formula and Tables For Statistics v1
No ratings yet
Stat 2024 Formula and Tables For Statistics v1
28 pages
2021 MID-YEAR - Assignment Stats 1
No ratings yet
2021 MID-YEAR - Assignment Stats 1
7 pages
Math 6 Q4 W8
No ratings yet
Math 6 Q4 W8
68 pages
FRM 2 - PDF
No ratings yet
FRM 2 - PDF
12 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
DiscreteDist PDF
No ratings yet
DiscreteDist PDF
4 pages
Mean Variance and Standard Deviation
No ratings yet
Mean Variance and Standard Deviation
16 pages
Reliability Distributions
No ratings yet
Reliability Distributions
21 pages
All CLO and PLO For BSRS 2nd Year
No ratings yet
All CLO and PLO For BSRS 2nd Year
18 pages
Number of Outcomescontained The Event E Totalnumber of Outcomes The Sample Space Type Equation Here
No ratings yet
Number of Outcomescontained The Event E Totalnumber of Outcomes The Sample Space Type Equation Here
5 pages
PS04 New
No ratings yet
PS04 New
40 pages
1 Addition Rule of Probability: 1.1 Exercise Set 5
No ratings yet
1 Addition Rule of Probability: 1.1 Exercise Set 5
2 pages
Discrete and Continuous Probability Distributions PPT BEC
No ratings yet
Discrete and Continuous Probability Distributions PPT BEC
68 pages
Buhlmann Credibility Homework Solutions
No ratings yet
Buhlmann Credibility Homework Solutions
11 pages
Stats 2 Week1-2 Mock
No ratings yet
Stats 2 Week1-2 Mock
11 pages
8th - ICSE - TEST PAPER MATH - 26aug2023
No ratings yet
8th - ICSE - TEST PAPER MATH - 26aug2023
2 pages

Module-5 complete notes-Quantifying Uncertainty 20th February 2024

Uploaded by

Module-5 complete notes-Quantifying Uncertainty 20th February 2024

Uploaded by

Fundamentals of Artificial Intelligence

Chapter 13: Quantifying Uncertainty

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

Probability allows to summarize the uncertainty on effects of

Ex: Suppose I believe:

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

Factored representation of possible worlds: sets of ⟨variable, value⟩ pairs

(© S. Russell & P. Norwig, AIMA)

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

Example: Generic Inference

(© S. Russell & P. Norwig, AIMA)

Example: Generic Inference

(© S. Russell & P. Norwig, AIMA)

Start with the joint distribution P(Toothache, Catch, Cavity )

(© S. Russell & P. Norwig, AIMA)

(© S. Russell & P. Norwig, AIMA)

Start with the joint distribution P(Toothache, Catch,

(© S. Russell & P. Norwig, AIMA)

Start with the joint distribution P(Toothache, Catch,

(© S. Russell & P. Norwig, AIMA)

Start with the joint distribution P(Toothache, Catch,

(© S. Russell & P. Norwig, AIMA)

Let X be all the variables. Typically, we want P(Y|E = e):

for large n’s

(© S. Russell & P. Norwig, AIMA)

(© S. Russell & P. Norwig, AIMA)

(© S. Russell & P. Norwig, AIMA)

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

(© S. Russell & P. Norwig, AIMA)

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

(© S. Russell & P. Norwig, AIMA)

Q: How can we compute P(Cause|Effect1, ..., Effectk )?

1 Acting Under Uncertainty

3 Probabilistic Inference via Enumeration

4 Independence and Conditional Independence

5 Applying Bayes’ Rule

6 An Example: The Wumpus World Revisited

Specifying the probability model

α unknown P(P1,3 |p∗∗ , b ,

(© S. Russell & P. Norwig, AIMA)

Fringe in Quantifying Uncertainty:

In quantifying uncertainty, particularly in probabilistic or statistical contexts, the

(© S. Russell & P. Norwig, AIMA)

(© S. Russell & P. Norwig, AIMA)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)

(© of Dana Nau, CMSC21, U. Maryland, Licensed under Creative Commons)