0% found this document useful (0 votes)
18 views

CS480 Lecture October26th

The document discusses decision theory and decision networks for artificial intelligence. It covers key concepts like agents making decisions based on beliefs over states of the environment and preferences represented by utilities. Decision theory combines probability and utility theory by defining the expected utility of actions. Decision networks (also called influence diagrams) represent decisions as nodes, states as chance nodes, and utilities as value nodes to help agents make rational decisions.

Uploaded by

Rajeswari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

CS480 Lecture October26th

The document discusses decision theory and decision networks for artificial intelligence. It covers key concepts like agents making decisions based on beliefs over states of the environment and preferences represented by utilities. Decision theory combines probability and utility theory by defining the expected utility of actions. Decision networks (also called influence diagrams) represent decisions as nodes, states as chance nodes, and utilities as value nodes to help agents make rational decisions.

Uploaded by

Rajeswari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

CS 480

Introduction to Artificial Intelligence

October 26, 2023


Announcements / Reminders
 Please follow the Week 10 To Do List instructions (if you
haven't already)
 Written Assignment #03 posted
 Programming Assignment #01 due on Sunday
(10/29/23) at 11:59 PM CST

 Final Exam date:


– Thursday 11/30/2023 (last week of classes!)
 Ignore the date provided by the Registrar

2
Plan for Today
 Decision Networks

3
Playing Minesweeper with Bayes’ Rule
Prior probability / belief: Posterior probability / belief:

X X

4
Naive Bayes Spam Filter
Email = Spam

Word 1 Word 2 ..... Word N

P(Email = spam | Word1) = 0.09


P(Email = spam | Word2) = 0.01
...
P(Email = spam | WordN) = 0.03

5
Agents and Belief State
Partially
Sensor A=4 B=7 observable

A=4
Agent’s model of the world
B=7
Environment could be in one of Agent can
those states: consult its C=0

S1: A = 4, B = 7, C = 0  Plan X internal


representation
S2: A = 4, B = 7, C = 1  Plan Y
of the world /
S3: A = 4, B = 7, C = 2  Plan X environment to
S4: A = 4, B = 7, C = 3  Plan Z choose action

Plans are sequences of actions.


Actuator ACTION(S)

Agent Environment

Assume: DC = {0,1,2,3}

6
Decision Theory
 Decisions: every plan (actions) leads to an
outcome (state)
 Agents have preferences (preferred outcomes)
 Preferences  outcome utilities
 Agents have degrees of belief (probabilities) for
actions

Decision theory = probability theory + utility theory

7
Decision Theory
 Decisions: every plan (actions) leads to an
outcome (state)
 Agents have preferences (preferred outcomes)
 Preferences  outcome utilities
 Agents have degrees of belief (probabilities) for
actions

Decision theory = probability theory + utility theory


BELIEFS DESIRES

8
Maximum Expected (Average) Utility

Outcome S10: P(S10) = 0.1 U(S10) = 2

Action M Outcome S15: P(S15) = 0.4 U(S15) = 3

Environment could be in one of


those states: Outcome S55: P(S55) = 0.5 U(S55) = 1
A = 4, B = 7, C = 0  Plan X Pick highest
A = 4, B = 7, C = 1  Plan Y MEU action
A = 4, B = 7, C = 2  Plan X Outcome S20: P(S20) = 0.7 U(S20) = 8
A = 4, B = 7, C = 3  Plan Z
Action N Outcome S15: P(S15) = 0.2 U(S15) = 4

Outcome S12: P(S12) = 0.1 U(S12) = 5

9
Agents Decisions
Recall that agent ACTIONS change the state:
 if we are in state s
 action a is expected to
 lead to another state s’ (outcome)

Given uncertainty about the current state s and action outcome s’


we need to define the following:
 probability (belief) of being in state s: P(s)
 probability (belief) of action a leading to outcome s’: P(s’ | s, a)
Now:

10
State Utility Function
Agent’s preferences (desires) are captured by
the Utility function U(s).

Utility function assigns a value to each state s


to express how desirable this state is to the
agent.

11
Expected Action Utility
The expected utility of an action a given the
evidence is the average utility value of all
possible outcomes s’ of action a, weighted by
their probability (belief) of occurence:
’ ’

Rational agent should choose an action that


maximizes the expected utility:

12
How Did We Get Here?
Let’s start with relationships (and related notation)
between agent’s preferences:
 agent prefers A over B:
A B
 agent is indifferent between A and B:
A ~ B
 agent prefers A over B or is indifferent between
A and B (weak preference):
A B
13
The Concept of Lottery
Let’s assume the following:
 an action a is a lottery ticket
 the set of outcomes (resulting states) is a lottery
A lottery L with possible outcomes S1, ..., Sn that
occur with probabilities p1, ..., pn is written as:

L = [p1, S1; p2, S2; ... ; pn, Sn]

Lottery outcome Si: atomic state or another lottery.


14
Lottery Constraints: Orderability
Given two lotteries A and B, a rational agent must
either prefer one or else rate them as equally
preferable:

Exactly one of (A B), (B A), or (A ~ B) holds

15
Lottery Constraints: Transitivity
Given three lotteries A, B, and C, if an agent
prefers A to B AND prefers B to C, then the agent
must prefer A to C:

(A B)  (B C)  (A C)

16
Lottery Constraints: Continuity
If some lottery B is between A and C in preference,
then there is some probability p for which the
rational agent will be indifferent between getting B
for sure or some other lottery that yields A with
probability p and C with probability 1 - p:

(A B C)  p [p, A; 1- p, C] ~ B

17
Lottery Constraints: Substitutability
If an agent is indifferent between two lotteries A
and B, then the agent is indifferent between two
more complex lotteries that are the same, except
that B is subsituted for A in one of them:

(A ~ B)  [p, A; 1- p, C] ~ [p, B; 1- p, C]

18
Lottery Constraints: Monotonicity
Suppose two lotteries have the same two possible
outcomes, A and B. If an agent prefers A to B, then
the agent must prefer the lottery that has a higher
probability for A:

(A B)  (p > q  [p, A; 1- p, B] [q, A; 1- q, B])

19
Lottery Constraints: Decomposability
Compound lotteries can be reduced to smaller ones
using the laws of probability:

[p, A; 1- p, [q, B; 1- q, C]] ~ [p, A; (1- p)*q, B; (1- p)*(1- q), C]

20
Preferences and Utility Function
An agent whose preferences between lotteries
follow the set of axioms (of utility theory) below:
 Orderability
 Transitivity
 Continuity
 Subsitutability
 Monotonicity
 Decomposability
can be described as possesing a utility function and
maximize it.
21
Preferences and Utility Function
If an agent’s preferences obey the axioms of utility
theory, then there exist a function U such that:

U(A) = U(B) if and only if (A ~ B)

and

U(A) > U(B) if and only if (A B)

22
Multiattribute Outcomes
Outcomes can be characterized by more than one
attribute. Decisions in such cases are handled by
Multiattribute Utility Theory.
Attributes: X = X1, ..., Xn
Assigned values: x = <x1, ..., xn>

23
Strict Dominance: Deterministic

B strictly dominates A
B is better than A
for both X1 and X2

24
Strict Dominance: Deterministic

D doesn’t strictly
dominate A,D is better
than A only for X1

25
Strict Dominance: Uncertain
B strictly dominates A
B is better than A
for both X1 and X2

26
Decision Network (Influence Diagram)
Decision networks (also called influence diagrams)
are structures / mechanisms for making rational
decisions.

Decision networks are based on Bayesian


networks, but include additional nodes that
represent actions and utilities.

27
Decision Networks
The most basic decision network needs to include:
 information about current state s
 possible actions
 resulting state s’ (after applying chosen action a)
 utility of the resulting state U(s’)

28
Decision Network Nodes
Decision networks are built using the following
nodes:
 chance nodes:
X

 decision nodes:
Y

 utility (or value) nodes


Z

29
Decision Network: Example

30
Decision Network: Example
Nodes describing
current state

31
Decision Network: Example
Evidence

32
Decision Network: Example
Airport Site decision
changes Safety, Quietness,
and Frugality nodes
conditional distributions

33
Decision Network: Example

Parents of the
utility node

34
Decision Network: Example

Outcome nodes

35
Decision Network: Example

Nodes directly
influencing utility

36
Decision Network: Evaluation
The algorithm for decision network evaluation is as
follows:
1. Set the evidence variables for the current state
2. For each possible value a of decision node:
a. Set the decision node to that value
b. Calculate the posterior probabilities for the parent
nodes of the utility node
c. Calculate the utility for the action / value a
3. Return the action with highest utility

37
Decision Network: Example

Utility table is used


to get utility

38
Decision Network: Simplified Form

Action-Utility table
is used to get
expected utility

39
(Single-Stage) Decision Networks
General Structure Simplified Structure
Decision Network Decision Network

Decision Utility Decision Utility


Node Node Node Node

Bayes Network Bayes Network

40
(Single-Stage) Decision Networks
General Structure Simplified Structure

Utility Table Action-Utility Table (not all columns shown)


S low low low low high high high high AT low low low --- --- high high high

Q low low high high low low high high L low low high --- --- low high high

F low high low high low high low high C low high low --- --- high low high

U 10 20 5 50 70 150 100 200 AS A A A --- --- B B B

U 10 20 5 --- --- 150 100 200

41
Decision Network: Evaluation
The algorithm for decision network evaluation is as
follows:
1. Set the evidence variables for the current state
2. For each possible value a of decision node:
a. Set the decision node to that value
b. Calculate the posterior probabilities for the parent
nodes of the utility node
c. Calculate the utility for the action / value a
3. Return the action with highest utility

42
Agent’s Decisions
Recall that agent ACTIONS change the state:
 if we are in state s
 action a is expected to
 lead to another state s’ (outcome)

Given uncertainty about the current state s and action outcome s’


we need to define the following:
 probability (belief) of being in state s: P(s)
 probability (belief) of action a leading to outcome s’: P(s’ | s, a)
Now:

43
Expected Action Utility
The expected utility of an action a given the
evidence is the average utility value of all
possible outcomes s’ of action a, weighted by
their probability (belief) of occurence:
’ ’

Rational agent should choose an action that


maximizes the expected utility:

44
Decision Networks: Example
Decision: take umbrella Decision: leave umbrella

P(W=rain) P(W=sun) P(W=rain) P(W=sun)

0.30 0.70 0.30 0.70

Weather Weather

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

45
Decision Networks: Example
Decision: take umbrella Decision: leave umbrella
’ ’

P(W=rain) P(W=sun) P(W=rain) P(W=sun)

0.30 0.70 0.30 0.70

Weather Weather

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

46
Decision Networks: Example
Decision: take umbrella Decision: leave umbrella
’ ’

P(W=rain) P(W=sun) P(W=rain) P(W=sun)

0.30 0.70 0.30 0.70

Weather Weather

Decision U Decision U

S1’: D = take, W = sun D W U S3’: D = leave, W = sun D W U

S2’: D = take, W = rain leave sun 100 S4’: D = leave, W = rain leave sun 100
EU(take) = EU(leave) =
leave rain 0 leave rain 0
P(Result(take) = S1’)*U(S1’) + P(Result(leave) = S3’)*U(S3’) +
P(Result(take) = S2’)*U(S2’) = take sun 20 P(Result(leave) = S4’)*U(S4’) = take sun 20
0.70 * 20 + 0.30 * 70 = 35 take rain 70 0.70 * 100 + 0.30 * 0 = 70 take rain 70

47
Decision Networks: Example
Which action to choose: take or leave Umbrella?
’ ’

P(W=rain) P(W=sun) P(W=rain) P(W=sun)

0.30 0.70 0.30 0.70

Weather Weather

Decision U Decision U

S1’: D = take, W = sun D W U S3’: D = leave, W = sun D W U

S2’: D = take, W = rain leave sun 100 S4’: D = leave, W = rain leave sun 100
EU(take) = EU(leave) =
leave rain 0 leave rain 0
P(Result(take) = S1’)*U(S1’) + P(Result(leave) = S3’)*U(S3’) +
P(Result(take) = S2’)*U(S2’) = take sun 20 P(Result(leave) = S4’) U(S4’) = take sun 20
0.70 * 20 + 0.30 * 70 = 35 take rain 70 0.70 * 100 + 0.30 * 0 = 70 take rain 70

48
Decision Networks: Example
Decision: take umbrella Decision: leave umbrella

P(W=rain) P(W=sun) P(F=sun) P(F=rain) P(W=rain) P(W=sun) P(F=sun) P(F=rain)

0.30 0.70 0.59 0.41 0.30 0.70 0.59 0.41

Weather Forecast Weather Forecast

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

49
Decision Networks: Example
Decision:take umbrella given e Decision:leave umbrella given e
’ ’

P(rain | F) P(sun | F) P(F=sun) P(F=rain) P(rain | F) P(sun | F) P(F=sun) P(F=rain)

??? ??? 0.59 0.41 ??? ??? 0.59 0.41

Weather Forecast Weather Forecast

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

50
Decision Networks: Example
Decision:take umbrella given e Conditional probabilities
Assume that we are given:

F W P(F|W)

P(W=rain) P(W=sun) P(F=sun) P(F=rain) sun sun 0.80


rain sun 0.20
0.30 0.70 0.59 0.41
sun rain 0.10

Weather Forecast rain rain 0.90

By Bayes’ Theorem:
P(F = sun | W = sun) ∗ P(W = sun)
P(W = sun | F = sun) = =
P(F = sun)
0.80 ∗ 0.70
Decision U = 0.95
0.59

P(F = rain | W = sun) ∗ P(W = sun)


P(W = sun | F = rain) = =
D W U P(F = rain)
0.20 ∗ 0.70
= 0.34
leave sun 100 0.41

leave rain 0 P(W = rain | F = sun) =


P(F = sun | W = rain) ∗ P(W = rain)
=
P(F = sun)
take sun 20 0.10 ∗ 0.30
= 0.05
0.59
take rain 70
P(F = rain | W = rain) ∗ P(W = rain)
P(W = rain | F = rain) = =
P(F = rain)
0.90 ∗ 0.30
= 0.66
0.41

51
Decision Networks: Example
Decision:take umbrella given sun Decision:leave umbrella given sun
’ ’

P(rain | F) P(sun | F) P(F=sun) P(F=rain) P(rain | F) P(sun | F) P(F=sun) P(F=rain)

0.05 0.95 0.59 0.41 0.05 0.95 0.59 0.41

Forecast Forecast
Weather Weather
=sun =sun

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

take given sun forecast given sun forecast

52
Decision Networks: Example
Decision:take umbrella given sun Decision:leave umbrella given sun
’ ’

P(rain | F) P(sun | F) P(F=sun) P(F=rain) P(rain | F) P(sun | F) P(F=sun) P(F=rain)

0.05 0.95 0.59 0.41 0.05 0.95 0.59 0.41

Forecast Forecast
Weather Weather
=sun =sun

Decision U Decision U

S1’: D = take, W = sun D W U S3’: D = leave, W = sun D W U

S2’: D = take, W = rain leave sun 100 S4’: D = leave, W = rain leave sun 100
EU(take) = EU(leave) =
leave rain 0 leave rain 0
P(Result(take)=S1’|e)*U(S1’) + P(Result(leave)=S3’|e)*U(S3’) +
P(Result(take)=S2’|e)*U(S2’) = take sun 20 P(Result(leave)=S4’|e)*U(S4’) = take sun 20
0.95 * 20 + 0.05 * 70 = 22.5 take rain 70 0.95 * 100 + 0.05 * 0 = 95 take rain 70

take given sun forecast given sun forecast

53
Decision Networks: Example
Decision:take umbrella given rain Decision:leave umbrella given rain
’ ’

P(rain | F) P(sun | F) P(F=sun) P(F=rain) P(rain | F) P(sun | F) P(F=sun) P(F=rain)

0.66 0.34 0.59 0.41 0.66 0.34 0.59 0.41

Forecast Forecast
Weather Weather
= rain = rain

Decision U Decision U

D W U D W U

leave sun 100 leave sun 100


leave rain 0 leave rain 0
take sun 20 take sun 20
take rain 70 take rain 70

take given rain forecast given rain forecast

54
Decision Networks: Example
Decision:take umbrella given rain Decision:leave umbrella given rain
’ ’

P(rain | F) P(sun | F) P(F=sun) P(F=rain) P(rain | F) P(sun | F) P(F=sun) P(F=rain)

0.66 0.34 0.59 0.41 0.66 0.34 0.59 0.41

Forecast Forecast
Weather Weather
= rain = rain

Decision U Decision U

S1’: D = take, W = sun D W U S3’: D = leave, W = sun D W U

S2’: D = take, W = rain leave sun 100 S4’: D = leave, W = rain leave sun 100
EU(take) = EU(leave) =
leave rain 0 leave rain 0
P(Result(take)=S1’|e)*U(S1’) + P(Result(leave)=S3’|e)*U(S3’) +
P(Result(take)=S2’|e)*U(S2’) = take sun 20 P(Result(leave)=S4’|e)*U(S4’) = take sun 20
0.34 * 20 + 0.66 * 70 = 53 take rain 70 0.34 * 100 + 0.66 * 0 = 34 take rain 70

take given rain forecast given rain forecast

55
Decision Networks: Example
Decision:take umbrella given rain Decision:leave umbrella given rain
Forecast Forecast
Weather Weather
= rain = rain

Decision U Decision U

take given rain forecast given rain forecast

Decision:take umbrella given sun Decision:leave umbrella given sun


Forecast Forecast
Weather Weather
= sun = sun

Decision U Decision U

take given sun forecast given sun forecast

56
Decision Networks: Example
Decision:take umbrella given rain Decision:leave umbrella given rain
Forecast Forecast
Weather Weather
= rain = rain

Decision U Decision U

take given rain forecast given rain forecast

Decision:take umbrella given sun Decision:leave umbrella given sun


Forecast Forecast
Weather Weather
= sun = sun

Decision U Decision U

take given sun forecast given sun forecast

57
Value of Perfect Information
The value/utility of best action a without additional
evidence (information) is :
a

If we include new evidence/information ( ) given by


some variable Ej, value/utility of best action a becomes:

The value of additional evidence/information from Ej is:

using our current beliefs about the world.

58
Decision Network: Example
Decision network Outcome tree
Weather Forecast

Decision U take leave

Weather | e Weather | e

The value of best action a without additional evidence sun rain sun rain
MEU(a) =MEU(leave)=70
With evidence information (E = e ) given by Forecast:
U U U U
MEU(𝑎 | 𝑒 ) = MEU(take | 𝐹 = 𝑟𝑎𝑖𝑛) = 53
MEU(𝑎 | 𝑒 ) = MEU(leave | 𝐹 = 𝑠𝑢𝑛) = 95
U(t,s) U(t,r) U(l,s) U(l,r)
The value of additional evidence / information from F is:
VPI(𝐸 ) = ∑ P(𝐸 = 𝑒 ) ∗ MEU(𝑎 | 𝐸 = 𝑒 ) − 𝑀𝐸𝑈(𝑎)
VPI(𝐹) =(P(𝐹 = 𝑟𝑎𝑖𝑛) ∗ MEU(𝑡𝑎𝑘𝑒 | 𝐹 = 𝑟𝑎𝑖𝑛) + P(𝐹 = 𝑠𝑢𝑛) ∗
MEU(leave | 𝐹 = 𝑠𝑢𝑛)) − 𝑀𝐸𝑈(𝑙𝑒𝑎𝑣𝑒) =
(0.41 ∗ 53 + 0.59 ∗ 95) − 70 = 7.78

59
Decision Networks: Example
Decision:leave umbrella Decision:take umbrella given rain
Forecast
Weather Weather
= rain

Decision U Decision U

leave given rain forecast


The value of best action a without additional evidence
MEU(a) =MEU(leave)=70 Decision:leave umbrella given sun
With evidence information (E = e ) given by Forecast:
Forecast
MEU(𝑎 | 𝑒 ) = MEU(take | 𝐹 = 𝑟𝑎𝑖𝑛) = 53 Weather
= sun
MEU(𝑎 | 𝑒 ) = MEU(leave | 𝐹 = 𝑠𝑢𝑛) = 95
The value of additional evidence / information from F is:
VPI(𝐸 ) = ∑ P(𝐸 = 𝑒 ) ∗ MEU(𝑎 | 𝐸 = 𝑒 ) − 𝑀𝐸𝑈(𝑎) Decision U
VPI(𝐹) =(P(𝐹 = 𝑟𝑎𝑖𝑛) ∗ MEU(𝑡𝑎𝑘𝑒 | 𝐹 = 𝑟𝑎𝑖𝑛) + P(𝐹 = 𝑠𝑢𝑛) ∗
MEU(leave | 𝐹 = 𝑠𝑢𝑛)) − 𝑀𝐸𝑈(𝑙𝑒𝑎𝑣𝑒) =
(0.41 ∗ 53 + 0.59 ∗ 95) − 70 = 7.78
given sun forecast

60
Utility & Value of Perfect Information
Action Action Action
a1 a1
Action a1
Action Action
a2 a2 a2

New information will not New information New information


help here. may help a lot here. may help a bit here.

61
VPI Properties
Given a decision network with possible observations Ej
(sources of new information / evidence):

 The expected value of information is nonnegative:



 VPI is not additive:

 VPI is order-independent:

62
Information Gathering Agent

63

You might also like