0% found this document useful (0 votes)
100 views

Bayesian Networks Bucket Elimination Algorithm

The document discusses Bayesian networks and the bucket elimination algorithm. It introduces basic concepts such as belief updating using variable elimination. The goal is to calculate P(X=x|E=e) by computing P(X=x,E=e) and normalizing. An example Bayesian network is provided and the process of variable elimination is demonstrated by defining elimination functions for each variable.

Uploaded by

goransuksi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Bayesian Networks Bucket Elimination Algorithm

The document discusses Bayesian networks and the bucket elimination algorithm. It introduces basic concepts such as belief updating using variable elimination. The goal is to calculate P(X=x|E=e) by computing P(X=x,E=e) and normalizing. An example Bayesian network is provided and the process of variable elimination is demonstrated by defining elimination functions for each variable.

Uploaded by

goransuksi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

Bayesian Networks

Bucket Elimination Algorithm

主講人:虞台文

大同大學資工所
智慧型多媒體研究室
Content
 Basic Concept
 Belief Updating
 Most Probable Explanation (MPE)
 Maximum A Posteriori (MAP)
Bayesian Networks
Bucket Elimination Algorithm

Basic Concept

大同大學資工所
智慧型多媒體研究室
Satisfiability
Given a statement of clauses (in disjunction normal form), t
he satisfiability problem is to determine whether there exist
s a truth assignment to make the statement true.

Examples:
1. ( A  B  C )  (C  D)  ( A  B  D)  (A  D)

A=True, B=True, C=False, D=False Satisfiable


2. ( A  B  C )  ( B  C  D)  ( B  C )  (A  C )  (D)

Satisfiable?
Resolution
(  q )  (   q) can be true if and only if
       can be true.
 q      TURE ?

  q

   unsatisfiable
( A  B  C )  ( B  C  D)  ( B  C )  (A  C )  (D)

Direct Resolution
Example:  A  B  C 
 BC D 
 
Given a set of clauses  B  C  and an order d=ABCD
 A  C 
 
 D 
Set initial buckets as follows:
A  B  C
B  C BC  D
A  C D
BucketA BucketB BucketC BucketD
( A  B  C )  ( B  C  D)  ( B  C )  (A  C )  (D)

Direct Resolution
Because no empty clause () is resulted, the statem
ent is satisfiable.

How to get a truth assignment?


BC
A B A  B  C
B B  C BC  D
A  B A  C D
BucketA BucketB BucketC BucketD
( A  B  C )  ( B  C  D)  ( B  C )  (A  C )  (D)

Direct Resolution

True if A  True
C
True/False if A  False
A  True or False B  True D  False
BC
A B A  B  C
B B  C BC  D
A  B A  C D
BucketA BucketB BucketC BucketD
Direct Resolution
Queries on Bayesian Networks

 Belief updating
P ( X | E)  ?
 Finding the most probable explanation (mpe)
– Given evidence, finding a maximum probability assignment to t
he rest of variables.
 Maximizing a posteriori hypothesis (map)
– Given evidence, finding an assignment to a subset of hypothesi
s variables that maximize their probability.
 Maximizing the expected utility of the problem (meu)
– Given evidence and utility function, finding a subset of decision
variables that maximize the expected utility.
Bucket Elimination
 The algorithm will be used as a framework for
various probabilistic inferences on Bayesian
Networks.
Preliminary – Elimination Functions

Given a function h defined over subset of


variables S, where X  S,

min X h
Eliminate parameter X from h
max X h
mean X h Defined over U = S – {X}.
X h
Preliminary – Elimination Functions

Given a function h defined over subset of


variables S, where X  S,

min X h
 min X h  (u)  min X h( x, u)
max X h
 max X h  (u)  max X h( x, u)
1
mean X h  mean X h  (u) 
|X|
 X h ( x, u )
X h
  X h  (u)   X h( x, u)
Preliminary – Elimination Functions

Given function h1,…, hn defined over subset o


f variables S1,…, Sn, respectively,

 j hj
Defined over U   S j
j

 j hj
Preliminary – Elimination Functions

Given function h1,…, hn defined over subset o


f variables S1,…, Sn, respectively,

 j hj   h  (u)  
j j j
h j (u S j
)

 j hj   h  (u)  
j j j
h j (u S j
)
Bayesian Networks
Bucket Elimination Algorithm

Belief Updating

大同大學資工所
智慧型多媒體研究室
Goal
P ( X  x | E  e)  ?

P ( X  x, E  e)
 Normalization
P ( E  e) Factor
P(a | g  1)  ? P(a, g  1)  ?

Basic Concept of Variable Elimination

Example:
P( g , f , d , c, b, a )  P ( g | f ) P( f | b, c) P(d | a, b) P(b | a) P(c | a ) P (a)
A

B C

G
P(a | g  1)  ? P(a, g  1)  ?

Basic Concept of Variable Elimination

Example:
P( g , f , d , c, b, a )  P ( g | f ) P( f | b, c) P(d | a, b) P(b | a) P(c | a ) P (a)

P (a, g  1)  
b , c , d , f , g 1
P ( a , b, c , d , f , g )

 P (a ) 
b , c , d , f , g 1
P( g | f ) P( f | b, c) P (d | a, b) P(b | a) P(c | a)

 P (a) P (c | a ) P(b | a) P ( f | b, c) P(d | a, b) P( g | f )


c b f d g 1
P(a | g  1)  ? P(a, g  1)  ?

Basic Concept of Variable Elimination

P (a, g  1)
 P(a ) P(c | a) P (b | a ) P ( f | b, c) P(d | a, b) P ( g | f )
c b f d g 1
(f)
G


 P( a) P(c | a ) P(b | a) P( f | b, c )G ( f ) P (d | a, b)
c b f d
(a, b)
D


 P (a ) P(c | a) P (b | a )D (a, b) P ( f | b, c)G ( f )
c b f
(b, c)
F


 P(a) P(c | a ) P(b | a)D (a, b)F (b, c)  P(a) P(c | a)B (a, c)
c b c
B (a, c) (a)
C


 P(a )C (a )
P (a, g  1)
 P(a ) P (c | a) P (b | a ) P ( f | b, c) P (d | a, b) P ( g | f )
c b f d g 1

Basic Concept of Variable Elimination

BucketG G ( f )   P( g | f )
g 1

BucketD D (a, b)   P(d | a, b)


d

BucketF F (b, c)   P( f | b, c)G ( f )


f

BucketB B (a, c)   P (b | a)D (a, b)F (b, c)


b

BucketC C (a)   P(c | a)B (a, c)


c

BucketA P (a, g  1)  P(a )C (a)


P (a, g  1)
 P(a ) P (c | a) P (b | a ) P ( f | b, c) P (d | a, b) P ( g | f )
c b f d g 1

Basic Concept of Variable Elimination

BucketG  P( g | f )
g 1
 G ( f )

BucketD  P ( d | a, b)  
d
D ( a, b)

BucketF  P ( f | b, c ) 
f
G ( f )  F (b, c )

BucketB  P(b | a)


b
D (a, b) F (b, c)  B (a, c)

BucketC  P (c | a ) 
c
B (a, c)  C (a )

BucketA P (a ) C (a )  P (a, g  1)
Basic Concept of Variable Elimination

G ( f )   P( g | f )
g 1

f G(f )
+ 0.1
 0.7
f G(f )
+ 0.1
 0.7

Basic Concept of Variable Elimination

D ( a , b )   P ( d | a , b )
d

a b D(a,
b)
0 0 1
0 1 1
1 0 1
1 1 1
a b D(a,
b) f G(f )
0 0 1 + 0.1
0 1 1  0.7
1 0 1
1 1 1

Basic Concept of Variable Elimination

F (b, c)   P( f | b, c)G ( f )
f

b c F(b, c)
0 0 0.701
0 1 0.610
1 0 0.400
1 1 0.340

0.7
0.1
0.7
0.1
0.7
0.1
0.7
0.1
b c F(b, c) a b D(a,
b) f G(f )
0 0 0.701
0 0 1 + 0.1
0 1 0.610
0 1 1  0.7
1 0 0.400
1 0 1
1 1 0.340
1 1 1

Basic Concept of Variable Elimination

B (a, c)   P(b | a )D (a, b)F (b, c)


b
a c B(a, c)
0 0 0.90.701+0.1 0.400=0.6709
0 1 0.90.610+0.1 0.340=0.5830
1 0 0.60.701+0.4 0.400=0.5806
1 1 0.60.610+0.4 0.340=0.5020
a c B(a, c) b c F(b, c) a b D(a, f G(f )
b)
0 0 0.6709 0 0 0.701 + 0.1
0 0 1
0 1 0.5830 0 1 0.610  0.7
0 1 1
1 0 0.5806 1 0 0.400
1 0 1
1 1 0.5020 1 1 0.340
1 1 1

Basic Concept of Variable Elimination

C (a )   P (c | a )B (a, c)
c

a C(a )
1 0.67 0.5806+0.33 0.5020=0.554662
0 0.75 0.6709+0.25 0.5830=0.648925
a C(a ) a c B(a, c) b c F(b, c) a b D(a, f G(f )
b)
1 0.554662 0 0 0.6709 0 0 0.701 + 0.1
0 0 1
0 0.648925 0 1 0.5830 0 1 0.610  0.7
0 1 1
1 0 0.5806 1 0 0.400
1 0 1
1 1 0.5020 1 1 0.340
1 1 1

Basic Concept of Variable Elimination

P (a, g  1)  P(a )C (a )


a P(a, g=1)
1 0.30.554662=0.1663986
0 0.70.648925=0.4542475

P(a, g  1)
P(a | g  1) 
P ( g  1)
P ( g  1)  0.1663986  0.4542475
 0.6206461
a P(a | g=1)
1 0.1663986/0.6206461=0.26811
0 0.4542475/0.6206461=0.73189
Bucket Elimination Algorithm
Complexity
 The BuckElim Algorithm can be applied to any ordering.
 The arity of the function recorded in a bucket
– the numbers of variables appearing in the processed bucked, excl
uding the bucket’s variable.
 Time and Space complexity is exponentially grow with a f
unction of arity r.
 The arity is dependent on the ordering.
 How many possible orderings for BN’s variables?
Consider the ordering AFDCBG.
A

Determination of the Arity B C

F
D

BucketG  P( g | f )  G ( f ) G
g 1

BucketB  P(b | a)P(d | a, b) P( f | b, c)  B (a, c, d , f ) G


1
b

4
BucketC P(c | a) B (a, c, d , f )  C (a, d , f ) B
c
1 ,3

BucketD 
C
C (a, d , f )  D ( a , f )
d 0 ,2
D

BucketF  G ( f )D (a, f )  F (a) F


0 ,1
f

BucketA P (a ) F (a)  P (a, g  1) A


0
Given the ordering, e.g., AFDCBG.
d
Determination of the Arity
The
Thewidth
widthof
ofaagraph
graphisisthe
the
maximum
maximumwidth
widthofofits
itsnodes.
nodes. w(d) = 4 w*(d) = 4
w(d):
w(d):width
widthof
ofinitial
initialgraph
graph Width of Width of
for node node
forordering
orderingd.d. 1 1
w*(d):
w*(d):width
widthof
ofinduced
inducedgraph
graph G G
for
forordering
orderingd.d. 4 4
B B

A 1 3
C C
B C
0 2
D D
Initial Induced
F Graph
Graph 0 1
D F F

G 0 0
A A
Definition of Tree-Width

Goal: Finding an ordering with smallest induced width.


Greedy heuristic and Approximation methods
NP-Hard
Are available.
Summary
 The complexity of BuckElim algorithm is domi
nated by the time and space needed to proce
ss a bucket.
 It is time and space is exponential in number
of bucket variables.
 Induced width bounds the arity of bucket func
tions.
Exercises
 Use BuckElim to evaluate P(a|b=1) with the foll
owing two ordering:
A

B C
1. d1=ACBFDG
2. d2=AFDCBG F

Give the details and make some conclusion. G


How to improve the algorithm?
Bayesian Networks
Bucket Elimination Algorithm

Most Probable Exp


lanation (MPE)

大同大學資工所
智慧型多媒體研究室
MPE
Goal:
x*  arg max P( X  x | E  e)  ?
x
evidence

x  ( x1 , , xn )
MPE
Goal:
x*  arg max P( X  x | E  e)  ?
x

x*  arg max P(x, e)  ?


x
x*  arg max P (x, e)
x
Notations

i
xi

Fi i
x*  arg max P (x, e)
x
MPE
PX (x)  P(x, e)   i 1 P( xi , e |  i )
n

Let x n  ( x1 , , xn )
P(x*)  max PX (x)  max PX (x n )
x xn

 max  i 1 P( xi , e |  i )
n

xn

 max  i 1 P( xi , e |  i )
n

x n1 , xn
x*  arg max P (x, e)
x
MPE
P (x*)  max  i 1 P( xi , e |  i )
n

x n1 , xn

Some terms involve xn,


some terms not.

n Xn is conditioned by its parents.


P ( xn , e |  n )
Xn
Xn conditions its children.
n P( xk , e | xn ,), xk  n
x*  arg max P (x, e)
x
MPE
P (x*)  max  i 1 P( xi , e |  i )
n

x n1 , xn

 max  X X F P( xi , e |  i )  max P( xn , e |  n ) X  P( xi , e |  i )


x n1 i n xn i n

Not conditioned by x n Itself Conditioned by x n

n
x nappears in these CPT’s
Xn
Fn
n
x*  arg max P (x, e)
x
MPE
P (x*)  max  i 1 P( xi , e |  i )
n

x n1 , xn

 max  X X F P( xi , e |  i )  max P( xn , e |  n ) X  P( xi , e |  i )


x n1 i n xn i n

hn ( xUn )
Eliminate variable xn at Bucketn.
 max  X X  F P( xi , e |  i )  hn ( xUn )
xn1 i n

Process the next bucket recursively.


max P(a, b, c, d , f , g  1)  ?
a ,b , c , d , f

Example
P(a, b, c, d , f , g  1)  P( g  1 | f ) P( f | b, c) P( d | a, b) P(b | a) P(c | a ) P(a)

B C

G
max P(a, b, c, d , f , g  1)  ?
a ,b , c , d , f

Example Consider ordering ACBFDG


A

BucketG max P ( g | f )  hG ( f )
g 1 B C

BucketD max P (d | a, b)  hD (a, b)


d
F
D

BucketF max P ( f | b, c) h ( f )  h (b, c) G


f G F

BucketB max P (b | a )h (a, b) h (b, c)  h (a, c)


b D F B

BucketC max P (c | a )  (a, c)   (a )


c B C

BucketA max P (a ) hC (a)  max P (a, b, c, d , f , g  1)


a a ,b , c , d , f
Bucket Elimination Algorithm
max P(a, b, c, d , f , g  1)  ?
a ,b , c , d , f

Exercise Consider ordering ACBFDG


Bayesian Networks
Bucket Elimination Algorithm

Maximum
A Posteriori (MAP)

大同大學資工所
智慧型多媒體研究室
MAP
Given a belief network, a subset of hypothesized
variables A=(A1, …, Ak), and evidence E=e, the
goal is to determine

a*  arg max P( A  a | E  e)  ?
a

a*  arg max P(a, e)  ?


a
Example
A
Hypothesis (Decision)
Variables B C

G
g=1

(b*, c*)  arg max P(b, c | g  1)  ?


b ,c
MAP
Ordering

d  X 1 , X k , X k 1 , , X n
 A1 , Ak , X k 1 , , X n
 Ak , X n
k 1

Some of them may be observed


d  Ak , X n
k 1

MAP
a  arg max P(a k | e)
*
k
ak

P (a k , e)
 arg max
ak P (e)
 arg max P(a k , e)
ak
d  Ak , X n
k 1

MAP
a  arg max P(a k , e)
*
k
ak

P(a k , e)   xn P(a k , x n
k 1 , e)
k 1

  xn 
n
i 1
P ( xi , e |  i )
k 1

 
n
P (a )  max ak
*
k x nk1 i 1
P ( xi , e |  i )
d  Ak , X n
k 1

MAP
a  arg max P(a k , e)
*
k
ak

Bucket
BucketElimination
Elimination Bucket
BucketElimination
Elimination
for
forMPE
MPE for
forbelief
beliefupdating
updating

 
n
P (a )  max ak
*
k x nk1 i 1
P ( xi , e |  i )
Bucket Elimination Algorithm
Consider ordering CBAFDG
(b*, c*)  arg max P(b, c | g  1)  ?
Example b ,c

BucketG  P( g | f )  
g 1
G (f) A

 P ( d | a, b)  
B C
BucketD D ( a, b )
d
F

BucketF  P ( f | b, c ) 
f
G ( f )   F (b, c )
D

G g=1
BucketA  P(c | a) P(b | a) P(a) 
a
D (a, b)   A (b, c )

BucketB max  F (b, c)  A (b, c )   b (c)


b

BucketC max b (c)  max P (b, c | g  1)


c b ,c
Consider ordering CBAFDG
(b*, c*)  arg max P(b, c | g  1)  ?
Exercise b ,c

BucketG  P( g | f )  
g 1
G (f) A

 P ( d | a, b)  
B C
BucketD D ( a, b )
d
F

BucketF  P ( f | b, c ) 
f
G ( f )   F (b, c )
D

G g=1
BucketA  P(c | a) P(b | a) P(a) 
a
D (a, b)   A (b, c )

BucketB Give
Give the
max  F (b, c)  A (b, c )   b (c)
b
the
max  (c)  max P (b, c | g  1)detail
BucketC c b
b ,c detail

You might also like