CC511 Week 4
CC511 Week 4
• Alpha-Beta Pruning
– The fact of the adversary leads to an advantage in search!
Types of Games
battleship
Kriegspiel
• Utility values for each agent are the opposite of the other
– This creates the adversarial situation
• Search – no adversary
– Solution is (heuristic) method for finding goal
– Heuristics and CSP techniques can find optimal solution
– Evaluation function: estimate of cost from start to goal through given node
– Examples: path planning, scheduling activities
• Games – adversary
– Solution is strategy
• strategy specifies move for every possible opponent reply.
– Time limits force an approximate solution
– Evaluation function: evaluate “goodness” of game position
– Examples: chess, checkers, Othello, backgammon
Games as Search
• MAX moves first and they take turns until the game is over
– Winner gets reward, loser gets penalty.
– “Zero sum” means the sum of the reward and the penalty is a constant.
Designed to find the optimal strategy for Max and find best move:
Minimax maximizes the utility for the worst-case outcome for max
• Complete?
– Yes (if tree is finite).
• Optimal?
– Yes (against an optimal opponent).
– Can it be beaten by an opponent playing sub-optimally?
• No.
• Time complexity?
– O(bm)
• Space complexity?
– O(bm) (depth-first search, generate all actions at once)
Game Tree Size
• Tic-Tac-Toe
– b ≈ 5 legal actions per state on average, total of 9 plies in game.
• “ply” = one action by one player, “move” = two plies.
– 59 = 1,953,125
– 9! = 362,880 (Computer goes first)
– 8! = 40,320 (Computer goes second)
exact solution quite reasonable
• Chess
– b ≈ 35 (approximate average branching factor)
– d ≈ 100 (depth of game tree for “typical” game)
– bd ≈ 35100 ≈ 10154 nodes!!
exact solution completely infeasible
• An Evaluation Function:
– Estimates how good the current board configuration is for a player.
– Typically, evaluate how good it is for the player, how good it is for
the opponent, then subtract the opponent’s score from the player’s.
– Often called “static” because it is called on a static board position.
– Othello: Number of white pieces - Number of black pieces
– Chess: Value of all white pieces - Value of all black pieces
Backup Values
Another Alpha-Beta Example
(−∞,+∞)
(−∞, +∞)
Alpha-Beta Example (continued)
(−∞,+∞)
(−∞,3]
Alpha-Beta Example (continued)
(−∞,+∞)
(−∞,3]
Alpha-Beta Example (continued)
[3,+∞)
[3, + ∞]
Alpha-Beta Example (continued)
[3,+∞)
This node is
worse for MAX
(3, +∞]
[3,3] (3,2]
Alpha-Beta Example (continued)
[3,+∞] ,
[3,+∞] ,
[3,3]
[3,2]
• Prune whenever a ≥ b.
– Prune below a Max node whose alpha value becomes greater than
or equal to the beta value of its ancestors.
• Max nodes update alpha based on children’s returned values.
– Prune below a Min node whose beta value becomes less than or
equal to the alpha value of its ancestors.
• Min nodes update beta based on children’s returned values.
Pseudocode for Alpha-Beta Algorithm
a, b, passed to kids
a=−
b =+
Alpha-Beta Example (continued)
a=−
b =+
a=−
b =3
MIN updates b, based on kids
Alpha-Beta Example (continued)
a=−
b =+
a=−
b =3
MIN updates b, based on kids.
No change.
Alpha-Beta Example (continued)
3 is returned
as node value.
Alpha-Beta Example (continued)
a=3
b =+
a, b, passed to kids
a=3
b =+
Alpha-Beta Example (continued)
a=3
b =+
MIN updates b,
based on kids.
a=3
b =2
Alpha-Beta Example (continued)
a=3
b =+
a=3 a ≥ b,
b =2 so prune.
Alpha-Beta Example (continued)
a=3
b =+ ,
a, b, passed to kids
a=3
b =+
Alpha-Beta Example (continued)
a=3
b =+ ,
MIN updates b,
based on kids.
a=3
b =14
Alpha-Beta Example (continued)
a=3
b =+ ,
MIN updates b,
based on kids.
a=3
b =5
Alpha-Beta Example (continued)
a=3
2 is returned
b =+ as node value.
2
Alpha-Beta Example (continued)
2
Effectiveness of Alpha-Beta Search
• Worst-Case
– branches are ordered so that no pruning takes place. In this case
alpha-beta gives no improvement over exhaustive search
• Best-Case
– each player’s best move is the left-most child (i.e., evaluated first)
– in practice, performance is closer to best rather than worst-case
– E.g., sort moves
– E.g., run Iterative Deepening search, sort by value last iteration.
5 6
3 4 1 2 7 8
Answer to Example
Min
Max
5 6
3 4 1 2 7 8
Answer: NONE! Because the most favorable nodes for both are
explored last (i.e., in the diagram, are on the right-hand side).
Second Example
(the exact mirror image of the first example)
3 4
6 5 8 7 2 1
Answer to Second Example
(the exact mirror image of the first example)
Min
Max
3 4
6 5 8 7 2 1
Answer: LOTS! Because the most favorable nodes for both are
explored first (i.e., in the diagram, are on the left-hand side).
Summary
• Game playing is best modeled as a search problem