Adversarial Search
Adversarial Search
Game Playing
Chapter 6
Outline
• Games
• Perfect Play
– Minimax decisions
– α-β pruning
• Resource Limits and Approximate
Evaluation
• Games of chance
Games
• * Environments with very many agents are best viewed as economies rather than
games
Deterministic Games
• Many possible formalizations, one is:
– States: S (start at s0)
– Players: P={1...N} (usually take turns)
– Actions: A (may depend on player / state)
– Transition Function: SxA →S
– Terminal Test: S → {t,f}
– Terminal Utilities: SxP → R
• Plan of attack:
– Computer considers possible lines of play (Babbage, 1846)
– Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)
– Finite horizon, approximate evaluation (Zuse, 1945; Wiener,
1948; Shannon, 1950)
– First chess program (Turing, 1951)
– Machine learning to improve evaluation accuracy (Samuel,
1952-57)
– Pruning to allow deeper search (McCarthy, 1956)
Deterministic Single-Player?
• Deterministic, single player,
perfect information:
– Know the rules
– Know what actions do
– Know when you win
– E.g. Freecell, 8-Puzzle, Rubik’s
cube
• … it’s just search!
• Slight reinterpretation:
– Each node stores a value: the
best outcome it can reach
– This is the maximal outcome of
its children (the max value)
– Note that we don’t have path
sums as before (utilities at end)
• After search, can pick move that
leads to best node
• Algorithm:
1. Generate game tree completely
2. Determine utility of each terminal state
3. Propagate the utility values upward in the three by applying
MIN and MAX operators on the nodes in the current level
4. At the root node use minimax decision to select the move with
the max (of the min) utility value
•
The α-β algorithm
The α-β algorithm
Imperfect Real-Time Decisions
Suppose we have 100 secs, explore 104
nodes/sec
106 nodes per move
Standard approach:
• cutoff test:
e.g., depth limit (perhaps add quiescence search)
• evaluation function
= estimated desirability of position
* Replace the utility function by a heuristic evaluation
function EVAL, which gives an estimate of the
position’s utility
–
Evaluation Functions
• First proposed by Shannon in 1950
• The evaluation function should order the
terminal states in the same way as the true utility
function
• The computation must not take too long
• For non-terminal states, the evaluation function
should be strongly correlated with the actual
chances of winning
– Uncertainty introduced by computational limits
Evaluation Functions
Evaluation Functions
• Material value for each piece in chess
– Pawn: 1
– Knight: 3
– Bishop: 3
– Rook: 5
– Queen: 9
This can be used as weights and the number of each kind can be used as
features
• Other features
– Good pawn structure
– King safety
• These features and weights are not part of the rules of chess, they
come from playing experience
Cutting off search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval