0% found this document useful (0 votes)
36 views

Lesson 3 Gaming

1. The document provides an overview of game theory and game playing concepts including terminology like utility, outcomes, strategies and game trees. 2. It describes the min-max algorithm which is used by players to determine their optimal move, with the first player maximizing their utility and the second minimizing the first player's utility at each turn. 3. An example game tree is presented to illustrate how the min-max algorithm recursively evaluates the game from the bottom up to determine the best opening move.

Uploaded by

thomas mumo
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Lesson 3 Gaming

1. The document provides an overview of game theory and game playing concepts including terminology like utility, outcomes, strategies and game trees. 2. It describes the min-max algorithm which is used by players to determine their optimal move, with the first player maximizing their utility and the second minimizing the first player's utility at each turn. 3. An example game tree is presented to illustrate how the min-max algorithm recursively evaluates the game from the bottom up to determine the best opening move.

Uploaded by

thomas mumo
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 86

LESSON 3

GAME PLAYING
Terminology
• A game is a type of conflict in which n individuals or groups known as
players participates
• Game theory is a set of ideas and techniques for analyzing conflict
situations between two or more parties. The outcomes are determined by
their decisions
• Rules: conditions for playing the game
• Perfect information: all moves are known to each players involved
• Strategy: a plan of action for whatever situation might arise.
• Move: choice made by a player
Terminology used
• Utility : a number indicating motivation of players. It is also known as
pay off.
• Outcome:  A set of moves or strategies taken by the players, or it is
their utility resulting from the actions or strategies taken by
all players.
• Utility function: a function for a given player that assigns a number for
every possible outcome of game. Where a higher number indicates
the outcome is more preferred.
Benefits of games

1. Games facilitate learning


2. Used as a form of entertainment.
4. Absorption in a game distracts the mind from pain and discomfort.
5. It helps players to improve hand-eye co-ordination and help
players gain many skills.
6. Games can improve working memory
7. Encourage Cooperation and teamwork.
Applications of game theory
1. Military strategists: used to study conflicts of
interest resolved through battles where the outcome or
payoff of a war game is either victory or defeat.

2. Social sciences: used In studying distribution of


power in legislative procedures, problems of majority
rule and decision making
Relation of Games to Search
Search
 There is no adversary
 Solution is (heuristic) method (path) for reaching the goal

 Evaluation function estimates cost (path cost) from start to goal through
given node
 Searching techniques aim at finding optimal solution (cheapest path)

Games
 There is existence of adversary
 Solution is a set of moves leading to a certain outcome.
 Utility (Evaluation) function evaluate utility (“goodness”) of game
position
 The objective is find an approximate solution
Types of board Games
There are various types of board games
1. Perfect information games
2. Imperfect information games

3. Un-deterministic

4. Deterministic

5. Zero sum game


Types of Games

1. Perfect information games


 Each Player has complete knowledge of the game
state
 Usually only two players, who take alternate turns
 Examples include Chess, Checkers, Awari, Connect-
Four, Go, Othello
Types of Games

2. Imperfect information games


 Some of the game state is hidden
 Examples include: Cards, Poker, Bridge,
Types of Games

3. Undeterministic
 Games with an element of chance
 The game moves have some stochastic (random) element
 Unpredictable
 For example: throwing a coin, Backgammon, Die rolls.
Types of Games

4. Deterministic Game
• Games with no element of chance
• The outcome can be predicted
• For example, Chess, draughts/scramble

5. Zero sum game


• The winnings of the player(s) is exactly balanced by the loss of the
other(s)
• E.g chess –when one person win the one will loose
• The (+1) added to the loss(-1) equals) zero
• Betting is a zero sum game
Types of Games
Most games are in more than one category
Example1: draught

Is a computer more intelligent


if it beats you in a game of draught
Chess

1997: IBM Deep Blue defeats Kasparov.


There is still debate about whether computers are really
better
Example2: Robocup
Representing games
• Game play is represented on a rooted graph.
• Example:
Representing games
• The game starts in some initial state at the root of the game tree.
• To get to the next level, player one chooses a move,A, B, C,or D.
• To get to the next level, player two makes a move, etc.
• Each level of the tree is called a ply.
Representing games
• The objective of player one is to find what move to take

• To ensure he/she reaches one of the “W” states.

NB: we cannot just learn a strategy and specify it beforehand,


because our opponent can do whatever it wants and mess up
our plan.
Terminology in graphs
• Branching factor (b): The number of outgoing edges from a
single node.

• In a game graph,this corresponds to the number of possible


moves a player can make.

• • Ply: A level of the game tree. When a player makes a move


the game tree moves to the next ply.

• • Depth (d): How many plys we need to go down the game


tree, or how many moves the game takes to complete.

• In chess this is around 40.


Terminology in graphs
• Node: represent game “states” ( e.g game position, score, etc.)

• Edges: represent a move by a player that moves the game from


one state to another:
Game search
• Game search involves finding a winning move.

• Most searches will involve building the tree to some preset depth of the
tree, and then using a utility evaluation function to guess the value of
game positions at that depth.

• A heuristic value is then assigned to each of the remaining positions


(states) by propagating the evaluation function upwards.

• Values at the leaves indicate final values to be gained for each Path at the
end of the Game
Game search vs search problem
• Search problem: the aim is to reach a goal state in a graph, and
finding the path from start state to goal state,

• Game search: the aim is to find a winning move. The path


taken might change, since we cannot control what the
opponent does
Game Search algorithms
• Min-Max algorithm
• Alpha-Beta Pruning algorithm
Mini-max algorithm
• min max algorithm is implemented in a game which has two
players Where the objective of player1 is to maximize his utility

• The objective of player 2 is to minimize the utility of player1.

• Player who intends to maximize utility is called ‘max’

• The player who intends minimize the utility of ‘max’ is called


‘min’

• The players alternate moves, the algorithm alternates between


minimizing and maximizing levels of the recursive search tree.
Min-Max algorithm
• This algorithm is called Min-Max because computer
makes moves that gives it maximum gain, while assuming
the opponent (user) makes moves that gives the
minimum gain.

• In order to implement the algorithm there is a need to


use method of measuring how good a position is.

• The goodness of a position is called the utility of the


position
• e.g. outcome of a game; win 1, loss -1, draw 0

• The function for measuring utility is called a utility


function
Mini-max algorithm Example:

• In this example Hellen act as player 1 and stavro’s act as


player2 .

• Hellen (player 1) has a certain number of coins (utility) at


each state.

• The opponent (stavro’s) is poor and never gets money but he


doesn’t want Hellen (player1) to get any richer since she will
keep commanding (mastering) him around.
Mini-max algorithm
• The aim of player1(hellen) is to maximize the number of coins
she has while

• The aim of opponent (stavro’s) is to minimize the wealth of


player 1 (hellen).

• The two players uses mini-max algorithm to play the game


which requires that at each level, player 1 selects the move
leading to the greatest value (more wealth)

• The opponent moves to the minimum-valued state which will


minimize the wealth of player 1(hellen). .
Algorithm
The algorithm makes a tree of all possible moves for both
players.
Algorithm
• The values at the leaves are utilities of games corresponding to
the paths leading to those nodes.

• Lets say Helen is the first player to move.

• So she wants to take the option (A,B,C,) that will maximize her
score.

• But she knows in the next level down Stavros will try to minimize
the score, etc.

• So we must fill in the values of the tree recursively, starting from


the bottom up so as to know the best move.
Algorithm
Algorithm
• Stavros minimizes
Algorithm
• Hellen maximizes
Algorithm
• This game tree assumes that each player is Rational( will
always make the Optimal or best moves.

• For instance, Helen is doing the best she can given that Stavros
is doing the best he can.

• It is assumed that if the opponent (Stavros) doesn’t do the best


he can, then player 1 (Helen) will be even better off!
Min-max: Example
• Restrictions:
– 2 players: MAX (computer) and MIN (opponent)
– deterministic, perfect information

- Select a depth-bound (say: 2) and evaluation function

- Construct the tree up till


MAX Select
3 this move the depth-bound e.g 2
levels
- Compute the evaluation
function for the leaves
MIN
2 1 3
- Propagate the evaluation
function upwards:
MAX - taking minima in MIN
2 5 3 1 4 4 3 - taking maxima in MAX
Game Playing - Minimax example
1 A
MAX Select this move

1 B -3 C
MIN

4 D 1 E 2 F -3 G
MAX

4 -5 -5 1 -7 2 -3 -8

= terminal position = agent = opponent


GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAMETRE
E

MAX

MIN

MAX

MIN
GAME TRE
E

MAX

MIN

MAX

MIN
GAME TRE
E

MAX

MIN

MAX

MIN
MINIMAX ALGORIT
HM
max_value(node):
1: if end_state(node): return
value(node)
2: v = –Inf
3: for each child in node.children():
4: v = max(v, min_value(child))
5: return v

min_value(node):
1: if end_state(node): return
value(node)
2: v = +Inf
3: for each child in node.children():
4: v = min(v, max_value(child))
5: return v
MINIMAX ALGORITHM

function minimax(node, depth, maximizingPlayer) is


if depth = 0 or node is a terminal node then
return the heuristic value of node
if maximizingPlayer then
value := −∞
for each child of node do
value := max(value, minimax(child, depth − 1, FALSE))
return value
else (* minimizing player *)
value := +∞
for each child of node do
value := min(value, minimax(child, depth − 1, TRUE))
return value
HEURISTIC E VA L U AT I
ON FUNCTIONS

MAX

MIN MAX MIN

ESTIMATES OF THE VALUE OF THE


POSITION
HEURISTIC E VA L U AT I
ON FUNCTIONS
• The quality (accuracy) of the heuristic will affect the outcome:
– the better the heuristic => the better the outcome

• Consequently, you can measure the quality of the heuristic by


looking at the outcome in a number of games:
– the better the outcomes => the better the heuristic

• Sometimes even a good player loses to a bad player, so comparing


heuristics is not easy

• A common technique for player ranking: Elo rating


ANOTHER APPROACH
“Sum to 2” game
• Player 1 moves, then player 2, finally player 1 again
• Move = 0 or 1
• Player 1 wins if and only if all moves together sum to 2
Player 1
0 1

Player 2 Player 2

0 1 1
0

Player 1 Player 1 Player 1 Player 1

0 1 0 1 0 1 0 1

-1 -1 -1 1 -1 1 1 -1
Player 1’s utility is in the leaves; player 2’s utility is the negative of this
Backward induction (aka. minimax)
• From leaves upward, analyze best decision for player at node, give
node a value
• Once we know values, easy to find optimal action (choose best value)

1 Player 1
0 1

Player 2 Player 2
-1 1
0 1 1
0

Player 1 -1 Player 1 1 1 Player 1 1 Player 1

0 1 0 1 0 1 0 1

-1 -1 -1 1 -1 1 1 -1
Modified game
• From leaves upward, analyze best decision for player at node, give node a
value

6 Player 1
0 1

Player 2 Player 2
-1 6
0 1 1
0

Player 1 -1 Player 1 4 6 Player 1 7 Player 1

0 1 0 1 0 1 0 1

-1 -2 -3 4 -5 6 7 -8
A recursive implementation
• Value(state)
• If state is terminal, return its value
• If (player(state) = player 1)
• v := -infinity
• For each action
• v := max(v, Value(successor(state, action)))
• Return v
• Else
• v := infinity Space? Time?
• For each action
• v := min(v, Value(successor(state, action)))
• Return v
Do we need to see all the leaves?
• Do we need to see the value of the question mark here?

Player 1
0 1

Player 2 Player 2

0 1 1
0

Player 1 Player 1 Player 1 Player 1

0 1 0 1 0 1 0 1

-1 -2 ? 4
Do we need to see all the leaves?
• Do we need to see the values of the question marks here?

Player 1
0 1

Player 2 Player 2

0 1 1
0

Player 1 Player 1 Player 1 Player 1

0 1 0 1 0 1 0 1

-1 -2 ? ? -5 6 7 -8
ALPHA-BETA PRUNING
Alpha-Beta Pruning
• mini-max algorithm involves a lot of computation if the size of
tree is large.

• For instance a tree representing a game with 3640 nodes

• This call the need to reduce computation work when searching


for the best move

• For example an algorithm that will avoid having to look at the


entire tree (game).

• Such an algorithm is called alpha-beta prunning


Alpha-Beta Pruning

• Generally applied optimization on Mini-max.


• Instead of:
• first creating the entire tree (up to depth-level)
• then doing all propagation
• Interleave the generation of the tree and the
propagation of values.
• Point:
• some of the obtained values in the tree will
provide information so that other (non-
generated) parts are redundant and do not
need to be generated.
Alpha-Beta idea:
• Principles:
– generate the tree depth-first, left-to-right
• propagate final values of nodes as initial estimates
for their parent node.
MAX 2 - The MIN-value (1) is already
smaller than the MAX-value of
the parent (2)
- The MIN-value can only
decrease further,
MIN 2 =2 1
- The MAX-value is only
allowed to increase,
- No point in computing
further below this node
MAX
2 5 1
Terminology:
- The (temporary) values at MAX-nodes are ALPHA-values
- The (temporary) values at MIN-nodes are BETA-values

MAX Alpha-value
2

MIN 2 =2 1 Beta-value

MAX
2 5 1
The Alpha-Beta principles (1):
- If an ALPHA-value is larger than or equal the Beta-value
of a descendant node:
stop generation of the children of the descendant (prune)

MAX 2 Alpha-value


MIN 2 =2 1 Beta-value

MAX
2 5 1
The Alpha-Beta principles (2):
- If a Beta-value is smaller or equal than the Alpha-value
of a descendant node:
stop generation of the children of the descendant (prune)

MAX

MIN

MAX 4
2 5 3
Alpha-Beta Pruning example

MAX A

MIN <=6 B C

MAX 6 D >=8 E

H I J K
6 5 8
= opponent
= agent
Alpha-Beta Pruning example
MAX >=6 A

MIN 6 B <=2 C

MAX 6 D >=8 E 2 F G

H I J K L M
6 5 8 2 1

= agent = opponent
Alpha-Beta Pruning example
>=6 A

MAX

6 B 2 C
MIN

6 D >=8 E 2 F G

MAX

H I J K L M
6 5 8 2 1

= agent = opponent
Alpha-Beta Pruning example
MAX 6 A

MIN 6 B 2 C beta
cutoff

MAX 6 D >=8 E alpha 2 F G


cutoff

H I J K L M
6 5 8 2 1

= agent = opponent
Mini-Max with  at work:
 4 16  5 31 = 5 39
MAX
8 6  5 23
= 4 15 = 5 30  3 38
MIN

8 2  2 10  1 18  1 33
 4 12  3 20  3 25  2 35
=8 5 9 8  9 27  6 29 = 3 37
= 4 14 = 5 22
MAX

8 7 3 9 1 6 2 4 1 1 3 5 3 9 2 6 5 2 1 2 3 9 7 2 8 6 4
1 3 4 7 9 11 13 17 19 21 24 26 28 32 34 36
11 static evaluations saved !!
A L P H A - B E TA P R U N I
NG

MAX

MIN

MAX

MIN
A L P H A - B E TA P R U N I
NG

MAX

MIN
MIN-VALUE ≤ 1
MAX

MIN
A L P H A - B E TA P R U N I
NG

MAX

MIN
MIN-VALUE ≤ 1
‹ MAX-VALUE = 3
MAX

MIN
A L P H A - B E TA P R U N I
NG

MAX

MIN
MIN-VALUE ≤ 1
‹ MAX-VALUE = 3
MAX

MIN
A L P H A - B E TA P R U N I
NG
max_value(node, alpha, beta):
• 1: if end_state(node): return value(node)
• 2: v = –Inf
• 3: for each child in node.children():
)
• 4: v = max(v, min_value(child, alpha, beta)
UPPER-BOUND
• 5: alpha = max(alpha, v)
OF
• 6: if alpha >= beta: return v THE VALUE OF
• 7: return v
min_value(node, alpha, beta): LOWER-BOUND OF
1: if end_state(node): return va lu e(n od e )
v = +Inf T H E V A LUE
for each child in node.children(): GAME
OF THEv = min(v, max_value(child, alpha, beta))
2: beta = min(beta, v)
3: if alpha >= beta: return v return v
4:
5:
A L P H A - B E TA P R U N I
NG

MAX

MIN

MAX α = 0

MIN
A L P H A - B E TA P R U N I
NG

MAX

MIN

MAX α = 1

MIN
A L P H A - B E TA P R U N I
NG

MAX

MIN

MAX α = 6

MIN
ALPHA-BE
Pmin_value(node,
R U N I Nalpha, beta):
TA
G1:2: ifv =end_state(node):
+Inf
return value(node)

3: for each child in node.children():


4: v = min(v, max_value(child, alpha,
5: beta))
beta = min(beta, v)
MAX 6: if alpha >= beta: return v
7: return v
MIN β=6

MAX α=3

MIN α=3
ALPHA-BE
Pmin_value(node,
R U N I Nalpha, beta):
TA
G1:value(node)
if end_state(node): return

2: v = +Inf
3: for each max_value(child,
min(v, child in node.children():
alpha, beta))
min(beta, v)
MAX 5: 4: beta = 1v =1
3 >= beta: return v
if alpha
6: 1
7: return v
MIN β=6

MAX α=3

MIN α=3
ALPHA-BE
Pmin_value(node,
R U N I Nalpha, beta):
TA
G1:value(node)
if end_state(node): return

2: v = +Inf
3: for each max_value(child,
min(v, child in node.children():
alpha, beta))
min(beta, v)
MAX 5: 4: beta = 1v =1
3 >= beta: return v
if alpha
6: 1
7: return v
MIN β=6

MAX α=3

MIN α=3
A L P H A - B E TA P R U N I
NG

MAX

MIN β=3

MAX

MIN
A L P H A - B E TA P R U N I
NG

MAX α=3

MIN α=3

MAX

MIN
A L P H A - B E TA P R U N I
NG

MAX α=3

MIN α=3

MAX
min_value(node, alpha, beta):
MIN
1: if end_state(node): return value(node)
2: v = +Inf
3: for each child in node.children():
4: v = min(v, max_value(child, alpha,
beta)) 5: beta = min(beta, v)
6: if alpha >= beta: return v 7:
return v
A L P H A - B E TA P R U N I
NG

MAX α=3

MIN α=3

MAX
min_value(node, alpha, beta):
MIN
1: if end_state(node): return
value(node) 2: v = +Inf
3: for each child in node.children():
min(v, max_value(child, alpha, beta))
4: v =2
min(beta, v)
5: beta 3= 2 >= beta:
if alpha 2 return v
6: return v
A L P H A - B E TA P R U N I
NG

MAX α=3

MIN α=3

MAX

MIN
Summary
• alpha-beta algorithm does the same calculation
as minimax, and is more efficient since it prunes
irrelevant branches.
• To perform good and efficient search do the
following:
• select good search method/technique
• provide info/heuristic if possible
• apply prune irrelevant branches
Exercise
• Consider the following tree representing playing
• between hellen and stavros
Exercise
i) Apply the alpha-beta pruning algorithm to it and show the
search tree that would be built by this algorithm.

ii) How many nodes did you not have to visit with alpha-beta
pruning when compared to the full MIN-MAX search?

You might also like