0% found this document useful (0 votes)
19 views

Game Theory: Zhixin Liu

This document provides a summary of a lecture on game theory. It begins with an introduction to game theory and its analysis of strategic interactions between rational agents. It then discusses some key concepts in game theory, including Nash equilibrium, mixed strategies, dominance, and iterations of games. It provides examples like Rock-Paper-Scissors and the prisoner's dilemma to illustrate various game theory solutions and strategies.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Game Theory: Zhixin Liu

This document provides a summary of a lecture on game theory. It begins with an introduction to game theory and its analysis of strategic interactions between rational agents. It then discusses some key concepts in game theory, including Nash equilibrium, mixed strategies, dominance, and iterations of games. It provides examples like Rock-Paper-Scissors and the prisoner's dilemma to illustrate various game theory solutions and strategies.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 50

Lecture V: Game Theory

Zhixin Liu
Complex Systems Research
Center,
Academy of Mathematics and
Systems Sciences, CAS

In the last two lectures, we talked about

Multi-Agent Systems
Analysis
Intervention

In this lecture, we will talk about

Game theory
complex interactions between
people

Start With A Game


Rock-paper-scissor
B
rock

paper scissor

rock

0,0

-1,1

paper

1,-1

0,0

scissor -1,1

1,-1

1,-1
-1,1
0,0

Other games: poker, go, chess, bridge, basketball, football,

From Games To Game Theory

Some hints from the games


Rules
Results

(payoff)
Strategies
Interactions between strategies and payoff

Games are everywhere.


Economic

systems: oligarchy monopoly, market, trade


Political systems: voting, presidential election, international relations
Military systems: war, negotiation,

Game theory

the study of the strategic interactions among rational agents.


Not to beat the
Rationality
other players
implies that each player tries to maximize his/her payoff

History of Game Theory


1928, John von Neumann proved the minimax
theorem
1944, John von Neumann & Oskar Morgenstern,
Theory of Games and Economic Behaviors
1950s, John Nash, Nash Equilibrium
1970s, John Maynard Smith, Evolutionarily stable
strategy

Eight game theorists have won Nobel prizes in


economics

Elements of A Game
Player:

Who is interacting? N={1,2,,n}


Actions/ Moves: What the players can do?
Action set : Ai ai1 , ai 2 , , aili
Payoff: What the players can get from the game
ui : in1 Ai R

Strategy

Strategy: complete plan of actions


Pure strategy is a special
Mixed strategy: probability distribution
over
the
kind of
mixed
strategies
pure strategies

Si si si ( si1 , si 2 , , sili ), sij 0, sij 1

j 1

li

Payoff:
u ( s1 , s2 ) i j s1i s2 j u (a1i ,a 2 j ), 1,2.

An Example: Rock-paper-scissor
B

Players: A and B
Actions/ Moves:
{rock, scissor, paper}
Payoff:
u1(rock,scissor)=1
u2(scissor, paper)=-1
Mixed strategies

rock

paper scissor

rock

0,0

-1,1 1,-1

paper

1,-1

0,0

scissor -1,1

1,-1

s1=(1/3,1/3,1/3)
s2=(0,1/2,1/2)
u1(s1, s2) = 1/3(00+1/2(-1)+1/21)+
1/3(01+1/20+1/2(-1))+1/3(0(-1)+1/21+1/20)
=0

-1,1
0,0

Classifications of Games

Cooperative and non-cooperative games


Cooperative game: players are able to form binding commitments.
Non cooperative games: the players make decisions independently
Zero sum and non-zero sum games
Zero sum game: the total payoff to all players is zero. E.g., poker, go,
Non-zero sum game: e.g., prisoners dilemma
Finite game and infinite game
Finite game: the players and the actions are finite.
Simultaneous and sequential (dynamic) games
Simultaneous game: players move simultaneously, or if they do not move
simultaneously, the later players are unaware of the earlier players' actions
Sequential game: later players have some knowledge
about
earlier
actions.
Every
player
know
the
Perfect information and imperfect information
games and payoffs of
strategies
the other
players but
not
Perfect information game: all players know the moves
previously
made
necessarily the actions.
by all other players. E.g., chess, go,
Perfect information Complete information

We will first focus on games:


Simultaneous
Complete information
Non cooperative
Finite

What is the solution of the game?

Assumption
Assume

that each player

knows the structure of the game


attempts to maximize his payoff
attempt to predict the moves of his
opponents.
knows that this is the common knowledge
between the players

Dominated Strategy
A strategy is dominated if, regardless of what any
other players do, the strategy earns a player a
smaller payoff than some other strategies.
S-i : the strategy set formed by all other players except player i

Strategy s' of the player i is called a strictly


dominated strategy if there exists a strategy s*,
such that
ui ( s * , si ) ui ( s ' , si ), si S i

Elimination of Dominated Strategies


Example:
L

4,3

5,1 6,2

4,3 6,2

2,1

8,4 3,6

2,1 3,6

3,0

9,6 2,8

3,0 2,8

L
U

4,3 6,2

(U,L) is the solution of the game.


A dominant strategy may not exist!

R
U

4,3

Definition of Nash Equilibrium


Nash

Equilibrium (NE): A solution concept of


a game

(N, S, u) : a game
Si: strategy set for player i
: set of strategy profiles
: payoff function
s-i: strategy profile of all players except player i
A strategy profile s* is called a Nash equilibrium if
ui ( si* , s*i ) ui ( i , s*i ), i
where i is any pure strategy of the player i.

Remarks on Nash Equilibrium

A set of strategies, one for each player, such that


each players strategy is a best response to others
strategies

Best Response:
The strategy that maximizes the payoff given
others strategies.
No player can do better by unilaterally changing his
or her strategy
A dominant strategy is a NE

Example

Players: Smith and Louis


Actions: { Advertise , Do Not Advertise }
Payoffs: Companies Profits

Each firm earns $50 million from its customers


Advertising costs a firm $20 million
Advertising captures $30 million from competitor

How to represent this game?

Strategic Interactions

Smith

Louis

No Ad

Ad

No Ad

(50,50)

(20,60)

Ad

(60,20)

(30,30)

Best Responses

Best response for Louis:

If Smith advertises: advertise


If Smith does not advertise: advertise

The best response for Smith is the same.


(Ad, Ad) is a dominant strategy!
(Ad, Ad) is a NE!
This is another Prisoners Dilemma!
Smith
No Ad

Ad

No Ad

(50,50)

(20,60)

Ad

(60,20)

(30,30)

Louis

Nash Equilibrium
NE

may be a pair of mixed strategies.


Example:
B
head

Tail

head

(1,-1)

(-1,1)

Tail

(-1,1)

(1,-1)

Matching Pennies

(1/2,1/2) is the Nash Equilibrium.

Existence of NE
Theorem

(J. Nash, 1950s)

For a finite game, there exists at least one


Nash Equilibrium (Pure strategy, or mixed
strategy).

Nash Equilibrium

NE may not be a good solution of the game, it is


different from the optimal solution.
e.g.,
Smith
No Ad

Ad

No Ad

(50,50)

(20,60)

Ad

(60,20)

(30,30)

Louis

Nash Equilibrium
A game

may have more than one NE.


e.g., The Battle of Sex
NE: (opera, opera), (football, football),
((2/3,1/3),(1/3, 2/3))
Husband
opera

football

opera

(2,1)

(0,0)

football

(0,0)

(1,2)

Wife

Nash Equilibrium
Zero

sum games (two-person):


Saddle point is a solution
u ( s1* , s2* ) max min u ( s1 , s2 ) min max u ( s1 , s2 )
s1S1 s 2 S 2

s1* arg max min u ( s1 , s2 )


s1S1 s2 S 2

s2* arg min max u ( s1 , s2 )


s2 S 2 s1S1

s2 S 2 s1S1

Nash Equilibrium
Many

varieties of NE: Refined NE, Bayesian


NE, Sub-game Perfect NE, Perfect Bayesian
NE

Finding
NE

NEs is very difficult.

can only tell us if the game reach such a


state, then no player has incentive to change
their strategies unilaterally. But NE can not
tell us how to reach such a state.

Iterated Prisoners Dilemma

Cooperation
Groups

of organisms:

Mutual cooperation is of benefit to all agents


Lack of cooperation is harmful to them

Another types of cooperation:

Cooperating agents do well


Any one will do better if failing cooperate
Prisoners Dilemma is an elegant embodiment

Prisoners Dilemma
The

story of prisoners dilemma


Player: two prisoners
Action: {Cooperation, Defecti}
Payoff matrix
Prisoner B
C

Prisoner A

(3,3)

(0,5)

(5,0)

(1,1)

Prisoners Dilemma
No

matter what the other does, the best


choice is D.
(D,D) is a Nash Equilibrium.
But, if both choose D, both will do worse
than if both select C
Prisoner B
C

Prisoner A

(3,3)

(0,5)

(5,0)

(1,1)

Iterated Prisoners Dilemma


The

individuals:

Meet many times


Can recognize a previous interactant
Remember the prior outcome

Strategy:

specify the probability of


cooperation and defect based on the history

P(C)=f1(History)
P(D)=f2(History)

Strategies

Tit For Tat cooperating on the first time, then repeat opponent's last choice.

Player A C D D C C C C C D D D D C
Player B D D C C C C C D D D D C

Strategies

Tit For Tat - cooperating on the first time, then repeat opponent's last choice.
Tit For Tat and Random - Repeat opponent's last choice skewed by random
setting.*
Tit For Two Tats and Random - Like Tit For Tat except that opponent must
make the same choice twice in a row before it is reciprocated. Choice is skewed
by random setting.*
Tit For Two Tats - Like Tit For Tat except that opponent must make the same
choice twice in row before it is reciprocated.
Naive Prober (Tit For Tat with Random Defection) - Repeat opponent's last
choice (ie Tit For Tat), but sometimes probe by defecting in lieu of cooperating.*
Remorseful Prober (Tit For Tat with Random Defection) - Repeat opponent's
last choice (ie Tit For Tat), but sometimes probe by defecting in lieu of
cooperating. If the opponent defects in response to probing, show remorse by
cooperating once.*
Naive Peace Maker (Tit For Tat with Random Co-operation) - Repeat
opponent's last choice (ie Tit For Tat), but sometimes make peace by cooperating in lieu of defecting.*
True Peace Maker (hybrid of Tit For Tat and Tit For Two Tats with Random
Cooperation) - Cooperate unless opponent defects twice in a row, then defect
once, but sometimes make peace by cooperating in lieu of defecting.*
Random - always set at 50% probability.

Strategies

Always Defect
Always Cooperate
Grudger (Co-operate, but only be a sucker once) - Cooperate until the opponent
defects. Then always defect unforgivingly.
Pavlov (repeat last choice if good outcome) - If 5 or 3 points scored in the last
round then repeat last choice.
Pavlov / Random (repeat last choice if good outcome and Random) - If 5 or 3
points scored in the last round then repeat last choice, but sometimes make
random choices.*
Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and then takes choices which have
given the best average score re-calculated after every move.
Gradual - Cooperates until the opponent defects, in such case defects the total
number of times the opponent has defected during the game. Followed up by
two co-operations.
Suspicious Tit For Tat - As for Tit For Tat except begins by defecting.
Soft Grudger - Cooperates until the opponent defects, in such case opponent is
punished with d,d,d,d,c,c.
Customised strategy 1 - default setting is T=1, P=1, R=1, S=0, B=1, always cooperate unless sucker (ie 0 points scored).
Customised strategy 2 - default setting is T=1, P=1, R=0, S=0, B=0, always
play alternating defect/cooperate.

Iterated Prisoners Dilemma

The same players repeat the prisoners dilemma many times.


After ten rounds

The best income is 50.


A real case is to get 30 for each player.
An extreme case is that each player selects defection, each player
can get 10.
The most possible case is that each player will play with a mixing
strategy of defect and cooperate .
Prisoner A
C
C

(3,3)

(0,5)

(5,0)

(1,1)

Prisoner B
D

Iterated Prisoners Dilemma


Which

strategy can thrive/what is the good


strategy?
Robert Axelrod, 1980s
A

computer round-robin tournament

AXELROD R. 1987. The evolution of strategies in the iterated Prisoners' Dilemma.


In L. Davis, editor, Genetic Algorithms and Simulated Annealing. Morgan Kaufmann, Los Altos, CA.

The first round

Strategies: 14 entries+ random strategy


Including Markov process + Bayesian inference

Each pair will meet each other, totally there


are 15*15 runs, each pair will play the game
200 times
Payoff: S U(S,S)/15
Tit For Tat wins (cooperation based on
reciprocity)

The first round


Naive Prober - Repeat opponent's last

Characters

choice but sometimes probe by defecting


ofingood
strategies
lieu of cooperating

Goodness: never defect first


TFT vs. Naive prober

Forgiveness: may revenge, but the memory is short.


TFT vs. Grudger

Grudger - Cooperate until the


opponent defects. Then always
defect unforgivingly

Winning Vs. High Scores


This

is not a zero sum game, there is a


banker.
TFT never wins one game. The best result for
it is to get the same result as its opponent.
Winning the game is a kind of jealousness,
it does not work well
It is possible to arise cooperation in a
selfish group.

The second round

Strategies: 62 entries+ random strategy

goodness strategies

wiliness: strategies

Tit For Tat wins again


Win or lost depends on the circumstance.

Characters of good strategies


Goodness: never defect first
First round: the first eight strategies with goodness
Second round: there are fourteen strategies with
goodness in the first fifteen strategies
Forgiveness: may revenge, but the memory is

short.

Grudger is not s strategy with forgiveness

goodness and forgiveness is a kind of collective


behavior.
For a single agent, defect is the best strategy.

Evolution of the Strategies

Evolve good strategies by genetic algorithm (GA)

What is a good strategy?


TFT

is a good strategy?
Tit For Two Tats may be the best strategy in
the first round, but it is not a good strategy in
the second round.
Good strategy depends on the environment.
Tit

For Two Tats - Like Tit For Tat


except that opponent must make
the same choice twice in row
before itstable
is reciprocated.
Evolutionarily
strategy

Evolutionarily stable strategy (ESS)

Introduced by John Maynard Smith and George R.


Price in 1973
ESS means evolutionarily stable strategy, that is a
strategy such that, if all member of the population
adopt it, then no mutant strategy could invade the
population under the influence of natural selection.
John Maynard Smith, Evolution and the Theory of Games

ESS is robust for evolution, it can not be invaded by


mutation.

Definition of ESS

A strategy x is an ESS if for all y, y x, such


that

x U ((1 ) x y ) y U ((1 ) x y )

holds for small positive.

(1) u ( x, x) u ( x, y ), y
(2) u ( x, y ) u ( y, y ), if u ( x, x) u ( x, y ), y x

ESS

ESS is defined in a population with a large number


of individuals.

The individuals can not control the strategy, and


may not be aware the game they played

ESS is the result of natural selection

Like NE, ESS can only tell us it is robust to the


evolution, but it can not tell us how the population
reach such a state.

ESS in IPD

Tit For Tat can not be invaded by the wiliness


strategies, such as always defect.
TFT can be invaded by goodness strategies, such
as always cooperate, Tit For Two Tats and
Suspicious Tit For Tat
Tit For Tat is not a strict ESS.
Always Cooperate can be invaded by Always
Defect.
Always Defect is an ESS.

references

Drew Fudenberg, Jean Tirole, Game Theory, The


MIT Press, 1991.

AXELROD R. 1987. The evolution of strategies in


the iterated Prisoners' Dilemma. In L. Davis, editor,
Genetic Algorithms and Simulated Annealing.
Morgan Kaufmann, Los Altos, CA.

Richard Dawkins, The Selfish Gene, Oxford


University Press.

Concluding Remarks
Tip

Of Game theory

Basic

Concepts
Nash Equilibrium
Iterated Prisoners Dilemma
Evolutionarily Stable Strategy

Concluding Remarks
Many

interesting topics deserve to be


studied and further investigated:
Cooperative games
Incomplete information games
Dynamic games
Combinatorial games
Learning in games
.

Thank you!

You might also like