Game Theory: Zhixin Liu
Game Theory: Zhixin Liu
Zhixin Liu
Complex Systems Research
Center,
Academy of Mathematics and
Systems Sciences, CAS
Multi-Agent Systems
Analysis
Intervention
Game theory
complex interactions between
people
paper scissor
rock
0,0
-1,1
paper
1,-1
0,0
scissor -1,1
1,-1
1,-1
-1,1
0,0
(payoff)
Strategies
Interactions between strategies and payoff
Game theory
Elements of A Game
Player:
Strategy
j 1
li
Payoff:
u ( s1 , s2 ) i j s1i s2 j u (a1i ,a 2 j ), 1,2.
An Example: Rock-paper-scissor
B
Players: A and B
Actions/ Moves:
{rock, scissor, paper}
Payoff:
u1(rock,scissor)=1
u2(scissor, paper)=-1
Mixed strategies
rock
paper scissor
rock
0,0
-1,1 1,-1
paper
1,-1
0,0
scissor -1,1
1,-1
s1=(1/3,1/3,1/3)
s2=(0,1/2,1/2)
u1(s1, s2) = 1/3(00+1/2(-1)+1/21)+
1/3(01+1/20+1/2(-1))+1/3(0(-1)+1/21+1/20)
=0
-1,1
0,0
Classifications of Games
Assumption
Assume
Dominated Strategy
A strategy is dominated if, regardless of what any
other players do, the strategy earns a player a
smaller payoff than some other strategies.
S-i : the strategy set formed by all other players except player i
4,3
5,1 6,2
4,3 6,2
2,1
8,4 3,6
2,1 3,6
3,0
9,6 2,8
3,0 2,8
L
U
4,3 6,2
R
U
4,3
(N, S, u) : a game
Si: strategy set for player i
: set of strategy profiles
: payoff function
s-i: strategy profile of all players except player i
A strategy profile s* is called a Nash equilibrium if
ui ( si* , s*i ) ui ( i , s*i ), i
where i is any pure strategy of the player i.
Best Response:
The strategy that maximizes the payoff given
others strategies.
No player can do better by unilaterally changing his
or her strategy
A dominant strategy is a NE
Example
Strategic Interactions
Smith
Louis
No Ad
Ad
No Ad
(50,50)
(20,60)
Ad
(60,20)
(30,30)
Best Responses
Ad
No Ad
(50,50)
(20,60)
Ad
(60,20)
(30,30)
Louis
Nash Equilibrium
NE
Tail
head
(1,-1)
(-1,1)
Tail
(-1,1)
(1,-1)
Matching Pennies
Existence of NE
Theorem
Nash Equilibrium
Ad
No Ad
(50,50)
(20,60)
Ad
(60,20)
(30,30)
Louis
Nash Equilibrium
A game
football
opera
(2,1)
(0,0)
football
(0,0)
(1,2)
Wife
Nash Equilibrium
Zero
s2 S 2 s1S1
Nash Equilibrium
Many
Finding
NE
Cooperation
Groups
of organisms:
Prisoners Dilemma
The
Prisoner A
(3,3)
(0,5)
(5,0)
(1,1)
Prisoners Dilemma
No
Prisoner A
(3,3)
(0,5)
(5,0)
(1,1)
individuals:
Strategy:
P(C)=f1(History)
P(D)=f2(History)
Strategies
Tit For Tat cooperating on the first time, then repeat opponent's last choice.
Player A C D D C C C C C D D D D C
Player B D D C C C C C D D D D C
Strategies
Tit For Tat - cooperating on the first time, then repeat opponent's last choice.
Tit For Tat and Random - Repeat opponent's last choice skewed by random
setting.*
Tit For Two Tats and Random - Like Tit For Tat except that opponent must
make the same choice twice in a row before it is reciprocated. Choice is skewed
by random setting.*
Tit For Two Tats - Like Tit For Tat except that opponent must make the same
choice twice in row before it is reciprocated.
Naive Prober (Tit For Tat with Random Defection) - Repeat opponent's last
choice (ie Tit For Tat), but sometimes probe by defecting in lieu of cooperating.*
Remorseful Prober (Tit For Tat with Random Defection) - Repeat opponent's
last choice (ie Tit For Tat), but sometimes probe by defecting in lieu of
cooperating. If the opponent defects in response to probing, show remorse by
cooperating once.*
Naive Peace Maker (Tit For Tat with Random Co-operation) - Repeat
opponent's last choice (ie Tit For Tat), but sometimes make peace by cooperating in lieu of defecting.*
True Peace Maker (hybrid of Tit For Tat and Tit For Two Tats with Random
Cooperation) - Cooperate unless opponent defects twice in a row, then defect
once, but sometimes make peace by cooperating in lieu of defecting.*
Random - always set at 50% probability.
Strategies
Always Defect
Always Cooperate
Grudger (Co-operate, but only be a sucker once) - Cooperate until the opponent
defects. Then always defect unforgivingly.
Pavlov (repeat last choice if good outcome) - If 5 or 3 points scored in the last
round then repeat last choice.
Pavlov / Random (repeat last choice if good outcome and Random) - If 5 or 3
points scored in the last round then repeat last choice, but sometimes make
random choices.*
Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and then takes choices which have
given the best average score re-calculated after every move.
Gradual - Cooperates until the opponent defects, in such case defects the total
number of times the opponent has defected during the game. Followed up by
two co-operations.
Suspicious Tit For Tat - As for Tit For Tat except begins by defecting.
Soft Grudger - Cooperates until the opponent defects, in such case opponent is
punished with d,d,d,d,c,c.
Customised strategy 1 - default setting is T=1, P=1, R=1, S=0, B=1, always cooperate unless sucker (ie 0 points scored).
Customised strategy 2 - default setting is T=1, P=1, R=0, S=0, B=0, always
play alternating defect/cooperate.
(3,3)
(0,5)
(5,0)
(1,1)
Prisoner B
D
Characters
goodness strategies
wiliness: strategies
short.
is a good strategy?
Tit For Two Tats may be the best strategy in
the first round, but it is not a good strategy in
the second round.
Good strategy depends on the environment.
Tit
Definition of ESS
x U ((1 ) x y ) y U ((1 ) x y )
(1) u ( x, x) u ( x, y ), y
(2) u ( x, y ) u ( y, y ), if u ( x, x) u ( x, y ), y x
ESS
ESS in IPD
references
Concluding Remarks
Tip
Of Game theory
Basic
Concepts
Nash Equilibrium
Iterated Prisoners Dilemma
Evolutionarily Stable Strategy
Concluding Remarks
Many
Thank you!