Vdoc - Pub Computer Gamesmanship The Complete Guide To Creating and Structuring Intelligent Games Programs
Vdoc - Pub Computer Gamesmanship The Complete Guide To Creating and Structuring Intelligent Games Programs
EI nts of Intelligent
Game Design •
David Levy
Computer
Gamesmanship
THE COMPLETE GU IDE TO
CREATI NG AND STRUCTURI NG
I NTEL LIGE NT GAMES PROGR AMS
by
DAVID LEVY
A It rights reserved
including the right of reproducti o ll
In whole or in part in any form
,
Contents
INTRODUCTION 12
Deduced probabilities 77
Shuffling 77
Deducing information from Ille play oj lhe cords 78
Bayes' theorem 79
What have we leorned? 80
Oeducing informa/ion from Ihe bidding 82
Ilow 10 use deduced informalion 84
FxpeC'fed values in backed-up trees 86
Task 6 87
8
o
9
CHAPTl:R 12: Draw Poker 181
The rules of the game 181
The basis of the algorithm 182
How the algorithm operates 187
What happens during the draw 190
Estimating how hands improve during the draw 191
After the draw is over 192
Bluffing 193
Making draw poker a many-player game 195
II
Introduction
DAVID LEVY
London, December 1982
14
Computer
Gamesmanship
CHAPTER 1
One-Person Games
The 8-puzzle
3 8 1 2 3
2 5 7 4 5
1 4 6 6 7 8
FIGURE I
binary
Pile A: 1IIIl11=7matches= III
Pile B: I 1111 = 5 matches = 101
Pile C: 111 =3 matches = II
Pile 0: I =1 match 1
totals: 224
FIGURE 2
3 !\
P"
2 5 ,
S()(l6)
1 4 6
~
rnl (3) m, (K)
m2 (5)
3 R 3 5 8 3 H
2 5 7 2 7 2 5 7
1 4 6 1 4 6 1 4 6
1', P, P,
SI(l6) S, ( 16) S,(IR)
FIGURE 4
4- m
, I
2 8 3
5 7
1 4 6
FIGURE 5
The best position now on the tree, i.e. the position closest
III t he target configuration, is P II' since its score of 14 is lower
lhan the scores of all the other nodes. So remembering not to
allow the retrograde move of the 2 tile, the program now
expands position PII' and the choice is to move the 1 tile or
the 5 tile, giving rise to the position shown in Figure 6.
23
Computer Gamesmanship
2 2 3 8
7
7
6
"
1 4 6
PI I I PI 1 2
SIII(14) ;)112(1-1)
FIGURE 6
P,-.
P, ,--0
PIll PI I ~
(14 ) (14 )
FIGURE:, 7
P,-..
P ltlt Pltt2
FIGURE 8
Plj P l2
FIGURE 9
26
Computer Gamesmanship
The new Po is the old P II
The new PI is the old Pill
The new P 1 is the old Pill
The new P ll is the old P IIII
The new P ll is the old P Il12
And thus the search for a solution continues.
Flow chart
II'II.L CHEATIOJ\
Y~S OF TIll<; NDc T I'OSITlO'<
('AliSE A ~IE~lOHY
OVEHfLOW"
'"
CAN MORE
MEMOHY BE l'HE:ATlm
YES DELFTE I
WORST J\ODE
BY SPRIJ\G OJ\' THt TRFE j
SUX'F.SSO~
CLEANISG'
GENERATE '<Dc T
'" KODE '1 0 THE: BE~!' NOllE
FOlINIl SO F AR ~'NE:XT' WILL
~O~IETlM~S BE' 'msT"
I
OUTPllT rt 1R RENTL \ \ES I~
THIS ;-.0
Bf:SI PATH A S()],lTTION'
OUTPUT AU
MAKE CUHHESTLY MOVES ON SOLUTION
BEST NOllE INTO PATH NOT YET
HOOT OF NEW Ol'TPIlT FOLLOWED
THEE BY NliMBEII OF MOVES
'""
FIGURE 10
28
Computer Gamesmanship
Task 1
Sources
29
r
CHAPTER 2
Two-Person Games
d • f
g h i
x x
FIGURE 11
31
Computer Gamesmanship
How should we set about designing our evaluation
function? This is one of the fundamental problems in game
playing programming because a, good evaluation function
will help the program to make good judgements, and hence
to play well, even though the depth of look-ahead may be
shallow. A poor function, on the other hand, might well
result in poor play even with a deep and time consuming
search of the game tree. It is therefore very much worthwhile
putting some careful thought into the design of the evaluation
function, and the following example should illustrate the type
of thinking that is necessary.
The object of the game is to create a row of three of your
own symbols. We shall call this a '3-row'. The next most
important thing is to prevent your opponent from making a
3-row, which means that he should not have a 2-row after you
move (a 2-row has two symbols of one player and one empty
space). Next most important is the creation of your own
2-rows; then it is important not to leave your opponent with
I-rows (one of his symbols and two empty spaces); and
finally you should try to create your own I-rows. All of these
features could well be incorporated into a tic-tac-toe
evaluation function.
If we denote the number of X's 3-rows by C3. the number
of D's 2-rows by n2, the number of X's 2-rows by C2,
the number of D's I-rows by nl, and the number of
X's I-rows by C1 • • • then one measure of the merit of a
position from X's point of view would be:
but this measure has one obvious drawback. It does not allow
for the fact that the term C.l is more important than n 2 , which
is more important than c l • and so on. This can be done by
multiplying each of the terms in the evaluation function by
some numerical weighting, in such a way that the weightings
reflect the relative importance of each feature. The
evaluation function then becomes
J2
Computer Gamesmanship
(k J X e l ) - (k 2 ' X n 2) + (k2 X c l ) - (k]' X nl) +(k l XCI)
where k 1 • k2" k~. k]' and k] are the numerical weightings.
Since one c] is worth more than all the fi 2S in the world, Le. a
winning row is more important than any number of 2-rows,
we can set k3 to be some arbitrarily high number. say 128. By
studying the game for a few minutes it is possible to see that if
one side has a 3-row, the other side may have at most two
2-rows, so to retlect the relative importance of one's own
3-rows and enemy 2-rows it is necessary to ensure that
kJ>2 X k 2 ' . We can therefore try k2' = 63. (If one side has a
3-row and his opponent two 2-rows, the opponent will not
have any I-rows to upset this scoring mechanism.)
If there are no 3-rows, but one side only has a 2-row, his
opponent cannot have more than three I-rows, as in Figure
12.
0
X 0
X
FIGLRI::', 12
#
00" rn" '"'' "'"
\
,/
«,' , I
,\ j,j "\ \
,"", '-'" , ";,, ,",.' "", <'",
#: 1rltt;;;1¢#w###*'H=
FIGURb 13
values is - 9, the program will play move mi' and the backed-
up score at the root of the tree will be - 9. This represents the
score that will be achieved with best play from both sides.
This procedure of choosing the maximum of the minimums
... etc. is known, not surprisingly, as the minimax method
of tree searching. It is an algorithm that finds the move which
will be best, assuming correct play for both sides, provided
that the evaluation function is reasonably accurate.
Task 2
39
CHAPTER 3
Games with Big Trees
I'
I '\
M" M" M" I ,
I ANOTHER THOUSAND
:MOVES
I
FIGURE 14
P, (S, [e])
INITIALL Y ~ =. ~
P, (S, [e]1
INITIALLY
/3 = + -
FlGURE 15
be greater. so a at So is set to 8,
4 The left hand side of the tree has now been examined and
the search proceeds to the comparison of the best score
achieved so far (8) with whatever can be reached, assuming
43
Computer Gamesmanship
best play by both sides, if the program should choose ml-'
This part of the search commences with an examination of
P 211 , which is found to have a score of 3. This is compared
with a at S21 and found to be greater, and since it is intended
to maximise a the program will set this value of a to 3.
L '""
4
8 "" 12.14-
38.65
40.11
220.37
" . -_.256 122.11 4096 1214.45
FIGURE 16
It will be seen that as the branching factor increases, so the
proportion of nodes that can be ignored thanks to the alpha-
beta algorithm also increases. And as the depth of search
increases the effect of the algorithm is again increased. So the
bigger the tree becomes, the greater will be the savings using
the alpha-beta method.
The savings become even more dramatic when the branches
of the tree are examined in an intelligent order. In general it is
true to say that within any group of moves the best one
should be examined first, so that if the best one is not good
enough we need not waste time in examining the second best,
third best and inferior moves. If the tree is searched in such a
way that the moves are examined in their optimal order, then
the number of terminal nodes examined will be approxi-
mately 2 x VN, where N is the total number of terminal
nodes on the tree. Thus, for a game of chess in which the
branching factor is typically 36, the number of terminal
nodes on the tree is 364 for a 4-ply tree. Yet by using the
alpha-beta algorithm, if the tree is optimally ordered we need
examine only 2 x 362 terminal nodes before we find the best
move from the root of the tree, a saving of well over 99010
when compared with the simple minimax method.
Taking the figures from Newborn's results quoted above,
we can compare the expected number of nodes examined with
random ordering and the number of nodes examined with
45
Computer Gamesmanship
,3
random
6.84
- -optimalO-"1
5.66 ,
, 12.14 40.11 15
44.218 I
,
"
38.65
122.11 "31
220.37
1214.45
FIGURU 17
- '"
. I
I hope that the reader is now convinced that for all two-
person game trees, except the smallest of the small, alpha-
beta is a must. The most important implication of these
results is that if it is at all possible, you should generate
and/or examine the moves within any group or family in such
a way as to take maximum advantage of the savings that can
be achieved. and this means ordering the search in some way.
We shall discuss various techniques for speeding up the
alpha-beta search in the following chapter, but one obvious
method can be mentioned here. First, generate all the moves
at the root of the tree, m l m 2 • • • etc., and evaluate the
resulting positions with the evaluation function. Sort the
moves so that the move with the highest score will be
examined first, then the move with the next highest, and so
on.
Next. look at the first position on the list and generate its
successor positions. These are assigned scores using the
evaluation function and they are then sorted, this time with
the lowest scored position coming at the top of the list and
the highest scored position at the bottom. (This is because the
program's opponent is trying to minimise the score.)
This process is repeated all the way down the tree, except
for the terminal nodes, which are not sorted. Now, when
searching the tree with the alpha-beta algorithm, the tree will
be found to be much nearer an optimally sorted tree than if
46
Computer Gamesmanship
this process had not been applied. One disadvantage of this
method, however. is that it requires us to keep in memory all
the successor nodes to each node on the principal variation,
apart from the terminal nodes. So in a search of a chess tree,
with 36 moves at each node, this method would require us to
keep in memory:
(a) the root node
(b) 36 nodes at each level of look-ahead apart from the
terminal node.
In order to combat this problem we might try to find an
extremely compact method of representing a position, but if
this compactness results in a slowing down of the search
process while each position is unravelled or created, much of
the effect of the fast alpha-beta algorithm will be lost. Such
problems require careful thought and it is often necessary to
experiment before the best balance is achieved between
representation and optimality of ~earch.
Other useful techniques for examining the moves in a
sensible order can often be found by thinking a little about
the nature of the game. Let us consider once again the game
of tie-tac-toe. The elements of the 3 x 3 array might be
numbered as in Figure 18a.
121
456
789
FIGURE 18a
A simple way to generate all the legal moves from any
position is to look at the elements, starting with I and
working up to 9, and putting any empty space on the move
list. But with a basic knowledge of the strategy of the game
we can speed up the search process by looking first at element
5, then 1,3,7 and 9, and finally at 2, 4, 6 and 8. This method
of move generation takes no longer than 1, 2, 3, 4, ... 9, yet
47
Computer Gamesmanship
it enables the alpha-beta algorithm to examine the moves in a
more sensible order, thereby taking us closer to an optimal
search process.
Task 3
Sources
48
CHAPTER 4
Speeding Up the Search
000000
800000
FIGURE ISb
all belong to one player, pits 'b' and Kalah 'B' belong to his
opponent. At the start of the game each pit contains an equal
number of stones, say 5, and each Kalah is empty.
The players move alternately. To make a move a player
picks up all the stones in one of his pits and, moving his hand
in an anti-clockwise direction, drops one stone into each pit
and into his own Kalah, but not into his opponent's Kalah.
When his hand holds no more stones the player has had his
turn, and it is then his opponent's turn to play; but if the last
stone lands in a player's Kalah he has another turn, so it is
advantageous to plan the game so that you will have two or
more turns in succession. The other important rule is that if a
player's last stone lands in an empty pit on his own side, he
captures all of the stones in the opposite pit and places them,
together with the stone making the capture, in his own Kalah.
At the end of the game the player with the most stones in
his Kalah is the winner.
Russen experimented with preliminary searches of various
depths. With a full look-ahead of to-ply he discovered that
the program consumed the minimum CPU time when 90010 of
its total search time was spent in the short look-ahead of
5-ply. He then found a method for improving the search
speed still further. Rather than begin a new 5-ply search at
each ply, he used the fact that the short look-ahead searches
overlap - the 5-ply search conducted at one position in the
tree could be used as a 4-ply search of a position at the next
level down in the tree. This means that a short look-ahead of
51
Computer Gamesmanship
5-ply would have its own short look-ahead ordered: to a
depth of 4-ply the first move, 3-ply on the next move, 2-ply
on the third move and I-ply on the fourth. So when the
program is executing the short look-ahead routine it can take
advantage of this partial ordering within the short look-
ahead, and the short look-ahead itself is speeded up. In the
case of Russell's Kalah program this technique produced a
reduction in total search time of approximately 65010.
One of the problems of implementing this short look-ahead
method on a personal computer is the need to store the whole
of the short look-ahead tree. For most games this will be
impossible without a floppy disk system, and even then there
will be games for which there is insufficient memory to cope
with anything more than a 2-ply or 3-pJy short look-ahead
search. Nevertheless, the idea is worth remembering, either
for games with relatively small branching factors, or for the
day when you upgrade your micro by adding a hard disk. But
with even the smallest memory configuration you can utilise
this method to some extent, simply by restricting your short
look-ahead to a I-ply search! Let us see how this might work
in practice, using the game of tic-tac-toe, as in our example
(Figure 19).
The program generates the three, essentially different first
moves: the central move (location 5), a corner move (location
1) and a move in the middle of an edge (location 2). Those of
you who have followed the earlier chapters will know that the
moves may actually be generated in that order by the appli-
cation of an elementary understanding of the game.
The program evaluates the resulting positions, i.e. the
positions it has found from a I-ply search, and sorts them so
that the best move is examined first. We shall assume that our
evaluation function retains the order in which the moves were
generated, in which case the program next produces the
moves from position P It the position arising after making the
central move (location 5). In reply to this move there arc two,
essentially differem moves, a corner (location I) and the
52
Computer Gamesmanship
1 2 3
4 5 6
7 8 9
1 2 3
406
7 8 9
X 2 3
4 0 6
7 8 9
0 ml12(9) mllJ(2)
X 2
4 0 6
7 8 9
Shorr look-alread f'" liNGNo<,
FIGIJRE 19
the move mi' it is quite possible, even likely. that ZAP will
ruin you after you make the move m 2• In chess and many
other games there is the concept of the threat. and ZAP
moves often fall into this category. If your queen is
threatened and you playa random move, the chances are that
your opponent will be able to capture your queen on his next
turn. Each time you think of a move you should first look to
see if it loses your queen in the same way. and if it does so
then you will have pruned off large chunks of the game tree
simply by finding the refutation move (sometimes called the
'killer' move) early in the search.
The implementation of the killer heuristic is not difficult,
but it does require the use of extra RAM. At each level in the
tree, keep a note of which move produced the last cutoff (this
is the killer move) and try that move first when examining the
next group of positions at the same level. This method
becomes clearer from an examination of the example in
Figure 20.
,P,
, etc
,I,
,,I
etc
_ _ _ Compare
- with roll
FIGURE 20
55
Computer Gamesmanship
The program has already looked at the flfSt move from the
root of the tree, and returned a score to the root position. It
now examines move mI, leading to position P l , and soon
discovers that in reply to m, if its opponent chooses mZI then
the opponent will have improved on his score which is
currently at the root of the tree. In other words, move mll
refutes move m" and the program need not look at m n , mD'
... , etc.
Next the program examines move ml' It knows that m 21
refuted m 2 so it first looks at its list of legal moves from
position p) to see if the same move as mIl can be found in this
list ~ if so it examines that move first, in the hope of finding
that here too the same move provides a refutation, thereby
terminating the search from m) after examining the minimum
number of branches. If it turns out that m] is refuted by a
different move, then this new killer move replaces the original
one and it is this new killer which is looked for first when
examining the successors to m4 •
There are various ways in which this heuristic may be
refined and expanded, but each of them requires still more
RAM. Instead of storing just one killer move at each level,
the program could store (say) the first five killer moves that it
encountered at each level and keep a note of how often each
killer was used as a refutation move at that level. Each time
the count for one of the killers was updated, all five killers
could be ordered so that the next time the program reached
this level of look-ahead it examined the most frequently used
killer first, then the second most frequently used, and so on.
Another idea is to store killer moves linked to the moves
that they refute, and then use this information at different
depths of search. For example. if it was discovered that in a
chess position the move e2-e4 by White was refuted by the
reply c7---c5, then wherever the move e2 e4 was found in the
tree, whether it was at 3-ply, 5-ply, 7-ply or deeper, the first
move to be examined for Black would be c7-c5. Again the
logic behind this use of the heuristic is easy to understand - a
56
Computer Gamesmanship
decision which is bad today will probably be bad in a similar
situation tomorrow.
When a program has finished its search of the game tree, and
has decided on its move, it will have in its memory the path
through the tree which it considers to represent the best play
by both sides. Its own best move will be at the top of the tree,
then the move which it expects its opponent to make in reply,
then the move which it thinks is the mOst likely reply to its
opponent's expected move, and so on. It seems a pity to
waste this information when so much effort has been put into
its acquisition, and no more memory is required to take
advantage of the information than one needs for the killer
heuristic. Simply use the 3rd ply move from the current
search as the first move to be examined when the program
next begins to compute a move. The 4th ply move in the
current search can serve as the 'killer' at ply-2 in the next
search; the 5th ply move now can be the first killer at ply-3
next time, and so on. Very little computation time will be
taken up with this method, and it is as well to start your
search locking at vaguely sensible moves.
J' ~ i
i-~i-l I:: j-I!
E(i),-E(i) IS THERE
-
E(j),"+=OR
AS APPROPRIATE
EO)':.= OR
~
AS APPROPN.1ATE
NO
A CUT OFF'
[E(i) WORSE THAN
EO) I
>
EXIT
i-O'
(ROOT OF>
NO '"
L(i)'~O
'fRRE)
'" EXIT
BEST MOVE'z
m(O)
EXIT
FIGURE 21
Task 4
(If you have finished task 3 you will find this one much
shorter.) Write a program to play tic-tac-toe, taking
advantage of symmetry and employing the alpha-heta
algorithm. Search the whole of the game tree using the
primitive evaluation function: program win.,,- + I, opponent
win= -I, draw=O. Generate moves in the order: centre,
corners, middle of edges. (Thus far as in task 3.)
Add, in turn, routines to use the killer heuristic in its
simplest form, and a modification to set the alpha-beta
window to - 0.9 and + 0.9. Note the effect that each of these
changes has on the time taken to search the whole of (he
game tree from the initial position. Add a routine to make use
of the principal continuation, and test this by timing the
program's computation, with and without this routine, after
one move has been made by each side (remember to use
symmetry here also).
The results should bear out the assertions contained in this
chapter.
Sources
61
CHAPTER 5
More Complex Evaluation
Functions
I
Computer Gamesmanship
feature would almost certainly be sufficient to enable the
program to play better than Bobby Fischer. But such is the
nature of the game that a 20-ply search is not yet realistic, let
alone 200-ply, so our evaluation function must have more
features.
In order to discover which features of the game are
important, you may do one or both of two things. You may
read some books on the subject, in the search for general
advice (heuristics), and you may ask someone who is expert at
the game. In answer to your question 'What else is important
in chess, apart from material?', you may well receive the
reply 'Control of the central squares'. On investigating
further you discover that pieces in the centre can move to, or
attack, more squares than pieces on an edge or in a corner.
And pieces that attack central squares may eventually be able
to move to a central square, so attacking central squares is a
useful thing to do.
Further questioning, and/or reading, will reveal that if
your pieces are getting in each other's way they will not be
able to do very much, whereas if they have plenty of scope to
move they will be more likely to help you improve your
position; thus it is important for your pieces to have as many
moves as possible.
Everyone knows that the king is the most important piece
in chess, so obviously one should look after one's own king.
Expert advice will tell you to keep it away from the centre of
the board until the final stage of the game has been reached;
castle during the opening stage so as to put your king nearer a
corner, where it will be safer than on its original square; and
don't rashly advance the pawns in front of it once you have
castled. You can learn all this from any decent book on the
game.
A fifth feature, whose importance is often underestimated
is pawn-structure. Good chess players know that 'isolated
pawns', that is pawns which do not have any supporting
pawns on adjacent columns, are weak, because if the
64
Computer Gamesmanship
opponent attacks them they can be defended only with
something more valuable than a pawn, and it is always best to
use your less valuable pieces for defence. Also, it is usually a
disadvantage to have 'doubled' pawns, i.e. two of your own
pawns, one in front of the other, since they will not be able to
defend each other and the front one will block its colleague's
path.
To summarise this stage of function building: read some
good books on the game and try to get advice from a strong
player. You need to know which features in a position are
important, and you need to understand why they are impor-
tant so that you can mea~ure roughly how much of each
feature is present in a position.
FIGURE 22
Boxes
Task 5
Sources
76
CHAPTER 6
Card Games-Guessing the Odds
Deduced probabilities
Shuffling
AKQJ1098765132
SPADES: 0.0 0.0 0.5 0.5 0.5 0.5 0.5 0.0 0.5 0.5 0.0 0.5 0.0
HEAHTS: 0.5 0.5 0.0 0.5 0.0 0.5 0.5 0.0 0.5 0.0 0.5 0.5 0.5
DIAMONDS: 0.5 0.0 0.5 0.5 0.0 0.0 0.5 0.5 0.0 0.5 0.5 0.5 0.0
CLUBS: 0.5 0.5 0.5 0.0 0.5 0.5 0.0 0.5 0.0 0.5 0.0 0.5 0.5
Fl(j\JRE 23
Bayes' theorem
Let us suppose that there are two bags, each containing rive
balls. Bag A contains 1 white and 4 black balls, bag B
contains 3 white and 2 black balls. 1 take a ball at random
from one of the bags, and the ball is white. What is the
probability that 1 took the ball from bag A?
The probability that a ball selected at random from bag A
will be white is 115.
The probability that a ball selected at random from bag B
will be white is 3/5.
Bayes' theorem shows that the probability that a randomly
selected white ball actually came from:
79
Computer Gamesmanship
1/5
bag A~ (1/5 + 3/5) ~ 1/4
84
Computer Gamesmanship
that we are two tricks from the end of a hand of our three-
player card game.
We hold: A of Spades, 10 of Diamonds.
Bill holds: J of Diamonds, 5 of Clubs.
John holds: 3 and 2 of Clubs.
It is our turn to lead (remember that Spades are trumps).
The program now constructs a game tree, of depth 6-ply.
Part of the tree will look like Figure 24. We assign to the
terminal nodes of the tree, scores corresponding to the
number of tricks won by each player, and we back-up
R]LL
---- '"
~
ALL T~"M]NA].
P<)']T!ONS B~LOW
Hll'." P" AN" P
llAVFscokKSOf
.lOHN
•
p " ,
~
'0 U] .. "O"lD.<
,<, - __ ~. I
JO"" •
FIGURE 24
85
Computer Gamesmanship
through the tree until we can determine which card should be
played next. In this example the situation is simple because if
we lead the A of Spades first we may take two tricks, whereas
if we lead the 10 of Diamonds we can only make one trick.
Note the use of the word 'may'. In order to make two tricks
we need some help from Bill, who must make a mistake and
discard the J of Diamonds in the hope that our second card is
the 2 or 3 of Clubs and he will make his 5. But since we lose
nothing by playing the A of Spades first, that is clearly the
best way to continue. How can we modify our traditional
methods of tree-searching to cater for situations such as this
one, in which we wish to allow for the possibility that our
opponent will make a mistake? Fortunately the problem has
been solved for us, by the ubiquitous Donald Michie, whose
name crops up time and again in interesting research reports
on various topics within the science of Artificial Intelligence.
Task 6
Find or invent a simple card game in which information may
be deduced from the play of the cards. (Avoid bidding games,
unless you are extremely confident and have many free
hours.) Write a program to play this game, modifying the
probability estimates of the unseen cards in the light of the
user's play. Experiment with various methods of adjusting
these estimates until the program plays at least moderately
sensibly. At the point in the game where exhaustive search
will not be too time consuming, set up a probabilistic game
tree a la Michie to search to the end of the game.
Sources
Michie's work
FIGURE 25
89
Computer Gamesmanship
game will inevitably be a draw. So the expected result from
making move MI is 0.5.
If he makes move M2 the player sees that his opponent can
defeat him, but only by fmding a IS-ply deep continuation
that is very difficult to spot. Otherwise, our player will win.
He assesses the probability of his opponent finding this
IS-ply win as being 0.1. The expected result from making
move M2 is therefore
(0.1 xO)+(0.9x 1)~0.9
,,
,
et;'.
M" (e,,)
Box 1.
FIGURE 26
Discernibility
Bluffing
Rummy
Poker
FIGURE 27
FIGURE 28
For Ben the probability of his calling the bet is 0.3, so the
expected income is 70-30=40. For Joe the probability is
0.6, so the expected income is 40- 60 = - 20. So the program
would not try to bluff in this situation against Joe, but it
would against Ben, and it would determine whether or not to
try a bluff against the other players in a similar manner. If
98
Computer Gamesmanship
you are writing a poker program and your computer system
supports a cassette or disk, it will be possible for you to retain
the information learned during one playing session for use in
the next. Of course it is quite possible for one or more of the
program's opponents to change his style from one session to
another, but it is always useful to have some reference point
at the start of a game. For players with unknown character-
istics, the program will employ fixed estimates, stored in a
table, which can be updated during a playing session as the
program learns how each player acts at the poker table. It will
also be possible for an intelligent program to make certain
generalisations: for example, Joe is quite likely to call a
possible bluff in a parallel but as yet unrecorded situation
(the probability of the bettor having a cast iron cinch being
roughly the same as his having a flush when showing four
cards of the same suit). Again this is largely a matter of
learning.
Task 7
99
CHAPTER 8
Checkers
At the next ply the program ignored a1l moves for which
the previous move was not a capture, and at the sixth ply and
deeper levels in the tree only capture moves were examined.
By the time the program reached this depth the number of
moves being examined from any position was small, but it
was still possible for the program to find itself getting
involved in ridiculous capture sequences, and so at a depth of
iI-ply the search would terminate if either side was more than
two kings ahead (an overwhelming advantage). At 20-ply the
search terminated under all circumstances so that the pro-
gram did not run out of memory for storing the tree.
Samuel's criteria for pruning the tree were chosen in such a
way as to encourage the evaluation of positions that were
quiescent, and to discourage evaluation in turbulent
positions. The concepts of quiescence and turbulence are
perhaps better understood when related to the two different
aspects of game playing: strategy and tactics. Strategy
involves planning and manoeuvring. Tactics (e.g. capturing)
are used to punish blunders and to convert a strategic
advantage into something more concrete, such as material.
The argument in favour of Samuel's approach is that the
three-ply of exhaustive search gives the program some
strategic grasp of what is happening, while the deeper tactical
search ensures that it does not perform erroneous evaluations
in turbulent positions. The necessity to restrict the deeper
search in this way is clearly dependent on the nature of the
game and the number of branches at each node of the tree
(the branching factor). The number of positions evaluated in
a minimax is roughly proportional to b d where b is the
average branching factor and d is the depth of search, and
anything that can be done to reduce the 'b' will produce a
combinatorial improvement in playing speed.
The evaluation function used in the early version of
101
Computer Gamesmanship
Samuel's program employed 39 terms or features, only 17 of
which were in use at anyone time. The features were
temporarily suspended from duty if and when it was found
that they did not contribute significantly to the evaluation
process. Correlation measurements indicafed which of the 17
features currently in use were the least effective, and once the
effectiveness dropped below some threshold value they were
replaced by the features at the top of the reserve list, while the
rejects were added to the bottom of the reserve list. Material
was the dominant feature, and Samuel recognised the need to
encourage the program to trade off pieces when it was ahead
but to avoid exchanges when behind. This may be accom-
plished in various ways but the most reliable is probably to
determine the value of:
(program material- (greater side's material)
opponent's material) x (lesser side's material)
A full list of the other features in the linear part of the
evaluation function is given below. There were, in addition,
some non-linear terms in the function. In the following list
the board notation is as used in the checkers literature
(see Figure 29):
FIGURE 29
102
Computer Gamesmanship
ADV (Advancement)
The parameter is credited with 1 for each passive man in the
5th and 6th rows (counting in passive's direction) and debited
with I for each passive man in the 3rd and 4th rows.
APEX (Apex)
The parameter is debited with I if there are no kings on the
board, if either square 7 or 26 is occupied by an active man,
and if neither of these squares is occupied by a passive man.
CRAMP (Cramp)
The parameter is credited with 2 if the passive side occupies
the cramping square (13 for Black, and 20 for White) and at
least one other nearby square (9 or 14 for Black, and 19 or 20
103
Computer Gamesmanship
for White), while certain squares (17, 21, 22 and 25 for Black,
and 8, 11, 12 and 16 for White) are occupied by the active
side.
DYKE (Dyke)
The parameter is credited with 1 for each string of passive
pieces that occupy three adjacent diagonal squares.
EXCH (Exchange)
The parameter is credited with I for each square to which the
active side may advance a piece and, in so doing, force an
exchange.
EXPOS (Exposure)
The parameter is credited with I for each passive piece that is
flanked along one or the other diagonal by two empty
squares.
GAP (Gap)
The parameter is credited with 1 for each single empty square
that separates two passive pieces along a diagonal, or that
separates a passive piece from the edge of the board.
104
Computer Gamesmanship
HOLE (Hole)
The parameter is credited with I for each empty square that is
surrounded by three or more passive pieces.
MOVE (Move)
The parameter is credited with 1 if pieces are even with a total
piece count (2 for men, and 3 for kings) of less than 24, and if
105
Computer Gamesmanship
an odd number of pieces are in the move system. defined as
those vertical files starting with squares 1. 2, 3 and 4.
NODE (Node)
The parameter is credited with I for each passive piece that is
surrounded by at least three empty squares.
POLE (Pole)
The parameter is credited with 1 for each passive man that is
completely surrounded by empty squares.
RECAP (Recapture)
This parameter is identical with Exchange, as defined above.
(It was introduced to test the effects produced by the random
times at which parameters are introduced and deleted from
the evaluation polynomial.)
THRET (Threat)
The parameter is credited with I for each square to which an
active piece may be moved and in so doing threaten the
capture of a passive piece on a subsequent move.
Different sets of weightings were tried in the evaluation
function and an initial set was chosen by playing through a
series of checker games from a book and computing the
correlation coefficient of the moves chosen by the program
and those chosen by the original (human) player.
Rote learning
Move.phase tables
FIGURE 30
Sources
109
CHAPTER 9
Chess
In the beginning
your program should look further afield if the first move that
it comes up with is seen to be bad. In this case, after
examining the 7 chosen moves from the root of the tree, the
program could see that it was losing material to 22 ...
d8-dl, but was powerless to stop it. Had it been permitted to
continue its search it would have found a 'better' move
before too long. There is a parallel here between the drastic
forward pruning method employed by Bernstein, and the
iterative deepening approach used by many of today's
programs. With iterative deepening, a program finds the best
move it can after a I-ply search, then it increases the depth to
2-ply and looks for a better move, then to 3-p\y, and so on,
until it runs out of time. Similarly, a forward pruning
program should be permitted to continue its search by
relaxing the pruning requirements, if it cannot find a
satisfactory move early on in its search. Instead of searching 7
moves at each level, Bernstein could have examined (say) 5
moves at each level in less than one-third of the time, then
when the program discovered that 21 h2-h4 and its four
brothers were all dreadful moves, it could have examined all
the other moves from the root of the tree, and the best five
successors to each of them. This would have resulted in only a
slight increase in total computation time for the move, but it
would have enabled the program to see the immediate tactical
consequence overlooked by the 'best seven' approach.
piece in its old and new locations and a strategic gain was
taken as an indication that the move should be on the
plausible move list. In other words, if a move appears to put a
piece on a better square, that move is worth further exami-
nation.
The program encouraged certain types of attack on squares
that were considered possible weak points, for example weak
pawns, pinned pieces, and pieces defending other pieces.
Moves which fell into these categories were also added to the
plausible move list.
MacHack performed an aJpha-beta search, with forward
pruning. The plausible move generator would select a number
of moves at each level of look-ahead, and add to this number
any moves which satisfied certain conditions: all safe checks
were examined; at the first and second plies all captures were
investigated; the moves of a certain number of distinct pieces
were examined, so that the program would not ignore most of
the board if all of the moves of a single piece were highly
plausible. The minimum number of moves selected by the
plausible move generator was normally six at each level of
look-ahead, but in tournament mode, i.e. when playing at a
rate of 2-3 minutes per move, the program would examine a
minimum of 15 moves at the first two ply, nine moves at the
next two ply, and seven moves at each subsequent level. Only
when the minimum number did not exist (for example when
one side was in check or had only its king on the board)
would the search be narrower, though of course the alpha-
beta algorithm would often prune away branches on which
there were plausible moves.
One of the few advantages that mainframe programmers
have over those writing for a micro, is the availability of
an enormous backing store. This enables a program to employ
transposition tables, which are advantageous in preventing
the program from evaluating the same position more than
once. In chess, as in many other games, it is frequently
possible to reach the same position by different routes, and
123
Computer Gamesmanship
we call this phenomenon transposition. As a simple example,
if White makes move A, Black makes move B, and White
then makes move C, we can reach the same position as if
White had made his moves A and C in the reverse order.
MacHack produced a hash value for every position evaluated
in the tree search, and together with this value the program
stored the score for the position and a note of the depth of
search at which the evaluation took place. If the position is
created again during the search, the program would not
recompute the score for the position but would take it from
the value stored together with the hash for that position.
Even though MacHack stored only 32,000 positions in
hashed form, it was able to save considerable computation
time and as a side benefit, it was quickly able to detect draws
by repetition.
The MacHack program represents the first really signifi-
cant milestone after Shannon's paper because it was the first
program to make good use of the Shannon-B strategy. The
strength of the program in 1967 was extremely impressive and
created considerable publicity for computer chess among the
computing and chess fraternities. This publicity served as the
impetus for many of the groups which started programming
around 1967 or 1968, for example the Slate/ Atkin/Gorlen
group at Northwestern University and Newborn at Columbia
University. In fact Greenblatt and his colleagues probably did
as much for computer chess in 1967 as Shannon had done
almost 20 years earlier.
I should like to offer you two examples of the playing
strength of the Greenblatt program. The first is a position
which was shown to several strong American chess players,
including Masters, and defeated a number of them.
The position in Figure 31 is a win for Black, who has an
extra knight for a pawn. But the task is to find a quick win. If
White is allowed to survive he might conjure up counterplay
based on the exposed position of the Black king and the
weakness of Black's pawns on g6 and a7. How can Black
force a quick win?
124
Computer Gamesmanship
FIGURE 31
2 • • • f2-e2
3 g7-hS+ dS-c7
4 hS-f6 e2---el +
5 Resigns
To show that a computer program is a good chess player, it
is not enough to give an example of its tactical prowess. The
very best programs are extremely adept at tactical combina-
tions but are often let down by their poor strategic under-
standing. So the proof of the whole pudding must lie in
the examination of complete games. The following is the first
game ever won by a computer program in a chess tour-
125
Computer Gamesmanship
nament. Its opponent was rated 1510 on the USA rating
scale. equivalent to a weak club player. The game was played
in the Massachusetts State Championship, 1967.
WHITE: MacHack VI
BLACK: Human
1 e2-e4 c7-c5
2 d2-d4 c5-d4
J dl-d4
MacHack knew no openings at that time and plays very
much as many of today's commercially available machines.
This type of opening is bad for White because it allows Black
to bring out his pieces 'free of charge', by using developing
moves to harass the White queen.
J • • • bB-c6
4 d4-dJ gB-f6
5 bl-cJ g7-g6
6 gl-fJ d7-d6
7 cl-f4 e7-e5
A dubious decision. The human was obviously worried
about the possible advance of the White pawn from e4 to e5,
but Black should have continued 7 ... fS-g7, and if e4-e5,
then f6--h5, attacking White's bishop.
B f4-gJ a7-a6
9 el---cl b7-b5
10 a2-a4 fS-h6+ ?
An ineffective move that weakens an important central
pawn. One gets the impression that the human felt he could
take risks against MacHack.
11 cl-bl b5-b4
12 dJ-d6
Black, when making his tenth move, almost certainly
overlooked the fact that on the d6 square, White's queen or
126
Computer Gamesmanship
rook will fork the two Black knights on f6 and c6, thereby
rendering harmless Black's threat to the White knight on c3.
12 • • • c8-d7
13 g3-h4 h6-g7
14 cJ-d5 f6-e4
15 d5--<7 +
Black may have overlooked this response, but in any event
his position was hopeless.
15 • • • d8--c7
16 d6-c7 e4-c5
17 c7-d6 g7-fS
18 d6-d5 a8-c8
19 fJ-e5 d7-e6
20 d5-c6+
MacHack spots a simple queen sacrifice that forces mate.
20 ... c8-c6
21 dl-d8 mate
shall offer you the following game, which was its fir,,,l l'VL'r
win over a human Grandmaster. The game was playeli ill hli1./.
speed, which requires each player to make aJi of his move:.;
within five minutes. In fact the rules were slightly liiHerent
for the two participants - Stean was playing in real time hut
the program was permitted a total of five minutes for CPU
time and satellite transmission time, with no penalty for the
time taken by it:.; human operator to move the pieces.
WHITE: CHESS 4.6
BLACK: Stean
I e2-e4 b7-b6
2 d2-d4 c8-b7
3 b 1--c3 c7-c5
4 d4-c5 b6-d
5 cl-e3 d7-d6
6 fl-b5 + b8-d7
7 gl-f3 e7-e6
8 cl-gl a7-a6
9 b5-d7 + d8-d7
10 dl-d3 g8-e7
11 al-dl a8-dS
12 d3-c4 e7-g6
13 fl-el [S-e7
14 c4-b3 d7-c6
15 gl-hl
It is peculiar moves such as this one which make it po:.;sible
to recognise the play of a computer. A strong human player
would never move his king onto a diagonal occupied by his
opponent's queen and bishop, unless forced.
15 • • • eS-gS
16 e3-g5 b7-aS
17 g5---e7 g6-e7
IS a2-a4 dS-bS
19 b3-a2 bS-b4
20 b2-b3
114
Computer Gamesmanship
If we sum up what has happened so far, it is clear that
Black has a dominating position. His pawns control the
centre while White's e4 pawn attacks only one central square.
Black's pieces are active, White's are passive. But the
program has one important advantage - its opponent thinks
that to all intents and purposes the game is over and he tries
to take the program's position by storm. This is exactly the
opposite of the way one should play against a strong pro-
gram - the tactical search will reveal tricks that the human
misses, especially at this breakneck speed.
20 • • • 17-f5?
A mistaken attempt to open up the diagonal to the White
king.
21 fJ-g5 f5-e4
22 c3-e4 f8-f2
This move appears, at first glance, to be very strong. If
now 23 e4-f2, Black's queen immediately gives mate on g2.
But the program had seen further in the crucial variation than
its opponent.
23 dl-d6!
When he saw this move Stean exclaimed, 'Bloody iron
monster'. The point is that Black's queen is needed to prevent
d6-d8 mate, and the queen is attacked. If the queen moves to
a square that protects d8, White can then capture the rook on
f2. So White must win material.
23 • • • c6-d6
The best try.
24 e4-d6 f2-g2
Threatening to move the rook to g5, c2 or e2, with check
from the bishop on b7. Any of these moves would win for
Black, but ...
135
Computer Gamesmanship
25 g5-e4
Blocking the crucial diagonal.
25 • • • g2-g4
26 c2-c4
Blocking off another line of attack.
26 ... e7-f5
27 h2-h3
Stean had hoped for 27 d6-f5 e6-f5, when Black wins the
other knight which is pinned against the White king. When
the computer played h2-h3 Stean cried out, 'This computer is
a genius' .
27 • • • f5-g3 +
28 hl-h2 g4-e4
29 a2-f2!
Yet another tactical blow. Black had only expected 29
d6-e4 g3-e4, when he has sufficient material to make the
program's task quite difficult. But this latest move,
threatening mate by f2-f7 + and then f7-fS mate, forces an
even greater material advantage.
29 • • • h7-h6
30 d6-<:4 g3-e4
31 f2-fl b4--b8
32 el-e4 b8-f8
33 fl-g4 a8-e4
34 g4 e6+ g8-h8
35 e6-<:4 f8-f6
36 e4-e5 f6-b6
37 e5-c5 b6-b3
38 c5-c8+ h8-h7
39 cS-a6 Black Resigns
There was once a time when leading experts in computer
science would say that 'Computers can't play chess'.
136
Computer Gamesmanship
Up to now 1 have discussed the most important milestones
in computer chess since the lime Claude Shannon's famous
paper was first published in 1950. Next 1 shall survey the
current state of the art in microcomputer chess programming,
as typified by the play in the first World Microprocessor
Chess Championship which was held as part of the 3rd
Personal Computer World Show in London, in September
1980.
We shall be examining some of the critical moments of the
games, and 1 shall give a deep analysis of a key game between
Boris Experimental and Chess Challenger. I hope that from
these episodes the reader will learn of a number of important
pitfalls that should be avoided in writing chess programs.
Remember - it is not necessary to be a strong chess player
yourself, but you should get some advice from someone who
is at least club strength.
By now the reader should be thoroughly familiar with
standard computer chess notation, so from now on I shall
employ a more complete form of notation, for ease of under-
standing.
Checkmate!
The endgame
83 • • • QdS-eS+
84 Ke7-d7 QeS-d4+
8S Kd7-<:6 Qd4-c4+
This last move produces a position which has now occurred
three times. Under the rules of chess the game is drawn under
such circumstances.
Drawing by threefold repetition in this way is a frequently
overlooked problem in microcomputer chess programs. In
order to detect a repetition it is necessary for the program to
store the move sequence going back as far as the last pawn
move or the last capture, and so a 50 move, or tOO-ply
sequence must be allowed for. (Another rule is that a game is
drawn if 50 moves are played by each side without a pawn
being moved or a piece being captured.) The program may
then examine the move selected by the tree search (or any
move within the tree search) to see if it produces a position
that has already occurred. If so, the value of this move is set
to a draw, and the move is only made if there is no alternative
move which keeps the advantage. Here the program would
reject the draw because it would eventually give up its queen
for the White pawn, leaving itself with an extra pawn.
Another method for avoiding this problem in the game
given above is to have a simple routine which measures
whether or not a passed pawn can be caught by the enemy
king, before it reaches the promotion square. If it cannot be
caught, and if there is nothing else on the board except the
kings, the program can set the value of this pawn to the value
of a queen. In the final position Black has the advantage of
queen and pawn vs. pawn, or ten pawns to one (for a differ-
ence of nine), but if Black gives up its queen for the c-pawn it
will be left with an effective advantage of a queen for
nothing - still a nine-pawn difference but 9--0 is better than
10-1. Simple ideas like this can often make the difference
between a win and a draw.
141
Computer Gamesmanship
Mobility
144
Computer Gamesmanship
FIGURE 34
145
Computer Gamesmanship
II Qh6-f8+ RhS-FS
12 Bft-d3 BcS-d7
13 Ke1-gl
Everyone knows that it is important to castle in chess, so as
to unite the rooks and get the king into safety, but in some
positions it is much better not to castle, so that the king will
be nearer the centre of the board in readiness for the end-
game. This is often true when queens have already been
exchanged, as in the present position, but few (if any) chess
programs utilise this heuristic.
13 . . . RaS-dS
14 Ral-bl Bd7-cS
15 Bg5-h6 RfS-eS
16 Rft-e1 Ne7-gS
A passive and unnecessary move, even though it drives
away the bishop (which was no longer doing anything useful
on h6). I noticed more than once at the World
Championships that when programs get into a passive
position they do not understand how to play to improve the
freedom of their pieces. Here, for example, Black should try
for counterplay on the queen side by means of ... Nc6-a5
... b7-b6 and .. c7-c5.
17 NG-g5 + Kf7--<7
IS Bh6-g7
Threatening the pawn on h7.
IS • • • h7-h6
19 Ng5-h7?!
A highly dubious plan which deserves to lose material. If
Black played carefully it could probably trap one of the
White pieces (the knight at h7 or the bishop at g7).
146
Computer Gamesmanship
FIGURE 35
19 • • • Ke7-t7!
A good move, but Black misses the main point of the idea.
20 Bg7-f6 Ng8-f6??
A blunder. After 20 ... Rd8-d7, White would have to go
through great contortions even to try to save his knight from
permanent incarceration. The immediate threat would be 21
... g6-g5, followed by 22 ... Kt7-g6. If White played 20
h2-h4, to prevent the advance of the Black g-pawn, Black
could respond 20 ... b7-b6, followed by ... Bc8-b7, ...
Re8-c8 and ... Kt7-e8, leaving the knight with no place to
go. Such plans are not difficult for human players, who
realise that entombed pieces are liable to be trapped, but
planning is one of the most difficult aspects of computer
chess.
21 Nh7-f6 Re8-e7
22 h2-h4 b7-b6
23 h4-h5 g6-g5
24 g2-g3 a7-a6
147
Computer Gamesmanship
Although White's knight still cannot get out of the Black
camp, it will never be in any danger so long as White can
maintain a pawn on e5. For this reason, among others, Black
should still be striving to play ... c7-c5, with the idea of
undermining White's pawn structure. But again this plan is
far too long term for an innocent computer program.
Nc6-a5!
At last, Black begins to do something positive.
26 g3-g4!
But it is too late. White crashes in on the king side. If now
26 ... f5-f4, 27 Bd3-g6 + Kf7-g7, and White will soon
extricate its knight via e8, once Black moves one of his rooks
away from its defence of that square.
26 ... b6-b5
27 Kgl-g2 Na5-c4?
Not understanding the position, Black makes an obvious-
looking move which is strategically wrong. In positions of
this type ... c7-c5 is just about the only way to create satis-
factory counterplay.
28 Bd3-c4 d5-c4
28 ... b5-c4 was best, hoping to keep the position closed.
29 g4-f5 e6-f5
30 d4-d5!
Natural and strong. White dominates the centre (see Figure
36).
30 ... Bc8-b7!
The best way to achieve counterplay. Black attacks the
centre.
31 Rbl-dl Bb7-c8?
148
Computer Gamesmanship
F(UURE 36
After its previous fine move, this is inexplicable. More
logical would have been 31 ... Kf7-g7, defending the
h~pawn, so that Black could continue with 32 ... g5-g4,
opening up the king side in the hope of creating counterplay.
If White's knight moves off f6 the pawn on d5 can be
captured.
32 Kg2-f2 a6-a5
33 Rdl-bl
With the Black bishop on b7 this would not have been
possible because 33 ... b5-b4 34 c3-b4 a5-b4 35 Rbl-b4
Bb7-d5 is probably satisfactory for Black.
33 ... c7-c6!
34 d5--<:6 Rd8-d2+
35 Kf2-g1 8c8-a6
Black's 33rd move might also be justified by 35 ...
Rd2-c2 36 Rbl-b5 Rc2-c3, when Black has a passed pawn.
36 Nf6-d7 Rd2-c2
37 e5-e6+! Kf7-e8
Not 37 .' . Re7-e6 38 Rel-e6 Kf7-e6 39 Nd7---c5+
149
Computer Gamesmanship
winning the bishop on a6 (another good reason for preferring
35 ... Rd2-c2).
38 Nd7-f6+ Ke8-fS
39 Nf6-d5
Now Black is helpless.
39 • • • Re7-a7
40 e6-e7 + Kf8-e8
41 Nd5-f6+ Ke8~f7
42 e7--;,8(Q) + Kf7-g7
42 ... Kf7-f6 is answered by 43 Qe8~g6 mate.
43 Qe8-g6 + Kg7-f8
44 Re l--e8 mate
FIGURE 37
150
Computer Gamesmanship
these problem areas, so that the reader can gain some grasp
of the magnitude of the problem.
The first subject that I wish to discuss in this section is what
we call zugzwang. It is a German word, whose meaning on
the chessboard is that the player whose turn it is to move is at
a disadvantage because, and only because, he must make the
next move. In such a situation any evaluation function,
examining the position as a terminal node on the game tree,
would not realise that the very right to move is, in itself, a
serious or even fatal disadvantage. The position in Figure 37
will give an idea of what I am talking about.
Imagine that this position is the terminal node on a tree,
and the program must evaluate the position. White is a pawn
up for nothing, and Black has no pieces on the board other
than his king, so the program would indicate a very good,
possibly winning score for White. Yet if it is White's turn to
move (usually an added advantage!), the game is a draw.
Why? Because if White makes the only king move that pro-
tects the pawn, Kd6-e6, Black has been stalemated and the
game is drawn; while if White makes any other king move,
Black can reply by capturing the White pawn and again the
game is drawn. So with Black to move, the above position is a
win for White, but with White having the extra plus of the
right to move, the game is only a draw. How can a computer
program know, when evaluating a terminal position, that
zugzwang is a possibility and that it should search deeper just
to see what will happen after another ply or two?
It may seem to the uninitiated, that zugzwang is not really
very important in chess, but in fact the whole of endgame
theory and technique is based on the fact that sooner or later
one player is normally forced into a zugzwang situation. Even
the endgame of king and rook against king involves zug-
zwang: if the defending player were allowed to pass whenever
he wished, and miss his move, then the player with the extra
rook would never be able to win.
The next problem that I wish to discuss involves the
151
Computer Gamesmanship
FIGURE 38
FIGURE 39
Sources
155
CHAPTER 10
Backgammon
Probabilistic trees
Once the search extends beyond one ply, the trees become
probabilistic. We have already encountered such trees in
another form, earlier in this book. The tree in Figure 40 will
enable the reader to understand the problem.
\, I",,, ,. , ,
, ,
..,"1'1"'" ,,, ,,
" """'" . "
,
,,
01 II""."
,,
,,' ,IH' <I". ,
,
~I] ,, , ,
I', , , ,
,
'II"",]
,"
FI(jURE 40
'"
~T ,'~
"'' ' )
d>c', '
'",
- ------
dn,
FIGURE 41
161
Computer Gamesmanship
the problem. Since none of the scores at any ply can be
known until all successor moves from that node have been
examined, there is no way that alpha-beta can be employed.
Of course forward pruning is always possible, using either the
apparent merit of a node measured by the evaluation
function or the backed-up merit as determined from an extra
ply of look-ahead (which would be slow) but this is all that
one can do.
his opponent has a blot. Clearly the blot danger feature must
use such a calculation in order to arrive at an accurate
estimate of danger, which in turn will discourage the program
from making moves which leave vulnerable blots. In any
game where chance plays a part it is impossible to be sure that
a particular strategy will be foolproof, but it makes good
sense to play with the odds.
The notion of a blockade is very important in impeding the
progress of your opponent's men, since an enemy man
cannot land on a blocked point (i.e. a point with two or more
of your own men on it). Setting up a succession of adjacent
blockades is a particularly powerful strategy if it can be
successfully adopted because it prevents the opponent from
moving unless he is lucky enough to roll high numbers from a
point just on one side of the blockade. Berliner's program
considers every combination of from zero to seven block-
ading points (seven is the maximum number possible, since
each side has only 15 men), at a distance of from one to 12
points in front of each man. It employs a table of these
blockading patterns to store the number of rolls of the dice
that could legally be played by the side which is trying to pass
the blockade. This number indicates the extent to which each
man is blockaded.
When the two sides' men become disengaged, the 'running
game' begins, so called because each player's men run as fast
as possible towards the inner (or home) table. At this stage of
the game it is possible to estimate fairly accurately the
probability that a particular player will win the game by
bearing off all his men before his opponent is able to do so.
One method of doing this is to 'count' the position - simply
add up the number of steps each man must take before he can
bear off, assuming no wasted motion. This count can be
employed in a simple table to determine the odds of winning,
and such tables are found in most backgammon books. For
example, the books will tell you that if your count is 60 and
your lead over your opponent is four when it is your turn to
163
Computer Gamesmanship
roll, the odds are eight to five in your favour. Until the last
few moves of the game, when special heuristics apply, it is
relatively simple for the program to decide which men to
move and in many situations it will make no difference. But it
is just in this stage of the game that the complication of the
doubling cube becomes of paramount importance.
Backgammon is traditionally played for stakes. One of the
essential elements of a backgammon set is a cube with the
numbers 2, 4, 8, 16, 32 and 64 on the faces. If a player feels
that he has a good chance of winning he may put the cube
with the 2 face uppermost, at which point his opponent must
either resign or agree to play the game for double the usual
stake. Having accepted a double a player may, later in the
game, double again, by turning the face 4 uppermost. This
process may continue until the players are wagering 64 times
the original stake - people have won and lost fortunes
through the doubling cube.
Not surprisingly, statisticians have calculated formulae
which indicate when a player should double and when a
double should be accepted and these formulae are obviously
easier for a program to apply than for a human. There is,
however, an important psychological aspect to doubling. If
most players double when their probability of winning is
around 0.6, it will be better to double at 0.7 and keep your
opponent in the game (if he assumes that he still has some
chance he will be less likely to resign, and you will win twice
as much).
Backgammon books give quite a lot of useful information
on when a player should double and when a double should be
accepted. This makes the programmer's task easier, and helps
to reduce the element of skill in the game below its normally
tiny amount. My own view of the game is that it is rather
shallow, with virtually no scope for brilliant or imaginative
play but with features that allow a fast mind to score a steady
though slight advantage against a player with a lesser facility
for calculation. It can be a fun game to program, with plenty
164
Computer Gamesmanship
of scope for neat graphics work, and the problem of coping
with the enormous trees certainly makes it a challenge to the
serious games programmer.
Sources
165
CHAPTER 11
Stud Poker
Five-card stud
Briefly, each player is dealt one card face down and one card
face up, and may look at his own down card. A round of
betting takes place, and all those who put in the necessary
amount of money on this round will stay in the game and
receive a second face-up card (the others drop out of the
hand). After receiving the second up card, the players indulge
166
Computer Gamesmanship
in another round of betting and, once again, those who put in
the necessary amount remain for a further round, while the
others drop out. The third up card is followed by another
round of betting, and then comes the fourth and final card up
and the fourth and final round of betting. When the last
round of betting is over, those remaining in the hand turn
over their one down card, and the player with the best five
cards wins the money. In order to determine whose cards are
the best, the following ranking applies to the hands:
Straight flush: This is the best type of hand to have, and most
regular poker players will only have such a hand a few times
in their life. A straight flush is five cards of the same suit
which are in an unbroken sequence, for example the 6, 7, 8, 9,
and 10 of H earls.
Four of a kind: As its name suggests, this type of hand has
four cards of the same denomination.
Full house: Three cards of one denomination and two of
another, for example three 6s and two Aces.
Flush: All five cards of the same suit but not in any unbroken
sequence.
Straight: All five cards in an unbroken sequence, though not
all of the same suit.
Three of a kind: Three of the cards are of one denomination,
the other two are not of the same denomination as each
other.
Two pairs: For example two Aces and two 7s - the fifth card
is of no importance unless two players have the same two
pairs, in which case the fifth card breaks the tie.
One pair: Aces is the highest pair, then Kings, Queens, and so
on down to 2s.
High card: If a player has none of the above hands, then his
holding is valued in accordance with the highest
denomination card in his hand (Ace is high) and then if two
players have the same high card their second highest cards are
compared and so on.
167
Computer Gamesmanship
So much for the procedure and the ranking of 1he hands.
Various betting options exist in most forms of pOKl'r, the
mOst common ones bcing:
Bet: At the start of a round of betting, one player is first to
speak. There are various methods for deciding who is first to
speak and in stud poker it is usually the player with thc
highest face-up cards. Hc has two options, he may bet or hc
may ·check'. If he wishes \0 add to the money in the pOt, thc
player bets, by putting into thc pot any amount of money
that is in accordance with thc house rules. We shall assume
that we are playing 'pot limit', which means that the size of
the bet may be anything from one unit up to the total amount
of money already in the pot. So if the pot stands at $10 and
we are playing in $1 units the first person to speak may, if he
wishes to bet, put in any amount from $1 to $10.
Check: If the person whose turn it is to speak does not wish
to bet and no one else has put money in on that round of
betting, he may say 'check', which means that he does not
wish to put money in at this stage but he may decide to do so
when it is next his turn. If, at any time in a round of betting,
all the players check in succession, then the round of betting
•
IS over.
Call: Once someone has put some money into the pot during
a round of betting, the next player must put in at least the
same amount if he wishes to remain in the game. Putting in
the same amount as the others is known as calling. When all
the players have put money into a particular betting round,
that round may only end when all of the players bar one have
called - at that point everyone has put in the same amount.
Raise: It is possible to put in more than the previous bettor
and this is known as raising. If the first player puts in $1 and
the second player wants to put in an extra $1, he will say
something like 'your $1 raise $1 " and put $2 into the pot.
Once there has been a raise it is necessary for all the players
after the last raiser to call the bet before the round is at an
end, so that everyone will have contributed the Salllt· allioullt
16R
Computer Gamesmanship
to the pot. The maximum that can be raised is the amount in
the pot before the raise takes place. So if the pot stands at $1,
and the player bets $1, making the pot $2, the second player
can put in the $1 to meet the bet and then raise $3 (the current
size of the pot).
Pass: Sometimes known as 'fold'. This is what happens when
a player decides that he no longer wishes to take part in this
particular hand - he turns his cards face down and relin-
quishes all claim to the money. Beginners often think that
passing is cowardly but in fact more hands are passed by
good players than by bad ones.
172
Computer Gamesmanship
DENOMINATION PROBABILITY
Ace 0.061
King 0.082
Queen 0.082
Jack 0.082
10 0.082
9 0.061
8 0.061
7 0.082
6 0.082
5 0.082
4 0.082
3 0.082
2 0.082
FIGURE 42: Probabilities for opponent's down card before
first round of betting (correct to three decimal places).
173
Computer Gamesmanship
that we had prior to the first round of betting was all a priori
information, whereas we now have some a posleriori infor-
mation, I would give the new information something like
four times as much weight as the older information.
Furthermore, I would suggest that we assume it to be twice as
likely that the opponent's hole card was an A, K, Q, J or 10
than another 8. So from the assumptions made on the basis
of the one called bet we can estimate the probabilities of the
various denominations being the opponent's down card as in
Figure 43.
The Queen, Jack and 10 have the same old estimates and the
174
Computer Gamesmanship
DENOMINATION PROBABILITY
Ace 0.789/5.007 = 0.158
King 0.810/5.007 = 0.162
Queen 0.162
Jack 0.162
10 0.162
9 0.061/5.007 = 0.012
8 0.425/5.007 = 0.085
7 0.082/5.007 = 0.016
6 Om6
5 0.016
4 0.016
3 0.016
2 0.016
FKiURE 44: Probabilities for opponent's down card after the
first round of betting.
175
Computer Gamesmanship
The first round of betting is now over, and the dealer gives
each of the players one more card. The program receives the 7
of Spades while its opponent gets the 10 of Clubs, so the
situation on the table now looks like this:
PROGRAM: (A C) 9 H, 7 S
OPPONENT: (??) 8 D, 10 C
and there is $3 in the pot. The opponent is now 'high', i.e. he
has the highest cards shown on the table, since 10, 8 is better
than 9, 7, and so it is the opponent who is to open the betting
on this round. He may check, or he may bet anything from
$1 to $3. Let us assume that he bets the maximum of $3.
The first thing that the program must do is to determine
whether or not, on the basis of the probability estimates that
it had before his $3 bet, the opponent is likely to have the
winning hand and if so, by what margin of probability. In
order to be winning at this stage, the opponent must hold. as
his down card, an Ace, an 8 or a 10. An Ace would give him
A, 10, 8 against A, 9, 7, while a 10 or an 8 as the down card
would give him a pair. From Figure 44 the program can deter-
mine that the probability of its opponent's down card being
an A, 8 or 10 is:
0.158 + 0.085 + 0.162 ~ 0.405
So the probability that he does not hold the winning hand is
1-0.405=0.595, and the odds against the program having
the winning hand are 0.405:0.595, or 1:1.47. If the program
calls the $3 bet, since the pot now stands at $6 the program
will be getting 2:1 money odds, so the program definitely has
enough equity to call the bet because 2: 1 is better than 1.47: I.
From this calculation the program may determine that it is
safe to call the bet. The algorithm ought to have some
randomly-based adjustment in its calculations to determine
when to raise rather than call- possibly this might be a
probability function whose input parameters are the actual
odds against the opponent having the better hand, and some
176
Computer Gamesmanship
measure of how the opponent sees the situation. It is clearly
better for the program, when raising the pot, to have its
strength hidden in the down card if it wants the opponent to
stay in the hand, while it is better to have all its strength on
the table (with the 'threat' of more strength in the down card)
if it is trying to bluff its opponent out of the pot.
Having made the above calculations, the program has de-
termined that it is safe to call the $3 bet, but since the odds
against the opponent having the best hand at this stage are
only 1.47:1, it would be a little imprudent to raise at this
stage. What the odds should be is not an easy question to
answer but I would recommend not raising unless the odds
are at least 2:1. (In fact 1 would recommend an over-riding
heuristic, under which the program would never raise when
the opponent could have a cast-iron cinch, as here, if he has
another 10, the opponent knows for sure that he is winning.)
The program therefore calls the $3, making the total in the
pot $9 and the dealer gives out another card to each player;
this time the program gets the 6 of Diamonds and its
opponent the Jack of Spades, so the situation on the table is
now this:
PROGRAM: (A C) 9 H, 7 S, 6 D
OPPONENT: (??) 8 D, 10 C, J S
and there is $9 in the pot. The opponent is still high, since J,
10,8 is a better holding than 9, 7, 6, but the program's hidden
Ace is still an important card, because unless the opponent
already has a pair or an Ace, the program is still winning. The
situation has now been made even more complicated because
the latest cards to be dealt give each player, in theory at least,
the chance for a straight if the fifth card is exactly right. For
example, if the opponent's hole card is a 9, 7 or Q, he can
make a straight on card five by hitting a 7 or Q Of he holds a
9), or a 9 Of he already holds a 7 or Q).
The opponent's betting situation has improved somewhat
since his highest face-up card is better than the program's
177
Computer Gamesmanship
highest face-up card, the opponent's second highest up card
is better than the program's, and so is his third highest up
card. So the opponent happily tosses in $9 with a smile on his
face that the poor microcomputer cannot see. What should
the program do now? Answer: stay calm and calculate the
odds. In order to be winning at this stage, the program's
opponent must hold an Ace, 8, 10 or J as his hole card. The
probability of this, from Figure 44, is:
0.158 + 0.085 + 0.162 + 0.162 ~ 0.567
This means that the program probably doesn't hold the
winning hand at the moment, but the odds against it holding
the winning hand are only 0.567:0.433, or 1.31:1, whereas if
it calls the $9 bet it is getting 2: 1 money odds, since the $9 bet
has made the pot up to a total of $18. Therefore, the program
should still call this bet, even though the odds indicate that at
this stage it is probably not holding the best cards. So the
program calls the bet, the pot stands at $27, and the fifth and
final card is dealt. The program gets an Ace while its
opponent gets another Jack, so the players have the following
cards showing:
PROGRAM: (A C) 9 H, 7 S, 6 D, A D
OPPONENT: (??) 8 D, 10 C, J S, .I H
and there is $27 in the pot. The human opponent now feels
very smug, with a pair of Jacks showing, and says, 'I suppose
I ought to bet something - here is $20.'
The principles apply here, just as they did on the previous
rounds of betting, except for the fact that this is the final
round, after which whoever has the best cards will take the
money. The program calculates that to beal it the opponent
must have a Jack (for three Jacks) or a 10 or 8 in the hole (for
two pairs). The probability estimates indicate that the total
probability of the opponent having the winning hand is:
0.162 + 0.162 + 0.085 ~ 0.409
Compu ler Gamesmanship
therefore the odds against the program are 0.409: (1-0.409)
=0.6921:1, well below the money odds, so there is every
reason to call the final bet.
]80
CHAPTER 12
Draw Poker
FIGURE 45
184
Computer Gamesmanship
t85
Computer Gamesmanship
Full House (4s) 74
· .. etc.
Full House (Ks) 83
Full House (Ks over Aces) 84
Full House (Aces) 85
Four 2s 86
Four 3s 87
Four 4s 88
· .. etc.
Four Ks with an Ace 98
Four Aces 99
Straight Flush 5 high 100
Straight Flush 6 high 101
Straight Flush 7 high 102
· .. etc.
Straight Flush A high 109
190
Computer Gamesmanship
pair, so set all of his pair probabilities to zero and adjust the
other probabilities accordingly. (If his pointer is already at 26
or higher you need take no action, since it is already assumed
that he does not hold less than two pairs.)
A pJayer who discards four or five cards (five is prohibited
in some schools) should be assumed to have designation 4 (if
he discards four cards - assume that he has kept an Ace), or
designation 1 (if he discards five cards).
A player who stands pat, i.e. takes no cards at all, should be
assumed to have at least designation 52, though when bluff·
ing is added to your program you should allow for the 'no
card bluff' in a certain proportion of hands, and assume a
lower minimum designation.
All good books on poker give tables to show the odds against
making various types of improvement to your hand during
the draw. For example, Irwin Steig's Poker for Fun and
Profit teaches that when holding a pair and discarding three
cards, the probability of making a full house is 0.0102, of
making four of a kind 0.00278, of making three of a kind
0.1149, and of making two pairs 0.1587. We can use this
information to adjust the probabilities still further.
Let us assume that, after the first round of betting, the
pointers are all on 24 (a pair of Kings). A player discards
three cards, so we assume that he does indeed have a pair,
and the designation probabilities are adjusted accordingly.
We must then assume, after the draw, that the probabilities
of his holding four of a kind, a full house, three of a kind and
two pairs, are given by the above figures, and that the balance
(0.7143) is the probability of his holding a pair after the draw.
Having determined the probabilities for each of the feasible
categories of hand, we can divide them up to indicate the
191
Computer Gamesmanship
probability of his holding each of the feasible types of hand
(remember that some types, such as straights and flushes, are
no longer feasible after the three card draw).
Bluffing
194
Computer Gamesmanship
with it for two or three hands at the most, but then the pro-
gram would 'suspect' and BF would soar to nearly 1.
Sources
I 195
I
Computer Gamesmanship
Findler, N. Y.: Computer Poker. Scientific American, Vol.
239, No.1, July 1978, pp. 112-119.
It will also be essential for a poker programmer to find a
book on the game which gives tables showing the odds
against making certain improvements when drawing cards.
The Steig book mentioned in the text is one such volume, but
there are very many others.
196
CHAPTER 13
Othello
197
Computer Gamesmanship
squares of the board, d4, e4, dS and eS, and herein lies the
one and only difference between Reversi and 'Othello'. In
Reversi, the two players may choose where they play within
these four central squares. Thus, the player who moves
second may either force his opponent to make the first two
moves in a horizontal or vertical line or offer his opponent
the choice between that and a diagonal line. Black moves first
and if he decides to put a disc on (say) d4, White could force
him to play in a horizontal or vertical line by himself playing
on the only diagonal spot, eS. Or White could leave the
choice open by playing on e4 or dS.
In Othello, which was 'invented' in Japan during the carly
1970s, Black starts the game with discs on d4 and e5, White
with discs on e4 and dS. if this really is a new gamc then I
have just invented a wonderful game called David Chcs~. in
which the rules are exactly the same as in normal ches,~ except
that White must make his first move on the King'~ ~iJc.
(Incidentally. Kevin O'Connell has invented another gamc,
almost as interesting as my own, called Kevin Chc~s, ill whidl
White must make his first move on the Queen's ~iJc, and we
are both going to patent our games and try to make as 1I1w:h
8
7
6
5 W B
4 B W
2
1
abcrlefgh
The nature of the game changes as more and more discs are
added to the board. In the early stages (the opening) and the
middle game, structure and mobility are all important, but in
the final analysis it is the player with the most discs on the
board who wins the game. It is therefore clear that up until a
certain point in the game, structure and mobility should be
the most heavily-weighted features in the evaluation
function, while during the last few moves the evaluation
should become more and more oriented towards the number
of Black and White discs actually on the board. One way in
which this might be accomplished is to have an evaluation
function of the form:
W, x (MOBILITY +kx STRUCTURE) + W,x MATERIAL
where Wj = e- nz and W 2 = (1- e- "Z)
n = number of discs on the board
k and z are constants
When the number of discs on the board is low, i.e. during the
early stages of the game, WI might be just below I, while near
the end of the game, when n approaches 64, WI approaches
o.
8 16 -4 , 2 2 , -, 16
7 -, -12 -2 -2 -2 -2 -12 -,
6 , -2 , 2 2 , ,
-2
, ,
,
2 -2 2
2 -2
0 0
,
2 -2
-2
a , -, , 1 0
2
0
2 , -, , 2
-,
1tt-, 1 ,
2 -4 -12 -2 -2 -2 -2 -12
1L 1 2 2 '1-' 16
abcdefgh
All things being equal, which they never are, the above
map represents an acceptable valuation of individual squares,
but the problem is made more complex by the fact that
occupation of one square may well change the desirability of
occupying some other square, and this change might have an
202
Computer Gamesmanship
effect of fatal proportions. A simple example is the question
of the b2 square. It is very bad to occupy it, because
occupation of b2 might lead to the loss of ai, but if you
already occupy at then b2 can do you no harm. A map of
square values must therefore change dynamically as the game
progresses, and your program should be able to allow for
these changes.
The openings
203
Computer Gamesmanship
your program might only be able to perform a 3~ply or 4~ply
search during the game, it could play the first few moves on
the basis of the exhaustive 12~ply search.
I should perhaps add that it is not yet known the extent to
which the 'Sweet 16' strategy is likely to be successful, but
that, combined with a mobility feature, should enable your
program to write a strong openings book.
The endgame
Since the total number of discs on the board is the final and
absolute criterion for determining the winner, it is clear that
your program should, during the last few moves, search the
game tree to its very end, and apply only material as its
evaluation feature. How far from the end of the game an
exhaustive search is possible will depend upon the speed of
your processor and the efficiency of your program. For this
reason it is doubly important to have an efficient move
generation routine. The advantage of being able to search the
204 i
•
,i
I
Computer Gamesmanship
whole of the game tree from six to eight moves prior to the
end of the game, are rather obvious.
Game one
8 B B B B B B B
7 B B W W W B B B
6 B B W W B B B B
5 B B W B B B W B
4 B W B W B B B B
-
3 B B B B W B B B
2 B B B B B
1 B B B B W
abedefgh
FIGURE 49
Game two
Black: Neil Cogle (1980 British Othello Champion - for
humans!)
White: The Moor (4-ply look-ahead)
I B c5 2 We6 3 B f5
4 W c4 5 B c3 6 W d3
7 B f4 8 W b3 9 B b4
to W c6 II B d6 12 W a4
So The Moor has gained the first disc on the edge of the
board, and to redress the balance Black takes the dangerous
square a2.
13 B a2 14 W f6 15 B e7
16 W f8 17 B b5 18 W e3
19 B f7 20Wa5 21 B,6
Black was already in a bad way, with a disc on a2 and a
deficit in mobility, but this move is a fatal mistake which puts
his position beyond repair. See if you can spot The Moor's
killing reply (Figure 50).
22 W,3
Now you can see the danger of playing on a2. Black must
lose the a I corner.
23 B d8 24 W b6 25 B c7
26 W,I
Now that The Moor has a corner, it uses it as an
impregnable base from which to expand its control of the
board.
208
Computer Gamesmanship
8 w
7 W B
6 B W W B B
5 w B W W W B
, W W B W W B
3 B W W w
2 B
• b , d • r , h
FIGURE 50
27 B f3 28 W g3 29 B 12
30W g4 31 B h5 32 W e2
33 Bel 34 W d2 35 B h4
36 W d7 37 B c8
White can afford to concede virtually every edge square at
this stage of the game, in the knowledge that his corner
anchor on al will eventually allow a clean sweep of the edges.
38 W gl 39 B dl 40W g6
41 B h6 42 W g5 43 B c2
44 W bl 45 B b2
Now that al is already occupied, putting a disc on b2 is
unimportant.
46 W a7 47 B g2
There is no way that White can be kept out of hi. If Black
plays on fl, White replies on c 1 and then Black is forced to
play on b7 and g2 within the next few moves.
48 W hI 49 B h2 50 W fl
51 B b7 52 W cl 53 B PASS
Black has no moves, and White continues its march around
the edge of the board.
209
Computer Gamesmanship
54 W h3 55 B PASS 56 W h7
57 B PASS 58 W g8 59 B g7
Black's problems are aggravated by the fact that by now
The Moor is examining the whole of the game tree exhaus-
tively, and is always making the very best move.
61 B PASS 62 W e8 63 B PASS
64 W b8
Neither side may move to a8, so the game comes to an end
with The Moor winning by 61 discs to 2, which is rather like
being several queens up at the end of a game of chess.
Game three
Finally, I shall give without comment the game won by The
Moor against World Champion Hiroshi lnouie of Japan, on
June 191980. The final score in this game was 36-28 in favour
of the program, and not 34-30 as reported in the tournament
bulletin.
Black: The Moor
White: Hiroshi Inouie
1 B d6 2 W c6 3 B c5
4 W c4 5 B b3 6 We6
7 B c7 8 W b5 9 B a6
lOW c3 II B c2 12 W b4
13 B f4 14 W f5 15 B f3
16 W e3 17 B a3 18 W d7
19 B d3 20 W g4 21 B f6
22 Wa4 23 B d8 24 W b6
25 B a5 26 We7 27 B h3
28 We8 29 B f8 30 W f7
31 B c8 32 W g5 33 B h6
34 W h5 35 B h4 36 W g6
37 B h7 38 W c1 39 B d2
40W b2 41 B d1 42 W e1
210
Computer Gamesmanship
43 B e2 44Wfl 45 B f2
46 W bl 47 B g8 48 W gl
49 B b7 50Wa7 51 B g2
52 W g3 53 B hi 54 W h2
55 B al 56 W h8 57 B g7
58 W b8 59 B a8 60Wa2
Black wins by 36-28.
To the best of my knowledge. this is the first time that a
computer program has ever defeated a human World
Champion in a game of pure skill.
211
CHAPTER 14
Go-Moku (and Renju)
212
computer Gamesmanship
Although it is possible to counter the threat of making one
completely open row of three into an open row of four, it is
obviously impossible to counter two such threats if they exist
simultaneously. Thus, the most fundamental winning tactic
in Go-Moku is to try to force a position in which you have,
simultaneously, two completely open rows of three stones.
The simple examples of Figure 51 will help to illustrate these
principles.
8
7 - - -
6 --<'>-
5 - - -
4
:l
2 -
1 -
A fI C [) E F G H I • •
M N 0 P Q R
FIGURE 51
213
Computer Gamesmanship
Black's turn and he plays a stone on some useless point,
White may place a stone at B6 or F6, in either case creating an
open row of four which next move will become a winning row
of five.
(c) If we now add to the board two more white stones, on 1'5
and G4, we can see that unless Black has a win on some other
part of the board, White will win by making one or other of
these rows of three into an open row of four on his next
move. Black may stop the horizontal row by placing a stone
at B6 or F6, or he may stop the diagonal row by placing a
stone at H3 or D7, but he cannot do both simultaneously;
and whichever row he does not stop immediately will grow on
the next move into an open row of four and then into a
winning raw of five.
Because this winning threat, created by simultaneous rows
of three, is absolutely decisive, the game loses much of its
interest if no restriction is placed on the players. Try for
yourself, playing Go-Moku against a friend and you will both
soon discover that it is not terribly difficult to force a double
threat situation early in the game.
For this reason, the Japanese have made the game more
difficult and more interesting, by creating a version called
Renju, in which the player of the black pieces has certain
restrictions placed on his moves. These restrictions are:
(a) A move which simultaneously creates a double or triple
line of 3 is illegal for Black, but legal for White.
(b) A move which creates a double or triple line of 4 is illegal
for Black, but legal for White.
(c) A move which creates an overline (more than 5 in a row) is
illegal for Black, but legal for White (though it does not win
for White). Note that in Go-Moku, overlines are legal but do
not win.
Renju is only played on a 15 x 15 board, whereas Go-Moku
can be enjoyed on a board that is 19x 19 (the traditional Go
board), or any size down to 9 x 9.
214
Computer Gamesmanship
Program design
Evaluation
21S
Computer Gamesmanship
shallow tree, so we ought to ensure that our evaluation
mechanism is wise rather than foolish.
Let us start by considering what features might usefully be
incorporated in our evaluation function - we shall expand
their scope a little further on in this chapter.
The key to a successful strategy is obtaining some of your
own stones, in an unbroken row, in such a way that they
could conceivably be extended into a row of five. Let us first
define some variables.
WI = the number of single white stones which are in a row,
column or diagonal in such a way as to allow the stone to be
extended into a row, column or diagonal of five stones.
Using the notation of Figure 51, imagine a white stone on
01 and black stones on AI, FI and os. There is no way that
the stone on 01 can ever form part of a row or column of five
stones, because the horizontal and vertical directions are
sufficiently well blocked off by Black, but it is conceivable
that the stone on 01 could form part of a diagonal of five
stones, if White were to be able to place stones on E2, F3, G4
and HS. So in this case WI would be I, because this is the
number of possible S-rows that can be made using D1. If
there were no black stone on Al then the value of W 1 would
be 2 because OJ could be part of a horizontal or diagonal
5-TOW, and if there were no stone on DS either the value of
WI would be 3, since 5-rows could be constructed hori-
zontally, vertically and in one diagonal direction.
Similarly, B 1 = the number of single black stones which are
in a row. column or diagonal in such a way as to allow the
stone to be extended into a row, column or diagonal of five
stones (which we call as-row).
And W2, B2, W3, B3, W4, B4, W5 and B5 are the
corresponding variables for situations in which White or
Black has a row, column or diagonal with 2, 3, 4 or S of his
own stones in an unbroken row.
Let us assume for the sake of simplicity that all terminal
nodes are at even depth, that is to say we only evaluate a
216
Computer Gamesmanship
position in which it is the program's turn to move. We shall
further assume that the program is White. It is now necessary
to assign weights to the features of the evaluation function in
such a way as to reflect the worth of a I-row, a 2-row, a
3-row, a 4-row and a S-row. Of course a S-row has infinite
value, in the sense that if you make S-row you have won the
game, so the weighting assigned to WS and BS should reflect
this fact, in the same way that a chess-playing program would
have an infinite value assigned to the kings.
Let the weightings which we assign to these variables be as
follows:
AWl is the weighting assigned to WI;
AB I is the weighting assigned to B I.
218
Computer Gamesmanship
White has then not accomplished anything in the horizontal
direction because his play has now been blocked and if he
puts a stone on FI Black can counter on GI, and if he plays
on BI Black can counter on AI. But the placing of the third
white stone on C I might well have much deeper implica-
tions - it might be part of a plan to create a strong formation
over on the left hand side of the board, with a view to
extending this formation into a winning threat later in the
game.
Now we come to the important difference between having
a single stone on E I and having it on J I. 1f the planned future
activity is in the area of the A-column, B-column and
C-column, it is less likely to be successful than if it is in the
E-column, F-column and G-column, simply because in the
former case this activity is bounded by the left hand edge of
the board. If your area of activity is bounded in some way,
either by an edge of the board or by a strong (or even
impregnable) formation of your opponent's stones, you will
be less likely to win than if your area of activity is not
bounded. In the latter case you have more opportunity to use
the area of activity to create further threats.
What does all this mean in relation to our evaluation
function? The obvious implication is that the weighting
should vary in some way that reflects the number of vacant
intersections to each side of ai-row, 2-row or 3-row. (The
number of vacant intersections to each side of a 4-row is not
important, since the 4-row itself win determine the outcome
of the game at once.) In the above example it might appear as
though the small number of intersections to the left of El
might be compensated for by the larger number of vacant
intersections to the right of E I, and that therefore, EI and 11
are of equal value. But if we think about the nature of the
game it is clear that having a formation near the centre of a
row, column or diagonal, gives greater flexibility than having
that same formation near one or more edges of the board. We
219
Computer Gamesmanship
should therefore adjust our weightings in some suitable
manner, to reflect the desire to have useful formations nearer
the centre than the edges. One possible way of doing this is to
subtract from a weighting AWi (or BWi), an amount Ci,
where Ci is inversely proportional to (1 + number of vacant
intersections between the end of a formation and its nearest
edge of the board (or enemy stone) in the same direction).
Thus, for a single black stone on the 01 intersection of an
otherwise empty board, the weighting AB 1 would actually be
ABl - (1/3), for the component of the score that is related to
the horizontal I-row. This is because in a horizontal direction
the nearest edge intersection to the I-row on 01 is the
intersection on AI, which is two vacant intersections away
from 01. The weighting of ABt in the diagonal direction
towards the left hand edge would be AB I - 1/1; the weighting
in the diagonal direction towards the right hand edge would
be AB1-1/1; and finally the weighting towards the top edge
would be ABI-l/1 (these last three values are due to 01
being on the edge of the board).
The suggestion to subtract a value that is inversely
proportional to the 'freedom of movement' of a formation is
given here as an indication of the shape that this part of the
evaluation function should take. You might find it more
satisfactory to subtract the square of that number, or some
other function.
Another important refinement of the evaluation function is
needed to take care of those situations in which a stone of one
colour may have a nearby neighbour of its own colour. For
example, white stones on EI and G I with no other stones on
the first horizontal row. The value of these two stones is
clearly more than the value of two individual I-rows, because
the two stones can easily combine into a 3-row if White is
permitted to play on Fl. On the other hand, two white stones
with one vacant intersection between them are worth slightly
less than a 2-row because with a 2-row there are four distinct
220
Computer Gamesmanship
ways of creating a 5-row, whereas with two separated I-rows
there are only three distinct ways (since the vacant inter-
section between them must be occupied). This leads me to
suggest that in a situation of this type we employ a weighting
mid-way between that of a 2-row and the sum of two I-rows.
If there are two vacant intersections between the two I-rows,
use a weighting one quarter of the way between that of two
I-rows and that of one 2-row, and if there are three vacant
intersections take a weighting one-eighth of the way between
them. Similar logic can be used to suggest weightings for (say)
a I-row separated from a 2-row (in the same horizontal,
vertical or diagonal) by one or two vacant intersections,
though here as usual, your first guesstimate as to the size of
the weighting will almost certainly need to be changed in the
light of experience.
The two refinements discussed here are probably necessary
for a very strong program, but those of you who wish to keep
things simple will, I'm sure, get an entertaining game from a
program which employs only the most primitive form of the
evaluation function.
221
Computer Gamesmanship
evaluations. The worst n per cent of the moves on the list may
then be discarded (n can be chosen to suit the execution speed
of your program - I would suggest that you start with n = 90).
You will now have a list of some 36 moves (at the start of the
game) and from each of the 36 positions you again generate
and evaluate, discarding the worst (say) 92 per cent of the
moves at the next ply. The percentage of moves discarded
goes up as the tree gets deeper and deeper, and this parameter
can be adjusted, dynamically if necessary, so that the
program is made to respond in any desired time frame.
Your tree will now be no larger than the tree for a chess
program, and move generation will be faster than for chess,
so provided you code the evaluation routine in an efficient
manner, you ought to be able to perform a search of 4-6 ply
within a minute or two, if your program is written in
assembler.
We have already encountered the concept of the alpha-beta
window and the killer heuristic, both of which should be
employed in your Go-Moku program. In a large tree the killer
heuristic is particularly useful and the fact that you sorted the
moves prior to generation of the replies at each level will help
considerably in the optimisation of the alpha-beta routine
itself. One other method of speeding the search is to avoid the
need to re-evaluate those parts of the board that are not
affected by a move in the game tree. You might, for example,
keep several different components of the evaluation, and
update only those affected by a move. For example, let us
assume that the evaluation function has separate components
for each horizontal, each vertical and each diagonal. If the
program considers a move on the intersection AI, this move
will in no way affect the evaluation of a formation in the J
column, so part of the evaluation process need not be
repeated - it is known to be unchanged. The more you speed
up the evaluation process, the deeper the tree can grow, so
any technique which updates the evaluation function in an
incremental way is certain to be useful.
222
Computer Gamesmanship
Tactical search
Sources
224
CHAPTER 15
Bridge Bidding
Spades: 10 N Spades:
H ..arts:
Uiamonds:
Q J 1076
Q 8 2 w E
Hearts:
Diamonds:
"K94
K 10 7
Clu bs: QJ43 S Clubs: A 98 65
Spades: A QJ 6 3
Hearts: A853
Uiamonds: 64
Clubs: 7 2
FIGURE 52
For the sake of convenience we usually refer to the four
hands by the four points of the compass: North, South, East
and West. We shall assume that West was the dealer, and that
the bidding goes like this (players' thought processes in
brackets):
West: Pass (1 have a weak hand);
North: One Diamond (I have a stronger-than-average hand
with two good suits. 1 shall bid the lower-ranking suit first to
give my partner a chance of bidding Hearts at the one level);
East: Pass (l also have a hand that is no better than average,
and since my partner is weak we will not have enough
combined strength to make any contract);
228
Computer Gamesmanship
South: One Spade (I have two biddable suits, but I have more
Spades than Hearts so I shall bid Spades first);
West: Pass;
North: Two Spades (My partner has at least four Spades in
his hand so we have at least nine Spades out of 13 between us.
Obviously Spades will be a good suit for us to playa contract
in);
East: Pass;
South: Three Hearts (l must show my partner that I have
another biddable suit);
West: Pass;
North: Three Spades (My first Spade bid indicated only that I
had reasonable Spade support for my partner. Now I should
tell him that I have more than minimal Spade support and
that I do not have enough strong cards in the unbid suits to
make a No-Trump contract possible);
East: Pass;
South: Four Spades (My partner has at least four Spades and
probably holds the King of Spades. He also has four or five
Diamonds so he does not have many Clubs and Hearts. I
have the Ace of Hearts so we are unlikely to lose more than
one Heart trick, and I only have two Clubs, so we cannot lose
more than two Club tricks before I can trump any further
Clubs that are led. So we ought to be able to avoid losing any
more than three tricks, and four Spades seems quite
possible);
West: Pass;
North: Pass (Enough is enough);
East: Pass.
The above bidding and thought processes represents an
over-simplification of what was going on in the minds of the
players. But it does serve to explain the type of thought
processes that one goes through when bidding in a simple
fashion. I ought perhaps to mention at this stage that by
reaching certain contracts a partnership may qualify for a
'game bonus' if the contract is made. These game contracts
are: three No-Trumps; four Hearts or four Spades; five Clubs
229
Computer Gamesmanship
or five Diamonds. Making a lesser contract allows you to
score the game bonus later on if you can make another
contract that counts, together with the earlier contract, for
enough points to make a game. I will not go into the scoring
system in this chapter, but you should study an elementary
book on bridge before writing your program, so that the
scoring will be correct.
In order to make the bidding phase easier and to ensure
that information is conveyed economically, various bidding
systems have been invented. In a bidding system, each bid has
a fairly precise defined meaning, and by correctly interpreting
a bid, a player will understand more about his partner's
hand. One useful tool employed in many bidding systems is
what are known as 'high-card points'. This points method
usually counts 4 points for holding an Ace, 3 for a King, 2 for
a Queen, 1 for a Jack or singleton (a suit with only one card,
other than the Ace), 4 for a void (a suit with no cards), 1 for
each card after the first five in a suit. Using this point count
method, various rules of thumb have been developed,
including:
(a) Do not open the bidding with fewer than 12 points;
(b) If you hold 12-15 points you should open one of your
best suit;
(c) If you hold 16-18 points you should open one No-Trump;
(d) In order to make a three No-Trump contract the
combined hands should have no less than 24 points,
preferably 25 or more.
The above rules can all be broken, under the correct
circumstances and, in fact, the same bid can mean many dif-
ferent things in the same situation, depending on which
system of bidding the partnership is employing. The most
important thing to remember about bidding is that bridge is a
partnership game, and you should be trying to help your
partner during the bidding by making meaningful bids that he
will understand. There is no point in making a brilliant bid on
one bidding system if your partner is using a different
230
Computer Gamesmanship
237
CHAPTER 16
Bridge Play
NO
'0
rW-"-'-'-"-O-'-,-"".~a~:v. PI:'" I
mak~, mo,t lncks'
~~_~L-_,
EX~CUI@ lhat plan
Spades: -
Hearts: K J
Diamonds: 4 3 2
Clubs: -
Let us assume that South has the lead, and that North's
Spades are the last three Spades (and therefore they will all
win), while South's Diamonds are not the last three
Diamonds (and will therefore be losers). The contract is being
played in No-Trumps. South leads the J of Hearts, North
plays the A, then North plays the Q of Hearts and suddenly
notices what he has done wrong. South must now take his
trick with the K and then he is forced to lead a Diamond,
losing the last three tricks instead of winning with the
remaining Spades. Of course, there are various ways that the
244
Computer Gamesmanship
cards could be played in a different order, so that North-
South would make all five tricks, but this example shows you
how a careless mistake can cost several tricks and turn a good
contract into a bad one.
Endplay situations
---- -- ~--- ..
O'd~ •• u,t. .~""rd'n~
to d"""abihty
If vo,d in lh~
,u,t .nd trump"
discard low"'"
Am [leadinR to this NO e.rd in 1•• "
t"ck' >- de" •• blp .mt.
oth.rwi .. u ...
d.I.".i.e heumt,e,
to eh"" .... eord
U .. d~fen ...e
heumtics '0
d"",M .... hich
card to Iud
Irom most
d ..i",t,I •• Uo'
FIGURE 54
246
Computer Gamesmanship
in the correct direction, a wrong guess as to the location of
the card will result in the finesse going wrong, and at least one
trick being lost which should not have been lost. So the miss-
ing Queen will be an 'important' one.
As a general rule, I would suggest that the program should
not try an exhaustive tree search until the number of
'important' cards remaining is no more than one.
Having written your tree search routine, you will have an
opportunity to try it out after different numbers of tricks
have been played. This will enable you to time the execution
of the routine where there are 3, 4, 5 ... tricks remaining to
be played. You can then decide what sort of time delay is
acceptable to you, and set the 'endplay parameter' so that the
exhaustive search does not begin until such time as the
computer's delay in calculating its optimal play is acceptable.
Remember ~ the computer can determine the optimal
endpJay strategy for defending players as well as for declarer,
and since you will be playing one of the hands and the
computer will be playing the two unseen hands, you must
allow for a delay by both of the other players. Unfortunately,
there can be a substantial difference between the times
needed to search trees with the same numbers of cards (or
tricks). This is because one tree search might be performed in
a near optimal ordering (i.e. the search heuristics provide a
good ordering of the 'moves') while another tree search
might be highly non-optimal in its ordering. Another reason
for a large disparity in the search times is that even with the
same number of tricks to be played, there can be vastly
differing numbers of nodes on the search tree because of the
way that the suits are distributed. If all the suits are evenly
distributed among the players, the branching factor at each
node will be small. If the suits are unevenly distributed, the
branching factor at some nodes will be small and at others it
will be large. The combinatorial effects of these differences
might result in two trees having the same number of tricks but
widely differing numbers of terminal nodes.
247
Computer Gamesmanship
Sources
249
CHAPTER 17
Shogi
,I A
• , '.
~ I; * • --
':I(
..• '
•
A \! •
+ 'X •
• •
j
•
j
•
j
•
j
• •
j j
• •
j
•
1•
' .• <
I
, ,
I
1-- f- +I
• , I
,
, , . ,•
, I
, , , ,,, , ,
_.
, -
, ,-
t t t I t t t t , t i t
f--- ...
•X !
, •
,+ •
, ,
, , y•
. .
•
;I:
·0
'I'-
• ··0
'I'-
-
•;I: !y• II , ·
*
FIGUR!:, 55: The starting position in shogi.
Promoted pieces
One of the most interesting aspects of shogi, as compared to
chess, is the fact that whereas in chess only the pawns can
promote to a piece of higher value, in shogi some of the other
pieces can also promote. A promotion move is made by
moving a piece partly or wholly within your promotion zone
(the last three ranks or rows furthest from you). Promotion
takes place at the conclusion of the promoting move, and it is
important to remember that in shogi it is not always com-
pulsory to promote, as we shall see. The following pieces
have the ability to promote:
SILVER: The promoted silver moves exactly like a gold. On
your shogi set the silver can be turned over and on the reverse
side you will see the symbol for a promoted silver.
KNIGHT: The promoted knight also moves exactly like a
gold.
253
Computer Gamesmanship
Capturing
Shogi openings
The exact order in which the opening moves are played does
not appear to be so critical in shogi as in chess. The most
important aspect of opening play in shogi seems to be the
squares on which one places one's pieces, and not the exact
order in which they are moved there. The only source of shogi
openings that I can find in any language other than Japanese
is, once again, the publication of the Shogi Association.
Since it is not necessary for your shogi program to have
access to large tables of opening variations, you need only
devise some method of encouraging the program to make
moves that will lead to its pieces being on the right squares. A
simple method of accomplishing this is to examine each of the
pieces in a desired formation and determine how many moves
away from its target square it is at the moment. The 'opening'
feature in the evaluation function can then be penalised by
256
Computer Gamesmanship
(say) 1 point for each piece that is one move away from its
target square, 2 points for each piece that is two moves away,
and so on. This method, or any similar pattern-matching
process, will provide a useful measure as to the degree to
which a desired opening formation has been achieved.
Sources
263
CHAPTER 18
Dominoes
264
Computer Gamesmanship
Countless games may be played with the set of dominoes.
Here I shall describe a very simple game which I used to play
as a child.
All dominoes are turned face down and shuffled, and each
player picks seven dominoes at random, which he then looks
at. The game may be played with two, three or four players,
but I always found the game with two players was the most
challenging and the most enjoyable. There is some method
for deciding who goes first - this may be done by the toss of a
coin, or it may alternate from one game to the next, or it can
be the player who holds the highest double (in which case this
double must be played on the first move). Once a domino has
been placed on the table, face up, the players take it in turns
to move.
In order to make a move a player must put down a domino
which has, as one of its numbers, the same number as one of
the ends of the chain of dominoes already on the table. The
new domino is put on the table in such a way that the
matching parts of the two dominoes are next to each other.
The other end of the new domino then forms a new end to the
chain. Whenever a double domino is placed on the table it is
put at right-angles to the end of the chain whose number
matches the double. The example given in Figure 56 illus-
trates the first few moves of a game.
Thus the game progresses, until the player whose turn it is
to move cannot put a domino from his own hand at either
end of the chain. He must then pick up dominoes from the
shuffled set one at a time until he gets one which may legally
be played at one end of the chain. The first player to get rid of
all his dominoes wins the hand, and his opponent is debited
by the number of points showing on all the dominoes
remaining in his hand. It is customary to play until one
player's total reaches a certain threshold, say 101, and he
loses the game.
265
Compuler Gamesmam-hip
The first plaYH {who won th~ toss)
puts down the 5·4:
• • •
•
• • • • • • • • • •
• • •
• • • • • • • • • • • • •
•
FIGURE 56
Playing strategy
272
About the Author
1)I1IN I [O IN USA