0% found this document useful (0 votes)
34 views9 pages

Genetic Programming Produced Competitive Soccer Softbot Teams For Robocup97

Genetic programming was used to evolve the high-level decision-making behaviors of a softbot soccer team for the RoboCup97 competition. The team's behaviors were entirely learned through evolution rather than being human-designed. At the competition, the evolved team won its first two games against opponents with human-crafted strategies, and received the RoboCup Scientific Challenge award. The RoboCup soccer simulator environment poses significant challenges for evolutionary techniques due to its real-time, noisy nature with limited sensing and communication.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views9 pages

Genetic Programming Produced Competitive Soccer Softbot Teams For Robocup97

Genetic programming was used to evolve the high-level decision-making behaviors of a softbot soccer team for the RoboCup97 competition. The team's behaviors were entirely learned through evolution rather than being human-designed. At the competition, the evolved team won its first two games against opponents with human-crafted strategies, and received the RoboCup Scientific Challenge award. The RoboCup soccer simulator environment poses significant challenges for evolutionary techniques due to its real-time, noisy nature with limited sensing and communication.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Genetic Programming Produced

Competitive Soccer Softbot Teams for RoboCup97


Sean Luke
[email protected]
http://www.cs.umd.edu/˜seanl/
Department of Computer Science
University of Maryland
College Park, MD 20742

ABSTRACT Unlike other teams, who had refined well-understood


robotics techniques in order to win the competition, we saw
At RoboCup, teams of autonomous robots or soft-
the RoboCup simulator as a very difficult environment to push
ware softbots compete in simulated soccer matches
the bounds of what was possible to do with existing evolu-
to demonstrate cooperative robotics techniques
tionary computation techniques. For a variety of reasons
in a very difficult, real-time, noisy environment.
detailed later, the soccer simulator is very difficult to evolve
At the IJCAI/RoboCup97 softbot competition, all
for. Hence, our goal was relatively modest: to produce a team
entries but ours used human-crafted cooperative
which played at all. As it turned out, we were pleasantly sur-
decision-making behaviors. We instead entered
prised with the results. Our evolved teams learned to disperse
a softbot team whose high-level decision making
throughout the field, pass, kick to the goal, defend the goal,
behaviors had been entirely evolved using genetic
and coordinate with and defer to other teammates. At the
programming. Our team won its first two games
IJCAI/RoboCup97 competition our team managed to win its
against human-crafted opponent teams, and re-
first two matches against human-coded opponents, and took
ceived the RoboCup Scientific Challenge Award.
home the RoboCup97 Scientific Challenge award.
This report discusses the issues we faced and the
approach we took to use GP to evolve our robot
soccer team for this difficult environment. 2 The RoboCup Soccer Server Domain
Genetic programming has been successfully applied to mul-
1 Introduction tiagent coordination before. [Andre 1995] evolved com-
RoboCup is a competition which pits teams of robots against munication between agents with different skills. [Qureshi
each other in a robotic soccer tournament [Kitano et al 1995]. 1996] evolved agent-based communication in a cooperative
To be successful at RoboCup, a team of robotic soccer play- avoidance domain. [Raik and Durnota 1994] used GP to
ers must be able to cooperate in real time in a noisy, highly- evolve cooperative sporting strategies, and [Luke and Spec-
dynamic environment against an opposing team. In addition tor 1996], [Haynes et al 1995] used GP to develop cooper-
to the two “real-robot” leagues at RoboCup, there is a softbot ation in predator-prey environments. [Iba 1996] applied a
league which competes inside a provided soccer simulator, similar approach to cooperative behavior in the TileWorld
the RoboCup Soccer Server [Itsuki 1995]. The simulator en- domain. Even so, evolutionary computation is rarely applied
forces extremely limited and noisy sensor information, com- to a problem domain of this difficulty. The RoboCup Soccer
plex physics, real-time dynamics, and limited intercommu- Server (http://ci.etl.go.jp/˜ noda/soccer/server.html) was not
nication among softbots. The result is a rather challenging designed with GP in mind. To realize why the Soccer Server
real-time domain. presents such a challenge for evolutionary computation (and
Practically all entrants in the RoboCup simulator league GP in specific), it’s important to understand how the domain
used hand-coded team strategy algorithms; though some fine- works.
tuned a few low-level functions (like ball interception) with In a full match, the Soccer Server admits eleven separate
backpropagation or decision trees. In contrast, at the Univer- player programs per team, each controlling a different virtual
sity of Maryland a group of undergraduates and I entered a soccer player in its simulation model. By regulation rules,
softbot team whose high-level strategies were entirely learned these player programs must be separate processes which are
through genetic programming [Luke et al 1997]. not permitted to communicate with each other except through
the limited facilities provided by the Soccer Server. Each • Rotate n degrees from his current facing direction.
player on the team makes a separate socket connection to the
simulator. Once connected, a player program receives UDP • Dash in the direction he is facing with n power. A
datagrams once every 300 milliseconds, providing it with player must repeatedly dash to keep up forward move-
sensor information and messages “yelled” by other players ment. Dashing also decreases stamina; players that dash
on the field. The player issues commands to the server by with high power will soon start running much slower than
sending it UDP datagram messages no faster than once every they realize. A player is not told his current stamina, nor
100 milliseconds. Commands are not queued: if the player how fast he is currently running. Players may also dash
issues commands faster than this, they are simply ignored backwards, but at most a third of maximum power.
by the server. The server updates its internal world model
• Kick the ball in a certain direction (relative to the direction
every ten milliseconds; this places a real-time restriction on
the player is facing) and with a certain power. Players can
the speed of play.
only kick a ball when it is under 1.8 units away.
The simulator maintains a virtual soccer field 105 units
long by 68 units wide, with goals 14 units wide. Sensor • Yell a message up to 512 bytes long. There are strong
information relays game status and the relative positions of limitations on this. Messages may be yelled only very
viewable objects on the field. The only useful sensor op- infrequently, and fellow players can hear only so many
tion (and the one used by all players in the competition) messages each sensor cycle. Further, players aren’t told
gives both the direction and distance of objects the player where a yell came from; hence yells can be (and often
can “see”. Players may choose narrow (45 degrees), medium are) faked by opponents.
(90 degrees — we and most competitors chose this), or wide
(180 degrees) fields of view, but the wider the range, the • Move the player to a specific (X,Y) coordinate position
more slowly sensor updates are received. Sensor information and facing a specific angle. This is only permitted while
includes only: the ball is out of play.

• The ball position and relative movement. Play happens in real time. If a player cannot process sensor
information or make moves fast enough, he will fall behind.
• Positions and relative movement of other players. If play- The Soccer Server also maintains complex dynamics among
ers are far enough away (over 20 units), their jersey num- moving objects. Balls and players have acceleration and
bers cannot be ascertained. If players are very far away momentum, and cannot immediately stop, change direction,
(over 40 units), the team they’re on cannot be determined. or move at a certain velocity on command. Players and the
ball have different mass, hence can move at different rates
• Goal positions.
(the ball can move much faster). Players take up space and
• The position of flags placed at corners of the field, and on collide with the ball and other players inelastically. When a
the edges of the field at its midpoint. team is given possession of the ball (for a goal kick, perhaps),
the server “bumps” opposing players from the general ball
• The distance to and perceived angle of the soccer field area. Though not used in the competition, the simulator can
boundary line crossing the player’s field of vision. also provide wind and other complicating conditions.
The simulator enforces standard soccer rules with one very
• The state of the ball in play, including free, goal, side, large exception: as the robot players have no hands, there is
and corner kicks, out-of-bounds, pre-game and mid-game no goalie, and no goalie area. When a ball is kicked out-of-
setup, kickoffs, etc. bounds, ball control is transferred to opponents and the ball
is moved to the appropriate kick-in position per regulation
Sensing is further complicated in three ways. First, the co-
rules. Goals are scored when the ball passes through the goal
ordinate positions of objects are given relative to the player’s
line. Finally, the server allows a human referee to make foul
position and the direction he is facing, but the server does not
calls for ungentlemanly play (for example, the entire team
tell the player any information about his own whereabouts.
lining up to block the goal).
Second, the server gives information only about object closer
than 3 units, or within the player’s field of view (the widest
is only 180 degrees). Third, the simulator adds to the sensor 3 The Challenge for Evolutionary
positional information a heavy dose of gaussian noise, and Computation
noise increases as the distance to an object increases.
Player movement is nonholonomic (players can either turn As should be obvious from the above description, the Soc-
or dash but not both at the same time), which makes things cer Server domain is very complex, with a large number of
messy. And like sensor information, movement is also subject options and controls, and a correspondingly large number of
to a great deal of noise. Each movement cycle, a player may boundary conditions and special cases that must be accounted
issue one of several commands: for. This alone makes it a tough problem to tackle with GP.
As if the Soccer Server’s complex dynamics didn’t make We included a simple state-estimation mechanism that inter-
evolving a robot team hard enough, the server also adds one polated teammate and opponent positions in-between sensor
enormously problematic issue: time. As provided, the Soccer cycles and maintained estimates of the player’s stamina. The
Server runs in real-time, and all twenty-two players connect boiled-down domain provided information about whose ball
to it via separate UDP sockets. Because of the enforced it was during free-kicks, goal-kicks, etc., but this information
ten-millisecond delay between world model updates, a full was largely unused as the simulator would keep players out
game takes ten minutes to play. Game play can be sped of the kick area anyway. As we decided to ignore the com-
up by hacking the server and players into a unified program plexities of intercommunication, our domain also eliminated
(removing UDP) and eliminating the ten-millisecond delay. the ability to yell or listen.
However, we found that for many reasons this does not in- We also made some significant changes to the traditional
crease speed as dramatically as might be imagined, and if not GP genome. Instead of a player algorithm consisting of a
carefully done, runs the risk of changing the game dynamics single tree, our players consisted of two algorithm trees. The
(and hence “changing” the game to optimize over). first tree was responsible for moving the player, and when
The reason all this is such a problem is that evolving a evaluated would output a vector which gave the direction and
computer program to work successfully in this domain would speed with which to turn and dash. The second tree was re-
likely require a very large number of evaluations, and each sponsible for making kicks, and when evaluated would output
new evaluation is another soccer simulator trial. In previous a vector which gave the direction and power with which to
experiments with considerably simpler cooperation domains kick the ball. At evaluation-time, the program executing the
[Luke and Spector 1996], we have found that genetic pro- player’s moves would follow the instructions of one or the
gramming could require on order of 100,000 evaluations to other tree based on the following simplifying state-rules:
find a reasonable solution. We suspected the soccer domain
• If the player can see the ball and is close enough to kick
could be much worse. Consider that just 100,000 5-minute-
the ball, call the kick tree. Kick the ball as told, moving
long evaluations in serial in the Soccer Server could require
the player slightly out-of-the-way if necessary. Turn in
up to a full year of evolution time.
the direction the ball was kicked.
Our challenge was to cut this down from years to a few
weeks or months, but still produce a relatively good-playing • If the player can see the ball but isn’t close enough to kick
soccer team from only a few evolutionary runs. We accom- it, call the move tree. Turn and dash as told; if the player
plished this in several ways: can continue to watch the ball by doing so, dash instead
by moving in reverse.
• Brute force. We sped up play by performing up to 32
full-team game evaluations in parallel on a supercomputer • If the player cannot see the ball, turn in the direction last
cluster. We also cut down game time from 10 minutes to turned until the player can see it.
between 20 seconds and one minute. This state mechanism eliminated a great many troublesome
• We attempted to cut down the population size and number boundary conditions. First, by combining “movement” into
of generations necessary to produce a reasonable team. turn-dash pairs, we allowed the GP tree function set to as-
sume its player had holonomic movement, that is, the ability
• We developed an additional layer of software which sim- to move immediately in any direction. Second, by doing ev-
plified and orthogonalized the domain, eliminating many erything reasonable to keep the ball in view, we were able to
of the boundary conditions the GP programs would have eliminate many of the boundary conditions which occur when
to account for. We also spent much time designing a func- ball suddenly disappears due to arbitrary player movement (a
tion set and evaluation criteria to promote better evolution big problem in our early tests).
in the domain. Before evolving a team, we had to create the set of low-
level “basic” behavior functions to be used by its players. This
• We performed parallel runs with different genome struc- required some compromise. Ideally, we would have liked to
tures to give us more options as competition time neared. produce soccer players out of a variety of very low-level,
generic vector functions. This would have allowed us to say
4 Using Genetic Programming to that in no way did we bias the function set to produce certain
kinds of strategies. But our early tests suggested that the
Evolve Soccer Behaviors domain was so complex that there was little hope of evolving
As the Soccer Simulator dynamics were quite complex, we a team with this kind of function set. Instead, we designed
began by hand-coding a multithreaded socket library which a large set of functions (some generic, some specialized) we
abstracted away some of the oddities of the domain. The thought would have particular utility in the soccer domain.
library received all incoming sensor information and boiled it In doing so, we tried to stay as general as possible but still
down into the absolute position of all visible teammates and come up with a function set that we thought stood a chance
opponents (and the player himself), the ball, and the goals. of evolving successfully.
Function Syntax Returns Description
(home) v A vector to my home (my starting position).
(ball) v A vector to the ball.
(findball) v A zero-length vector to the ball.
(block-goal) v A vector to the closest point on the line segment between the ball and the
goal I defend.
(away-mates) v A vector away from known teammates, computed as the inverse of
P max−||m||
m∈{vectors to teammates} ||m|| m
(away-opps) v A vector away from known opponents, computed as the inverse of
P max−||o||
o∈{vectors to opponents} ||o|| o
(squad1) b t if I am first in my squad, else nil.
(opp-closer) b t if an opponent is closer to the ball than I am, else nil.
(mate-closer) b t if a teammate is closer to the ball than I am, else nil.
(home-of i) v A vector to the home of teammate i.
(block-near-opp v) v A vector to the closest point on the line segment between the ball and the
nearest known opponent to me. If there is no known opponent, return v.
(mate i v) v A vector to teammate i. If I can’t see him, return v.
(inv v) v v rotated 180 degrees.
(if-v b v1 v2) v If b is t, return v1, else return v2.
(sight v) v Rotate v just enough to keep the ball in sight.
i
(ofme i) b Return t if the ball is within max units of me, else nil.
i
(ofhome i) b Return t if the ball is within max units of my home, else nil.
i
(ofgoal i) b Return t if the ball is within max units of the goal, else nil.
(weight-+ i v1 v2) v Return v1(i)+v2(9−i)
9 .
(far-mate i k) k A vector to the most offensive-positioned teammate who can receive the
ball with at least i+1 10 probability. If none, return k.
(mate-m i1 i2 k) k A vector to teammate i1 if his position is known and he can receive the
ball with at least i2+1 10 probability. If not, return k.
(kick-goal i k) k A vector to the goal if the kick will be successful with at least i+1 10
probability. If not, return k.
(dribble i k) k A “dribble” kick of size 20i (max) in the direction of k.
(kick-goal!) k Kick to the goal.
(far-mate!) k Kick to the most offensive-positioned teammate. If there is none, kick to
the goal.
(kick-clear) k Kick out of the goal area. Namely, kick away from opponents as computed
with (away-opps), but adjust the direction so that it is at least 135 degrees
from the goal I defend.
(kick-if b k1 k2) k If b is t, return k1, else return k2.
max
(opponent-close i) b Return t if an opponent is within (1.5)i of me.
0,1,2,3,4,5,6,7,8,9 i Constant integer values.

Table 1: GP functions used in the soccer evaluation runs. Other functions tried (but not used in the final runs) included internal
state, magnitude and cross-product comparison, angle rotation, boolean operators, move history, etc. max is the approximate
maximum distance of kicking, set to 35. k is a kick-vector, v is a move-vector, i is an integer, and b is a boolean.
Vectors (for either kicking or moving) are a pair of floating-point values.
To achieve this, we used Strongly-Typed GP [Montana
Homogenous Pseudo-Heterogeneous (Squad-Based)
1995] to provide for a variety of different types of data
All Team Members Players on Squad 1 Players on Squad 2 Players on Squad 3
(booleans, vectors, etc.) accepted and returned by GP func- “Move” Tree “Move” Tree “Move” Tree “Move” Tree
tions, restricting tree formation to conform to these type rules.
This allowed us to include a large, rich set of GP functions
(allowing for many more player options), but still constrain
etc...
the possible permutations of function combinations.
“Kick” Tree “Kick” Tree “Kick” Tree “Kick” Tree
Table 1 gives the basic functions we provided our GP sys-
tem with which to build GP trees. We decided early on to
enrich a basic function set of vector operators and if-then con-
trol constructs with some relatively domain-specific behavior
functions. Some of these behaviors could be derived directly
from the Soccer Server’s sensor information. This included
vector functions like (kick-goal!), or (home). Others behaviors Figure 1: Homogeneous and Pseudo-Heterogeneous (Squad-
were important to include but were hand-coded because we Based) genome encodings.
found evolving them unsuccessful, at least within our limited
time constraints. These included good ball interception (a
surprisingly complex task), which was formed into (ball), or To implement a fully heterogeneous approach in the soccer
moving optimally to a point between two objects (forming domain would necessitate evolving a genome consisting of
(block-near-opp), for example). twenty-two separate GP trees, far more than we felt could
We used genetic programming to evolve other low-level reasonably evolve in the time available. Instead, we ran
behaviors. Most notably, Charles Hohn used symbolic re- separate runs for homogeneous teams and for hybrid pseudo-
gression to evolve functions determining the probability of heterogeneous teams (see Figure 1). The hybrid teams were
a successful goal-kick or pass to a teammate, given oppo- divided into six squads of one or two players each; each
nents in various positions [Hohn 1997]. Our symbolic re- squad evolved a separate algorithm used by all players in the
gression data points were generated by playing actual trials squad. This way, pseudo-heterogeneous teams had genomes
in the Soccer Server (with a kicker, receiver, and opponent of twelve trees. Each player could still develop his own
for teammate-pass trials, or just a kicker and opponent for unique behavior, because the function set included functions
goal-kick trials). We used these evolved algorithms as the which let each player distinguish himself from his squad-
basic probabilistic mechanism behind the decision-making mates. We ran separate runs with these two different ap-
functions (kick-goal ...), (mate-m ...), and (far-mate ...). proaches to increase our chance of having something to show
Given a basic function set, there are a variety of ways to use at RoboCup (as discussed later, our concern was justified).
genetic programming to “evolve” a soccer team. An obvious Because our genomes consisted of forests of trees, we
approach is to form teams from populations of individual adapted the GP crossover and mutation operators to accom-
players. The difficulty with this approach is that it introduces modate this. In our runs, crossover and mutation would
the credit assignment problem: when a team wins (or loses), apply only to a single tree in the genome. For both homoge-
how should the blame or credit be spread among the various neous and pseudo-heterogeneous approaches, we disallowed
teammates? We took a different approach, widely used in GP, crossover between a kick tree and a move tree. For pseudo-
which we had tried before in [Luke and Spector 1996]: the heterogeneous approaches, we allowed trees to cross over
genetic programming genome is an entire team; all the players only if they were from the same squad: this “restricted breed-
in a team stay together through evaluations and breeding. ing” has in previous experience proven useful in promoting
This raises the question of a homogenous or heterogeneous specialization [Luke and Spector 1996]. We also introduced
team approach. With a homogenous team approach, each a special crossover operator, root crossover, which swapped
soccer player would follow effectively the same algorithm, whole trees at the root instead of swapping subtrees. This
and so a GP genome would be a single kick-move tree pair let teams effectively “trade players”, which we hoped would
used by all teammates during play. With a heterogeneous spread good strategies through the population more rapidly.
approach, each soccer player would develop and follow its To reduce run time, we used population sizes between
own unique algorithm, so a GP genome would be not just a 100 and 400 (in the final run, 128). We felt these small
kick-move tree pair, but a forest of such pairs, one pair per populations (given the problem complexity) necessitated a
player. In a domain where heterogeneity is useful, the het- somewhat unusual mix of breeding operators. As a conse-
erogeneous approach provides considerably more flexibility quence of findings in [Luke and Spector 1997], we decided
and the promise of specialized behaviors and coordination. to use a large dose of mutation (30% in the final run) to pro-
However, homogenous approaches can take far less time to duce higher overall fitness with the small population, and to
evolve, since they require evolving only a single algorithm stave off premature convergence. The rest consisted of 70%
rather than (in this case) eleven algorithms. subtree crossover (choosing internal-nodes 30% of the time,
Figure 2: A competition between two initial random (and Figure 3: “Kiddie-Soccer”, a problematic early suboptimal
randomly-moving) teams. strategy, where everyone on the team would go after the ball
and try to kick it into the goal. Without careful tuning, many
populations would not escape this suboptima.
leaf-nodes 10% of the time, and performing root crossover
60% of the time). Finally, we used tournament selection with
a tournament size of 7. approach is to use a traditional single- or double-elimination
Another issue was the evaluation function needed to as- tournament (n − 1 evaluations at best). Because of the ex-
sess a genome’s fitness. One way to assess a team would be treme cost on evaluations, we opted instead to randomly pair
to play the team against one or more hand-created opponent up teams, basing team fitness on the single game each pair
teams of known difficulty. There are two problems with this played (n/2 evaluations).
approach. First, from our experience, evolutionary computa- Competitive fitness functions are chaotic and so can occa-
tion strategies often work more efficiently when the difficulty sionally have undesirable effects on the population, but we
of the problem ramps up as evolution progresses, that is, as found them a useful fit for a naturally competitive domain
the population gets better, the problem gets harder. A good such as robotic soccer. Competitive fitness functions also
ramping with a suite of pre-created opponents is difficult to naturally ramp problem difficulty because teams in the pop-
gauge. Second, unless there are several opponents at any par- ulation play against peers of approximately similar ability.
ticular difficulty level, one runs the common risk of evolving Such functions can also promote generalization, because the
a team optimized to beat that particular set of hand-made set of possible “opponents” an individual might face is the
opponents, instead of generalizing to play “good” soccer. population itself.
We opted instead for evolving our teams in a competi- We initially based fitness on a variety of game factors in-
tive fitness environmnent 1 : teams’ fitnesses were assessed cluding the number of goals, time in possession of the ball,
based on competition with peers in the population (for a sur- average position of the ball, number of successful passes, etc.
vey of such environments, see [Angeline and Pollack 1996]. However, we found that in our early runs, the entire popula-
There are a variety of approaches to creating such competi- tion would converge to very poor solutions. Ultimately, we
tions. One approach is to create a “round-robin” tournament found that by simply basing fitness on goal difference alone,
where every team in the population squares off at least once the population avoided such convergence. At first glance,
against every other team. This approach is very expensive: such a simplistic fitness assessment would seem an overly
(n2 − n)/2 evaluations for a population of size n. Another crude measure, as many early games might end with 0–0
scores. Luckily, this turned out to be untrue. We discovered
1 This might all fit under the “co-evolution” umbrella. In theoretical
that initial game scores were in fact very high and quite vari-
biology, the definition “co-evolution” (along with the term “species”) has
able: vectors to the ball and to the goal were fundamental
become rather fuzzy of late. But while I originally used “co-evolution” to
describe this environment, I think the term carries just too much inter-species parts of the function set, so teams did simple offense well,
(or multi-population) baggage. “Competitive fitness” is more precise. but defense poorly. Only later, as teams evolved better defen-
Figure 4: Some players begin to hang back and protect the Figure 5: Teams eventually learn to disperse themselves
goal, while others chase after the ball. throughout the field.

5 A History of Evolution
sive strategies, would scores come down to more reasonable
levels. One of the fun parts of working with this domain is watching
We performed our GP runs in parallel using a custom the population learn. In a typical GP experiment one would
strongly-typed, multithreaded version of lil-gp 1.1 [Zongker conduct a large number of runs, which provides a statistically
and Punch 1995], running on a 40-node DEC Alpha super- meaningful analysis of population growth and change. For
computer. At evaluation time, the system paired off competi- obvious reasons, this was not possible for us to do. Given
tors, formed the pairs into groups, and assigned each group the one-shot nature of our RoboCup runs (the final run took
to a separate evaluation thread. In parallel, these evaluation several months’ time), our observations of population devel-
threads would work with the socket communication library opment are therefore admittedly anecdotal. Still, we observed
to pair off teams and play competitions in separate Soccer some very interesting trends.
Server processes. Our initial random teams consisted primarily of players
which wandered aimlessly, spun in place, stared at the ball, or
At some point I felt we would need the population to stop chased after teammates. Because (ball) and (kick-goal!) were
global searches and start narrowly tweaking its best strate- basic functions, there were occasional players which would
gies to date in preparation for the competition. As such, we go to the ball and kick it to the goal. These players helped their
ran the final runs for forty generations, at which time we re- teams rack up stratospheric scores against helpless opponents.
introduced into the population high-fitness individuals from Figure 2 shows two random teams playing.
past generations, and added 10% reproduction (30% muta-
Early populations produced all sorts of bizarre strategies.
tion, 60% crossover). The intent of this unusual step was
One particular favorite was a (homogeneous) competition of
to force the population to rapidly converge to a narrow set
one team programmed to move away from the ball, against
of suboptima. We then continued runs up to the time of the
another team programmed to move away from the first team.
RoboCup-97 competition (for twelve generations).
Thankfully, such strategies didn’t last for many generations.
Just prior to the competition, we held a “tournament of One suboptimal strategy, however, was particularly trou-
champions” among the twenty highest-performing teams at blesome: “everyone chase after the ball and kick it into the
that point, and submitted the winner. While I feel that, given goal”, otherwise known as “kiddie-soccer”, shown in Fig-
enough evolution time, the learned strategies of the pseudo- ure 3. This strategy gained dominance because early teams
heterogeneous teams might ultimately outperform the homo- had effectively no defensive ability. Kiddie-soccer proved to
geneous teams, the best teams at competition time (including be a major obstacle to evolving better strategies. The over-
the one we submitted) were homogeneous. whelming tendency to converge to kiddie-soccer and similar
strategies was the chief reason behind our simplification of In the past, genetic programming has been surprisingly
the evaluation function (to be based only on goals scored). successful in a variety of areas, but real-time robotics is not
After we simplified the evaluation function, the population one of them. The complex dynamics of the field, plus the long
eventually found its way out of the kiddie-soccer suboptima time necessary to perform evaluations, makes robotics (and
and on to better strategies. realistic simulator robotics) a difficult problem for evolution-
After a number of generations, the population as a whole ary computation to crack. But not an impossible problem. I
began to develop rudimentary defensive ability. One com- hope this paper puts to rest the myth that genetic programming
mon approach we noted was to have a few players hang can compare favorably to human-crafted solutions only in toy
back near the goal when not close to the ball (Figure 4). domains, or for problems tailor-made for the constraints of
Most teams still had many players which clumped around the evolution.
ball, kiddie-soccer-style, but such simple defensive moves
effectively eliminated the long-distance goal shots which had 7 Acknowledgements
created such high scores in the past.
Eventually teams began to disperse players throughout the This research is supported in part by grants to Dr. James
field and to pass to teammates when appropriate instead of Hendler from ONR (N00014-J-91-1451), AFOSR (F49620-
kicking straight to the goal, as shown in Figure 5. Ho- 93-1-0065), ARL (DAAH049610297), and ARPA contract
mogeneous teams did this usually by using players’ home DAST-95-C0037.
positions and information about nearby teammates and ball My thanks to the Maryland RoboCup team for their help
position. But some pseudo-heterogeneous teams appeared in the development of this project: Charles Hohn, Jonathan
to be forming separate offensive and defensive squad algo- Farris, Gary Jackson, Daniel Wigglesworth, John Peterson,
rithms. Although the pseudo-heterogeneous teams were not Shaun Gittens, Shu Chiun Cheah, and Tanveer Choudhury.
sufficiently fit by the time RoboCup arrived, we suspect that Thanks also to Jim Hendler, Lee Spector, Kilian Stoffel, Bob
given more time, this approach could have ultimately yielded Kohout, Hiroaki Kitano, Minoru Asada, and to the UMIACS
better strategies. system staff for turning their heads while we played soccer
games on their supercomputers.
6 Conclusions and Future Work
References
This project was begun to see if it was even possible to suc-
cessfully evolve a team for such a challenging domain as the Andre, D. 1995. The Automatic Programming of Agents
RoboCup Soccer Server. Given the richness of the Soccer that Learn Mental Models and Create Simple Plans of
Server environment and the very long run time required to Action. In Proceedings of the Fourteenth International
get reasonable results, we are very pleased with the outcome. Joint Conference on Artificial Intelligence, C. S. Mellish,
Our evolved softbots learned to play rather well given the ed. 741–747. Morgan Kaufmann, San Mateo CA.
constraints we had to place on their evolution, and they beat
Angeline, P. and J. Pollack. Competitive Environments Evolve
teams crafted by hand by real human experts (who weren’t
Better Solutionf for Complex Tasks. In Proceedings of the
us). As such, I think the experiment was a success.
Fifth International Conference on Genetic Algorithms, S.
Still, we had to compromise in order to make the project Forrest, ed. 264–270. Morgan Kaufmann, San Mateo
a reality. The function set we provided was heavy on if-then CA.
statements and functional operation, with little internal state.
In the future I hope to try more computationally sophisticated Haynes, T., S. Sen, D. Schoenefeld and R. Wainwright. 1995.
algorithms. The competition mechanism (only one game Evolving a Team. In Working Notes of the AAAI-95 Fall
per individual) was also very restrictive: I think a single- Symposium on Genetic Programming. E. V. Siegel and J.
elimination tournament might yield better results. Finally, R. Koza, editors. 23–30. AAAI Press.
the population sizes were very small; enlarging these could
make a significant impact on evolutionary progress. Hohn, C. 1997. Evolving Predictive Functions from Observed
It is unfortunate that the our pseudo-heterogeneous teams Data for Simulated Robots. Senior Honor’s Thesis. De-
could not outperform our homogeneous teams by RoboCup partment of Computer Science, University of Maryland
competition-time. Much of this is due to the excessive size of at College Park.
the pseudo-heterogeneous genomes (which were still much Holland, J. H. 1996. Adaption in Natural and Artificial Sys-
smaller than full-heterogeneous genomes). Hindsight is 20- tems. University of Michigan Press.
20. In the future I would definitely pick larger squad sizes,
perhaps three squads of three or four each, or two squads of Iba, H. 1996. Emergent Cooperation for Multiple Agents
five or six each. This would bring the total number of trees using Genetic Programming. In Late Breaking Papers of
in the genome down to six or four, which might evolve more the Genetic Programming 1996 Conference, J. R. Koza,
rapidly. ed. 66–74. Stanford University Bookstore, Stanford CA.
Itsuki, N. 1995. Soccer Server: a simulator for RoboCup. In
JSAI AI-Symposium 95: Special Session on RoboCup.
Kitano, H., M. Asada, Y. Kuniyoshi, I. Noda, and E. Osawa.
1995. RoboCup: The Robot World Cup Initiative. In
Proceedings of the IJCAI-95 Workshop on Entertainment
and AI/ALife.
Koza, J. R. 1992. Genetic Programming: On the Program-
ming of Computers by Means of Natural Selection. The
MIT Press, Cambridge MA.
Luke, S. and L. Spector. 1996. Evolving Teamwork and
Coordination with Genetic Programming. In Proceedings
of the First Annual Conference on Genetic Programming
(GP-96), J. R. Koza et al, eds. 150–156. The MIT Press,
Cambridge MA.
Luke, S. et al. 1997. Co-evolving Soccer Softbot Team
Coordination with Genetic Programming. In Proceedings
of the RoboCup-97 Workshop at the 15th International
Joint Conference on Artificial Intelligence (IJCAI97). H.
Kitano, ed. 115–118. IJCAI.
Luke, S. and L. Spector. 1997. A Comparison of Crossover
and Mutation in Genetic Programming. In Genetic Pro-
gramming 1997: Proceedings of the Second Annual Con-
ference (GP97). J. Koza et al, eds. 240–248. San Fran-
sisco: Morgan Kaufmann.
Montana, D. J. 1995. Strongly Typed Genetic Programming.
In Evolutionary Computation. 3:2, 199–230. The MIT
Press, Cambridge MA.
Raik, S. and B. Durnota. 1994. The Evolution of Sporting
Strategies. In Complex Systems: Mechanisms of Adap-
tion, R. J. Stonier and X. H. Yu, eds. 85–92. IOS Press,
Amsterdam.
Qureshi, A. 1996. Evolving Agents. In Proceedings of the
First Annual Conference on Genetic Programming (GP-
96), J. R. Koza et al, eds. 369–374. The MIT Press,
Cambridge MA.
Zongker, D. and B. Punch. 1995. lil-gp 1.0 User’s Manual.
Available at http://isl.cps.msu.edu/GA/software/lil-gp

You might also like