Machine Learning: Algorithms and Applications
Machine Learning: Algorithms and Applications
Machine Learning:
Algorithms and Applications
Floriano Zini
Evolutionary computing
These slides are mainly taken from A.E. Eiben and J.E. Smith,
Introduction to Evolutionary Computing
1
19/03/12
Environment Problem
Individual Candidate Solution
Fitness Quality
Motivations for EC
Developing,
analyzing, applying problem solving
methods a.k.a. algorithms is a central theme in
mathematics and computer science
2
19/03/12
British
bank evolved
creditability model to
predict loan paying
behavior of new
applicants
Fitness:
model accuracy
on historical data
3
19/03/12
Classify
mushrooms as
edible or not edible
Fitness:
classification
accuracy on training
set of edible and not
edible mushrooms
EC metaphor
4
19/03/12
Termination
Offspring
Survivor selection
EA scheme in pseudo-code
5
19/03/12
Main EA components
Representation
Population
Evaluation
Representation
Genotype space
Phenotype space Encoding
(representation) R0c01cd
B0c01cd
G0c01cd
Decoding
(inverse representation)
6
19/03/12
Population
Role:
holds the candidate solutions of the
problem as individuals (genotypes)
Formally,a population is a multiset of individuals,
i.e. repetitions are possible
Population
is the basic unit of evolution, i.e., the
population is evolving, not the individuals
Selection operators act on population level
Variation operators act on individual level
7
19/03/12
Selection
Role:
Identifies individuals
to become parents
to survive
Pushes population towards higher fitness
Usually probabilistic
high quality solutions more likely to be selected than low
quality
but not guaranteed
even worst in current population usually has non-zero
probability of being selected
This
stochastic nature can aid escape from local
optima
B
fitness(A) = 3 A C
fitness(B) = 1 3/6 = 50% 2/6 = 33%
fitness(C) = 2
8
19/03/12
Survivor selection
A.k.a. replacement
Most EAs use fixed population size so need a way
of going from (parents + offspring) to next
generation
Often deterministic (while parent selection is
usually stochastic)
Fitness based : e.g., rank parents+offspring and take
best
Age based: make as many offspring as parents and
delete all parents
Sometimes a combination of stochastic and
deterministic (elitism)
Variation operators
Role: to generate new candidate solutions
Usually
divided into two types according to their
arity (number of inputs):
Arity 1 : mutation operators
Arity >1 : recombination operators
Arity = 2 typically called crossover
Arity > 2 is formally possible, seldomly used in EC
There
has been much debate about relative
importance of recombination and mutation
Nowadays most EAs use both
Variation operators must match the given representation
9
19/03/12
Mutation
Role: causes small, random variance
Acts on one genotype and delivers another
Element of randomness is essential and differentiates it from
other unary heuristic operators
before 1 1 1 1 1 1 1
after 1 1 1 0 1 1 1
Recombination
Role: merges information from parents into offspring
Choice of what information to merge is stochastic
Most offspring may be worse, or the same as the parents
Hope is that some are better by combining elements of
genotypes that lead to good traits
Parents
cut cut
1 1 1 1 1 1 1 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 1 1 1 1
Offspring
10
19/03/12
Initialisation / Termination
Initialisation usually done at random
Need to ensure even spread and mixture of possible allele
values
Can include existing solutions, or use problem-specific
heuristics, to “seed” the population
11
19/03/12
Phenotype:
a board configuration
Penalty
of one queen: the number of queens
she can check
Penalty
of a configuration: the sum of
penalties of all queens
Fitness
of a configuration: inverse penalty to
be maximized
12
19/03/12
1 3 5 2 6 4 7 8 1 3 72 6 4 58
13526 47 8 13542 87 6
87654 32 1 87624 13 5
13
19/03/12
14
19/03/12
Typical behavior of an EA
Stages in optimizing on a 1-dimensional fitness landscape
Early stage:
quasi-random population distribution
Mid-stage:
population arranged around/on hills
Late stage:
population concentrated on high hills
15
19/03/12
T
Time (number of generations)
• Answer: it depends.
• Possibly good, if good solutions/methods exist
• Care is needed
16
19/03/12
Evolutionary algorithm
Random search
EA 4
Performance of methods on problems
EA 2
EA 3
EA 1
17
19/03/12
Genetic Algorithms
GA Quick Overview
18
19/03/12
Representation
19
19/03/12
20
19/03/12
B
C 2/6 = 33% Fitness(A) 3
1/6 = 17%
Fitness(B) 2
A Fitness(C) 1
3/5 = 50 %
21
19/03/12
X2 example: selection
22
19/03/12
X2 example: Crossover
X2 example: Mutation
23
19/03/12
The simple GA
Has been subject of many (early) studies
still often used as benchmark for novel GAs
Shows many shortcomings, e.g.,
Representation is too restrictive
Mutation & crossover operators only applicable for bit-
string & integer representations
Selection mechanism sensitive for converging
populations with close fitness values
Generational population model (step 5 in SGA repr.
cycle) can be improved with explicit survivor selection
24
19/03/12
n-point crossover
Choose n random crossover points
Split along those points
Glue parts, alternating between parents
Uniform crossover
Assign 'heads' to one parent, 'tails' to the other
Flip a coin for each gene of the first child
Make an inverse copy of the gene for the second child
Inheritance is independent of position
25
19/03/12
Crossover OR mutation?
Decade long debate: which one is better /
necessary / main-background
26
19/03/12
27