UNIT 4 Notes
UNIT 4 Notes
GENETIC ALGORITHMS
Genetic Algorithms (GAs) are adaptive heuristic search algorithms that belong
to the larger part of evolutionary algorithms and it mimics the process of
natural selection and genetics. These are intelligent exploitation of random
search provided with historical data to direct the search into the region of better
performance in solution space. They are commonly used to generate high-
quality solutions for optimization problems and search problems.
Genetic algorithms simulate the process of natural selection which means those
species who can adapt to changes in their environment are able to survive and
reproduce and go to next generation. In simple words, they simulate “survival
of the fittest” among individual of consecutive generation for solving a problem.
Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution. Each individual is
represented as a string of character/integer/float/bits. This string is analogous
to the Chromosome.
Search space
Key terms
Herbert Spencer first used the phrase, after reading Charles Darwin's On the
Origin of Species, in his Principles of Biology (1864), in which he drew parallels
between his own economic theories and Darwin's biological ones. In biology,
the definition of survival of the fittest is this, “a natural process resulting in
the evolution of organisms best adapted to the environment”.
FITNESS COMPUTATIONS
In most cases the fitness function and the objective function are the same as
the objective is to either maximize or minimize the given objective function.
However, for more complex problems with multiple objectives and
constraints, an Algorithm Designer might choose to have a different fitness
function.
In some cases, calculating the fitness function directly might not be possible
due to the inherent complexities of the problem at hand. In such cases, we
do fitness approximation to suit our needs.
The following image shows the fitness calculation for a solution of the 0/1
Knapsack. It is a simple fitness function which just sums the profit values of
the items being picked (which have a 1), scanning the elements from left to
right till the knapsack is full.
Reproduction:
PARENT SELECTION
However, care should be taken to prevent one extremely fit solution from
taking over the entire population in a few generations, as this leads to the
solutions being close to one another in the solution space thereby leading to
a loss of diversity. Maintaining good diversity in the population is extremely
crucial for the success of a GA. This taking up of the entire population by one
extremely fit solution is known as premature convergence and is an
undesirable condition in a GA.
Once the initial generation is created, the algorithm evolves the generation
using following operators –
SELECTION OPERATOR:
The idea is to give preference to the individuals with good fitness scores and
allow them to pass their genes to successive generations.
It is clear that a fitter individual has a greater pie on the wheel and therefore a
greater chance of landing in front of the fixed point when the wheel is rotated.
Therefore, the probability of choosing an individual depends directly on its
fitness.
Implementation wise, the following steps are −
Calculate S = the sum of a finesses.
Generate a random number between 0 and S.
Starting from the top of the population, keep adding the finesses to the
partial sum P, till P<S.
The individual for which P exceeds S is the chosen individual.
Rank Selection also works with negative fitness values and is mostly used when
the individuals in the population have very close fitness values (this happens
usually at the end of the run). This leads to each individual having an almost
equal share of the pie (like in case of fitness proportionate selection) as shown
in the following image and hence each individual no matter how fit relative to
each other has an approximately same probability of getting selected as a
parent. This in turn leads to a loss in the selection pressure towards fitter
individuals, making the GA to make poor parent selections in such situations.
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5
The Random Selection
In this strategy we randomly select parents from the existing population. There
is no selection pressure towards fitter individuals and therefore this strategy is
usually avoided.
CROSSOVER OPERATOR
The crossover operator is analogous to reproduction and biological crossover. In
this more than one parent is selected and one or more off-springs are produced
using the genetic material of the parents. Crossover is usually applied in a GA
with a high probability – pc .
This represents mating between individuals. Two individuals are selected using
selection operator and crossover sites are chosen randomly. Then the genes at
these crossover sites are exchanged thus creating a completely new individual
(offspring).
• A crossover operator is used to recombine two strings to get a better
string.
• In reproduction, good strings in a population are probabilistically assigned
a larger number of copies
The two strings participating in the crossover operation are known as parent
strings and the resulting strings are known as children strings. A crossover
operator is mainly responsible for the search of new strings
Uniform Crossover
MUTATION OPERATOR
In simple terms, mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and introduce
diversity in the genetic population and is usually applied with a low probability
– pm. If the probability is very high, the GA gets reduced to a random search.
Mutation is the part of the GA which is related to the “exploration” of the search
space. It has been observed that mutation is essential to the convergence of the
GA while crossover is not.
The key idea is to insert random genes in offspring to maintain the diversity in
the population to avoid premature convergence.
For example –
Random Resetting
Random Resetting is an extension of the bit flip for the integer representation.
In this, a random value from the set of permissible values is assigned to a
randomly chosen gene.
Swap Mutation
Inversion Mutation
SURVIVOR SELECTION
The Survivor Selection Policy determines which individuals are to be kicked out
and which are to be kept in the next generation. It is crucial as it should ensure
that the fitter individuals are not kicked out of the population, while at the same
time diversity should be maintained in the population.
Some GAs employ Elitism. In simple terms, it means the current fittest member
of the population is always propagated to the next generation. Therefore, under
no circumstance can the fittest member of the current population be replaced.
The easiest policy is to kick random members out of the population, but such an
approach frequently has convergence issues, therefore the following strategies
are widely used.
For instance, in the following example, the age is the number of generations for
which the individual has been in the population. The oldest members of the
population i.e. P4 and P7 are kicked out of the population and the ages of the
rest of the members are incremented by one.
In this fitness-based selection, the children tend to replace the least fit
individuals in the population. The selection of the least fit individuals may be
done using a variation of any of the selection policies described before –
tournament selection, fitness proportionate selection, etc.
For example, in the following image, the children replace the least fit individuals
P1 and P10 of the population. It is to be noted that since P1 and P9 have the
same fitness value, the decision to remove which individual from the population
is arbitrary.
TERMINATION CONDITION
The termination condition of a Genetic Algorithm is important in determining
when a GA run will end. It has been observed that initially, the GA progresses
very fast with better solutions coming in every few iterations, but this tends to
saturate in the later stages where the improvements are very small. We usually
want a termination condition such that our solution is close to the optimal, at
the end of the run.
Usually, we keep one of the following termination conditions −
• When there has been no improvement in the population for X iterations.
• When we reach an absolute number of generations.
• When the objective function value has reached a certain pre-defined value.
For example, in a genetic algorithm we keep a counter which keeps track of the
generations for which there has been no improvement in the population.
Initially, we set this counter to zero. Each time we don’t generate off-springs
which are better than the individuals in the population, we increment the
counter.
However, if the fitness any of the off-springs is better, then we reset the counter
to zero. The algorithm terminates when the counter reaches a predetermined
value.
The parameters PCROSS and Pmut have a big impact on the performance of the
genetic algorithm. Although a direct relationship to a gradient based optimizer
cannot be drawn, the crossover operation results in a local exploration of the
target function, since only a combination of other individuals is created. The
parameter values are reused. The mutation operation on the other hand
introduces (totally new) parameter values into the genome which have never
been used before. Thereby, the algorithm effectively 'escapes' any local
extrema. There is no general rule on how to set those parameters
Rank Method
Rank Selection sorts the population first according to fitness value and ranks
them. Then every chromosome is allocated selection probability with respect to
its rank. Individuals are selected as per their selection probability. Rank selection
is an explorative technique of selection.
The rank selection first ranks the population and then every chromosome
receives fitness from this ranking. The worst-case will have fitness 1, second-
worst 2, etc. and the best-case will have fitness N (number of chromosomes in
population). After this, all the chromosomes have a chance to be selected.
But this method can point to slower convergence because the best
chromosomes do not differ so much from other ones.
Not only offers a way of controlling the bias toward the best chromosome, but
also eliminates implicit biases, introduced by unfortunate choices of the
measurement scale, that might otherwise do harm.
To select a candidate by rank method,
o sort the n individuals by quality
o let the probability of selecting the i-th candidate, given that the first
i-1 candidates have not been selected, be p, except for the final
candidate, which is selected if no previous candidate has been
selected.
o Select a candidate using the computed probabilities.
Advantages of GA
• Genetic algorithm can handle the integer programming or mixed-integer
programming problems effectively.
• Genetic algorithm can optimize discontinuous objective functions also
without the requirement of the gradient information of objective
function.
• It is suitable for parallel implementations.
• A genetic algorithm starts with a population of initial solutions created at
random. Out of all these initial solutions, if at least one solution falls in the
global basin, there is a good chance for the GA to reach the global optimal
solution. It is important to mention that initial population may not contain
any solution lying in the global basin. In that case, if the GA-operators
(namely crossover, mutation) can push at least one solution into the
global basin, there is a chance that the GA will find the global optimal
solution. Thus, the chance of its solutions for being trapped into the local
minima is less.
• The same GA with a little bit of modification in the string can solve a
variety of problems. Thus, it is a versatile optimization tool.
Disadvantages of GA
Mutation