0% found this document useful (0 votes)
7 views

UNIT 4 Notes

Uploaded by

skar6086
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

UNIT 4 Notes

Uploaded by

skar6086
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT-IV

GENETIC ALGORITHMS

Survival of the Fittest – Fitness Computations – Cross over – Mutation -


Reproduction –– Rank Method – Rank Space Method.

Genetic Algorithms (GAs) are adaptive heuristic search algorithms that belong
to the larger part of evolutionary algorithms and it mimics the process of
natural selection and genetics. These are intelligent exploitation of random
search provided with historical data to direct the search into the region of better
performance in solution space. They are commonly used to generate high-
quality solutions for optimization problems and search problems.

Genetic algorithms simulate the process of natural selection which means those
species who can adapt to changes in their environment are able to survive and
reproduce and go to next generation. In simple words, they simulate “survival
of the fittest” among individual of consecutive generation for solving a problem.
Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution. Each individual is
represented as a string of character/integer/float/bits. This string is analogous
to the Chromosome.

Search space

The population of individuals is maintained within search space. Each individual


represents a solution in search space for given problem. Each individual is coded
as a finite length vector (analogous to chromosome) of components. These
variable components are analogous to Genes. Thus, a chromosome (individual)
is composed of several genes (variable components).

Key terms

• Individual - Any possible solution


• Population - Group of all individuals
• Search Space - All possible solutions to the problem
• Chromosome - Blueprint for an individual
• Trait - Possible aspect (features) of an individual
• Allele - Possible settings of trait (black, blond, etc.)
• Locus - The position of a gene on the chromosome
• Genome - Collection of all chromosomes for an individual
Representations:

Representation Binary strings


Recombination N-point or uniform
Mutation Bitwise bit-flipping with fixed probability
Parent selection Fitness-Proportionate
Survivor selection All children replace parents
Speciality Emphasis on crossover

"Survival of the fittest" is a phrase that originated from Darwinian


evolutionary theory as a way of describing the mechanism of natural
selection. The biological concept of fitness is defined as reproductive success.
In Darwinian terms, the phrase is best understood as "Survival of the form
that will leave the most copies of itself in successive generations."

Herbert Spencer first used the phrase, after reading Charles Darwin's On the
Origin of Species, in his Principles of Biology (1864), in which he drew parallels
between his own economic theories and Darwin's biological ones. In biology,
the definition of survival of the fittest is this, “a natural process resulting in
the evolution of organisms best adapted to the environment”.

FITNESS COMPUTATIONS

The fitness function simply defined is a function which takes a candidate


solution to the problem as input and produces as output how “fit” our how
“good” the solution is with respect to the problem in consideration.

In most cases the fitness function and the objective function are the same as
the objective is to either maximize or minimize the given objective function.
However, for more complex problems with multiple objectives and
constraints, an Algorithm Designer might choose to have a different fitness
function.

A fitness function should possess the following characteristics −

• The fitness function should be sufficiently fast to compute.

• It must quantitatively measure how fit a given solution is or how fit


individuals can be produced from the given solution.

In some cases, calculating the fitness function directly might not be possible
due to the inherent complexities of the problem at hand. In such cases, we
do fitness approximation to suit our needs.

The following image shows the fitness calculation for a solution of the 0/1
Knapsack. It is a simple fitness function which just sums the profit values of
the items being picked (which have a 1), scanning the elements from left to
right till the knapsack is full.
Reproduction:

Reproduction operation is also referred as selection operator. This operator is


basically used to select the good ones from the population of strings based on
their fitness information. Reproduction is an artificial version of natural
selection, a Darwinian survival of fittest among string creatures competing with
each other.
It is a process in which individual strings are copied according to their objective
function values, f (fitness function). Intuitively a fitness function may be a
measure of profit, utility or goodness that is desired to be maximized.
Copying strings according to their fitness values means that strings with a
higher value have a higher probability of contributing one or more offspring in
the next generation.
Considering the principle of natural selection, the fitness is determined by the
creature’s ability to survive the predator, pestilence, and other obstacles at
adulthood and subsequent reproduction. There are a number of reproduction
operators in GA literature but the essential idea in all of them is that the above
average chromosomes are picked from the current population and their
multiple copies are inserted in the mating pool in a probabilistic manner. The
various schemes for selecting chromosomes for the mating pool for
reproduction are:
• Roulette-wheel selection or Proportionate selection
• Rank selection
• Tournament selection
• Elitism

PARENT SELECTION

Parent Selection is the process of selecting parents’ which mate and


recombine to create off-springs for the next generation. Parent selection is
very crucial to the convergence rate of the GA as good parents drive
individuals to a better and fitter solutions.

However, care should be taken to prevent one extremely fit solution from
taking over the entire population in a few generations, as this leads to the
solutions being close to one another in the solution space thereby leading to
a loss of diversity. Maintaining good diversity in the population is extremely
crucial for the success of a GA. This taking up of the entire population by one
extremely fit solution is known as premature convergence and is an
undesirable condition in a GA.

OPERATORS OF GENETIC ALGORITHMS

Once the initial generation is created, the algorithm evolves the generation
using following operators –

SELECTION OPERATOR:
The idea is to give preference to the individuals with good fitness scores and
allow them to pass their genes to successive generations.

• String is selected with a probability proportional to its fitness


• Thus, the ith string in the population is selected with a probability
proportional to Fi.
• The probability for selecting the ith string is given as :
𝐹𝑖
𝑝𝑖 = 𝑛
∑𝑖=1 𝐹𝑖
where n is population size.
The Roulette wheel selection

• The roulette-wheel is spun n times.


• each time selecting an instance of the string chosen by the roulette-wheel
pointer
• this roulette-wheel mechanism is expected to make copies of the ith
string in the mating pool

The average fitness of the population is calculated as follows:


𝐹𝑖
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐹𝑖𝑡𝑛𝑒𝑠𝑠 = , where 𝐹̅ = ∑𝑛𝑖=1 𝐹𝑖
𝐹̅

In a roulette wheel selection, the circular wheel is divided as described. A fixed


point is chosen on the wheel circumference as shown and the wheel is rotated.
The region of the wheel which comes in front of the fixed point is chosen as the
parent. For the second parent, the same process is repeated.
Fig: Roulette Wheel Selection

Roulette Wheel’s Selection Pseudo Code:

for all members of population


sum += fitness of this individual
end for
for all members of population
probability = sum of probabilities + (fitness / sum)
sum of probabilities += probability
end for
loop until new population is full
do this twice
number = Random between 0 and 1
for all members of population
if number > probability but less than next probability then
you have been selected
end for
end
create offspring
end loop

It is clear that a fitter individual has a greater pie on the wheel and therefore a
greater chance of landing in front of the fixed point when the wheel is rotated.
Therefore, the probability of choosing an individual depends directly on its
fitness.
Implementation wise, the following steps are −
 Calculate S = the sum of a finesses.
 Generate a random number between 0 and S.
 Starting from the top of the population, keep adding the finesses to the
partial sum P, till P<S.
 The individual for which P exceeds S is the chosen individual.

The Stochastic Universal Sampling (SUS)

Stochastic Universal Sampling is quite similar to Roulette wheel selection,


however instead of having just one fixed point, we have multiple fixed points as
shown in the following image. Therefore, all the parents are chosen in just one
spin of the wheel. Also, such a setup encourages the highly fit individuals to be
chosen at least once.

The Tournament Selection

In K-Way tournament selection, we select K individuals from the population at


random and select the best out of these to become a parent. The same process
is repeated for selecting the next parent. Tournament Selection is also extremely
popular in literature as it can even work with negative fitness values.
The Rank Selection

Rank Selection also works with negative fitness values and is mostly used when
the individuals in the population have very close fitness values (this happens
usually at the end of the run). This leads to each individual having an almost
equal share of the pie (like in case of fitness proportionate selection) as shown
in the following image and hence each individual no matter how fit relative to
each other has an approximately same probability of getting selected as a
parent. This in turn leads to a loss in the selection pressure towards fitter
individuals, making the GA to make poor parent selections in such situations.

In this, we remove the concept of a fitness value while selecting a parent.


However, every individual in the population is ranked according to their fitness.
The selection of the parents depends on the rank of each individual and not the
fitness. The higher ranked individuals are preferred more than the lower ranked
ones.

Chromosome Fitness Rank


Value

A 8.1 1

B 8.0 4

C 8.05 2

D 7.95 6

E 8.02 3

F 7.99 5
The Random Selection

In this strategy we randomly select parents from the existing population. There
is no selection pressure towards fitter individuals and therefore this strategy is
usually avoided.

CROSSOVER OPERATOR
The crossover operator is analogous to reproduction and biological crossover. In
this more than one parent is selected and one or more off-springs are produced
using the genetic material of the parents. Crossover is usually applied in a GA
with a high probability – pc .
This represents mating between individuals. Two individuals are selected using
selection operator and crossover sites are chosen randomly. Then the genes at
these crossover sites are exchanged thus creating a completely new individual
(offspring).
• A crossover operator is used to recombine two strings to get a better
string.
• In reproduction, good strings in a population are probabilistically assigned
a larger number of copies

The two strings participating in the crossover operation are known as parent
strings and the resulting strings are known as children strings. A crossover
operator is mainly responsible for the search of new strings

Types of cross over

• 1-pt cross over

• 2-pt cross over

• N-pt cross over

1-point cross over with cross over point: 3

String 1# 101|11 String 1# 101|01


String 2# 100|01 String 2# 100|11

Before Crossover After cross over

2- point crossover with cross over points: 3 and 6

String 1# 101|101|000 String 1# 101|010|000


String 2# 100|010|011 String 2# 100|101|011

Before Crossover After cross over

Uniform Crossover

In a uniform crossover, we don’t divide the chromosome into segments, rather


we treat each gene separately. In this, we essentially flip a coin for each
chromosome to decide whether or not it’ll be included in the off-spring. We can
also bias the coin to one parent, to have more genetic material in the child from
that parent.

MUTATION OPERATOR
In simple terms, mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and introduce
diversity in the genetic population and is usually applied with a low probability
– pm. If the probability is very high, the GA gets reduced to a random search.
Mutation is the part of the GA which is related to the “exploration” of the search
space. It has been observed that mutation is essential to the convergence of the
GA while crossover is not.
The key idea is to insert random genes in offspring to maintain the diversity in
the population to avoid premature convergence.
For example –

• Mutation adds new information in a random way to the genetic search


• Helps to avoid getting trapped at local optima
• It is a process of randomly disturbing genetic information
• Operates at the bit level
• There is probability that each bit may become mutated pm

Bit Flip Mutation


In this bit flip mutation, we select one or more random bits and flip them. This
is used for binary encoded GAs.

Random Resetting

Random Resetting is an extension of the bit flip for the integer representation.
In this, a random value from the set of permissible values is assigned to a
randomly chosen gene.

Swap Mutation

In swap mutation, we select two positions on the chromosome at random, and


interchange the values. This is common in permutation-based encodings.
Scramble Mutation

Scramble mutation is also popular with permutation representations. In this,


from the entire chromosome, a subset of genes is chosen and their values are
scrambled or shuffled randomly.

Inversion Mutation

In inversion mutation, we select a subset of genes like in scramble mutation, but


instead of shuffling the subset, we merely invert the entire string in the subset.

SURVIVOR SELECTION
The Survivor Selection Policy determines which individuals are to be kicked out
and which are to be kept in the next generation. It is crucial as it should ensure
that the fitter individuals are not kicked out of the population, while at the same
time diversity should be maintained in the population.

Some GAs employ Elitism. In simple terms, it means the current fittest member
of the population is always propagated to the next generation. Therefore, under
no circumstance can the fittest member of the current population be replaced.

The easiest policy is to kick random members out of the population, but such an
approach frequently has convergence issues, therefore the following strategies
are widely used.

Age Based Selection

In Age-Based Selection, we don’t have a notion of a fitness. It is based on the


premise that each individual is allowed in the population for a finite generation
where it is allowed to reproduce, after that, it is kicked out of the population no
matter how good its fitness is.

For instance, in the following example, the age is the number of generations for
which the individual has been in the population. The oldest members of the
population i.e. P4 and P7 are kicked out of the population and the ages of the
rest of the members are incremented by one.

Fitness Based Selection

In this fitness-based selection, the children tend to replace the least fit
individuals in the population. The selection of the least fit individuals may be
done using a variation of any of the selection policies described before –
tournament selection, fitness proportionate selection, etc.

For example, in the following image, the children replace the least fit individuals
P1 and P10 of the population. It is to be noted that since P1 and P9 have the
same fitness value, the decision to remove which individual from the population
is arbitrary.
TERMINATION CONDITION
The termination condition of a Genetic Algorithm is important in determining
when a GA run will end. It has been observed that initially, the GA progresses
very fast with better solutions coming in every few iterations, but this tends to
saturate in the later stages where the improvements are very small. We usually
want a termination condition such that our solution is close to the optimal, at
the end of the run.
Usually, we keep one of the following termination conditions −
• When there has been no improvement in the population for X iterations.
• When we reach an absolute number of generations.
• When the objective function value has reached a certain pre-defined value.

For example, in a genetic algorithm we keep a counter which keeps track of the
generations for which there has been no improvement in the population.
Initially, we set this counter to zero. Each time we don’t generate off-springs
which are better than the individuals in the population, we increment the
counter.

However, if the fitness any of the off-springs is better, then we reset the counter
to zero. The algorithm terminates when the counter reaches a predetermined
value.

REPRODUCTION IN GENETIC ALGORITHMS


The reproduction process is to allow the genetic information, stored in the good
fitness for survive the next generation of the artificial strings, whereas the
population's string has assigned a value and its aptitude in the object function.
This value has the probability of being chosen as the parent in the reproduction
process of a new generation.

During reproduction, changes in the chromosome is expected naturally due to


the process called “crossover” in which chromosomes from the parent gets
exchanged randomly. The chromosomes developed in offspring will definitely
shows some traits of both the parents. In most rare cases the chromosomes gets
replicated resulting in an offspring with no resemblance to parents. Now to
make it clear, regarding the “mutation” process let us go further genetics;
consider a case in which a parent chromosome having A-C-G-C-T produces an
offspring of A-C-T-C-T due to some natural mistakes.

Reproduction is controlled by mutation and crossover operators. Crossover


defines the procedure for generating a child from two parents. Before the actual
crossover is performed, the parents need to be selected.

Several selection schemes are possible:


1. The best individuals of every generation are selected.
2. An individual is selected based on its fitness relative to the other
individuals. The better the fitness the more likely this individual gets
chosen. The probability p for an individual with fitness S to get chosen
among N individuals is equal to p = 𝑆/ ∑𝑛𝑖=1 𝑆𝑖 . This is called the
ROULETTE-WHEEL selection scheme.
3. Two individuals are selected using (2), the one with the better fitness is
finally chosen.
4. Individuals are chosen randomly. Each individual has the same chance of
being chosen independent of its fitness.
5. Individuals are chosen based on a two-stage stochastic selection. In the
first stage a temporary population is computed in which an individual
from the base population may occur several times corresponding to the
integer part of its expected fitness (i.e. the absolute fitness divided by
the average fitness). The fraction of the expected fitness is used to give
the individual another chance for being represented in the temporary
population.
After individuals were chosen for mating, one of the crossovers takes place.
Finally, mutation is performed on the newly introduced off-springs (right after
the crossover). It introduces new genetic material into a population by replacing
one parameter in a genome by a random value within the allowed range. The
parameter Pmut of the genetic algorithm thereby controls the mutation
probability.

The parameters PCROSS and Pmut have a big impact on the performance of the
genetic algorithm. Although a direct relationship to a gradient based optimizer
cannot be drawn, the crossover operation results in a local exploration of the
target function, since only a combination of other individuals is created. The
parameter values are reused. The mutation operation on the other hand
introduces (totally new) parameter values into the genome which have never
been used before. Thereby, the algorithm effectively 'escapes' any local
extrema. There is no general rule on how to set those parameters
Rank Method
Rank Selection sorts the population first according to fitness value and ranks
them. Then every chromosome is allocated selection probability with respect to
its rank. Individuals are selected as per their selection probability. Rank selection
is an explorative technique of selection.

The rank selection first ranks the population and then every chromosome
receives fitness from this ranking. The worst-case will have fitness 1, second-
worst 2, etc. and the best-case will have fitness N (number of chromosomes in
population). After this, all the chromosomes have a chance to be selected.

 Rank-based selection schemes can avoid premature convergence.

 But it can be computationally expensive because it manages the populations


based on fitness value.

 But this method can point to slower convergence because the best
chromosomes do not differ so much from other ones.

Not only offers a way of controlling the bias toward the best chromosome, but
also eliminates implicit biases, introduced by unfortunate choices of the
measurement scale, that might otherwise do harm.
To select a candidate by rank method,
o sort the n individuals by quality
o let the probability of selecting the i-th candidate, given that the first
i-1 candidates have not been selected, be p, except for the final
candidate, which is selected if no previous candidate has been
selected.
o Select a candidate using the computed probabilities.

RANK SPACE METHOD


To select candidate by the rank-space method:
 Sort the n individuals by quality.

 Sort the n individuals by the sum of their inverse squared distances to


already selected candidates.
 Use the rank method, but sort on the sum of the quality rank and the
diversity rank, rather than on quality rank

Advantages of GA
• Genetic algorithm can handle the integer programming or mixed-integer
programming problems effectively.
• Genetic algorithm can optimize discontinuous objective functions also
without the requirement of the gradient information of objective
function.
• It is suitable for parallel implementations.
• A genetic algorithm starts with a population of initial solutions created at
random. Out of all these initial solutions, if at least one solution falls in the
global basin, there is a good chance for the GA to reach the global optimal
solution. It is important to mention that initial population may not contain
any solution lying in the global basin. In that case, if the GA-operators
(namely crossover, mutation) can push at least one solution into the
global basin, there is a chance that the GA will find the global optimal
solution. Thus, the chance of its solutions for being trapped into the local
minima is less.
• The same GA with a little bit of modification in the string can solve a
variety of problems. Thus, it is a versatile optimization tool.

Disadvantages of GA

• Genetic algorithm computationally expensive and consequently, has a


slow convergence rate.
• Genetic algorithm works like a black-box optimization tool.
• There is no mathematical convergence proof of the GA, till today.
• An user must have a proper knowledge of how to select an appropriate
set of GA Parameters.

Example for function x2 : selection

Fitness: fi = f(x) = x3 + 1 ; Probi = ( fi / sum(fi) )

Expected count = fi / average (fi)


Crossover

Mutation

Application of Genetic Algorithms


i. Optimization.
ii. Automatic Programming.
iii. Machine and Robot Learning.
iv. Economic Models.
v. Immune System Models.
vi. Ecological Models.
vii. Population Genetics Models.
viii. Models of Social Systems.
ix. Filtering and Signal Processing.
x. Recurrent Neural Network.

You might also like