0% found this document useful (0 votes)
54 views41 pages

CO423 - Swarm and Evolutionary Computing - Notes by V Daneesha

Uploaded by

Poojita Dagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views41 pages

CO423 - Swarm and Evolutionary Computing - Notes by V Daneesha

Uploaded by

Poojita Dagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

SWARM AND VALLURI DANEESHA

EVOLUTIONARY
COMPUTING
Lecture Notes
Swarm and Evolutionary Computing V Daneesha

1. Contents
1. Contents .......................................................................................................................................... 1
Chapter 1 - Introduction to Evolutionary Computing ........................................................................... 2
1. Background ................................................................................................................................... 2
2. Optimization .................................................................................................................................. 3
3. Global Optimization ..................................................................................................................... 4
4. Evolutionary Computing .............................................................................................................. 6
5. Evolutionary Algorithm ............................................................................................................... 9
6. Components of evolutionary algorithms ................................................................................... 10
7. General Evolutionary Algorithm ............................................................................................... 16
8. Evolution Strategies .................................................................................................................... 16
9. Learning Classifier Systems ....................................................................................................... 21
10. Parameter Control ...................................................................................................................... 24
11. Multimodal Problems .................................................................................................................. 27
Chapter 2 -Swarm Intelligence............................................................................................................. 30
1. Introduction to Swarm Intelligence........................................................................................... 30
2. Applications of Swarm Intelligence in Optimization Problems .............................................. 32
3. Particle swarm optimization (PSO) ........................................................................................... 33

Page 1 of 40
Swarm and Evolutionary Computing V Daneesha

Chapter 1 - Introduction to Evolutionary Computing


1. Background
Black-box Model
 The classification of problems is based on a black box model of computer systems.
 Informally, we can think of any computer-based system as follows: The system
initially sits, awaiting some input from either a person, a sensor, or another computer.
When input is provided, the system processes that input through some computational
model, whose details are not specified in general (hence the name black box).
 The purpose of this model is to represent some aspects of the world relevant to the
particular application.
 For instance, the model could be a formula that calculates the total route length from a
list of consecutive locations, a statistical tool estimating the likelihood of rain given
some meteorological input data, a mapping from realtime data regarding a car’s speed
to the level of acceleration necessary to approach some prespecified target speed.
1.1. Simulation Problem
We know the system model and some inputs, and need to compute the outputs
corresponding to these inputs

An example is that of a weather forecast system. In this case, the inputs are the
meteorological data regarding, temperature, wind, humidity, etc. The output can be
prediction of rainfall.
1.2. Modelling or system identification problem
 Corresponding sets of inputs and outputs are known, and a model of the system is
sought that delivers the correct output for each known input .
 These occur frequently in data mining and machine learning

1.3. Optimization problem


 The model is known, together with the desired output (or a description of the desired
output), and the task is to find the input(s) leading to this output.

Page 2 of 40
Swarm and Evolutionary Computing V Daneesha

 let us consider the travelling salesman problem. In the abstract version, we are given a
set of cities and have to find the shortest tour which visits each city exactly once. For
a given instance of this problem, we have a formula (the model) that for each given
sequence of cities (the inputs) will compute the length of the tour (the output). The
problem is to find an input with a desired output, that is, a sequence of cities with
optimal (minimal) length. Note that in this example the desired output is defined
implicitly. That is, rather specifying the exact length, it is required that the tour should
be shorter than all others, and we are looking for inputs realising this.

2. Optimization
 Optimization involves finding the best solution from all possible solutions for a given
problem.
 In mathematical terms, an optimization problem aims to maximize or minimize an
objective function subject to certain constraints.
Minimize (or Maximize) f(x)
Subject to: gi(x)≤0,i=1,2,…,m
hj(x)=0,j=1,2,…,p
Where:
f(x) is the objective function.
gi(x) are inequality constraints.
hj(x) are equality constraints.
x is the vector of decision variables.

2.1. Optimization Versus Constraint Satisfaction


 “objective functions to be optimized and constraints to be satisfied”
 In general, we can consider an objective function to be some way of assigning a value
to a possible solution that reflects its quality on a scale, whereas a constraint
represents a binary evaluation telling us whether a given requirement holds or not.
 Eg: In Travelling Salesman Problem-
o Objective Function: the length of a tour visiting each city in a given set
exactly once (to be minimised).
Constraints:
Each city to be visited only once city.
X is visited after city Y

2.2. Applications of optimization methods


 Machine Learning: Optimization is used to minimize the error function in training
algorithms, such as in neural networks.

Page 3 of 40
Swarm and Evolutionary Computing V Daneesha

 Robotics: Path planning and motion control for robots.


 Operations Research: Resource allocation and scheduling problems.
 Data Science: Parameter tuning in algorithms, feature selection.
 Computer Vision: Image and video analysis, object detection, and recognition.
 Natural Language Processing: Model training, sequence alignment, and machine
translation.

3. Global Optimization
3.1.Local Maximum and Minimum

3.2. Global Maximum and Minimum

3.3. Local Optimization


Local optimization focuses on finding the best solution within a small region of the feasible
space, but it may not be the global best.
x∗=argmin f(x)
Subject to: x∈N(x0)
Where:
N(x0) is a neighborhood around the point x0.
x∗ is the local minimizer.

Page 4 of 40
Swarm and Evolutionary Computing V Daneesha

Methods for local optimization:


 Gradient Descent: An iterative method that moves in the direction of the steepest
descent of the objective function.
 Newton’s Method: Uses the second-order derivative (Hessian) to find the minimum
of a function.
 Conjugate Gradient Method: Combines the ideas of steepest descent and Newton’s
method for optimizing large-scale problems.

3.4. Global Optimization


Global optimization seeks the absolute best solution across the entire feasible region of the
problem. It deals with the optimization of single or maybe even multiple, possible conflicting,
criteria. These criteria are expressed as a set of mathematical functions F = {f1, f2, . . . , fn},
the so-called objective functions. The result of the optimization process is the set of inputs for
which these objective functions return optimal values.
x∗=argmin f(x)
Subject to: x∈S
Where:
x∗ is the global minimizer.
S is the feasible set.
Methods for global optimization:
 Simulated Annealing: A probabilistic technique that explores the solution space by
accepting worse solutions with a probability that decreases over time.
 Genetic Algorithms: Based on natural selection, these algorithms evolve a
population of solutions over time.
 Particle Swarm Optimization: A population-based stochastic optimization technique
inspired by social behavior of birds flocking or fish schooling.

Page 5 of 40
Swarm and Evolutionary Computing V Daneesha

3.5. Categories of optimization methods

4. Evolutionary Computing
 Evolutionary Computing (EC) is a branch of artificial intelligence that draws inspiration
from the process of natural evolution to solve complex optimization problems.
 The motivation behind EC is rooted in the limitations of traditional optimization
techniques, especially when dealing with complex, high-dimensional, and nonlinear
problems.

4.1. Positioning of EC
 EC is part of computer science.
 It is not part of life science/biology. It’s only biology derived inspiration and terminology.
 Path: Computer Science  Artificial Intelligence  Computational Intelligence  EC

Page 6 of 40
Swarm and Evolutionary Computing V Daneesha

4.2. Darwin’s theory of evolution


Part 1 : Natural Selection and Survival of the Fittest
 In a world with limited resources and stable populations, each individual competes
with others for survival. Those individuals with the “best” characteristics (traits) are
more likely to survive and to reproduce, and those characteristics will be passed on to
their offspring. These desirable characteristics are inherited by the following
generations, and (over time) become dominant among the population.
 Natural selection favours those individuals that compete for the given resources most
effectively, in other words, those that are adapted or fit to the environmental
conditions best. This phenomenon is also known as survival of the fittest.
Part 2 : Diversity drives change
 A second part of Darwin’s theory states that, during production of a child organism,
random events cause random changes to the child organism’s characteristics. If these
new characteristics are a benefit to the organism, then the chances of survival for that
organism are increased.

Page 7 of 40
Swarm and Evolutionary Computing V Daneesha

Darwin Evolution: SUMMARY

Evolutionary computation (EC) refers to computer-based problem solving systems that use
computational models of evolutionary processes, such as natural selection, survival of the
fittest and reproduction, as the fundamental components of such computational systems.

4.3. Evolutionary Computing Metaphor

Page 8 of 40
Swarm and Evolutionary Computing V Daneesha

4.4. Historical Background


 The origins of evolutionary computing can be traced back to the 1950s and 1960s.

5. Evolutionary Algorithm
 Evolution via natural selection of a randomly chosen population of individuals can be
thought of as a search through the space of possible chromosome values.
 In that sense, an evolutionary algorithm (EA) is a stochastic search for an optimal
solution to a given problem.

5.1. Processes in Evolution Theory and Their Use in Evolutionary Computing


 The processes of natural evolution provide the foundational mechanisms for
evolutionary computing algorithms. Key processes include:
 Selection:
o Natural Evolution: In nature, individuals that are better adapted to their
environment have a higher chance of surviving and reproducing. This is known
as "survival of the fittest."
o Evolutionary Computing: In algorithms, selection mechanisms choose the
most promising individuals (solutions) from the current population to create
offspring for the next generation. This ensures that better solutions are
propagated.
 Crossover (Recombination):
o Natural Evolution: Genetic recombination occurs during reproduction, where
offspring inherit genetic material from both parents, leading to new
combinations of traits.
o Evolutionary Computing: Crossover operators combine the genetic material
of two parent solutions to produce new offspring. This process helps explore
new regions of the solution space.
 Mutation:
o Natural Evolution: Mutation introduces random changes to an individual's
genetic material, providing genetic diversity and the potential for new traits.

Page 9 of 40
Swarm and Evolutionary Computing V Daneesha

o Evolutionary Computing: Mutation operators introduce random changes to


individual solutions, ensuring diversity in the population and enabling the
exploration of new potential solutions.
 Fitness Evaluation:
o Natural Evolution: The fitness of an individual is determined by its ability to
survive and reproduce in its environment.
o Evolutionary Computing: A fitness function evaluates the quality of each
solution, guiding the selection process by ranking individuals based on their
performance.

6. Components of evolutionary algorithms


The evolutionary search process is influenced by the following main components of an EA:
 an encoding of solutions to the problem as a chromosome;
 a function to evaluate the fitness, or survival strength of individuals;
 initialization of the initial population;
 selection operators; and
 reproduction operators.

6.1. Representation of Solutions - The Chromosome

An evolutionary algorithm utilizes a population of individuals, where each individual


represents a candidate solution to the problem. The characteristics of an individual are
represented by a chromosome, or genome. The concept is inspired by biological genetics,
where a chromosome is a string of DNA that contains the information necessary to create an
organism. Similarly, in EAs, a chromosome contains information to generate a potential
solution to the problem.
The characteristics represented by a chromosome can be divided into classes of
evolutionary information: genotypes and phenotypes.

 Genotype vs. Phenotype: A genotype describes the genetic composition of an


individual as inherited from its parents. Genotypes provide a mechanism to store
experiential evidence as gathered by parents. A phenotype is the expressed behavioural
traits of an individual in a specific environment.

Explanation with example of Genotype vs. Phenotype

“In nature, every creature has a body and a set of behaviours. They are known as
its phenotype, which is the way a creature appears and behave. Hair, eyes and skin
colour are all part of your phenotype. The look of a creature is mostly determined by a
set of instructions written in its own cells. This is called genotype, and it refers to the
information that is carried out, mostly in our DNA. While the phenotype is how a house
looks like once its built, the genotype is the original blueprint that drove
its construction. The first reason why we keep phenotype and genotype separate is
simple: the environment plays a huge role in the final look of a creature. The same
house, built with a different budget, could look very different.

Page 10 of 40
Swarm and Evolutionary Computing V Daneesha

The environment is, in most cases, what determines the success of a genotype.
In biological terms, surviving long enough is succeeding. Successful creatures have a
higher chance of reproducing, transmitting their genotype to their offspring. The
information to design the body and the behaviours of each creature is store in its DNA.
Every time a creature reproduces, its DNA is copied and transmitted to its offspring.
Random mutations can occur during the duplication process, introducing changes in the

DNA. These changes can potentially result in phenotypical changes. ”


So, in a problem of optimizing a function, the genotype may be a binary string
representing potential inputs, and the phenotype is the actual input after decoding.

 A complex relationship can exist between the genotype and phenotype. Two such
relationships are:

Pleiotropy and Polygeny:

o Pleiotropy, where random modification of genes causes unexpected variations


in the phenotypic traits.
One gene affects multiple traits. Example: Changing wing length in an aircraft
design affects lift, drag, and stability.
In optimization, this could imply that a small change in the chromosome leads
to changes in multiple aspects of the solution.
o Polygeny (or Polygenic Inheritance), where several genes interact to produce
a specific phenotypic trait. To change this behavioural characteristic, all the
associated genes need to change.
Multiple genes influence a single trait. Example: Multiple parameters in a
machine learning model collectively determine prediction accuracy.
In optimization, it means several parts of the chromosome influence one aspect
of the solution.

Each chromosome represents a point in search space. A chromosome consists of a number of


genes, where the gene is the functional unit of inheritance. Each gene represents one
characteristic of the individual, with the value of each gene referred to as an allele. In terms of
optimization, a gene represents one parameter of the optimization problem.

A very important step in the design of an EA is to find an appropriate chromosome


representation. The efficiency and complexity of a search algorithm greatly depend on the
representation scheme, where classical optimization techniques usually use vectors of real
numbers, different EAs use different representation schemes. For example, genetic algorithms
(GA) mostly use a binary string representation, where the binary values may represent Boolean
values, integers or even discretized real numbers, genetic programming (GP) makes use of a
tree representation to represent programs and evolutionary programming (EP) uses real-valued
variables.

Examples:

 In genetic algorithms, the chromosome could be a binary string like 101010, where
each bit represents a decision variable.

Page 11 of 40
Swarm and Evolutionary Computing V Daneesha

 In genetic programming, the chromosome is a tree structure representing a


mathematical expression.
 Evolutionary programming (EP) uses real-valued variables to represent the
chromosome.

6.2. Fitness Function

The fitness function is crucial because it evaluates how good a potential solution is. It's akin to
the concept of "survival of the fittest" in nature, where only the fittest organisms reproduce.

 Fitness Calculation: The fitness function maps the chromosome to a scalar value that
quantifies its quality as a solution. This value is then used for selection, reproduction,
and other operations.

 Constraints: Fitness functions often include penalties for violating constraints. For
example, in a scheduling problem, a solution that violates time constraints might have
its fitness reduced.

Examples:

 Maximization Problem: In a function maximization problem, the fitness function


could simply be the value of the function at the point represented by the chromosome.
 Minimization with Constraints: In a resource allocation problem, the fitness function
might involve minimizing costs while penalizing solutions that exceed budget limits.

6.3. Initial Population

The initial population is the starting point for the evolutionary process. Its diversity affects the
algorithm's ability to explore the search space.

 Random Initialization: Usually, the initial population is generated randomly to cover


a wide range of possible solutions. The goal of random selection is to ensure that the
initial population is a uniform representation of the entire search space. If prior
knowledge about the search space and problem is available, heuristics can be used to
bias the initial population toward potentially good solutions. However, this approach to
population initialization leads to opportunistic EAs. Not all of the elements of the search
space have a chance to be selected, which may result in premature convergence of the
population to a local optimum.
 Population Size: The size of the population influences both the exploration of the
search space and the computational cost. A larger population provides better coverage
but requires more computational resources.

Small vs. Large Population: In a traveling salesman problem, a small population


might lead to faster but less accurate solutions, while a large population might provide
more accurate solutions at a higher computational cost.

Page 12 of 40
Swarm and Evolutionary Computing V Daneesha

6.4. Selection Operators

Selection operators determine which individuals get to reproduce. Each generation of an EA


produces a new generation of individuals, representing a set of new potential solutions to the
optimization problem. The new generation is formed through application of three operators:
cross-over, mutation and elitism. The aim of the selection operator is to emphasize better
solutions in a population.

 In the case of cross-over, "superior" individuals should have more opportunities to


reproduce. In doing so, the offspring contains combinations of the genetic material of
the best individuals. The next generation is therefore strongly influenced by the genes
of the fitter individuals.
 In the case of mutation, fitness values can be used to select only those individuals with
the lowest fitness values to be mutated. The idea is that the most fit individuals should
not be distorted through application of mutation - thereby ensuring that the good
characteristics of the fit individuals persevere.
 Elitism is an operator that copies a set of the best individuals to the next generation,
hence ensuring that the maximum fitness value does not decrease from one generation
to the next. Selection operators are used to select these elitist individuals.

Several selection techniques exist, divided into two classes:


 Explicit fitness remapping, where the fitness values of each individual is mapped
into a new range, e.g. normalization to the range [0, 1]. The mapped value is then
used for selection.
 Implicit fitness remapping, where the actual fitness values of individuals are used
for selection.

A summary of the most frequently used selection operators are given in the subsections below.

 Random Selection: Individuals are selected randomly with no reference to fitness at


all. All the individuals, good or bad, have an equal chance of being selected.
 Proportional Selection: The chance of individuals being selected is proportional to the
fitness values. A probability distribution proportional to fitness is created, and
individuals are selected through sampling of the distribution,

That is, the probability of an individual being selected, e.g. to produce offspring, is
directly proportional to the fitness value of that individual. This may cause an individual
to dominate the production of offspring, thereby limiting diversity in the new
population. This can of course be prevented by limiting the number of offspring that a
single individual may produce.

In roulette wheel sampling the fitness values are normalized, usually by dividing each
fitness value by the maximum fitness value. The probability distribution can then be
thought of as a roulette wheel, where each slice has a width corresponding to the

Page 13 of 40
Swarm and Evolutionary Computing V Daneesha

selection probability of an individual. Selection can be visualized as the spinning of the


wheel and testing which slice ends up at the top. Roulette wheel selection is illustrated
by the following pseudocode algorithm:

 Tournament Selection: A subset of the population is chosen randomly, and the fittest
individual in this subset is selected. This method avoids some of the pitfalls of
proportional selection, such as premature convergence.

In tournament selection, a group of k individuals is randomly selected. These k


individuals then take part in a tournament, i.e. the individual with the best fitness is
selected. For cross-over, two tournaments are held: one to select each of the two parents.
It is therefore possible that (1) a parent can be selected to reproduce more than once,
and (2) that one individual can combine with itself to reproduce offspring.

The advantage of tournament selection is that the worse individuals of the population
will not be selected, and will therefore not contribute to the genetic construction of the
next generation, and the best individual will not dominate in the reproduction process

 Rank-Based Selection: Instead of using fitness values directly, individuals are ranked,
and selection probabilities are assigned based on rank.

Rank-based selection uses the rank ordering of the fitness values to determine the
probability of selection and not the fitness values itself. This means that the selection
probability is independent of the actual fitness value. Ranking therefore has the
advantage that a highly fit individual will not dominate in the selection process as a
function of the magnitude of its fitness. One example of rank-based selection is non-
deterministic linear sampling, where individuals are sorted in decreasing fitness value.
The first individual is the most fit one. The selection operator is defined as

Page 14 of 40
Swarm and Evolutionary Computing V Daneesha

These nonlinear selection operators bias toward the best individuals, at the cost of
possible premature convergence.

 Elitism: Elitism involves the selection of a set of individuals from the current
generation to survive to the next generation. The number of individuals to survive to
the next generation, without being mutated, is referred to as the generation gap. If the
generation gap is zero, the new generation will consist entirely of new individuals. For
positive generation gaps, say k, k individuals survive to the next generation. These can
be
 the k best individuals, which will ensure that the maximum fitness value does
not decrease, or
 k individuals, selected using any of the previously discussed selection operators.

6.5. Reproduction Operators

Reproduction operators generate new solutions by combining or mutating existing ones,


simulating the natural processes of crossover and mutation in biological reproduction.

 Crossover: Combines genetic material from two parents to produce offspring. Various
methods exist, such as single-point crossover (where the chromosome is split at a
random point) and uniform crossover (where genes are randomly swapped between
parents).
 Mutation: Introduces random changes to the chromosome to explore new areas of the
search space. The aim of mutation is to introduce new genetic material into an existing
individual, thereby enlarging the search-space. Mutation usually occurs at a low
probability. A large mutation probability distorts the genetic structure of a chromosome
- the disadvantage being a loss of good genetic material in the case of highly fit
individuals.

Examples:

 Single-Point Crossover: In a binary string chromosome, a crossover might swap genes


between two parents at a single point, creating two new offspring.
 Mutation in Real-Valued Chromosomes: In a real-valued optimization problem,
mutation might involve adding a small random value to one of the genes.

Reproduction operators are usually applied to produce the individuals for the next generation.
Reproduction can, however, be applied with replacement. That is, newly generated individuals
replace parents if the fitness of the offspring is better than the parents; if not, the offspring do
not survive to the next generation.

Page 15 of 40
Swarm and Evolutionary Computing V Daneesha

7. General Evolutionary Algorithm


The following pseudocode represents a general evolutionary algorithm. While this algorithm
includes all operator types, different EC paradigms use different operators

Convergence is reached when, for example,


 the maximum number of generations is exceeded
 an acceptable best fit individual has evolved
 the average and/or maximum fitness value do not change significantly over the
 past g generations.

8. Evolution Strategies
Evolution Strategies (ES) are a class of evolutionary algorithms that focus on optimizing real-
valued parameters. Originating in the 1960s, they were developed by Ingo Rechenberg and
Hans-Paul Schwefel at the Technical University of Berlin. The primary goal of ES is to
optimize continuous functions, especially for real-world engineering problems.
The core idea is to generate a population of solutions, apply random variations, and
then select the best solutions for the next generation.
Evolution Strategies involve several core concepts, including chromosome
representation, selection, mutation, and crossover operators. These components work together
to drive the evolutionary process and produce optimized solutions.

 Representation: Real-valued vectors.


 Mutation: Gaussian perturbation, where each gene (parameter) in the individual
(solution vector) is modified by adding a value drawn from a normal distribution.
 Selection: Deterministic, where the best individuals are chosen for the next generation.

In ES, each individual is represented as a vector 𝑥 = (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) in ℝ𝑛 . Mutation is applied


to each element of the vector using: 𝑥𝑖′ = 𝑥𝑖 + 𝜎 ⋅ 𝑁(0,1) where σ is the mutation step size and
N(0,1) is a normally distributed random variable with mean 0 and variance 1.

Page 16 of 40
Swarm and Evolutionary Computing V Daneesha

The fitness function f(x) evaluates the quality of the solution x. The goal is to maximize or
minimize f(x).

8.1. Chromosome Representation

In ES, each individual (solution) is represented by a chromosome that consists of two parts:
genetic material and strategy parameters. The genetic material represents the actual solution
to the problem, while the strategy parameters control the evolutionary process, particularly
mutation.

The general chromosome representation is: 𝐶𝑛 = (𝐺𝑛 , 𝑆𝑛 )

 𝐺𝑛 represents the genetic material (the solution itself).


 𝑆𝑛 represents the strategy parameters (e.g., mutation step sizes).

Types of Chromosome Representations:


1. Single Standard Deviation: Each individual is associated with a single standard
deviation 𝜎𝑛 , which controls the mutation step size: 𝐶𝑛 = (𝐺𝑛 , 𝜎𝑛 ) Here, 𝐺𝑛 ∈ ℝ𝑙
(genetic vector) and 𝜎𝑛 ∈ ℝ+ (positive scalar representing mutation step size).
2. Multiple Standard Deviations with Rotational Angles: In this more advanced
representation, each genetic variable in the chromosome has its standard deviation
𝜎𝑛 , and rotation angles 𝜃 are used to define correlations among the genetic
variables: 𝐶𝑛 = (𝐺𝑛 , 𝜎𝑛 , 𝑢𝑛 )
o 𝐺𝑛 ∈ ℝ𝑙 : Genetic variables.
o 𝜎𝑛 ∈ ℝ𝑙+ : Standard deviations for each genetic variable.
𝑙(𝑙−1)
o 𝑢𝑛 ∈ ℝ 2 : Rotation angles representing the covariance matrix.

Including rotation angles enables the algorithm to adjust the mutation steps more effectively,
improving the search trajectory alignment with the problem's landscape.

8.2. EVOLUTION STRATEGY OPERATORS

Evolution strategies use the three main operators of EC, namely selection, crossover and
mutation.

8.2.1. Selection operators

Selection operators in ES are responsible for choosing the individuals that will pass their
genes to the next generation. Two primary strategies are used:

 (𝜇 + 𝜆) Selection:
o 𝜇 parents produce 𝜆 offspring.
o The best 𝜇 individuals are selected from both the parent and offspring
populations.
o This strategy implements elitism, ensuring that the best solutions are always
carried forward.
 (𝜇, 𝜆) Selection:
o 𝜇 parents generate 𝜆 offspring.

Page 17 of 40
Swarm and Evolutionary Computing V Daneesha

o Only the best 𝜇 individuals are selected from the offspring, with no guarantee
that parents survive.
o This strategy promotes diversity by preventing the dominance of existing
solutions.

8.2.2. Cross over(Recombination) operators

Crossover operators differ in the number of parents used to produce a single offspring.

 Local Crossover:
o Offspring are generated by combining randomly selected components from
two parents.
o This method maintains the local structure of parent genes while introducing
new combinations.
 Global Crossover:
o The entire population contributes to the creation of offspring.
o Components are randomly selected from different individuals, resulting in
greater diversity.

In both local and global cross over, recombination is done in one of two ways:

 Discrete Recombination:
o The offspring directly inherit genes from either parent.
o This method keeps the genes intact and straightforward.
 Intermediate Recombination:
𝑥𝑖1 +𝑥𝑖2
o The offspring's genes are the average of the parents' genes, e.g., 𝑥𝑖 =
2
o This method is useful for real-valued optimization problems where
intermediate values often provide better solutions.

8.2.3. Mutation operators:


Mutation introduces random changes to both genetic material and strategy parameters,
allowing the algorithm to explore new areas of the search space. The mutation process in ES
is divided into two types based on whether or not correlation information is used.

Mutation without Correlation Information:

 Mutating Standard Deviation:

The mutation step size p is adjusted based on the success rate of previous
mutations. The 1/5th success rule is often applied, where the step size increases if
more than 1/5 of the mutations are successful:
1
𝜎𝑔 ⋅ 𝑐𝑑 if𝑠𝑔 <
𝜎𝑔+1 = { 5
1
𝜎𝑔 ⋅ 𝑐𝑖 if𝑠𝑔 =
5

Here, 𝑠𝑔 is the success rate, and 𝑐𝑑 , 𝑐𝑖 are constants.

Page 18 of 40
Swarm and Evolutionary Computing V Daneesha

 Mutating Genetic Material:


o Genetic variables are altered by adding Gaussian noise: 𝑥𝑖 → 𝑥𝑖 + 𝜎 ⋅ 𝑁(0,1)

Mutation with Correlation Information:

When correlation information (e.g., rotational angles) is included, the mutation process
involves:

1. Rotational Angles Mutation:


o Adjusts angles that influence the direction of mutation, improving search
efficiency.
2. Standard Deviation Mutation:
o Similar to mutation without correlation but takes into account the rotated
search direction.
3. Genetic Material Mutation:
o Genetic variables are mutated using a rotation matrix and adjusted standard
deviations.

Each of these operators plays a critical role in ensuring that the population evolves effectively
towards optimal solutions.

8.3. Types of Evolution Strategies

i. (1+1) ES: A simple two-membered ES where one parent generates one offspring. The
offspring is accepted if it has better fitness than the parent. Otherwise, the parent is
retained. This is an elitist strategy.

 Selection: Deterministic; only the best individual is selected.


 Mutation: Gaussian perturbation with a fixed step size.
 Use Case: This strategy is suitable for simple, unimodal optimization
problems where a single optimal solution exists.

ii. (1,1) ES: Similar to (1+1) ES but without elitism. The offspring always replaces the
parent, even if it is worse.
iii. (𝝁 + 𝝀) ES: A multi-membered ES where μ parents generate λ offspring. The best μ
individuals from both parents and offspring are selected for the next generation.
 Selection: Elitist; the best individuals, including parents and offspring, are
retained.
 Mutation: Gaussian perturbation with adaptive step sizes.
 Advantage:
o This strategy ensures that the best solutions are always preserved, leading
to steady progress towards the optimal solution.
o Suitable for static optimization problems where retaining good solutions is
crucial.

iv. (μ, λ) ES: Similar to (μ + λ) ES, but only the λ offspring are considered for selection,
discarding the parents. This is beneficial for escaping local optima.
 Selection: Non-elitist; only offspring contribute to the next generation.

Page 19 of 40
Swarm and Evolutionary Computing V Daneesha

 Mutation: Gaussian perturbation with adaptive step sizes.

 Advantage:
o This strategy is more exploratory and can escape local optima by not
retaining older solutions.
o Suitable for dynamic optimization problems and multimodal functions
where diversity is important.

Population Offspring Selection


Type Mutation
Size (μ) (λ) Scheme
(1+1) ES 1 1 Elitist Gaussian
(1,1) ES 1 1 Non-Elitist Gaussian
(μ +λ) ES μ λ Elitist Gaussian
(μ ,λ) ES μ λ Non-Elitist Gaussian

v. Self-Adaptive Evolution Strategies: In this variant, the strategy parameters (e.g.,


mutation step size) evolve along with the genetic variables. These parameters are
encoded in the chromosomes and undergo mutation and recombination like other
genes.

Self-Adaptation Mechanism: Parameters such as mutation step sizes are adjusted


automatically based on their performance. For example, Rechenberg's 1/5th rule is
used to adjust the mutation rate.
 Advantage:
o This allows the algorithm to fine-tune its behavior over time, adapting to
the problem's landscape.
o Self-adaptive ES are particularly effective in complex, real-valued
optimization problems.

vi. Covariance Matrix Adaptation Evolution Strategy (CMA-ES): CMA-ES is a more


advanced form of ES that adapts the covariance matrix of the mutation distribution.
This allows the algorithm to learn the shape of the objective function landscape and
adapt the search accordingly.

 Covariance Matrix Adaptation:


o The covariance matrix represents correlations between different variables,
allowing the algorithm to make more informed search steps.
o The matrix is updated over time based on successful mutations, leading to
a more efficient search in complex landscapes.
 Advantage:
o CMA-ES is considered one of the most powerful optimization algorithms
for continuous, high-dimensional problems.
o It performs well in both convex and non-convex optimization scenarios.

Page 20 of 40
Swarm and Evolutionary Computing V Daneesha

8.4. Evolutionary Strategy Algorithm

9. Learning Classifier Systems


Learning Classifier Systems (LCS) represent an alternative evolutionary approach to model
building based on the use of rule sets, rather than parse trees. LCS are used primarily in
applications where the objective is to evolve a system that will respond to the current state of
its environment (i.e., the inputs to the system) by suggesting a response that in some way
maximises future reward from the environment.

An LCS is therefore a combination of a classifier system and a learning algorithm. The


classifier system component is typically a set of rules, each mapping certain inputs to actions.
The whole rule set therefore constitutes a model that covers the space of possible inputs and
suggests the most appropriate actions for each. The learning algorithm component of an LCS
is implemented by an evolutionary algorithm, whose population members either represent
individual rules, or complete rule sets, known respectively as the Michigan and Pittsburgh
approaches. The fitness driving the evolutionary process may be driven by many different
forms of learning, here we restrict ourselves to ‘supervised’ learning, where at each stage the
system receives a training signal (reward) from the environment in response to the output it
proposes. This helps emphasise the difference between the Michigan and Pittsburgh
approaches. In the former, data items are presented to the system one-by-one and individual
rules are rewarded according to their predictions. By contrast, in a Pittsburgh approach each
individual represents a complete model, so the fitness would normally be calculated by
presenting the entire data set and calculating the mean accuracy of the predictions.

Page 21 of 40
Swarm and Evolutionary Computing V Daneesha

9.1. Michigan-style learning classifier system


The Michigan-style LCS was first described by Holland in 1976 as a framework for studying
learning in condition/action rule-based systems, using genetic algorithms as the principal
method for the discovery of new rules and the reinforcement of successful ones. Typically each
member of the population was a single rule representing a partial model – that is to say it might
only cover a region of the decision space. Thus it is the entire population that together
represents the learned model. Each rule is a tuple {condition:action:payoff}. The condition
specifies a region of the space of possible inputs in which the rule applies. The condition parts
of rules may contain wildcard, or ‘don’t-care’ characters for certain variables, or may describe
a set of values that a given variable may take – for example, a range of values for a continuous
variable. Rules may be distinguished by the number of wildcards they contain, and one rule is
said to be more specific than another if it contains fewer wildcards, or if the ranges for certain
variables are smaller — in other words if it covers a smaller region of the input space. Given
this flexibility, it is common for the condition parts of rules to overlap, so a given input may
match a number of rules. In the terminology of LCS, the subset of rules whose condition
matches the current inputs from the environment is known as the match set. These rules may
prescribe different actions, of which one is chosen. The action specifies either the action to be
taken (for example, if controlling robots or on-line trading agents) or the system’s prediction
(such as a class label or a numerical value). The subset of the match set advocating the chosen
action is known as the action set. Holland’s original framework maintained lists of which rules
have been used, and when a reward was received from the environment a portion was passed
back to recently used rules to provide information for the selection mechanism. The intended
effect is that the strength of a rule predicts the value of the reward that the system will gain for
undertaking the action. However the framework proved unwieldy and difficult to make work
well in practice.

LCS research was reinvigorated in the mid-1990s by Wilson who removed the concept of
memory and stripped out all but the essential components in his minimalist ZCS algorithm . At
the same time several authors were noting the conceptual similarity between LCS and
reinforcement learning algorithms which attempt to learn, for each input state, an accurate
mapping from possible actions to expected rewards. The XCS algorithm firmly established this
link by extending rule-tuples to {condition:action:payoff,accuracy}, where the accuracy value
reflects the system’s experience of how well the predicted payoff matches the reward received.
Unlike ZCS, the EA is restricted at each cycle — originally to the match set, latterly to the
action set, which increases the pressure to discover generalised conditions for each action. As
per ZCS, a credit assignment mechanism is triggered by the receipt of rewards from the
environment to update the predicted pay-offs for rules in the previous action set. However, the
major difference is that these are not used directly to drive selection in the evolution process.
Instead selection operates on the basis of accuracy, so the algorithm can in principle evolve a

Page 22 of 40
Swarm and Evolutionary Computing V Daneesha

complete mapping from input space to actions. Table below gives a simple overview of the
major features of Michigan-style classifiers for a problem with a binary input and output space.

The list below summarizes the main workflow of the algorithm.

1. A new set of inputs are received from the environment.


2. The rule base is examined to find the match-set of rules.
 If the match set is empty, a ‘cover operator’ is invoked to generate one or more
new matching rules with a random action.
3. The rules in the match-set are grouped according to their actions.
4. For each of these groups the mean accuracy of the rules is calculated.
5. An action is chosen, and its corresponding group noted as the action set.
 If the system is an ‘exploit’ cycle, the action with the highest mean accuracy is
chosen.
 If the system is in an ‘explore’ cycle, an action is chosen randomly or via fitness-
proportionate selection, acting on the mean accuracies.
6. The action is carried out and a reward is received from the environment.
7. The estimated accuracy and predicted payoffs are then updated for the rule in the
current and previous action sets, based on the rewards received and the predicted pay-
offs, using a Widrow–Hoff style update mechanism.
8. If the system is in an ‘explore’ cycle, an EA is run within the action-set, creating new
rules (with pay-off and accuracies set to the mean of their parents), and deleting others.

9.2. Pittsburgh-style Learning classifier system

The Pittsburgh-style LCS predates, but is similar to the better-known GP: each member of the
evolutionary algorithm’s population represents a complete model of the mapping from input to
output spaces. Each gene in an individual typically represents a rule, and again a new input
item may match more than one rule, in which case typically the first match is taken. This means
that the representation should be viewed as an ordered list, and two individuals which contain
the same rules, but in a different order on the genome, are effectively different models.
Learning of appropriately complex models is typically facilitated by using a variable-length
representation so that new rules can be added at any stage. This approach has several conceptual
advantages — in particular, since fitness is awarded to complete rule sets, models can be
learned for complex multi-step problems. The downside of this flexibility is that, like GP,
Pittsburgh-style LCS suffers from bloat and the search space becomes potentially infinite.
Nevertheless, given sufficient computational resources, and effective methods of parsimony to
counteract bloat, Pittsburgh-style LCS has demonstrated state-of-the-art performance in several
machine learning domains, especially for applications such as bioinformatics and medicine,
where human-interpretability of the evolved models is vital and large data-sets are available so
that the system can evolve off-line to minimise prediction error. Two recent examples winning
Humies Awards for better than human performance are in the realms of prostate cancer
detection and protein structure prediction

Page 23 of 40
Swarm and Evolutionary Computing V Daneesha

10. Parameter Control


Parameter control in evolutionary algorithms (EAs) is the process of adjusting key parameters
during the execution of the algorithm. The goal is to enhance the algorithm's performance by
adapting it to different stages of the search process. Unlike static parameter settings, where
parameters remain constant, dynamic parameter control adjusts parameters based on pre-
defined rules or feedback from the search process.

10.1. Types of Parameter Control

There are three primary categories of parameter control:

1. Deterministic Parameter Control: Parameter values are altered based on a predefined


rule without feedback from the search process.
2. Adaptive Parameter Control: Parameter values are modified based on feedback from
the search, such as the quality of solutions or population diversity.
3. Self-adaptive Parameter Control: Parameters are encoded into the chromosomes and
evolve along with the solutions themselves.

Deterministic Parameter Control

In deterministic control, parameters are adjusted according to a predetermined schedule or rule


that doesn't rely on any feedback from the search. The change is typically time-based or
generation-based.

Example: For a mutation step size in an EA:

 Initially set the mutation step size as 1.


 Reduce the step size over generations according to the rule:

𝑡
𝜎(𝑡) = 1 − 0.9 ⋅
𝑇

where t is the current generation, and T is the total number of generations.

This approach enables finer search control as the algorithm progresses, aiding in convergence.

Page 24 of 40
Swarm and Evolutionary Computing V Daneesha

Adaptive Parameter Control

Adaptive control uses feedback from the algorithm's performance to adjust parameters. This
feedback can be based on various aspects, such as the success of certain operations or the
diversity of the population.

Example: Rechenberg's 1/5 success rule for mutation step size:

 If more than 1/5 of mutations are successful, increase the step size.
 If fewer than 1/5 are successful, decrease the step size.

𝜎/𝑐 if𝑝𝑠 > 1/5,
 𝜎 = {𝜎 ⋅ 𝑐 if𝑝𝑠 < 1/5,

𝜎 if𝑝𝑠 = 1/5,
 where c is a constant, and is the success ratio.

Adaptive control can be more responsive and improve algorithm efficiency during execution.

Self-adaptive Parameter Control

In self-adaptive control, the parameters themselves are treated as variables within the algorithm
and evolve along with the solutions. Parameters such as mutation rates or crossover
probabilities are encoded in the chromosomes and are subject to the same evolutionary
operators.

Example: Self-adapting mutation step size:

 Each individual in the population includes both the solution and the mutation step size.

𝜎 ′ = 𝜎 ⋅ 𝑒 𝜏⋅𝑁(0,1)

𝑥𝑖′ = 𝑥𝑖 + 𝜎 ′ ⋅ 𝑁(0,1)

This approach allows the algorithm to naturally adapt the parameters based on their impact on
the search process.

10.2. Examples of Changing Parameters

10.2.1. Changing Mutation Step Size

 Deterministic: The mutation step size can decrease over generations, such as:

𝑡
 𝜎(𝑡) = 1 − 0.9 ⋅ 𝑇
 Adaptive: Using Rechenberg 's 1/5 success rule.
 Self-adaptive: Each individual carries its mutation step size that evolves.

Page 25 of 40
Swarm and Evolutionary Computing V Daneesha

10.2.2. Changing Penalty Coefficients

 Deterministic: Adjust penalty coefficients over time in constrained optimization


problems.
 𝑊(𝑡) = (𝐶 ⋅ 𝑡)𝛼
 Adaptive: Adjust penalties based on the feasibility of the best individuals over recent
generations.

10.2.3. Evaluation Function Adaptation

 Deterministic: Gradually change the importance of certain constraints in the evaluation


function over time.
 Self-adaptive: Allow penalties for constraints to evolve alongside the solutions.

10.3. Classification of Control Techniques

The classification of parameter control techniques in evolutionary algorithms can


be considered from different perspectives:

What is Changed?

 Representation of Individuals: How solutions are encoded.


 Evaluation Function: How solutions are evaluated.
 Variation Operators: Mutation, crossover, etc.
 Selection Operators: Parent or survival selection.
 Population Parameters: Size, topology, etc.

How Changes Are Made?


1. Deterministic: Predefined rules without feedback.
2. Adaptive: Feedback-based adjustments.
3. Self-adaptive: Parameters evolve with the population.

What Evidence Informs the Change?


1. Absolute Evidence: Based on predefined events, such as population diversity falling
below a threshold.
2. Relative Evidence: Based on comparing the performance of different parameter values
during the search.

Scope of Changes
 Gene-Level: Changes affect individual genes.
 Individual-Level: Changes affect entire individuals.
 Population-Level: Changes affect the entire population.

Parameter control in evolutionary algorithms is crucial for balancing exploration and


exploitation. By choosing appropriate techniques—deterministic, adaptive, or self-adaptive—
you can significantly improve the performance of the algorithm.

Page 26 of 40
Swarm and Evolutionary Computing V Daneesha

11. Multimodal Problems


Multimodal problems refer to optimization tasks where the search landscape contains multiple
local optima. These local optima represent points in the search space that are better than their
surrounding points but not necessarily the best possible solution (global optimum). Effective
search strategies in Evolutionary Algorithms (EAs) must balance the exploitation of known
high-fitness regions and the exploration of new areas to uncover potentially better solutions.

In EAs, multimodality is a common characteristic, and they are often employed to either:

1. Locate the global optimum, especially in cases where local optima are misleading.
2. Identify multiple high-fitness solutions, which correspond to different local optima.
This scenario is useful in practical problems where multiple viable solutions are
required.

Example: Consider the design of a new product where different design options need to be
evaluated. The fitness function might evolve as decisions about materials, functionality, and
aesthetics are made. In such a dynamic scenario, it is beneficial to examine several design
options to allow flexibility in future stages.

However, multimodal problems present challenges in preserving diversity within the


population. The finite population size and the effects of genetic drift—where random
variations cause the population to converge prematurely around one optimum—can reduce the
ability of EAs to maintain multiple optima.

Genetic Drift Example: Imagine a population with two equally fit niches, each initially
containing 50% of the population. Due to random selection effects, one niche may eventually
dominate. For instance, if one niche slightly outnumbers the other (51% vs. 49%), the EA will
increasingly favor the more populous niche until only one niche remains.

11.1. Preserving Diversity in Multimodal Problems

To effectively handle multimodal problems, EAs employ strategies to maintain diversity.


These strategies fall into two main categories:

 Explicit Approaches: Modifications are made directly to operators in the algorithm to


encourage diversity.
 Implicit Approaches: Frameworks are used that permit diversity but do not explicitly
enforce it.

Types of Diversity Spaces:

1. Genotype Space: Represents the set of all possible solutions. Diversity in this space
can be maintained using distance metrics (e.g., Manhattan distance).
2. Phenotype Space: The actual solution space, where diversity is based on the
differences between solutions.
3. Algorithmic Space: Represents how the EA operates, such as population structure or
distribution across different processors.

Page 27 of 40
Swarm and Evolutionary Computing V Daneesha

11.2. Fitness Sharing


Fitness Sharing is an explicit approach to maintaining diversity by controlling the number of
individuals within each niche. This technique adjusts the fitness of individuals based on their
proximity to others, ensuring that niches are populated proportionally to their fitness levels.

Process:

1. Calculate the distance between each pair of individuals.


2. Adjust each individual's fitness based on the number of individuals within a certain
distance (σ share).

Effect: Individuals within densely populated niches will have lower fitness, reducing their
chances of selection and encouraging the population to spread across multiple niches.

11.3. Crowding

Crowding is another explicit approach where new individuals replace similar ones in the
population, ensuring that diversity is maintained.

Crowding Process:

1. New offspring are compared to a subset of the population.


2. The offspring replace the most similar individuals, preserving diversity by preventing
overpopulation of any single niche.

Deterministic Crowding: A variant where offspring compete with their parents for survival,
ensuring that similar solutions don't crowd out diverse ones.

11.4. Speciation
Automatic Speciation introduces mating restrictions to promote diversity. In this approach,
individuals are grouped into species based on their characteristics, and they only mate within
their species. This approach mimics biological evolution and maintains diversity by preventing
the mixing of very different solutions.

Implementation:

 Individuals are assigned to species based on their genotype or phenotype.


 Mating is restricted to within-species pairs, ensuring that different niches or species are
preserved within the population.

11.5. Implicit Approaches: Island Model EAs and Cellular EAs


1. Island Model EAs: The population is divided into subpopulations (islands), each
evolving independently. Occasionally, individuals migrate between islands,
introducing new genetic material and maintaining diversity.

Page 28 of 40
Swarm and Evolutionary Computing V Daneesha

2. Cellular EAs: The population is structured in a grid-like manner, where individuals


only interact with their neighbors. This structure helps maintain diversity by limiting
the spread of dominant genes across the entire population.

Multimodal problems present unique challenges in Evolutionary Algorithms due to their


complex landscapes with multiple local optima. Preserving diversity is key to solving these
problems effectively. Techniques like fitness sharing, crowding, speciation, and implicit
models help maintain diverse populations, allowing EAs to explore multiple optima
simultaneously and increasing the chances of finding global solutions or multiple high-quality
solutions in multimodal search spaces.

Page 29 of 40
Swarm and Evolutionary Computing V Daneesha

Chapter 2 -Swarm Intelligence


1. Introduction to Swarm Intelligence
1.1. Definition

 Definition: Swarm Intelligence (SI) refers to the collective, decentralized, and self-
organized behavior of a group of individuals or agents. These individuals interact
locally with one another and their environment, resulting in the emergence of complex,
coordinated behaviors.

 Origin of the Term: The term Swarm Intelligence was introduced by Beni and Wang
in connection with cellular robotic systems. They developed algorithms for controlling
robotic swarms. An earlier example of utilizing the concept of flocking behavior was
in 1987 when Reynolds developed a program to simulate the motion of bird flocks.

 Scope of Swarm Intelligence: It encompasses both natural and artificial systems,


where individuals coordinate their behaviors to achieve tasks collectively. Examples of
natural systems include:
o Bird Flocking: Birds adjust their flight path based on the movement of nearby
birds.
o Ant Colony Foraging: Ants find optimal paths between their nest and food sources
using pheromones.
o Fish Schooling: Fish in a school move in a coordinated fashion to avoid predators
and optimize movement.

 The swarm consists of simple, dynamic, and active agents with little inherent
knowledge of the environment. This cooperative behavior leads to the emergence of
intelligent search strategies that are more efficient than random searches.

 Definition by Bonabeau et al.: Swarm Intelligence is defined as the emergent


collective intelligence of groups of simple agents.

1.2. Development of Swarm Intelligence Algorithms

o 1987: Reynolds developed a simulation for flocking behavior in birds.


o 1992: Ant Colony Optimization (ACO) was introduced, inspired by ant foraging
behavior.
o 1995: Particle Swarm Optimization (PSO) was introduced, inspired by the social
behavior of birds flocking or fish schooling.

 Initially, Swarm Intelligence approaches were considered under Evolutionary


Computation (EC) due to similarities in population-based and stochastic approaches.
However, over time, SI became distinct from EC because of inherent differences in the
underlying philosophy:

o SI mimics collective and synergistic behavior of simple agents.


o EC is inspired by biological evolution.

Page 30 of 40
Swarm and Evolutionary Computing V Daneesha

1.3. Key Phases in Swarm Intelligence Algorithms


Swarm Intelligence algorithms generally consist of two key phases:
 Variation Phase: Explores various areas of the search space.
 Selection Phase: Exploits past experiences to improve the search process.

These phases help in maintaining the balance between exploration (discovering new solutions)
and exploitation (refining known good solutions) within the swarm

1.4. Characteristics of Swarm Intelligence Systems


A group of homogeneous agents exhibits the swarm intelligence if and only if it follows two
conditions: self-organization and division of labour.

i. Self-Organization
Swarm intelligence systems are self-organizing, meaning ordered patterns and behaviours
arise from local interactions between individuals without centralized control. According to
Bonabeau et al., self-organization involves:
 Positive Feedback: Reinforces the formation of favourable structures. It promotes
diversity and exploration in swarm intelligence.
 Negative Feedback: Stabilizes the collective behaviour and helps exploit known
solutions.
 Fluctuations: Introduces randomness, helping the system escape from local optima.
 Multiple Interactions: Individuals learn from interactions with others, improving the
overall intelligence of the swarm.

ii. Division of Labour


Division of labour enables different individuals to specialize in different tasks, allowing for
simultaneous completion of multiple tasks. This makes the swarm adaptable to changes in the
environment or the problem space.

iii. Flexibility
SI systems can adapt to changes in the environment.

iv. Robustness
They can withstand individual failures without collapse.

1.5. Popular Swarm Intelligence Algorithms


Swarm Intelligence algorithms are inspired by various natural systems, and some popular
algorithms include:
 Particle Swarm Optimization (PSO): Inspired by bird flocking and fish schooling
behaviour.
 Ant Colony Optimization (ACO): Inspired by the foraging behaviour of ants.
 Artificial Bee Colony (ABC): Motivated by foraging behaviour of honey bees.
 Bacterial Foraging Optimization (BFO): Inspired by bacterial foraging behaviour
(e.g., E. coli).
 Firefly Algorithm (FFA): Inspired by the flashing behaviour of fireflies.

Page 31 of 40
Swarm and Evolutionary Computing V Daneesha

 Spider Monkey Optimization (SMO): Based on the foraging behaviour of spider


monkeys.

Particle swarm optimization and Ant colony optimization are the most popular ones.

Particle Swarm Optimization (PSO)

PSO was developed by Kennedy and Eberhart in 1995, inspired by the social behaviour of bird
flocks and fish schools.

Typically, a flock of birds have no leader and they find food by collaborative trial and error.
They follow one of the members of the group that has the closest position with the food source.
Others update their position simultaneously by communicating with members who already
have a better position. This is done repeatedly until the best food source is found. Particle
swarm optimization consists of a population of individuals called swarm, and each individual
is called a particle, which represents a location or possible candidate solution in a
multidimensional search space.

Over time, the swarm converges toward the best solution. PSO has been widely used in
continuous optimization problems, such as parameter tuning and function optimization.

Ant Colony Optimization (ACO)

ACO is inspired by the foraging behaviour of ants. Ants are able to find the shortest
path between their nest and a food source using pheromones.

Ant Colony Optimization is the first successful example of swarm intelligence. The algorithm
was introduced to find the optimal path in a graph. A group of ants starts food foraging
randomly. Once an ant finds a food source, it exploits the same and returns to the nest leaving
pheromones on the path. The concentration of this pheromone can guide other ants in searching
for food. When other ants find the pheromones they follow the path with a probability
proportional to the concentration of the pheromone. Now if other ants also able to find the food
source, they also leave pheromones during their return to the nest. As more ants find the path,
the concentration of pheromone gets stronger. The pheromone evaporates with time and hence
the longer paths will have more evaporation as compared to shorter paths.

2. Applications of Swarm Intelligence in Optimization Problems

2.1. Combinatorial Optimization


Ant Colony Optimization (ACO): This algorithm is excellent for solving problems such as
vehicle routing, job scheduling, and network routing, where the solution involves finding the
best sequence or combination.
Example:
In the Traveling Salesman Problem (TSP), ants deposit pheromones on the paths they traverse,
and over time, the pheromone concentrations guide other ants to follow the shortest path, thus
finding an optimal or near-optimal solution.

2.2. Continuous Optimization:

Page 32 of 40
Swarm and Evolutionary Computing V Daneesha

Particle Swarm Optimization (PSO): PSO is used to solve problems where variables take
continuous values, such as parameter tuning in machine learning models and function
optimization.
Example:
PSO optimizes a multi-dimensional function by adjusting the positions of particles (potential
solutions) in the search space, where each particle represents a candidate solution and its
velocity is influenced by both its own best-known position and the best-known positions of
other particles.

2.3. Multi-modal Optimization:


SI algorithms are also applied to problems with multiple optima (peaks or valleys). For
instance, PSO and ACO can find several solutions in problems where there are multiple
objectives or trade-offs to consider.
Example:
In the design of engineering systems, multi-modal optimization can be used to explore
multiple solutions simultaneously, rather than converging to a single optimum.

2.4. Dynamic Optimization Problems:


SI algorithms can adapt to changing environments, making them suitable for problems where
the optimization landscape changes over time.
Example:
Dynamic resource allocation in communication networks can be handled using swarm
intelligence, where network conditions (such as bandwidth or user traffic) change
dynamically, requiring real-time adjustments.

3. Particle swarm optimization (PSO)

3.1. Introduction to PSO

 Particle Swarm Optimization (PSO) is a swarm intelligent algorithm, inspired from birds’
flocking or fish schooling for the solution of nonlinear, nonconvex or combinatorial
optimization problems that arise in many science and engineering domains.
 It was developed by James Kennedy and Russell Eberhart in 1995.
 PSO simulates the natural group behaviour where no leader exists, and individuals
communicate and cooperate to achieve a common goal, such as finding food.
 Key Idea: PSO mimics the social interactions of individuals (particles) within a
population (swarm) to explore the search space and find the optimal solution. Each
particle adjusts its position based on its own best-known position and the best-known
position of the swarm.
 Key Concepts of PSO
 Swarm: A population of potential solutions to the optimization problem, called
particles.
 Particle: Each particle represents a candidate solution in a multi-dimensional search
space.
 Position: The location of each particle in the search space.
 Velocity: The rate at which the particle moves through the search space.

Page 33 of 40
Swarm and Evolutionary Computing V Daneesha

Fitness: A measure of how good a particle's current position (solution) is, based on an
objective function.
Particles adjust their positions by considering two factors:

1. Personal best (pBest): The best position a particle has encountered so far.
2. Global best (gBest): The best position found by any particle in the entire swarm.

3.2. Introduction to Bird Flocking Behaviour


Many bird species are social and form flocks for various reasons. Flocks may be of different
sizes, occur in different seasons and may even be composed of different species that can work
well together in a group.

 More Eyes and Ears: Better opportunities for finding food and detecting predators.
 Survival Advantage: Collective behavior aids in increasing the survival chances of
individual birds.

3.2.1. Key Benefits of Flocking

i. Foraging
 E.O. Wilson's Theory: Individuals in a group can benefit from the discoveries and
experiences of others. When birds forage together, they can share information about
food sources.
 Non-Competing Flocks: Some species form flocks in a non-competitive way, where
all members benefit from shared knowledge about food locations.

ii. Protection Against Predators


 Increased Vigilance: More ears and eyes means more chances of spotting a predator
or any other potential threat.
 Confusion and Defense: A group of birds may be able to confuse or overwhelm a
predator through mobbing or agile flights.
 Safety in Numbers: The larger the group, the smaller the chance of an individual bird
being targeted.

iii. Aerodynamics
When birds fly in flocks, they often arrange themselves in specific shapes or
formations. Those formations take advantage of the changing wind patterns based on
the number of birds in the flock and how each bird’s wings create different currents.
This allows flying birds to use the surrounding air in the most energy efficient way.

3.2.2. Risks of Flocking


 Noise and Motion: More birds create more noise and movement, which can attract
predators causing a constant threat to the birds.
 Competition for Resources: Larger flocks require more food, leading to competition,
which can result in the death of weaker individuals.

Page 34 of 40
Swarm and Evolutionary Computing V Daneesha

3.2.3. Connection to Particle Swarm Optimization

PSO, inspired by bird flocking, simulates the advantages of collective behaviour but
PSO does not simulate the disadvantages of the birds’ flocking behaviour.
 No Elimination of Weaker Individuals: During the search process killing of any
individual is not allowed as in Genetic Algorithms where some weaker individuals die
out. In PSO, all individuals remain alive and try to make themselves stronger throughout
the search process.
 Cooperation Over Competition: The improvement in potential solutions in PSO is due
to cooperation while in evolutionary algorithms it is due to competition.
This concept makes swarm intelligence different from evolutionary algorithms.

Bird Flocking Rules and Their Influence on PSO


Mataric outlined the following rules for birds’ flocking:
i. Safe Wandering: When birds fly they are not allowed to collide with each other and
with obstacles.
ii. Dispersion: Each bird will maintain a minimum distance with any other.
iii. Aggregation: Each bird will also maintain a maximum distance with any other.
iv. Homing: All birds will have potential to find a food source or the nest.

The basic PSO model developed by Kennedy and Eberhart incorporates some, but not all,
of the flocking rules.

 Safe Wandering and Dispersion: These rules are not enforced in PSO. Particles
(agents) are allowed to come as close to each other as possible during movement.
 Aggregation and Homing: These rules are valid in the PSO model. In PSO, agents
have to fly within a particular region so that they can maintain a maximum distance
with any other agent. Homing says that any agent in the group may reach to the global
optima.

3.3. Fundamental Principles of PSO


Kennedy and Eberhart proposed five principles which determine whether a group of agents is
a swarm or not.

i. Proximity Principle: the population should be able to carry out simple space and time
computations.
ii. Quality Principle: the population should be able to respond to quality factors in the
environment.
iii. Diverse Response Principle: the population should not commit its activity along
excessively narrow channels.
iv. Stability Principle: the population should not change its mode of behaviour every time
the environment changes.
v. Adaptability Principle: the population should be able to change its behaviour mode
when it is worth the computational price.

Page 35 of 40
Swarm and Evolutionary Computing V Daneesha

3.4. PSO Model for Function Optimization


Particle Swarm Optimization (PSO) is a swarm intelligence-based algorithm used for function
optimization. It performs a random search guided by swarm intelligence. The search is carried
out by a set of potential solutions, referred to as a swarm, where each potential solution is
called a particle.

3.4.1. Learning in PSO


 Cognitive Learning: Particles learn from their own experiences. As a result, the particle
stores in its memory the best solution visited so far by itself, called p best.
 Social Learning: Particles learn from others within the swarm. As a result, the particle
stores in its memory the best solution visited by any particle of the swarm which we call
as g best.

3.4.2. Velocity and Position Update in PSO

Change of the direction and the magnitude in any particle is decided by a factor called velocity.
This is the rate of change in the position with respect to the time. With reference to the PSO,

time is the iteration. In this way, for PSO, the velocity may be defined as the rate of change in
the position with respect to the iteration. Since iteration counter increases by unity, the
dimension of the velocity v and the position x becomes the same.

Velocity Update Equation

𝑡+1 𝑡 𝑡 𝑡 𝑡 𝑡
𝑣𝑖,𝑑 = 𝑣𝑖,𝑑 + 𝑐1 𝑟1 (𝑝𝑖,𝑑 − 𝑥𝑖,𝑑 ) + 𝑐2 𝑟2 (𝑝𝑔,𝑑 − 𝑥𝑖,𝑑 ) (1)

Where,
𝑡+1
𝑣𝑖,𝑑 : Updated velocity for particle i in dimension d
𝑐1 𝑐2 : Cognitive and social scaling parameters.
𝑟1 𝑟2 : Random numbers between [0, 1].
𝑝𝑖,𝑑 : Personal best position of particle i
𝑝𝑔,𝑑 : Global best position (best in the swarm).
𝑥𝑖,𝑑 : Current position of particle i.

Position Update Equation


After updating velocity, the position of the particle is updated
t+1 t t+1
xi,d = xi,d + vi,d (2)

𝑡+1
Where 𝑥𝑖,𝑑 : Updated position of particle i in dimension d.

Page 36 of 40
Swarm and Evolutionary Computing V Daneesha

The only link between the dimensions of the problem space is introduced via the objective
function, i.e., through the locations of the best positions found so far g best and p best.

Flowchart of PSO
1. Initialize the swarm with random positions and velocities.
2. Evaluate the fitness of each particle.
3. Update the velocity of each particle based on pBest and gBest.
4. Update the position of each particle.
5. Update pBest and gBest.
6. If the termination condition is met, stop. Otherwise, return to step 3.

Understanding Update Equations

 Momentum Term: The previous velocity v, which can be thought of as a momentum


term and serves as a memory of the previous direction of movement. This term prevents
the particle from drastically changing direction.
 Cognitive Component: This term 𝑐1 𝑟1 (𝑝𝑖,𝑑 − 𝑥𝑖,𝑑 ) pulls the particle towards its personal
best position. It ensures that the particle remembers and moves closer to the best position
it has encountered.
 Social Component: This term 𝑐2 𝑟2 (𝑝𝑔,𝑑 − 𝑥𝑖,𝑑 ) attracts the particle towards the global
best position, allowing it to learn from the swarm’s overall best performance.

The constants 𝑐1 and 𝑐2 regulate the influence of personal best and global best, respectively.
Typically, c1 and c2 are set to 1.5-2.0, and ω is decreased linearly over time to encourage
convergence.

Page 37 of 40
Swarm and Evolutionary Computing V Daneesha

3.5. Particle Swarm Optimization Parameters

The convergence speed and the ability of finding optimal solution of any population based
algorithm is greatly influenced by the choice of its parameters. Usually, a general
recommendation for the setting of parameters of these algorithms is not possible as it is
highly dependent upon the problem parameters. However, theoretical and/or experimental
studies have been carried out to recommend the generic range for parameter values. Likewise
other population based search algorithms, tuning of parameters for a generic version of PSO
has always been a challenging task due the presence of stochastic factors r1 and r2 in the
search procedure.

Key PSO Parameters

3.5.1. Swarm Size

 Definition: The swarm size refers to the number of particles (potential solutions) used in
the optimization process.
 Importance: The choice of swarm size influences the algorithm's balance between
exploration (searching broadly across the solution space) and exploitation (refining the
search around promising areas).
 Recommended Swarm Size:
o Empirically chosen based on the number of decision variables and the complexity of
the problem.
o General guidelines suggest using 20–50 particles for many problems.

3.5.2. Scaling Factors 𝒄𝟏 and 𝒄𝟐

 Role: These scaling factors, also known as cognitive (𝑐1) and social (𝑐2 )coefficients,
determine the step size of the particles as they move through the solution space.
 Influence on Search:
o The values of 𝑐1 and 𝑐2 decide how much weight is given to the personal experience
of a particle (via p best) and to the experience of the swarm (via g best).
o They regulate the speed of particles in reaching better solutions.
 Common Values:
o In the basic PSO model, 𝑐1 = 𝑐2 = 2 is a typical setting.
o With this setting, particles can converge quickly, but uncontrolled speed may reduce
the exploitation of the search space, potentially leading to premature convergence.
 Effect of Different Configurations:
o If 𝑐1 = 𝑐2 > 0 , particles are attracted toward the average of their personal best
position and the swarm’s best position.
o If 𝑐1 > 𝑐2 , the algorithm is more useful for multimodal problems (with multiple
optimal solutions).
o If 𝑐2 > 𝑐1 , the algorithm is better suited for unimodal problems (with a single optimal
solution).
 Small vs. Large Values:
o Small values of 𝑐1 and 𝑐2 result in smoother particle movements, which improves the
stability of the search process.
o Larger values can cause abrupt, aggressive movements, leading to faster convergence
but a higher chance of missing the global optimum.

Page 38 of 40
Swarm and Evolutionary Computing V Daneesha

 Adaptive Scaling: Researchers have also proposed adaptive versions of 𝑐1 and 𝑐2 , where
these parameters change dynamically during the search process to balance exploration and
exploitation.

3.5.3. Stopping Criteria

 Role: Stopping criterion is also a parameter not only for PSO but for any population based
meta-heuristic algorithm
 Popular Stopping Criteria:
o Maximum number of iterations: This is a common choice. The algorithm stops after
a fixed number of iterations, regardless of the quality of the solution.
o Maximum number of function evaluations: Another common criterion, where the
algorithm stops after evaluating the objective function a certain number of times.
 More Efficient Stopping Criteria:
o Instead of relying only on iteration or evaluation counts, a more sophisticated approach
is to monitor the improvement in the solution.
o Convergence-based criterion: If the best solution doesn’t improve significantly after
a certain number of iterations, the search should stop. This method saves computational
time by halting the search once the algorithm’s progress slows down.

Table: Summary of Key Parameter Effects


Recommended
Parameter Description Effect
Range
Number of Larger size leads to better
Swarm Size (S) particles in the exploration, but higher 20–50 particles
swarm computational cost.
Weight for the
Higher values increase
Cognitive Factor particle's own
individual exploration. Useful Typically 1–2
(c1) experience
in multimodal problems.
(pbest)
Weight for the
Higher values increase global
swarm's
Social Factor (c2) cooperation. Useful in unimodal Typically 1–2
experience
problems.
(gbest)
Stochastic terms Introduces randomness for
Random Factors
controlling diversity. Can help avoid local Range [0, 1]
(r1, r2)
randomness optima.
Condition to Can be based on iterations,
Stopping
terminate the function evaluations, or lack of Problem-dependent
Criteria
algorithm improvement.

Page 39 of 40
Swarm and Evolutionary Computing V Daneesha

3.6. Advantages, Disadvantages and Applications

3.6.1. Advantages of PSO


 Simple and Easy to Implement: Fewer parameters compared to other evolutionary
algorithms like Genetic Algorithms (GA).
 No Gradient Information Needed: PSO works well even when the fitness function
is non-differentiable, noisy, or discontinuous.
 Fast Convergence: PSO can converge quickly, especially for simpler optimization
problems.

3.6.2. Disadvantages of PSO


 May Get Stuck in Local Optima: Especially for highly multimodal optimization
problems.
 Parameter Sensitivity: Performance can be sensitive to parameters like inertia
weight, and social and cognitive coefficients.

3.6.3. Applications of PSO


PSO has been widely applied in a variety of fields, including:

 Engineering Optimization: Structural design, control system design, power systems


optimization.
 Machine Learning: Neural network training, hyperparameter tuning, clustering.
 Robotics: Path planning, multi-robot coordination.
 Scheduling and Planning: Job-shop scheduling, resource allocation.
 Economics and Finance: Portfolio optimization, option pricing.

3.6.4. Example Problem: Sphere Function

A common example used to test PSO is optimizing the Sphere Function:


𝑛

𝑓(𝑥) = ∑ 𝑥𝑖2
𝑖=1

 The global minimum is at x=0, where f(x)=0.


 PSO is used to minimize this function by moving particles through the multi-
dimensional search space.

References:

 “ Computational Intelligence”, Second Edition, by Andries P. Engelbrech, John Wiley


& Sons, ISBN: 978-0-470-03561-0, 2008
 “Introduction to Evolutionary Computing”, A.E Eiben and J.E. Smith, Springer,
second edition, 2007

Page 40 of 40

You might also like