0% found this document useful (0 votes)
10 views

Unit - 5 Machine Learning

The document discusses machine learning algorithms including association rule learning, artificial neural networks, and genetic algorithms. It provides details on association rule learning including the apriori algorithm and applications. It also describes the steps of the apriori algorithm and provides an example of its application.

Uploaded by

Abhishek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Unit - 5 Machine Learning

The document discusses machine learning algorithms including association rule learning, artificial neural networks, and genetic algorithms. It provides details on association rule learning including the apriori algorithm and applications. It also describes the steps of the apriori algorithm and provides an example of its application.

Uploaded by

Abhishek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

UNIT 5

Association Rules Learning


Artificial Neural Network
Genetic Algorithms

Machine Learning / BTCS 618‐18


Dr. Vandana Mohindru
Topics to be discussed
• Association Rules Learning: Need and Application of Association Rules Learning
• Basic concepts of Association Rule Mining
• Naïve algorithm
• Apriori algorithm
• Artificial Neural Network: Need and Application of Artificial Neural Network
• Neural network representation and working
• Activation Functions
• Genetic Algorithms: Basic concepts
• Gene Representation and Fitness Function
• Selection, Recombination
• Mutation and Elitism.
Association Rules Learning
• Association rule learning is a type of unsupervised learning technique that
checks for the dependency of one data item on another data item and
maps accordingly so that it can be more profitable.
• It tries to find some interesting relations or associations among the
variables of dataset. It is based on different rules to discover the
interesting relations between variables in the database.
• The association rule learning is one of the very important concepts
of machine learning, and it is employed in Market Basket analysis, Web
usage mining, continuous production, etc.
• Here market basket analysis is a technique used by the various big retailer
to discover the associations between items. We can understand it by
taking an example of a supermarket, as in a supermarket, all products that
are purchased together are put together.
Association Rules Learning
• For example, if a customer buys bread, he most likely can also buy
butter, eggs, or milk, so these products are stored within a shelf or
mostly nearby. Consider the below diagram:
Association Rules Learning
Association rule learning can be divided into three types of algorithms:
• Apriori
• Eclat
• F‐P Growth Algorithm
Association Rules Learning
How does Association Rule Learning work?
• Association rule learning works on the concept of If and Else Statement, such as
if A then B.

• Here the If element is called antecedent, and then statement is called


as Consequent.
• These types of relationships where we can find out some association or relation
between two items is known as single cardinality. It is all about creating rules,
and if the number of items increases, then cardinality also increases
accordingly. So, to measure the associations between thousands of data items,
there are several metrics. These metrics are given below:
Support Confidence Lift
Association Rules Learning
How does Association Rule Learning work?
• Support: Support is the frequency of A or how frequently an item appears
in the dataset. It is defined as the fraction of the transaction T that
contains the item set X. If there are X datasets, then for transactions T, it
can be written as:

• Confidence: Confidence indicates how often the rule has been found to
be true. Or how often the items X and Y occur together in the dataset
when the occurrence of X is already given. It is the ratio of the transaction
that contains X and Y to the number of records that contain X.
Association Rules Learning
How does Association Rule Learning work?
• Lift: It is the strength of any rule, which can be defined as below formula:

It is the ratio of the observed support measure and expected support if X


and Y are independent of each other. It has three possible values:
• If Lift = 1: The probability of occurrence of antecedent and consequent is
independent of each other.
• Lift > 1: It determines the degree to which the two item sets are
dependent to each other.
• Lift < 1: It tells us that one item is a substitute for other items, which
means one item has a negative effect on another.
Association Rules Learning Algorithms
• Apriori Algorithm: This algorithm uses frequent datasets to generate
association rules. It is designed to work on the databases that contain
transactions. This algorithm uses a breadth‐first search and Hash Tree to
calculate the item set efficiently. It is mainly used for market basket analysis
and helps to understand the products that can be bought together. It can
also be used in the healthcare field to find drug reactions for patients.
• Eclat Algorithm: Eclat algorithm stands for Equivalence Class
Transformation. This algorithm uses a depth‐first search technique to find
frequent item sets in a transaction database. It performs faster execution
than Apriori Algorithm.
• F‐P Growth Algorithm: The F‐P growth algorithm stands for Frequent
Pattern, and it is the improved version of the Apriori Algorithm. It represents
the database in the form of a tree structure that is known as a frequent
pattern or tree. The purpose of this frequent tree is to extract the most
frequent patterns.
Association Rules Learning
Applications of Association Rule Learning:
• Market Basket Analysis: It is one of the popular examples and
applications of association rule mining. This technique is commonly used
by big retailers to determine the association between items.
• Medical Diagnosis: With the help of association rules, patients can be
cured easily, as it helps in identifying the probability of illness for a
particular disease.
• Protein Sequence: The association rules help in determining the synthesis
of artificial Proteins.
• It is also used for the Catalog Design and Loss‐leader Analysis and many
more other applications.
Apriori Algorithm
• The Apriori algorithm uses frequent itemsets to generate association
rules, and it is designed to work on the databases that contain
transactions. With the help of these association rule, it determines how
strongly or how weakly two objects are connected. This algorithm uses
a breadth‐first search and Hash Tree to calculate the itemset associations
efficiently. It is the iterative process for finding the frequent itemsets from
the large dataset.
• This algorithm was given by the R. Agrawal and Srikant in the year 1994.
It is mainly used for market basket analysis and helps to find those
products that can be bought together. It can also be used in the
healthcare field to find drug reactions for patients.
Apriori Algorithm
What is Frequent Itemset?
• Frequent itemsets are those items whose support is greater than the
threshold value or user‐specified minimum support. It means if A & B are
the frequent itemsets together, then individually A and B should also be
the frequent itemset.
• Suppose there are the two transactions: A= {1,2,3,4,5}, and B= {2,3,7}, in
these two transactions, 2 and 3 are the frequent itemsets.
Apriori Algorithm
Steps for Apriori Algorithm
Step‐1: Determine the support of itemsets in the transactional database,
and select the minimum support and confidence.
Step‐2: Take all supports in the transaction with higher support value than
the minimum or selected support value.
Step‐3: Find all the rules of these subsets that have higher confidence value
than the threshold or minimum confidence.
Step‐4: Sort the rules as the decreasing order of lift.
Apriori Algorithm
Apriori Algorithm Working
Example: Suppose we have the following dataset that has various
transactions, and from this dataset, we need to find the frequent itemsets
and generate the association rules using the Apriori algorithm:
Apriori Algorithm
Apriori Algorithm Working
Solution:
Step‐1: Calculating C1 and L1: In the first step, we will create a table that
contains support count (The frequency of each itemset individually in the
dataset) of each itemset in the given dataset. This table is called
the Candidate set or C1.
Apriori Algorithm
Apriori Algorithm Working
Solution:
Now, we will take out all the itemsets that have the greater support count
that the Minimum Support (2). It will give us the table for the frequent
itemset L1. Since all the itemsets have greater or equal support count than
the minimum support, except the E, so E itemset will be removed.
Apriori Algorithm
Apriori Algorithm Working
Solution:
Step‐2: Candidate Generation C2, and
L2:
• In this step, we will generate C2 with
the help of L1. In C2, we will create
the pair of the itemsets of L1 in the
form of subsets.
• After creating the subsets, we will
again find the support count from
the main transaction table of
datasets, i.e., how many times these
pairs have occurred together in the
given dataset. So, we will get the
below table for C2:
Apriori Algorithm
Apriori Algorithm Working
Solution:
• Again, we need to compare the C2 Support count with the minimum
support count, and after comparing, the itemset with less support count
will be eliminated from the table C2. It will give us the below table for L2
Apriori Algorithm
Apriori Algorithm Working
Solution:
Step‐3: Candidate generation C3, and L3: For C3, we will repeat the same
two processes, but now we will form the C3 table with subsets of three
itemsets together, and will calculate the support count from the dataset. It
will give the below table:
Apriori Algorithm
Apriori Algorithm Working
Solution:
• Now we will create the L3 table. As we can see from the above C3 table,
there is only one combination of itemset that has support count equal to
the minimum support count. So, the L3 will have only one combination,
i.e., {A, B, C}.
Step‐4: Finding the association rules for the subsets:
• To generate the association rules, first, we will create a new table with the
possible rules from the occurred combination {A, B.C}. For all the rules,
we will calculate the Confidence using formula sup( A ^B)/A. After
calculating the confidence value for all rules, we will exclude the rules that
have less confidence than the minimum threshold(50%).
• Consider the below table:
Apriori Algorithm
Apriori Algorithm Working
Solution: Rules Support Confidence
A ^B → C 2 Sup{(A ^B) ^C}/sup(A ^B)= 2/4=0.5=50%
B^C → A 2 Sup{(B^C) ^A}/sup(B ^C)= 2/4=0.5=50%
A^C → B 2 Sup{(A ^C) ^B}/sup(A ^C)= 2/4=0.5=50%
C→ A ^B 2 Sup{(C^( A ^B)}/sup(C)= 2/5=0.4=40%
A→ B^C 2 Sup{(A^( B ^C)}/sup(A)= 2/6=0.33=33.33%
B→ B^C 2 Sup{(B^( B ^C)}/sup(B)= 2/7=0.28=28%

As the given threshold or minimum confidence is 50%, so the first three


rules A ^B → C, B^C → A, and A^C → B can be considered as the strong
association rules for the given problem.
Apriori Algorithm
Advantages of Apriori Algorithm
• This is easy to understand algorithm.
• The join and prune steps of the algorithm can be easily implemented on
large datasets.

Disadvantages of Apriori Algorithm


• The apriori algorithm works slow compared to other algorithms.
• The overall performance can be reduced as it scans the database for
multiple times.
• The time complexity and space complexity of the apriori algorithm is
O(2D), which is very high. Here D represents the horizontal width present
in the database.
Artificial Neural Network
The term "Artificial Neural Network" is derived from Biological neural
networks that develop the structure of a human brain. Similar to the
human brain that has neurons interconnected to one another, artificial
neural networks also have neurons that are interconnected to one another
in various layers of the networks. These neurons are known as nodes.
Artificial Neural Network
The typical Artificial Neural Network looks something like the given
figure.

Dendrites from Biological Neural Network represent inputs in Artificial


Neural Networks, cell nucleus represents Nodes, synapse represents
Weights, and Axon represents Output.
Artificial Neural Network
Relationship between Biological neural network and artificial neural
network: Biological Neural Network Artificial Neural Network

Dendrites Inputs
Cell nucleus Nodes
Synapse Weights
Axon Output

• An Artificial Neural Network in the field of Artificial intelligence where it


attempts to mimic the network of neurons makes up a human brain so
that computers will have an option to understand things and make
decisions in a human‐like manner. The artificial neural network is
designed by programming computers to behave simply like
interconnected brain cells.
Artificial Neural Network
• There are around 1000 billion neurons in the human brain.
• Each neuron has an association point somewhere in the range of 1,000 and
100,000. In the human brain, data is stored in such a manner as to be
distributed, and we can extract more than one piece of this data when
necessary from our memory parallelly.
• We can say that the human brain is made up of incredibly amazing parallel
processors.
• We can understand the artificial neural network with an example, consider an
example of a digital logic gate that takes an input and gives an output. "OR"
gate, which takes two inputs. If one or both the inputs are "On," then we get
"On" in output.
• If both the inputs are "Off," then we get "Off" in output. Here the output
depends upon input. Our brain does not perform the same task. The outputs to
inputs relationship keep changing because of the neurons in our brain, which
are "learning."
Architecture of Artificial Neural Network
To understand the concept of the architecture of an artificial neural
network, we have to understand what a neural network consists of. In order
to define a neural network that consists of a large number of artificial
neurons, which are termed units arranged in a sequence of layers. Lets us
look at various types of layers available in an artificial neural network.
• Artificial Neural Network primarily consists of three layers:
Architecture of Artificial Neural Network
• Input Layer: As the name suggests, it accepts inputs in several different
formats provided by the programmer.
• Hidden Layer: The hidden layer presents in‐between input and output
layers. It performs all the calculations to find hidden features and
patterns.
• Output Layer: The input goes through a series of transformations using
the hidden layer, which finally results in output that is conveyed using this
layer. The artificial neural network takes input and computes the weighted
sum of the inputs and includes a bias. This computation is represented in
the form of a transfer function.
Architecture of Artificial Neural Network
It determines weighted total is passed as an input to an activation function
to produce the output. Activation functions choose whether a node should
fire or not. Only those who are fired make it to the output layer. There are
distinctive activation functions available that can be applied upon the sort
of task we are performing.

Science artificial neural networks that have steeped into the world in the
mid‐20th century are exponentially developing. In the present time, we
have investigated the pros of artificial neural networks and the issues
encountered in the course of their utilization. It should not be overlooked
that the cons of ANN networks, which are a flourishing science branch,
are eliminated individually, and their pros are increasing day by day. It
means that artificial neural networks will turn into an irreplaceable part
of our lives progressively important.
Architecture of Artificial Neural Network
What is an activation function and why to use them?
• Definition of activation function:‐ Activation function decides, whether a
neuron should be activated or not by calculating weighted sum and
further adding bias with it. The purpose of the activation function is
to introduce non‐linearity into the output of a neuron.
• Explanation :‐
We know, neural network has neurons that work in correspondence
of weight, bias and their respective activation function. In a neural
network, we would update the weights and biases of the neurons on the
basis of the error at the output. This process is known as back‐
propagation. Activation functions make the back‐propagation possible
since the gradients are supplied along with the error to update the
weights and biases.
Advantages of Artificial Neural Network (ANN)
• Parallel processing capability: Artificial neural networks have a numerical value
that can perform more than one task simultaneously.
• Storing data on the entire network: Data that is used in traditional
programming is stored on the whole network, not on a database. The
disappearance of a couple of pieces of data in one place doesn't prevent the
network from working.
• Capability to work with incomplete knowledge: After ANN training, the
information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.
• Having a memory distribution: For ANN is to be able to adapt, it is important to
determine the examples and to encourage the network according to the desired
output by demonstrating these examples to the network. The succession of the
network is directly proportional to the chosen instances, and if the event can't
appear to the network in all its aspects, it can produce false output.
• Having fault tolerance: Extortion of one or more cells of ANN does not prohibit
it from generating output, and this feature makes the network fault‐tolerance.
Disadvantages of Artificial Neural Network (ANN)
• Assurance of proper network structure: There is no particular guideline for
determining the structure of artificial neural networks. The appropriate
network structure is accomplished through experience, trial, and error.
• Unrecognized behavior of the network: It is the most significant issue of ANN.
When ANN produces a testing solution, it does not provide insight concerning
why and how. It decreases trust in the network.
• Hardware dependence: Artificial neural networks need processors with parallel
processing power, as per their structure. Therefore, the realization of the
equipment is dependent.
• Difficulty of showing the issue to the network: ANNs can work with numerical
data. Problems must be converted into numerical values before being
introduced to ANN. The presentation mechanism to be resolved here will
directly impact the performance of the network. It relies on the user's abilities.
• The duration of the network is unknown: The network is reduced to a specific
value of the error, and this value does not give us optimum results.
How do Artificial Neural Networks work?
Artificial Neural Network can be best
represented as a weighted directed
graph, where the artificial neurons
form the nodes. The association
between the neurons outputs and
neuron inputs can be viewed as the
directed edges with weights. The
Artificial Neural Network receives
the input signal from the external
source in the form of a pattern and
image in the form of a vector. These
inputs are then mathematically
assigned by the notations x(n) for
every n number of inputs.
How do Artificial Neural Networks work?
• Afterward, each of the input is multiplied by its corresponding weights (
these weights are the details utilized by the artificial neural networks to
solve a specific problem ).
• In general terms, these weights normally represent the strength of the
interconnection between neurons inside the artificial neural network. All
the weighted inputs are summarized inside the computing unit.
• If the weighted sum is equal to zero, then bias is added to make the
output non‐zero or something else to scale up to the system's response.
Bias has the same input, and weight equals to 1.
• Here the total of weighted inputs can be in the range of 0 to positive
infinity. Here, to keep the response in the limits of the desired value, a
certain maximum value is benchmarked, and the total of weighted inputs
is passed through the activation function.
How do Artificial Neural Networks work?
• The activation function refers to the set of transfer functions used to achieve
the desired output. There is a different kind of the activation function, but
primarily either linear or non‐linear sets of functions. Some of the commonly
used sets of activation functions are the Binary, linear, and Tan hyperbolic
sigmoidal activation functions. Let us take a look at each of them in details:
• Binary: In binary activation function, the output is either a one or a 0. Here, to
accomplish this, there is a threshold value set up. If the net weighted input of
neurons is more than 1, then the final output of the activation function is
returned as one or else the output is returned as 0.
• Sigmoidal Hyperbolic: The Sigmoidal Hyperbola function is generally seen as
an "S" shaped curve. Here the tan hyperbolic function is used to approximate
output from the actual net input. The function is defined as:
F(x) = (1/1 + exp(‐????x))
Where ???? is considered the Steepness parameter.
Types of Artificial Neural Network
There are various types of Artificial Neural Networks (ANN) depending upon the
human brain neuron and network functions, an artificial neural network similarly
performs tasks. The majority of the artificial neural networks will have some
similarities with a more complex biological partner and are very effective at their
expected tasks. For example, segmentation or classification.
• Feedback ANN: In this type of ANN, the output returns into the network to
accomplish the best‐evolved results internally. As per the University of
Massachusetts, Lowell Centre for Atmospheric Research. The feedback
networks feed information back into itself and are well suited to solve
optimization issues. The Internal system error corrections utilize feedback
ANNs.
• Feed‐Forward ANN: A feed‐forward network is a basic neural network
comprising of an input layer, an output layer, and at least one layer of a neuron.
Through assessment of its output by reviewing its input, the intensity of the
network can be noticed based on group behavior of the associated neurons,
and the output is decided. The primary advantage of this network is that it
figures out how to evaluate and recognize input patterns.
Genetic Algorithms ‐ Introduction
• Genetic Algorithm (GA) is a search‐based optimization technique based
on the principles of Genetics and Natural Selection. It is frequently used
to find optimal or near‐optimal solutions to difficult problems which
otherwise would take a lifetime to solve. It is frequently used to solve
optimization problems, in research, and in machine learning.
Introduction to Optimization
• Optimization is the process of making something better. In any process,
we have a set of inputs and a set of outputs as shown in the following
figure.
Genetic Algorithms ‐ Introduction
• Optimization refers to finding the values of inputs in such a way that we
get the “best” output values. The definition of “best” varies from
problem to problem, but in mathematical terms, it refers to maximizing
or minimizing one or more objective functions, by varying the input
parameters.

• The set of all possible solutions or values which the inputs can take make
up the search space. In this search space, lies a point or a set of points
which gives the optimal solution. The aim of optimization is to find that
point or set of points in the search space.
Genetic Algorithms ‐ Introduction
What are Genetic Algorithms?
• Nature has always been a great source of inspiration to all mankind.
Genetic Algorithms (GAs) are search based algorithms based on the
concepts of natural selection and genetics. GAs are a subset of a much
larger branch of computation known as Evolutionary Computation.
• GAs were developed by John Holland and his students and colleagues at
the University of Michigan, most notably David E. Goldberg and has since
been tried on various optimization problems with a high degree of
success.
• In GAs, we have a pool or a population of possible solutions to the given
problem. These solutions then undergo recombination and mutation (like
in natural genetics), producing new children, and the process is repeated
over various generations.
Genetic Algorithms ‐ Introduction
• Each individual (or candidate solution) is assigned a fitness value (based
on its objective function value) and the fitter individuals are given a
higher chance to mate and yield more “fitter” individuals. This is in line
with the Darwinian Theory of “Survival of the Fittest”.
• In this way we keep “evolving” better individuals or solutions over
generations, till we reach a stopping criterion.
• Genetic Algorithms are sufficiently randomized in nature, but they
perform much better than random local search (in which we just try
various random solutions, keeping track of the best so far), as they exploit
historical information as well.
Genetic Algorithms ‐ Introduction
Advantages of GAs
• Does not require any derivative information (which may not be available
for many real‐world problems).
• Is faster and more efficient as compared to the traditional methods.
• Has very good parallel capabilities.
• Optimizes both continuous and discrete functions and also multi‐
objective problems.
• Provides a list of “good” solutions and not just a single solution.
• Always gets an answer to the problem, which gets better over the time.
• Useful when the search space is very large and there are a large number
of parameters involved.
Genetic Algorithms ‐ Introduction
Limitations of GAs
• GAs are not suited for all problems, especially problems which are simple
and for which derivative information is available.
• Fitness value is calculated repeatedly which might be computationally
expensive for some problems.
• Being stochastic, there are no guarantees on the optimality or the quality
of the solution.
• If not implemented properly, the GA may not converge to the optimal
solution.
Genetic Algorithms ‐ Fundamentals
Basic Terminology
• Population − It is a subset of all the possible (encoded) solutions to the
given problem. The population for a GA is analogous to the population for
human beings except that instead of human beings, we have Candidate
Solutions representing human beings.
• Chromosomes − A chromosome is one such solution to the given
problem.
• Gene − A gene is one element position of a chromosome.
• Allele − It is the value a gene takes for a particular chromosome.
Genetic Algorithms ‐ Fundamentals
• Genotype − Genotype is the population in the computation space. In the computation
space, the solutions are represented in a way which can be easily understood and
manipulated using a computing system.
• Phenotype − Phenotype is the population in the actual real world solution space in
which solutions are represented in a way they are represented in real world
situations.
Genetic Algorithms ‐ Fundamentals
• Decoding and Encoding − For simple problems, the phenotype and
genotype spaces are the same. However, in most of the cases, the
phenotype and genotype spaces are different. Decoding is a process of
transforming a solution from the genotype to the phenotype space, while
encoding is a process of transforming from the phenotype to genotype
space. Decoding should be fast as it is carried out repeatedly in a GA
during the fitness value calculation.
• For example, consider the 0/1 Knapsack Problem. The Phenotype space
consists of solutions which just contain the item numbers of the items to
be picked.
• However, in the genotype space it can be represented as a binary string
of length n (where n is the number of items). A 0 at position x represents
that xth item is picked while a 1 represents the reverse. This is a case
where genotype and phenotype spaces are different.
Genetic Algorithms ‐ Fundamentals
• Fitness Function − A fitness function simply defined is a function which takes the
solution as input and produces the suitability of the solution as the output. In some
cases, the fitness function and the objective function may be the same, while in
others it might be different based on the problem.
• Genetic Operators − These alter the genetic composition of the offspring. These
include crossover, mutation, selection, etc.
Basic Structure ‐ Genetic Algorithms
Basic Structure ‐ Genetic Algorithms
• We start with an initial population (which may be generated at random or seeded by
other heuristics), select parents from this population for mating. Apply crossover and
mutation operators on the parents to generate new off‐springs. And finally these off‐
springs replace the existing individuals in the population and the process repeats. In
this way genetic algorithms actually try to mimic the human evolution to some extent.
• A generalized pseudo‐code for a GA is explained in the following program −
GA()
initialize population
find fitness of population

while (termination criteria is reached) do


parent selection
crossover with probability pc
mutation with probability pm
decode and fitness calculation
survivor selection
find best
return best
Genetic Algorithms ‐ Population
• Population is a subset of solutions in the current generation. It can also
be defined as a set of chromosomes. There are several things to be kept
in mind when dealing with GA population −
• The diversity of the population should be maintained otherwise it might lead to
premature convergence.
• The population size should not be kept very large as it can cause a GA to slow
down, while a smaller population might not be enough for a good mating pool.
Therefore, an optimal population size needs to be decided by trial and error.
• The population is usually defined as a two dimensional array of – size
population, size x, chromosome size.
Genetic Algorithms ‐ Population
Population Initialization:
• Random Initialization − Populate the initial population with completely
random solutions.
• Heuristic initialization − Populate the initial population using a known heuristic
for the problem.

Population Models
• Steady State: In steady state GA, we generate one or two off‐springs in each
iteration and they replace one or two individuals from the population. A steady
state GA is also known as Incremental GA.
• Generational: In a generational model, we generate ‘n’ off‐springs, where n is
the population size, and the entire population is replaced by the new one at
the end of the iteration.
Genetic Algorithms ‐ Fitness Function
• The fitness function simply defined is a function which takes a candidate
solution to the problem as input and produces as output how “fit” our
how “good” the solution is with respect to the problem in consideration.
• Calculation of fitness value is done repeatedly in a GA and therefore it
should be sufficiently fast. A slow computation of the fitness value can
adversely affect a GA and make it exceptionally slow.
• In most cases the fitness function and the objective function are the
same as the objective is to either maximize or minimize the given
objective function. However, for more complex problems with multiple
objectives and constraints, an Algorithm Designer might choose to have a
different fitness function.
Genetic Algorithms ‐ Fitness Function
• A fitness function should possess the following characteristics −
• The fitness function should be sufficiently fast to compute.
• It must quantitatively measure how fit a given solution is or how fit individuals can
be produced from the given solution.
• In some cases, calculating the fitness function directly might not be
possible due to the inherent complexities of the problem at hand. In such
cases, we do fitness approximation to suit our needs.
Genetic Algorithms ‐ Parent Selection
• Parent Selection is the process of selecting parents which mate and
recombine to create off‐springs for the next generation. Parent selection
is very crucial to the convergence rate of the GA as good parents drive
individuals to a better and fitter solutions.
• Maintaining good diversity in the population is extremely crucial for the
success of a GA.
• This taking up of the entire population by one extremely fit solution is
known as premature convergence and is an undesirable condition in a
GA.
Genetic Algorithms ‐ Parent Selection
Fitness Proportionate Selection:
• Fitness Proportionate Selection is one of the most popular ways of parent
selection. In this every individual can become a parent with a probability
which is proportional to its fitness. Therefore, fitter individuals have a
higher chance of mating and propagating their features to the next
generation. Therefore, such a selection strategy applies a selection
pressure to the more fit individuals in the population, evolving better
individuals over time.
• Consider a circular wheel. The wheel is divided into n pies, where n is the
number of individuals in the population. Each individual gets a portion of
the circle which is proportional to its fitness value.
Genetic Algorithms ‐ Parent Selection
Two implementations of fitness proportionate selection are possible −
1. Roulette Wheel Selection: In this the circular wheel is divided as
described before. A fixed point is chosen on the wheel circumference and
the wheel is rotated. The region of the wheel which comes in front of the
fixed point is chosen as the parent. For the second parent, the same
process is repeated.
Genetic Algorithms ‐ Parent Selection
2. Stochastic Universal Sampling (SUS): Stochastic Universal Sampling is
quite similar to Roulette wheel selection, however instead of having just
one fixed point, we have multiple fixed points. Therefore, all the parents
are chosen in just one spin of the wheel. Also, such a setup encourages the
highly fit individuals to be chosen at least once.
Genetic Algorithms ‐ Parent Selection
Tournament Selection:
• In K‐Way tournament selection, we select K individuals from the
population at random and select the best out of these to become a
parent. The same process is repeated for selecting the next parent.
Tournament Selection is also extremely popular in literature as it can
even work with negative fitness values.
Genetic Algorithms ‐ Parent Selection
Rank Selection:
• Rank Selection also works with negative fitness values and is mostly used
when the individuals in the population have very close fitness values (this
happens usually at the end of the run). This leads to each individual
having an almost equal share of the pie (like in case of fitness
proportionate selection) and hence each individual no matter how fit
relative to each other has an approximately same probability of getting
selected as a parent.
Genetic Algorithms ‐ Parent Selection
• This in turn leads to a loss in the selection pressure towards fitter
individuals, making the GA to make poor parent selections in such
situations.

Random Selection
• In this strategy we randomly select parents from the existing population.
There is no selection pressure towards fitter individuals and therefore this
strategy is usually avoided.
Genetic Algorithms ‐ Crossover
Introduction to Crossover
• The crossover operator is analogous to reproduction and biological
crossover. In this more than one parent is selected and one or more off‐
springs are produced using the genetic material of the parents. Crossover
is usually applied in a GA with a high probability – pc .
Crossover Operators
• It is to be noted that these crossover operators are very generic and the
GA Designer might choose to implement a problem‐specific crossover
operator as well.
Genetic Algorithms ‐ Crossover
Crossover Operators
1. One Point Crossover: In this one‐point crossover, a random crossover
point is selected and the tails of its two parents are swapped to get new
off‐springs.

2. Multi Point Crossover: Multi point crossover is a generalization of the


one‐point crossover wherein alternating segments are swapped to get new
off‐springs.
Genetic Algorithms ‐ Crossover
Crossover Operators
3. Uniform Crossover: In a uniform crossover, we don’t divide the
chromosome into segments, rather we treat each gene separately. In this,
we essentially flip a coin for each chromosome to decide whether or not
it’ll be included in the off‐spring. We can also bias the coin to one parent,
to have more genetic material in the child from that parent.
Genetic Algorithms ‐ Crossover
Crossover Operators
4. Whole Arithmetic Recombination: This is commonly used for integer
representations and works by taking the weighted average of the two
parents by using the following formulae −
Child1 = α.x + (1‐α).y
Child2 = α.x + (1‐α).y
• Obviously, if α = 0.5, then both the children will be identical as shown in
the following image.
Genetic Algorithms ‐ Crossover
Crossover Operators
5. Davis’ Order Crossover (OX1): OX1 is used for permutation based
crossovers with the intention of transmitting information about relative
ordering to the off‐springs. It works as follows −
• Create two random crossover points in the parent and copy the segment
between them from the first parent to the first offspring.
• Now, starting from the second crossover point in the second parent, copy
the remaining unused numbers from the second parent to the first child,
wrapping around the list.
• Repeat for the second child with the parent’s role reversed.
Genetic Algorithms ‐ Crossover
Crossover Operators
5. Davis’ Order Crossover (OX1):

There exist a lot of other crossovers like Partially Mapped Crossover (PMX),
Order based crossover (OX2), Shuffle Crossover, Ring Crossover, etc.
Genetic Algorithms ‐ Mutation
• In simple terms, mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and introduce
diversity in the genetic population and is usually applied with a low
probability – pm. If the probability is very high, the GA gets reduced to a
random search.
• Mutation is the part of the GA which is related to the “exploration” of the
search space. It has been observed that mutation is essential to the
convergence of the GA while crossover is not.
Mutation Operators
• Like the crossover operators, this is not an exhaustive list and the GA
designer might find a combination of these approaches or a problem‐
specific mutation operator more useful.
Genetic Algorithms ‐ Mutation
Mutation Operators
1. Bit Flip Mutation: In this bit flip mutation, we select one or more
random bits and flip them. This is used for binary encoded GAs.

2. Random Resetting: Random Resetting is an extension of the bit flip for


the integer representation. In this, a random value from the set of
permissible values is assigned to a randomly chosen gene.
3. Swap Mutation: In swap mutation, we select two positions on the
chromosome at random, and interchange the values. This is common in
permutation based encodings.
Genetic Algorithms ‐ Mutation
Mutation Operators
4. Scramble Mutation: Scramble mutation is also popular with
permutation representations. In this, from the entire chromosome, a
subset of genes is chosen and their values are scrambled or shuffled
randomly.

5. Inversion Mutation: In inversion mutation, we select a subset of genes


like in scramble mutation, but instead of shuffling the subset, we merely
invert the entire string in the subset.
Genetic Algorithms ‐ Survivor Selection
• The Survivor Selection Policy determines which individuals are to be
kicked out and which are to be kept in the next generation. It is crucial as
it should ensure that the fitter individuals are not kicked out of the
population, while at the same time diversity should be maintained in the
population.
• Some GAs employ Elitism. In simple terms, it means the current fittest
member of the population is always propagated to the next generation.
Therefore, under no circumstance can the fittest member of the current
population be replaced.
• The easiest policy is to kick random members out of the population, but
such an approach frequently has convergence issues, therefore the
following strategies are widely used.
Genetic Algorithms ‐ Survivor Selection
Age Based Selection
• In Age‐Based Selection, we don’t have a notion of a fitness. It is based on
the premise that each individual is allowed in the population for a finite
generation where it is allowed to reproduce, after that, it is kicked out of
the population no matter how good its fitness is.

Fitness Based Selection


• In this fitness based selection, the children tend to replace the least fit
individuals in the population. The selection of the least fit individuals may
be done using a variation of any of the selection policies described before
– tournament selection, fitness proportionate selection, etc.
Genetic Algorithms ‐ Termination Condition
The termination condition of a Genetic Algorithm is important in
determining when a GA run will end. It has been observed that initially, the
GA progresses very fast with better solutions coming in every few
iterations, but this tends to saturate in the later stages where the
improvements are very small. We usually want a termination condition
such that our solution is close to the optimal, at the end of the run.
Usually, we keep one of the following termination conditions −
• When there has been no improvement in the population for X iterations.
• When we reach an absolute number of generations.
• When the objective function value has reached a certain pre‐defined
value.
Genetic Algorithms ‐ Termination Condition
• For example, in a genetic algorithm we keep a counter which keeps track
of the generations for which there has been no improvement in the
population. Initially, we set this counter to zero. Each time we don’t
generate off‐springs which are better than the individuals in the
population, we increment the counter.
• However, if the fitness any of the off‐springs is better, then we reset the
counter to zero. The algorithm terminates when the counter reaches a
predetermined value.
• Like other parameters of a GA, the termination condition is also highly
problem specific and the GA designer should try out various options to
see what suits his particular problem the best.

You might also like