0% found this document useful (0 votes)

176 views21 pages

A Tutorial On Differential Evolution With Python - Pablo R. Mier

This document provides a tutorial on implementing the Differential Evolution (DE) algorithm for optimization problems in Python. DE is an evolutionary algorithm that can be used to find the minimum of black-box functions. The author implements a basic DE algorithm in 27 lines of Python code and uses it to minimize simple test functions to demonstrate how it works. They also discuss how DE can be applied to real-world multi-dimensional optimization problems and some of the challenges that can arise with high-dimensional problems.

Uploaded by

Neel Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

176 views21 pages

A Tutorial On Differential Evolution With Python - Pablo R. Mier

Uploaded by

Neel Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

A tutorial on Differential Evolution with Python - Pablo R.

Mier 17/09/20, 7:13 PM

A tutorial on Differential Evolution

with Python
19 minute read

I have to admit that I’m a great fan of the Differential Evolution (DE) algorithm. This
algorithm, invented by R. Storn and K. Price
(https://link.springer.com/article/10.1023%2FA%3A1008202821328?LI=true) in 1997, is a very powerful
algorithm for black-box optimization (also called derivative-free optimization). Black-box
optimization is about finding the minimum of a function f (x) : ℝn → ℝ, where we don’t
know its analytical form, and therefore no derivatives can be computed to minimize it (or are
hard to approximate). The figure below shows how the DE algorithm approximates the
minimum of a function in succesive steps:

Figure 1. Example of DE iteratively optimizing the 2D Ackley function

(https://www.sfu.ca/~ssurjano/ackley.html) (generated using Yabox
(https://github.com/pablormier/yabox))

The optimization of black-box functions is very common in real world problems, where the
function to be optimized is very complex (and may involve the use of simulators or external
software for the computations). For these kind of problems, DE works pretty well, and that’s

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 1 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

why it’s very popular for solving problems in many different fields, including Astronomy,
Chemistry, Biology, and many more. For example, the European Space Agency (ESA) uses
DE to design optimal trajectories (http://www.esa.int/gsp/ACT/doc/INF/pub/ACT-RPR-INF-2014-
(PPSN)CstrsOptJupiterCapture.pdf) in order to reach the orbit of a planet using as less fuel as
possible. Sounds awesome right? Best of all, the algorithm is very simple to understand and
to implement. In this tutorial, we will see how to implement it, how to use it to solve some
problems and we will build intuition about how DE works.

Let’s start!
Before getting into more technical details, let’s get our hands dirty. One thing that
fascinates me about DE is not only its power but its simplicity, since it can be implemented
in just a few lines. Here is the code for the DE algorithm using the rand/1/bin schema (we
will talk about what this means later). It only took me 27 lines of code using Python with
Numpy:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 2 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

1 import numpy as np
2
3 def de(fobj, bounds, mut=0.8, crossp=0.7, popsize=20, its=1000):
4 dimensions = len(bounds)
5 pop = np.random.rand(popsize, dimensions)
6 min_b, max_b = np.asarray(bounds).T
7 diff = np.fabs(min_b - max_b)
8 pop_denorm = min_b + pop * diff
9 fitness = np.asarray([fobj(ind) for ind in pop_denorm])
10 best_idx = np.argmin(fitness)
11 best = pop_denorm[best_idx]
12 for i in range(its):
13 for j in range(popsize):
14 idxs = [idx for idx in range(popsize) if idx != j]
15 a, b, c = pop[np.random.choice(idxs, 3, replace = False)]
16 mutant = np.clip(a + mut * (b - c), 0, 1)
17 cross_points = np.random.rand(dimensions) < crossp
18 if not np.any(cross_points):
19 cross_points[np.random.randint(0, dimensions)] = True
20 trial = np.where(cross_points, mutant, pop[j])
21 trial_denorm = min_b + trial * diff
22 f = fobj(trial_denorm)
23 if f < fitness[j]:
24 fitness[j] = f

25 pop[j] = trial
26 if f < fitness[best_idx]:
27 best_idx = j
28 best = trial_denorm
29 yield best, fitness[best_idx]

github.com/pablormier/0caff10a5f76e87857b44f63757729b0/raw/793550984b8853cbc942e0631f2dab4bdc3fc88d/differential_evolution.py)
differential_evolution.py (https://gist.github.com/pablormier/0caff10a5f76e87857b44f63757729b0#file-differential_evolution-
py) hosted with ❤ by GitHub (https://github.com)

This code is completely functional, you can paste it into a python terminal and start playing
with it (you need numpy >= 1.7.0). Don’t worry if you don’t understand anything, we will see
later what is the meaning of each line in this code. The good thing is that we can start
playing with this right now without knowing how this works. The only two mandatory
parameters that we need to provide are fobj and bounds:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 3 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

fobj: f (x) function to optimize. Can be a function defined with a def or a lambda
n
expression. For example, suppose we want to minimize the function f (x) = ∑i xi2 /n.
If x is a numpy array, our fobj can be defined as:

fobj = lambda x: sum(x**2)/len(x)

If we define x as a list, we should define our objective function in this way:

def fobj(x):
value = 0
for i in range(len(x)):
value += x[i]**2
return value / len(x)

bounds: a list with the lower and upper bound for each parameter of the function. For
example: bound sx = [(-5, 5), (-5, 5), (-5, 5), (-5, 5)] means that each variable
xi , i ∈ [1, 4] is bound to the interval [-5, 5].

For example, let’s find the value of x that minimizes the function f (x) = x 2 , looking for
values of x between -100 and 100:

>>> it = list(de(lambda x: x**2, bounds=[(-100, 100)]))

>>> print(it[-1])

(array([ 0.]), array([ 0.]))

The first value returned ( array([ 0.] ) represents the best value for x (in this case is just
a single number since the function is 1-D), and the value of f(x) for that x is returned in
the second array ( array([ 0.] ).

Note: for convenience, I defined the de function as a generator function that yields
the best solution x and its corresponding value of f (x) at each iteration. In order to
obtain the last solution, we only need to consume the iterator, or convert it to a list
and obtain the last value with list(de(...))[-1]

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 4 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Yeah I know, this is too easy. Now, let’s try the same example in a multi-dimensional setting,
n
with the function now defined as f (x) = ∑i xi2 /n, for n=32 dimensions. This is how it looks
like in 2D:

# https://github.com/pablormier/yabox (https://github.com/pablormier/yabox)
# pip install yabox
>>> from yabox.problems import problem
>>> problem(lambda x: sum(x**2)/len(x), bounds=[(-5, 5)] * 2).plot3d()

Figure 2. Representation of f (x) = ∑ xi2 /n

>>> result = list(de(lambda x: x**2 / len(x), bounds=[(-100, 100)] * 32))

>>> print(result[-1])

(array([ 1.43231366, 4.83555112, 0.29051824, 2.94836318, 2.02918578,

-2.60703144, 0.76783095, -1.66057484, 0.42346029, -1.36945634,
-0.01227915, -3.38498397, -1.76757209, 3.16294878, 5.96335235,
3.51911452, 1.24422323, 2.9985505 , -0.13640705, 0.47221648,
0.42467349, 0.26045357, 1.20885682, -1.6256121 , 2.21449962,
-0.23379811, 2.20160374, -1.1396289 , -0.72875512, -3.46034836,
-5.84519163, 2.66791339]), 6.3464570348900136)

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 5 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

This time the best value for f(x) was 6.346 , we didn’t obtained the optimal solution
f (0, … , 0) = 0. Why? DE doesn’t guarantee to obtain the global minimum of a function.
What it does is to approach the global minimum in successive steps, as shown in Fig. 1. So
in general, the more complex the function, the more iterations are needed. This can raise a
new question: how does the dimensionality of a function affects the convergence of the
algorithm? In general terms, the difficulty of finding the optimal solution increases
exponentially with the number of dimensions (parameters). This effect is called “curse of
dimensionality (https://en.wikipedia.org/wiki/Curse_of_dimensionality)”. For example, suppose we
want to find the minimum of a 2D function whose input values are binary. Since they are
binary and there are only two possible values for each one, we would need to evaluate in the
worst case 22 = 4 combinations of values: f (0, 0) , f (0, 1) , f (1, 0) and f (1, 1) . But if we
have 32 parameters, we would need to evaluate the function for a total of 232 =
4,294,967,296 possible combinations in the worst case (the size of the search space grows
exponentially). This makes the problem much much more difficult, and any metaheuristic
algorithm like DE would need many more iterations to find a good approximation.

Knowing this, let’s run again the algorithm but for 3,000 iterations instead of just 1,000:

>>> result = list(de(lambda x: x**2 / len(x), bounds=[(-100, 100)] * 32, its=3000))

>>> print(result[-1])

(array([ 0.00648831, -0.00093694, -0.00205017, 0.00136862, -0.00722833,

-0.00138284, 0.00323691, 0.0040672 , -0.0060339 , 0.00631543,
-0.01132894, 0.00020696, 0.00020962, -0.00063984, -0.00877504,
-0.00227608, -0.00101973, 0.00087068, 0.00243963, 0.01391991,
-0.00894368, 0.00035751, 0.00151198, 0.00310393, 0.00219394,
0.01290131, -0.00029911, 0.00343577, -0.00032941, 0.00021377,
-0.01015071, 0.00389961]), 3.1645278699373536e-05)

Now we obtained a much better solution, with a value very close to 0. In this case we only
needed a few thousand iterations to obtain a good approximation, but with complex
functions we would need much more iterations, and yet the algorithm could get trapped in a
local minimum. We can plot the convergence of the algorithm very easily (now is when the
implementation using a generator function comes in handy):

>>> import matplotlib.pyplot as plt

>>> x, f = zip(*result)
>>> plt.plot(f)

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 6 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Figure 3. Evolution of the best solution found by DE in each iteration

Fig. 2 shows how the best solution found by the algorithm approximates more and more to
the global minimum as more iterations are executed. Now we can represent in a single plot
how the complexity of the function affects the number of iterations needed to obtain a good
approximation:

for d in [8, 16, 32, 64]:

it = list(de(lambda x: sum(x**2)/d, [(-100, 100)] * d, its=3000))
x, f = zip(*it)
plt.plot(f, label='d={}'.format(d))
plt.legend()

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 7 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Figure 4. Comparison of the convergence speed for different dimensions

The plot makes it clear that when the number of dimensions grows, the number of iterations
required by the algorithm to find a good solution grows as well.

How it works?
Now it’s time to talk about how these 27 lines of code work. Differential Evolution, as the
name suggest, is a type of evolutionary algorithm. An evolutionary algorithm is an algorithm
that uses mechanisms inspired by the theory of evolution, where the fittest individuals of a
population (the ones that have the traits that allow them to survive longer) are the ones that
produce more offspring, which in turn inherit the good traits of the parents. This makes the
new generation more likely to survive in the future as well, and so the population improves
over time, generation after generation. This is possible thanks to different mechanisms
present in nature, such as mutation, recombination and selection, among others.
Evolutionary algorithms apply some of these principles to evolve a solution to a problem.

In this way, in Differential Evolution, solutions are represented as populations of individuals

(or vectors), where each individual is represented by a set of real numbers. These real
numbers are the values of the parameters of the function that we want to minimize, and this
function measures how good an individual is. The main steps of the algorithm are:
initialization of the population, mutation, recombination, replacement and evaluation. Let’s
see how these operations are applied working through a simple example of minimizing the
function f (x) = ∑ xi2 /n for n = 4, so x = {x1 , x2 , x3 , x4 }, and −5 ≤ xi ≤ 5.

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 8 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Components
fobj:

fobj = lambda x: sum(x**2)/len(x)

bounds:

bounds = [(-5, 5)] * 4

Initialization
The first step in every evolutionary algorithm is the creation of a population with popsize
individuals. An individual is just an instantiation of the parameters of the function fobj. At
the beginning, the algorithm initializes the individuals by generating random values for each
parameter within the given bounds. For convenience, I generate uniform random numbers
between 0 and 1, and then I scale the parameters (denormalization) to obtain the
corresponding values. This is done in lines 4-8 of the algorithm.

# Population of 10 individuals, 4 params each (popsize = 10, dimensions = 4)

>>> pop = np.random.rand(popsize, dimensions)
>>> pop

# x[0] x[1] x[2] x[3]

array([[ 0.09, 0.01, 0.4 , 0.21],
[ 0.04, 0.87, 0.52, 0. ],
[ 0.96, 0.78, 0.65, 0.17],
[ 0.22, 0.83, 0.23, 0.84],
[ 0.8 , 0.43, 0.06, 0.44],
[ 0.95, 0.99, 0.93, 0.39],
[ 0.64, 0.97, 0.82, 0.06],
[ 0.41, 0.03, 0.89, 0.24],
[ 0.88, 0.29, 0.15, 0.09],
[ 0.13, 0.19, 0.17, 0.19]])

This generates our initial population of 10 random vectors. Each component x[i] is
normalized between [0, 1]. We will use the bounds to denormalize each component only for
evaluating them with fobj.

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 9 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Evaluation
The next step is to apply a linear transformation to convert each component from [0, 1] to
[min, max]. This is only required to evaluate each vector with the function fobj:

>>> pop_denorm = min_b + pop * diff

>>> pop_denorm

array([[-4.06, -4.89, -1. , -2.87],

[-4.57, 3.69, 0.19, -4.95],
[ 4.58, 2.78, 1.51, -3.31],
[-2.83, 3.27, -2.72, 3.43],
[ 3. , -0.68, -4.43, -0.57],
[ 4.47, 4.92, 4.27, -1.05],
[ 1.36, 4.74, 3.19, -4.37],
[-0.89, -4.67, 3.85, -2.61],
[ 3.76, -2.14, -3.53, -4.06],
[-3.67, -3.14, -3.34, -3.06]])

At this point we have our initial population of 10 vectors, and now we can evaluate them
using our fobj. Although these vectors are random points of the function space, some of
them are better than others (have a lower f (x) ). Let’s evaluate them:

>>> for ind in pop_denorm:

>>> print(ind, fobj(ind))

[-4.06 -4.89 -1. -2.87] 12.3984504837

[-4.57 3.69 0.19 -4.95] 14.7767132816
[ 4.58 2.78 1.51 -3.31] 10.4889711137
[-2.83 3.27 -2.72 3.43] 9.44800715266
[ 3. -0.68 -4.43 -0.57] 7.34888318457
[ 4.47 4.92 4.27 -1.05] 15.8691538075
[ 1.36 4.74 3.19 -4.37] 13.4024093959
[-0.89 -4.67 3.85 -2.61] 11.0571791104
[ 3.76 -2.14 -3.53 -4.06] 11.9129095178
[-3.67 -3.14 -3.34 -3.06] 10.9544056745

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 10 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

After evaluating these random vectors, we can see that the vector x=[ 3., -0.68, -4.43,
-0.57] is the best of the population, with a f (x) = 7.34, so these values should be closer to
the ones that we’re looking for. The evaluation of this initial population is done in L. 9 and
stored in the variable fitness.

Mutation & Recombination

How can the algorithm find a good solution starting from this set of random values?. This is
when the interesting part comes. Now, for each vector pop[j] in the population (from j=0
to 9), we select three other vectors that are not the current one, let’s call them a, b and c.
So we start with the first vector pop[0] = [-4.06 -4.89 -1. -2.87] (called target vector),
and in order to select a, b and c, what I do is first I generate a list with the indexes of the
vectors in the population, excluding the current one (j=0) (L. 14):

>>> target = pop[j] # (j = 0)

>>> idxs = [idx for idx in range(popsize) if idx != j] # (j = 0)
>>> idxs

[1, 2, 3, 4, 5, 6, 7, 8, 9]

And then I randomly choose 3 indexes without replacement (L. 14-15):

>>> selected = np.random.choice(idxs, 3, replace=False)

>>> selected

array([1, 4, 7])

Here are our candidates (taken from the normalized population):

>>> pop[selected]

array([[ 0.04, 0.87, 0.52, 0. ], # a

[ 0.8 , 0.43, 0.06, 0.44], # b
[ 0.41, 0.03, 0.89, 0.24]]) # c

>>> a, b, c = pop[selected]

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 11 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Now, we create a mutant vector by combining a, b and c. How? by computing the difference
(now you know why it’s called differential evolution) between b and c and adding those
differences to a after multiplying them by a constant called mutation factor (parameter
mut). A larger mutation factor increases the search radius but may slowdown the
convergence of the algorithm. Values for mut are usually chosen from the interval [0.5, 2.0].
For this example, we will use the default value of mut = 0.8:

# mut = 0.8
>>> mutant = a + mut * (b - c)

array([ 0.35, 1.19, -0.14, 0.17])

Note that after this operation, we can end up with a vector that is not normalized (the
second value is greater than 1 and the third one is smaller than 0). The next step is to fix
those situations. There are two common methods: by generating a new random value in the
interval [0, 1], or by clipping the number to the interval, so values greater than 1 become 1,
and the values smaller than 0 become 0. I chose the second option just because it can be
done in one line of code using numpy.clip :

>>> np.clip(mutant, 0, 1)

array([ 0.35, 1. , 0. , 0.17])

Now that we have our mutant vector, the next step to perform is called recombination.
Recombination is about mixing the information of the mutant with the information of the
current vector to create a trial vector. This is done by changing the numbers at some
positions in the current vector with the ones in the mutant vector. For each position, we
decide (with some probability defined by crossp) if that number will be replaced or not by
the one in the mutant at the same position. To generate the crossover points, we just need
to generate uniform random values between [0, 1] and check if the values are less than
crossp. This method is called binomial crossover since the number of selected locations
follows a binomial distribution.

>>> cross_points = np.random.rand(dimensions) < crossp

>>> cross_points

array([False, True, False, True], dtype=bool)

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 12 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

In this case we obtained two Trues at positions 1 and 3, which means that the values at
positions 1 and 3 of the current vector will be taken from the mutant. This can be done in
one line again using the numpy function where :

>>> trial = np.where(cross_points, mutant, pop[j]) # j = 0

>>> trial

array([ 0.09, 1. , 0.4 , 0.17])

Replacement
After generating our new trial vector, we need to denormalize it and evaluate it to measure
how good it is. If this mutant is better than the current vector ( pop[0] ) then we replace it
with the new one.

>>> trial_denorm = min_b + trial * diff

>>> trial_denorm

array([-4.1, 5. , -1. , -3.3])

And now, we can evaluate this new vector with fobj:

>>> fobj(mutant_denorm)

13.425000000000001

In this case, the trial vector is worse than the target vector (13.425 > 12.398), so the target
vector is preserved and the trial vector discarded. All these steps have to be repeated again
for the remaining individuals (pop[j] for j=1 to j=9), which completes the first iteration of the
algorithm. After this process, some of the original vectors of the population will be replaced
by better ones, and after many iterations, the whole population will eventually converge
towards the solution (it’s a kind of magic uh?).

Polynomial curve fitting example

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 13 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Let’s see now the algorithm in action with another concrete example. Given a set of points
(x, y), the goal of the curve fitting problem is to find the polynomial that better fits the given
points by minimizing for example the sum of the distances between each point and the
curve. For this purpose, we are going to generate our set of observations (x, y) using the
function f (x) = cos(x), and adding a small amount of gaussian noise:

>>> x = np.linspace(0, 10, 500)

>>> y = np.cos(x) + np.random.normal(0, 0.2, 500)
>>> plt.scatter(x, y)
>>> plt.plot(x, np.cos(x), label='cos(x)')
>>> plt.legend()

Figure 5. Dataset of 2D points (x, y) generated using the function y = cos(x) with
gaussian noise.

Our goal is to fit a curve (defined by a polynomial) to the set of points that we generated
before. This curve should be close to the original f (x) = cos(x) used to generate the points.

We would need a polynomial with enough degrees to generate at least 4 curves. For this
purpose, a polynomial of degree 5 should be enough (you can try with more/less degrees to
see what happens):

fmodel (w, x) = w0 + w1 x + w2 x 2 + w3 x 3 + w4 x 4 + w5 x 5

This polynomial has 6 parameters w = {w1 , w2 , w3 , w4 , w5 , w6 }. Different values for those

parameters generate different curves. Let’s implement it:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 14 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

def fmodel(x, w):

return w[0] + w[1]*x + w[2] * x**2 + w[3] * x**3 + w[4] * x**4 + w[5] * x**5

Using this expression, we can generate an infinite set of possible curves. For example:

>>> plt.plot(x, fmodel(x, [1.0, -0.01, 0.01, -0.1, 0.1, -0.01]))

Figure 6. Example of a polynomial of degree 5.

Among this infinite set of curves, we want the one that better approximates the original
function f (x) = cos(x). For this purpose, we need a function that measures how good a
polynomial is. We can use for example the Root Mean Square Error (RMSE)
(https://en.wikipedia.org/wiki/Root-mean-square_deviation) function:

def rmse(w):
y_pred = fmodel(x, w)
return np.sqrt(sum((y - y_pred)**2) / len(y))

Now we have a clear description of our problem: we need to find the parameters
w = {w1 , w2 , w3 , w4 , w5 , w6 } for our polynomial of degree 5 that minimizes the rmse
function. Let’s evolve a population of 20 random polynomials for 2,000 iterations with DE:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 15 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

>>> result = list(de(rmse, [(-5, 5)] * 6, its=2000))

>>> result

(array([ 0.99677643, 0.47572443, -1.39088333, 0.50950016, -0.06498931,

0.00273167]), 0.214860061914732)

We obtained a solution with a rmse of ~0.215. We can plot this polynomial to see how good
our approximation is:

>>> plt.scatter(x, y)
>>> plt.plot(x, np.cos(x), label='cos(x)')
>>> plt.plot(x, fmodel(x, [0.99677643, 0.47572443, -1.39088333,
>>> 0.50950016, -0.06498931, 0.00273167]), label='result')
>>> plt.legend()

Figure 7. Approximation of the original function f (x) = cos(x) used to generate the data
points, after 2000 iterations with DE.

Not bad at all!. Now let’s see in action how the algorithm evolve the population of random
vectors until all of them converge towards the solution. It is very easy to create an animation
with matplotlib, using a slight modification of our original DE implementation to yield the
entire population after each iteration instead of just the best vector:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 16 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

# Replace this line in DE (L. 29)

yield best, fitness[best_idx]
# With this line (and call the new version de2)
yield min_b + pop * diff, fitness, best_idx

import matplotlib.animation as animation

from IPython.display import HTML

result = list(de2(rmse, [(-5, 5)] * 6, its=2000))

fig, ax = plt.subplots()

def animate(i):
ax.clear()
ax.set_ylim([-2, 2])
ax.scatter(x, y)
pop, fit, idx = result[i]
for ind in pop:
data = fmodel(x, ind)
ax.plot(x, data, alpha=0.3)

Now we only need to generate the animation:

>>> anim = animation.FuncAnimation(fig, animate, frames=2000, interval=20)

>>> HTML(anim.to_html5_video())

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 17 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

The animation shows how the different vectors in the population (each one corresponding
to a different curve) converge towards the solution after a few iterations.

DE variations
The schema used in this version of the algorithm is called rand/1/bin because the vectors
are randomly chosen (rand), we only used 1 vector difference and the crossover strategy
used to mix the information of the trial and the target vectors was a binomial crossover. But
there are other variants:

Mutation schemas
Rand/1: xmut = xr1 + F(xr2 − xr3 )

Rand/2: xmut = xr1 + F(xr2 − xr3 + xr4 − xr5 )

Best/1: xmut = xbest + F(xr2 − xr3 )

Best/2: xmut = xbest + F(xr2 − xr3 + xr4 − xr5 )

Rand-to-best/1: xmut = xr1 + F1 (xr2 − xr3 ) + F2 (xbest − xr1 )

…

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 18 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Crossover schemas
Binomial (bin): crossover due to independent binomial experiments. Each component
of the target vector has a probability p of being changed by the component of the the
mutant vector.
Exponential (exp): it’s a two-point crossover operator, where two locations of the
vector are randomly chosen so that n consecutive numbers of the vector (between the
two locations) are taken from the mutant vector.

Mutation/crossover schemas can be combined to generate different DE variants, such as

rand/2/exp, best/1/exp, rand/2/bin and so on. There is no single strategy “to rule them all”.
Some schemas work better on some problems and worse in others. The tricky part is
choosing the best variant and the best parameters (mutation factor, crossover probability,
population size) for the problem we are trying to solve.

Final words
Differential Evolution (DE) is a very simple but powerful algorithm for optimization of
complex functions that works pretty well in those problems where other techniques (such
as Gradient Descent) cannot be used. In this post, we’ve seen how to implement it in just 27
lines of Python with Numpy, and we’ve seen how the algorithm works step by step.

If you are looking for a Python library for black-box optimization that includes the
Differential Evolution algorithm, here are some:

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 19 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

Yabox (https://github.com/pablormier/yabox). Yet another black-box optimization library for

Python 3+. This is a project I’ve started recently, and it’s the library I’ve used to
generate the figures you’ve seen in this post. Yabox is a very lightweight library that
depends only on Numpy.

Pygmo (http://esa.github.io/pygmo/). A powerful library for numerical optimization,

developed and mantained by the ESA. Pygmo is a scientific library providing a large
number of optimization problems and algorithms under the same powerful
parallelization abstraction built around the generalized island-model paradigm.

Platypus (https://github.com/Project-Platypus/Platypus). Platypus is a framework for

evolutionary computing in Python with a focus on multiobjective evolutionary
algorithms (MOEAs). It differs from existing optimization libraries, including PyGMO,
Inspyred, DEAP, and Scipy, by providing optimization algorithms and analysis tools for
multiobjective optimization

Scipy (https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html).
The well known scientific library for Python includes a fast implementation of the
Differential Evolution algorithm.

Tags: evolution optimization tutorial

Categories: Tutorials

Updated: September 5, 2017

COMMENTS

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 20 of 21
A tutorial on Differential Evolution with Python - Pablo R. Mier 17/09/20, 7:13 PM

23 Comments pablormier ! Disqus' Privacy Policy "

1 Login

% Recommend 12 t Tweet f Share Sort by Best

Join the discussion…

LOG IN WITH OR SIGN UP WITH DISQUS ?

Name

Alexander • 2 years ago

Amazing article. Very beautiful explanation =)
3△ ▽ • Reply • Share ›

Umesh • 2 years ago

Hi,
I wanted to implement it for constrained optimization. Can we use it to do it? I read that

https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/ Page 21 of 21

(Chapman & Hall - CRC Computational Biology Series) Karthik Raman - An Introduction To Computational Systems Biology - Systems-Level Modelling of Cellular Networks-Chapman and Hall - CRC (2021)
100% (2)
(Chapman & Hall - CRC Computational Biology Series) Karthik Raman - An Introduction To Computational Systems Biology - Systems-Level Modelling of Cellular Networks-Chapman and Hall - CRC (2021)
359 pages
Lostlegion
No ratings yet
Lostlegion
111 pages
Shock-Capturing Methods For Free-Surface Shallow Flows
No ratings yet
Shock-Capturing Methods For Free-Surface Shallow Flows
164 pages
New Chapter 3 Programming C
No ratings yet
New Chapter 3 Programming C
13 pages
Differential Evolution
No ratings yet
Differential Evolution
11 pages
300 BLK Ballistic Gel Results
No ratings yet
300 BLK Ballistic Gel Results
119 pages
Plunkett's Renewable, Alt. & Hydro. Energy Industry Almanac 2012 - Renewable, Alternative & Hydrogen Energy Industry Market Research, Statistics, Trends & Leading Companies (PDFDrive)
No ratings yet
Plunkett's Renewable, Alt. & Hydro. Energy Industry Almanac 2012 - Renewable, Alternative & Hydrogen Energy Industry Market Research, Statistics, Trends & Leading Companies (PDFDrive)
514 pages
Research On Object Tracking Based On Siamese Network
No ratings yet
Research On Object Tracking Based On Siamese Network
5 pages
Dan Zan Ryu Jujitsu: Forward Rolls - Mae Chugeri
No ratings yet
Dan Zan Ryu Jujitsu: Forward Rolls - Mae Chugeri
8 pages
A Micro Lie Theory For State Estimation in Robotics
No ratings yet
A Micro Lie Theory For State Estimation in Robotics
17 pages
Foucault Pendulum - Wikipedia
No ratings yet
Foucault Pendulum - Wikipedia
26 pages
Differential Evolution: A Handbook For Global Permutation-Based Combinatorial Optimization
No ratings yet
Differential Evolution: A Handbook For Global Permutation-Based Combinatorial Optimization
225 pages
Engle 1982
No ratings yet
Engle 1982
22 pages
Competitive Equilibrium Theory and Applications by Bryan Ellickson
No ratings yet
Competitive Equilibrium Theory and Applications by Bryan Ellickson
394 pages
Chess and Variations of
No ratings yet
Chess and Variations of
6 pages
Chapter 1 - Review Mobile Robot Kinematics
No ratings yet
Chapter 1 - Review Mobile Robot Kinematics
23 pages
Engineering References
No ratings yet
Engineering References
122 pages
2018 Book MotionControlOfUnderactuatedMe
No ratings yet
2018 Book MotionControlOfUnderactuatedMe
230 pages
Quantum Versus Classical Mechanics and Integrability Problems
No ratings yet
Quantum Versus Classical Mechanics and Integrability Problems
464 pages
Mel Tappen's Personal Survival Letters
No ratings yet
Mel Tappen's Personal Survival Letters
540 pages
Steps To Build A Python Tic Tac Toe Game
No ratings yet
Steps To Build A Python Tic Tac Toe Game
7 pages
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
No ratings yet
0262072629.MIT Press - Peter D. Grunwald, in Jae Myung, Mark A. P Advances in Minimum Description Length & Applications - Apr.2009
455 pages
Pyomo - DAE: A Python-Based Framework For Dynamic Optimization
No ratings yet
Pyomo - DAE: A Python-Based Framework For Dynamic Optimization
35 pages
Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier
No ratings yet
Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier
5 pages
HEQ5 PRO Rebuild
No ratings yet
HEQ5 PRO Rebuild
23 pages
Analysis of Inverted Pendulum and Control PDF
No ratings yet
Analysis of Inverted Pendulum and Control PDF
62 pages
Demon UserGuide
No ratings yet
Demon UserGuide
150 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
209 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Binomial Trees - Stock, One Period
No ratings yet
Binomial Trees - Stock, One Period
28 pages
Practical Optimization Methods With Mathematica Applications PDF
100% (1)
Practical Optimization Methods With Mathematica Applications PDF
729 pages
Immediate download Information Theory Coding Theorems for Discrete Memoryless Systems 2nd Edition Imre Csiszár ebooks 2024
100% (1)
Immediate download Information Theory Coding Theorems for Discrete Memoryless Systems 2nd Edition Imre Csiszár ebooks 2024
75 pages
Part 1: Data Investigation and Cleaning: Classification For Data Errors
No ratings yet
Part 1: Data Investigation and Cleaning: Classification For Data Errors
12 pages
Collected Scientific Papers PDF
No ratings yet
Collected Scientific Papers PDF
346 pages
The Analytic S-Matrix 2
No ratings yet
The Analytic S-Matrix 2
13 pages
Viterbi Algorithm Demystified
No ratings yet
Viterbi Algorithm Demystified
15 pages
Digital Weight Transmitter USER MANUAL
No ratings yet
Digital Weight Transmitter USER MANUAL
64 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
Static Analysis of Timoshenko Beams Using Isogeometric Approach
No ratings yet
Static Analysis of Timoshenko Beams Using Isogeometric Approach
9 pages
Bluetooth
No ratings yet
Bluetooth
8 pages
STAT501 Complex Survey
No ratings yet
STAT501 Complex Survey
104 pages
The Fractal Geometry of Nature
No ratings yet
The Fractal Geometry of Nature
497 pages
Download full (Ebook) Advanced Mathematical Techniques in Engineering Sciences by Davim, J. Paulo; Ram, Mangey ISBN 9781315148977, 9781351371896, 1315148978, 1351371894 ebook all chapters
100% (4)
Download full (Ebook) Advanced Mathematical Techniques in Engineering Sciences by Davim, J. Paulo; Ram, Mangey ISBN 9781315148977, 9781351371896, 1315148978, 1351371894 ebook all chapters
57 pages
Kumbahar 2018
No ratings yet
Kumbahar 2018
105 pages
Idc30at Group 3 Project 2022s2 Final
No ratings yet
Idc30at Group 3 Project 2022s2 Final
16 pages
Important Genetic Algorithm
No ratings yet
Important Genetic Algorithm
7 pages
Geometry For Programmers Meap V11 Meap 11 Oleksandr Kaleniuk pdf download
100% (2)
Geometry For Programmers Meap V11 Meap 11 Oleksandr Kaleniuk pdf download
91 pages
A Bayesian Approximation Method For Online Ranking: Ruby C. Weng
No ratings yet
A Bayesian Approximation Method For Online Ranking: Ruby C. Weng
34 pages
How Efficient Is Jacobi Iteration in Solving Linear Systems
No ratings yet
How Efficient Is Jacobi Iteration in Solving Linear Systems
7 pages
Thinking in Patterns (Bruce Eckel)
No ratings yet
Thinking in Patterns (Bruce Eckel)
180 pages
L12 Bayesian Network
No ratings yet
L12 Bayesian Network
35 pages
2018 - A Survey On The Combined Use of Optimization Methods and Game Theory
No ratings yet
2018 - A Survey On The Combined Use of Optimization Methods and Game Theory
22 pages
Seidenberg A.-Elements of The Theory of Algebraic Curves-Addison-Wesley (1968)
No ratings yet
Seidenberg A.-Elements of The Theory of Algebraic Curves-Addison-Wesley (1968)
224 pages
Fatigue Prediction For Rotating Components With FEMFAT
No ratings yet
Fatigue Prediction For Rotating Components With FEMFAT
2 pages
Graphblas
No ratings yet
Graphblas
8 pages
C++ Interactive Course
100% (1)
C++ Interactive Course
299 pages
The Engineering Design Revolution - CAD History - 09 Auto-Trol Technology
No ratings yet
The Engineering Design Revolution - CAD History - 09 Auto-Trol Technology
28 pages
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
No ratings yet
On Cellular Automata Representation of Submicroscopic Physics: From Static Space To Zuse's Calculating Space Hypothesis
11 pages
DE Survey
No ratings yet
DE Survey
13 pages
OnTheUsageOfDEforFunctionOptimization
No ratings yet
OnTheUsageOfDEforFunctionOptimization
6 pages
Differential Evolution Using Matlab: Mast. Mandar Pandurang Ganbavale (2013H143013H)
No ratings yet
Differential Evolution Using Matlab: Mast. Mandar Pandurang Ganbavale (2013H143013H)
21 pages
Full download Proceedings of the 2nd International Conference on Data Engineering and Communication Technology ICDECT 2017 Anand J. Kulkarni pdf docx
100% (2)
Full download Proceedings of the 2nd International Conference on Data Engineering and Communication Technology ICDECT 2017 Anand J. Kulkarni pdf docx
65 pages
Condition-Based Maintenance of CNC Turning Machine: Research Paper
No ratings yet
Condition-Based Maintenance of CNC Turning Machine: Research Paper
7 pages
11311-995733 Scattolini PDF
No ratings yet
11311-995733 Scattolini PDF
15 pages
Module 3: Linear Programming: Graphical Method: Learning Outcomes
No ratings yet
Module 3: Linear Programming: Graphical Method: Learning Outcomes
9 pages
Genetic Paper
No ratings yet
Genetic Paper
4 pages
Priyadarshi N. Intelligent Renewable Energy Systems... 2022
No ratings yet
Priyadarshi N. Intelligent Renewable Energy Systems... 2022
473 pages
Optimization of Damped Dynamic Vibration Absorber To Control Chatter in Metal Cutting Process
No ratings yet
Optimization of Damped Dynamic Vibration Absorber To Control Chatter in Metal Cutting Process
11 pages
2-3 Assignments - Solution To Specified Problem II
50% (2)
2-3 Assignments - Solution To Specified Problem II
7 pages
Operations Research Perspectives: Avi Herbon
No ratings yet
Operations Research Perspectives: Avi Herbon
15 pages
1 1optimization
No ratings yet
1 1optimization
25 pages
MINLP
No ratings yet
MINLP
123 pages
Service Restoration
100% (1)
Service Restoration
17 pages
Contract Out Planning of Solid Waste Management System Under Uncertainty Case Study On Toronto, Ontario, Canada
No ratings yet
Contract Out Planning of Solid Waste Management System Under Uncertainty Case Study On Toronto, Ontario, Canada
11 pages
Genetic Algorithm Report
No ratings yet
Genetic Algorithm Report
26 pages
Next Generation Multi-Scale Process Systems
No ratings yet
Next Generation Multi-Scale Process Systems
6 pages
Module 2 Unit 4 Decision Making
No ratings yet
Module 2 Unit 4 Decision Making
7 pages
Daa Cse 5TH Sem
No ratings yet
Daa Cse 5TH Sem
1 page
Experiment 9
No ratings yet
Experiment 9
10 pages
IVU - Rail 2017 Brochure English
No ratings yet
IVU - Rail 2017 Brochure English
32 pages
Nash Model
100% (1)
Nash Model
23 pages
CM19352 process optimization
No ratings yet
CM19352 process optimization
2 pages
Research Methodology
100% (1)
Research Methodology
70 pages
Linear Programming
No ratings yet
Linear Programming
36 pages
2021 Ansys Upm Masters Degree Catalogue
No ratings yet
2021 Ansys Upm Masters Degree Catalogue
50 pages
DS MCQ
67% (3)
DS MCQ
12 pages
Flexible 3-D Seismic Survey Design: Gabriel Alvarez
No ratings yet
Flexible 3-D Seismic Survey Design: Gabriel Alvarez
37 pages
14.128 Dynamic Optimization and Economic Applications (Recursive Methods)
No ratings yet
14.128 Dynamic Optimization and Economic Applications (Recursive Methods)
3 pages
Upfc Placement
No ratings yet
Upfc Placement
0 pages
Fleet Performance
100% (2)
Fleet Performance
26 pages

A Tutorial On Differential Evolution With Python - Pablo R. Mier

Uploaded by

A Tutorial On Differential Evolution With Python - Pablo R. Mier

Uploaded by

A tutorial on Differential Evolution with Python - Pablo R.

Mier 17/09/20, 7:13 PM

A tutorial on Differential Evolution

Figure 1. Example of DE iteratively optimizing the 2D Ackley function

fobj = lambda x: sum(x**2)/len(x)

If we define x as a list, we should define our objective function in this way:

>>> it = list(de(lambda x: x**2, bounds=[(-100, 100)]))

(array([ 0.]), array([ 0.]))

Figure 2. Representation of f (x) = ∑ xi2 /n

>>> result = list(de(lambda x: x**2 / len(x), bounds=[(-100, 100)] * 32))

(array([ 1.43231366, 4.83555112, 0.29051824, 2.94836318, 2.02918578,

>>> result = list(de(lambda x: x**2 / len(x), bounds=[(-100, 100)] * 32, its=3000))

(array([ 0.00648831, -0.00093694, -0.00205017, 0.00136862, -0.00722833,

>>> import matplotlib.pyplot as plt

Figure 3. Evolution of the best solution found by DE in each iteration

for d in [8, 16, 32, 64]:

Figure 4. Comparison of the convergence speed for different dimensions

In this way, in Differential Evolution, solutions are represented as populations of individuals

fobj = lambda x: sum(x**2)/len(x)

bounds = [(-5, 5)] * 4

# Population of 10 individuals, 4 params each (popsize = 10, dimensions = 4)

# x[0] x[1] x[2] x[3]

>>> pop_denorm = min_b + pop * diff

array([[-4.06, -4.89, -1. , -2.87],

>>> for ind in pop_denorm:

[-4.06 -4.89 -1. -2.87] 12.3984504837

Mutation & Recombination

>>> target = pop[j] # (j = 0)

And then I randomly choose 3 indexes without replacement (L. 14-15):

>>> selected = np.random.choice(idxs, 3, replace=False)

Here are our candidates (taken from the normalized population):

array([[ 0.04, 0.87, 0.52, 0. ], # a

array([ 0.35, 1.19, -0.14, 0.17])

array([ 0.35, 1. , 0. , 0.17])

>>> cross_points = np.random.rand(dimensions) < crossp

array([False, True, False, True], dtype=bool)

>>> trial = np.where(cross_points, mutant, pop[j]) # j = 0

array([ 0.09, 1. , 0.4 , 0.17])

>>> trial_denorm = min_b + trial * diff

array([-4.1, 5. , -1. , -3.3])

And now, we can evaluate this new vector with fobj:

Polynomial curve fitting example

>>> x = np.linspace(0, 10, 500)

This polynomial has 6 parameters w = {w1 , w2 , w3 , w4 , w5 , w6 }. Different values for those

def fmodel(x, w):

>>> plt.plot(x, fmodel(x, [1.0, -0.01, 0.01, -0.1, 0.1, -0.01]))

Figure 6. Example of a polynomial of degree 5.

>>> result = list(de(rmse, [(-5, 5)] * 6, its=2000))

(array([ 0.99677643, 0.47572443, -1.39088333, 0.50950016, -0.06498931,

# Replace this line in DE (L. 29)

import matplotlib.animation as animation

result = list(de2(rmse, [(-5, 5)] * 6, its=2000))

Now we only need to generate the animation:

>>> anim = animation.FuncAnimation(fig, animate, frames=2000, interval=20)

Rand/2: xmut = xr1 + F(xr2 − xr3 + xr4 − xr5 )

Best/2: xmut = xbest + F(xr2 − xr3 + xr4 − xr5 )

Rand-to-best/1: xmut = xr1 + F1 (xr2 − xr3 ) + F2 (xbest − xr1 )

Mutation/crossover schemas can be combined to generate different DE variants, such as

Yabox (https://github.com/pablormier/yabox). Yet another black-box optimization library for

Pygmo (http://esa.github.io/pygmo/). A powerful library for numerical optimization,

Platypus (https://github.com/Project-Platypus/Platypus). Platypus is a framework for

Tags: evolution optimization tutorial

Updated: September 5, 2017

23 Comments pablormier ! Disqus' Privacy Policy "

% Recommend 12 t Tweet f Share Sort by Best

Join the discussion…

LOG IN WITH OR SIGN UP WITH DISQUS ?

Alexander • 2 years ago

Umesh • 2 years ago

You might also like