Donofrio 2015
Donofrio 2015
RELATORI CANDIDATO
...
Contents
List of Figures iv
1 Introduction 1
1.1 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation and goals . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . 3
i
Contents
ii
Contents
9 Conclusions 115
9.1 Lesson Learned . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.2 Future Developments . . . . . . . . . . . . . . . . . . . . . . . 117
Bibliography 141
iii
List of Figures
iv
List of Figures
5.1 Jacobian Matrix Sparsity Patterns for the Space Shuttle Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Jacobian Matrix Sparsity Patterns for the Orbit Raising Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Jacobian Matrix Sparsity Patterns for the Hang Glider Problem. 59
v
List of Figures
6.1 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Maximum error in the Jacobian Matrix for the Hang Glider
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.5 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.6 Maximum error in the Jacobian Matrix for the Hang Glider
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.7 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.8 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.9 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.10 CPU Time Required for the Space Shuttle Reentry Problem. . 73
6.11 CPU Time Required for the Orbit Raising Problem. . . . . . . 74
6.12 CPU Time Required for the Hang Glider Problem. . . . . . . 75
6.13 CPU Time Required for the Space Shuttle Reentry Problem. . 76
6.14 CPU Time Required for the Orbit Raising Problem. . . . . . . 77
6.15 CPU Time Required for the Hang Glider Problem. . . . . . . 77
vi
List of Figures
vii
List of Tables
6.1 Accuracy and step size comparison for the Space Shuttle Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Accuracy and step size comparison for the Orbit Raising Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3 Accuracy and step size comparison for the Hang Glider Problem. 65
7.1 Accuracy and CPU Time comparison for the Space Shuttle
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Accuracy and CPU Time comparison for the Space Shuttle
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Accuracy and CPU Time comparison for the Orbit Raising
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 94
7.4 Accuracy and CPU Time comparison for the Orbit Raising
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 94
7.5 Accuracy and CPU Time comparison for the Hang Glider
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6 Accuracy and CPU Time comparison for the Hang Glider
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 99
viii
Chapter 1
Introduction
1
Chapter 1. Introduction
2
Chapter 1. Introduction
3
Chapter 1. Introduction
are used to compare the exact analytical Jacobian with the numerical one,
presented in Chapter 6, and computed using the numerical differentiations
previously defined.
In Chapter 7, the differentiation schemes are used to solve optimal control
problem. A general formulation of an OCP is shown, then the differentiation
schemes are implemented in SPARTAN. Three numerical examples of OCP
are studied: the maximization of the final crossrange in the space shuttle
reentry trajectory, the maximization of the final specific energy in an orbit
raising problem, and the maximization of the final range of a hang glider in
presence of a specific updraft. Each of the three examples is solved using two
different off-the-shelf, well-known NLP solvers: SNOPT and IPOPT. The
results obtained using the different differentiation schemes are thoroughly
inspected in terms of accuracy and CPU time.
In Chapter 8, we deal with the problem of the differentiation of signals given
in real time with the aim to design a robust differentiator based on sliding
mode technique. Two sliding mode robust differentiators are examined on
tutorial examples, and simulated in Simulink.
4
Chapter 2
Finite Difference Traditional
Schemes
Overview
Direct methods for optimal control use gradient based techniques for solving
the NLP (Nonlinear optimization problem or Nonlinear programming prob-
lem). Gradient methods for solving NLPs require the computation of the
derivatives of the objective function and constraints and, the accuracy of the
derivatives has a strong impact on the computational efficiency and reliabil-
ity of the solutions.
The most obvious way to compute the derivatives of the objective function
and constraints is by analytical differentiation. This approach is appealing
because analytical derivatives are exact and generally result in faster opti-
mization but, in many cases it is impractical to compute them. For this
reasons, the aim of the following discussion is to employ alternative means
to obtain the necessary gradients.
The most basic way to estimate derivatives is by finite difference approxi-
mation. In this chapter the most common methods for the finite difference
approximation of a derivative are discussed.
The principle of the finite difference methods consists in approximating the
differential operator by using a discrete differential operator. Considering a
generic one dimensional function f(x) defined in the domain D=[0, X], it is
possible to identify N-grid points
xi = ih (i = 0, 1, . . . , (N 1))
where h is the mesh size. The finite difference schemes calculate derivatives
by approximating them by linear combinations of function values at the grid
5
Chapter 2. Finite Difference Traditional Schemes
points. The simplest schemes using this approach are the backward and
forward schemes.
6
Chapter 2. Finite Difference Traditional Schemes
Errors between exact analytic derivatives and numerical ones will be calcu-
lated considering different values of the perturbation h.
f (xi + h2 ) f (xi h2 )
f 0 (xi ) = (2.8)
h
h2 |f 000 (x)|
T = (2.9)
24
2
R = (2.10)
h
7
Chapter 2. Finite Difference Traditional Schemes
The 3-points stencil central difference scheme error is O(h2 ) which means
that it is a second order approximation and it provides more accurate results
than the backward and forward traditional schemes.
8
Chapter 2. Finite Difference Traditional Schemes
It can be seen that the generic expression of the central difference scheme
has an anti-symmetric structure and in general difference of N th order can
be written as [3]
(N 1)
2
1 X
f 0 (xi ) ak (fk fk ) (2.17)
h k=1
Starting from (2.17), it is possible to derive the formula for N = 7 which
represents the 7-points stencil central difference formula [3]
The following figures illustrate the function trend (Figure 2.1), the
analytical and numerical derivatives of the function (Figure 2.2) and
the comparison between errors related to the three different schemes, at
first considering a constant h (Figure 2.3), and then varying h (Figure
2.4).The error are computed using the analytical result as the reference;
= f 0 fref
0 0
/fref .
As shown in Figure 2.2, considering h = 1 102 , the more points the
stencil is composed of, the more accurate the numerical derivative is,
so it is convenient to use a 7 points stencil central difference scheme.
This fact can be pointed out also observing the Figure 2.3, in which
the error decreases as the number of the points of the stencil increases.
However, this is not generally true. Indeed, Figure 2.4 shows that, if we
reduce the size of the perturbation h, the accuracy of the central differ-
ence schemes which involve more points increases until h is so that the
error || is minimum (meaning T = R ). If h is furtherly reduced, the
9
Chapter 2. Finite Difference Traditional Schemes
10
Chapter 2. Finite Difference Traditional Schemes
Figure 2.2: Analytical and numerical derivatives of the function f (x) = sin(x),
h = 1 102 .
11
Chapter 2. Finite Difference Traditional Schemes
Figure 2.5: Errors comparison, 3-points stencil scheme, function f (x) = sin(x).
12
Chapter 2. Finite Difference Traditional Schemes
Figure 2.6: Errors comparison, 5-points stencil scheme, function f (x) = sin(x).
Figure 2.7: Errors comparison, 7-points stencil scheme, function f (x) = sin(x).
13
Chapter 2. Finite Difference Traditional Schemes
Let f (x) = x1 .
The following figures illustrate the function trend (Figure 2.8), the
analytical and numerical derivatives of the function (Figure 2.9) and
the comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.10) and then varying h (Figure
2.11).
For this case as well, considering h = 1 102 , Figures 2.9 and 2.10
show that the 7-points stencil scheme is more accurate than the 5-
points stencil scheme, and the 5-points stencil scheme appears to be
more accurate than the 3-points stencil scheme.
Figure 2.11 illustrates that, if we reduce the size of the perturbation
h, the accuracy of the central difference schemes which involve more
points increases until h , defined as the value of h so that |T | = |R |.
If h is furtherly reduced, the accuracy of the dense stencil schemes de-
creases because of the round-off error becomes dominant. As in the
previous case, their computational load will be heavier due to the in-
crease of the number of points where the function must be evaluated.
So, in this case, it is not convenient to use a dense stencil for h < h .
14
Chapter 2. Finite Difference Traditional Schemes
15
Chapter 2. Finite Difference Traditional Schemes
1
Figure 2.11: Errors comparison, f (x) = x and varying h.
Let f (x) = 1
x2
.
In the following figures the function trend (Figure 2.12), the analyt-
ical and numerical derivatives of the function (Figure 2.13) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.14) and then varying h (Figure
2.15) are shown.
Here again, considering h = 1 102 , Figure 2.13 and Figure 2.14 show
that the more complex the stencil is, the more accurate the numerical
derivative is. Figure 2.15 illustrates that, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the dense stencil
schemes decreases because of the round-off error becomes dominant. In
addition, their computational load will be heavier due to the increase
of the number of points where the function must be evaluated. So, in
this case, it is not convenient to use a dense stencil for h < h .
16
Chapter 2. Finite Difference Traditional Schemes
1
Figure 2.12: Function f (x) = x2
.
1
Figure 2.13: Analytical and numerical derivatives of the function f (x) = x2
,
h = 1 102 .
17
Chapter 2. Finite Difference Traditional Schemes
1
Figure 2.15: Errors comparison, f (x) = x2
and varying h.
18
Chapter 2. Finite Difference Traditional Schemes
The following figures show the function trend (Figure 2.16), the an-
alytical and numerical derivatives of the function (Figure 2.17) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.18) and then varying h (Figure
2.19).
19
Chapter 2. Finite Difference Traditional Schemes
20
Chapter 2. Finite Difference Traditional Schemes
load will be heavier due to the increase of the number of points where
the function must be evaluated. So, in this case, it is not convenient
to use a more complex stencil for h < h .
In the following figures the function trend (Figure 2.20), the analyt-
ical and numerical derivatives of the function (Figure 2.21) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.22) and then varying h (Figure
2.23) are illustrated.
Considering h = 5 104 , Figure 2.21 shows that the denser the stencil
is, the more accurate the central difference scheme is. Furthermore,
Figure 2.22 illustrates that, here again, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the dense stencil
schemes decreases because of the round-off error becomes dominant. In
addition, their computational load will be heavier due to the increase
of the number of points where the function must be evaluated.
21
Chapter 2. Finite Difference Traditional Schemes
22
Chapter 2. Finite Difference Traditional Schemes
Figure 2.22: Errors comparison, f (x) = A sin2 (t) cos(t2 ) and h = 5 104 .
Figure 2.23: Errors comparison,f (x) = A sin2 (t) cos(t2 ) and varying h.
23
Chapter 2. Finite Difference Traditional Schemes
The last example has been selected from the literature [4], and confirms the
results of the analysis performed so far.
ex
Let f (x) = .
sin3 (x)+cos3 (x)
The following figures show the function trend (Figure 2.24), the an-
alytical and numerical derivatives of the function (Figure 2.25) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.26) and then varying h (Figure
2.27).
As seen in the previous cases, here again, considering h = 1 102 ,
ex
Figure 2.24: Function f (x) = .
sin3 (x)+cos3 (x)
Figures 2.25 - 2.26 show that the more points the stencil is composed
of, the more accurate the numerical derivative is, so it is convenient to
use a 7 points stencil central difference scheme, meaning that the error
between analytical and numerical derivative decreases as the number
of the points of the stencil increases. However, this is not generally
true. Indeed, Figure 2.27 shows that, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the more complex
24
Chapter 2. Finite Difference Traditional Schemes
ex
Figure 2.26: Errors comparison, f (x) = and h = 1 103 .
sin3 (x)+cos3 (x)
25
Chapter 2. Finite Difference Traditional Schemes
ex
Figure 2.27: Errors comparison, f (x) = and varying h.
sin3 (x)+cos3 (x)
2.3 Conclusions
In this chapter the traditional finite difference schemes have been analysed.
We focused on the central difference schemes which appear to be more accu-
rate than the backward and forward difference schemes. Numerical examples
of the different stencil (3-points, 5-points and 7-points) on six different func-
tions having increasing complexity have been discussed showing the effects
of selecting different values of the perturbation h.
The fundamental result is the following: in order to improve the accuracy of
the central difference schemes it is necessary to reduce the truncation error,
due to the higher order terms in the Taylor series, by reducing h. However,
making h too small can lead to subtraction errors due to the finite precision
used by computers to store numbers. Indeed, it is not desirable to choose h
to small otherwise the round-off error becomes dominant.
26
Chapter 2. Finite Difference Traditional Schemes
27
Chapter 3
Advanced Differentiation
Scheme: the Complex-Step
Derivative Approach
Overview
In this chapter the complex-step derivative approximation and its applica-
tion to six test functions are presented.
As seen from Chapter 2, the easiest way to estimate numerical derivatives is
by finite difference approximation and, in particular, the central difference
schemes appears to be the most accurate ones. These schemes can be derived
by truncating a Taylor series expanded about a point x.
When estimating derivatives using finite difference formulas we are faced with
the problem of selecting the size of the perturbation h so that it minimizes
the error between the analytic and numerical derivative. Indeed, it is nec-
essary to choose a small h to minimize the truncation error while avoiding
the use of a perturbation too small because, in this case, the round-off error
becomes dominant.
In order to improve the accuracy of the numerical derivatives, the complex-
step derivative approximation is defined so that it is not subject to subtrac-
tive cancellation errors. This is a great advantage over the finite difference
operations as it will be seen in the next sections.
In the following sections the complex-step method is defined and then tested
in presence of six different functions. The results are compared with respect
to the ones achieved in Chapter 2 using the finite difference approximation.
28
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Taking the imaginary parts of both sides of this Taylor expansion (3.6) and
dividing it by h yields [4]
Im[f (x + ih)]
f 0 (x) = + O(h2 ). (3.7)
h
As a consequence, the approximation is of order O(h2 ). The second order
errors can be reduced by ensuring that h is sufficiently small and, since the
complex-step approximation does not involve a difference operation, it is
possible to choose extremely small size of the perturbation h without losing
accuracy. The only drawback is the need to have an analytical function. It
cannot be applied, for instance, to look-up tables.
Figure 3.1 shows the relative error in the errors estimates given by
the central difference and the complex-step methods using the analyt-
ical result as the reference; = |f 0 fref
0 0
|/|fref |. It illustrates that the
central difference estimates initially converge quadratically to the exact
result but, when the step h is reduced below a value of about 102 for
the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable. Indeed, diverges or tends
to 1 meaning that the finite difference estimates yield zero because h
is so small that no difference exists in the output.
The complex-step derivative approximation converges quadratically with
decreasing step size because of the decrease of the truncation error. The
estimate is practically insensitive to small size of the perturbation h and
for any h below a value of about 108 it achieves the accuracy of the
function evaluation.
Figure 3.2 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. The complex-step approximation ap-
pears to be, approximately, two orders of magnitude more accurate
than the central difference scheme.
30
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Figure 3.1: Relative error in the sensitivity estimates, function f (x) = sin(x).
31
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Let f (x) = x1 .
Figure 3.3 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.4 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.
32
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Let f (x) = 1
x2
.
Figure 3.5 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.6 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.
33
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
1
Figure 3.5: Relative error in the first derivative, function f (x) = x2
.
1
Figure 3.6: Minimum-Errors comparison, function f (x) = x2
.
34
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Figure 3.7 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.8 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.
35
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Figure 3.9 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. Here again, the
central difference estimates initially converge quadratically to the exact
result but, when the step h is reduced below a value of about 102 for
the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable. Indeed, diverges or tends
to 1 meaning that the finite difference estimates yield zero because h
is so small that no difference exists in the output. The complex-step
derivative approximation converges quadratically with decreasing step
size because of the decrease of the truncation error. The estimate is
practically insensitive to small size of the perturbation h and for any
h below a value of about 108 it achieves the accuracy of the function
evaluation.
Figure 3.10 illustrates the comparison of the minimum-error of the
central difference derivative estimate and the minimum-error of the
complex-step derivative approximation. The complex-step approxima-
tion appears to be, approximately, three orders of magnitude more
accurate than the central difference scheme.
36
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
Figure 3.9: Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ).
37
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
ex
Let f (x) = .
sin3 (x)+cos3 (x)
Figure 3.11 shows the relative error in the errors estimates given
by the central difference and the complex-step methods. Here again,
the central difference estimates initially converge quadratically to the
exact result but, when the step h is reduced below a value of about 102
for the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable.
Since this example is taken from [4], we can compare the results. The
comparison of results reported in the Figures 3.11 and 3.12 shows the
consistency of the results. Indeed, diverges or tends to 1 meaning that
the finite difference estimates yield zero because h is so small that no
difference exists in the output. The complex-step derivative approxi-
mation converges quadratically with decreasing step size because of the
decrease of the truncation error. The estimate is practically insensitive
to small size of the perturbation h and for any h below a value of about
108 it achieves the accuracy of the function evaluation.
Figure 3.13 illustrates the comparison of the minimum-error of the
central difference derivative estimate and the minimum-error of the
complex-step derivative approximation. The complex-step approxima-
tion appears to be, approximately, three orders of magnitude more
accurate than the central difference scheme.
3.3 Conclusions
In this chapter the complex-step derivative approximation has been analysed.
The complex-step approximation provides greater accuracy than the finite
difference formulas, for first derivatives, by eliminating the subtraction error.
Indeed, as the finite differences, the complex-step derivative approximation is
concerned with truncation error but, this approximation does not suffer from
the problem of round-off error. f 0 (x) is the leading term of the imaginary
part of f (x + ih), so h can be made small enough that the truncation error
is effectively zero without worrying about the round-off error. The only
disadvantage is the need to have an analytical function, meaning that, for
instance, it is not possible to apply this approximation to look-up tables.
The complex-step derivative approximation has been tested in presence of six
different functions and the results have been compared with the ones shown
in the chapter 2. The value of h that minimize the error between analytical
and numerical derivatives is, in the case of the complex-step approximation,
38
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
ex
Figure 3.11: Relative error in the first derivative, f (x) = .
sin3 (x)+cos3 (x)
39
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach
ex
Figure 3.13: Minimum-Errors comparison, f (x) = .
sin3 (x)+cos3 (x)
less than the machine epsilon ( = 2.2204 1016 ). For this reason, in order
to be sure to get reliable results, a value of h equal to twice the machine
epsilon has been selected.
40
Chapter 4
Advanced Differentiation
Scheme: the Dual-Step
Derivative Approach
Overview
In this chapter the dual-step derivative approach and its application to six
test functions are presented.
As said in the previous chapters, derivatives are often approximated using
finite difference schemes. These approximations are subject to truncation
error, associated with the higher order terms of the Taylor series that are
ignored when forming the approximation, and to round-off error which is a
result of performing these calculations on a computer with finite precision.
The complex-step derivative approximation is more accurate than the finite
difference scheme and the greater accuracy is provided by eliminating the
round-off error.
In order to improve the accuracy of the numerical derivatives, the dual-step
approach uses dual numbers, see Appendix [A], and the derivatives calcu-
lated using these new numbers are exact, without any truncation error or
subject to subtraction errors. This is a great advantage, in terms of the step
size, over the complex-step derivative approximation.
In the following sections the dual-step method for the first derivative calcula-
tion is defined and tested in presence of six different functions. The error in
the first derivative calculation is then compared with the error for the central
difference schemes and complex-step approximation.
41
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
42
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
Figure 4.1 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation, as seen in Chapters 2 and 3.
The error of the dual-step approximation, which is not subject to trun-
cation error or round-off error, is machine zero regardless of the selected
step size.
Figure 4.1: Relative error in the first derivative, function f (x) = sin(x).
43
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
Let f (x) = x1 .
Figure 4.2 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
Here again, the error of the dual-step approximation, which is not sub-
ject to truncation error or round-off error, is machine zero regardless
of the selected step size.
44
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
Let f (x) = 1
x2
.
Figure 4.3 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.
1
Figure 4.3: Relative error in the first derivative, function f (x) = x2
.
45
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
Figure 4.4 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.
46
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
Figure 4.5 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.
Figure 4.5: Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ).
47
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
ex
Let f (x) = .
sin3 (x)+cos3 (x)
Figure 4.6 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-
step and the dual-step methods using the analytical result as the ref-
erence. As the step size decreases, the error decreases according to the
order of the truncation error of the method. However, after a certain
value of h, the error for the central difference approximations begins to
grow, while the error for the complex-step approximation continues to
decrease until it reaches, and remains at, machine zero (the machine
epsilon). This shows the effect of subtractive cancellation errors, which
affects the finite difference approximations but not the first derivative
complex-step approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size. Since this example is taken from
[13], we can compare the results. The comparison of results reported
in the Figures 4.6 and 3.12 shows the consistency of the results.
ex
Figure 4.6: Relative error in the first derivative, f (x) = .
sin3 (x)+cos3 (x)
48
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach
4.3 Conclusions
In this chapter the dual-step approach for the computation of the first deriva-
tives has been introduced. This approach provides greater accuracy than the
finite difference formulas and the complex-step derivative approximation. In-
deed, the dual-step approach is subject neither to the truncation error, nor
to the round-off error and, as a consequence, the error in the first derivative
estimate is machine zero regardless of the selected step size. This is a great
advantage over the complex-step derivative approximation because, using the
dual-step approach, there is no need to select the step sizes as small as pos-
sible.
The disadvantage is the computational cost due to the fact that working
with dual numbers requires addition computational work. In addition, as
the complex-step approximation, the dual-step approach is concerned with
the need to have an analytical function.
The dual-step approach has been tested in presence of six different functions.
The relative errors have been calculated using the analytical derivative as
reference and then, they have been compared with the ones computed in
chapters 2 and 3 using, respectively, the finite difference schemes and the
complex-step approximation.
49
Chapter 5
Generation of Reference Data
Overview
In this chapter the analytical structure of the Gradient vector and the Jaco-
bian matrix are analysed in order to generate a set of reference data for each
of the following problems: the Space Shuttle Reentry Problem [1], the Orbit
Raising Problem [5] and the Hang Glider Problem [1]. These reference data
are useful so as to compare the exact analytical Jacobian with the numerical
one, computed using different numerical differentiations, as we will see in the
next chapters.
In the first section of this chapter the analytical definitions of the Gradi-
ent vector and the Jacobian matrix of a generic function are given. Then,
the formulation of the three aforementioned problems is described and the
analytical Jacobian is generated for each of them. In the last section the
Jacobian matrix sparsity pattern for each of the three problems is shown.
50
Chapter 5. Generation of Reference Data
dim(J ) = [m (1 + ns + nc )] (5.3)
F1 F1 F1 F1 F1
t x1
... xns u1
... unc
F2 F2
... F2 F2 F2
. . . u
t x1 xns u1 nc
J =
.
(5.4)
.. .. .. .. ..
. ... . . ... .
Fm Fm Fm Fm Fm
t x1
... xns u1
. . . u n
c
Considering the three problems we will analyse in the next sections, the
number m will be equal to the number of the state variables ns but, if we
consider a generic optimal control problem we have to account for additional
equations, meaning for the cost function and for the nc equation constraints
so that m will be equal to ns + nc + 1.
51
Chapter 5. Generation of Reference Data
h = v sin(), (5.5)
v sin() cos()
= , (5.6)
r cos()
v
= cos() cos(), (5.7)
r
D
v = g sin(), (5.8)
m
L v g
= cos() + cos() , (5.9)
mv r v
L v
= sin() + cos() sin() sin(), (5.10)
m v cos() r cos()
where the aerodynamic and atmospheric forces on the vehicle are specified
by the following quantities (English units) [1]:
L = 21 cL Sv 2 , D = 12 cD Sv 2 ,
g = /r2 ( = 0.1407 1017 ), r = Re + h (Re = 20902900),
= 0 exph/hr (0 = 0.002378), hr = 23800,
cL = a0 + a1 , c D = b0 + b1 + b2 2 ,
a0 = 0.20704, b0 = 0.07854,
a1 = 0.029244, b1 = 0.6159 102 ,
S = 2690, b2 = 0.6214 103 .
52
Chapter 5. Generation of Reference Data
53
Chapter 5. Generation of Reference Data
54
Chapter 5. Generation of Reference Data
x = vx , (5.17)
y = vy , (5.18)
1
vx = (L sin() D cos()), (5.19)
m
1
vy = (L cos() D sin() W ), (5.20)
m
where
L = 21 cL Svr2 , D = 12 cD Svr2 ,
sin() = Vvry , cos() = vvxr ,
p
vr = vx2 + Vy2 , Vy = vy ua (x),
2
ua (x) = uM (1 X)eX , X = Rx 2.5 ,
cD (cL ) = c0 + kc2L .
55
Chapter 5. Generation of Reference Data
J14 = 1,
J25 = 1,
x
2 ua (x)+uM eX 2.5 Vy vx2
D vvx2
J32 = Rm
R
vr
sin()cL S vr + cos()cD S vr +L vr3
,
r
2
vx Vy D Vy
J34 = mvr sin()cL S vr L v2 + cos()cD S vr m v3 ,
r r
Vy 2
J35 = mvr sin()cL S vr D v2 + cos()cD S vr m vvx3 ,
vx L
r r
2
J36 = m1 S vr 2sin() S K cL vr2 cos() ,
x
2 ua (x)+uM eX 2.5 Vy vx
vx2
J42 = Rm
R
vr
cos()c L S vr sin()c D S vr L v 2 D vr3
,
r
1
cL S vx cos() + vLr cD S vx sin() + D vxvV3 y ,
J44 = m r
D vx2
J45 = mVyvr cos()cL S vr L vvx2 sin()cD S vr m vr3
,
r
1 S vr2 cos() 2
J46 =m 2
S K cL vr sin() .
56
Chapter 5. Generation of Reference Data
is calculated.
dim(Ji ) = [m (1 + ns + nc )] (5.24)
F1 F1
F1
F1
F1
ti x1
... xns u1
... unc
ti ti ti
ti
F
F2 F2
F2 F2
2
ti x1
. . . xns u1
. . . u nc
Ji = ti ti ti ti (5.25)
.. .. .. .. ..
. . ... . . ... .
Fm Fm Fm Fm Fm
. . . xn ...
ti x1 s u1 unc
ti ti ti ti
So the reference Jacobian matrix is a sparse matrix and its pattern depends
on the problem we are analysing.
In the following figures the Jacobian matrix sparsity patterns for the Space
Shattle Reentry problem (Figure 5.1), the Orbit Raising problem (Figure
5.2) and the Hang Glider (Figure 5.3) are illustrated.
In order to have a better visualization of the pattern, all the figures showing
the Jacobian structure are associated to the solutions obtained using 3 nodes,
meaning nt = 3.
57
Chapter 5. Generation of Reference Data
Figure 5.1: Jacobian Matrix Sparsity Patterns for the Space Shuttle Problem.
Figure 5.2: Jacobian Matrix Sparsity Patterns for the Orbit Raising Problem.
58
Chapter 5. Generation of Reference Data
Figure 5.3: Jacobian Matrix Sparsity Patterns for the Hang Glider Problem.
5.4 Conclusions
In this chapter the Gradient vector and the Jacobian matrix have been de-
fined. Three different problems (the Space Shuttle Reentry Problem, the
Orbit Raising Problem and the Hang Glider Problem) have been formulated
in order to generate a reference Jacobian matrix for each of them, meaning
a Jacobian matrix formulated using analytical derivatives.
The reference Jacobian matrices have been generated using reference solu-
tions which have been calculated with SPARTAN, a tool developed by DLR
based on the the use of the FRPM.
The resulting reference Jacobian matrices are characterized by a sparsity
pattern whose structure depends on the problem that we are analysing, as it
is reasonable to expect. These reference data will be useful in order to val-
idate the numerical differentiation approaches we will introduce in the next
chapters.
59
Chapter 6
Jacobian Matrix Generation
with Numerical Differentiations
- Analysis of Accuracy and
CPU Time
Overview
In this chapter the numerical differentiation schemes analysed in the Chapter
2.2, 3 and 4 are employed in order to generate the numerical Jacobian ma-
trix for three different problems: the Space Shuttle Reentry Problem [1], the
Orbit Raising Problem [5] and the Hang Glider Problem [1]. Then, these nu-
merical Jacobian matrices are compared with the analytical ones computed
in the previous and the results in terms of accuracy and CPU time are illus-
trated.
In the first section of this chapter we focus our attention on the central dif-
ference traditional schemes. For each of the three aforementioned problems
the Jacobian matrix is generated using the 3-points stencil central difference
scheme, the 5-points stencil central difference scheme and the 7-points cen-
tral difference scheme. Then the accuracy of these schemes is analysed using
the analytical Jacobian matrix as reference.
In the second section the Jacobian matrix, for each of the three problems, is
generated using the complex-step derivative approximation. Then the accu-
racy of this scheme is analysed and the results are compared with the one
achieved using the central difference traditional schemes.
In the third section the same analysis is repeated, for each of the three prob-
60
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
lems, considering the Jacobian matrix generated with the dual-step derivative
approach.
In the last sections we focus our attention on the CPU time to have a measure
of the different computational power required for each technique.
61
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.1: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.
Table 6.1: Accuracy and step size comparison for the Space Shuttle Problem.
62
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.2: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.
63
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
points and, at the same time, to select a value of the perturbation which is
not too small.
Table 6.2: Accuracy and step size comparison for the Orbit Raising Problem.
64
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
accuracy of each central difference scheme are and the value of the step size
which minimize the error in the Jacobian matrix increases. These results are
consistent with the ones achieved in the Chapter 2.2.
Figure 6.3: Maximum error in the Jacobian Matrix for the Hang Glider Problem.
Table 6.3: Accuracy and step size comparison for the Hang Glider Problem.
in case of use of the 3-points stencil central difference scheme the step
size should be selected in the range 1.8 106 < h < 7.4 105 ;
65
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
in case of use of the 5-points stencil central difference scheme the step
size should be selected in the range 3.7 104 < h < 7.7 103 ;
in case of use of the 7-points stencil central difference scheme the step
size should be selected in the range 2.7 103 < h < 4.9 102 .
66
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.4: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.
67
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.5: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.
68
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.6: Maximum error in the Jacobian Matrix for the Hang Glider Problem.
69
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
complex-step, regardless of the selected step size. This means that it is con-
venient to select a size of the perturbation h = 1 so that we will avoid to
perform the ratio (4.4) to compute each term of the Jacobian matrix but, at
the same time, we will not lose accuracy.
The computation of the Jacobian matrix with the dual-step approach appears
to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, even if the minimum value
of the error in the Jacobian matrix ? = 7.4 1012 is comparable with the
one obtained with the use of the complex-step approximation now, with the
dual-step approach, the number of exact derivatives in the Jacobian matrix
is increased. In addition, the value of the perturbation h? which minimize
the error can be selected equal to 1.
Figure 6.7: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.
70
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
However, after a certain value of h, the error for the central difference approx-
imations begins to grow, while the error for the complex-step approximation
continues to decrease until it reaches, and remains at a minimum value. The
error of the dual-step approach, which is not subject to truncation error or
round-off error (see Chapter 4), is around the minimum value of the error of
the complex-step, regardless of the selected step size. This means that, here
too, it is convenient to select a size of the perturbation h = 1.
The computation of the Jacobian matrix with the dual-step approach appears
to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, in this case the minimum
value of the error in the Jacobian matrix ? = 3.3 1016 . In addition, with
the use of the dual-step, the number of exact derivatives in the Jacobian
matrix is increased and the value of the perturbation h? which minimize the
error can be selected equal to 1.
Figure 6.8: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.
71
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.9: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.
to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, even if the minimum value
of the error in the Jacobian matrix ? = 4.0 1015 is comparable with the
one obtained with the use of the complex-step approximation now, with the
dual-step approach, the number of exact derivatives in the Jacobian matrix
72
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.10: CPU Time Required for the Space Shuttle Reentry Problem.
The CPU time required to compute the Jacobian matrix with the dual-step
approach is higher than the one associated to the complex-step approxima-
tion because of the additional computational work associated to the use of
73
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
the dual numbers. However, the values are comparable and the dual-step
approach is preferred due to the better accuracy.
The CPU time associated to the use of either the 7-points or the 5-points
stencil central difference schemes is higher than the one required when the
dual-step approach is employed. This is caused by the complexity of the
equations which describe the problem. Indeed, in this case, the multiple
evaluation of the functions associated to the central difference schemes is, for
this problem, the major contribution to the CPU load, and its cost is higher
then the effort associated to the use of the dual-step class.
Figure 6.11: CPU Time Required for the Orbit Raising Problem.
The CPU time required to compute the Jacobian matrix with the dual-step
approach is higher than the one associated to the use of the other approx-
imations. This is due to the fact that the use of the dual-step approach
74
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.12: CPU Time Required for the Hang Glider Problem.
75
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.13: CPU Time Required for the Space Shuttle Reentry Problem.
76
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
Figure 6.14: CPU Time Required for the Orbit Raising Problem.
Figure 6.15: CPU Time Required for the Hang Glider Problem.
77
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time
6.6 Conclusions
In this chapter the Jacobian matrices for three different problems (the Space
Shuttle Reentry Problem, the Orbit Raising Problem and the Hang Glider
Problem) have been generated using different numerical differentiations. The
results are compared in terms of accuracy and CPU time.
The dual-step approach has proved to be the most accurate differentiation
method for the computation of the Jacobian matrix. Each term of the Jaco-
bian matrix calculated using this approach is subject neither to truncation
error, nor to round-off error. In addition, using the dual-step approach there
is no need to make the step size small because the best accuracy is achieved
regardless of the selected step size and the simplest choice is h = 1 in order to
eliminate the need to divide by the step size. This is an advantage over the
use of the central difference schemes and the complex-step approximation.
Indeed, the use of either the central difference schemes or the complex-step
approximation has proved to be less accurate and, in addition, their accuracy
is strongly influenced by the selection of the optimal step size. The optimal
step size for these methods is not known a priori and it requires a trade off
between the truncation and round-off errors as well as a substantial effort
and knowledge of the analytical derivative.
In terms of CPU time, the time required for the computation of the Jacobian
matrix with the dual-step approach is not the smallest one and it depends
on the non-linearity and complexity of the equations which describe the dy-
namics of the problem. Indeed, the use of the dual-step approach requires
additional computational work associated to the use of the dual numbers
which are implemented in MATLAB as a new class of numbers.
The CPU time is also influenced by the number of the nodes and, for each of
the three problems analysed, an increase of the number of the nodes causes
an increase of the CPU time as well, as expected.
To conclude, the trade-off between accuracy and CPU power suggests that
the dual-step differentiation is a valid alternative to the other well-known nu-
merical differentiation methods, in case the problem is analytically defined
(i.e., no look-up table are part of the data).
78
Chapter 7
Use of the Advanced
Differentiation Schemes for
Optimal Control Problems
Overview
In this chapter the differentiation schemes defined in the previous chapters
are used to solve optimal control problems.
In the first section we formulate a general optical control problem and we fo-
cus our attention on the numerical approaches which can be used for solving
it. In the next sections SPARTAN, an algorithm developed by DLR based on
the use of the Flipped Radau Pseudospectral Method, is presented focusing
on the computation of the Jacobian matrix associated to the NLP problem.
Three numerical examples of optimal control problem are studied: the max-
imization of the final crossrange in the space shuttle reentry trajectory, the
maximization of the final specific energy in an orbit raising problem, and the
maximization of the final range of a hang glider in presence of a specific up-
draft. Each of the three examples is solved using five different differentiation
schemes ( the 3-points stencil central difference scheme, the 5-points sten-
cil central difference scheme, the 7-points stencil central difference scheme,
the complex-step approach and the dual-step method), and two different off-
the-shelf, well-known NLP solvers (SNOPT and IPOPT). The results are
compared in terms of accuracy and CPU time in order to realize the main
advantages and drawbacks related to the use of the pseudospectral methods
in combination with the dual numbers, and with the other differentiation
schemes.
79
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
where is called the Mayer term and L is called the Lagrange integrand.
The differential equations (7.2) describe the dynamics of the system while
the objective (7.1) is the performance index which can be considered as a
measure of the quality of the trajectory. When it is desired to minimize
the performance index, a lower value of J is preferred; conversely, when it is
desired to maximize the performance index, a higher value of J is preferred.
With the exception of simple problems (i.e., some special weakly nonlinear
80
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
cL c(x) cU (7.7)
81
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Once the optimal control problem is transcribed to a NLP problem, the NLP
will be solved using well known optimization techniques, [1]. In conclusion,
in a direct method the optimal solution is found by transcribing the infinite-
dimensional (continuous) optimization problem to a finite-dimensional op-
timization problem. In particular, one of the most promising techniques is
represented by direct collocation methods and, among these, pseudospectral
methods are gaining popularity for their straightforward implementation and
some interesting properties which are associated to their use [5].
7.2 SPARTAN
SPARTAN (Shefex-3 Pseudospectral Algorithm for Reentry Trajectory ANal-
ysis) is an optimal control package developed by the DLR. It has already been
used in literature [11, 12], and it is the reference tool for the development
of the entry guidance for the SHEFEX-3 (Sharp Edge Flight Experiment)
mission and has been validated with several well-known literature examples.
SPARTAN implements the global Flipped Radau Pseudospectral Methods
(FRPMs) to solve constrained or unconstrained optimal control problems,
which can have a fixed or variable final time. It belongs to the class of direct
methods and has a highly exploited Jacobian structure, as well as routines
for automatic linear/nonlinear scaling and auto-validation using the Runge-
Kutta 45 scheme.
The basic idea of the FRPM is, as in the other direct methods, to collocate
the differential equations, the cost function and the constraints in a finite
number of points in order to treat them as a set of nonlinear algebraic equa-
tions. In this way, the continuous OCP is reduced to a discrete NLP problem
which can be solved with one of the well-known available software packages,
e.g. SNOPT, IPOPT.
In details, considering the structure of the classical Bolza Optimal Control
Problem (7.1)-(7.5) we want to solve, SPARTAN proposes a transcription of
the OCP as NLP based on the choice of some trial functions to represent
the continuous variables
x(ti )
= Xi , i [0, N ] (7.9)
u(tj )
= Uj , j [1, N ]. (7.10)
82
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
In other words, the continuous states and controls can be substituted with
polynomials which interpolate the values in the nodes
N
x(t)
X
= Xi Pi (t) (7.11)
i=0
N
u(t)
X
= Ui Pi (t) (7.12)
i=1
where
N
Y t tj
Pi (t) = (7.13)
j=0,j6=i
ti tj
and tj are the roots of linear combinations of Legendre Polynomials (Figure
7.1) Pn (t) and Pn1 (t).
The difference in the indexing in (7.6), (7.10) and in their discrete repre-
sentations is due to the distinctions between discretization and collocation.
While the discretization includes (in the FRPM) the initial point, the collo-
cation does not. Hence, the controls will be approximated with a polynomial
having a lower order and the NLP problem will not provide the initial values
for the controls. These can be, in some cases, part of the initial set of known
inputs, otherwise they can be extrapolated from the generated polynomial
interpolating the N values of controls in the collocation nodes, [5].
In this way the entire information related to the states and the controls is
enclosed in their nodal values. Of course, the boundaries valid for the contin-
uous form will also be applied to the discrete representation of the functions.
In particular, it has been shown that for SPARTAN, as well as for all the
pseudospectral methods, the following properties are valid:
Spectral convergence in the case of smooth problem;
The Runge phenomenon is avoided;
Sparse structure of the associated NLP problem;
Differential equations become algebraic constraints evaluated in the
collocations points.
In the next section we will focus our attention on the structure of Jacobian
associated to the NLP problem deriving from the transcription implemented
by SPARTAN. Indeed, experience shows that, while for simple systems a
more detailed analysis of Jacobian can be avoided, in complex problems
like atmospheric reentry a solid knowledge of its structure is very helpful
and significantly increases the speed of computation and in some cases the
quality of the results.
83
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
84
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
The final point on the reentry trajectory occurs at the unknown final time
tf . The goal is to choose the control variables (t) and (t) so that the final
cross-range is maximized which is equivalent to maximizing the final latitude
(tf ). So, the cost function can be defined as follows:
J = (tf ). (7.16)
In this case the Jacobian has all the three components and its pattern is
shown in [5]. The OCP has been implemented and solved with SPARTAN
using different differentiation schemes: the 3-points stencil central differ-
ence scheme, the 5-points stencil central difference scheme, the complex-step
derivative approach and the dual-step derivative method. Furthermore, the
solution is computed with an upper bound on the aerodynamic heating of 70
BT U/f t2 /s.
Figures 7.2, 7.3 and 7.4 illustrate the time histories of the states, the controls
and the constraints which are associated to the solution obtained using the
dual-step derivative method and a number of nodes equal to 100. Figure
7.5 shows the discrepancy between SPARTAN and propagated (using the
Runge-Kutta 45 scheme) solutions.
Since this example is taken from [1], we can compare the results. In Figure
7.6 the time histories for the states and the controls are shown as a solid
line for the unconstrained solution and, as a dotted line for the constrained
solution (the one implemented and solved by SPARTAN). The comparison
of the Figures 7.2-7.4 and 7.6 shows the consistency of the results.
Tables 7.1 and 7.2 summarize the results obtained with SPARTAN using the
five different differentiation schemes. In the first table SNOPT (Sparse Non-
linear OPTimizer) is used to solve the NLP problem, instead in the second
table IPOPT (Interior Point OPTimizer) is employed in SPARTAN.
85
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.2: States Evolution for the Space Shuttle Reentry Problem.
Figure 7.3: Controls Evolution for the Space Shuttle Reentry Problem.
86
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.4: Heat Rate Evolution for the Space Shuttle Reentry Problem.
Figure 7.5: Discrepancy between optimized and propagated solutions for the Space
Shuttle Reentry Problem.
87
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
88
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Table 7.1: Accuracy and CPU Time comparison for the Space Shuttle Problem
(SNOPT).
6.287 62.78 483 472.1 7.402 26.52 3094 1.2 104 13.917 48.33 1706 2.8104
CD3
6.306 62.80 767 675.8 7.547 27.01 6635 2.7 104 13.911 48.32 1168 1.9104
CD5
6.280 62.87 687 628.85 6.867 24.63 1971 8.1 103 13.920 48.31 1448 2.5104
CD7
6.526 62.84 439 313.6 6.983 25.24 3044 1.2 104 13.910 48.33 4177 7.2104
Compl.-Step
6.285 62.76 574 518.7 6.932 21.93 3493 1.3 104 13.857 47.14 2425 4.1104
Dual-Step
Table 7.2: Accuracy and CPU Time comparison for the Space Shuttle Problem
(IPOPT).
Figure 7.8 shows the trend of the mean error (in logarithmic scale) between
the solutions computed with SPARTAN and the propagated solutions as a
function of the number of the nodes. For each of the five differentiation
schemes we have a spectral (exponential) convergence of the solution. The
dual-step related plot appears to be much smoother than the complex-step.
89
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.8: Spectral Convergence for the Space Shuttle Reentry Problem.
90
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
91
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.11: Orbit Raising Problem - Trajectory Optimizing Final Orbit Energy
(LU=Unitary Length).
92
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.12: Discrepancy between optimized and propagated solutions for the Or-
bit Raising Problem.
93
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Tables 7.3 and 7.4 summarize the results obtained with SPARTAN using five
different differentiation schemes, and two different NLP solvers.
Table 7.3: Accuracy and CPU Time comparison for the Orbit Raising Problem
(SNOPT).
0.0019 0.0118 87 47.83 0.0037 0.0225 90 333.5 0.0054 0.0330 184 1.61103
CD3
0.0019 0.0118 72 40.65 0.0037 0.0225 87 317.7 0.0054 0.0330 103 934.9
CD5
0.0019 0.0118 86 46.99 0.0037 0.0225 108 388.8 0.0054 0.0330 126 1.09103
CD7
0.0019 0.0118 110 58.13 0.0037 0.0225 101 360.8 0.0054 0.0330 125 1.13103
Compl.-Step
0.0019 0.0118 75 42.0 0.0037 0.0225 114 394.4 0.0054 0.0330 175 1.56103
Dual-Step
Table 7.4: Accuracy and CPU Time comparison for the Orbit Raising Problem
(IPOPT).
Figure 7.14 shows the trend of the mean error (in logarithmic scale) between
the solutions computed with SPARTAN and the propagated solutions as
function of the number of the nodes.
94
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
95
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
0 cL 1.4 (7.19)
Figures 7.15, 7.16 and 7.17 illustrate states, controls and the discrepancy be-
tween optimized and propagated solutions. These results are obtained using
the dual-step derivative method, with a number of nodes equal to 100.
Since this example is solved in [1], we can compare the results. The compar-
ison of the Figures 7.15, 7.16 and 7.18 shows the consistency of the results.
In more details, in Figure 7.18, the dashed line is the initial guess which has
been computed using linear interpolation between the boundary conditions,
with x(tf ) = 1250, and cL (0) = cL (tf ) = 1.
Tables 7.5 and 7.6 summarize the results obtained with SPARTAN using five
different differentiation schemes. In the first table SNOPT is used to solve
the NLP problem, instead in the second table IPOPT is employed in SPAR-
TAN as NLP solver.
Furthermore, Figure 7.19 shows the trend of the mean error (in logarithmic
scale) between the optimized and propagated solutions as function of the
number of the nodes. For each of the five differentiation schemes we have a
spectral (exponential) convergence of the solution, as expected by the use
of Pseudospectral methods.
96
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
97
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Figure 7.17: Discrepancy between optimized and propagated solutions for the
Hang Glider Problem.
98
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
Table 7.5: Accuracy and CPU Time comparison for the Hang Glider Problem
(SNOPT).
0.0070 0.0817 80 42.32 0.0015 0.0521 133 344.5 0.0011 0.0099 136 1.59103
CD3
0.0070 0.0817 89 39.65 0.0015 0.0521 115 332.6 0.0011 0.0099 125 1.65103
CD5
0.0070 0.0817 89 38.14 0.0015 0.0521 91 359.23 0.0011 0.0099 144 1.32103
CD7
0.0070 0.0817 89 40.42 0.0015 0.0521 116 343.6 0.0011 0.0099 178 1.31103
Compl.-Step
0.0070 0.0817 89 38.47 0.0015 0.0521 117 338.1 0.0011 0.0099 137 578.36
Dual-Step
Table 7.6: Accuracy and CPU Time comparison for the Hang Glider Problem
(IPOPT).
99
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
100
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems
7.5 Conclusions
In this chapter the general formulation of an optimal control problem, and
the main numerical approaches used to solve it have been briefly summa-
rized. We have focused on SPARTAN and on the structure of the Jacobian
matrix which describes the discrete, transcribed OCP, that is, the resulting
NLP. Therefore, the advanced differentiation schemes defined in the previous
chapters have been implemented in SPARTAN in order to solve three differ-
ent examples of optimal control problem. Each of the three OCPs has been
solved using two different NLP solvers (SNOPT and IPOPT).
The results in terms of accuracy and CPU time show that the effects of the
use of the dual-step derivative method, as well as of the other schemes, in
combination with the pseudosectral methods are strongly influenced by the
nonlinear behaviour of the equations which describe the problem, by the
number of the nodes used to discretized the problem under analysis, and by
the NLP solver which has been selected.
Overall, among the NLP solvers, SNOPT has been demonstrated to provide
better accuracy than IPOPT and, in addition, the computational power re-
quired for the greater accuracy is far less than the one required by IPOPT.
Furthermore, considering SNOPT, the dual-step method provides better ac-
curacy than the other differentiation schemes as the number of the nodes
increases, and the improvements of the accuracy are paid in terms of an
increasing CPU time. On the other hand, considering IPOPT, there are
no significant differences in terms of accuracy when different differentiation
schemes are implemented in SPARTAN, but these schemes have different
effects in terms of CPU time. In some cases, the dual-step method in combi-
nation with IPOPT provides a significant save in CPU time which leads to
faster optimization.
In addition, the trend of the mean error between SPARTAN and propagated
(using the Runge-Kutta 45 scheme) solutions as function of the number of
nodes has been analysed for each of the three problems in order to verify the
spectral convergence of the solution.
In conclusion, it is not possible to define a priori the most convenient dif-
ferentiation methods to be implemented in SPARTAN. Therefore a trade-off
between the desired quality of the results (e.g., hypersensitive problems), and
the CPU time (e.g., trajectory database generation) can be found according
to the specific number of nodes, to the NLP solver used and to the behaviour
of the equations which describe the problem under analysis. However, the
dual-step method has been demonstrated to be a valid alternative to the
other traditional, well-known differentiation schemes, and it is worth being
considered as valid method to solve the OCP with SPARTAN.
101
Chapter 8
Further Tools: Robust
Differentiation via Sliding
Mode Technique
Overview
In the previous chapters, different differentiation schemes have been anal-
ysed in order to define the method that provides the best accuracy in the
computation of the derivatives and the Jacobian matrix. This analysis is of
great importance considering that gradient methods for solving NLPs require
the computation of the derivatives of the objective function and constraints.
Therefore, the accuracy of the derivatives has a strong impact on the com-
putational efficiency and reliability of the solution.
In this chapter we will deal with the problem of the differentiation of signals
given in real time with the aim to design a robust differentiator.
Real-time differentiation is an old and well-studied problem and, the main
difficulty is the obvious differentiation sensitivity to input noises. Given an
input signal consisting of a noise and an unknown base signal, the goal of a
robust differentiator is to find real-time robust estimations of the derivatives
of the base signal which are exact in the absence of measurement noise.
Combining differentiation exactness with robustness in respect to possible
measurement errors and input noise is a challenging task and, one particular
approach to robust differentiator design is the so-called robust differentiation
via sliding mode technique.
In the first section the main concepts of the theory of sliding mode control
are briefly summarized in order to underline the potential advantages of us-
102
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
where:
103
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
The control problem is to get the state x to track a specific state xref =
[xd , xd , . . . , xn1
d ]T in the presence of model imprecision on f (x, t) and b(x, t)
and of disturbances d(t). Defining the tracking error vector x := x xref =
. . . , x(n1) ], we must assume
[x, x,
xt=0 = 0 (8.2)
1d 2
s (x; t) |s| (8.4)
2 dt
where is a positive constant. Relation (8.4) is equivalent to
104
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
Indeed, the sign function has the important property so that s sign(s) = |s|.
Inequality (8.4), often termed either sliding condition or reachability condi-
tion, constraints trajectories to point towards the sliding surface s(t) and to
remain on it thereafter.
The idea behind conditions (8.3) and (8.4) is to pick up a well-behaved func-
tion of the tracking error, s, according to (8.3), and then select the feedback
control law u in (8.1) such that s2 satisfied equation (8.4) despite the pres-
ence of model imprecision and of disturbances, [6].
The control u that drives the state variables to the sliding surface s in a
finite time, and keeps them on the surface thereafter in presence of bounded
disturbances, is called a sliding mode controller, and an ideal sliding mode is
said to be taking place in the system.
Control laws that satisfy (8.4) have to be discontinuous across the sliding sur-
face. In more details, sliding mode control usually is a high frequency switch-
ing control with a switching frequency which is finite due to the discrete-time
nature of the computer simulation. This high-frequency switching control
causes in practice the control chattering meaning a finite frequency zig-
zag motion in the sliding mode. In an ideal sliding mode the switching
frequency is suppose to approach infinity and the amplitude of the zig-zag
motion tends to zero.
8.1.2 Example
The main advantages of the sliding mode control, including robustness, finite-
time convergence, and reduced-order compensated dynamics, are demon-
strated on a tutorial example taken from [7]. In the example, a single-
dimensional motion of a unit mass is considered. If we introduce variables for
the position and the velocity, x1 = x and x2 = x1 , a state-variable description
is the following (
x1 = x2
(8.6)
x2 = u + f (x1 , x2 , t),
where u is the control force, and the disturbance term f (x1 , x2 , t), which
may include dry and viscous friction as well as any other unknown resistance
forces, is assumed to be bounded, i.e., |f (x1 , x2 , t)| L > 0. The problem
is to design a feedback control law u = u(x1 , x2 ) that drives the mass to
the origin asymptotically (t x1 , x2 = 0). First, we introduce a new
variable, called sliding variable, in the state space of the system (8.6):
105
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
106
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
the state variables to zero and the state trajectory, in the presence of the
external bounded disturbance f (x1 , x2 , t) = sin(2t) and of the sliding mode
control u(x1 , x2 ) = cx2 sign(). It is possible to identify a reaching
phase, when the state trajectory is driven towards the sliding surface, and a
sliding phase, when the state trajectory is moving along the sliding surface
towards the origin.
So far, we have assumed that all the state variables are measured (available).
In many cases, the full state is not available, and thus that a sliding surface
definition similar to (8.3) is not adequate.
By means of some modifications in the algorithm, it is possible to define
a sliding mode observer which can be treated as a differentiator, since the
variable it estimates is a derivative of the measured variable.
107
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
108
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
109
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
110
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
x1 = x2
x2 = x31 f (x1 , x2 ) + u
z = x1 +
x 1 = 1 z + x2 + k1 sign(z)
x 2 = 2 z x31 f(x1 , x2 ) + u + k2 sign(z)
z = x1 z
= 1.0
Fs = 1.0
Fd = 0.75
(8.9)
= 1.0
Fs = 1.25
Fd = 1.00
1 = 3.8
2 = 7.2
The true system is excited by a sinusoidal input u = sin(t) and the initial
condition are: x1 (0) = 0.0 and x2 (0) = 0.5; with the estimated initial condi-
tions: x1 (0) = 0.0 x2 (0) = 0.2.
111
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
8.3 Conclusions
In this chapter the sliding mode technique and the robust differentiation via
sliding mode have been analysed.
The main concepts of the sliding mode control for tracking problems, as
well as state and input observation have been briefly summarized. The
112
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
113
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique
main advantages of the sliding mode control such as the robustness, the
finite-time convergence, and the reduced-order compensated dynamics, are
demonstrated on a tutorial example.
The need of the employment of the sliding mode technique to construct a ro-
bust observer/differentiator has been justified and two robust differentiators
based on the sliding mode technique have been analysed.
Robust differentiation via sliding mode control has been proved to be able
to provide accurate, real-time estimations of the derivatives of a base signal
when nothing is known about its structure except some differential inequal-
ities and, in spite of the presence of measurement noises.
114
Chapter 9
Conclusions
115
Chapter 9. Conclusions
116
Chapter 9. Conclusions
Extension of the dual numbers class into dual quaternion class to have
a stable and efficient representation of the 6 DOF motion.
117
Appendix A
Dual and Hyper-Dual Numbers
A.1 Introduction
This appendix provides the necessary background that is required for the
computation of the derivatives of a generic function using either the dual
numbers or the hyper-dual numbers. For further details see [16].
z = a + b (A.1)
Dual numbers extend the real numbers in a similar way to the complex
numbers. Indeed, as the dual numbers, the complex numbers adjoin a new
element i, for which i2 = 1, and every complex number has the form
z = a + bi where a and b are real numbers.
The above definition (A.1) relies on tha idea that 2 = 0 with 6= 0. This
118
Appendix A. Dual and Hyper-Dual Numbers
A.2.2 Properties
In order to implement the dual numbers, operations on this numbers should
be properly defined. Indeed, given three dual numbers a, b and c, it is possible
to demonstrate that the following properties hold:
Additive associativity: (a + b) + c = a + (b + c).
Additive commutativity: a + b = b + a.
Additive identity: 0 + a = a + 0 = a.
Multiplicative identity: 1 a = a 1 = a.
Multiplicative inverse: a a1 = a1 a = 1.
119
Appendix A. Dual and Hyper-Dual Numbers
Multiplicative inverse: 1
a
= 1
a1
a2
a21
.
The multiplication inverse is therefore defined for all numbers of this type
with a non-zero real part, a1 . Division can be defined as follows.
Conjugate: aconj = a1 a2 .
sin(x) Example.
The Mclaurin series of the function sin(x) for x R is the following
x3 x5 x7 (1)n 2n+1
sin(x) = x + + ... + x . (A.2)
6 120 5040 (2n)!
These results are then added according to the Mclaurin series so that
120
Appendix A. Dual and Hyper-Dual Numbers
Recognizing that
x2 x4 x6 (1)n
cos(x) = 1 + + ... + (A.6)
2 24 720 (2n)!
the expression (A.5) can be simplified to give [13]
sin(x) = sin(x1 ) + x2 cos(x1 ) (A.7)
ln(x) Example.
The Mclaurin series of the function ln(x) for x inR is the following
x2 x3 x4 (1)n+1 n
ln(1 + x) = x + + ... + x . (A.8)
2 3 4 n
For x = x1 + x2 , using the above expressions for the terms x3 , x5 , etc.
yields
x21 x31 x41
2 3
ln(1+x) = x1 + +. . . +x2 1x1 +x1 x1 +. . . (A.9)
2 3 4
1 2 3 1
2
where 1+x 1
= 1x 1 +x 1 x 1 +. . . and 1+x 1
= 12x1 +3x21 4x31 +. . .
so [13]
x2
ln(1 + x) = ln(1 + x) + (A.10)
1 + x1
x2
ln(x) = ln(x1 ) + (A.11)
x1
Function f (x).
The above results can also be derived from the Taylor series for a
general function f (x).
1 2 00 z 3 f 000 (x)
f (x + z) = f (x) + zf 0 (x) +z f (x) + + ... (A.12)
2! 3!
where z is a dual number so that
z = b (A.13)
z2 = 0 (A.14)
z 3 = 0. (A.15)
These terms are then added according to the Taylor series [13], so that
f (x + b) = f (x) + bf 0 (x). (A.16)
It is implicit that each function extended in the dual plane hides its
derivative in its dual part. For this reason it is possible to state that
the dual-step approach can be considered as belonging to the class of
the Automatic Differentiation Methods as well.
121
Appendix A. Dual and Hyper-Dual Numbers
A.2.5 Implementation
The dual numbers have been implemented as new class of numbers, using
operator overloading, in MATLAB. This section contains their implementa-
tion which is based on Ref. [17], with some minor modifications.
The new class includes definitions for the standard algebraic operations, log-
ical comparison operations, and other more general functions such as the
exponential, the sine and so on. This class definition file allows a real-valued
analysis code to be easily converted to operate on dual numbers by just
changing the variable type declarations, the structure of the code remains
unchanged.
classdef Dual
properties
x = 0;
d = 0;
end
methods
% Constructor
function obj = Dual(x,d)
if size(x,1) ~= size(d,1) | | size(x,2) ~= size(d,2)
error('DUAL:constructor','X and D are different size')
else
obj.x = x;
obj.d = d;
end
end
% Getters
function v = getvalue(a)
v = a.x;
end
function d = getderiv(a)
d = a.d;
end
% Indexing
function B = subsref(A,S)
switch S.type
case '()'
idx = S.subs;
switch length(idx)
case 1
B = Dual(A.x(idx{1}),A.d(idx{1}));
case 2
B = Dual(A.x(idx{1},idx{2}), A.d(idx{1},idx{2}));
otherwise
122
Appendix A. Dual and Hyper-Dual Numbers
123
Appendix A. Dual and Hyper-Dual Numbers
ds{k} = tmp.d;
end
A = Dual(vertcat(xs{:}), vertcat(ds{:}));
end
% Plotting functions
function plot(X,varargin)
if length(varargin) < 1
Y = X;
X = 1:length(X.x);
elseif isdual(X) && isdual(varargin{1})
Y = varargin{1};
varargin = varargin(2:end);
elseif isdual(X)
Y = X;
X = 1:length(X);
elseif isdual(varargin{1})
Y = varargin{1};
varargin = varargin(2:end);
end
if isdual(X)
plot(X.x,[Y.x(:) Y.d(:)],varargin{:})
else
plot(X,[Y.x(:) Y.d(:)],varargin{:})
end
grid on
legend({'Function','Derivative'})
end
% Comparison operators
function res = eq(a,b)
if isdual(a) && isdual(b)
res = a.x == b.x;
elseif isdual(a)
res = a.x == b;
elseif isdual(b)
res = a == b.x;
end
end
function res = neq(a,b)
if isdual(a) && isdual(b)
res = a.x ~= b.x;
elseif isdual(a)
res = a.x ~= b;
elseif isdual(b)
res = a ~= b.x;
end
end
function res = lt(a,b)
if isdual(a) && isdual(b)
res = a.x < b.x;
124
Appendix A. Dual and Hyper-Dual Numbers
elseif isdual(a)
res = a.x < b;
elseif isdual(b)
res = a < b.x;
end
end
function res = le(a,b)
if isdual(a) && isdual(b)
res = a.x <= b.x;
elseif isdual(a)
res = a.x <= b;
elseif isdual(b)
res = a <= b.x;
end
end
function res = gt(a,b)
if isdual(a) && isdual(b)
res = a.x > b.x;
elseif isdual(a)
res = a.x > b;
elseif isdual(b)
res = a > b.x;
end
end
function res = ge(a,b)
if isdual(a) && isdual(b)
res = a.x >= b.x;
elseif isdual(a)
res = a.x >= b;
elseif isdual(b)
res = a >= b.x;
end
end
function res = isnan(a)
res = isnan(a.x);
end
function res = isinf(a)
res = isinf(a.x);
end
function res = isfinite(a)
res = isfinite(a.x);
end
% Unary operators
function obj = uplus(a)
obj = a;
end
function obj = uminus(a)
obj = Dual(-a.x, -a.d);
end
125
Appendix A. Dual and Hyper-Dual Numbers
126
Appendix A. Dual and Hyper-Dual Numbers
127
Appendix A. Dual and Hyper-Dual Numbers
128
Appendix A. Dual and Hyper-Dual Numbers
function b = isdual(a)
b = strcmp(class(a),'Dual');
end
x = x1 + x2 1 + x3 2 + x4 1 2 . (A.17)
It has one real part and three non-real parts with the following properties
where
1 6= 2 6= 1 2 6= 0 (A.19)
or in other words
1 = 0 6= 0 (A.20)
2 = 0 6= 0 (A.21)
1 2 = 0= 6 0 (A.22)
The properties of these numbers are exactly the same of the ones concerning
the dual numbers (see section A.2.2).
Addition:
a + b = (a1 + b1 ) + (a2 + b2 )1 + (a3 + b3 )2 + (a4 + b4 )1 2 .
129
Appendix A. Dual and Hyper-Dual Numbers
Multiplication:
ab = (a1 b1 )+(a1 b2 +a2 b1 )1 +(a1 b3 +a3 b1 )2 +(a1 b4 +a2 b3 +a3 b2 +a4 b1 )1 .
Using the definition for multiplication, the multiplicative inverse can be de-
fined as
Multiplicative inverse:
1
= a11 aa22 1 aa23 2 + a4 2a2 a3
a a21
+ a31
1 2 .
1 1
The multiplication inverse is therefore defined for all numbers of this type
with a non-zero real part, a1 . Division can be defined as follows.
Division:
b4
a
= ab11 + ab12 a1 b2 a3 a1 b3 a4 a2 b3 a3 b2
b b21
1 + b1
b21
2 + b1
b21
b21
+ a1 b21
+
2b2 b3
b2
1 2 .
1
130
Appendix A. Dual and Hyper-Dual Numbers
The higher order terms are all zero by the definition of 21 = 22 = (1 2 )2 = 0,
so there is no truncation error. The first and second derivatives are the
leading terms of the non-real parts, meaning that if f 0 (x) is desired simply
look at the 1 or 2 part and divide by the appropriate step and if f 00 (x) is
desired look at the 1 2 part:
1 part[f (x + h1 1 + h2 2 + 01 2 )]
f 0 (x) = (A.24)
h1
2 part[f (x + h1 1 + h2 2 + 01 2 )]
f 0 (x) = (A.25)
h2
1 2 part[f (x + h1 1 + h2 2 + 01 2 )]
f 00 (x) = . (A.26)
h1 h2
The derivative calculations are not even subject to subtractive cancellation
error so the use of hyper-dual numbers results in first and second derivative
calculations that are exact, regardless of the step size.
The real part returns the original function evaluated in the real argument
Re(x), and it is mathematically impossible for the derivative calculations to
affect the real part. Indeed, the use of this new number system to com-
pute the first and second derivative involves converting a real-valued func-
tion evaluation to operate on these alternative types of numbers. Then, the
derivatives are computed by adding a perturbation to the non-real parts and
evaluating the modified function.
The mathematics of these alternative types of numbers are such that when
operations are carried out on the real part of the number, derivative infor-
mation for those operations is formed and stored in the non-real parts of
the number. At every stage of the function evaluation the non-real parts of
the number contain derivative information with respect to the input. This
process must be repeated for every input variable for which derivative infor-
mation is desired [14].
Following the above discussion, methods for computing exact higher deriva-
tives can be created by using more non-real parts. For instance, to produce
nth derivatives, nth order hyper-dual numbers would be used. These nth order
hyper-dual numbers have n components 1 , 2 , . . . , n and all of their combi-
nations.
If only the first derivatives are needed, first order hyper-dual numbers would
be used: the dual numbers.
131
Appendix A. Dual and Hyper-Dual Numbers
132
Appendix A. Dual and Hyper-Dual Numbers
(a) (b)
(a) (b)
133
Appendix A. Dual and Hyper-Dual Numbers
A.3.4 Implementation
The hyper-dual numbers have been implemented as new class of numbers,
using operator overloading, in MATLAB. This section contains their imple-
mentation, which is based on the dual numbers class (available at section
A.2.5).
The new class includes definitions for the standard algebraic operations, log-
ical comparison operations, and other more general functions such as the
exponential, the sine and so on. This class definition file allows a real-valued
analysis code to be easily converted to operate on hyper-dual numbers by
just changing the variable type declarations, the structure of the code re-
mains unchanged.
classdef HyperDual
properties
x = 0;
d1 = 0;
d2 = 0;
d3 = 0;
end
methods
% Constructor
function obj = HyperDual(x,d1,d2,d3)
if size(x,1) ~= size(d1,1) | | size(x,2) ~= size(d1,2) ...
| | size(x,1) ~= size(d2,1) | | size(x,2) ~= size(d2,2) ...
| | size(x,1) ~= size(d3,1) | | size(x,2) ~= size(d3,2)
error('DUAL:constructor','X and D are different size')
else
obj.x = x;
obj.d1 = d1;
obj.d2 = d2;
obj.d3 = d3;
end
end
% Getters
function v = getvalue h(a)
v = a.x;
end
function d1 = getderiv 1(a)
d1 = a.d1;
end
function d2 = getderiv 2(a)
d2 = a.d2;
end
134
Appendix A. Dual and Hyper-Dual Numbers
135
Appendix A. Dual and Hyper-Dual Numbers
A.d2(idx{1}) = B.d2;
A.d3(idx{1}) = B.d3;
case 2
A.x(idx{1},idx{2}) = B.x;
A.d1(idx{1},idx{2}) = B.d1;
A.d2(idx{1},idx{2}) = B.d2;
A.d3(idx{1},idx{2}) = B.d3;
otherwise
error('Dual:subsref','Arrays with more than 2 dims not supported')
end
end
% Comparison operators
function res = eq(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x == b.x;
elseif ishyperdual(a)
res = a.x == b;
elseif ishyperdual(b)
res = a == b.x;
end
end
function res = neq(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x ~= b.x;
elseif ishyperdual(a)
res = a.x ~= b;
elseif ishyperdual(b)
res = a ~= b.x;
end
end
function res = lt(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x < b.x;
elseif ishyperdual(a)
res = a.x < b;
elseif ishyperdual(b)
res = a < b.x;
end
end
function res = le(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x <= b.x;
elseif ishyperdual(a)
res = a.x <= b;
elseif ishyperdual(b)
res = a <= b.x;
end
end
function res = gt(a,b)
136
Appendix A. Dual and Hyper-Dual Numbers
137
Appendix A. Dual and Hyper-Dual Numbers
elseif ishyperdual(a)
obj = HyperDual(a.x + b, a.d1, a.d2, a.d3);
elseif ishyperdual(b)
obj = HyperDual(a + b.x, b.d1, b.d2, b.d3);
end
end
function obj = minus(a,b)
if ishyperdual(a) && ishyperdual(b)
obj = HyperDual(a.x - b.x, a.d1 - b.d1, a.d2 - b.d2, a.d3 - b.d3);
elseif ishyperdual(a)
obj = HyperDual(a.x - b, a.d1, a.d2, a.d3);
elseif ishyperdual(b)
obj = HyperDual(a - b.x, -b.d1, -b.d2, -b.d3);
end
end
138
Appendix A. Dual and Hyper-Dual Numbers
139
Appendix A. Dual and Hyper-Dual Numbers
end
function obj = cos(a)
obj = HyperDual(cos(a.x), -a.d1 .* sin(a.x), -a.d2 .* sin(a.x),...
(-a.d3 .* sin(a.x))-(a.d1 .* a.d2 .* cos(a.x)));
end
function obj = tan(a)
obj = HyperDual(tan(a.x), a.d1 .*sec(a.x).2, a.d2 .*sec(a.x).2,...
a.d3 .*sec(a.x).2+a.d1 .*a.d2 .*(2.*tan(a.x).*sec(a.x).2));
end
function obj = asin(a)
obj = HyperDual(asin(a.x), a.d1 ./ sqrt(1-(a.x.(2))),...
a.d2 ./sqrt(1-(a.x.(2))), a.d3 ./sqrt(1-(a.x.(2)))...
+ a.d1 .* a.d2 .* (a.x ./ (1-(a.x.(2))).(-3/2)));
end
function obj = atan(a)
obj = HyperDual(atan(a.x),a.d1 ./(1+(a.x).2),a.d2 ./(1 + (a.x).2),...
a.d3./(1+(a.x).2)+a.d1 .* a.d2 .*...
(-2 .*a.x ./ (1+a.x.(2)).(2)));
end
% Hyperbolic trig functions
function obj = sinh(a)
obj = (exp(a)-exp(-a)) ./ 2;
end
function obj = cosh(a)
obj = (exp(a)+exp(-a)) ./ 2;
end
function obj = tanh(a)
obj = (exp(a)-exp(-a)) ./ (exp(a)+exp(-a)) ;
end
function obj = asinh(a)
obj = log(a + sqrt(a.(2)+1));
end
function obj = acosh(a)
obj = log(a + sqrt(a.(2)-1));
end
function obj = atanh(a)
obj = log((sqrt(1-a.(2))) ./ (1-a));
end
end
end
function b = ishyperdual(a)
% ISHYPERDUAL Return true if a is of class HyperDual,
% else return false
b = strcmp(class(a),'HyperDual');
end
140
Bibliography
[1] John T. Betts: Practical Methods for Optimal Control and Estimation
Using Nonlinear Programming, SIAM-Society for Industrial and Applies
Mathematics, Philadelphia, SECOND EDITION, 2010.
[2] John H. Mathews and Kurtis K. Fink: Numerical Methods Using Mat-
lab., Pearson, New Jersey, FOURTH EDITION, 2004.
[3] http://www.holoborodko.com/pavel/numerical-methods/numerical-
derivative/central-differences/
[4] Joaquin R. R. A. Martins and Peter Sturdza and Juan J. Alonso: The
Complex-Step Derivative Approximation, ACM Transactions on Mathe-
matical Software, Vol.29, No.3, September 2003.
[6] J.-J. E. Slotine, J.K. Hedrick, E.A. Misawa: On Sliding Observers for
Nonlinear Systems, Journal of Dynamic Systems, Measurement, and
Control, September 1987, Vol. 109/245
141
Bibliography
[11] Sagliano, M., Samaan M., Theil S., Mooij E.: SHEFEX-3 Optimal
Feedback Entry Guidance, AIAA SPACE 2014 Conference and Exposi-
tion, AIAA 2014-4208, San Diego, CA, 2014, doi:10.2514/6.2014-4208.
[12] Arslantas Y. E., Oehlschlagel T., Sagliano M., Theil S., Braxmaier C.,
Safe Landing Area Determination for a Moon Lander by Reachability
Analysis, 17th International Conference and Control (HSSC), Berlin,
Germany, 2014.
[14] Jeffrey A. Fike, S. Jongsma, Juan J. Alonso, Edwin van der Weide:
Optimization with Gradient and Hessian Information Calculated Using
Hyper-Dual Numbers, AIAA Applied Aerodynamics Conference, 27-30
June 2011, Honolulu, Hawaii.
[17] https://gist.github.com/chris-taylor/2005955
142