0% found this document useful (0 votes)
57 views

Donofrio 2015

This thesis explores advanced differentiation methods for optimal trajectory computation. It implements complex-step differentiation and dual-step differentiation within SPARTAN, a pseudospectral optimal control software tool. The thesis provides an in-depth overview of differentiation methods, from basic finite difference approximations to advanced complex-step and dual-step approaches. It evaluates these methods on standard optimal control problems, analyzing accuracy and computation time of the generated Jacobians. Reference Jacobians are also generated to assess error. The results provide insight into the most accurate and efficient differentiation approaches.

Uploaded by

Mysonic Nation
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Donofrio 2015

This thesis explores advanced differentiation methods for optimal trajectory computation. It implements complex-step differentiation and dual-step differentiation within SPARTAN, a pseudospectral optimal control software tool. The thesis provides an in-depth overview of differentiation methods, from basic finite difference approximations to advanced complex-step and dual-step approaches. It evaluates these methods on standard optimal control problems, analyzing accuracy and computation time of the generated Jacobians. Reference Jacobians are also generated to assess error. The results provide insight into the most accurate and efficient differentiation approaches.

Uploaded by

Mysonic Nation
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

UNIVERSIT DEGLI STUDI DI NAPOLI FEDERICO II

SCUOLA POLITECNICA E DELLE SCIENZE DI BASE

DIPARTIMENTO DI INGEGNERIA INDUSTRIALE

TESI DI LAUREA MAGISTRALE IN INGEGNERIA AEROSPAZIALE

IMPLEMENTATION OF ADVANCED DIFFERENTIATION METHODS


FOR OPTIMAL TRAJECTORY COMPUTATION

RELATORI CANDIDATO

PROF. MICHELE GRASSI VINCENZO DONOFRIO


ING. MARCO SAGLIANO M53/363

ANNO ACCADEMICO 2013/2014


Abstract

Nowadays the new, increased capabilities of CPUs have constantly encour-


aged researchers and engineers towards the investigation of numerical op-
timization as an analysis and synthesis tool in order to generate optimal
trajectories and the controls to track them. In particular, one of the most
promising techniques is represented by direct methods. Among these, Pseu-
dospectral Methods are gaining widespread acceptance for their straightfor-
ward implementation and some useful properties like the spectral (expo-
nential) convergence observable in the case of smooth problems.
Direct methods use gradient-based techniques, and require the computation
of the derivatives of the objective function and of the constraints of the prob-
lem under analysis. The accuracy of these derivatives has a strong impact on
the computational efficiency and reliability of the solutions. Therefore, the
quality of the results and the computation time are strongly affected by the
Jacobian matrix describing the discrete, transcribed Optimal Control Prob-
lem (OCP), that is, the resulting Nonlinear Programming Problem (NLP).
From this perspective, the core of this thesis provides the reader a thorough
knowledge of several differentiation methods starting from the analysis of the
most basic approaches, which estimate derivatives by means of finite differ-
ence approximation, up to the analysis of advanced differentiation schemes,
such as the complex-step derivative approach, and the dual-step derivative
method. These methods are here implemented in SPARTAN (Shefex-3 Pseu-
dospectral Algorithm for Reentry Trajectory ANalysis), a tool developed by
the DLR (Deutsches Zentrum fur Luft und Raumfafrt) which implements
the global Flipped Radau Pseudospectral Method (FRPM), in order to solve
several well-known literature examples of OCP. Results in terms of accuracy
and CPU time are thoroughly inspected.
Furthermore, the problem of the differentiation of signals given real time is
discussed with the aim to examine, on tutorial examples, robust differentia-
tors/observers based on sliding mode control technique.
Sommario

Oggigiorno, le nuove e accresciute capacita delle CPU incoraggiano ricerca-


tori ed ingegneri verso lo studio di tecniche di ottimizzazione numerica come
strumento di analisi e sintesi, in modo da calcolare traiettorie ottimali e i
controlli necessari a generarle. Una delle tecniche che si preannunciano piu
promettenti e rappresentata dai metodi diretti. Tra questi ultimi i metodi
pseudospettrali stanno guadagnando ampio consenso grazie alla loro chiara
implementazione, e ad alcune loro utili proprieta, come la convergenza spet-
trale (esponenziale) osservabile nel caso di problemi smooth.
I metodi diretti utilizzano tecniche basate su gradienti, e richiedono dunque
il calcolo delle derivate della funzione di costo e dei vincoli che descrivono il
problema in esame. Laccuratezza di queste derivate ha un grosso impatto
sullefficienza computazionale e sullaffidabilita delle soluzioni. Per questo
motivo, la qualita dei risultati e la potenza computazionale necessaria a ge-
nerarli sono fortemente influenzati dalla matrice Jacobiana, la quale descrive
il problema di programmazione non-lineare (Nonlinear Programing Problem,
NLP) ottenuto dalla discretizzazione e transformazione del problema di con-
trollo ottimo (Optimal Control Problem, OCP).
In questottica, la parte centrale della tesi fornisce al lettore una conoscen-
za dettagliata di unampia gamma di metodi di differenziazione, partendo
dallanalisi degli approcci di base, i quali calcolano le derivate attraverso
lapprossimazione delle differenze finite, fino allanalisi degli schemi di diffe-
renziazione avanzati, come ad esempio lapproccio mediante complex-step ed
il metodo derivativo dual-step. Nella tesi, questi metodi sono implementa-
ti in SPARTAN (Shefex-3 Pseudospectral Algorithm for Reentry Trajectory
ANalysis), un algoritmo elaborato presso il DLR (Deutsches Zentrum fur
Luft und Raumfahrt), e che implementa il Flipped Radau Pseudospectral
Method (FRPM), al fine di risolvere alcuni esempi di OCP ben documentati
in letteratura. I risultati in termini di accuratezza e costo computazionale
sono esaminati in modo dettagliato.
Inoltre, nella tesi si analizza il problema della differenziazione di segnali da-
ti in tempo reale con lobbiettivo di esaminare, attraverso alcuni esempi,
differenziatori robusti basati sulla tecnica di controllo sliding mode.
Acknowledgements

...
Contents

List of Figures iv

List of Tables viii

1 Introduction 1
1.1 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation and goals . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . 3

2 Finite Difference Traditional Schemes 5


2.1 Backward and Forward Differences Traditional Schemes . . . . 6
2.2 Central Difference Traditional Schemes . . . . . . . . . . . . . 7
2.2.1 3-points stencil central difference scheme . . . . . . . . 7
2.2.2 5-points stencil central difference scheme . . . . . . . . 8
2.2.3 7-points and K-points stencil central difference schemes 8
2.2.4 Numerical Examples . . . . . . . . . . . . . . . . . . . 9
2.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Advanced Differentiation Scheme: the Complex-Step Deriva-


tive Approach 28
3.1 The Complex-Step Derivative Approximation . . . . . . . . . 29
3.2 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Advanced Differentiation Scheme: the Dual-Step Derivative


Approach 41
4.1 The Dual-Step Derivative Approach . . . . . . . . . . . . . . . 42
4.2 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

i
Contents

5 Generation of Reference Data 50


5.1 Definition of Gradient and Jacobian . . . . . . . . . . . . . . . 50
5.2 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . 52
5.2.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 52
5.2.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 54
5.2.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 55
5.3 Generation of Reference Jacobians . . . . . . . . . . . . . . . . 56
5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6 Jacobian Matrix Generation with Numerical Differentiations


- Analysis of Accuracy and CPU Time 60
6.1 Jacobian Matrix Generation with Central Difference Tradi-
tional Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 61
6.1.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 63
6.1.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 64
6.2 Jacobian Matrix Generation with Complex-Step Derivative
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.2.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 66
6.2.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 67
6.2.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 68
6.3 Jacobian Matrix Generation with Dual-Step Derivative Ap-
proach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 69
6.3.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 70
6.3.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 71
6.4 CPU Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 73
6.4.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 74
6.4.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 75
6.5 Analysis of CPU Time vs. Increasing Size of the Problem . . . 76
6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7 Use of the Advanced Differentiation Schemes for Optimal


Control Problems 79
7.1 General Formulation of an Optimal Control Problem . . . . . 80
7.2 SPARTAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.3 Hybrid Jacobian Computation . . . . . . . . . . . . . . . . . . 84
7.4 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . 85
7.4.1 Space Shuttle Reentry Problem . . . . . . . . . . . . . 85
7.4.2 Orbit Raising Problem . . . . . . . . . . . . . . . . . . 91

ii
Contents

7.4.3 Hang Glider Problem . . . . . . . . . . . . . . . . . . . 96


7.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8 Further Tools: Robust Differentiation via Sliding Mode Tech-


nique 102
8.1 Sliding Mode Technique . . . . . . . . . . . . . . . . . . . . . 103
8.1.1 Theory of Sliding Mode Control . . . . . . . . . . . . . 103
8.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.2 Sliding Mode Robust Differentiators . . . . . . . . . . . . . . . 107
8.2.1 Fifth-Order Differentiator . . . . . . . . . . . . . . . . 108
8.2.2 Second Order Nonlinear System Obsrever . . . . . . . 111
8.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

9 Conclusions 115
9.1 Lesson Learned . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.2 Future Developments . . . . . . . . . . . . . . . . . . . . . . . 117

A Dual and Hyper-Dual Numbers 118


A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
A.2 Dual Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
A.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 118
A.2.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . 119
A.2.3 Algebraic operations . . . . . . . . . . . . . . . . . . . 119
A.2.4 Defining functions . . . . . . . . . . . . . . . . . . . . . 120
A.2.5 Implementation . . . . . . . . . . . . . . . . . . . . . . 122
A.3 Hyper-Dual Numbers . . . . . . . . . . . . . . . . . . . . . . . 129
A.3.1 Defining Algebraic Operations . . . . . . . . . . . . . . 129
A.3.2 Hyper-Dual Numbers for Exact Derivative Calculations 130
A.3.3 Numerical Examples . . . . . . . . . . . . . . . . . . . 132
A.3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . 134

Bibliography 141

iii
List of Figures

2.1 Function f (x) = sin(x). . . . . . . . . . . . . . . . . . . . . . . 10


2.2 Analytical and numerical derivatives of the function f (x) =
sin(x), h = 1 102 . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Errors comparison, f (x) = sin(x) and h = 1 102 . . . . . . . . 11
2.4 Errors comparison, f (x) = sin(x) and varying h. . . . . . . . . 12
2.5 Errors comparison, 3-points stencil scheme, function f (x) =
sin(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Errors comparison, 5-points stencil scheme, function f (x) =
sin(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 Errors comparison, 7-points stencil scheme, function f (x) =
sin(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.8 Function f (x) = x1 . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9 Analytical and numerical derivatives of the function f (x) = x1 ,
h = 1 102 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.10 Errors comparison, f (x) = x1 and h = 1 102 . . . . . . . . . . 15
2.11 Errors comparison, f (x) = x1 and varying h. . . . . . . . . . . 16
2.12 Function f (x) = x12 . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.13 Analytical and numerical derivatives of the function f (x) =
1
x2
, h = 1 102 . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.14 Errors comparison, f (x) = x12 and h = 1 102 . . . . . . . . . . 18
2.15 Errors comparison, f (x) = x12 and varying h. . . . . . . . . . . 18
2.16 Function f (x) = A sin(t)et . . . . . . . . . . . . . . . . . . . 19
2.17 Analytical and numerical derivatives of the function f (x) =
A sin(t)et , h = 1 102 . . . . . . . . . . . . . . . . . . . . . 20
2.18 Errors comparison, f (x) = A sin(t)et and h = 1 102 . . . . 20
2.19 Errors comparison, f (x) = A sin(t)et and varying h. . . . . 21
2.20 Function f (x) = A sin2 (t) cos(t2 ). . . . . . . . . . . . . . . . 22

iv
List of Figures

2.21 Analytical and numerical derivatives of the function f (x) =


A sin2 (t) cos(t2 ), h = 1 102 . . . . . . . . . . . . . . . . . . 22
2.22 Errors comparison, f (x) = A sin2 (t) cos(t2 ) and h = 5 104 . 23
2.23 Errors comparison,f (x) = A sin2 (t) cos(t2 ) and varying h. . 23
x
2.24 Function f (x) = 3 e 3
. . . . . . . . . . . . . . . . . . 24
sin (x)+cos (x)
2.25 Analytical and numerical derivatives of the function f (x) =
x
3 e
3
, h = 1 102 . . . . . . . . . . . . . . . . . . . . 25
sin (x)+cos (x)
ex
2.26 Errors comparison, f (x) = and h = 1 103 . . . 25
sin3 (x)+cos3 (x)
ex
2.27 Errors comparison, f (x) = and varying h. . . . 26
sin3 (x)+cos3 (x)

3.1 Relative error in the sensitivity estimates, function f (x) =


sin(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Minimum-Errors comparison, function f (x) = sin(x). . . . . . 31
3.3 Relative error in the sensitivity estimates, function f (x) = x1 . . 32
3.4 Minimum-Errors comparison, function f (x) = x1 . . . . . . . . . 33
3.5 Relative error in the first derivative, function f (x) = x12 . . . . 34
3.6 Minimum-Errors comparison, function f (x) = x12 . . . . . . . . 34
3.7 Relative error in the first derivative, f (x) = A sin(t)et . . . 35
3.8 Minimum-Errors comparison, f (x) = A sin(t)et . . . . . . . 36
3.9 Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ). 37
3.10 Minimum-Errors comparison, f (x) = A sin2 (t) cos(t2 ). . . . 37
x
3.11 Relative error in the first derivative, f (x) = 3 e 3
. . . 39
sin (x)+cos (x)
3.12 Relative error in the first derivative [4]. . . . . . . . . . . . . . 39
x
3.13 Minimum-Errors comparison, f (x) = 3 e 3
. . . . . . . 40
sin (x)+cos (x)

4.1 Relative error in the first derivative, function f (x) = sin(x). . 43


4.2 Relative error in the first derivative, function f (x) = x1 . . . . . 44
4.3 Relative error in the first derivative, function f (x) = x12 . . . . 45
4.4 Relative error in the first derivative, f (x) = A sin(t)et . . . 46
4.5 Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ). 47
x
4.6 Relative error in the first derivative, f (x) = 3 e 3
. . . 48
sin (x)+cos (x)
4.7 Relative error in the first derivative [13]. . . . . . . . . . . . . 49

5.1 Jacobian Matrix Sparsity Patterns for the Space Shuttle Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Jacobian Matrix Sparsity Patterns for the Orbit Raising Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Jacobian Matrix Sparsity Patterns for the Hang Glider Problem. 59

v
List of Figures

6.1 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Maximum error in the Jacobian Matrix for the Hang Glider
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.5 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.6 Maximum error in the Jacobian Matrix for the Hang Glider
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.7 Maximum error in the Jacobian Matrix for the Space Shuttle
Reentry Problem. . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.8 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.9 Maximum error in the Jacobian Matrix for the Orbit Raising
Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.10 CPU Time Required for the Space Shuttle Reentry Problem. . 73
6.11 CPU Time Required for the Orbit Raising Problem. . . . . . . 74
6.12 CPU Time Required for the Hang Glider Problem. . . . . . . 75
6.13 CPU Time Required for the Space Shuttle Reentry Problem. . 76
6.14 CPU Time Required for the Orbit Raising Problem. . . . . . . 77
6.15 CPU Time Required for the Hang Glider Problem. . . . . . . 77

7.1 Legendre Polynomials of order 5. . . . . . . . . . . . . . . . . 84


7.2 States Evolution for the Space Shuttle Reentry Problem. . . . 86
7.3 Controls Evolution for the Space Shuttle Reentry Problem. . . 86
7.4 Heat Rate Evolution for the Space Shuttle Reentry Problem. . 87
7.5 Discrepancy between optimized and propagated solutions for
the Space Shuttle Reentry Problem. . . . . . . . . . . . . . . . 87
7.6 Shuttle reentry - state and control variables, [1] . . . . . . . . 88
7.7 Space Shuttle Reentry Problem - Groundtrack of Trajectory
Optimizing Final Crossrange. . . . . . . . . . . . . . . . . . . 88
7.8 Spectral Convergence for the Space Shuttle Reentry Problem. 90
7.9 States Evolution for the Orbit Raising Problem. . . . . . . . . 91
7.10 Control Evolution for the Orbit Raising Problem. . . . . . . . 92
7.11 Orbit Raising Problem - Trajectory Optimizing Final Orbit
Energy (LU=Unitary Length). . . . . . . . . . . . . . . . . . . 92
7.12 Discrepancy between optimized and propagated solutions for
the Orbit Raising Problem. . . . . . . . . . . . . . . . . . . . 93

vi
List of Figures

7.13 Orbit Raising - state and control variables, [9] . . . . . . . . . 93


7.14 Spectral Convergence for the Orbit Raising Problem. . . . . . 95
7.15 States Evolution for the Hang Glider Problem. . . . . . . . . . 97
7.16 Control Evolution for the Hang Glider Problem. . . . . . . . . 97
7.17 Discrepancy between optimized and propagated solutions for
the Hang Glider Problem. . . . . . . . . . . . . . . . . . . . . 98
7.18 Hang Glider - state and control variables, [1] . . . . . . . . . . 98
7.19 Spectral Convergence for the Hang Glider Problem. . . . . . . 100

8.1 Sliding Variable and Sliding Mode Control . . . . . . . . . . . 106


8.2 Asymptotic Convergence and State Trajectory for f (x1 , x2 , t) =
sin(2t) and u(x1 , x2 ) = cx2 sign(). . . . . . . . . . . . . 107
8.3 Fifth-Order Differentiator, without noise. . . . . . . . . . . . . 109
8.4 Fifth-Order Differentiator Errors, without noise. . . . . . . . . 109
8.5 Fifth-Order Differentiator, with noise. . . . . . . . . . . . . . . 110
8.6 Fifth-Order Differentiator Errors, with noise. . . . . . . . . . . 110
8.7 True and Estimated State Variables, without noise. . . . . . . 112
8.8 True and Estimated State Variables, with noise. . . . . . . . . 113
8.9 Comparison between True, Estimated and Measured Position. 113

A.1 Accuracies of several derivative calculation methods as a func-


tion of the step size for the function f (x) = A sin(t)et . . . . 133
A.2 Accuracies of several derivative calculation methods as a func-
x
tion of the step size for the function f (x) = 3 e 3
. . . 133
sin (x)+cos (x)
A.3 Accuracies of several derivative calculation methods as a func-
x
tion of the step size for the function f (x) = 3 e 3
, [15].133
sin (x)+cos (x)

vii
List of Tables

6.1 Accuracy and step size comparison for the Space Shuttle Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Accuracy and step size comparison for the Orbit Raising Prob-
lem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3 Accuracy and step size comparison for the Hang Glider Problem. 65

7.1 Accuracy and CPU Time comparison for the Space Shuttle
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Accuracy and CPU Time comparison for the Space Shuttle
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3 Accuracy and CPU Time comparison for the Orbit Raising
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 94
7.4 Accuracy and CPU Time comparison for the Orbit Raising
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 94
7.5 Accuracy and CPU Time comparison for the Hang Glider
Problem (SNOPT). . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6 Accuracy and CPU Time comparison for the Hang Glider
Problem (IPOPT). . . . . . . . . . . . . . . . . . . . . . . . . 99

viii
Chapter 1
Introduction

1.1 State of the Art


Nowadays the new, increased capabilities of CPUs have constantly encour-
aged researchers and engineers towards the investigation of numerical op-
timization as an analysis and synthesis tool in order to generate optimal
trajectories and the controls to track them.
Optimal control is defined in [8] as the subject where it is desired to de-
termine the inputs to a dynamical system that optimize (i.e., minimize or
maximize) a specified performance index while satisfying any constraints on
the motion of the system. Because of the complexity of most applications,
Optimal Control Problems (OCPs) can no longer be solved analytically and,
consequently, they are most often solved numerically. [1] concentrates on
practical numerical methods for solving the OCP.
Numerical methods for solving OCPs are divided in two major classes: indi-
rect methods and direct methods. Indirect methods are based on the Pon-
tryagin Maximum Principle, which leads to a multiple-point boundary-value
problem. Direct methods, instead, consist in the discretization of the OCP,
transcribing it to a nonlinear optimization problem or NonLinear Program-
ming Problem (NLP).
It is seen [8] that indirect methods and direct methods emanate from two
different philosophies. Indeed, the indirect approach solves the problem indi-
rectly by converting the OCP to a boundary-value problem and, as a result,
the optimal solution is found by solving a system of differential equations
that satisfies endpoint and/or interior point conditions. On the other hand,
using a direct approach, the optimal solution is found by transcribing an
infinite-dimensional optimization problem to a finite-dimensional optimiza-
tion problem. As a consequence, researchers who focus on indirect methods

1
Chapter 1. Introduction

are interested more in differential equations theory, while researchers who


focus on direct methods are occupied largely in optimization techniques.
From this perspective, SPARTAN (Shefex-3 Pseudospectral Algorithm for
Reentry Trajectory ANalysis) is an optimal control package developed by the
DLR (Deutsches Zentrum fur Luft und Raumfafrt). It is the reference tool
for the development of the entry guidance for the SHEFEX-3 (Sharp Edge
Flight Experiment) mission and has been validated with several well-known
literature examples. SPARTAN implements the global Flipped Radau Pseu-
dospectral Methods (FRPMs) to solve constrained or unconstrained OCPs,
which can have a fixed or variable final time.
Pseudospectral Methods [5] represent a particular area of interest in the frame
of the wider class of direct methods. The basic idea behind these methods
is, as in the other direct methods, to collocate the differential equations, the
cost function and the constraints (if any) in a finite number of points in order
to treat them as a set of nonlinear algebraic constraints. In this way, the con-
tinuous OCP is reduced to a discrete NLP problem having finite dimensions,
which can be solved with one of the well-known available software packages,
e.g. SNOPT or IPOPT.
SPARTAN has a highly exploited Jacobian structure, as well as routines
for automatic linear/nonlinear scaling and auto-validation using the Runge-
Kutta 45 scheme. In more details, SPARTAN exploits the general structure
of the Jacobian associated to the NLP problem deriving from the application
of the FRPM, which results in a hybrid computation. Indeed, the Jacobian
matrix is expressed as a sum of three different contributions. Two of them
(the Pseudospectral and the Theoretical contributions) are exact, the third
term (the Numerical contribution), instead, is not exact and it is numerically
computed using the complex-step derivative technique which is proved to be
subject to truncation errors [4, 8] .

1.2 Motivation and goals


Four computational issues that arise in the numerical solution of OCP are:

consistent approximations for the solution of differential equations;

the scaling of the OCP;

exploitation of sparsity in the NLP;

computation of derivatives of the objective and constraint functions.

2
Chapter 1. Introduction

Indeed, an inconsistent approximation of the differential equation can lead to


either nonconvergence or convergence of the optimization problem to a poor
solution. Scaling and exploitation of sparsity in the NLP are discussed for
SPARTAN in [5, 10] and they greatly affect both computational efficiency
and convergence of the NLP. Finally, the manner in which derivatives are
computed is of great importance because accurate derivatives can strongly
improve both computational efficiency and reliability. Therefore, differential
information is an important ingredient in all optimization algorithms, and
consequently it is well worth analysing thoroughly the methods for comput-
ing these quantities.
From this perspective, the aim of this thesis is to provide the reader a thor-
ough knowledge of several differentiation methods starting from the study
of the most basic approaches, which estimate derivatives by means of fi-
nite difference approximation, up to the analysis of advanced differentiation
schemes, such as the complex-step derivative approach, and the dual-step
derivative method. These methods will be compared in terms of accuracy
and computanion time, as well as they will be implemented in SPARTAN
in order to assess the effects of the use of the pseudospectral methods in
combination with each of these differentiation methods.

1.3 Structure of the thesis


The work presented in this thesis is organized as follows.
In Chapter 2 the most common methods for the finite difference approxima-
tion of a first-order derivative are discussed. The central difference schemes
are treated in detail, and are implemented to compute the first-order deriva-
tive of six different test functions having increasing complexity. The effects,
in terms of accuracy, of selecting different values of the perturbation h are
discussed.
Chapter 3 presents the complex-step derivative approach and its application
to the six test functions already defined. The results are compared with re-
spect to the ones achieved using the central difference traditional schemes.
Chapter 4 describes the dual-step derivative method and its implementation.
The error in the first derivative calculation is compared with the error for
the central difference schemes and complex-step approximation.
In Chapter 5, the analytical structure of the Gradient vector and the Jaco-
bian matrix are analysed in order to generate a set of reference data for each
of the following problems: the Space Shuttle Reentry Problem [1], the Orbit
Raising Problem [5], and the Hang Glider Problem [1]. These reference data

3
Chapter 1. Introduction

are used to compare the exact analytical Jacobian with the numerical one,
presented in Chapter 6, and computed using the numerical differentiations
previously defined.
In Chapter 7, the differentiation schemes are used to solve optimal control
problem. A general formulation of an OCP is shown, then the differentiation
schemes are implemented in SPARTAN. Three numerical examples of OCP
are studied: the maximization of the final crossrange in the space shuttle
reentry trajectory, the maximization of the final specific energy in an orbit
raising problem, and the maximization of the final range of a hang glider in
presence of a specific updraft. Each of the three examples is solved using two
different off-the-shelf, well-known NLP solvers: SNOPT and IPOPT. The
results obtained using the different differentiation schemes are thoroughly
inspected in terms of accuracy and CPU time.
In Chapter 8, we deal with the problem of the differentiation of signals given
in real time with the aim to design a robust differentiator based on sliding
mode technique. Two sliding mode robust differentiators are examined on
tutorial examples, and simulated in Simulink.

4
Chapter 2
Finite Difference Traditional
Schemes

Overview
Direct methods for optimal control use gradient based techniques for solving
the NLP (Nonlinear optimization problem or Nonlinear programming prob-
lem). Gradient methods for solving NLPs require the computation of the
derivatives of the objective function and constraints and, the accuracy of the
derivatives has a strong impact on the computational efficiency and reliabil-
ity of the solutions.
The most obvious way to compute the derivatives of the objective function
and constraints is by analytical differentiation. This approach is appealing
because analytical derivatives are exact and generally result in faster opti-
mization but, in many cases it is impractical to compute them. For this
reasons, the aim of the following discussion is to employ alternative means
to obtain the necessary gradients.
The most basic way to estimate derivatives is by finite difference approxi-
mation. In this chapter the most common methods for the finite difference
approximation of a derivative are discussed.
The principle of the finite difference methods consists in approximating the
differential operator by using a discrete differential operator. Considering a
generic one dimensional function f(x) defined in the domain D=[0, X], it is
possible to identify N-grid points
xi = ih (i = 0, 1, . . . , (N 1))
where h is the mesh size. The finite difference schemes calculate derivatives
by approximating them by linear combinations of function values at the grid

5
Chapter 2. Finite Difference Traditional Schemes

points. The simplest schemes using this approach are the backward and
forward schemes.

2.1 Backward and Forward Differences Tra-


ditional Schemes
The two basic methods for the finite difference approximation of a deriva-
tive are backward differencing and forward differencing. They are obtained
considering Taylor series expansion about the point xi D:
1
f (xi h) = f (xi ) f 0 (xi )h + f 00 (xi )h2 + . . . (2.1)
2
1
f (xi + h) = f (xi ) + f 0 (xi )h + f 00 (xi )h2 + . . . (2.2)
2
Focusing on the forward difference scheme, dividing the equation (2.2) through
by h yields [1]
f (xi + h) f (xi ) h 00 f (xi + h) f (xi )
f 0 (xi ) = f (xi ) . . . = + O(h) (2.3)
h 2 h
If the terms of order O(h) are ignored, one obtains the forward difference
approximation (2.4). Applying the same procedure to the equation (2.1) one
obtains the backward difference approximation (2.5) [1]
f (xi + h) f (xi )
f 0 (xi ) = + O(h) Forward Difference (2.4)
h
f (xi ) f (xi h)
f 0 (xi ) = + O(h) Backward Difference (2.5)
h
where h is the perturbation around xi . The choice of h directly affects the
accuracy of the backward and forward difference schemes.
The error between the numerical solution and the exact analytical one is
called truncation error because it reflects the fact that only a finite part of a
Taylor series is used in the approximation. In both cases, using either a for-
ward or a backward scheme, the truncation error is O(h) and for this reason
we will refer to these schemes as first order approximations.
In addition, due to the fact that, on a digital computer, the difference ap-
proximation must be evaluated using finite precision arithmetic, there is a
second source of error, the round-off error, which depends on the accuracy
of the evaluation of the function we are dealing with.
In order to compute more accurate numerical derivatives, it is worth analysing
the central difference schemes.

6
Chapter 2. Finite Difference Traditional Schemes

2.2 Central Difference Traditional Schemes


If the function f (x) can be evaluated at values that lie to the left and right
of xi , then the central difference schemes will involve abscissas that are cho-
sen symmetrically on both sides of xi D. In the following discussions six
different functions will be examined using different central difference scheme:

3-points stencil central difference scheme;

5-points stencil central difference scheme;

7-points stencil central difference scheme;

K-points stencil central difference scheme.

Errors between exact analytic derivatives and numerical ones will be calcu-
lated considering different values of the perturbation h.

2.2.1 3-points stencil central difference scheme


Assuming that f C 3 in D and that (xi + h2 ) and (xi h2 ) D then
   2
h 0 h 1 00 h
f xi = f (xi ) f (xi ) + f (xi ) + ... (2.6)
2 2 2 2
   2
h 0 h 1 00 h
f xi + = f (xi ) + f (xi ) + f (xi ) + ... (2.7)
2 2 2 2
end if we combine the equations (2.7) and (2.6) we get the 3-points stencil
central difference formula (2.8) [2], the truncation error (2.9) [2] and the
round-off error (2.10) [1]

f (xi + h2 ) f (xi h2 )
f 0 (xi ) = (2.8)
h
h2 |f 000 (x)|
T = (2.9)
24
2
R = (2.10)
h
7
Chapter 2. Finite Difference Traditional Schemes

The 3-points stencil central difference scheme error is O(h2 ) which means
that it is a second order approximation and it provides more accurate results
than the backward and forward traditional schemes.

2.2.2 5-points stencil central difference scheme


Assuming that f C 5 in D and that (xi + h), (xi h), (xi + h2 ) and (xi h2 )
D it is possible to derive the 5-points stencil central difference formula
(2.11) [2], the truncation error (2.12) [2] and the round-off error (2.13) [1]
f (xi + h) + 8f xi + h2 8f xi h2 + f (xi h)
 
0
f (xi ) = (2.11)
6h
h 4 (5)

2
|f (x)|
T = (2.12)
30
3
R = (2.13)
h
It is a fourth order approximation meaning that the truncation error term is
of the order O(h4 ).
Comparing the formulas (2.8) and (2.11) it is possible to observe that the
truncation error for the fourth order formula is O(h4 ) and will go to zero
faster than the truncation error O(h2 ) for the second order formula. This
will have a strong impact on the choice of h as it will be seen in the next
sections.

2.2.3 7-points and K-points stencil central difference


schemes
In order to derive the 7-points stencil central difference formula it is conve-
nient to derive a formula for a generic number of points K.
Up to now the numerical derivative of the function f (x) at any point xi has
been computed approximating f (x) by a polynomial in the neighbourhood of
xi . Considering N equidistant points around xi so that N is an odd number
and
N 1 N 1
f (xk ) = fk ; xk = xi + kh; k= ,..., , (2.14)
2 2
and assuming that the points (xk , fk ) are interpolated by a polynomial of
(N 1)th degree
N
X 1
PN 1 (x) = aj x j (2.15)
j=0

8
Chapter 2. Finite Difference Traditional Schemes

where coefficients aj are found as a solution of system of linear equation


{PN 1 (xk = fk )}, the derivative f 0 (xi ) can be approximated by the derivative
of the constructed interpolating polynomial

f 0 (xi ) PN0 1 (xi ) (2.16)

It can be seen that the generic expression of the central difference scheme
has an anti-symmetric structure and in general difference of N th order can
be written as [3]
(N 1)
2
1 X
f 0 (xi ) ak (fk fk ) (2.17)
h k=1
Starting from (2.17), it is possible to derive the formula for N = 7 which
represents the 7-points stencil central difference formula [3]

f (xi 3h) + 9f (xi 2h) 45f (xi h)


f 0 (x) =
60h
45f (xi + h) 9f (xi + 2h) + f (xi + 3h)
+ (2.18)
60h

2.2.4 Numerical Examples


In the following subsection the schemes described above are applied in order
to compute the first derivative of some test functions.

Let f (x) = sin(x).

The following figures illustrate the function trend (Figure 2.1), the
analytical and numerical derivatives of the function (Figure 2.2) and
the comparison between errors related to the three different schemes, at
first considering a constant h (Figure 2.3), and then varying h (Figure
2.4).The error are computed using the analytical result as the reference;
= f 0 fref
0 0
/fref .
As shown in Figure 2.2, considering h = 1 102 , the more points the
stencil is composed of, the more accurate the numerical derivative is,
so it is convenient to use a 7 points stencil central difference scheme.
This fact can be pointed out also observing the Figure 2.3, in which
the error decreases as the number of the points of the stencil increases.
However, this is not generally true. Indeed, Figure 2.4 shows that, if we
reduce the size of the perturbation h, the accuracy of the central differ-
ence schemes which involve more points increases until h is so that the
error || is minimum (meaning T = R ). If h is furtherly reduced, the

9
Chapter 2. Finite Difference Traditional Schemes

accuracy of the more complex stencil schemes decreases because of the


increase of the relative round-off error, which becomes dominant. In
addition, their computational load will be heavier due to the increase
of the number of points where the function must be evaluated. So, in
this case, below some values of h, it is not convenient to use a more
complex stencil.

Figure 2.1: Function f (x) = sin(x).

Considering the 3-points stencil scheme, it is interesting to compare the


trend of the error || between analytical and numerical derivative of
the function with the trend of the truncation error T and the round-
off error R . Figure 2.5 illustrates that the minimum of || occurs,
approximatively, at the intersection between T and R (meaning at
the value of h which corresponds to the minimum value of the sum
(R + T )), as it is reasonable to expect.
The same analysis is repeated considering the 5-points stencil scheme
(Figure 2.6) and the 7-points stencil scheme (Figure 2.7). From the
qualitative point of view, the results are the same.

10
Chapter 2. Finite Difference Traditional Schemes

Figure 2.2: Analytical and numerical derivatives of the function f (x) = sin(x),
h = 1 102 .

Figure 2.3: Errors comparison, f (x) = sin(x) and h = 1 102 .

11
Chapter 2. Finite Difference Traditional Schemes

Figure 2.4: Errors comparison, f (x) = sin(x) and varying h.

Figure 2.5: Errors comparison, 3-points stencil scheme, function f (x) = sin(x).

12
Chapter 2. Finite Difference Traditional Schemes

Figure 2.6: Errors comparison, 5-points stencil scheme, function f (x) = sin(x).

Figure 2.7: Errors comparison, 7-points stencil scheme, function f (x) = sin(x).

13
Chapter 2. Finite Difference Traditional Schemes

Let f (x) = x1 .

The following figures illustrate the function trend (Figure 2.8), the
analytical and numerical derivatives of the function (Figure 2.9) and
the comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.10) and then varying h (Figure
2.11).

Figure 2.8: Function f (x) = x1 .

For this case as well, considering h = 1 102 , Figures 2.9 and 2.10
show that the 7-points stencil scheme is more accurate than the 5-
points stencil scheme, and the 5-points stencil scheme appears to be
more accurate than the 3-points stencil scheme.
Figure 2.11 illustrates that, if we reduce the size of the perturbation
h, the accuracy of the central difference schemes which involve more
points increases until h , defined as the value of h so that |T | = |R |.
If h is furtherly reduced, the accuracy of the dense stencil schemes de-
creases because of the round-off error becomes dominant. As in the
previous case, their computational load will be heavier due to the in-
crease of the number of points where the function must be evaluated.
So, in this case, it is not convenient to use a dense stencil for h < h .

14
Chapter 2. Finite Difference Traditional Schemes

Figure 2.9: Analytical and numerical derivatives of the function f (x) = x1 , h =


1 102 .

Figure 2.10: Errors comparison, f (x) = 1


x and h = 1 102 .

15
Chapter 2. Finite Difference Traditional Schemes

1
Figure 2.11: Errors comparison, f (x) = x and varying h.

Let f (x) = 1
x2
.

In the following figures the function trend (Figure 2.12), the analyt-
ical and numerical derivatives of the function (Figure 2.13) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.14) and then varying h (Figure
2.15) are shown.
Here again, considering h = 1 102 , Figure 2.13 and Figure 2.14 show
that the more complex the stencil is, the more accurate the numerical
derivative is. Figure 2.15 illustrates that, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the dense stencil
schemes decreases because of the round-off error becomes dominant. In
addition, their computational load will be heavier due to the increase
of the number of points where the function must be evaluated. So, in
this case, it is not convenient to use a dense stencil for h < h .

16
Chapter 2. Finite Difference Traditional Schemes

1
Figure 2.12: Function f (x) = x2
.

1
Figure 2.13: Analytical and numerical derivatives of the function f (x) = x2
,
h = 1 102 .

17
Chapter 2. Finite Difference Traditional Schemes

Figure 2.14: Errors comparison, f (x) = 1


x2
and h = 1 102 .

1
Figure 2.15: Errors comparison, f (x) = x2
and varying h.

18
Chapter 2. Finite Difference Traditional Schemes

Now it is interesting to analyze how the central difference schemes work in


presence of more complicated functions:

Let f (x) = A sin(t)et .

The following figures show the function trend (Figure 2.16), the an-
alytical and numerical derivatives of the function (Figure 2.17) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.18) and then varying h (Figure
2.19).

Figure 2.16: Function f (x) = A sin(t)et .

From the qualitative point of view, even in presence of a more com-


plicated function, the results of the sensitivity analysis are the same
(Figures 2.17 - 2.18 - 2.19). Indeed, Figures 2.19 illustrates that, here
again, if we reduce the size of the perturbation h, the accuracy of the
central difference schemes which involve more points increases until h ,
defined as the value of h so that |T | = |R |. If h is furtherly reduced,
the accuracy of the more complex stencil schemes decreases because of
the increase of the round-off error and, in addition, their computational

19
Chapter 2. Finite Difference Traditional Schemes

Figure 2.17: Analytical and numerical derivatives of the function f (x) =


A sin(t)et , h = 1 102 .

Figure 2.18: Errors comparison, f (x) = A sin(t)et and h = 1 102 .

20
Chapter 2. Finite Difference Traditional Schemes

load will be heavier due to the increase of the number of points where
the function must be evaluated. So, in this case, it is not convenient
to use a more complex stencil for h < h .

Figure 2.19: Errors comparison, f (x) = A sin(t)et and varying h.

Let f (x) = A sin2 (t) cos(t2 ).

In the following figures the function trend (Figure 2.20), the analyt-
ical and numerical derivatives of the function (Figure 2.21) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.22) and then varying h (Figure
2.23) are illustrated.
Considering h = 5 104 , Figure 2.21 shows that the denser the stencil
is, the more accurate the central difference scheme is. Furthermore,
Figure 2.22 illustrates that, here again, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the dense stencil
schemes decreases because of the round-off error becomes dominant. In
addition, their computational load will be heavier due to the increase
of the number of points where the function must be evaluated.

21
Chapter 2. Finite Difference Traditional Schemes

Figure 2.20: Function f (x) = A sin2 (t) cos(t2 ).

Figure 2.21: Analytical and numerical derivatives of the function f (x) =


A sin2 (t) cos(t2 ), h = 1 102 .

22
Chapter 2. Finite Difference Traditional Schemes

Figure 2.22: Errors comparison, f (x) = A sin2 (t) cos(t2 ) and h = 5 104 .

Figure 2.23: Errors comparison,f (x) = A sin2 (t) cos(t2 ) and varying h.

23
Chapter 2. Finite Difference Traditional Schemes

The last example has been selected from the literature [4], and confirms the
results of the analysis performed so far.
ex
Let f (x) = .
sin3 (x)+cos3 (x)

The following figures show the function trend (Figure 2.24), the an-
alytical and numerical derivatives of the function (Figure 2.25) and the
comparison between errors related to the three different schemes at
first considering a constant h (Figure 2.26) and then varying h (Figure
2.27).
As seen in the previous cases, here again, considering h = 1 102 ,

ex
Figure 2.24: Function f (x) = .
sin3 (x)+cos3 (x)

Figures 2.25 - 2.26 show that the more points the stencil is composed
of, the more accurate the numerical derivative is, so it is convenient to
use a 7 points stencil central difference scheme, meaning that the error
between analytical and numerical derivative decreases as the number
of the points of the stencil increases. However, this is not generally
true. Indeed, Figure 2.27 shows that, if we reduce the size of the
perturbation h, the accuracy of the central difference schemes which
involve more points increases until h , defined as the value of h so that
|T | = |R |. If h is furtherly reduced, the accuracy of the more complex

24
Chapter 2. Finite Difference Traditional Schemes

Figure 2.25: Analytical and numerical derivatives of the function f (x) =


x
3 e
3
, h = 1 102 .
sin (x)+cos (x)

ex
Figure 2.26: Errors comparison, f (x) = and h = 1 103 .
sin3 (x)+cos3 (x)

25
Chapter 2. Finite Difference Traditional Schemes

ex
Figure 2.27: Errors comparison, f (x) = and varying h.
sin3 (x)+cos3 (x)

stencil schemes decreases because of the increase of the round-off error,


which becomes dominant. In addition, their computational load will be
heavier due to the increase of the number of points where the function
must be evaluated. So, in this case, it is not convenient to use a more
complex stencil for h < h .

2.3 Conclusions
In this chapter the traditional finite difference schemes have been analysed.
We focused on the central difference schemes which appear to be more accu-
rate than the backward and forward difference schemes. Numerical examples
of the different stencil (3-points, 5-points and 7-points) on six different func-
tions having increasing complexity have been discussed showing the effects
of selecting different values of the perturbation h.
The fundamental result is the following: in order to improve the accuracy of
the central difference schemes it is necessary to reduce the truncation error,
due to the higher order terms in the Taylor series, by reducing h. However,
making h too small can lead to subtraction errors due to the finite precision
used by computers to store numbers. Indeed, it is not desirable to choose h
to small otherwise the round-off error becomes dominant.

26
Chapter 2. Finite Difference Traditional Schemes

In addition, the last three examples also show that, as it is reasonable to


expect, the error between the analytical and the numerical derivatives be-
comes higher when more complex functions need to be differentiated. This
is a consequence of the combination of several errors that of course reduce
the overall accuracy of the numerical schemes we have investigated so far.
In conclusion, considering a specific function, it is possible to study how the
error between analytic and numerical derivatives varies with respect to h
in order to choose the central difference scheme which minimize the error
according to a certain value of the perturbation h.

27
Chapter 3
Advanced Differentiation
Scheme: the Complex-Step
Derivative Approach

Overview
In this chapter the complex-step derivative approximation and its applica-
tion to six test functions are presented.
As seen from Chapter 2, the easiest way to estimate numerical derivatives is
by finite difference approximation and, in particular, the central difference
schemes appears to be the most accurate ones. These schemes can be derived
by truncating a Taylor series expanded about a point x.
When estimating derivatives using finite difference formulas we are faced with
the problem of selecting the size of the perturbation h so that it minimizes
the error between the analytic and numerical derivative. Indeed, it is nec-
essary to choose a small h to minimize the truncation error while avoiding
the use of a perturbation too small because, in this case, the round-off error
becomes dominant.
In order to improve the accuracy of the numerical derivatives, the complex-
step derivative approximation is defined so that it is not subject to subtrac-
tive cancellation errors. This is a great advantage over the finite difference
operations as it will be seen in the next sections.
In the following sections the complex-step method is defined and then tested
in presence of six different functions. The results are compared with respect
to the ones achieved in Chapter 2 using the finite difference approximation.

28
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

3.1 The Complex-Step Derivative Approxi-


mation
In this section it is shown that, a first derivative estimate for real functions
can be obtained using complex calculus.
Consider a function, f = u + iv, of the complex variable, z = x + iy. If
f is analytical - that is, if it is differentiable in the complex plane - the
Cauchy-Riemann equations apply and
u v
= , (3.1)
x y
u v
= . (3.2)
y x
Using the definition of derivative in the right-hand side of the first Cauchy-
Riemann equation it is possible to write
u v(x + i(y + h)) v(x + iy)
= lim , (3.3)
x h0 h
where h is a real number. Since the functions we are interested in are orig-
inally real functions of real variables, y = 0, u(x) = f (x) and v(x) = 0.
Equation (3.3) can be rewritten as
f Im[f (x + ih)]
= lim . (3.4)
x h0 h
For a small discrete h, this can be approximated by
f Im[f (x + ih)]
. (3.5)
x h
The equation (3.5) is called complex-step derivative approximation [4]. As
it can be seen, this estimate is not subject to subtractive cancellation errors,
because it does not involve a difference operation as it happens, instead, in the
finite difference approximation. In other words, the complex-step derivative
approximation is not affected by round-off error and it constitutes a huge
advantage over the finite difference approximations allowing us to chose h as
small as possible in order to reduce the truncation error without worrying
about the round-off error.
In order to determine the error involved in this approximation it is possible
to expand f as a Taylor series about a real point x but now, rather than
using a real step h, a pure imaginary step ih is used
000
0 2f (x)
f (x + ih) = f (x) + ihf (x) h + .... (3.6)
3!
29
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Taking the imaginary parts of both sides of this Taylor expansion (3.6) and
dividing it by h yields [4]
Im[f (x + ih)]
f 0 (x) = + O(h2 ). (3.7)
h
As a consequence, the approximation is of order O(h2 ). The second order
errors can be reduced by ensuring that h is sufficiently small and, since the
complex-step approximation does not involve a difference operation, it is
possible to choose extremely small size of the perturbation h without losing
accuracy. The only drawback is the need to have an analytical function. It
cannot be applied, for instance, to look-up tables.

3.2 Numerical examples


In the following section the complex-step method is tested in presence of six
different functions having increasing complexity and the results are compared
with the ones obtained in Chapter 2 using the finite difference schemes.
Let f (x) = sin(x).

Figure 3.1 shows the relative error in the errors estimates given by
the central difference and the complex-step methods using the analyt-
ical result as the reference; = |f 0 fref
0 0
|/|fref |. It illustrates that the
central difference estimates initially converge quadratically to the exact
result but, when the step h is reduced below a value of about 102 for
the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable. Indeed, diverges or tends
to 1 meaning that the finite difference estimates yield zero because h
is so small that no difference exists in the output.
The complex-step derivative approximation converges quadratically with
decreasing step size because of the decrease of the truncation error. The
estimate is practically insensitive to small size of the perturbation h and
for any h below a value of about 108 it achieves the accuracy of the
function evaluation.
Figure 3.2 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. The complex-step approximation ap-
pears to be, approximately, two orders of magnitude more accurate
than the central difference scheme.

30
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Figure 3.1: Relative error in the sensitivity estimates, function f (x) = sin(x).

Figure 3.2: Minimum-Errors comparison, function f (x) = sin(x).

31
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Let f (x) = x1 .

Figure 3.3 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.4 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.

Figure 3.3: Relative error in the sensitivity estimates, function f (x) = x1 .

32
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Figure 3.4: Minimum-Errors comparison, function f (x) = x1 .

Let f (x) = 1
x2
.

Figure 3.5 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.6 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.

33
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

1
Figure 3.5: Relative error in the first derivative, function f (x) = x2
.

1
Figure 3.6: Minimum-Errors comparison, function f (x) = x2
.

34
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Let f (x) = A sin(t)et .

Figure 3.7 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. It illustrates that,
here again, the central difference estimates initially converge quadrati-
cally to the exact result but, when the step h is reduced below a value of
about 102 for the 7-points stencil scheme, 103 for the 5-points stencil
scheme and 105 for the 3-points stencil scheme, round-off error be-
comes dominant and the resulting estimates are not reliable. Indeed,
diverges or tends to 1 meaning that the finite difference estimates
yield zero because h is so small that no difference exists in the out-
put. The complex-step derivative approximation converges quadrati-
cally with decreasing step size because of the decrease of the truncation
error. The estimate is practically insensitive to small size of the per-
turbation h and for any h below a value of about 108 it achieves the
accuracy of the function evaluation.
Figure 3.8 illustrates the comparison of the minimum-error of the cen-
tral difference derivative estimate and the minimum-error of the complex-
step derivative approximation. Here too, the complex-step approxi-
mation appears to be, approximately, two orders of magnitude more
accurate than the central difference scheme.

Figure 3.7: Relative error in the first derivative, f (x) = A sin(t)et .

35
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Figure 3.8: Minimum-Errors comparison, f (x) = A sin(t)et .

Let f (x) = A sin2 (t) cos(t2 ).

Figure 3.9 shows the relative error in the errors estimates given by
the central difference and the complex-step methods. Here again, the
central difference estimates initially converge quadratically to the exact
result but, when the step h is reduced below a value of about 102 for
the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable. Indeed, diverges or tends
to 1 meaning that the finite difference estimates yield zero because h
is so small that no difference exists in the output. The complex-step
derivative approximation converges quadratically with decreasing step
size because of the decrease of the truncation error. The estimate is
practically insensitive to small size of the perturbation h and for any
h below a value of about 108 it achieves the accuracy of the function
evaluation.
Figure 3.10 illustrates the comparison of the minimum-error of the
central difference derivative estimate and the minimum-error of the
complex-step derivative approximation. The complex-step approxima-
tion appears to be, approximately, three orders of magnitude more
accurate than the central difference scheme.

36
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

Figure 3.9: Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ).

Figure 3.10: Minimum-Errors comparison, f (x) = A sin2 (t) cos(t2 ).

37
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

ex
Let f (x) = .
sin3 (x)+cos3 (x)

Figure 3.11 shows the relative error in the errors estimates given
by the central difference and the complex-step methods. Here again,
the central difference estimates initially converge quadratically to the
exact result but, when the step h is reduced below a value of about 102
for the 7-points stencil scheme, 103 for the 5-points stencil scheme and
105 for the 3-points stencil scheme, round-off error becomes dominant
and the resulting estimates are not reliable.
Since this example is taken from [4], we can compare the results. The
comparison of results reported in the Figures 3.11 and 3.12 shows the
consistency of the results. Indeed, diverges or tends to 1 meaning that
the finite difference estimates yield zero because h is so small that no
difference exists in the output. The complex-step derivative approxi-
mation converges quadratically with decreasing step size because of the
decrease of the truncation error. The estimate is practically insensitive
to small size of the perturbation h and for any h below a value of about
108 it achieves the accuracy of the function evaluation.
Figure 3.13 illustrates the comparison of the minimum-error of the
central difference derivative estimate and the minimum-error of the
complex-step derivative approximation. The complex-step approxima-
tion appears to be, approximately, three orders of magnitude more
accurate than the central difference scheme.

3.3 Conclusions
In this chapter the complex-step derivative approximation has been analysed.
The complex-step approximation provides greater accuracy than the finite
difference formulas, for first derivatives, by eliminating the subtraction error.
Indeed, as the finite differences, the complex-step derivative approximation is
concerned with truncation error but, this approximation does not suffer from
the problem of round-off error. f 0 (x) is the leading term of the imaginary
part of f (x + ih), so h can be made small enough that the truncation error
is effectively zero without worrying about the round-off error. The only
disadvantage is the need to have an analytical function, meaning that, for
instance, it is not possible to apply this approximation to look-up tables.
The complex-step derivative approximation has been tested in presence of six
different functions and the results have been compared with the ones shown
in the chapter 2. The value of h that minimize the error between analytical
and numerical derivatives is, in the case of the complex-step approximation,

38
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

ex
Figure 3.11: Relative error in the first derivative, f (x) = .
sin3 (x)+cos3 (x)

Figure 3.12: Relative error in the first derivative [4].

39
Chapter 3. Advanced Differentiation Scheme: the Complex-Step Derivative
Approach

ex
Figure 3.13: Minimum-Errors comparison, f (x) = .
sin3 (x)+cos3 (x)

less than the machine epsilon ( = 2.2204 1016 ). For this reason, in order
to be sure to get reliable results, a value of h equal to twice the machine
epsilon has been selected.

40
Chapter 4
Advanced Differentiation
Scheme: the Dual-Step
Derivative Approach

Overview
In this chapter the dual-step derivative approach and its application to six
test functions are presented.
As said in the previous chapters, derivatives are often approximated using
finite difference schemes. These approximations are subject to truncation
error, associated with the higher order terms of the Taylor series that are
ignored when forming the approximation, and to round-off error which is a
result of performing these calculations on a computer with finite precision.
The complex-step derivative approximation is more accurate than the finite
difference scheme and the greater accuracy is provided by eliminating the
round-off error.
In order to improve the accuracy of the numerical derivatives, the dual-step
approach uses dual numbers, see Appendix [A], and the derivatives calcu-
lated using these new numbers are exact, without any truncation error or
subject to subtraction errors. This is a great advantage, in terms of the step
size, over the complex-step derivative approximation.
In the following sections the dual-step method for the first derivative calcula-
tion is defined and tested in presence of six different functions. The error in
the first derivative calculation is then compared with the error for the central
difference schemes and complex-step approximation.

41
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

4.1 The Dual-Step Derivative Approach


In this section the dual-step formula for the first derivative calculation is
shown.
Consider the Taylor series of a function f (x) for x R
1 2 00 a3 f 000 (x)
f (x + a) = f (x) + af 0 (x) + a f (x) + + .... (4.1)
2! 3!
If we assume that the perturbation a is the dual part of the dual number
(x + a1 )
a = a1  with 2 = 0 and  6= 0 (4.2)
so that a2 = 0, a3 = 0, . . . , the Taylor series (4.1) truncates exactly at the
first-derivative term, yielding the properties of the approximation that we
are seeking:
f (x + a) = f (x) + a1 f 0 (x). (4.3)
So, to get f 0 (x) it is necessary to simply read off the  component and divide
by a1 , yielding the dual-step first derivative formula:
Dual[f (x + a)]
f 0 (x) = . (4.4)
a1
Since the dual-step derivative approximation does not involve a difference
operation and no terms of the Taylor series are ignored, this formula is sub-
ject neither to truncation error, nor to round-off error. There is no need to
make the step size small and the simplest choice is a1 = 1, which eliminates
the need to divide by the step size.
The main disadvantage of the dual-step approach is related to the computa-
tional cost. Working with dual numbers requires additional computational
work. Indeed, adding two dual numbers is equivalent to 2 real additions.
Multiplying two dual numbers is equivalent to 3 real multiplications and 2
real additions, as shown in the Appendix [A]. Therefore a dual-function
evaluation should take about 2 to 5 times the runtime of a real-function eval-
uation.

4.2 Numerical Examples


In the following section the dual-step method is tested in presence of six
different functions having increasing complexity and the results are compared
with the ones obtained using the finite difference schemes and the complex-
step method.

42
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Let f (x) = sin(x).

Figure 4.1 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation, as seen in Chapters 2 and 3.
The error of the dual-step approximation, which is not subject to trun-
cation error or round-off error, is machine zero regardless of the selected
step size.

Figure 4.1: Relative error in the first derivative, function f (x) = sin(x).

43
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Let f (x) = x1 .

Figure 4.2 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
Here again, the error of the dual-step approximation, which is not sub-
ject to truncation error or round-off error, is machine zero regardless
of the selected step size.

Figure 4.2: Relative error in the first derivative, function f (x) = x1 .

44
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Let f (x) = 1
x2
.

Figure 4.3 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.

1
Figure 4.3: Relative error in the first derivative, function f (x) = x2
.

45
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Let f (x) = A sin(t)et .

Figure 4.4 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.

Figure 4.4: Relative error in the first derivative, f (x) = A sin(t)et .

46
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Let f (x) = A sin2 (t) cos(t2 ).

Figure 4.5 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-step
and the dual-step methods using the analytical result as the reference.
The relative error is defined as = |f 0 fref0 0
|/|fref |. As the step size
decreases, the error decreases according to the order of the truncation
error of the method. However, after a certain value of h, the error
for the central difference approximations begins to grow, while the er-
ror for the complex-step approximation continues to decrease until it
reaches, and remains at, machine zero (the machine epsilon). This
shows the effect of subtractive cancellation errors, which affects the fi-
nite difference approximations but not the first derivative complex-step
approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size.

Figure 4.5: Relative error in the first derivative, f (x) = A sin2 (t) cos(t2 ).

47
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

ex
Let f (x) = .
sin3 (x)+cos3 (x)

Figure 4.6 illustrates, as a function of the step size h, the relative error
in the errors estimates given by the central difference, the complex-
step and the dual-step methods using the analytical result as the ref-
erence. As the step size decreases, the error decreases according to the
order of the truncation error of the method. However, after a certain
value of h, the error for the central difference approximations begins to
grow, while the error for the complex-step approximation continues to
decrease until it reaches, and remains at, machine zero (the machine
epsilon). This shows the effect of subtractive cancellation errors, which
affects the finite difference approximations but not the first derivative
complex-step approximation.
For this case as well, the error of the dual-step approximation, which
is not subject to truncation error or round-off error, is machine zero
regardless of the selected step size. Since this example is taken from
[13], we can compare the results. The comparison of results reported
in the Figures 4.6 and 3.12 shows the consistency of the results.

ex
Figure 4.6: Relative error in the first derivative, f (x) = .
sin3 (x)+cos3 (x)

48
Chapter 4. Advanced Differentiation Scheme: the Dual-Step Derivative
Approach

Figure 4.7: Relative error in the first derivative [13].

4.3 Conclusions
In this chapter the dual-step approach for the computation of the first deriva-
tives has been introduced. This approach provides greater accuracy than the
finite difference formulas and the complex-step derivative approximation. In-
deed, the dual-step approach is subject neither to the truncation error, nor
to the round-off error and, as a consequence, the error in the first derivative
estimate is machine zero regardless of the selected step size. This is a great
advantage over the complex-step derivative approximation because, using the
dual-step approach, there is no need to select the step sizes as small as pos-
sible.
The disadvantage is the computational cost due to the fact that working
with dual numbers requires addition computational work. In addition, as
the complex-step approximation, the dual-step approach is concerned with
the need to have an analytical function.
The dual-step approach has been tested in presence of six different functions.
The relative errors have been calculated using the analytical derivative as
reference and then, they have been compared with the ones computed in
chapters 2 and 3 using, respectively, the finite difference schemes and the
complex-step approximation.

49
Chapter 5
Generation of Reference Data

Overview
In this chapter the analytical structure of the Gradient vector and the Jaco-
bian matrix are analysed in order to generate a set of reference data for each
of the following problems: the Space Shuttle Reentry Problem [1], the Orbit
Raising Problem [5] and the Hang Glider Problem [1]. These reference data
are useful so as to compare the exact analytical Jacobian with the numerical
one, computed using different numerical differentiations, as we will see in the
next chapters.
In the first section of this chapter the analytical definitions of the Gradi-
ent vector and the Jacobian matrix of a generic function are given. Then,
the formulation of the three aforementioned problems is described and the
analytical Jacobian is generated for each of them. In the last section the
Jacobian matrix sparsity pattern for each of the three problems is shown.

5.1 Definition of Gradient and Jacobian


Given a generic scalar-valued function of n variables, f (x) : x Rn , if
f (x1 , x2 , . . . , xn ) is differentiable, its gradient is the vector whose components
are the n partial derivatives of f with respect to the n variables xi . The
gradient of the function f is denoted by f where denotes the vector
differential operator:
 
f f f
f = , ,..., . (5.1)
x1 x2 xn
The gradient can be considered as a generalization in n dimensions of the
usual concept of derivative of a function of several variables.

50
Chapter 5. Generation of Reference Data

If we now consider a generic function F (x) : x Rn Rm , the Jacobian ma-


trix of this vector-valued function is the matrix of all the n first-order partial
derivatives of the m real-valued functions, F1 (x1 , . . . , xn ), . . . , Fm (x1 , . . . , xn ),
which can be organized in an m-by-n matrix as follows:

F F1 F1
x11 x2
... xn


F2 F2
... F2
x1 x2 xn
J =
.
(5.2)
.. .. ..
. ... .



Fm Fm Fm
x1 x2
... xn

The Jacobian can be considered as the generalization of the gradient for


vector-valued functions of several variables. Indeed, in the case m = 1 the
Jacobian matrix has a single row, and may be identified with a vector, which
is the gradient.
In the case m = n the Jacobian matrix is a square matrix and its determinant
is a function of (x1 , x2 , . . . , xn ) called the Jacobian determinant of F .
Let us now consider the general structure of the Jacobian matrix associated
to a dynamical system of m equations. In the most general case, considering
ns states x = (x1 , . . . , xns )T , nc controls u = (u1 , . . . , unc )T and a generic
time t, the Jacobian matrix will have the following dimension and structure:

dim(J ) = [m (1 + ns + nc )] (5.3)

F1 F1 F1 F1 F1
t x1
... xns u1
... unc


F2 F2
... F2 F2 F2
. . . u
t x1 xns u1 nc

J =
.
(5.4)
.. .. .. .. ..
. ... . . ... .


Fm Fm Fm Fm Fm
t x1
... xns u1
. . . u n
c

Considering the three problems we will analyse in the next sections, the
number m will be equal to the number of the state variables ns but, if we
consider a generic optimal control problem we have to account for additional
equations, meaning for the cost function and for the nc equation constraints
so that m will be equal to ns + nc + 1.

51
Chapter 5. Generation of Reference Data

5.2 Numerical Examples


In this section, three problems are formulated and, for each of them, we will
generate the analytical Jacobian matrix based on the definition given above.

5.2.1 Space Shuttle Reentry Problem


The problem of the construction of the reentry trajectory for the space shuttle
is a classic example of an optimal control problem and it is of significant
practical interest. The motion of the vehicle is defined by the following set
of differential algebraic equations [1]:

h = v sin(), (5.5)
v sin() cos()
= , (5.6)
r cos()
v
= cos() cos(), (5.7)
r
D
v = g sin(), (5.8)
m  
L v g
= cos() + cos() , (5.9)
mv r v
L v
= sin() + cos() sin() sin(), (5.10)
m v cos() r cos()

where the aerodynamic and atmospheric forces on the vehicle are specified
by the following quantities (English units) [1]:

L = 21 cL Sv 2 , D = 12 cD Sv 2 ,
g = /r2 ( = 0.1407 1017 ), r = Re + h (Re = 20902900),
= 0 exph/hr (0 = 0.002378), hr = 23800,
cL = a0 + a1 , c D = b0 + b1 + b2 2 ,
a0 = 0.20704, b0 = 0.07854,
a1 = 0.029244, b1 = 0.6159 102 ,
S = 2690, b2 = 0.6214 103 .

The state variables are x = (h, , , v, , )T where h is the altitude (ft),


is the longitude (rad), is the latitude (rad), v is the velocity (ft/sec), is
the flight path angle (rad) and is the azimuth (rad). Whereas, the control
variables are u = (, )T where is the angle of attack (rad) and is the
bank angle (rad). So, considering the formula (5.4), the analytical Jacobian

52
Chapter 5. Generation of Reference Data

matrix associated to the Space Shuttle Reentry problem is the following:



0 0 0 0 J15 J16 0 0 0
0 J22 0 J24 J25 J26 J27 0 0

0 J32 0 0 J35 J36 J37 0 0
J = 0 J42 0
(5.11)
0 J45 J46 0 J48 0
0 J52 0 0 J55 J56 0 J58 J59
0 J62 0 J64 J65 J66 J67 J68 J69

The non-zero elements of J are the following:


2
J15 = sin(), J48 = S v (b1 +2b2 )
2m
,
cos()(2v 2 r)
J16 = v cos(), J52 = Svcos()c
2 m hr
L
+ v r3
,
v cos() cos() sin() S cos() cL cos()(v 2 r+)
J22 = (R e cos()+h cos())
2, J55 = 2m
+ r2 v2
,
v cos() sin() sin() sin()(v 2 r)
J24 = (Re +h) cos2 ()
, J56 = r2 v
,
cos() sin() S v cos() a1
J25 = (Re +h) cos()
, J58 = 2m
,

J26 = v(Rsin() sin()


e +h) cos()
, J59 = S v c2Lmsin() ,
v cos() cos()
J27 = (Re +h) cos()
, J62 = 2Ssin()vc L
m cos() hr
vcos()sin()tan()
r2
,

J32 = v cos() cos()


(Re +h)2
, J64 = v cos() sin()
r cos2 ()
,
cos() cos() S sin() cL cos()sin()tan()
J35 = Re +h
, J65 = m cos()
+ r
,

J36 = v sin() cos()


Re +h
, J66 = S sin()v cL sin()
m cos2 ()
v sin()sin()tan()
r
,

J37 = v cos() sin()


Re +h
, J67 = v cos()cos()tan()
r
,
S cD v 2 2 sin() S sin()v a1
J42 = 2 m hr
+ (Re +h)3
, J68 = m cos()
,
S cos()v cL
J45 = S mcD v , J69 = m cos()
.
cos()
J46 = (R e +h)
2,

53
Chapter 5. Generation of Reference Data

5.2.2 Orbit Raising Problem


This problem has been proposed more than once in literature [5] and deals
with the maximization of the specific energy of a low-thrust spacecraft orbit
transfer, in a given fixed time. It can be expressed considering an orbit
subject to the following dynamics (expressed in canonical units) [5]:
dr
= Vr (5.12)
dt
d Vt
= (5.13)
dt r
dVr Vt2
= 2 + T sin() (5.14)
t r r
dVt Vr Vt
= + T cos() (5.15)
dt r
where Vr and Vt are, respectively, the radial and the tangential speed, r is
the radius, is the true anomaly, is the thrust angle, T is the specific force
(the thrust acceleration), assumed to be constant and equal to 0.01, is the
normalized gravitational parameter and is the angle between the direction
of the thrust and the tangential velocity.
The state variables are x = (r, , Vr , Vt )T whereas, the control variable is
u = . So, here again, considering the formula (5.4), the analytical Jacobian
matrix concerning the Orbit Raising problem is the following:

0 0 0 J14 0 0
0 J22 0 0 J25 0
J =0 J32 0 0 J35 J36
(5.16)
0 J42 0 J44 J45 J46

The non-zero elements of J are the following:

J14 = 1, J36 = T cos(),

J22 = Vr2t , J42 = Vr Vt


r2
,

J25 = 1r , J44 = Vrt ,


V2 2
J32 = rt2 + r3
, J45 = Vrr ,
2 Vt
J35 = r
, J46 = T sin().

54
Chapter 5. Generation of Reference Data

5.2.3 Hang Glider Problem


This problem deals with the range maximization of an hang glider in the
presence of a specified thermal updraft. The state equations which describe
the planar motion for the hang glider are [1]

x = vx , (5.17)
y = vy , (5.18)
1
vx = (L sin() D cos()), (5.19)
m
1
vy = (L cos() D sin() W ), (5.20)
m
where

L = 21 cL Svr2 , D = 12 cD Svr2 ,
sin() = Vvry , cos() = vvxr ,
p
vr = vx2 + Vy2 , Vy = vy ua (x),
2
ua (x) = uM (1 X)eX , X = Rx 2.5 ,

and with the quadratic drag polar

cD (cL ) = c0 + kc2L .

The state variables are x = (x, y, vx , vy )T , where x is the horizontal distance,


y is the altitude, vx and vy are, respectively, the horizontal and the vertical
velocity. The control variable is u = cL , the aerodynamic lift coefficient.
Considering the formula (5.4), the analytical Jacobian matrix representing
the Hang Glider problem is the following:

0 0 0 J14 0 0
0 0 0 0 J25 0
J = 0 J32 0 J34 J35 J36
(5.21)
0 J42 0 J44 J45 J46

The non-zero elements of J are the following:

55
Chapter 5. Generation of Reference Data

J14 = 1,

J25 = 1,
  
x
2 ua (x)+uM eX 2.5 Vy vx2 
D vvx2

J32 = Rm
R
vr
sin()cL S vr + cos()cD S vr +L vr3
,
r
   
2
vx Vy D Vy
J34 = mvr sin()cL S vr L v2 + cos()cD S vr m v3 ,
r r
   
Vy 2
J35 = mvr sin()cL S vr D v2 + cos()cD S vr m vvx3 ,
vx L
r r
 
2
J36 = m1 S vr 2sin() S K cL vr2 cos() ,
  
x
2 ua (x)+uM eX 2.5 Vy vx
 vx2 
J42 = Rm
R
vr
cos()c L S vr sin()c D S vr L v 2 D vr3
,
r

1
cL S vx cos() + vLr cD S vx sin() + D vxvV3 y ,

J44 = m r
 
D vx2
J45 = mVyvr cos()cL S vr L vvx2 sin()cD S vr m vr3
,
r
 
1 S vr2 cos() 2
J46 =m 2
S K cL vr sin() .

5.3 Generation of Reference Jacobians


In this section we will generate a reference Jacobian matrix for each of the
three problems which have been formulated in the previous sections. In order
to do that, it is necessary to define a reference solution for each problem. The
reference solution is computed using SPARTAN, a tool developed by DLR
based on the use of the Flipped Radau Pseudospectral Method, which gives,
for each of the aforementioned problems, the following outputs:

- a vector t with nt components which are the nt times, (t1 , . . . , ti , . . . , tnt ),


when the solution is calculated;

- a matrix X, whose dimension is equal to [ns nt ], containing the values


of the ns state variables evaluated at the nt times when the solution is
calculated;

- a matrix U, whose dimension is equal to [nc nt ], containing the values


of the nc control variables evaluated at the nt times when the solution

56
Chapter 5. Generation of Reference Data

is calculated.

As a consequence, the reference Jacobian matrix corresponding to this refer-


ence solution will have the following dimension and structure:

dim(J ) = [ns nt ] [nt (1 + ns + nc )] (5.22)



J1 O[m(1+ns +nc )] ... ... O[m(1+ns +nc )]


O J2 ... ... O[m(1+ns +nc )]
[m(1+ns +nc )]


J =
O[m(1+ns +nc )] ... J3 ... O[m(1+ns +nc )]


.. ..

. ... ... ... .



O[m(1+ns +nc )] ... . . . O[m(1+ns +nc )] Jnt
(5.23)
The generic submatrix Ji has the dimension and structure shown in formula
(5.4), but now each component of the matrix is calculated at time t = ti :

dim(Ji ) = [m (1 + ns + nc )] (5.24)

F1 F1
F1
F1
F1

ti x1
... xns u1
... unc
ti ti ti
ti


F
F2 F2

F2 F2

2
ti x1
. . . xns u1
. . . u nc


Ji = ti ti ti ti (5.25)
.. .. .. .. ..


. . ... . . ... .


Fm Fm Fm Fm Fm
. . . xn ...

ti x1 s u1 unc
ti ti ti ti

So the reference Jacobian matrix is a sparse matrix and its pattern depends
on the problem we are analysing.
In the following figures the Jacobian matrix sparsity patterns for the Space
Shattle Reentry problem (Figure 5.1), the Orbit Raising problem (Figure
5.2) and the Hang Glider (Figure 5.3) are illustrated.
In order to have a better visualization of the pattern, all the figures showing
the Jacobian structure are associated to the solutions obtained using 3 nodes,
meaning nt = 3.

57
Chapter 5. Generation of Reference Data

Figure 5.1: Jacobian Matrix Sparsity Patterns for the Space Shuttle Problem.

Figure 5.2: Jacobian Matrix Sparsity Patterns for the Orbit Raising Problem.

58
Chapter 5. Generation of Reference Data

Figure 5.3: Jacobian Matrix Sparsity Patterns for the Hang Glider Problem.

5.4 Conclusions
In this chapter the Gradient vector and the Jacobian matrix have been de-
fined. Three different problems (the Space Shuttle Reentry Problem, the
Orbit Raising Problem and the Hang Glider Problem) have been formulated
in order to generate a reference Jacobian matrix for each of them, meaning
a Jacobian matrix formulated using analytical derivatives.
The reference Jacobian matrices have been generated using reference solu-
tions which have been calculated with SPARTAN, a tool developed by DLR
based on the the use of the FRPM.
The resulting reference Jacobian matrices are characterized by a sparsity
pattern whose structure depends on the problem that we are analysing, as it
is reasonable to expect. These reference data will be useful in order to val-
idate the numerical differentiation approaches we will introduce in the next
chapters.

59
Chapter 6
Jacobian Matrix Generation
with Numerical Differentiations
- Analysis of Accuracy and
CPU Time

Overview
In this chapter the numerical differentiation schemes analysed in the Chapter
2.2, 3 and 4 are employed in order to generate the numerical Jacobian ma-
trix for three different problems: the Space Shuttle Reentry Problem [1], the
Orbit Raising Problem [5] and the Hang Glider Problem [1]. Then, these nu-
merical Jacobian matrices are compared with the analytical ones computed
in the previous and the results in terms of accuracy and CPU time are illus-
trated.
In the first section of this chapter we focus our attention on the central dif-
ference traditional schemes. For each of the three aforementioned problems
the Jacobian matrix is generated using the 3-points stencil central difference
scheme, the 5-points stencil central difference scheme and the 7-points cen-
tral difference scheme. Then the accuracy of these schemes is analysed using
the analytical Jacobian matrix as reference.
In the second section the Jacobian matrix, for each of the three problems, is
generated using the complex-step derivative approximation. Then the accu-
racy of this scheme is analysed and the results are compared with the one
achieved using the central difference traditional schemes.
In the third section the same analysis is repeated, for each of the three prob-

60
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

lems, considering the Jacobian matrix generated with the dual-step derivative
approach.
In the last sections we focus our attention on the CPU time to have a measure
of the different computational power required for each technique.

6.1 Jacobian Matrix Generation with Cen-


tral Difference Traditional Schemes
In this section the 3-points stencil central difference scheme, the 5-points
stencil central difference scheme and the 7-points central difference scheme
are employed to generate the Jacobian matrix for three different problems:
the Space Shuttle Reentry Problem, the Orbit Raising Problem and the Hang
Glider Problem.

6.1.1 Space Shuttle Reentry Problem


Figure 6.1 illustrates, as a function of the step size h, the error in the
Jacobian Matrix given by the 3-points central difference scheme, the 5-points
stencil central difference scheme and the 7-points central difference scheme.
The error is computed using the exact analytical Jacobian as reference: =
kJnum Jan k .
As shown in the Figure 6.1, as the step size decreases, the maximum error
decreases according to the order of the truncation error. However, this is
not generally true. Indeed, if we reduce the size of the perturbation h, the
accuracy of the central difference schemes which involve more points increases
until h is equal to a given h? so that is minimum. If h is furtherly reduced,
the accuracy of the more complex stencil schemes decreases because of the
increase of the relative round-off error, which becomes dominant. In this
case, below a certain value of h, it is not convenient to compute the Jacobian
matrix using dense stencil schemes.
The minimum value of the error in the computation of the Jacobian matrix ?
and the corresponding value of the perturbation h? are summarized, for each
central difference scheme, in the Table 6.1. In terms of accuracy, in order to
reduce ? it is convenient to use a central difference scheme which involves
more points and, at the same time, to select a value of the perturbation
which is not too small.

61
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

Figure 6.1: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.

Numerical Differentiation Scheme ? h?

3-points stencil CD scheme 3.1 108 1.8 106

5-points stencil CD scheme 3.9 1010 3.7 104

7-points stencil CD scheme 3.1 1011 4.8 103

Table 6.1: Accuracy and step size comparison for the Space Shuttle Problem.

62
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

6.1.2 Orbit Raising Problem


The analysis is now repeated for the Orbit Raising Problem. Figure 6.2 il-
lustrates, as a function of the step size h, the error in the Jacobian Matrix
= kJnum Jan k given by the 3-points central difference scheme, the
5-points stencil central difference scheme and the 7-points central difference
scheme.
Here again, as the step size decreases, the error decreases according to the
order of the truncation error. However, this is not generally true. Indeed,
if we reduce the size of the perturbation h, the accuracy of the central dif-
ference schemes which involve more points increases until h is equal to a
given h? so that is minimum. If h is furtherly reduced, the accuracy of the
more complex stencil schemes decreases because of the increase of the rela-
tive round-off error, which becomes dominant. In this case, below a certain
value of h, it is not convenient to compute the Jacobian matrix using dense
stencil schemes.
The minimum value of the error in the Jacobian matrix ? and the cor-

Figure 6.2: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.

responding value of the perturbation h? are summarized, for each central


difference scheme, in the Table 6.2. In terms of accuracy, in order to reduce
? it is convenient to use a central difference scheme which involves more

63
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

points and, at the same time, to select a value of the perturbation which is
not too small.

Numerical Differentiation Scheme ? h?

3-points stencil CD scheme 3.8 1011 3.6 106

5-points stencil CD scheme 7.0 1013 4.4 104

7-points stencil CD scheme 1.1 1013 2.7 103

Table 6.2: Accuracy and step size comparison for the Orbit Raising Problem.

6.1.3 Hang Glider Problem


Here the analysis is repeated for the Hang Glider Problem. Figure 6.3 il-
lustrates, as a function of the step size h, the error in the Jacobian Matrix
= kJnum Jan k given by the 3-points central difference scheme, the
5-points stencil central difference scheme and the 7-points central difference
scheme.
Also for the Hang Glider Problem, as the step size decreases, the error de-
creases according to the order of the truncation error. However, this is not
generally true. Indeed, if we reduce the size of the perturbation h, the ac-
curacy of the central difference schemes which involve more points increases
until h is equal to a given h? so that is minimum. If h is furtherly reduced,
the accuracy of the more complex stencil schemes decreases because of the
increase of the relative round-off error, which becomes dominant. In this
case, below a certain value of h, it is not convenient to compute the Jacobian
matrix using dense stencil schemes.
The minimum value of the error in the Jacobian matrix ? and the cor-
responding value of the perturbation h? are summarized, for each central
difference scheme, in the Table 6.3. In terms of accuracy, in order to reduce
? it is convenient to use a central difference scheme which involves more
points and, at the same time, to select a value of the perturbation which is
not too small.
It is interesting to underline this result: if we compare the results summa-
rized in the tables 6.1, 6.2 and 6.3 we can point out that, the simpler the
equations which describe the dynamics of the problem are, the better the

64
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

accuracy of each central difference scheme are and the value of the step size
which minimize the error in the Jacobian matrix increases. These results are
consistent with the ones achieved in the Chapter 2.2.

Figure 6.3: Maximum error in the Jacobian Matrix for the Hang Glider Problem.

Numerical Differentiation Scheme ? h?

3-points stencil CD scheme 5.9 1011 7.4 105

5-points stencil CD scheme 7.8 1013 7.7 103

7-points stencil CD scheme 1.4 1013 4.9 102

Table 6.3: Accuracy and step size comparison for the Hang Glider Problem.

In addition, the results suggest that:

in case of use of the 3-points stencil central difference scheme the step
size should be selected in the range 1.8 106 < h < 7.4 105 ;

65
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

in case of use of the 5-points stencil central difference scheme the step
size should be selected in the range 3.7 104 < h < 7.7 103 ;

in case of use of the 7-points stencil central difference scheme the step
size should be selected in the range 2.7 103 < h < 4.9 102 .

6.2 Jacobian Matrix Generation with Complex-


Step Derivative Approach
In this section the complex-step derivative approximation is applied in or-
der to generate the Jacobian matrix for three different problems: the Space
Shuttle Reentry Problem, the Orbit Raising Problem and the Hang Glider
Problem.

6.2.1 Space Shuttle Reentry Problem


Figure 6.4 shows, as a function of the step size h, the error in the Jacobian
matrix = kJnum Jan k given by the central difference schemes and the
complex-step method.
The figure illustrates that, the central difference estimates initially converge
to the exact result but, when the size of the perturbation h is reduced below
a value which is specific for each stencil, round-off error becomes dominant
and the resulting estimates are not reliable. The complex-step derivative ap-
proximation, instead, converges with decreasing step size and the estimate is
practically insensitive to small size of the perturbation h. This trend is due
to the fact that, as shown in Chapter 3, the complex-step approximation is
concerned with truncation error but, this approximation does not suffer from
the problem of round-off error. This is a great advantage over the central
difference schemes which are subject to both truncation and round-off errors.
The computation of the Jacobian matrix with the complex-step approxima-
tion appears to be more accurate than the one with the central difference
schemes. Indeed, in this case, the minimum value of the error in the Jaco-
bian matrix is ? = 7.41012 and the corresponding value of the perturbation
is h? = 7 109 . These results can be compared with the ones summarized
in table 6.1 to assess the better accuracy of the Jacobian matrix generated
with the complex-step approximation.

66
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

Figure 6.4: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.

6.2.2 Orbit Raising Problem


The analysis is now repeated for the Orbit Raising Problem. Figure 6.5
shows, as a function of the step size h, the error in the Jacobian matrix
= kJnum Jan k given by the central difference schemes and the complex-
step method.
The figure shows that, here again, the central difference estimates initially
converge to the exact result but, when the size of the perturbation h is re-
duced below a value which is specific for each stencil, round-off error becomes
dominant and the resulting estimates are not reliable. The complex-step
derivative approximation, instead, converges with decreasing step size and
the estimate is practically insensitive to small size of the perturbation h.
The computation of the Jacobian matrix with the complex-step approxima-
tion appears to be more accurate than the one with the central difference
schemes. Indeed, in this case, the minimum value of the maximum error in
the Jacobian matrix is ? = 4.4 1016 and the corresponding value of the
perturbation is h? = 1.9 109 . These results can be compared with the ones
summarized in table 6.2 to assess the better accuracy of the Jacobian matrix
generated with the complex-step approximation.

67
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

Figure 6.5: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.

6.2.3 Hang Glider Problem


Here the analysis is repeated for the Hang Glider Problem. Figure 6.6 il-
lustrates, as a function of the step size h, the error in the Jacobian matrix
= kJnum Jan k given by the central difference schemes and the complex-
step method. The figure shows that, here too, the central difference estimates
initially converge to the exact result but, when the size of the perturbation
h is reduced below a value which is specific for each stencil, round-off error
becomes dominant and the resulting estimates are not reliable. The complex-
step derivative approximation, instead, converges with decreasing step size
and the estimate is practically insensitive to small size of the perturbation h.
The computation of the Jacobian matrix with the complex-step approxima-
tion appears to be more accurate than the one with the central difference
schemes. Indeed, in this case, the minimum value of the maximum error in
the Jacobian matrix is ? = 4.5 1015 and the corresponding value of the
perturbation is h? = 2.1 109 . These results can be compared with the ones
summarized in table 6.3 to assess the better accuracy of the Jacobian matrix
generated with the complex-step approximation.

68
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

Figure 6.6: Maximum error in the Jacobian Matrix for the Hang Glider Problem.

6.3 Jacobian Matrix Generation with Dual-


Step Derivative Approach
In this section the dual-step derivative approach is employed to generate
the Jacobian matrix for three different problems: the Space Shuttle Reentry
Problem, the Orbit Raising Problem and the Hang Glider Problem.

6.3.1 Space Shuttle Reentry Problem


Figure 6.7 shows, as a function of the step size h, the error in the Jacobian
matrix given by the central difference schemes, the complex-step approxi-
mation and the dual-step approach. The error is computed using the exact
analytical Jacobian as reference: = kJnum Jan k .
The figure illustrates that, as the step size decreases, the error decreases ac-
cording to the order of the truncation error of the method. However, after
a certain value of h, the error for the central difference approximations be-
gins to grow, while the error for the complex-step approximation continues
to decrease until it reaches, and remains at a minimum value. The error of
the dual-step approach, which is not subject to truncation error or round-
off error (see Chapter 4), is around the minimum value of the error of the

69
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

complex-step, regardless of the selected step size. This means that it is con-
venient to select a size of the perturbation h = 1 so that we will avoid to
perform the ratio (4.4) to compute each term of the Jacobian matrix but, at
the same time, we will not lose accuracy.
The computation of the Jacobian matrix with the dual-step approach appears
to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, even if the minimum value
of the error in the Jacobian matrix ? = 7.4 1012 is comparable with the
one obtained with the use of the complex-step approximation now, with the
dual-step approach, the number of exact derivatives in the Jacobian matrix
is increased. In addition, the value of the perturbation h? which minimize
the error can be selected equal to 1.

Figure 6.7: Maximum error in the Jacobian Matrix for the Space Shuttle Reentry
Problem.

6.3.2 Orbit Raising Problem


Figure 6.8 shows, as a function of the step size h, the error in the Jaco-
bian matrix = kJnum Jan k given by the central difference schemes, the
complex-step method and the dual-step approach.
Here again, the figure illustrates that, as the step size decreases, the er-
ror decreases according to the order of the truncation error of the method.

70
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

However, after a certain value of h, the error for the central difference approx-
imations begins to grow, while the error for the complex-step approximation
continues to decrease until it reaches, and remains at a minimum value. The
error of the dual-step approach, which is not subject to truncation error or
round-off error (see Chapter 4), is around the minimum value of the error of
the complex-step, regardless of the selected step size. This means that, here
too, it is convenient to select a size of the perturbation h = 1.
The computation of the Jacobian matrix with the dual-step approach appears
to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, in this case the minimum
value of the error in the Jacobian matrix ? = 3.3 1016 . In addition, with
the use of the dual-step, the number of exact derivatives in the Jacobian
matrix is increased and the value of the perturbation h? which minimize the
error can be selected equal to 1.

Figure 6.8: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.

6.3.3 Hang Glider Problem


The analysis is now repeated for the Hang Glider Problem. Figure 6.9
shows, as a function of the step size h, the error in the Jacobian matrix

71
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

= kJnum Jan k given by the central difference schemes, the complex-


step method and the dual-step approach.
Here too, the figure illustrates that, as the step size decreases, the error
decreases according to the order of the truncation error of the method. How-
ever, after a certain value of h, the error for the central difference approxi-
mations begins to grow, while the error for the complex-step approximation
continues to decrease until it reaches, and remains at a minimum value. The
error of the dual-step approach, which is not subject to truncation error or
round-off error (see Chapter 4), is around the minimum value of the error of
the complex-step, regardless of the selected step size. This means that it is
convenient to select a size of the perturbation h = 1 so that we will avoid to
perform the ratio (4.4) to compute each term of the Jacobian matrix but, at
the same time, we will not lose accuracy.
The computation of the Jacobian matrix with the dual-step approach appears

Figure 6.9: Maximum error in the Jacobian Matrix for the Orbit Raising Prob-
lem.

to be more accurate than the one with either the central difference schemes
or with the complex-step approximation. Indeed, even if the minimum value
of the error in the Jacobian matrix ? = 4.0 1015 is comparable with the
one obtained with the use of the complex-step approximation now, with the
dual-step approach, the number of exact derivatives in the Jacobian matrix

72
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

is increased. In addition, the value of the perturbation h? which minimize


the error in the Jacobian matrix can be selected equal to 1.

6.4 CPU Time Analysis


In this section we will consider the aforementioned problems and, for each
of them, we will compare the CPU time required to compute the Jacobian
matrix using the numerical differentiation proposed.

6.4.1 Space Shuttle Reentry Problem


Figure 6.10 shows, as a function of the step size h, the CPU time required for
the computation of the Jacobian matrix with the central difference schemes,
the complex-step approximation and the dual-step approach.
The figure illustrates that, concerning the central difference scheme, the more
complex the stencil is, the higher the time required to compute the Jacobian
matrix is, as it is reasonable to expect.

Figure 6.10: CPU Time Required for the Space Shuttle Reentry Problem.

The CPU time required to compute the Jacobian matrix with the dual-step
approach is higher than the one associated to the complex-step approxima-
tion because of the additional computational work associated to the use of

73
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

the dual numbers. However, the values are comparable and the dual-step
approach is preferred due to the better accuracy.
The CPU time associated to the use of either the 7-points or the 5-points
stencil central difference schemes is higher than the one required when the
dual-step approach is employed. This is caused by the complexity of the
equations which describe the problem. Indeed, in this case, the multiple
evaluation of the functions associated to the central difference schemes is, for
this problem, the major contribution to the CPU load, and its cost is higher
then the effort associated to the use of the dual-step class.

6.4.2 Orbit Raising Problem


Figure 6.11 shows, as a function of the step size h, the CPU time required for
the computation of the Jacobian matrix with the central difference schemes,
the complex-step approximation and the dual-step approach.
The figure shows that, here again, concerning the central difference scheme,
the more complex the stencil is, the higher the time required to compute the
Jacobian matrix is.

Figure 6.11: CPU Time Required for the Orbit Raising Problem.

The CPU time required to compute the Jacobian matrix with the dual-step
approach is higher than the one associated to the use of the other approx-
imations. This is due to the fact that the use of the dual-step approach

74
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

requires the implementation of a new MATLAB class which allows a real-


valued function to be converted to operate on dual numbers. When the
equations which describe the dynamics of the problem are not complicated,
the need to call the MATLAB class and the additional computational work
associated to the use of the dual numbers need more CPU power than using
the other differentiation methods.

6.4.3 Hang Glider Problem


Figure 6.12 shows, as a function of the step size h, the CPU time required for
the computation of the Jacobian matrix with the central difference schemes,
the complex-step approximation and the dual-step approach.
The figure illustrates that, here too, concerning the central difference scheme,
the more complex the stencil is, the higher the time required to compute the
Jacobian matrix is.
Here again, the CPU time required to compute the Jacobian matrix with
the dual-step approach is higher than the one associated to the use of the
other methods. Indeed, also in this case, the need to call the new MATLAB
class and the additional computational work associated to the use of the dual
numbers need more CPU time than the one required to implement either the
central difference schemes or the complex-step approximation.

Figure 6.12: CPU Time Required for the Hang Glider Problem.

75
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

6.5 Analysis of CPU Time vs. Increasing


Size of the Problem
In this section a comparison between the CPU time required to compute the
Jacobian matrix is performed varying the size of the problem. In particular,
in the following figures, for each of the three problems considered, the CPU
time is calculated increasing the number of the nodes n.
Figures 6.13, 6.14 and 6.15 show that, for each of the three problems, if n
increases from 20 up to 120, the CPU time increases as well, as expected.
The CPU time has been calculated selecting, for each of the five methods
applied to compute the Jacobian matrix, the corresponding value of the step
size h? which minimizes the error. In particular, for the dual-step approach
a value of the step size equal to one has been chosen.

Figure 6.13: CPU Time Required for the Space Shuttle Reentry Problem.

76
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

Figure 6.14: CPU Time Required for the Orbit Raising Problem.

Figure 6.15: CPU Time Required for the Hang Glider Problem.

77
Chapter 6. Jacobian Matrix Generation with Numerical Differentiations -
Analysis of Accuracy and CPU Time

6.6 Conclusions
In this chapter the Jacobian matrices for three different problems (the Space
Shuttle Reentry Problem, the Orbit Raising Problem and the Hang Glider
Problem) have been generated using different numerical differentiations. The
results are compared in terms of accuracy and CPU time.
The dual-step approach has proved to be the most accurate differentiation
method for the computation of the Jacobian matrix. Each term of the Jaco-
bian matrix calculated using this approach is subject neither to truncation
error, nor to round-off error. In addition, using the dual-step approach there
is no need to make the step size small because the best accuracy is achieved
regardless of the selected step size and the simplest choice is h = 1 in order to
eliminate the need to divide by the step size. This is an advantage over the
use of the central difference schemes and the complex-step approximation.
Indeed, the use of either the central difference schemes or the complex-step
approximation has proved to be less accurate and, in addition, their accuracy
is strongly influenced by the selection of the optimal step size. The optimal
step size for these methods is not known a priori and it requires a trade off
between the truncation and round-off errors as well as a substantial effort
and knowledge of the analytical derivative.
In terms of CPU time, the time required for the computation of the Jacobian
matrix with the dual-step approach is not the smallest one and it depends
on the non-linearity and complexity of the equations which describe the dy-
namics of the problem. Indeed, the use of the dual-step approach requires
additional computational work associated to the use of the dual numbers
which are implemented in MATLAB as a new class of numbers.
The CPU time is also influenced by the number of the nodes and, for each of
the three problems analysed, an increase of the number of the nodes causes
an increase of the CPU time as well, as expected.
To conclude, the trade-off between accuracy and CPU power suggests that
the dual-step differentiation is a valid alternative to the other well-known nu-
merical differentiation methods, in case the problem is analytically defined
(i.e., no look-up table are part of the data).

78
Chapter 7
Use of the Advanced
Differentiation Schemes for
Optimal Control Problems

Overview
In this chapter the differentiation schemes defined in the previous chapters
are used to solve optimal control problems.
In the first section we formulate a general optical control problem and we fo-
cus our attention on the numerical approaches which can be used for solving
it. In the next sections SPARTAN, an algorithm developed by DLR based on
the use of the Flipped Radau Pseudospectral Method, is presented focusing
on the computation of the Jacobian matrix associated to the NLP problem.
Three numerical examples of optimal control problem are studied: the max-
imization of the final crossrange in the space shuttle reentry trajectory, the
maximization of the final specific energy in an orbit raising problem, and the
maximization of the final range of a hang glider in presence of a specific up-
draft. Each of the three examples is solved using five different differentiation
schemes ( the 3-points stencil central difference scheme, the 5-points sten-
cil central difference scheme, the 7-points stencil central difference scheme,
the complex-step approach and the dual-step method), and two different off-
the-shelf, well-known NLP solvers (SNOPT and IPOPT). The results are
compared in terms of accuracy and CPU time in order to realize the main
advantages and drawbacks related to the use of the pseudospectral methods
in combination with the dual numbers, and with the other differentiation
schemes.

79
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

7.1 General Formulation of an Optimal Con-


trol Problem
Optimal control is a subject where it is desired to determine the inputs to
a dynamical system that optimize (i.e., minimize or maximize) a specific
performance index while satisfying any constraints on the motion. Indeed,
the term optimal control problem refers to a problem where the inputs to
the system are themselves functions or static parameters and it is desired to
determine the particular input function and trajectory that optimize a given
performance index or objective function.
An optimal control problem is posed formally as follows, [8]. Determine the
state (equivalently, the trajectory or path), x(t) Rn , the control u(t) Rm ,
the vector of static parameters p Rq , the initial time, t0 R, and the
terminal time, tf R (where t [t0 , tf ] is the independent variable) that
optimizes the cost function

J[x(t), u(t), t; p] (7.1)

subject to the dynamic constraints (i.e., differential equation constraints),

x(t) = f[x(t), u(t), t; p], (7.2)

the path constraints

Cmin C[x(t), u(t), t; p] Cmax , (7.3)

and the boundary conditions

min [x(t), u(t), t; p] max , (7.4)

The objective function (7.1), in a Bolza formulation of the OCP, can be


expressed as follows
Z tf
J = [x(t0 ), t0 , x(tf ), tf ; p] + L[x(t), u(t), t; p]d (7.5)
t0

where is called the Mayer term and L is called the Lagrange integrand.
The differential equations (7.2) describe the dynamics of the system while
the objective (7.1) is the performance index which can be considered as a
measure of the quality of the trajectory. When it is desired to minimize
the performance index, a lower value of J is preferred; conversely, when it is
desired to maximize the performance index, a higher value of J is preferred.
With the exception of simple problems (i.e., some special weakly nonlinear

80
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

low-dimensional systems), optimal control problems must be solved numeri-


cally. The need for solving optimal control problems numerically has given
rise to a wide range of numerical approaches. These numerical approaches
are divided into two major categories: indirect methods and direct methods,
[8].
The indirect methods are based on the calculus of variations which is used to
determine the first-order optimality conditions of the original optimal control
problem given in equations (7.1)-(7.4). Unlike ordinary calculus (where the
objective is to determine points that optimize a function), the calculus of
variations is the subject of determining functions that optimize a functional.
A functional is a function from a vector space into its underlying scalar field
and, commonly, the vector space is a space of functions, thus the functional
is sometimes considered as a function of a function. The indirect approach
leads to a multiple-point boundary-value problem that is solved to deter-
mine candidate optimal trajectories called extremals. Each of the computed
extremals is then examined to see if it is a local minimum, maximum, or
a saddle point. Of the locally optimizing solutions, the particular extremal
with the lowest cost is chosen. So, the indirect aproach solves the problem
indirectly by converting the optimal control problem to a boundary-value
problem and, as a result, the optimal solution is found by solving a system
of differential equations that satisfies endpoint and/or interior point condi-
tions.
Direct methods are fundamentally different from indirect methods. In a
direct method, the state and/or control of the optimal control problem is
discretized in some manner and the problem is transcribed to a nonlinear
optimization problem or nonlinear programming problem (NLP). Indeed, a
NLP problem is characterized by a finite set of state and control variables,
while optimal control problems can involve continuous functions. Therefore,
it is convenient to view the optimal control problem as an infinite-dimensional
extension of an NLP problem.
The general nonlinear programming problem can be stated as follows [1]:
Find the n-vector xT = (x1 , . . . , xn ) to minimize the scalar objective func-
tion
F (x) (7.6)
subject to the m constraints

cL c(x) cU (7.7)

(equality constraints can be imposed by setting cL = cU ) and the simple


bounds
xL x xU . (7.8)

81
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Once the optimal control problem is transcribed to a NLP problem, the NLP
will be solved using well known optimization techniques, [1]. In conclusion,
in a direct method the optimal solution is found by transcribing the infinite-
dimensional (continuous) optimization problem to a finite-dimensional op-
timization problem. In particular, one of the most promising techniques is
represented by direct collocation methods and, among these, pseudospectral
methods are gaining popularity for their straightforward implementation and
some interesting properties which are associated to their use [5].

7.2 SPARTAN
SPARTAN (Shefex-3 Pseudospectral Algorithm for Reentry Trajectory ANal-
ysis) is an optimal control package developed by the DLR. It has already been
used in literature [11, 12], and it is the reference tool for the development
of the entry guidance for the SHEFEX-3 (Sharp Edge Flight Experiment)
mission and has been validated with several well-known literature examples.
SPARTAN implements the global Flipped Radau Pseudospectral Methods
(FRPMs) to solve constrained or unconstrained optimal control problems,
which can have a fixed or variable final time. It belongs to the class of direct
methods and has a highly exploited Jacobian structure, as well as routines
for automatic linear/nonlinear scaling and auto-validation using the Runge-
Kutta 45 scheme.
The basic idea of the FRPM is, as in the other direct methods, to collocate
the differential equations, the cost function and the constraints in a finite
number of points in order to treat them as a set of nonlinear algebraic equa-
tions. In this way, the continuous OCP is reduced to a discrete NLP problem
which can be solved with one of the well-known available software packages,
e.g. SNOPT, IPOPT.
In details, considering the structure of the classical Bolza Optimal Control
Problem (7.1)-(7.5) we want to solve, SPARTAN proposes a transcription of
the OCP as NLP based on the choice of some trial functions to represent
the continuous variables

x(ti )
= Xi , i [0, N ] (7.9)
u(tj )
= Uj , j [1, N ]. (7.10)

82
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

In other words, the continuous states and controls can be substituted with
polynomials which interpolate the values in the nodes
N
x(t)
X
= Xi Pi (t) (7.11)
i=0
N
u(t)
X
= Ui Pi (t) (7.12)
i=1
where
N
Y t tj
Pi (t) = (7.13)
j=0,j6=i
ti tj
and tj are the roots of linear combinations of Legendre Polynomials (Figure
7.1) Pn (t) and Pn1 (t).
The difference in the indexing in (7.6), (7.10) and in their discrete repre-
sentations is due to the distinctions between discretization and collocation.
While the discretization includes (in the FRPM) the initial point, the collo-
cation does not. Hence, the controls will be approximated with a polynomial
having a lower order and the NLP problem will not provide the initial values
for the controls. These can be, in some cases, part of the initial set of known
inputs, otherwise they can be extrapolated from the generated polynomial
interpolating the N values of controls in the collocation nodes, [5].
In this way the entire information related to the states and the controls is
enclosed in their nodal values. Of course, the boundaries valid for the contin-
uous form will also be applied to the discrete representation of the functions.
In particular, it has been shown that for SPARTAN, as well as for all the
pseudospectral methods, the following properties are valid:
Spectral convergence in the case of smooth problem;
The Runge phenomenon is avoided;
Sparse structure of the associated NLP problem;
Differential equations become algebraic constraints evaluated in the
collocations points.
In the next section we will focus our attention on the structure of Jacobian
associated to the NLP problem deriving from the transcription implemented
by SPARTAN. Indeed, experience shows that, while for simple systems a
more detailed analysis of Jacobian can be avoided, in complex problems
like atmospheric reentry a solid knowledge of its structure is very helpful
and significantly increases the speed of computation and in some cases the
quality of the results.

83
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.1: Legendre Polynomials of order 5.

7.3 Hybrid Jacobian Computation


In the most general case, considering ns states, nc controls, ng constraints,
n collocation points and unknown final time, the Jacobian associated to the
transcription of an autonomous system of equations will be expressed as a
matrix having the following dimension, [5]:
dim(J ) = [n (ns + ng ) + 1] [(n + 1) ns + n nc + 1]. (7.14)
SPARTAN exploits the general structure of the Jacobian associated to the
NLP problem deriving from the application of the FRPM. Indeed, in order
to take full advantage from the intrinsic sparsity associated to the use of
pseudospectral methods and from the theoretical knowledge contained in the
definition of the discrete operator D, the Jacobian is expressed as sum of
three different contributions
J = JP seudoSpectral + JN umerical + JT heoretical . (7.15)
For a deeper analysis on the structure of each term of the Jacobian see [5].
The PseudoSpectral and the Theoretical terms are exact. So far, the Nu-
merical Jacobian has been computed in SPARTAN using the complex-step
derivative method, so it is not exact because the complex-step derivative
approach is affected by truncation errors. Now, by means of the use of the
dual-step derivative method it is possible to have exact Numerical Jacobian
because the dual-step differentiation scheme is subject neither to truncation
error, nor to round-off error, as demonstrated in the previous chapters.

84
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

7.4 Numerical Example


7.4.1 Space Shuttle Reentry Problem
The construction of reentry trajectory for the space shuttle is a classic ex-
ample of an OCP. The motion of the vehicle is described by the set of dif-
ferential algebraic equations defined in Chapter 5.2.1, equations (4.5)-(4.10).
The reentry trajectory begins at an altitude where the aerodynamic forces
are quite small with the following initial conditions:
h0 = 260000f t, v0 =25600ft/s,
0 = 0 , 0 = 1
0 = 0 , 0 = 90 .

The final point on the reentry trajectory occurs at the unknown final time
tf . The goal is to choose the control variables (t) and (t) so that the final
cross-range is maximized which is equivalent to maximizing the final latitude
(tf ). So, the cost function can be defined as follows:
J = (tf ). (7.16)
In this case the Jacobian has all the three components and its pattern is
shown in [5]. The OCP has been implemented and solved with SPARTAN
using different differentiation schemes: the 3-points stencil central differ-
ence scheme, the 5-points stencil central difference scheme, the complex-step
derivative approach and the dual-step derivative method. Furthermore, the
solution is computed with an upper bound on the aerodynamic heating of 70
BT U/f t2 /s.
Figures 7.2, 7.3 and 7.4 illustrate the time histories of the states, the controls
and the constraints which are associated to the solution obtained using the
dual-step derivative method and a number of nodes equal to 100. Figure
7.5 shows the discrepancy between SPARTAN and propagated (using the
Runge-Kutta 45 scheme) solutions.
Since this example is taken from [1], we can compare the results. In Figure
7.6 the time histories for the states and the controls are shown as a solid
line for the unconstrained solution and, as a dotted line for the constrained
solution (the one implemented and solved by SPARTAN). The comparison
of the Figures 7.2-7.4 and 7.6 shows the consistency of the results.
Tables 7.1 and 7.2 summarize the results obtained with SPARTAN using the
five different differentiation schemes. In the first table SNOPT (Sparse Non-
linear OPTimizer) is used to solve the NLP problem, instead in the second
table IPOPT (Interior Point OPTimizer) is employed in SPARTAN.

85
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.2: States Evolution for the Space Shuttle Reentry Problem.

Figure 7.3: Controls Evolution for the Space Shuttle Reentry Problem.

86
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.4: Heat Rate Evolution for the Space Shuttle Reentry Problem.

Figure 7.5: Discrepancy between optimized and propagated solutions for the Space
Shuttle Reentry Problem.

87
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.6: Shuttle reentry - state and control variables, [1]

Figure 7.7: Space Shuttle Reentry Problem - Groundtrack of Trajectory Optimiz-


ing Final Crossrange.

88
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

100 Nodes 200 Nodes 300 Nodes


SNOPT

Mean Max CPU Mean Max CPU Mean Max CPU


Iter. Iter. Iter.
Error Error (sec) Error Error (sec) Error Error (sec)

8.647 60.47 13 21.38 1.367 13.23 2 62.18 0.4963 4.378 3 248.29


CD3

8.558 60.35 13 16.21 1.363 13.44 10 83.79 0.5106 4.380 3 267.60


CD5

8.571 60.45 13 17.39 1.357 13.28 10 82.27 0.5171 4.408 3 268.38


CD7

8.612 60.44 13 15.20 1.383 13.15 3 63.40 0.5337 4.441 3 249.23


Compl.-Step

8.533 60.45 13 15.37 1.348 13.05 10 80.87 0.4902 4.350 3 270.06


Dual-Step

Table 7.1: Accuracy and CPU Time comparison for the Space Shuttle Problem
(SNOPT).

100 Nodes 200 Nodes 300 Nodes


IPOPT

Mean Max CPU Mean Max CPU Mean Max CPU


Iter. Iter. Iter.
Error Error (sec) Error Error (sec) Error Error (sec)

6.287 62.78 483 472.1 7.402 26.52 3094 1.2 104 13.917 48.33 1706 2.8104
CD3

6.306 62.80 767 675.8 7.547 27.01 6635 2.7 104 13.911 48.32 1168 1.9104
CD5

6.280 62.87 687 628.85 6.867 24.63 1971 8.1 103 13.920 48.31 1448 2.5104
CD7

6.526 62.84 439 313.6 6.983 25.24 3044 1.2 104 13.910 48.33 4177 7.2104
Compl.-Step

6.285 62.76 574 518.7 6.932 21.93 3493 1.3 104 13.857 47.14 2425 4.1104
Dual-Step

Table 7.2: Accuracy and CPU Time comparison for the Space Shuttle Problem
(IPOPT).

Figure 7.8 shows the trend of the mean error (in logarithmic scale) between
the solutions computed with SPARTAN and the propagated solutions as a
function of the number of the nodes. For each of the five differentiation
schemes we have a spectral (exponential) convergence of the solution. The
dual-step related plot appears to be much smoother than the complex-step.

89
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.8: Spectral Convergence for the Space Shuttle Reentry Problem.

90
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

7.4.2 Orbit Raising Problem


This problem is taken from [5, 9] and deals with the maximization of the
total specific energy of a low-thrust spacecraft orbit transfer, in a given fixed
time. The orbit is subject to the dynamics expressed by the equations defined
in Chapter 5.2.2. The goal is to maximize the total specific energy at the
final time, considering that the rocket engine provides a constant thrust
acceleration to the spacecraft. Thus, the cost function can be defined as
follows:  
1 1 2 2

J= V (tf ) + Vt (tf ) . (7.17)
r(tf ) 2 r
Since the final time is known, the Jacobian here will only consist of the
pseudospectral and numerical contributions [5].
Figures 7.9, 7.10 and 7.12 illustrate states, controls and the discrepancy
between optimized and propagated solutions. These results are obtained
using the dual-step derivative approach, with a number of nodes equal to
110. Figure 7.11 shows the trajectory optimizing the final orbit energy. As it
can been seen, the optimal trajectory is a multirevolution spiral away from
attracting body [9] which has its center of mass located at the origin.
Since this example is solved in [9], we can compare the results in the Figures
7.9, 7.10 and 7.13.

Figure 7.9: States Evolution for the Orbit Raising Problem.

91
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.10: Control Evolution for the Orbit Raising Problem.

Figure 7.11: Orbit Raising Problem - Trajectory Optimizing Final Orbit Energy
(LU=Unitary Length).

92
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.12: Discrepancy between optimized and propagated solutions for the Or-
bit Raising Problem.

Figure 7.13: Orbit Raising - state and control variables, [9]

93
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Tables 7.3 and 7.4 summarize the results obtained with SPARTAN using five
different differentiation schemes, and two different NLP solvers.

100 Nodes 200 Nodes 300 Nodes


SNOPT
Mean Max Mean Max Mean Max
CPU CPU CPU
Error Error Iter. Error Error Iter. Error Error Iter.
(sec) (sec) (sec)
(105 ) (104 ) (105 ) (104 ) (105 ) (104 )

1.028 3.392 35 12.94 5.081 3.233 28 71.54 3.283 3.621 10 18.88


CD3

1.481 3.333 35 12.11 1.791 1.131 28 64.99 1.306 3.401 10 16.15


CD5

1.481 3.334 35 12.16 1.791 1.314 28 64.71 1.307 3.406 10 16.36


CD7

1.482 3.333 35 12.19 1.792 1.132 28 62.70 1.307 3.406 10 15.76


Compl.-Step

1.480 3.333 35 15.37 1.791 1.132 28 67.29 1.021 3.422 10 21.65


Dual-Step

Table 7.3: Accuracy and CPU Time comparison for the Orbit Raising Problem
(SNOPT).

100 Nodes 200 Nodes 300 Nodes


IPOPT

Mean Max CPU Mean Max CPU Mean Max CPU


Iter. Iter. Iter.
Error Error (sec) Error Error (sec) Error Error (sec)

0.0019 0.0118 87 47.83 0.0037 0.0225 90 333.5 0.0054 0.0330 184 1.61103
CD3

0.0019 0.0118 72 40.65 0.0037 0.0225 87 317.7 0.0054 0.0330 103 934.9
CD5

0.0019 0.0118 86 46.99 0.0037 0.0225 108 388.8 0.0054 0.0330 126 1.09103
CD7

0.0019 0.0118 110 58.13 0.0037 0.0225 101 360.8 0.0054 0.0330 125 1.13103
Compl.-Step

0.0019 0.0118 75 42.0 0.0037 0.0225 114 394.4 0.0054 0.0330 175 1.56103
Dual-Step

Table 7.4: Accuracy and CPU Time comparison for the Orbit Raising Problem
(IPOPT).

Figure 7.14 shows the trend of the mean error (in logarithmic scale) between
the solutions computed with SPARTAN and the propagated solutions as
function of the number of the nodes.

94
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.14: Spectral Convergence for the Orbit Raising Problem.

95
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

7.4.3 Hang Glider Problem


This problem is taken from [1], it describes the optimal control of a hang
glider in the presence of a specified thermal updraft.
The state equations which describe the planar motion for the hang glider are
the ones defined in Chapter 5.2.3. The final time tf is free and the final range
x(tf ) has to be maximized. Therefore, the cost function can be defined as
follows
J = x(tf ). (7.18)
The lift coefficient is bounded

0 cL 1.4 (7.19)

and the following boundary conditions are imposed:

x(0) = 0(m), x(tf ) : f ree,


y(0) = 1000(m), y(tf ) = 900(m),
vx (0) = 13.227567(m/sec), vx (tf ) = 13.227567(m/sec),
vy (0) = 1.28750052(m/sec), vy (tf ) = 1.28750052(m/sec).

Figures 7.15, 7.16 and 7.17 illustrate states, controls and the discrepancy be-
tween optimized and propagated solutions. These results are obtained using
the dual-step derivative method, with a number of nodes equal to 100.
Since this example is solved in [1], we can compare the results. The compar-
ison of the Figures 7.15, 7.16 and 7.18 shows the consistency of the results.
In more details, in Figure 7.18, the dashed line is the initial guess which has
been computed using linear interpolation between the boundary conditions,
with x(tf ) = 1250, and cL (0) = cL (tf ) = 1.
Tables 7.5 and 7.6 summarize the results obtained with SPARTAN using five
different differentiation schemes. In the first table SNOPT is used to solve
the NLP problem, instead in the second table IPOPT is employed in SPAR-
TAN as NLP solver.
Furthermore, Figure 7.19 shows the trend of the mean error (in logarithmic
scale) between the optimized and propagated solutions as function of the
number of the nodes. For each of the five differentiation schemes we have a
spectral (exponential) convergence of the solution, as expected by the use
of Pseudospectral methods.

96
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.15: States Evolution for the Hang Glider Problem.

Figure 7.16: Control Evolution for the Hang Glider Problem.

97
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.17: Discrepancy between optimized and propagated solutions for the
Hang Glider Problem.

Figure 7.18: Hang Glider - state and control variables, [1]

98
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

100 Nodes 200 Nodes 300 Nodes


SNOPT
Mean
Mean Max CPU Mean Max CPU Max CPU
Iter. Iter. Error Iter.
Error Error (sec) Error Error (sec) Error (sec)
(104 )

0.0068 0.0821 28 22.65 0.0010 0.0524 34 70.59 5.267 0.0102 23 279.87


CD3

0.0068 0.0816 29 21.87 0.0010 0.0526 34 85.61 5.127 0.0095 29 266.83


CD5

0.0068 0.0812 26 21.68 0.0012 0.0518 23 58.71 5.862 0.0103 29 257.05


CD7

0.0068 0.0823 26 20.64 0.0010 0.0524 31 63.87 4.519 0.0100 29 271.11


Compl.-Step

0.0068 0.0823 26 20.57 0.0010 0.0524 31 65.20 4.519 0.0100 29 262.21


Dual-Step

Table 7.5: Accuracy and CPU Time comparison for the Hang Glider Problem
(SNOPT).

100 Nodes 200 Nodes 300 Nodes


IPOPT

Mean Max CPU Mean Max CPU Mean Max CPU


Iter. Iter. Iter.
Error Error Time Error Error Time Error Error Time

0.0070 0.0817 80 42.32 0.0015 0.0521 133 344.5 0.0011 0.0099 136 1.59103
CD3

0.0070 0.0817 89 39.65 0.0015 0.0521 115 332.6 0.0011 0.0099 125 1.65103
CD5

0.0070 0.0817 89 38.14 0.0015 0.0521 91 359.23 0.0011 0.0099 144 1.32103
CD7

0.0070 0.0817 89 40.42 0.0015 0.0521 116 343.6 0.0011 0.0099 178 1.31103
Compl.-Step

0.0070 0.0817 89 38.47 0.0015 0.0521 117 338.1 0.0011 0.0099 137 578.36
Dual-Step

Table 7.6: Accuracy and CPU Time comparison for the Hang Glider Problem
(IPOPT).

99
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

Figure 7.19: Spectral Convergence for the Hang Glider Problem.

100
Chapter 7. Use of the Advanced Differentiation Schemes for Optimal
Control Problems

7.5 Conclusions
In this chapter the general formulation of an optimal control problem, and
the main numerical approaches used to solve it have been briefly summa-
rized. We have focused on SPARTAN and on the structure of the Jacobian
matrix which describes the discrete, transcribed OCP, that is, the resulting
NLP. Therefore, the advanced differentiation schemes defined in the previous
chapters have been implemented in SPARTAN in order to solve three differ-
ent examples of optimal control problem. Each of the three OCPs has been
solved using two different NLP solvers (SNOPT and IPOPT).
The results in terms of accuracy and CPU time show that the effects of the
use of the dual-step derivative method, as well as of the other schemes, in
combination with the pseudosectral methods are strongly influenced by the
nonlinear behaviour of the equations which describe the problem, by the
number of the nodes used to discretized the problem under analysis, and by
the NLP solver which has been selected.
Overall, among the NLP solvers, SNOPT has been demonstrated to provide
better accuracy than IPOPT and, in addition, the computational power re-
quired for the greater accuracy is far less than the one required by IPOPT.
Furthermore, considering SNOPT, the dual-step method provides better ac-
curacy than the other differentiation schemes as the number of the nodes
increases, and the improvements of the accuracy are paid in terms of an
increasing CPU time. On the other hand, considering IPOPT, there are
no significant differences in terms of accuracy when different differentiation
schemes are implemented in SPARTAN, but these schemes have different
effects in terms of CPU time. In some cases, the dual-step method in combi-
nation with IPOPT provides a significant save in CPU time which leads to
faster optimization.
In addition, the trend of the mean error between SPARTAN and propagated
(using the Runge-Kutta 45 scheme) solutions as function of the number of
nodes has been analysed for each of the three problems in order to verify the
spectral convergence of the solution.
In conclusion, it is not possible to define a priori the most convenient dif-
ferentiation methods to be implemented in SPARTAN. Therefore a trade-off
between the desired quality of the results (e.g., hypersensitive problems), and
the CPU time (e.g., trajectory database generation) can be found according
to the specific number of nodes, to the NLP solver used and to the behaviour
of the equations which describe the problem under analysis. However, the
dual-step method has been demonstrated to be a valid alternative to the
other traditional, well-known differentiation schemes, and it is worth being
considered as valid method to solve the OCP with SPARTAN.

101
Chapter 8
Further Tools: Robust
Differentiation via Sliding
Mode Technique

Overview
In the previous chapters, different differentiation schemes have been anal-
ysed in order to define the method that provides the best accuracy in the
computation of the derivatives and the Jacobian matrix. This analysis is of
great importance considering that gradient methods for solving NLPs require
the computation of the derivatives of the objective function and constraints.
Therefore, the accuracy of the derivatives has a strong impact on the com-
putational efficiency and reliability of the solution.
In this chapter we will deal with the problem of the differentiation of signals
given in real time with the aim to design a robust differentiator.
Real-time differentiation is an old and well-studied problem and, the main
difficulty is the obvious differentiation sensitivity to input noises. Given an
input signal consisting of a noise and an unknown base signal, the goal of a
robust differentiator is to find real-time robust estimations of the derivatives
of the base signal which are exact in the absence of measurement noise.
Combining differentiation exactness with robustness in respect to possible
measurement errors and input noise is a challenging task and, one particular
approach to robust differentiator design is the so-called robust differentiation
via sliding mode technique.
In the first section the main concepts of the theory of sliding mode control
are briefly summarized in order to underline the potential advantages of us-

102
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

ing discontinuous, switching, control laws. Furthermore, the use of sliding


surface for control design is examined at a tutorial level.
In the second section we focus our attention on the design of a robust differ-
entiator based on sliding mode technique. Two sliding mode robust differen-
tiators are examined on tutorial examples with simulation plots.

8.1 Sliding Mode Technique


8.1.1 Theory of Sliding Mode Control
In the formulation of any control problem, there will typically be discrepan-
cies between the actual plant and the mathematical model adopted to design
the controller. These disturbances usually derives from plant parameters and
unmodelled dynamics. Moreover, known or unknown external disturbances
can deteriorate the performances of the system. The need to design control
systems which are able to provide the required performance levels in a real
environment despite the presence of these plant mismatches has led to an
intense interest in the development of the so-called robust control methods.
The goal of robust design is to retain assurance of system performance in
spite of model inaccuracies and disturbances. Indeed, when we design a
control system, the mathematical model, based on numerous assumptions,
incorporates two important problems that are often encountered: a distur-
bance signal which is added to the control input to the plant and, noise that
is added to the sensor output. A robust control system exhibits the desired
performances despite these plant uncertainties.
One particular approach to robust controller design is the so-called sliding
mode control technique. Let us briefly summarize the basic idea of a sliding
mode.
Sliding mode control is a particular type of variable structure control system
(VCS). These systems are characterized by a set of feedback control laws and
a decision rule, known as switching function.
Consider the dynamic system, [6]:

x(n) = f (x, t) + b(x, t)u(t) + d(t) (8.1)

where:

u(t) is the scalar control input;

x is the scalar output of interest;

x = [x, x, . . . , x(n1) ]T is the state;

103
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

f (x, t) is not exactly known, it is assumed to be a continuous function


in x, in general nonlinear, but the extend of the imprecision on f (x, t)
is upper bounded by a known continuous function of x and t;

b(x, t) is a control gain, not exactly known, but it is of a known sign,


bounded by known, continuous of x and t, and it is assumed to be a
continuous function of x;

the disturbance d(t) is unknown but bounded in absolute value by a


known continuous function of time.

The control problem is to get the state x to track a specific state xref =
[xd , xd , . . . , xn1
d ]T in the presence of model imprecision on f (x, t) and b(x, t)
and of disturbances d(t). Defining the tracking error vector x := x xref =
. . . , x(n1) ], we must assume
[x, x,

x t=0 = 0 (8.2)

in order to guarantee that the control problem provides the aforementioned


performances using a finite control u.
It is possible to define a time-varying sliding surface s(t) in the state-space
Rn as s(x; t) = 0 with
 n1
d
s(x; t) := + x (8.3)
dt

where is a positive constant. For instance, if n = 2 we have s = x + x.


Given initial condition (8.2), the problem of tracking x xref is equivalent to
that of remaining on the surface s(t) for all t > 0. Indeed, s 0 represents
a linear differential equation whose unique solution is x 0, given initial
condition (8.2). Therefore, the problem of reaching the n-dimensional vector
xref can be reduced to the problem of tending the scalar sliding surface s to
zero. This is an advantage because if we compare equations (8.1) and (8.3)
we have a reduced-order compensated dynamics.
A sufficient condition for such a positive invariance of s(t) is to choose the
control law u so that outside of s(t)

1d 2
s (x; t) |s| (8.4)
2 dt
where is a positive constant. Relation (8.4) is equivalent to

ss |s| s sign(s) . (8.5)

104
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Indeed, the sign function has the important property so that s sign(s) = |s|.
Inequality (8.4), often termed either sliding condition or reachability condi-
tion, constraints trajectories to point towards the sliding surface s(t) and to
remain on it thereafter.
The idea behind conditions (8.3) and (8.4) is to pick up a well-behaved func-
tion of the tracking error, s, according to (8.3), and then select the feedback
control law u in (8.1) such that s2 satisfied equation (8.4) despite the pres-
ence of model imprecision and of disturbances, [6].
The control u that drives the state variables to the sliding surface s in a
finite time, and keeps them on the surface thereafter in presence of bounded
disturbances, is called a sliding mode controller, and an ideal sliding mode is
said to be taking place in the system.
Control laws that satisfy (8.4) have to be discontinuous across the sliding sur-
face. In more details, sliding mode control usually is a high frequency switch-
ing control with a switching frequency which is finite due to the discrete-time
nature of the computer simulation. This high-frequency switching control
causes in practice the control chattering meaning a finite frequency zig-
zag motion in the sliding mode. In an ideal sliding mode the switching
frequency is suppose to approach infinity and the amplitude of the zig-zag
motion tends to zero.

8.1.2 Example
The main advantages of the sliding mode control, including robustness, finite-
time convergence, and reduced-order compensated dynamics, are demon-
strated on a tutorial example taken from [7]. In the example, a single-
dimensional motion of a unit mass is considered. If we introduce variables for
the position and the velocity, x1 = x and x2 = x1 , a state-variable description
is the following (
x1 = x2
(8.6)
x2 = u + f (x1 , x2 , t),
where u is the control force, and the disturbance term f (x1 , x2 , t), which
may include dry and viscous friction as well as any other unknown resistance
forces, is assumed to be bounded, i.e., |f (x1 , x2 , t)| L > 0. The problem
is to design a feedback control law u = u(x1 , x2 ) that drives the mass to
the origin asymptotically (t x1 , x2 = 0). First, we introduce a new
variable, called sliding variable, in the state space of the system (8.6):

= (x1 , x2 ) = x2 + cx1 , c > 0. (8.7)

105
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.1: Sliding Variable and Sliding Mode Control

In order to achieve asymptotic convergence of the state variable x1 , x2 to zero


in presence of bounded disturbances, we have to drive the variable to zero
in a finite time by means of the control u. Equation = x2 + cx1 = 0, c > 0
corresponds to a straight line in the state space of the system (8.6) and is
the so-called sliding surface.
The sliding mode control u = u(x1 , x2 ) that drives the state variables x1 , x2
to the sliding surface in finite time, and keeps them on the surface thereafter
in the presence of the bounded disturbance f (x1 , x2 , t) is suggested in [7] as
the following
u = u(x1 , x2 ) = cx2 sign(), (8.8)
where is a given control gain. The results of the simulation system (8.6)
with the sliding mode control law (8.8), the initial conditions x1 (0) = 1,
x2 (0) = 2, the control gain = 2, the parameter c = 1.5, and the distur-
bance f (x1 , x2 , t) = sin(2t) are presented.
Figure 8.1 illustrates finite-time convergence of the sliding variable to zero
and the sliding mode control. As we have previously said, the sliding mode
control is a high frequency switching control and the zoom, in the figure,
shows a finite amplitude and finite frequency zig-zag motion in the slid-
ing mode due to the discrete-time of the computer simulation when the sign
function is implemented. Figure 8.2 shows the asymptotic convergence of

106
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.2: Asymptotic Convergence and State Trajectory for f (x1 , x2 , t) =


sin(2t) and u(x1 , x2 ) = cx2 sign().

the state variables to zero and the state trajectory, in the presence of the
external bounded disturbance f (x1 , x2 , t) = sin(2t) and of the sliding mode
control u(x1 , x2 ) = cx2 sign(). It is possible to identify a reaching
phase, when the state trajectory is driven towards the sliding surface, and a
sliding phase, when the state trajectory is moving along the sliding surface
towards the origin.
So far, we have assumed that all the state variables are measured (available).
In many cases, the full state is not available, and thus that a sliding surface
definition similar to (8.3) is not adequate.
By means of some modifications in the algorithm, it is possible to define
a sliding mode observer which can be treated as a differentiator, since the
variable it estimates is a derivative of the measured variable.

8.2 Sliding Mode Robust Differentiators


Construction of a special differentiator may often be avoided. For example,
if the signal satisfies a certain differential equation or is an output of some
known dynamic system, the derivative of the given signal may be calculated
as a derivative with respect to some known dynamic system. Thus, the

107
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

problem is reduced to the well-known observation and filtering problems. In


other cases construction of a differentiator is inevitable. However, the ideal
differentiator could not be realized. indeed, together with the basic signal it
would also have to differentiate any small noise which always exists and may
have large derivative.
If nothing is known on the structure of the signal except some differential
inequalities, the sliding mode technique is used. In this section two robust
differentiators based on the sliding mode technique are examined at a tutorial
level.

8.2.1 Fifth-Order Differentiator


Let the input signal f (t) be a function defined on [0, ) consisting of a
bounded Lebesgue-measurable noise with unknown features and of an un-
known base signal f0 (t), whose 6th derivative has a known Lipschitz constant
L > 0. The problem of finding real-time robust estimations of f0 (t), f0 (t),
(5)
. . ., f0 (t) which are exact in the absence of measurement noises is solved by
the fifth-order differentiator defined in [7]:

z0 = 0 , 0 = 8L1/6 |z0 f (t)|5/6 sign(z0 f (t)) + z1


z1 = 1 , 1 = 5L1/5 |z1 0 |4/5 sign(z1 0 ) + z2
z2 = 2 , 2 = 3L1/4 |z2 1 |3/4 sign(z2 1 ) + z3
z3 = 3 , 3 = 2L1/3 |z3 2 |2/3 sign(z3 2 ) + z4
z4 = 4 , 4 = 1.5L1/2 |z4 3 |1/2 sign(z4 3 ) + z5
z5 = 1.1Lsign(z5 4 ), |f (6) (t) L|.

The fifth-order differentiator is applied, with L = 1, in order to differentiate


the function f (t) = sin(0.5t) + cos(0.5t). The initial values of the differen-
tiator are taken zero.
Convergence of the differentiator is demonstrated in Figures 8.3, 8.4, 8.5, and
8.6. Figure 8.3 illustrates the comparison between the analytical derivatives
of the base signal f (t) = sin(0.5t) + cos(0.5t) and the outputs of the fifth-
order differentiator, in the absence of measurement noises. Figure 8.4 shows
the errors which are evaluated using the analytical derivatives as reference.
Figures 8.5 and 8.6 illustrate the results of the fifth-order differentiator and
the relative errors in the presence of a normally distributed random noise
with mean .01 and standard deviation .1.
In practice, the fifth-order differentiator provides accurate estimations of the
derivatives of the base signal f (t) = sin(0.5t) + cos(0.5t), significantly short-
ening the transient and, in spite of the presence of measurement noises.

108
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.3: Fifth-Order Differentiator, without noise.

Figure 8.4: Fifth-Order Differentiator Errors, without noise.

109
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.5: Fifth-Order Differentiator, with noise.

Figure 8.6: Fifth-Order Differentiator Errors, with noise.

110
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

The transient time is a function of the gain L, and it can be shortened


according to the performance requirements.

8.2.2 Second Order Nonlinear System Obsrever


This numerical example is taken from [6] and, here it is proposed again with
some modifications.
Let us consider a second order nonlinear system, consisting of a mass con-
nected to a nonlinear spring in the presence of dynamic and static friction:

x1 = x2
x2 = x31 f (x1 , x2 ) + u
z = x1 +

where is the measurement noise, is a constant nonlinear spring coefficient,


and f (x1 , x2 ) = Fs x1 + Fd x2 represents dynamic and static friction. For this
system the sliding mode observer suggested in [6] is:

x 1 = 1 z + x2 + k1 sign(z)
x 2 = 2 z x31 f(x1 , x2 ) + u + k2 sign(z)
z = x1 z

The numerical values in the simulations are:

= 1.0
Fs = 1.0
Fd = 0.75
(8.9)

while the estimated values used in the observer are:

= 1.0
Fs = 1.25
Fd = 1.00
1 = 3.8
2 = 7.2

The true system is excited by a sinusoidal input u = sin(t) and the initial
condition are: x1 (0) = 0.0 and x2 (0) = 0.5; with the estimated initial condi-
tions: x1 (0) = 0.0 x2 (0) = 0.2.

111
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.7: True and Estimated State Variables, without noise.

The simulation results, in the absence of measurement noises ( = 0), are


shown in Figure 8.7. In this case values of k1 and k2 respectively equal to .2
and .5 have been used. The Figure shows that the state variables estimated
by the observer converge to the true states, after a short transient phase.
Figures 8.8 and 8.9 illustrates the results of the simulation in the presence of
a normally distributed random noise with mean .01 and standard deviation
.1. In this case values of k1 and k2 respectively equal to .25 and .65 have been
used. The Figures show the convergence of the estimated states to the true
states and, in particular, Figure 8.9 illustrates that the position estimated by
the observer is much closer to the true position with respect to the measured
one.
These simulations show that, in spite of the parameter mismatch, the sliding
mode observer provides adequate performances.

8.3 Conclusions
In this chapter the sliding mode technique and the robust differentiation via
sliding mode have been analysed.
The main concepts of the sliding mode control for tracking problems, as
well as state and input observation have been briefly summarized. The

112
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

Figure 8.8: True and Estimated State Variables, with noise.

Figure 8.9: Comparison between True, Estimated and Measured Position.

113
Chapter 8. Further Tools: Robust Differentiation via Sliding Mode
Technique

main advantages of the sliding mode control such as the robustness, the
finite-time convergence, and the reduced-order compensated dynamics, are
demonstrated on a tutorial example.
The need of the employment of the sliding mode technique to construct a ro-
bust observer/differentiator has been justified and two robust differentiators
based on the sliding mode technique have been analysed.
Robust differentiation via sliding mode control has been proved to be able
to provide accurate, real-time estimations of the derivatives of a base signal
when nothing is known about its structure except some differential inequal-
ities and, in spite of the presence of measurement noises.

114
Chapter 9
Conclusions

9.1 Lesson Learned


This thesis has been focus on analysing advanced differentiation schemes in
order to implement them in SPARTAN, an algorithm developed by DLR to
solve OCPs. One of the most important computational issues that arise in
the numerical solution of OCP is the computation of derivatives of the objec-
tive and constraint functions. Therefore, the aim of this thesis has been to
define and test advanced differentiation schemes in order to realize how they
work, in terms of accuracy and CPU time, when implemented in combination
with pseudospectral methods.
The central difference schemes have been proved to be more accurate than
the backward and forward difference schemes. In order to improve the accu-
racy of the central difference schemes it is necessary to reduce the truncation
error by reducing the value of the step h. However, making h to small is not
desirable otherwise the round-off error becomes dominant. Therefore, the
optimal step size for these methods is not known a priori and it requires a
trade off between the truncation and round-off errors as well as a substantial
effort and knowledge of the analytical derivative.
The complex-step derivative approach has been proved to provide greater
accuracy than the finite difference formulas, for the first derivatives, by elim-
inating the round-off error. Therefore, the step size h can be made small
enough that the truncation error is effectively zero without worrying about
the subtraction error.
The dual numbers have been presented as a tool for differentiation, not only
for optimal control, but also as tool that can be used with existing codes.
Indeed, the dual numbers have been implemented as a new class of numbers,
using operator overloading, in MATLAB. The new class allows a real-valued

115
Chapter 9. Conclusions

analysis code to be easily converted to operate on dual numbers by just


changing the variable type declarations, the structure of the code remains
unchanged.
The dual-step approach has been proved to be able to provide exact first-
order derivatives. Indeed, the dual-step method is subject neither to the
truncation error, nor to the round-off error and, as a consequence, the error
in the first derivative estimate is machine zero regardless of the selected step
size. The disadvantage is the computational cost due to the fact that working
with the dual numbers requires additional computational work. In addition,
as the complex-step approximation, the dual-step method is concerned with
the need to have an analytical function.
The dual-step approach is the most accurate method for the computation of
the Jacobian matrix even if the improvements in the accuracy are paid in
terms of the computational power. However, the trade-off between accuracy
and CPU power suggests that the dual-step differentiation is a valid alter-
native to the other well-known numerical differentiation schemes, in case the
problem is analytically defined (i.e., no look-up table are part of the data).
Hyper-dual numbers have been presented as a higher dimensional extension
of dual numbers. The hyper-dual class, in MATLAB, can be used to compute
exact first and second derivatives in order to form Gradients and Hessians
which are exact.
Focusing on SPARTAN, two NLP solvers (SNOPT and IPOPT) have been
tested in combination with the aforementioned differentiation schemes. Over-
all, SNOPT is preferable for the greater accuracy and for the minor CPU time
required with respect to IPOPT while, among the differentiation schemes, a
trade-off between the desired quality of the results and the CPU time will
suggest the more suitable scheme. Indeed, the results in terms of accuracy
and CPU time show that the effects of the use of the dual-step method, as
well as of the other schemes, in combination with the pseudosectral methods
are strongly influenced by the nonlinear behaviour of the equations which
describe the problem and by the number of the nodes used to discretized
the problem under analysis. In addition, the use of the dual differentiation
method has given the possibility to prove the spectral convergence for prob-
lems which are very different from each other.
Furthermore, the thesis focused on the sliding mode technique and the ro-
bust differentiation via sliding mode. The main advantages of the sliding
mode control such as the robustness and the finite-time convergence have
been taken into account in order to design and test a robust differentia-
tor/observer. Robust differentiation via sliding mode control provides accu-
rate, real-time estimations of the derivatives of a base signal when nothing is
known about its structure except some differential inequalities and, in spite

116
Chapter 9. Conclusions

of the presence of measurement noises.

9.2 Future Developments


The future developments can be summarized in three points:

Extension of the dual numbers class into dual quaternion class to have
a stable and efficient representation of the 6 DOF motion.

Extension of the hyper-dual class to work with Hessian matrix.

Developments of nonlinear adaptive controllers based on robust differ-


entiation via sliding mode technique.

117
Appendix A
Dual and Hyper-Dual Numbers

A.1 Introduction
This appendix provides the necessary background that is required for the
computation of the derivatives of a generic function using either the dual
numbers or the hyper-dual numbers. For further details see [16].

A.2 Dual Numbers


A.2.1 Definition
In linear algebra, the dual numbers extend the real numbers by adjoining
one new element  with the property 2 = 0 ( is nilpotent). The collection
of dual numbers forms a particular two-dimensional commutative associative
algebra over the real numbers. Every dual number has the form

z = a + b (A.1)

with a and b uniquely determined real numbers and, in particular,

a = real(z) Real P art


b = dual(z) Dual P art

Dual numbers extend the real numbers in a similar way to the complex
numbers. Indeed, as the dual numbers, the complex numbers adjoin a new
element i, for which i2 = 1, and every complex number has the form
z = a + bi where a and b are real numbers.
The above definition (A.1) relies on tha idea that 2 = 0 with  6= 0. This

118
Appendix A. Dual and Hyper-Dual Numbers

may not be mathematically possible; 2 = 0 may require  = 0. For this


reason, these numbers will be also called fake numbers as a reference to
their similarity with imaginary numbers and to acknowledge that this type
of number may not formally exist.
Using matrices, dual numbers can be represented also as
   
0 1 a b
= ; z = a + b = .
0 0 0 a
It is easy to see that the matrix form satisfies all the properties of the dual
numbers.

A.2.2 Properties
In order to implement the dual numbers, operations on this numbers should
be properly defined. Indeed, given three dual numbers a, b and c, it is possible
to demonstrate that the following properties hold:
Additive associativity: (a + b) + c = a + (b + c).

Additive commutativity: a + b = b + a.

Additive identity: 0 + a = a + 0 = a.

Additive inverse: a + (a) = (a) + a = 0.

Multiplicative associativity: (a b) c = a (b c).

Multiplicative identity: 1 a = a 1 = a.

Multiplicative inverse: a a1 = a1 a = 1.

Left and right distributivity: a (b + c) = (a b) + (a c) and (b + c) a =


(b a) + (c a).

A.2.3 Algebraic operations


Given two numbers of this type a = a1 + a2  and b = b1 + b2  addition and
multiplication can be defined as follows.
Addition: a + b = (a1 + b1 ) + (a2 + b2 ).

Multiplication: a b = (a1 b1 ) + (a1 b2 + a2 b1 ).


Using the definition for multiplication, the multiplicative inverse can be de-
fined as

119
Appendix A. Dual and Hyper-Dual Numbers

Multiplicative inverse: 1
a
= 1
a1
a2
a21
.

The multiplication inverse is therefore defined for all numbers of this type
with a non-zero real part, a1 . Division can be defined as follows.

Division: ab = a 1b = ab11 + ab12 ab12b2 .



1

An important property of the dual numbers is related to the definition of their


norm; norm(a). The normp should be related only to the real part of these
numbers, norm(a) = a21 . This is a useful property in order to compare
numbers of this type using inequalities. For instance consider a < b, this is
equivalent to norm(a) < norm(b) or a1 < b1 .
This definition of the norm has the property that norm(a b) = norm(a)
norm(b).
It is moreover possible to define the conjugate of p
the dual number. Indeed,

given the norm, it is possible to write norm(a) = a21 = a aconj so that

Conjugate: aconj = a1 a2 .

A.2.4 Defining functions


With addition and multiplication defined in a consistent way, functions can
be defined using Mclaurin series.

sin(x) Example.
The Mclaurin series of the function sin(x) for x R is the following

x3 x5 x7 (1)n 2n+1
sin(x) = x + + ... + x . (A.2)
6 120 5040 (2n)!

For x = x1 + x2 , the terms x3 , x5 , etc. can be calculated using the rule


for multiplication given in section A.3.
 
3 3 2 3 3x2
x = x1 + (3x1 x2 ) = x1 1 +  (A.3)
x1
 
5 5 4 5 5x2
x = x1 + (5x1 x2 ) = x1 1 +  (A.4)
x1

These results are then added according to the Mclaurin series so that

x31 x51 x71 x21 x41 x61


   
sin(x) = x1 + +. . . +x2 1 + +. . . .
6 120 5040 2 24 720
(A.5)

120
Appendix A. Dual and Hyper-Dual Numbers

Recognizing that
x2 x4 x6 (1)n
cos(x) = 1 + + ... + (A.6)
2 24 720 (2n)!
the expression (A.5) can be simplified to give [13]
sin(x) = sin(x1 ) + x2 cos(x1 ) (A.7)

ln(x) Example.
The Mclaurin series of the function ln(x) for x inR is the following
x2 x3 x4 (1)n+1 n
ln(1 + x) = x + + ... + x . (A.8)
2 3 4 n
For x = x1 + x2 , using the above expressions for the terms x3 , x5 , etc.
yields
x21 x31 x41
   
2 3
ln(1+x) = x1 + +. . . +x2 1x1 +x1 x1 +. . .  (A.9)
2 3 4
1 2 3 1
2
where 1+x 1
= 1x 1 +x 1 x 1 +. . . and 1+x 1
= 12x1 +3x21 4x31 +. . .
so [13]
x2
ln(1 + x) = ln(1 + x) +  (A.10)
1 + x1
x2
ln(x) = ln(x1 ) +  (A.11)
x1
Function f (x).
The above results can also be derived from the Taylor series for a
general function f (x).
1 2 00 z 3 f 000 (x)
f (x + z) = f (x) + zf 0 (x) +z f (x) + + ... (A.12)
2! 3!
where z is a dual number so that
z = b (A.13)
z2 = 0 (A.14)
z 3 = 0. (A.15)
These terms are then added according to the Taylor series [13], so that
f (x + b) = f (x) + bf 0 (x). (A.16)
It is implicit that each function extended in the dual plane hides its
derivative in its dual part. For this reason it is possible to state that
the dual-step approach can be considered as belonging to the class of
the Automatic Differentiation Methods as well.

121
Appendix A. Dual and Hyper-Dual Numbers

A.2.5 Implementation
The dual numbers have been implemented as new class of numbers, using
operator overloading, in MATLAB. This section contains their implementa-
tion which is based on Ref. [17], with some minor modifications.
The new class includes definitions for the standard algebraic operations, log-
ical comparison operations, and other more general functions such as the
exponential, the sine and so on. This class definition file allows a real-valued
analysis code to be easily converted to operate on dual numbers by just
changing the variable type declarations, the structure of the code remains
unchanged.

classdef Dual

properties
x = 0;
d = 0;
end

methods
% Constructor
function obj = Dual(x,d)
if size(x,1) ~= size(d,1) | | size(x,2) ~= size(d,2)
error('DUAL:constructor','X and D are different size')
else
obj.x = x;
obj.d = d;
end
end
% Getters
function v = getvalue(a)
v = a.x;
end
function d = getderiv(a)
d = a.d;
end
% Indexing
function B = subsref(A,S)
switch S.type
case '()'
idx = S.subs;
switch length(idx)
case 1
B = Dual(A.x(idx{1}),A.d(idx{1}));
case 2
B = Dual(A.x(idx{1},idx{2}), A.d(idx{1},idx{2}));
otherwise

122
Appendix A. Dual and Hyper-Dual Numbers

error('Dual:subsref','Arrays with more than 2 dims not supported')


end
case '.'
switch S.subs
case 'x'
B = A.x;
case 'd'
B = A.d;
otherwise
error('Dual:subsref','Field %s does not exist',S.subs)
end
otherwise
error('Dual:subsref','Indexing with {} is not supported')
end
end
function A = subsasgn(A,S,B)
switch S.type
case '()'
idx = S.subs;
otherwise
error('Dual:subsasgn','Assignment with {} and . not supported')
end
if ~isdual(B)
B = mkdual(B);
end
switch length(idx)
case 1
A.x(idx{1}) = B.x;
A.d(idx{1}) = B.d;
case 2
A.x(idx{1},idx{2}) = B.x;
A.d(idx{1},idx{2}) = B.d;
otherwise
error('Dual:subsref','Arrays with more than 2 dims not supported')
end
end
% Concatenation operators
function A = horzcat(varargin)
for k = 1:length(varargin)
tmp = varargin{k};
xs{k} = tmp.x;
ds{k} = tmp.d;
end
A = Dual(horzcat(xs{:}), horzcat(ds{:}));
end
function A = vertcat(varargin)
for k = 1:length(varargin)
tmp = varargin{k};
xs{k} = tmp.x;

123
Appendix A. Dual and Hyper-Dual Numbers

ds{k} = tmp.d;
end
A = Dual(vertcat(xs{:}), vertcat(ds{:}));
end
% Plotting functions
function plot(X,varargin)
if length(varargin) < 1
Y = X;
X = 1:length(X.x);
elseif isdual(X) && isdual(varargin{1})
Y = varargin{1};
varargin = varargin(2:end);
elseif isdual(X)
Y = X;
X = 1:length(X);
elseif isdual(varargin{1})
Y = varargin{1};
varargin = varargin(2:end);
end
if isdual(X)
plot(X.x,[Y.x(:) Y.d(:)],varargin{:})
else
plot(X,[Y.x(:) Y.d(:)],varargin{:})
end
grid on
legend({'Function','Derivative'})
end
% Comparison operators
function res = eq(a,b)
if isdual(a) && isdual(b)
res = a.x == b.x;
elseif isdual(a)
res = a.x == b;
elseif isdual(b)
res = a == b.x;
end
end
function res = neq(a,b)
if isdual(a) && isdual(b)
res = a.x ~= b.x;
elseif isdual(a)
res = a.x ~= b;
elseif isdual(b)
res = a ~= b.x;
end
end
function res = lt(a,b)
if isdual(a) && isdual(b)
res = a.x < b.x;

124
Appendix A. Dual and Hyper-Dual Numbers

elseif isdual(a)
res = a.x < b;
elseif isdual(b)
res = a < b.x;
end
end
function res = le(a,b)
if isdual(a) && isdual(b)
res = a.x <= b.x;
elseif isdual(a)
res = a.x <= b;
elseif isdual(b)
res = a <= b.x;
end
end
function res = gt(a,b)
if isdual(a) && isdual(b)
res = a.x > b.x;
elseif isdual(a)
res = a.x > b;
elseif isdual(b)
res = a > b.x;
end
end
function res = ge(a,b)
if isdual(a) && isdual(b)
res = a.x >= b.x;
elseif isdual(a)
res = a.x >= b;
elseif isdual(b)
res = a >= b.x;
end
end
function res = isnan(a)
res = isnan(a.x);
end
function res = isinf(a)
res = isinf(a.x);
end
function res = isfinite(a)
res = isfinite(a.x);
end
% Unary operators
function obj = uplus(a)
obj = a;
end
function obj = uminus(a)
obj = Dual(-a.x, -a.d);
end

125
Appendix A. Dual and Hyper-Dual Numbers

function obj = transpose(a)


obj = Dual(transpose(a.x), transpose(a.d));
end
function obj = ctranspose(a)
obj = Dual(ctranspose(a.x), ctranspose(a.d));
end
function obj = reshape(a,ns)
obj = Dual(reshape(a.x,ns), reshape(a.d,ns));
end
% Binary arithmetic operators
function obj = plus(a,b)
if isdual(a) && isdual(b)
obj = Dual(a.x + b.x, a.d + b.d);
elseif isdual(a)
obj = Dual(a.x + b, a.d);
elseif isdual(b)
obj = Dual(a + b.x, b.d);
end
end
function obj = minus(a,b)
if isdual(a) && isdual(b)
obj = Dual(a.x - b.x, a.d - b.d);
elseif isdual(a)
obj = Dual(a.x - b, a.d);
elseif isdual(b)
obj = Dual(a - b.x, -b.d);
end
end
function obj = times(a,b)
if isdual(a) && isdual(b)
obj = Dual(a.x .* b.x, a.x .* b.d + a.d .* b.x);
elseif isdual(a)
obj = Dual(a.x .* b, a.d .* b);
elseif isdual(b)
obj = Dual(a .* b.x, a .* b.d);
end
end
function obj = mtimes(a,b)
% Matrix multiplication for dual numbers is elementwise
obj = times(a,b);
end
function obj = rdivide(a,b)
if isdual(a) && isdual(b)
xpart = a.x ./ b.x;
dpart = (a.d .* b.x - a.x .* b.d) ./ (b.x .* b.x);
obj = Dual(xpart,dpart);
elseif isdual(a)
obj = Dual(a.x ./ b, a.d ./ b);
elseif isdual(b)

126
Appendix A. Dual and Hyper-Dual Numbers

obj = Dual(a ./ b.x, -(a .* b.d) ./ (b.x .* b.x));


end
end
function obj = mrdivide(a,b)
% All division is elementwise
obj = rdivide(a,b);
end
function obj = power(a,b)
% n is assumed to be a real value (not a dual)
if isdual(a) && isdual(b)
error('Dual:power','Power is not defined for a and b both dual')
elseif isdual(a)
obj = Dual(power(a.x,b), b .* a.d .* power(a.x,b-1));
elseif isdual(b)
ab = power(a,b.x);
obj = Dual(ab, b.d .* log(a) .* ab);
end
end
function obj = mpower(a,n)
% Elementwise power
obj = power(a,n);
end
% Miscellaneous math functions
function obj = sqrt(a)
rr=a.x;
rr(rr==0)=eps;
obj = Dual(sqrt(a.x), a.d ./ (2 * sqrt(rr)));
end

function obj = abs(a)


obj = Dual(abs(a.x), a.d .* sign(a.x));
end
function obj = sign(a)
z = a.x == 0;
x = sign(a.x);
d = a.d .* ones(size(a.d)); d(z) = NaN;
obj = Dual(x,d);
end
function obj = pow2(a)
obj = Dual(pow2(a.x), a.d .* log(2) .* pow2(a.x));
end
function obj = erf(a)
disp('Reached here')
ds = 2/sqrt(pi) * exp(-(a.x).2);
obj = Dual(erf(a.x), a.d .* ds);
end
function obj = erfc(a)
disp('Reached here')
ds = -2/sqrt(pi) * exp(-(a.x).2);

127
Appendix A. Dual and Hyper-Dual Numbers

obj = Dual(erfc(a.x), a.d .* ds);


end
function obj = erfcx(a)
ds = 2 * a.x .* exp((a.x).2) .* erfc(a.x) - 2/sqrt(pi);
obj = Dual(erfcx(a.x), a.d .* ds);
end
% Exponential and logarithm
function obj = exp(a)
obj = Dual(exp(a.x), a.d .* exp(a.x));
end
function obj = log(a)
obj = Dual(log(a.x), a.d ./ a.x);
end
% Trigonometric functions
function obj = sin(a)
obj = Dual(sin(a.x), a.d .* cos(a.x));
end
function obj = cos(a)
obj = Dual(cos(a.x), -a.d .* sin(a.x));
end
function obj = tan(a)
obj = Dual(tan(a.x), a.d .* sec(a.x).2);
end
function obj = asin(a)
obj = Dual(asin(a.x), a.d ./ sqrt(1-(a.x).2));
end
function obj = acos(a)
obj = Dual(acos(a.x), -a.d ./ sqrt(1-(a.x).2));
end
function obj = atan(a)
obj = Dual(atan(a.x), 1 ./ (1 + (a.x).2));
end
% Hyperbolic trig functions
function obj = sinh(a)
obj = Dual(sinh(a.x), a.d .* cosh(a.x));
end
function obj = cosh(a)
obj = Dual(cosh(a.x), a.d .* sinh(a.x));
end
function obj = tanh(a)
obj = Dual(tanh(a.x), a.d .* sech(a.x).2);
end
function obj = asinh(a)
obj = Dual(asinh(a.x), 1 ./ sqrt((a.x).2 + 1));
end
function obj = acosh(a)
obj = Dual(acosh(a.x), 1 ./ sqrt((a.x).2 - 1));
end
function obj = atanh(a)

128
Appendix A. Dual and Hyper-Dual Numbers

obj = Dual(atanh(a.x), 1./ (1 - (a.x).2));


end
end
end

The function isdual.m is the following:

function b = isdual(a)
b = strcmp(class(a),'Dual');
end

A.3 Hyper-Dual Numbers


Hyper-dual numbers are a higher dimensional extension of dual numbers in
a similar way that the quaternions are a higher dimensional extension of
ordinary complex numbers. A hyper-dual number is of the form

x = x1 + x2 1 + x3 2 + x4 1 2 . (A.17)

It has one real part and three non-real parts with the following properties

21 = 22 = (1 2 )2 = 0, (A.18)

where
1 6= 2 6= 1 2 6= 0 (A.19)
or in other words

1 = 0 6= 0 (A.20)

2 = 0 6= 0 (A.21)

1 2 = 0= 6 0 (A.22)

The properties of these numbers are exactly the same of the ones concerning
the dual numbers (see section A.2.2).

A.3.1 Defining Algebraic Operations


Given two numbers of this type a = a1 + a2 1 + a3 2 + a4 1 2 and b =
b1 + b2 1 + b3 2 + b4 1 2 addition and multiplication can be defined as follows.

Addition:
a + b = (a1 + b1 ) + (a2 + b2 )1 + (a3 + b3 )2 + (a4 + b4 )1 2 .

129
Appendix A. Dual and Hyper-Dual Numbers

Multiplication:
ab = (a1 b1 )+(a1 b2 +a2 b1 )1 +(a1 b3 +a3 b1 )2 +(a1 b4 +a2 b3 +a3 b2 +a4 b1 )1 .
Using the definition for multiplication, the multiplicative inverse can be de-
fined as
Multiplicative inverse:
1
= a11 aa22 1 aa23 2 + a4 2a2 a3

a a21
+ a31
1 2 .
1 1

The multiplication inverse is therefore defined for all numbers of this type
with a non-zero real part, a1 . Division can be defined as follows.
Division:
b4
a
= ab11 + ab12 a1 b2 a3 a1 b3 a4 a2 b3 a3 b2
 
b b21
1 + b1
b21
2 + b1
b21
b21
+ a1 b21
+
2b2 b3

b2
1 2 .
1

An important property of the hyper-dual numbers is related to the definition


of their norm; norm(a). The norm is defined just like for the dual numbers
and it should be related only to the real part of these numbers, norm(a) =
p
a21 . This is a useful property in order to compare numbers of this type using
inequalities. For instance consider a < b, this is equivalent to norm(a) <
norm(b) or a1 < b1 .
This definition of the norm has the property that norm(a b) = norm(a)
norm(b).
It is moreover possible to define the conjugate of the hyper-dual number.
p
Indeed, given the norm, it is possible to write norm(a) = a21 = a aconj
so that
Conjugate: aconj = a1 a2 1 a3 2 + 2aa21a3 a4 .


With the addition and the multiplication operators defined in a consistent


way, functions can be defined as already done for the dual numbers (see
section A.2.4).

A.3.2 Hyper-Dual Numbers for Exact Derivative Cal-


culations
Hyper-dual numbers can be used to compute exact first and second deriva-
tives in order to form gradients and Hessians for optimization methods [14].
Considering the form of a hyper-dual number (A.17), the definition (A.18)
of 1 and 2 implies that the Taylor series for a function with a hyper-dual
step truncates exactly at the second-derivative term:
f (x + h1 1 + h2 2 + 01 2 ) = f (x) + h1 f 0 (x)1 + h2 f 0 (x)2 + h1 h2 f 00 (x)1 2 .
(A.23)

130
Appendix A. Dual and Hyper-Dual Numbers

The higher order terms are all zero by the definition of 21 = 22 = (1 2 )2 = 0,
so there is no truncation error. The first and second derivatives are the
leading terms of the non-real parts, meaning that if f 0 (x) is desired simply
look at the 1 or 2 part and divide by the appropriate step and if f 00 (x) is
desired look at the 1 2 part:

1 part[f (x + h1 1 + h2 2 + 01 2 )]
f 0 (x) = (A.24)
h1
2 part[f (x + h1 1 + h2 2 + 01 2 )]
f 0 (x) = (A.25)
h2
1 2 part[f (x + h1 1 + h2 2 + 01 2 )]
f 00 (x) = . (A.26)
h1 h2
The derivative calculations are not even subject to subtractive cancellation
error so the use of hyper-dual numbers results in first and second derivative
calculations that are exact, regardless of the step size.
The real part returns the original function evaluated in the real argument
Re(x), and it is mathematically impossible for the derivative calculations to
affect the real part. Indeed, the use of this new number system to com-
pute the first and second derivative involves converting a real-valued func-
tion evaluation to operate on these alternative types of numbers. Then, the
derivatives are computed by adding a perturbation to the non-real parts and
evaluating the modified function.
The mathematics of these alternative types of numbers are such that when
operations are carried out on the real part of the number, derivative infor-
mation for those operations is formed and stored in the non-real parts of
the number. At every stage of the function evaluation the non-real parts of
the number contain derivative information with respect to the input. This
process must be repeated for every input variable for which derivative infor-
mation is desired [14].
Following the above discussion, methods for computing exact higher deriva-
tives can be created by using more non-real parts. For instance, to produce
nth derivatives, nth order hyper-dual numbers would be used. These nth order
hyper-dual numbers have n components 1 , 2 , . . . , n and all of their combi-
nations.
If only the first derivatives are needed, first order hyper-dual numbers would
be used: the dual numbers.

131
Appendix A. Dual and Hyper-Dual Numbers

A.3.3 Numerical Examples


The behavior of the second-derivative calculation methods is demonstrated
using a simple analytic function: f (x) = A sin(t)et .
Figure A.1(b) shows the relative error of the second-derivative calculation
methods as function of the step size, h. The relative error is defined as
= |f 00 fref
00 00
|/|fref |.
As the step size is initially decreased, the error of the central difference and
complex-step approximations decreases according to the order of the trun-
cation error of the method. For the second-derivatives, the complex-step
approximation is subject to subtractive cancellation error, as are the central
difference approximations (the formula for the second-derivative complex-
step approximation is available at Ref. [15]). The subtractive cancellation
error begins to dominate the overall error as the step size is furtherly reduced.
The error of the hyper-dual number calculations is machine zero, regardless
of the step size, because of the hyper-dual number approach is subject nei-
ther to truncation error, nor to round-off error.
If we compare the error in the first-derivative calculations A.2(a) with the
one in the second-derivative calculations A.1(b) we can point out that, un-
fortunately, the optimal step size for accurate second derivative calculations
with central difference and complex-step approximations is usually not the
same as the optimal step size for first derivatives. The optimal step size
for these methods requires a trade off between the truncation and round-off
errors. The optimal step size is not known a priori, it may require knowledge
of the true derivative value and it will change depending on the function, the
point where the derivative is desired, and the independent variable we are
considering.
These problems do not concern with the hyper-dual number calculations of
the first and second derivatives as shown in Figure A.1.
Figure A.2 illustrates the accuracies of several derivative calculation methods
x
as a function of the step size, for the function f (x) = 3 e 3
. Since
sin (x)+cos (x)
this example is taken from [15], we can compare the results. The comparison
of results reported in the Figures A.2 and A.3 shows the consistency of the
results.
In conclusion, comparing the behavior of the first- and second- derivative
approximations the figures show the difficulty of computing accurate second
derivatives using traditional methods. In order to compute exact second
derivatives it is necessary to employ the hyper-dual numbers.

132
Appendix A. Dual and Hyper-Dual Numbers

(a) (b)

Figure A.1: Accuracies of several derivative calculation methods as a function of


the step size for the function f (x) = A sin(t)et .

(a) (b)

Figure A.2: Accuracies of several derivative calculation methods as a function of


x
the step size for the function f (x) = 3 e 3
.
sin (x)+cos (x)

Figure A.3: Accuracies of several derivative calculation methods as a function of


x
the step size for the function f (x) = 3 e 3
, [15].
sin (x)+cos (x)

133
Appendix A. Dual and Hyper-Dual Numbers

A.3.4 Implementation
The hyper-dual numbers have been implemented as new class of numbers,
using operator overloading, in MATLAB. This section contains their imple-
mentation, which is based on the dual numbers class (available at section
A.2.5).
The new class includes definitions for the standard algebraic operations, log-
ical comparison operations, and other more general functions such as the
exponential, the sine and so on. This class definition file allows a real-valued
analysis code to be easily converted to operate on hyper-dual numbers by
just changing the variable type declarations, the structure of the code re-
mains unchanged.

classdef HyperDual

properties
x = 0;
d1 = 0;
d2 = 0;
d3 = 0;
end

methods
% Constructor
function obj = HyperDual(x,d1,d2,d3)
if size(x,1) ~= size(d1,1) | | size(x,2) ~= size(d1,2) ...
| | size(x,1) ~= size(d2,1) | | size(x,2) ~= size(d2,2) ...
| | size(x,1) ~= size(d3,1) | | size(x,2) ~= size(d3,2)
error('DUAL:constructor','X and D are different size')
else
obj.x = x;
obj.d1 = d1;
obj.d2 = d2;
obj.d3 = d3;
end
end
% Getters
function v = getvalue h(a)
v = a.x;
end
function d1 = getderiv 1(a)
d1 = a.d1;
end
function d2 = getderiv 2(a)
d2 = a.d2;
end

134
Appendix A. Dual and Hyper-Dual Numbers

function d3 = getderiv 3(a)


d3 = a.d3;
end
% Indexing
function B = subsref(A,S)
switch S.type
case '()'
idx = S.subs;
switch length(idx)
case 1
B = HyperDual(A.x(idx{1}),A.d1(idx{1}),...
A.d2(idx{1}),A.d3(idx{1}));
case 2
B = Dual(A.x(idx{1},idx{2}), A.d1(idx{1},idx{2}),...
A.d2(idx{1},idx{2}),A.d3(idx{1},idx{2}));
otherwise
error('HyperD:subsref','Arrays with more than 2 dims not supported')
end
case '.'
switch S.subs
case 'x'
B = A.x;
case 'd1'
B = A.d1;
case 'd2'
B = A.d2;
case 'd3'
B = A.d3;
otherwise
error('Dual:subsref','Field %s does not exist',S.subs)
end
otherwise
error('Dual:subsref','Indexing with {} is not supported')
end
end
function A = subsasgn(A,S,B)
switch S.type
case '()'
idx = S.subs;
otherwise
error('Dual:subsasgn','Assignment with {} and . not supported')
end
if ~isdual(B)
B = mkdual(B);
end
switch length(idx)
case 1
A.x(idx{1}) = B.x;
A.d1(idx{1}) = B.d1;

135
Appendix A. Dual and Hyper-Dual Numbers

A.d2(idx{1}) = B.d2;
A.d3(idx{1}) = B.d3;
case 2
A.x(idx{1},idx{2}) = B.x;
A.d1(idx{1},idx{2}) = B.d1;
A.d2(idx{1},idx{2}) = B.d2;
A.d3(idx{1},idx{2}) = B.d3;
otherwise
error('Dual:subsref','Arrays with more than 2 dims not supported')
end
end
% Comparison operators
function res = eq(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x == b.x;
elseif ishyperdual(a)
res = a.x == b;
elseif ishyperdual(b)
res = a == b.x;
end
end
function res = neq(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x ~= b.x;
elseif ishyperdual(a)
res = a.x ~= b;
elseif ishyperdual(b)
res = a ~= b.x;
end
end
function res = lt(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x < b.x;
elseif ishyperdual(a)
res = a.x < b;
elseif ishyperdual(b)
res = a < b.x;
end
end
function res = le(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x <= b.x;
elseif ishyperdual(a)
res = a.x <= b;
elseif ishyperdual(b)
res = a <= b.x;
end
end
function res = gt(a,b)

136
Appendix A. Dual and Hyper-Dual Numbers

if ishyperdual(a) && ishyperdual(b)


res = a.x > b.x;
elseif ishyperdual(a)
res = a.x > b;
elseif ishyperdual(b)
res = a > b.x;
end
end
function res = ge(a,b)
if ishyperdual(a) && ishyperdual(b)
res = a.x >= b.x;
elseif ishyperdual(a)
res = a.x >= b;
elseif ishyperdual(b)
res = a >= b.x;
end
end
function res = isnan(a)
res = isnan(a.x);
end
function res = isinf(a)
res = isinf(a.x);
end
function res = isfinite(a)
res = isfinite(a.x);
end
% Unary operators
function obj = uplus(a)
obj = a;
end
function obj = uminus(a)
obj = HyperDual(-a.x, -a.d1, -a.d2, -a.d3);
end
function obj = transpose(a)
obj = HyperDual(transpose(a.x), transpose(a.d1),...
transpose(a.d2), transpose(a.d3));
end
function obj = ctranspose(a)
obj = HyperDual(ctranspose(a.x), ctranspose(a.d1),...
ctranspose(a.d2), ctranspose(a.d3));
end
function obj = reshape(a,ns)
obj = HyperDual(reshape(a.x,ns), reshape(a.d1,ns),...
reshape(a.d2,ns), reshape(a.d3,ns));
end
% Binary arithmetic operators
function obj = plus(a,b)
if ishyperdual(a) && ishyperdual(b)
obj = HyperDual(a.x + b.x, a.d1 + b.d1, a.d2 + b.d2, a.d3 + b.d3);

137
Appendix A. Dual and Hyper-Dual Numbers

elseif ishyperdual(a)
obj = HyperDual(a.x + b, a.d1, a.d2, a.d3);
elseif ishyperdual(b)
obj = HyperDual(a + b.x, b.d1, b.d2, b.d3);
end
end
function obj = minus(a,b)
if ishyperdual(a) && ishyperdual(b)
obj = HyperDual(a.x - b.x, a.d1 - b.d1, a.d2 - b.d2, a.d3 - b.d3);
elseif ishyperdual(a)
obj = HyperDual(a.x - b, a.d1, a.d2, a.d3);
elseif ishyperdual(b)
obj = HyperDual(a - b.x, -b.d1, -b.d2, -b.d3);
end
end

function obj = times(a,b)


if ishyperdual(a) && ishyperdual(b)
obj = HyperDual(a.x .* b.x, (a.x .* b.d1)+(a.d1 .* b.x),...
(a.x.*b.d2)+(a.d2.*b.x), (a.x.*b.d3)+(a.d1.*b.d2)+...
(a.d2.*b.d1)+(a.d3.*b.x));
elseif ishyperdual(a)
obj = HyperDual(a.x .* b, a.d1 .* b, a.d2 .* b, a.d3 .* b);
elseif ishyperdual(b)
obj = HyperDual(a .* b.x, a .* b.d1, a .* b.d2, a .* b.d3);
end
end
function obj = mtimes(a,b)
% Matrix multiplication for dual numbers is elementwise
obj = times(a,b);
end
function obj = rdivide(a,b)
if ishyperdual(a) && ishyperdual(b)
xpart = a.x ./ b.x;
d1part = (a.d1 ./ b.x) - ((a.x .* b.d1) ./ (b.x .* b.x));
d2part = (a.d2 ./ b.x)-((a.x .* b.d2) ./ (b.x .* b.x));
d3part = ((a.d3 ./ b.x)-((a.d1 .* b.d2)./(b.x .* b.x))-...
((a.d2 .* b.d1)./(b.x .* b.x))+(a.x .*(-(b.d3...
./ (b.x .* b.x))+((2 .* b.d1 .* b.d2)./(b.x .* b.x .*b.x)))));
obj = HyperDual(xpart,d1part, d2part, d3part);
elseif ishyperdual(a)
xpart = a.x ./ b;
d1part = a.d1 ./ b;
d2part = a.d2 ./ b;
d3part = a.d3 ./ b;
obj = HyperDual(xpart, d1part, d2part, d3part);
elseif ishyperdual(b)
xpart = a ./ b.x;
d1part = -((a .* b.d1) ./ (b.x .* b.x));

138
Appendix A. Dual and Hyper-Dual Numbers

d2part = -((a .* b.d2) ./ (b.x .* b.x));


d3part = a .*(-(b.d3 ./ (b.x .* b.x)) + ((2.* b.d1 .* b.d2)...
./(b.x .* b.x .*b.x)));
obj = HyperDual(xpart, d1part, d2part, d3part);
end
end
function obj = mrdivide(a,b)
% All division is elementwise
obj = rdivide(a,b);
end
function obj = power(a,b)
% n is assumed to be a real value (not a dual)
tol=1e-15;
if ishyperdual(a) && ishyperdual(b)
error('Dual:power','Power is not defined for a and b both dual')
elseif ishyperdual(a)
obj = HyperDual(power(a.x,b), b .* a.d1 .* power(a.x-tol,b-1),...
b .* a.d2 .* power(a.x+tol,b-1),...
(b .* a.d3 .* power(a.x+tol, b-1))+...
(b .* (b-1) .* a.d1 .* a.d2 .*power(a.x+tol, b-2)));
elseif ishyperdual(b)
error('Power with dual index, not implemented yet')
end
end
function obj = mpower(a,n)
% Elementwise power
obj = power(a,n);
end
% Miscellaneous math functions
function obj = sqrt(a)
obj = power(a,0.5);
end
function obj = abs(a)
obj = HyperDual(abs(a.x), a.d1 .* sign(a.x), a.d2 .* sign(a.x),...
a.d3 .* sign(a.x));
end
% Exponential and logarithm
function obj = exp(a)
obj = exp(a.x) .* HyperDual(ones(1,length(a.d1)), a.d1,...
a.d2, a.d3 + a.d1 .* a.d2);
end
function obj = log(a)
obj = HyperDual(log(a.x), a.d1 ./ a.x,...
a.d2 ./ a.x, a.d3 ./ a.x-(a.d1 .* a.d2)./(a.x.(2)));
end
% Trigonometric functions
function obj = sin(a)
obj = HyperDual(sin(a.x), a.d1 .*cos(a.x), a.d2 .* cos(a.x),...
((a.d3 .* cos(a.x)) - (a.d1 .* a.d2 .* sin(a.x))));

139
Appendix A. Dual and Hyper-Dual Numbers

end
function obj = cos(a)
obj = HyperDual(cos(a.x), -a.d1 .* sin(a.x), -a.d2 .* sin(a.x),...
(-a.d3 .* sin(a.x))-(a.d1 .* a.d2 .* cos(a.x)));
end
function obj = tan(a)
obj = HyperDual(tan(a.x), a.d1 .*sec(a.x).2, a.d2 .*sec(a.x).2,...
a.d3 .*sec(a.x).2+a.d1 .*a.d2 .*(2.*tan(a.x).*sec(a.x).2));
end
function obj = asin(a)
obj = HyperDual(asin(a.x), a.d1 ./ sqrt(1-(a.x.(2))),...
a.d2 ./sqrt(1-(a.x.(2))), a.d3 ./sqrt(1-(a.x.(2)))...
+ a.d1 .* a.d2 .* (a.x ./ (1-(a.x.(2))).(-3/2)));
end
function obj = atan(a)
obj = HyperDual(atan(a.x),a.d1 ./(1+(a.x).2),a.d2 ./(1 + (a.x).2),...
a.d3./(1+(a.x).2)+a.d1 .* a.d2 .*...
(-2 .*a.x ./ (1+a.x.(2)).(2)));
end
% Hyperbolic trig functions
function obj = sinh(a)
obj = (exp(a)-exp(-a)) ./ 2;
end
function obj = cosh(a)
obj = (exp(a)+exp(-a)) ./ 2;
end
function obj = tanh(a)
obj = (exp(a)-exp(-a)) ./ (exp(a)+exp(-a)) ;
end
function obj = asinh(a)
obj = log(a + sqrt(a.(2)+1));
end
function obj = acosh(a)
obj = log(a + sqrt(a.(2)-1));
end
function obj = atanh(a)
obj = log((sqrt(1-a.(2))) ./ (1-a));
end
end
end

The function ishyperdual.m is the following:

function b = ishyperdual(a)
% ISHYPERDUAL Return true if a is of class HyperDual,
% else return false
b = strcmp(class(a),'HyperDual');
end

140
Bibliography

[1] John T. Betts: Practical Methods for Optimal Control and Estimation
Using Nonlinear Programming, SIAM-Society for Industrial and Applies
Mathematics, Philadelphia, SECOND EDITION, 2010.

[2] John H. Mathews and Kurtis K. Fink: Numerical Methods Using Mat-
lab., Pearson, New Jersey, FOURTH EDITION, 2004.

[3] http://www.holoborodko.com/pavel/numerical-methods/numerical-
derivative/central-differences/

[4] Joaquin R. R. A. Martins and Peter Sturdza and Juan J. Alonso: The
Complex-Step Derivative Approximation, ACM Transactions on Mathe-
matical Software, Vol.29, No.3, September 2003.

[5] M. Sagliano, S. Theil: Hybrid Jacobian Computation for Fast Opti-


mal Trajectories Generation, AIAA Guidance, Navigation, and Control
(GNC) Conference, August 19-22, 2013, Boston, MA.

[6] J.-J. E. Slotine, J.K. Hedrick, E.A. Misawa: On Sliding Observers for
Nonlinear Systems, Journal of Dynamic Systems, Measurement, and
Control, September 1987, Vol. 109/245

[7] Y. Shtessel, C. Edwards, L. Fridman, A. Levant: Sliding Mode Control


and Observation,Springer New York Heidelberg Dordrecht London, 2014

[8] A. V. Rao: A Survey of Numerical Methods for Optimal Control,


AAS/AIAA Astrodynamics Specialist Conference, AAS Paper 09-334

[9] A. L. Herman and B. A. Conway: Direct Optimization Using Colloca-


tion Based on High-Order Gauss-Lobatto Quadrature Rules, Journal of
Guidance, Control, and Dynamics, Vol.19, No.3, May-June 1996.

141
Bibliography

[10] M. Sagliano: Performance analysis of linear and nonlinear techniques


for automatic scaling of discretized control problems, Operations Re-
search Letters, Volume 42, Issue 3, May 2014, Pages 213216.

[11] Sagliano, M., Samaan M., Theil S., Mooij E.: SHEFEX-3 Optimal
Feedback Entry Guidance, AIAA SPACE 2014 Conference and Exposi-
tion, AIAA 2014-4208, San Diego, CA, 2014, doi:10.2514/6.2014-4208.

[12] Arslantas Y. E., Oehlschlagel T., Sagliano M., Theil S., Braxmaier C.,
Safe Landing Area Determination for a Moon Lander by Reachability
Analysis, 17th International Conference and Control (HSSC), Berlin,
Germany, 2014.

[13] Jeffrey A. Fike: Numerically Exact Derivative Calculation Using Fake


Numbers, Stanford University, Department of Aeronautics and Astro-
nautics, April 9, 2008.

[14] Jeffrey A. Fike, S. Jongsma, Juan J. Alonso, Edwin van der Weide:
Optimization with Gradient and Hessian Information Calculated Using
Hyper-Dual Numbers, AIAA Applied Aerodynamics Conference, 27-30
June 2011, Honolulu, Hawaii.

[15] Jeffrey A. Fike, Juan J. Alonso: The Development of Hyper-Dual Num-


bers for Exact Second-Derivative Calculations, 49th AIAA, 4-7 January
2011, Orlando, Florida.

[16] W. B. Vasantha Kandasamy, Florentin Smarandache: Dual Numbers,


Zip Publishing, Ohio, 2012.

[17] https://gist.github.com/chris-taylor/2005955

142

You might also like