0% found this document useful (0 votes)

427 views277 pages

ACER Method in Extreme Value Statistics

This document is a preface to a book by Arvid Naess on applied extreme value statistics, focusing particularly on the ACER method. It discusses the challenges of applying asymptotic extreme value distributions to real-life data and presents the ACER method as a rational alternative for extreme value analysis. The book aims to provide a practical diagnostic tool for extreme value analysis while also introducing classical theories and methods relevant to the field.

Uploaded by

larsweber11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

427 views277 pages

ACER Method in Extreme Value Statistics

Uploaded by

larsweber11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Arvid Naess

Applied
Extreme
Value
Statistics
With a Special Focus on the ACER Method
Applied Extreme Value Statistics
Arvid Naess

Applied Extreme Value

Statistics
With a Special Focus on the ACER Method
Arvid Naess
Faculty of Information Technology,
Mathematics and Electrical Engineering
Norwegian University of Science and
Technology
Trondheim, Norway

ISBN 978-3-031-60768-4 ISBN 978-3-031-60769-1 (eBook)

[Link]

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

If disposing of this product, please recycle the paper.

To Dorothea and Vemund
Preface

This book grew out of many years of involvement with the practical applications
of extreme value analysis to measured or simulated data. This is a fascinating area
of research because of the fundamental dichotomy inherent in this problem area.
On the one hand, you have beautiful mathematical results for asymptotic extreme
value distributions. On the other hand, you have real-life data, which are hardly
asymptotic. So, the unavoidable question becomes: To what extent can you use
the asymptotic distributions to analyze real-life data? Personally, I have always felt
uncomfortable with the use of the parametric classes of asymptotic extreme value
distributions in applications. This was largely due to the fact that the justification
for applying them generally seemed dubious, and amazingly enough, the problem
of justification is rarely discussed at all in papers using asymptotic distributions
on real-life data. The problems of justification and other issues related to the
fundamental dichotomy are discussed in Chap. 1.
Of course, I was not the only one who disliked asymptotics for use on real-
life data. A consequence of this situation was that alternative procedures for
extreme value analysis were developed in several engineering disciplines. Some
of these alternative procedures were based on ideas similar to those developed in
Chap. 4. This chapter contains what was largely my world view on applied extreme
value statistics for quite some time, and to some extent, it still is. However, the
development of the ACER method, which is a central theme in this book, cf. Chap. 5,
allows for a much wider perspective. Its use in practice basically involves two
separate steps. The first step is based exclusively on the data and ends up with
a nonparametric representation of the extreme value distribution inherent in the
data. This is the crucial element of the ACER method. The second step consists
of an optimization procedure to fit a parametric function to the nonparametric
distribution. This step is necessary in order to be able to predict extremes larger
than those contained in the data, which is typically demanded by applications. To
develop a rational method of optimized fitting, the asymptotic extreme value theory
by necessity becomes an essential ingredient which guides the construction of the
parametric functions used in the optimization procedure. It is therefore necessary to
identify the correct asymptotic extreme value distribution, since the fitted extreme

vii
viii Preface

value distribution by necessity must approach the relevant asymptotic form in the
limit.
Even if my own work in developing methods for use in applied extreme
value analysis more or less avoided direct use of the asymptotic extreme value
distributions, I have always clearly understood their importance as an unavoidable
foundation. The publication in 1983 of the important book Extremes and Related
Properties of Random Sequences and Processes by Leadbetter, Lindgren, and
Rootzén happened when my own interest in extreme value analysis more or less
started. I, therefore, read this book very carefully, and it gave me a very good grip
on the asymptotic extreme value theory. Of course, I also read parts of the seminal
book by E. J. Gumbel, published in 1958, which also has a focus on asymptotic
results. However, by the mid-1980s, that book, which was written in a pre-computer
era, appeared as more or less obsolete when compared to the book by Leadbetter
et al.
I have written this book, not because I want to convince people to abandon
the classical asymptotic approaches, which, unfortunately, too often in practice are
reduced to blindfolded curve fitting exercises to asymptotic parametric distributions
with no real analysis to back it up. No, I have written the book because I would
also like to show that it is now possible to make a more rationally based extreme
value analysis of observed data. I want to show that the ACER method very often
provides a unique practical diagnostic tool for a rational extreme value analysis. If,
as a result, asymptotic distributions turn out to be more or less acceptable, then their
use would at least have a reasonable justification.
It is also important to emphasize that the book is not a comprehensive treatment
of methods for applied extreme value analyses, but to a large extent a collection
of methods that I have personally worked with on and off over a period of three
decades, and which I have found to be relevant and useful. I have made an effort to
write the book as much as possible like an introduction to extreme value statistics
with emphasis on applications. Therefore, the book also contains introductory
chapters to the classical asymptotic theories and the threshold exceedance models,
as well as many illustrative examples. The mathematical level is elementary, and
detailed mathematical proofs have been avoided in favor of heuristic arguments to
increase readability. Hopefully, this makes the book useful and appealing to a large
audience of people representing a wide range of diverse applications.
Since the topic of this book is applied extreme value statistics, an inevitable
component to go along with it, is access to computer programs for carrying out
the analysis of available data. For the methods based on the asymptotic results
described in Chaps. 2 and 3, there are several excellent programs easily available.
Specific recommendations are not given here. Whichever program is chosen, good
results can be obtained within the framework of asymptotic distributions. On the
other hand, the ACER method has not yet attained a comparable level of software
development. References to computer programs for univariate and bivariate analyses
by the ACER method have therefore been given in this book. These programs can
be freely downloaded.
Preface ix

Writing on the technical level necessary for this book requires a lot of attention to
details. It is in practice impossible to avoid errors and mistakes, poorly formulated
explanations, or misprints in initial versions of such a book. Fortunately, I have
some very good friends and colleagues who have helped me identify and correct
many such shortcomings, and for this, I am forever thankful. Any mistakes, which
may still remain, are entirely my own responsibility. The first group of people that
I would like to mention for their important contributions to improving the book
are Professors Bernt J. Leira, Bo H. Lindqvist, and Sverre Haver and Dr. Karl
W. Breitung. Previous collaborators and PhD students that have been important
in helping me in various ways are Professors Torgeir Moan, Oleg Gaidai, Sjur
Westgaard, Marc Maes, Nilanjan Saha, Wei Chai, and Arild Brandrud Næss; Drs.
Oleksandr Batsevich, Oleh Karpa, Ali Cetin, Hans K. Karlsen, and Kai Erik Dahlen;
and Morten Skjong. I am also very grateful to my many good students that I have
had the pleasure of working with over the years, who have also inevitably been part
of my own never-ending education as a researcher.

Trondheim, Norway Arvid Naess

March, 2023
Contents

1 Challenges of Applied Extreme Value Statistics. . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 A Brief Summary of Status, Problems and Challenges. . . . . . . . . . . . . 1
2 Classical Extreme Value Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Asymptotic Limits of Extreme Value Distributions . . . . . . . . . . . 5
2.3 The Block Maxima Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Outline Proof of the Extremal Types Theorem . . . . . . . . . . . . . . . . . . . . . 9
2.5 Domains of Attraction for the Extreme Value Distributions . . . . . . . 10
2.6 Parameter Estimation for the GEV Distributions . . . . . . . . . . . . . . . . . . . 11
2.6.1 Estimation by the Method of Moments . . . . . . . . . . . . . . . . . . . . 12
2.6.2 Maximum Likelihood Estimation. . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.8 Estimating Confidence Intervals by Bootstrapping . . . . . . . . . . . . . . . . . 15
2.9 The Asymptotic Extreme Value Distributions for
Dependent Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 The Peaks-Over-Threshold Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 The Peaks-Over-Threshold Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Threshold Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 Return Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Parameter Estimation for the GP Distributions . . . . . . . . . . . . . . . . . . . . . 24
3.5.1 Dekkers–Einmahl–de Haan Estimators. . . . . . . . . . . . . . . . . . . . 24
3.5.2 Moment Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5.3 Maximum Likelihood Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 Estimating Confidence Intervals by Bootstrapping . . . . . . . . . . . . . . . . . 27

xi
xii Contents

4 A Point Process Approach to Extreme Value Statistics. . . . . . . . . . . . . . . . . 29

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Average Rate of Level Crossings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Distribution of Peaks of a Narrow Banded Process. . . . . . . . . . . . . . . . . 33
4.4 Average Upcrossing Rate and Distribution of Peaks of a
Gaussian Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.5 Extreme Value Distributions by the Upcrossing Rate Method . . . . . 38
4.6 Extreme Values of Gaussian Processes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.7 The Crossing Rate of Transformed Processes . . . . . . . . . . . . . . . . . . . . . . 44
4.8 Hermite Moment Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.9 Return Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.10 Long-Term Extreme Value Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.10.1 All Peak Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.10.2 All Short-Term Extremes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.10.3 The Long-Term Extreme Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.10.4 Simplified Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 The ACER Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 A Sequence of Conditioning Approximations . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Empirical Estimation of the Average Conditional
Exceedance Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Long-Term Extreme Value Analysis by the ACER Method. . . . . . . . 68
5.5 Estimation of Extremes for the Asymptotic Gumbel Case . . . . . . . . . 69
5.6 Estimation of Extremes for the General Case . . . . . . . . . . . . . . . . . . . . . . . 72
6 Some Practical Aspects of Extreme Value Analyses . . . . . . . . . . . . . . . . . . . . 75
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Extreme Value Prediction for Synthetic Data . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Measured Wind Speed Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.4 Extreme Value Prediction for a Narrow-Band Process . . . . . . . . . . . . . 88
7 Estimation of Extreme Values for Financial Risk Assessment. . . . . . . . . 93
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3 Application to Simulated Time Series of Electricity Prices . . . . . . . . 95
7.4 Electricity Price Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.5 Conditional Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.6.1 Unconditional Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.6.2 Conditional Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8 The Upcrossing Rate via the Characteristic Function . . . . . . . . . . . . . . . . . . 103
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.2 The Response Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.3 The Average Crossing Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Contents xiii

8.4 Numerical Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.5.1 Slow-Drift Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.5.2 Moored Deep Floater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.5.3 Wind Excited Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Appendix 1: The Average Crossing Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Appendix 2: The Characteristic Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9 Monte Carlo Methods and Extreme Value Estimation . . . . . . . . . . . . . . . . . 127
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.2 Simulation of Stationary Stochastic Processes. . . . . . . . . . . . . . . . . . . . . . 127
9.2.1 Realizations of Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . 128
9.2.2 Variance Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.2.3 Units of Variance Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.2.4 Example: A Realization of a Wave Process . . . . . . . . . . . . . . . 133
9.2.5 The Variance Spectrum Directly
from the Realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.3 Monte Carlo Simulation of Load and Response . . . . . . . . . . . . . . . . . . . . 137
9.4 Sample Statistics of Simulated Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.5 Latin Hypercube Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.6 Estimation of Extreme Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.6.1 The Gumbel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.6.2 The Point Process Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.6.3 A Comparison of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
9.6.4 Combination of Multiple Stochastic Load Effects . . . . . . . . 151
9.6.5 Total Surge Response of a TLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
10 Bivariate Extreme Value Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10.2 Componentwise Extremes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
10.3 Bivariate ACER Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
10.4 Functional Representation of the Empirically Estimated
Bivariate ACER Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
10.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.5.1 Wind Speed Measured at Two Adjacent
Weather Stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.5.2 Wind Speed and Wave Height Measured at a
North Sea Weather Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Appendix 1: The Sequence of Conditioning Approximations . . . . . . . . . . . . . 187
Appendix 2: Empirical Estimation of the Bivariate ACER Functions. . . . . 190
11 Space–Time Extremes of Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
11.2 Spatial–Temporal Extremes for Gaussian Random Fields . . . . . . . . . 196
11.3 A Simplified Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
11.4 Spatial–Temporal Extremes for Non-Gaussian Random Fields. . . . 202
xiv Contents

11.5 Empirical Estimation of the Mean Upcrossing Rate. . . . . . . . . . . . . . . . 205

11.6 Numerical Examples for Gaussian Random Fields . . . . . . . . . . . . . . . . . 206
11.6.1 1+1-Dimensional Gaussian Field . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11.6.2 1+1-Dimensional Gaussian Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
11.6.3 A Short-Crested Gaussian Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
11.7 Numerical Examples for Non-Gaussian Random Fields . . . . . . . . . . . 215
11.7.1 A Second-Order Wave Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
11.7.2 A Student’s t Random Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
11.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
12 A Case Study—Extreme Water Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
12.2 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
12.2.1 Oslo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.2.2 Heimsjø . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.2.3 Honningsvåg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.3 Annual Maxima Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.3.1 Application to Water Level Measurements . . . . . . . . . . . . . . . . 228
12.4 The Peaks-Over-Threshold Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
12.4.1 Application to Water Level Measurements . . . . . . . . . . . . . . . 233
12.5 Revised Joint Probabilities Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
12.5.1 Estimating Return Levels with the RJP Method . . . . . . . . . . 241
12.5.2 Application to Water Level Measurements . . . . . . . . . . . . . . . . 242
12.6 The ACER Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
12.6.1 Application to Water Level Measurements . . . . . . . . . . . . . . . . 246
12.7 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
12.7.1 Oslo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
12.7.2 Heimsjø . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
12.7.3 Honningsvåg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
12.7.4 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Chapter 1
Challenges of Applied Extreme Value
Statistics

1.1 Introduction

This book provides an introduction to the calculation of extreme value statistics

for measured or simulated data. “Extreme” here means “the largest”, interpreted in
a way that follows from the context. As opposed to books on asymptotic extreme
value statistics, the focus is also on methods specifically developed to work for real-
life data. A consequence of this, is that the book contains much less theoretical
issues about the asymptotic properties of extreme value statistics than is usual.
However, the most important elements from the asymptotic extreme value statistics
will be discussed, since they are still widely used in practical applications.
Although two of the asymptotic methods described in this book have been
used extensively over several decades for prediction of the extreme value statistics
of many natural phenomena, the prerequisites for their application are often not
satisfied, and in some cases, not even approximately. Under such circumstances,
there would appear to be a problem. It is this situation that will be highlighted in
this chapter.

1.2 A Brief Summary of Status, Problems and Challenges

Statistical distributions of the extreme values of large samples of data were derived
almost one hundred years ago by Fisher and Tippett (1928), cf. also Fréchet (1927);
Gnedenko (1943); de Haan (1970). The main prerequisite for the existence of the
derived results were that the data could be considered as outcomes of independent
and identically distributed random variables. As it turned out, in non-degenerate
cases there are only three possible types of limiting extreme value distributions
with increasing amounts of data. It means that these results are asymptotic, as the
technical term goes. On the positive side, the fact that we know explicitly what the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1

A. Naess, Applied Extreme Value Statistics,
[Link]
2 1 Challenges of Applied Extreme Value Statistics

possible distributions look like, even if only in the limit of large samples increasing
indefinitely, is very satisfactory. And there are criteria that can tell us which type of
distribution applies if the underlying distribution of the data is known (Leadbetter
et al. 1983). However, on the negative side, it is not possible to know to what extent
one of the three types of limiting distributions actually applies to a real-life case with
only a limited amount of data, even though there may be reasons to expect that the
true extreme value distribution should not deviate too much from one of the limiting
forms. Unfortunately, there are no useful convergence results that are precise enough
to really help us decide quantitatively on this issue. Still, the common practice has
been to assume an appropriate limiting form as the extreme value distribution to use.
This can easily be understood from the simple fact that the limiting distributions are
known explicitly, while the exact extreme value distributions inherent in the data,
are largely unknown. The procedure to identify the appropriate limiting distribution
is to optimize the fit of the extreme values derived from the observed data to the
asymptotic forms. Typically, the extreme values from the data are taken as the
maxima of specified blocks of data, e.g. annual maxima.
The three asymptotic types of extreme value distributions are essentially charac-
terized by the value of one parameter, .γ say, called the shape parameter. As will be
seen later, the most important case for us in this book is when .γ = 0. This is called
a Type I, or Gumbel, distribution. For positive values of .γ , Type II, or Fréchet,
distributions are obtained, while for negative values of .γ , the distributions are of
Type III, or Weibull (for maxima). As it turns out, all three distribution types may
be expressed in terms of one parametric form called the generalized extreme value
(GEV) distribution. A standard recommendation is then to use the GEV parametric
form for the sake of optimized fitting of the obtained extreme value sample. There
is, however, one serious flaw with this procedure. The extreme value sample being
extracted from limited amounts of data, are hardly a sample from an asymptotic
distribution. Hence, one cannot expect that the estimated parameters will point to
the correct asymptotic distribution. This is an issue of importance for extrapolation
to out-of-sample long return period levels. For example, a practical task may be
to say something about a 100 year return period level on the basis of 25 years of
measured data. Then the correct asymptotic distribution is of paramount importance
because the different types of extreme value distribution may lead to quite different
extrapolation results. An additional issue is, of course, that with limited amounts
of data follows considerable uncertainty on the estimated quantities. It may, in fact,
happen that the estimated value of .γ is slightly negative, pointing to a Type III
distribution, but with the confidence interval accounted for, also .γ = 0, or even
.γ > 0 are possible candidates for the value of .γ . Hence, all three types of extreme

value distributions seems to be possible alternatives in such a case. Since these

asymptotic distributions have very different behaviour when extrapolated to high
quantile values, the previous comments on the importance of this aspect, would
often necessitate a more careful analysis of the situation to decide which asymptotic
distribution to apply.
The peaks-over-threshold (POT) method for extreme value analysis will be
discussed to some extent in this book. This method is also based on asymptotics.
1.2 A Brief Summary of Status, Problems and Challenges 3

The data extracted for its use, are the exceedances above high thresholds. Asymptot-
ically, these data are assumed to follow a generalized Pareto (GP) distribution, which
is then the equivalent of the GEV for the block maxima method. The POT method
also has three classes of distributions, again characterized by the .γ parameter. For
example, the singular case .γ = 0 corresponds to the exponential distribution. It
is a rather popular method, mainly because it uses more of the data for inference.
Unfortunately, it has certain deficiencies, which will be highlighted in this book.
There is an important and interesting observation to be made at this initial stage
of our exposition of extreme value statistics. As already been stated, for all negative
values of the shape parameter .γ , the Type III class of extreme value distributions
apply, while for all positive values of .γ , it is the Type II class of extreme value
distributions that is obtained. This would seem to indicate that there are two huge
classes of extreme value distributions that would tend to make the singular case
.γ = 0 a rather special and maybe uninteresting case. The fact of the matter is

quite the opposite. For almost all environmental processes that will be dealt with in
this book, it is the Gumbel distribution that has prevailed as the correct asymptotic
extreme value distribution. There has over the years been some suggestions to the
other types as well, but these have almost all been finally rejected in the face of
overwhelming evidence for Type I distributions. Of course, it is impossible to fully
answer the fundamental question: To what extent do our statistical models apply to
real-life data? But so far, it seems that these statistical methods work rather well
on such data, but being overconfident in these methods is perhaps an unwarranted
position to take.
One important reason that the singular asymptotic Gumbel case is so important
in practice, is that from the perspective of a sub-asymptotic world, the picture
of the size of the extreme value distribution classes looks very different. When
only limited amounts of data are available, the asymptotic limiting distributions,
strictly speaking, do not apply, except in very special cases. Hence, we are in a sub-
asymptotic situation. As will be seen in large parts of this book, there is a huge class
of extreme value distributions that apply to a range of different problems, which all
end up at the Gumbel distribution asymptotically. So, the apparent singularity of the
Gumbel case is an artefact of the asymptotic limiting process, and does not reflect
the situation in the sub-asymptotic regime.
In an effort to resolve the inconsistency between real-life data and asymptotic
distributions, a new method has been developed that is based on the concept of the
average conditional exceedance rate (ACER). The method proceeds by establishing
a cascade of empirical, non-parametric distribution functions that converge to the
extreme value distribution inherent in the data. The advantages of the method
is that no assumptions about independent and stationary data have to be made.
For example, seasonal variations of the data do not require special modelling.
The method also has a unique diagnostic feature in how it displays the effect of
dependence between the data on the extreme value distribution. This may be of
significance for the choice of which data can be included in the analysis. The ACER
method will be discussed in detail in this book.
4 1 Challenges of Applied Extreme Value Statistics

Whatever method of extreme value statistics is chosen for the analysis of the
available data, the goal is almost always to predict extreme values with return
periods larger, and often much larger, than the period of data collection. This
inevitably requires extrapolation techniques to be used. The seemingly stochastic
mechanism generating the sampled data is often sufficiently well understood to
support the assumption of the validity of extrapolation. Unfortunately, this may not
always be the situation. Ideally, in such cases, the predicted extreme values obtained
by extrapolation should then be accompanied by a cautionary note. However, this
is rarely done, simply because more credible alternatives for the prediction process
are not available.
The extrapolation procedure is, in general, based on obtaining estimates of the
parameters that determine the extreme value distribution type adopted for the data at
hand. If an asymptotic approach is used, the GEV distribution is often preferred for
parameter estimation in the case of the block maxima method, or the GP distribution
for the POT method. Since these are parametrized forms covering all three types of
asymptotic extreme value distributions, it is often recommended to use these forms,
allowing the data to determine which type of extreme value distribution to use. As
already mentioned, such a procedure may not always be a good idea. Also for the
ACER method, a parametrized family of functions is proposed for the purpose of
extrapolation, which is tailored to reflect the sub-asymptotic character of the data.
The parameter estimates calculated for the examples in this book, are based on
either the method of moments or the maximum likelihood method in the case of the
GEV or the GP distributions. For the ACER method, the optimized fitting is obtained
by using a Levenberg-Marquardt method on an objective function expressed as a
weighted mean square deviation measure between the empirical and the proposed
parametric ACER functions on the log level. Uncertainty quantification is also a very
important aspect of any statistical inference. In this book, the use of bootstrapping
will serve to illustrate this issue, since it has some attractive properties.
Chapter 2
Classical Extreme Value Theory

2.1 Introduction

Classical extreme value statistics is concerned with the distributional properties of

the maximum of a number of independent and identically distributed (iid) random
variables when the number of variables becomes large. A partial result was obtained
by Fréchet (1927), while Fisher and Tippett (1928) discovered that there are three
types of possible limiting or asymptotic distributions, which are now contained
in the Extremal Types Theorem, which is discussed in the next section. These
three asymptotic distributions are typically referred to as the Gumbel, Fréchet, and
Weibull distributions. It is also common practice to refer to them as Type I, Type
II, and Type III, in the same order. Important contributions to this theory were later
made by Gnedenko (1943), Gumbel (1958), and de Haan (1970).

2.2 The Asymptotic Limits of Extreme Value Distributions

The classical extreme value theory starts by looking at a sequence of independent

and identically distributed (iid) random variables .X1 , X2 , . . . with common distri-
bution function .FX (x). The extreme value of a finite number .X1 , . . . , Xn is then
.Mn = max{X1 , . . . , Xn }. The distribution of .Mn can be easily derived as

.FMn (x) = Prob(Mn ≤ x) = Prob(X1 ≤ x, . . . , Xn ≤ x)

n
= Prob(X1 ≤ x) · . . . · Prob(Xn ≤ x) = FX (x) . (2.1)

This relation is not very helpful in practice, because in most cases the distribution
function .FX (x) is not known exactly. Therefore, it would have to be estimated from
recorded data. However, small discrepancies in the estimates of .FX (x) can lead

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 5

A. Naess, Applied Extreme Value Statistics,
[Link]
6 2 Classical Extreme Value Theory

n
to substantial discrepancies, in a relative sense, in the values of . FX (x) for large
values
of n. nIn classical extreme value theory, one proceeds by studying the behavior
of . FX (x) as .n → ∞, but with a twist. Obviously, for any x such that .FX (x) < 1,
n
. FX (x) → 0 as .n → ∞. This necessitates a rescaling. Specifically, instead of
studying .Mn , one introduces a renormalized version of .Mn :

Mn − bn
Mn∗ =
. (2.2)
an

for suitable sequences of constants .an > 0 and .bn that are chosen to stabilize the
location and scale of .Mn∗ as .n → ∞. It is then proven that there are, in fact,
only three types of limiting distributions for this renormalized .Mn∗ . This is the
famous Extremal Types Theorem (Leadbetter et al. 1983), which can be expressed
as follows.
If there exist sequences of constants .an > 0 and .bn such that
M − b
n n
Prob
. ≤ x → G(x) , n → ∞, (2.3)
an

where .G(x) is a nondegenerate distribution function, then .G(x) belongs to one of

the following three families:

x − b
I
. G(x) = exp − exp − , −∞ < x < ∞ ; (2.4)
a

⎧
⎨0 , x ≤ b,
II
. G(x) = −c (2.5)
⎩ exp − x−b
a , x > b;

c
exp − b−x , x < b,
III
. G(x) = a (2.6)
1 , x ≥ b;

for parameters .a > 0, b and for families II and III, .c > 0.

These three types of extreme value distributions are also commonly referred to
as Gumbel, Fréchet, and Weibull, respectively. Note that the Weibull distribution
given here is not the same as the commonly known Weibull distribution, which
corresponds to the type III extreme value distribution for minima. Also, carefully
note that even if the Weibull distribution is the only type of extreme value
distribution with a finite upper limit on its values, this does not mean that extremes of
limited data must follow this distribution. For such data, it may very well happen that
the rescaling constant .an → 0 as n increases. Hence, even the Gumbel distribution
may be the appropriate asymptotic limit for the extreme values of bounded data.
2.2 The Asymptotic Limits of Extreme Value Distributions 7

It may be verified that it is, in fact, possible to express all three types of extreme
value distributions in a common form, which is known as the generalized extreme
value (GEV) distribution. This is achieved as follows:

x − μ −1/γ
G(x) = G(x; μ, σ, γ ) = exp − 1 + γ
. , (2.7)
σ

defined on the set .{x : 1 + γ (x − μ)/σ > 0}, where the parameters satisfy
.−∞ < μ < ∞, .σ > 0, .−∞ < γ < ∞. This distribution has three parameters:

a location parameter .μ, a scale parameter .σ , and a shape parameter .γ . The type II
distributions correspond to .γ > 0, while type III corresponds to .γ < 0. The case
.γ = 0 must be interpreted as a limiting case when .γ → 0, which leads to the

Gumbel distribution:

x − μ
.G(x) = exp − exp − , −∞ < x < ∞. (2.8)
σ

The statistical moments of the GEV distributions can now be calculated based on the
explicit formulas of Eqs. (2.7) and (2.8). Denoting the random variable determined
by a GEV distribution by M, its first two moments are
σ
E(M) = μ + (e1 − 1)
. , γ /= 0, γ < 1 (2.9)
γ

and
σ2
Var(M) = e2 − e12
. , γ /= 0, γ < 1/2, (2.10)
γ2

where .ek = Γ (1 − kγ ), .k = 1, 2, and .Γ (·) is the gamma function. For .γ ≥ 1,

.E(M) = ∞, while .E(M) = μ + λE σ when .γ = 0, that is, for the Gumbel case.
Here, .λE = 0.5772... denotes Euler’s constant. For .γ ≥ 1/2, .Var(M) = ∞, while
.Var(M) = σ π /6, when .γ = 0.
2 2

For statistical inference on experimental data, the unified form expressed by

Eq. (2.7) has the advantage that the data themselves determine which type of
distribution is appropriate, thereby avoiding a prior subjective judgment about any
specific tail behavior. The uncertainty in the estimated value of .γ is also a reflection
of the uncertainty about the correct distribution for the data. Unfortunately, in
practice, it may very well happen that the uncertainty in .γ may cover all three
types of extreme value distribution, which would necessitate a more careful analysis
of the data. Also note that the data used for estimation purposes are never truly
asymptotic, thereby introducing additional uncertainty when trying to identify the
correct asymptotic distribution. Since the results of extrapolation to determine long
return period design values may depend very much on the asymptotic extreme value
distribution used, identifying the correct one is clearly important in such cases.
8 2 Classical Extreme Value Theory

2.3 The Block Maxima Method

In practical application of the GEV distributions to a long time series of observed

data, it is assumed that the maximum observation of a reasonably large chunk of the
time series follows a GEV distribution. This is recognized by observing that from
(2.3) we would assume that for large n,
M − b
n n
Prob
. ≤ x ≈ G(x) . (2.11)
an

But this may be rewritten as (.y = an x + bn )

y − b
= G∗ (y) ,
n
Prob Mn ≤ y ≈ G
. (2.12)
an

where .G∗ is also a member of the GEV family of distributions. Hence, if the main
theorem applies, that is, by (2.3), .Mn∗ = (Mn − bn )/an approximately follows a
GEV distribution, then .Mn itself will approximately follow a GEV distribution, but
with different parameters. Anyway, in practice, it is the parameters of .G∗ that would
be of most interest.
This leads to the following approach, which is often referred to as the block
maxima method. Assume that a sequence of independent observations .x1 , x2 , . . .
from a stationary time series is long enough to allow segmenting it into blocks
of data of length n, for some large value of n, generating a series of m block
maxima, .Mn,1 , . . . , Mn,m , say, to which a GEV distribution is tentatively fitted. A
typical application of the block maxima method would be to yearly extreme value
observations of an environmental parameter, e.g., wind speed. In such a case, it is
also often referred to as an annual maxima method. There is a practical argument
behind extracting the maximum over the period of 1 year, because by choosing
shorter periods, the assumption that the sampled maxima are outcomes of a common
distribution would easily be violated due to seasonal variations. Still, of course, the
underlying assumption that the block maxima are extracted from a set of iid random
variables is clearly violated. Fortunately, by experience, this does not seem to pose
a serious obstacle to the practical use of the block maxima method.
A quantity of specific interest in applications is the return period level .xp , where
.G(xp ) = 1−p. For the annual maxima method, .xp has a return period of .1/p years.

That is, .xp would be exceeded on the average every .1/p years. Inverting (2.7), it is
found that for .γ /= 0,
−γ
xp = μ − (σ/γ ) 1 − − log(1 − p)
. , (2.13)

while for .γ = 0,

xp = μ − σ log − log(1 − p) .
. (2.14)
2.4 Outline Proof of the Extremal Types Theorem 9

Coles (2001) discusses how to estimate confidence intervals on .xp using profile
likelihood methods, which seem to provide reasonable accuracy. In this book the
focus is on the bootstrap method, cf. Sect. 2.8.

2.4 Outline Proof of the Extremal Types Theorem

The proof of the Extremal Types Theorem is not a very complicated proof, but it
is rather lengthy and technical (Leadbetter et al. 1983). Since it is not central to the
focus of this book, only a sketch will be given here to illustrate the main ingredients.
The concept of max-stability is needed. It is defined as follows:
A distribution G is called max-stable if, for every .m = 2, 3, . . ., there are
constants .αm > 0 and .βm such that

Gm (αm x + βm ) = G(x) .
. (2.15)

Gm is the distribution function of .Mm = max{Z1 , . . . , Zm }, where the .Zi are iid
.

random variables with distribution function G. Therefore, max-stability is a property

satisfied by distributions that are invariant under the operation of taking sample
maxima, except for a change of scale and location. The following result brings
forward the connection between max-stability and extreme value distributions
(Leadbetter et al. 1983),
A distribution is max-stable if, and only if, it is a GEV distribution.
To check that a GEV distribution is max-stable is a straightforward exercise in
algebra. The converse is much harder. Anyway, this result can now be used to prove
the Extremal Types Theorem. Consider first .Mnk = max{X1 , . . . , Xnk } of a sample
of nk iid random variables .Xi , for some large value of n. This large sample can be
divided into k subsamples of n variables in each. Hence, there will be k iid random
variables like .Mn = max{X1 , . . . , Xn }. n is chosen large enough to claim that
M − b
n n
Prob
. ≤ x ≈ G(x) , (2.16)
an

for suitable constants .an and .bn and for the limiting distribution G. Hence, for any
integer .k ≥ 2, since .nk > n,
M − bnk
nk
Prob
. ≤ x ≈ G(x) , (2.17)
ank

Eq. (2.16)
Mn ≤ z ≈ G (z − bn )/an , while Eq. (2.17) gives
leads to .Prob
.Prob Mnk ≤ z ≈ G (z − bnk )/ank . However, .Mnk is obviously the maximum

of k variables having the same distribution as .Mn . But then,

k
Prob Mnk ≤ z = Prob Mn ≤ z .
. (2.18)
10 2 Classical Extreme Value Theory

From this, it is deduced that (in the limit)

z − b z − b
nk n
G
. = Gk . (2.19)
ank an

From this it follows that G and .Gk are identical apart from location and scale
parameters. Hence, G is max-stable, and by the result above, it is a member of
the GEV family of distributions.

2.5 Domains of Attraction for the Extreme Value

Distributions

In practice, the exact statistical distribution of the data being analyzed is rarely
known. However, in many cases there may be rather strong evidence as to what
type of distribution to expect. For instance, average wind speeds over periods of 10
minutes in northern Europe have been found to follow a Weibull type distribution.
Then it would be useful to know what kind of extreme value distribution to
expect for such data. The answer to such questions is the subject of the theory of
domains of attraction for extreme value distributions. It is beyond the scope of our
treatment of this topic here to go into much detail, but some useful results seem
worthwhile presenting. A more thorough discussion is given by Gnedenko (1943)
and Leadbetter et al. (1983).
A time series .X1 , X2 , . . . of iid random variables with distribution function F ,
and with a density function f , is considered. .xF is defined to be the right endpoint of
F by .xF = sup{x; F (x) < 1} (xF ≤ ∞). Then the following sufficient conditions
due to von Mises apply:
Suppose that F is absolutely continuous with density f . Then sufficient con-
ditions for F belonging to each of the three possible domains of attraction are as
follows:

Type I: f has a negative derivative .f ' for all x in some interval .(x0 , xF ), .(xF ≤
∞), and

f ' (x)(1 − F (x))

. lim = −1.
x↗xF f 2 (x)

Type II: f (x) > 0 for .x ≥ x0 finite, and for some constant .α > 0,
.

xf (x)
. lim = α.
x→∞ 1 − F (x)
2.6 Parameter Estimation for the GEV Distributions 11

Type III: f (x) > 0 for all x in some finite interval .(x0 , xF ), .f (x) = 0 for .x > xF ,
.

and for some constant .α > 0,

(xF − x)f (x)

. lim = α.
x↗xF 1 − F (x)

Using these results, it is straightforward to verify that the following list of

distributions belongs to the domain of attraction of the Type I (Gumbel) case, just to
mention a few well-known cases: normal, lognormal, exponential, Weibull, gamma,
and, of course, the Gumbel distribution itself.
Distributions belonging to the domain of attraction of Type II are, e.g., the Pareto,
the generalized Pareto for positive shape parameter, and the Type II extreme value
distribution itself. For Type III may be mentioned, e.g., the uniform distributions,
distributions truncated on the upper side (provided a smooth density function), and
the Type III extreme value distribution itself.

2.6 Parameter Estimation for the GEV Distributions

The practical application of the block maxima method involves the need to decide
on how to divide the observed data into blocks. Obviously, there will be two
conflicting issues that have to be dealt with. The desire to have large blocks so
that the distribution of the block maxima will approximate a GEV distribution may
easily lead to a sample of few block maxima. Statistical inference on small samples
may entail large uncertainties. On the other hand, increasing the sample of block
maxima by choosing smaller blocks may violate the asymptotic approximation by
assuming a GEV distribution for the block maxima. These issues may be further
complicated by the issues of independence and stationarity, which were discussed
in Sect. 2.3. While establishing general rules for the choice of block size relative
to the amount of data available is hardly feasible, for some practical cases the
accumulated experience has led to what may be called a consensus. For example, in
wind engineering, the choice of 1 year as a block size has become very close to a
standard procedure. An important consideration for this choice is that the data may
then reasonably be assumed to belong to the same population since seasonal effects
have effectively been removed.
When the sample of block maxima has been determined, the next step would be
to estimate the parameters of the GEV model, or one of the three types, if that can
be ascertained a priori. In this book the focus is on two rather popular estimation
methods, which is the method of moments (primarily for the Gumbel model) and
the maximum likelihood method. The probability weighted moment method has
also been used to some extent. For this method, please cf. Hosking et al. (1985).
To simplify notation, the block maxima are denoted by .Z1 , . . . , Zk , assuming
k blocks. These random variables are assumed to be iid with a common GEV
12 2 Classical Extreme Value Theory

distribution, the parameters of which are to be estimated from the outcomes of the
block maxima, that is, the observed data.

2.6.1 Estimation by the Method of Moments

The exposition of the method of moments for parameter estimation is limited to the
Gumbel model. It is for this case that it seems to be most popular, maybe due to its
simplicity in this case. The general Gumbel model has two parameters. Since the
first two statistical moments .m1 = E(Z) and .m2 = E(Z 2 ) of a Gumbel distributed
variable Z can be expressed in terms of these two parameters, for estimation the
following two empirical moments are calculated:

k
j
m̂j = (1/k)
. zi , j = 1, 2, (2.20)
i=1

where .z1 , . . . , zk are the observed data.

Assuming that Z has the general Gumbel distribution .G(z) = exp{− exp[−(z −
μ)/σ ]}, then .m1 = E(Z) = μ + 0.5772σ and .m2 = E(Z 2 ) = m21 + π 2 σ 2 /6, cf.
Sect. 2.2. Denote by .μk and .σk the estimated values of the parameters based on the
k observations of block maxima. It is then obtained that

μk = m̂1 − 0.5772σk
. (2.21)

and
√
σk = ( 6/π ) m̂2 − m̂21 .
. (2.22)

2.6.2 Maximum Likelihood Estimation

The maximum likelihood (ML) method is very popular and has widespread use in
almost every branch of statistics. It turns out that the application of the ML methods
for estimation on GEV models requires some caution. Fortunately, it seems that
for the applications relevant for this book, the restrictions that need to be observed
are rarely an issue. Specifically, for values of the shape parameter .γ > −0.5, the
ML estimators behave regularly. The only thing to note is that there are some small
sample issues related to the use of ML estimators also for GEV models, cf. Coles
and Dixon (1999).
2.7 Model Validation 13

Based on the assumption that .Z1 , . . . , Zk are iid random variables having a
common GEV distribution, then the log-likelihood function for the GEV parameters
when .γ /= 0 has the following expression:

k z − μ
i
𝓁(μ, σ, γ ) = −k log σ − (1 + 1/γ )
. log 1 + γ
σ
i=1
k
z − μ −1/γ
i
− 1+γ , (2.23)
σ
i=1

provided that
z − μ
i
1+γ
. > 0, for i = 1, . . . , k. (2.24)
σ
If the last condition is violated, the likelihood becomes zero and the log-likelihood
therefore .−∞. The case .γ = 0 needs to be considered separately, using the Gumbel
model. In this case the log-likelihood becomes

k
zi − μ zi − μ
k
.𝓁(μ, σ ) = −k log σ − − exp − . (2.25)
σ σ
i=1 i=1

To obtain the numerical maximum likelihood estimates from the observed data
by using (2.23) and (2.25), standard numerical optimization programs may be
utilized. If (2.23) is used, care must be exercised to avoid numerical problems in
cases where the optimization algorithms tend to parameter estimates in the close
vicinity of .γ = 0. Then it is strongly advisable to use (2.25).
Confidence intervals on the estimated parameter values can be calculated exploit-
ing that the approximate distribution of the estimators .(μ̂, σ̂ , γ̂ ) is multivariate
normal with mean value .(μ, σ, γ ). This is discussed by Coles (2001).

2.7 Model Validation

As is well known from basic courses in statistics, the use of probability (or PP)
plots and quantile (or QQ) plots may reveal very useful information about the extent
of agreement between an assumed or estimated probability distribution and the
empirical distribution of the data. These are also highly useful tools for a visual
check of fitted GEV models in particular cases. For a thorough discussion of the use
of these plots, cf. Beirlant et al. (2004).
14 2 Classical Extreme Value Theory

A probability or a PP plot is a direct comparison of the fitted distribution model

to the empirical distribution. Assume that the sample of block maxima has been
ordered by increasing value: .z(1) ≤ z(2) ≤ . . . ≤ z(k) . The empirical distribution
function, .G̃ say, evaluated at .z(i) is given by

G̃(z(i) ) = i/(k + 1).

. (2.26)

The proposed GEV model distribution is obtained by substituting the parameter

estimates into (2.7)
z −1/γ̂
(i) − μ̂
Ĝ(z(i) ) = exp − 1 + γ̂
. , (2.27)
σ̂

provided .γ̂ /= 0. If .γ̂ = 0, the plot is constructed using the Gumbel distribution. If
the GEV model is a good approximation, then

Ĝ(z(i) ) ≈ G̃(z(i) )
. (2.28)

for each index i, so that the PP plot consisting of the points

.Ĝ(z(i) ), G̃(z(i) ) i = 1, . . . , k (2.29)

should follow approximately the unit diagonal.

For the case of extreme value distributions, a quantile or QQ plot is usually
considered to be more informative than a PP plot because it shows more clearly the
agreement at high values of the observed data, which is of primary concern when
fitting extreme value models. Assuming again that .γ̂ /= 0, the QQ plot is traced out
by the point graph
−1
.Ĝ (i/(k + 1)), z(i) , i = 1, . . . , k, (2.30)

where

σ̂ −γ̂
Ĝ−1 (i/(k + 1)) = μ̂ −
. 1 − − log i/(k + 1) . (2.31)
γ̂

This graph should also approximately follow a straight line. These procedures are
discussed at greater length in Chap. 9.
2.8 Estimating Confidence Intervals by Bootstrapping 15

2.8 Estimating Confidence Intervals by Bootstrapping

The bootstrapping method is a statistical technique of fairly recent origin that can
be used for estimating confidence intervals on quantities derived from a statistical
distribution on the basis of a limited sample generated by that same distribution
(Efron and Tibshirani 1993; Davison and Hinkley 1997). It is based on resampling
from a distribution determined by the available sample of data. Despite the fact that
the name of the method alludes to lifting oneself up by the bootstraps (Baron von
Munchausen), the method appears to be reasonably effective for the specific purpose
of estimating confidence bands. For convenience, a brief discussion of some basic
features of the bootstrapping method is provided here.
Assume that .z = (z1 , z2 , . . . , zn ) is a sample or a vector consisting of n
independent observations of a random variable Z and that this is the only empirical
information available about Z. Confidence intervals for a statistical quantity require
the estimation of quantiles from the distribution of a relevant estimator. There
are in principle two available options for obtaining bootstrap estimates of such
quantiles. One is the nonparametric approach, where a purely empirical distribution
function is established for Z on the basis of the observed data by allocating a
probability of .1/n to each of the observed data points. The other is the parametric
bootstrap, which is obtained by assuming that Z has a specified distribution function
.FZ (z; θ ) = Prob(Z ≤ z), where .θ denotes a vector of unknown parameters, which

determine the distribution. These parameters are then estimated from the observed
data .z, giving .θ̂, and .FZ (z; θ̂ ) is adopted as the distribution of Z.
In this section on the block maxima method using GEV models, only the
parametric bootstrap is used. The goal is to estimate some statistical quantity V , e.g.,
a high quantile like .100(1 − α)% (.0 < α << 1), given by the unknown distribution.
Let .V̂ denote the estimate of V obtained from the fitted model distribution .FZ (z; θ̂ ),
which is a GEV distribution. The parametric bootstrapping technique for estimating
confidence intervals on V is based on resampling from the GEV model obtained.
This is done as follows: Let .Z ∗ denote the random variable with distribution
function .FZ (z; θ̂ ). .𝓁 bootstrap samples .z∗j , .j = 1, . . . , 𝓁, with n independent
observations of .Z ∗ in each sample are now generated. Each sample .z∗j is used to
fit a new GEV model from which an estimate .Vj∗ of V is obtained.
Simple estimates for confidence intervals on V are derived by calculating the
sample standard deviation .sV∗ :

1 𝓁
∗
.sV = (Vj∗ − V̄ ∗ )2 , (2.32)
𝓁−1
j =1
16 2 Classical Extreme Value Theory

𝓁 ∗
where .V̄ ∗ = (1/𝓁) j =1 Vj . An approximate confidence interval at level .1 − q is
then obtained as

( V̂ − wq/2 sV∗ , V̂ + wq/2 sV∗ ),

. (2.33)

where .wq/2 denotes the .100(1 − q/2)% standard normal fractile. To get stable
results for this method, usually 20–30 bootstrap samples are sufficient. However,
to avoid making the assumption that the bootstrap estimates .Vj∗ are generated by a
normal distribution, which is the basis for Eq. (2.33), the true distribution may be
approximated by generating a large number of bootstrap samples, usually several
thousands are needed, especially for small values of q. If .𝓁 samples were generated,
the .Vj∗ are rearranged in increasing order .V(1)
∗ ≤ V ∗ ≤ . . . ≤ V ∗ . A .100(1 − q)%
(2) (𝓁)
confidence interval for V is then
∗ ∗
( V(L)
. , V(M) ), (2.34)

where .(L) = [q𝓁/2] and .(M) = [(1 − q/2)𝓁] (.[a] means the integer part of a). Such
estimates may be further improved as described by Davison and Hinkley (1997).
However, such details will not be discussed here.

2.9 The Asymptotic Extreme Value Distributions for

Dependent Sequences

The assumed sequence of iid random variables underlying the classical approach
to extreme value distributions is obviously not a very practical model for many
physical phenomena where dependence effects are obvious. Fortunately, it turns out
that the extremal types theorem still applies, provided some conditions are satisfied.
These conditions relate to the long range dependence structure of the sequence
of random variables. The typical model adopted is one of a stationary time series
.X1 , X2 , . . .. Stationarity means that the joint probability law of a group of random

variables from the sequence is invariant with respect to time shifts. That is, e.g.,
.X1 , X2 has the same joint distribution as .X51 , X52 .

The condition that has to be satisfied by the long range dependence of the
stationary time series to allow for an extremal types theorem, can be formulated
as follows:
A stationary time series .X1 , X2 , . . . is said to satisfy the .D(un ) condition if, for
all .i1 < . . . < ip < j1 < . . . < jq with .j1 − ip > l,

. Prob Xi1 ≤ un , . . . , Xip ≤ un , Xj1 ≤ un , . . . , Xjq ≤ un −

Prob Xi1 ≤ un , . . . , Xip ≤ un Prob Xj1 ≤ un , . . . , Xjq ≤ un ≤ α(n, l),
(2.35)
2.9 The Asymptotic Extreme Value Distributions for Dependent Sequences 17

where .α(n, ln ) → 0 for some sequence .ln such that .ln /n → 0 as .n → ∞.

A scrutiny of this condition conveys the understanding that it is required that
block maxima tend to become independent random variables if the blocks are
sufficiently far apart. Provided that this condition is satisfied, the following theorem
applies (Leadbetter et al. 1983):
Let .X1 , X2 , . . . be a stationary time series, and define .Mn = max{X1 , . . . , Xn }.
If there exist sequences of constants .an > 0 and .bn such that
M − b
n n
Prob
. ≤ x → G(x) , n → ∞, (2.36)
an

where .G(x) is a nondegenerate distribution function, and the .D(un ) condition is

satisfied with .un = an z + bn for every real z, and .G(z) > 0, then .G(x) belongs to
the class of generalized extreme value distributions.
The practical significance of this theorem is, in fact, substantial since very few
time series met in practice would consist of independent data. Hence, without it,
the application of GEV distributions to practical problems would be very hard to
justify. Its application, of course, presupposes that the .D(un ) condition is satisfied,
which may seem like a highly nontrivial criterion to check for a given stationary
time series. Fortunately, it turns out that for most applications it may be routinely
assumed to be satisfied. For example, a stationary Gaussian time series .X1 , X2 , . . .
with an autocovariance function .ρn = E [(Xi − μ)(Xi+n − μ)], where .μ = EXi ,
will satisfy .D(un ) if .ρn log n → 0 when .n → ∞. Time series of measured data in
engineering applications where .ρn decays slower than .1/ log n is actually very hard
to imagine. However, it should be clearly understood that the convergence to the
appropriate asymptotic limit may depend very much on the dependence structure of
the time series. For instance, if there is strong dependence between consecutive data
points of the time series, the convergence to the asymptotic limit may be very slow.
From the original time series .X1 , X2 , . . ., a time series of independent variables
.X̃1 , X̃2 , . . . may be constructed. Let .M̃n = max{X̃1 , . . . , X̃n }. Then the following

theorem has been proved by Leadbetter (1983),

If there exist sequences of constants .an > 0 and .bn and a nondegenerate
distribution function .G̃(x) such that

M̃ − b
n n
Prob
. ≤ x → G̃(x) , n → ∞, (2.37)
an

with .un = an z+bn for every real z, and .G(z) > 0,

if the .D(un) condition is satisfied
and if .Prob (Mn − bn )/an ≤ x converges for some x, then
M − b θ
n n
Prob
. ≤ x → G(x) = G̃(x) , n → ∞, (2.38)
an

for some constant .θ ∈ [0, 1].

18 2 Classical Extreme Value Theory

The constant .θ is called the extremal index, and unless it is equal to one, the
limiting distributions for the independent and the original stationary sequences are
not the same. If .θ > 0, then .G(x) is an extreme value distribution, but with different
parameters than .G̃(x). If .(μ, σ, γ ) are the parameters of .G(x) and .(μ̃, σ̃ , γ̃ ) are the
parameters of .G̃(x), then their relationship is

σ̃
γ = γ̃ ,
. μ = μ̃ − (1 − θ γ ), σ = σ̃ θ γ , (2.39)
γ

or, if .γ = 0, taking limits, it is obtained that .μ = μ̃ + σ log θ and .σ = σ̃ . Note that

the shape parameter .γ remains the same.
The results cited previously are derived under the assumption of a sequence
of random variables. For many applications in this book, a sequence of random
variables .X1 , X2 , . . . is initially not considered, but rather a stochastic process in
continuous time .X(t). To fit into the framework discussed in this section, one could
envisage sampling the considered process at discrete time points and obtaining a
sequence .Xj = X(tj ), .j = 1, 2, . . .. In practice, this is often done by extracting
local peak values from an observed realization of the process. As has been seen, if
the obtained time series is stationary, then, under suitable conditions, the extremal
types theorem still holds true. Unfortunately, no satisfactory general theory of
extremes is available for the case of continuous time stochastic processes, but some
results have been proven (Leadbetter et al. 1983). Of course, in order for a realization
of a stochastic process to be stored in computer memory, it has to be sampled. In
that sense, what is available for further analysis is, in fact, a time series of outcomes
of random variables. Hence, the sampling frequency relative to the characteristics of
the stochastic process determines to what extent one may consider the stored time
series to be a good replica of the realization of the stochastic process, which may
then be used for further processing, like extraction of peak values.
Chapter 3
The Peaks-Over-Threshold Method

3.1 Introduction

A common approach to practical extreme value analysis is to use the GEV form
of the asymptotic extreme value distributions to fit to the observed extreme values.
The typical data that are used in this process are the extreme values observed over
specified periods of time, e.g., over one year periods. Such a procedure, extracting
only extremes over blocks of data, would immediately appear to be wasteful, since
potentially very useful data might be discarded. The Peaks-Over-Threshold method,
or simply the POT method, represents an approach to extreme value analysis that
tries to avoid this waste of data by considering all data that exceed a prescribed high
threshold. It will be shown that also for the POT method there are three limiting
forms of the exceedance probability distributions corresponding to The Extremal
Types Theorem. The Generalized Pareto (GP) distribution will take the place of the
GEV distribution.

3.2 The Peaks-Over-Threshold Method

The basic assumption of the POT method is that the observed time series .x1 , x2 , . . .
are the outcomes of a sequence of independent and identically distributed (iid)
random variables .X1 , X2 , . . . with common distribution function .FX (x), which
belongs to the domain of attraction of one of the extreme value distributions. Instead
of extracting the extreme observations over blocks of data, the focus of the POT
method is on all data that exceed a given high threshold. If the chosen threshold u is
high enough, it would be natural to consider also the data exceeding u as extremes.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 19

A. Naess, Applied Extreme Value Statistics,
[Link]
20 3 The Peaks-Over-Threshold Method

Denoting an arbitrary term of the .Xi sequence by X, the exceedance probability

essential to the POT method is the following conditional probability:

1 − F (u + y)
Prob(X > u + y | X > u) =
. , y > 0, (3.1)
1 − F (u)

where y denotes the size of the exceedance above the threshold u.

If the parent distribution F were fully known, the distribution of the threshold
exceedances in Eq. (3.1) would also be known. However, in practical applications
this is rarely the case. Typically, only estimated approximations to the parent
distribution based on the observed data would be available. A consequence of this
is that the estimates of .F (x) for .x ≤ u in general become very uncertain for large
to extreme values of u. This makes the direct application of Eq. (3.1) useless for
practical purposes. Hence, limiting forms that would parallel the GEV distributions
for extremes are sought.
The main result needed is expressed by the following theorem:
Let .X1 , X2 , . . . be a sequence of iid random variables with common distribution
function F , and let .Mn = max{X1 , . . . , Xn }. Assume that the conditions of the
Extremal Types Theorem are satisfied, so that for large n,

.Prob(Mn ≤ z) ≈ G(z),

where

x − μ −1/γ
.G(x) = exp − 1 + γ , (3.2)
σ

defined on the set .{x : 1 + γ (x − μ)/σ > 0}, where the parameters satisfy
.−∞ < μ < ∞, .σ > 0, .−∞ < γ < ∞. Then, for large enough u,

y −1/γ
H (y) = Prob(X ≤ u + y | X > u) ≈ 1 − 1 + γ
. , (3.3)
σ̃

defined on .{y : y > 0 and (1 + γ y/σ̃ ) > 0}, where .σ̃ = σ + γ (u − μ).
Note that the special case .γ = 0 has to be interpreted as a limit, in analogy with
the GEV case .γ = 0, viz.
y
.H (y) = 1 − exp − , y > 0, (3.4)
σ̃

which corresponds to an exponential distribution with parameter .1/σ̃ .

The family of distribution functions defined by Eq. (3.3) is called the Generalized
Pareto (GP) distributions. The theorem above implies that if the block extremes
have the limiting distribution G, then threshold excesses have a corresponding
limiting distribution within the GP family of distributions. And it follows from
3.2 The Peaks-Over-Threshold Method 21

this theorem that the parameters of the GP distribution of threshold excesses are
uniquely determined by those of the associated GEV distribution of block extremes.
In particular, the shape parameter .γ of the GP distribution is identical to that of the
corresponding GEV distribution. It is important to note, however, that this statement
is valid only asymptotically. The practical significance of this is that, in general, the
.γ -parameter estimated on the basis of a finite set of data rarely equals that of the

correct asymptotic extreme value distribution. Also relevant in this context is the
observation made by Fisher and Tippett (1928) that the best fit to a finite set of
extreme value data from a stationary Gaussian process is provided by a Type III
distribution and not by the asymptotically correct Type I distribution.
A full proof of the theorem above will not be given here, but a sketch of the
proof may serve to illustrate the main ideas. A more precise argument is provided
by Leadbetter et al. (1983).
For large n and for suitable values of the argument z,

z − μ −1/γ
.Prob(Mn ≤ z) = F (z) ≈ exp − 1 + γ
n
.
σ

Hence,
z − μ −1/γ
n log F (z) ≈ − 1 + γ
. , (3.5)
σ

By a Taylor expansion, it is found that .log(1 − ε) ≈ −ε for small positive .ε.

Consequently, for large values of z, it follows that .log F (z) ≈ −(1 − F (z)).
Substituting into Eq. (3.5), followed by a rearrangement, leads to
u − μ −1/γ
1
.1 − F (u) ≈ 1+γ ,
n σ

for large values of u. Similarly, for .y > 0,

u + y − μ −1/γ
1
1 − F (u + y) ≈
. 1+γ ,
n σ

Hence, it follows that

1 + γ (u + y − μ)/σ −1/γ
Prob X > u + y | X > u ≈
.
1 + γ (u − μ)/σ
y −1/γ
= 1+γ , (3.6)
σ̃

where .σ̃ = σ + γ (u − μ), as required by Eq. (3.3).

22 3 The Peaks-Over-Threshold Method

3.3 Threshold Selection

Selection of an appropriate threshold to use when applying the POT method on

measured data is an essential ingredient of this method. Its importance is reflected
by the fact that the predicted extreme values will in many cases show a significant
dependence on this choice. Unfortunately, in practice, there are no fully reliable
methods to guide the selection of an appropriate threshold. In spite of this, a couple
of methods have seen extensive use. These two methods will be discussed in this
section.
The practical application of the POT method would typically proceed as follows.
The observed data .x1 , x2 , . . . , xn are assumed to be outcomes of independent and
identically distributed random variables .X1 , X2 , . . . , Xn , which, of course, would
require some degree of justification. The extreme events to be used are the observed
exceedances above a high threshold u, that is, the data .xi : xi > u. Denote these
exceedances as .x(1) , x(2) , . . . , x(k) , and define the threshold excesses by .yj = x(j ) −
u for .j = 1, . . . , k. According to our main result in the previous section, the excess
data .yj may be regarded as independent realizations of a random variable whose
distribution function is approximately of the GP type. By fitting the excess data to
the GP distribution, an approximate distribution of the excess variable is obtained.
In the choice of threshold, there are clearly two conflicting issues that have to
be dealt with. Choosing a high threshold is desirable from the point of view of not
violating too much the asymptotic basis for the theory. However, if too few data
are retained, there will be high uncertainty in every estimate. On the other hand, if
the threshold is too low, the apparent uncertainty of estimates may be reduced, but
the assumption that the excess values follow a GP distribution may be seriously in
error. Hence, in practice, there is a need to balance the two issues. There seems to
be two procedures that are used for this purpose. One is carried out prior to model
estimation by investigating how a diagnostic follows an expected pattern. The other
is an assessment of the stability of parameter estimates based on the fitting of models
for a range of different thresholds.
The first of the two methods uses the expected value of a GP variable Y with
distribution function
−1/γ
H (y) = 1 − 1 + γ y/σ
. . (3.7)

The expected value is then

σ
E[Y ] =
. , (3.8)
1−γ
3.4 Return Periods 23

provided .γ < 1; otherwise the mean value is infinite. Assume that the GP
distribution is a valid model for the excesses of a fixed threshold .u0 . Let X denote
an arbitrary term among .X1 , X2 , . . . , Xn . Then, by (3.8),
σu0
E[X − u0 | X > u0 ] =
. , (3.9)
1−γ

provided .γ < 1, where the convention of using .σu to denote the scale parameter
corresponding to excesses of the threshold u has been adopted. If the GP model
holds for the threshold .u0 , by necessity it also holds for any .u > u0 . Hence, for
.u > u0 , it is obtained that

σu σu + γ (u − u0 )
E[X − u | X > u] =
. = 0 , (3.10)
1−γ 1−γ

where the last equation follows from the main result of the previous section. This
tells us that for .u > u0 , .E[X − u | X > u] is a linear function of u. This can then
be verified by calculating the mean value of the excesses for a range of thresholds.
Thus, a plot of the following empirical point estimates,

1 nu
. u, (x(i) − u) : u < xmax , (3.11)
nu
i=1

where .x(1) , x(2) , . . . , x(nu ) consist of the .nu observations that exceed u, and .xmax
denotes the largest observation, should be an approximately straight line for a range
of u-values where the GP model is applicable. This empirical graph is commonly
called the mean residual life plot. Confidence intervals can be added to the graph
by using the approximate normality of the sample means. We shall have occasion to
illustrate this diagnostic for the applicability of the POT method for real-life data,
and it will be clear that it is not always of much help.

3.4 Return Periods

The return period R of a given wind speed, in years, is defined as the inverse of
the probability that the specified wind speed will be exceeded in any 1 year. If .λ
denotes the mean crossing rate of the threshold u per year (i.e., the average number
of data points above the threshold u per year), the return period R of the value of X
corresponding to the level .xR = u + yR is given by the relation

1 1
.R= = . (3.12)
λ Prob(X > xR ) λ Prob(Y > yR )
24 3 The Peaks-Over-Threshold Method

Hence, it follows that

Prob(Y ≤ yR ) = 1 − 1/(λR).
. (3.13)

Invoking Eq. (3.7) for .γ /= 0 leads to the result

xR = u + σ [(λR)γ − 1]/γ .
. (3.14)

Similarly, for .γ = 0, it is found that

xR = u + σ ln(λR),
. (3.15)

where u is the threshold used in the estimation of .γ and .σ . A discussion of how to

estimate confidence intervals on .xR using profile likelihood methods is provided by
Coles (2001). In this book the focus is on the bootstrap method, cf. Sect. 3.7.

3.5 Parameter Estimation for the GP Distributions

There is a range of possible estimation methods available for the parameters of the
GP distribution. In our own work, we have mostly used three methods that serve the
purpose reasonably well when the shape parameter does not deviate too much from
zero (.γ < 0.5): The Dekkers–Einmahl–de Haan estimators, the moment estimators,
and the maximum likelihood estimators. These three methods will be discussed
below. Alternative methods include the Hill estimators and the probability weighted
moment (PWM) estimators, cf. Beirlant et al. (2004). PWM estimators have been
popular in flood frequency analysis (Hosking et al. 1985). Further development
is provided by the L-moment estimators (Hosking and Wallis 1997). It has been
shown that L-moments have some issues related to their lack of sensitivity to the tail
behavior of the underlying statistical distribution. This is clearly of importance for
properly representing extreme value distributions, cf. Winterstein and MacKenzie
(2013).

3.5.1 Dekkers–Einmahl–de Haan Estimators

Let n denote the total number of data points, while the number of observations above
the threshold value u is denoted by k. The threshold u then represents the .(k + 1)th
highest data point(s). An estimate for .λ is .λ̂ = k/nyrs , where .nyrs denotes the length
of the record in years. The highest, second highest, . . . , kth highest, .(k +1)th highest
∗ , . . . , .X ∗
variates are denoted by .Xn∗ , .Xn−1 ∗
n−k+1 , .Xn−k = u, respectively.
3.5 Parameter Estimation for the GP Distributions 25

The parameter estimators proposed by Dekkers et al. (1989) and de Haan (1994)
are based on the following two quantities:

k−1
1 ∗ ∗
Hk,n =
. {ln(Xn−i ) − ln(Xn−k )} (3.16)
k
i=0

and
k−1
(2) 1 ∗ ∗
.H
k,n = {ln(Xn−i ) − ln(Xn−k )}2 . (3.17)
k
i=0

Estimators for .σ and .γ are then given by the relations

∗
σ̂ = ρ Xn−k
. Hk,n = ρ u Hk,n (3.18)

and
−1
1 (Hk,n )2
.γ̂ = Hk,n + 1 − 1− , (3.19)
2 (2)
Hk,n

where .ρ = 1 if .γ̂ ≥ 0, while .ρ = 1 − γ̂ if .γ̂ < 0. These two estimators will be

referred to as the Dekkers–Einmahl–de Haan estimators.
Subject to general conditions on the underlying probability law, Dekkers et al.
(1989) and de Haan (1994) showed that .γ̂ → γ and .σ̂ → σ as .n → ∞ (in
probability).
Closely related to the Dekkers–Einmahl–de Haan estimators are the Hill esti-
mators. Their application to the problem of estimating extreme wind speeds was
investigated by Naess and Clausen (1999). Their conclusion was that the Hill
estimators lead to results that are quite similar to those provided by the Dekkers–
Einmahl–de Haan estimators. Because the Hill estimators require considerably
higher numerical efforts than the Dekkers–Einmahl–de Haan estimators and rarely
provide significantly better results, the Hill estimators were excluded from the
present discussion. The interested reader is referred to Naess and Clausen (1999),
Beirlant et al. (2004) for details.

3.5.2 Moment Estimators

In terms of the mean value .E(Y ) and the standard deviation .s(Y ) of the exceedance
variate Y , it can be shown that (Hosking and Wallis 1987)

1
σ =
. E(Y ){1 + [E(Y )/s(Y )]2 } (3.20)
2
26 3 The Peaks-Over-Threshold Method

and
1
γ =
. {1 − [E(Y )/s(Y )]2 } . (3.21)
2
Hence, empirical estimates of the first two moments of Y provide estimates of
σ and .γ . The resulting estimators are referred to as the moment estimators for .σ
.

and .γ .

3.5.3 Maximum Likelihood Estimators

The maximum likelihood estimators (MLEs) are often preferred due to their
asymptotic efficiency. Let .y1 , . . . , yk denote the observed sample of exceedances
above the threshold u from an observed sample of peak values .x1 , . . . , xn . The log-
likelihood function .𝓁(σ, γ |y1 , . . . , yk ) for the sample .y1 , . . . , yk is given by
yi
k
1
𝓁(σ, γ |y1 , . . . , yk ) = −k ln σ −
. +1 ln 1 + γ , (3.22)
σ σ
i=1

provided .(1 + γ yi /σ ) > 0 for .i = 1, . . . , k. If .γ = 0, the log-likelihood assumes

the form
k
1
𝓁(σ, 0|y1 , . . . , yk ) = −k ln σ −
. yi . (3.23)
σ
i=1

The MLE .σ̂ and .γ̂ are obtained by maximizing .𝓁(σ, γ |y1 , . . . , yk ) with respect
to .σ and .γ . These values are found by numerical methods, except for the special
case .γ = 0, for which a simple, closed-form solution exists for .σ̂ . It is given as
k
.σ̂ = i=1 yi /k.

3.6 Model Validation

Probability (PP) and quantile (QQ) plots may be useful tools to get a grip on the
suitability of the GP model for the excesses beyond a chosen threshold u, cf. Beirlant
et al. (2004). Denote the threshold excesses by .y1 , y2 , . . . , yk and the estimated
GP model by .Ĥ . Ordering the threshold excesses by increasing magnitude: .y(1) ≤
y(2) ≤ . . . ≤ y(k) , the PP plot consists of the point graph

. i/(k + 1), Ĥ (y(i) ) , i = 1, . . . , k, (3.24)
3.7 Estimating Confidence Intervals by Bootstrapping 27

where
−1/γ̂
Ĥ (y) = 1 − 1 + γ̂ y/σ̂
. , (3.25)

provided .γ̂ /= 0. If .γ̂ = 0, the plot is constructed using the exponential distribution.
Assuming again that .γ̂ /= 0, the QQ plot is traced out by the point graph
−1
.Ĥ (i/(k + 1)), y(i) , i = 1, . . . , k, (3.26)

where

σ̂ −γ̂
Ĥ −1 (y) = u +
. y −1 . (3.27)
γ̂

If the GP model is a reasonable model for the distribution of the excesses of

u, both the PP and the QQ plot should follow approximately a straight line. These
procedures are discussed more thoroughly in Chap. 9.

3.7 Estimating Confidence Intervals by Bootstrapping

The principle of the bootstrapping method was briefly explained in the previous
chapter, cf. Sect. 2.8. In this section on the POT method, only the nonparametric
bootstrap is used. The goal of this method is to estimate some statistical quantity
V given by the unknown distribution function on the basis of an observed sample
.y = (y1 , y2 , . . . , yn ), which is a sample or vector consisting of n independent

observations of a random variable Y . Let .V̂ denote the estimate of V based on the
given sample. The nonparametric bootstrapping technique for estimating confidence
intervals on V is based on resampling (with replacement) from the empirical
distribution function (EDF) provided by the observed sample .y, cf. Sect. 2.8.
This is done as follows: The EDF gives rise to an empirical random variable .Y ∗ .
∗
.𝓁 bootstrap samples .y , .j = 1, . . . , 𝓁, with n independent observations of .Y
∗ in
j
∗ ∗
each sample are now generated. Each sample .yj gives rise to an estimate .Vj of V .
Simple estimates for confidence intervals on V are derived by calculating the
sample standard deviation .sV∗ :

𝓁
1
. sV∗ = (Vj∗ − V̄ ∗ )2 , (3.28)
𝓁−1
j =1
28 3 The Peaks-Over-Threshold Method

where .V̄ ∗ = (1/𝓁) 𝓁 ∗ An approximate confidence interval at level .1 − q is

j =1 Vj .
then obtained as

( V̂ − wq/2 sV∗ , V̂ + wq/2 sV∗ ),

. (3.29)

where .wq/2 denotes the .100(1 − q/2)% standard normal fractile. To get stable
results for this procedure, usually 20–30 bootstrap samples are sufficient. To avoid
making the assumption that the bootstrap estimates .Vj∗ are normally distributed, the
true distribution may be approximated by generating a large number of bootstrap
samples, usually several thousands are needed, especially for small values of q. If
∗ ∗ ∗
.𝓁 samples were generated, the .V are rearranged in increasing order .V
j (1) ≤ V(2) ≤
∗ . A .100(1 − q)% confidence interval for V is then
. . . ≤ V(𝓁)

∗ ∗
( V(L)
. , V(M) ), (3.30)

where .(L) = [q𝓁/2] and .(M) = [(1 − q/2)𝓁] (.[a] means the integer part of a).
Davison and Hinkley (1997) describe possible improvements of such estimates.
However, they are not discussed here as they are considered to be of less practical
interest.
Chapter 4
A Point Process Approach to Extreme
Value Statistics

4.1 Introduction

In Chap. 1, some challenges to the problem of estimating extreme value distributions

from limited amounts of data were discussed. In the current chapter, this problem
will be approached by exploiting the concept of the mean upcrossing rate. It
will be shown that this opens the door to very robust and reasonably accurate
approximations to the extreme value distributions of stochastic processes, provided
some reasonable conditions are satisfied. In the majority of books on extreme value
statistics, which generally focus on asymptotic results for sequences of data, this
approach is usually discussed under a heading that typically goes like the title of the
current chapter. By stopping short of the asymptotic limits, this approach offers a
uniquely applicable methodology for approximate extreme value analysis of a host
of engineering problems.
In all sections of this chapter, except the last, it is assumed that the stochastic
process model is stationary. This is typically referred to as a short-term condition,
which is highlighted because the environmental processes causing loads and
motions of structures of interest to us, are changing their characteristics with time.
For example, in offshore engineering, the practical time window for sea states to
be considered stationary, is typically chosen to be three hours. Hence, in order to
properly handle the estimation of extreme values over the design life of a structure,
it is necessary to derive methods that allow us to obtain extreme value distributions
for the long-term condition. This problem is discussed in Sect. 4.10.

4.2 Average Rate of Level Crossings

As an example, let us assume that the forces at a given location in a structure

due to wind loads can be modeled as a stationary stochastic process .X(t) with

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 29

A. Naess, Applied Extreme Value Statistics,
[Link]
30 4 A Point Process Approach to Extreme Value Statistics

Fig. 4.1 A realization of a narrow banded process with three upcrossings of the level a during
time T

smooth realizations, cf. Chap. 9. In fact, in practice, it may be assumed that the
variance spectrum, see Chap. 9, of the process has compact support, implying that
the realizations are infinitely smooth. It is now desirable to calculate how often an
arbitrary realization of .X(t) can be expected to exceed a given force level a. An
equivalent formulation of the same problem is the following: what is the average
number of a-upcrossings per unit time by .X(t)? An a-upcrossing means that the
level a is exceeded with positive slope. Figure 4.1 shows part of a realization of a
narrow banded process, where there are three upcrossings of the indicated level a.
Each upcrossing is marked with a small circle in Fig. 4.1.
Let .N + (a, Δt) denote the random number of times that .X(t) upcrosses the level
a during the time interval .(t, t + Δt). It has been assumed that .X(t) has smooth
realizations, which means that they are differentiable and that the differentiated
realizations are continuous. This means that .x(t +δ) ≈ x(t)+ ẋ(t) δ for .0 ≤ δ ≤ Δt
(.Δt small) for any realization .x(t) of .X(t). In other words, any realization .x(t) can
be approximated by a straight line in the interval .(t, t + Δt). This implies that .x(t)
crosses the level a at most once in this interval, see Fig. 4.2. The conditions for one
upcrossing of the level a in the interval .(t, t + Δt) then become

.x(t) ≤ a (4.1)

and

.x(t + Δt) ≈ x(t) + ẋ(t) Δt > a. (4.2)

To satisfy Eqs. (4.1) and (4.2), it is clearly necessary that .ẋ(t) > 0. The
conditions for one upcrossing can therefore be written in the following way:

a − ẋ(t) Δt < x(t) ≤ a

. (4.3)
4.2 Average Rate of Level Crossings 31

Fig. 4.2 Local approximation by a straight line at an upcrossing

and

ẋ(t) > 0 .
. (4.4)

It is seen that to calculate the probability for an upcrossing, the joint density
.fX(t)Ẋ(t) (x, ẋ) of .X(t) and .Ẋ(t) is needed. The way stationarity has been defined;
.fX(t)Ẋ(t) (x, ẋ) is automatically independent of t. Hence, we simply write .fX Ẋ (x, ẋ)

to indicate this independence. From Eqs. (4.3) and (4.4), it is obtained that, for
sufficiently small .Δt,
∞ a
Prob{N + (a, Δt) = 1} =
. fXẊ (x, ẋ) dx d ẋ. (4.5)
0 a−ẋΔt

Hence, for small .Δt,

a
. fXẊ (x, ẋ) dx = ẋ Δt fXẊ (a, ẋ), (4.6)
a−ẋΔt

provided that .fXẊ (x, ẋ) is continuous. This implies that

∞
+
.Prob{N (a, Δt) = 1} = Δt ẋ fXẊ (a, ẋ) d ẋ. (4.7)
0
32 4 A Point Process Approach to Extreme Value Statistics

Since it is assumed that the realizations can be approximated locally by a straight

line, then .pn = Prob{N + (a, Δt) = n} is negligible for .n = 2, 3, . . . compared to
.p1 for sufficiently small .Δt. It follows that

∞

E[N + (a, Δt)] =
. n p n ≈ 0 · p 0 + 1 · p1 + 2 · 0 + 3 · 0 + . . .
n=0
∞
= p1 = Δt ẋ fXẊ (a, ẋ) d ẋ. (4.8)
0

The expected (or average) number of a-upcrossings per unit of time, which is
+
denoted by .νX (a), is then given by the following expression:
∞
+ 1
νX
. (a) = lim E[N + (a, Δt)] = ẋ fXẊ (a, ẋ) d ẋ. (4.9)
Δt→0 Δt 0

+
.νX (a) is referred to by several names. In this book we use mostly average (or
mean) (a-)upcrossing rate and average (or mean) (a-)upcrossing frequency. It is
+ +
seen that .νX (a) depends only on the level a. Because .νX (a) is independent of t,
+ +
.E[N (a, T )] = ν (a) T for any value of T , cf. Eq. (4.8), which was derived under
X
the assumption that .Δt is small.
Equation (4.9) is a useful formula. It is often referred to as the Rice formula
after its creator S. O. Rice (1954). To get a feeling for the physical content of the
Rice formula, one may note that the right hand side of Eq. (4.9) expresses a sort
of expectation value of the slope at upcrossing of the level a coupled with the
probability of being at that level. The greater the average slope at a given level,
the more often an arbitrary realization will upcross that level. In other words, large
average positive slope of the time histories implies shorter cycles and thereby more
frequent level crossings. At the same time one must expect that the number of
upcrossings of a given level is coupled to the probability of reaching that level. The
Rice formula therefore appears to have a fairly plausible form when it is subjected
to closer scrutiny.
In the same way that an a-upcrossing was defined, an a-downcrossing can be
defined in a similar way. An a-downcrossing implies that the level a is passed
with negative slope. The expected number of a-downcrossings per unit of time of a
− −
stationary process .X(t) is denoted by .νX (a). To derive the formula for .νX (a), the
following observation is made: it is obvious that an a-downcrossing for the process
.X(t) is equivalent with a .(−a)-upcrossing for the process .Y (t) = −X(t). Hence
− +
.ν (a) = ν (−a). From the relation .Y (t) = −X(t) follows that .fY Ẏ (y, ẏ) =
X Y
fXẊ (−y, −ẏ). This, together with Eq. (4.9), gives
∞ ∞
−
.ν (a)
X = νY+ (−a) = ẏ fY Ẏ (−a, ẏ) d ẏ = ẏ fXẊ (a, −ẏ) d ẏ
0 0
4.3 Distribution of Peaks of a Narrow Banded Process 33

−∞ 0
= ẋ fXẊ (a, ẋ) d ẋ = − ẋ fXẊ (a, ẋ) d ẋ. (4.10)
0 −∞

A common way of rewriting Eq. (4.10) is

0
−
νX
. (a) = |ẋ| fXẊ (a, ẋ) d ẋ. (4.11)
−∞

An a-crossing is either an a-upcrossing or an a-downcrossing. The expected

number of a-crossings per unit of time, denoted by .νX (a), must satisfy the equation
+ −
.νX (a) = ν (a) + ν (a). From Eqs. (4.9) and (4.11), it is then obtained that
X X
∞
. νX (a) = |ẋ| fXẊ (a, ẋ) d ẋ. (4.12)
−∞

For a stationary process, any a-upcrossing must by necessity be followed by an

a-downcrossing, and conversely. If that was not the case, the realizations would have
a mean drift in the positive or negative direction; that is, the mean value would not
be constant. For a stationary process, the following relations apply:

− + 1
νX
. (a) = νX (a) = νX (a). (4.13)
2

4.3 Distribution of Peaks of a Narrow Banded Process

Assume that .X(t) is a stationary process with zero mean value, which is also narrow
banded. What characterizes a realization of a narrow banded process is that the
amplitude and length (period) of subsequent cycles vary slowly, as illustrated in
Fig. 4.1. This implies that almost invariably, there is only one maximum or peak
value between an upcrossing and a subsequent downcrossing of any level a, see
Fig. 4.3. Because a zero mean value was assumed, the mean number of peaks
per unit of time will therefore be approximately equal to the mean rate of zero-
+
upcrossings, that is, .νX (0).
Let .Xp denote the size or height of an arbitrary peak of .X(t). .Xp becomes a
random variable. The probability distribution of .Xp for a narrow banded process
.X(t) with zero mean value is now defined as

+
νX (a)
Prob{Xp > a} =
.
+ , (a ≥ 0). (4.14)
νX (0)
34 4 A Point Process Approach to Extreme Value Statistics

Fig. 4.3 (a) Peak between an upcrossing and a subsequent downcrossing of the level a. (b) Two
peaks between an upcrossing and a subsequent downcrossing

The distribution .FXp (a) is then given as

+
νX (a)
FXp (a) = 1 −
.
+ , a ≥ 0, (4.15)
νX (0)

while .FXp (a) = 0 for .a < 0.

+ +
If .mX = 0, the following definition applies: .FXp (a) = 1 − νX (a)/νX (mX ) for
+
.a ≥ mX and .FXp (a) = 0 for .a < mX . Because .ν (a) is assumed to equal the
X
+
mean number of peaks per unit of time above the level a, then clearly .νX (a) will
decrease with increasing a. For all processes of interest to us, it may be assumed
+
that .νX (a) → 0 when .a → ∞. This implies that .FXp (a) gets the properties that a
distribution must have; that is, .FXp (a) is a nondecreasing function for increasing a,
.FXp (a) → 0 when .a → −∞ and .FXp (a) → 1 when .a → ∞. It is tacitly assumed
+
that .νX (a) gets its largest value when .a = mX , which is usually the case. As
observed, it is true for a Gaussian process. Note that it must also apply to processes
that are characterized by having only one maximum between an upcrossing and
a subsequent downcrossing of the mean value level, that is, for infinitely narrow
banded processes. The assumption made is therefore quite reasonable.
+
If .νX (a) can be differentiated with respect to a, the density for peaks is obtained,
assuming that .mX = 0,
+
1 dνX (a)
fXp (a) = −
.
+ , a ≥ 0, (4.16)
νX (0) da

while .fXp (a) = 0 for .a < 0. Equations (4.15) and (4.16) are sometimes called the
“peak formulas.” It is emphasized that they only apply to narrow banded processes.
For the sake of completeness, it should be mentioned that the peak distributions
can also be defined in the general case. However, this would require that the exact
4.4 Average Upcrossing Rate and Distribution of Peaks of a Gaussian Process 35

mean number of peaks per unit of time were used in definitions. The expressions
then obtained would be of limited practical use because they are difficult, if not
impossible, to calculate. The simplifications that are sometimes introduced to make
the expressions amenable to calculations are often, in fact, ill defined.

4.4 Average Upcrossing Rate and Distribution of Peaks of a

Gaussian Process

A random variable X is normally distributed if the density of X is given as,

1 1 x − m 2
X
fX (x) = √
. exp − (4.17)
2π σX 2 σX

for .σX > 0. If .σX = 0, then .X = mX . In this case, X may be considered as a

degenerate normal variable.
Two random variables X and Y are called jointly normally distributed if the joint
density of X and Y is given by the equation,

1 1 x − m 2
X
fXY (x, y) =
. exp −
2π σX σY 1 − ρXY
2 2 (1 − ρXY
2 ) σX

x − m y − m y − m 2
X Y Y
− 2ρXY + , (4.18)
σX σY σY

where .ρXY = E[(X − mX )(Y − mY )]/σX σY is the correlation coefficient for X

and Y . Invariably, .| ρXY |≤ 1, but to be precise, in Eq. (4.18) it is assumed that
.|ρXY | < 1 and that .σX > 0, .σY > 0.

If X and Y are uncorrelated, .ρXY = 0 by definition. For this case, Eq. (4.18)
assumes the form,

1 1 x − m 2 y − m 2
X Y
fXY (x, y) =
. exp − +
2π σX σY 2 σX σY
1 1 x − m 2 1 1 y − m 2
X Y
=√ exp − ·√ exp −
2π σX 2 σX 2π σY 2 σY
= fX (x) · fY (y). (4.19)

According to Eq. (4.19), X and Y are (statistically) independent variables. We

have therefore shown that two uncorrelated normally distributed (real) variables are
automatically independent. However, one should make a note of the fact that this
does not apply to other types of random variables.
A stochastic process .X(t) is called Gaussian or normally distributed if the
n
random variable .Z = j =1 cj X(tj ) is normally distributed for any (arbitrary)
36 4 A Point Process Approach to Extreme Value Statistics

choice of .n (= 1, 2, . . .), constants .c1 , . . . , cn , and times .t1 , . . . , tn . If .Y (t) is the

response of a linear, time-invariant system where the input process .F (t) is Gaussian,
.F (t) and .Y (t) are jointly normally distributed variables for any time t, and .Y (t) also

becomes a Gaussian process.

If .X(t) is a stationary and differentiable Gaussian process, .X(t) and .Ẋ(t) are
jointly normally distributed for any t. It can be shown that .ρX(t) Ẋ(t) = 0, cf. Wong
and Hajek (1985), Naess and Moan (2013); hence, .X(t) and .Ẋ(t) are independent
variables, and it follows that,

fX(t)Ẋ(t) (x, ẋ) = fX(t) (x) · fẊ(t) (ẋ)

1 1 x − m 2 ẋ 2
X
= exp − + , (4.20)
2π σX σẊ 2 σX σẊ

where the fact that .mẊ = 0 for the derivative .Ẋ(t) of any stationary and
differentiable process .X(t) has been used. This can be seen as follows:

1 1
N N
d d
E[Ẋ(t)] = lim
. ẋj (t) = lim xj (t) = E[X(t)]. (4.21)
N →∞ N dt N →∞ N dt
j =1 j =1

Because .mX = E[X(t)] = constant for a stationary process, it follows immediately

that .mẊ = 0. Note also that .fX(t)Ẋ(t) (x, ẋ) is independent of t.
+
Let us calculate the mean upcrossing rate .νX (a). Substituting from Eq. (4.20)
into Eq. (4.9) gives
∞ 1 a − m 2 ẋ 2
+ ẋ X
νX
. (a) = exp − + d ẋ
0 2π σX σẊ 2 σX σẊ
1 σẊ 1 a − m 2 ∞ ẋ 1 ẋ 2 ẋ
X
= exp − exp − d
2π σX 2 σX 0 σẊ 2 σẊ σẊ
1 σẊ
1 a − mX 2
= exp − . (4.22)
2π σX 2 σX
+
It is seen that .νX (a) decreases rapidly (with .σX as reference scale) at each side
of the mean value, where it assumes its largest value

+ 1 σẊ
.νX (mX ) = . (4.23)
2π σX

In many situations one would prefer to define the origin so that .mX = 0. The
expression on the rhs of Eq. (4.23) is therefore often referred to as the mean zero-
upcrossing rate, under the tacit assumption that .mX = 0. Another corresponding
4.4 Average Upcrossing Rate and Distribution of Peaks of a Gaussian Process 37

parameter that is often met in the literature, is the mean zero-crossing period .Tz ,
which is defined by,

+ σX
Tz = (νX
. (0))−1 = 2π . (4.24)
σẊ

+
For a stationary Gaussian process .X(t) with mean value zero, .νX (a) is com-
pletely determined by the two standard deviations .σX and .σẊ . If the variance
spectrum .SX (ω) of .X(t) is known, .σX and .σẊ can be calculated by using the
formulas,
∞
σX2 =
. SX (ω) dω (4.25)
−∞

and
∞
.σ
2
Ẋ
= ω2 SX (ω) dω. (4.26)
−∞

If .X(t) is also assumed to be narrow banded, the density .fXp (a) of the peaks of
X(t) may be calculated. From Eqs. (4.16) and (4.22), it is found that (.mX = 0),
.

a2
a
exp − a≥0
fXp (a) =
. σX2 2σX2 (4.27)
0, a < 0.

A density of this type is called a Rayleigh density, and .Xp becomes a Rayleigh
distributed variable. An example of .fXp (a) is shown in Fig. 4.4
Gaussian processes have great practical significance. This is primarily due to
the following two reasons. In many cases, important physical phenomena that give
rise to loads on structures can be modeled as Gaussian processes with a reasonable
degree of accuracy. Moreover, weakly damped structures usually make the response
more Gaussian than the load. In addition comes the fact that a Gaussian process is
particularly amenable to analytical treatment.
+
From Eq. (4.22), it is seen that .νX (a) is proportional to .fX (a), which is a
consequence of the fact that .X(t) and .Ẋ(t) are independent random variables for
any t. Of course, in general this is not the case. Returning to Eq. (4.9), it may be
rewritten as,
∞ ∞
+
.ν (a)
X = ẋ fXẊ (a, ẋ) d ẋ = ẋ fẊ|X (ẋ | a) fX (a) d ẋ
0 0

= E[Ẋ+ | X = a] fX (a). (4.28)

Here .E[Ẋ+ | X = a] denotes the average positive slope of .X(t) at the level .X(t) =
a, which in general depends on the level a. However, as pointed out by Naess and
38 4 A Point Process Approach to Extreme Value Statistics

0.7

0.6

0.5

0.4
fXp (a)

0.3

0.2

0.1

0
0 1 2 3 4 5
a

Fig. 4.4 The PDF .fXp (a) with .σX = 1.0 of Rayleigh distributed peaks

Gaidai (2008), for a wide range of response processes, this dependence appears to
be surprisingly weak.

4.5 Extreme Value Distributions by the Upcrossing Rate

Method

The starting point is that a response quantity has been modeled as a stationary
stochastic process. The goal now is to calculate the probability distribution of the
largest value of the response process .X(t) during a specified time period. It is also
a goal to determine the probability distribution of the time to the first exceedance
of a given response level. These problems are very difficult to solve exactly, but
by simplifying somewhat, one may often find reasonably accurate approximate
solutions.
Let us denote the largest value that .X(t) assumes during the time T by .M(T ).
That is, .M(T ) = max{X(t); 0 ≤ t ≤ T }. Also, let .Θ(a) denote the time to the first
exceedance of the level a. .M(T ) and .Θ(a) are random variables. If it is convenient
to emphasize that .M(T ) and .Θ(a) refer to the process .X(t), the notation .MX (T )
and .ΘX (a) is used. Clearly,

Prob{M(T ) ≤ a} = Prob{Θ(a) > T }

. (4.29)

because both events .{M(T ) ≤ a} and .{Θ(a) > T } express the same, namely, that
there are no exceedances of the level a during time T . Let us call this event .E .
Then .E = {X(t) ≤ a for all t ∈ (0, T )}, but this event can also be expressed as
4.5 Extreme Value Distributions by the Upcrossing Rate Method 39

E = {X(0) ≤ a and N + (a, T ) = 0}. This is so, because if .X(0) ≤ a and there are
.

no subsequent upcrossings of a, there can be no exceedances. Hence, .Prob{E } =

Prob{X(0) ≤ a and N + (a, T ) = 0} → Prob{N + (a, T ) = 0} when .a → ∞
because of the law of marginal probability. When it is written that .a → ∞ here, it
means that a assumes values that are large compared to the typical values for the
process considered; it should not be strictly interpreted as meaning that a grows
beyond all limits. In the chapter on asymptotic extreme value distributions, this will
be different. Because the extreme values in most cases are much larger than the
typical values, the approximation .Prob{E } = Prob{N + (a, T ) = 0} is introduced.
To determine .Prob{N + (a, T ) = 0}, the following simplifying assumption is
introduced: upcrossings of high levels are statistically independent events. If the
process .X(t) is not too narrow banded, that is, neighboring peaks tend to be of
similar size, this is a reasonable approximation. This simplification implies that the
random number of upcrossings in an arbitrary time interval of length T is Poisson
+
distributed with parameter .E[N + (a, T )] = νX (a) T . In particular, this leads to the
result
+
Prob{N + (a, T ) = 0} = exp{−νX
. (a) T }. (4.30)

A derivation of Eq. (4.30) is given in the appendix at the end of this chapter.
From Eq. (4.30), it is then obtained that, for large values of a,
+
FM(T ) (a) = Prob{M(T ) ≤ a} = exp{−νX
. (a) T } , (4.31)

and
+
.FΘ(a) (θ ) = 1 − Prob{Θ(a) > θ } = 1 − exp{−νX (a) θ } . (4.32)

+
Often, .νX (a) T 1 for a relevant level a and time interval .(0, T ), such that the
probability of exceedance of a during the time T can be approximated as
+
Prob{Exceedance} = FΘ(a) (T ) ≈ νX
. (a) T (4.33)

because .ex ≈ 1 + x for .|x| 1.

The derivations above represent a successful attempt in deriving an approximate
expression for the distribution of the extreme value .M(T ) and for the time to
first passage .Θ(a) for large values of a, and it was seen that these distributions
+
are determined by the mean level-upcrossing rate .νX (a). The only significant
simplification that has been adopted is the assumption that upcrossings of high levels
are independent. Regarding the response of a lightly damped structure, the response
maxima will have a tendency to occur in clumps. In particular, large peaks will tend
to occur simultaneously as illustrated in Fig. 4.1. The assumption about independent
upcrossings will then tend to be less valid. In such cases, Eqs. (4.31) and (4.32) may
give significant deviations from the correct values, but always on the safe side in the
40 4 A Point Process Approach to Extreme Value Statistics

sense that Eq. (4.31), for instance, leads to larger extreme value estimates than the
correct ones. It will be seen in the next chapter that the ACER method provides a
practical and elegant solution to this specific problem.
When the upcrossing rate is estimated from time series of limited length, there
will be uncertainty due to sample variability. Hence, uncertainty quantification is an
important issue in extreme value analysis, which is sensitive to small model changes.
Since the upcrossing rate is, in fact, largely equivalent to one of the ACER functions,
this problem is discussed in the next chapter.
In some situations, the relevant extreme values will be connected to the smallest
or minimum values of a process. Such a case can, however, be easily recast to a study
of maximum values by observing that .min{X(t); 0 ≤ t ≤ T } = − max{−X(t); 0 ≤
t ≤ T }.

4.6 Extreme Values of Gaussian Processes

The particular case of a Gaussian process warrants special attention. Hence, let
X(t) be a stationary Gaussian process with a mean level upcrossing rate given by
.

Eq. (4.22). For simplicity, it is assumed that .mX = 0. In any case, since changing
the mean value is equivalent to a constant shift of all realizations, nothing is lost by
this assumption.
The distribution of .M(T ) (.≥ 0) for large values of a is then, according to
Eqs. (4.22) and (4.31), given by the expression
a 2
+
FM(T ) (a) = exp − νX
. (0) T exp − , (4.34)
2σX2

where the mean zero-upcrossing rate enters, that is,

+ 1 σẊ
νX
. (0) = . (4.35)
2π σX

The density of .M(T ), .fM(T ) (a), can be calculated from Eq. (4.34) by .fM(T ) (a)
= dFM(T ) (a)/da, and it is given as follows (for large values of a):
.

a + a2 a 2
+
fM(T ) (a) =
.
2
νX (0) T exp − 2
exp − νX (0) T exp −
σX 2σX 2σX2
a + a2
= 2 νX (0) T exp − FM(T ) (a). (4.36)
σX 2σX2

Assuming that Eq. (4.36) is valid for all values of a, Fig. 4.5 shows the density of
+
M(T ) for various values of .νX
. (0) T .
4.6 Extreme Values of Gaussian Processes 41

5
N = 10
4
N = 10
1.5
3
N = 10

PDF
2
N = 10
1 Mean value

B
0.5 C

0
-4 -2 0 2 4 6 8
Number of standard deviations

Fig. 4.5 Various densities for a stationary Gaussian process .X(t). A: Density of .X(t). B: Density
+
of the peaks of .X(t) (narrow banded case). C: Density of .M(T ) for various values of .N = νX (0) T

A quantity of particular interest in connection with design of structures is the

level .a = ξp = ξp (T ) , which with probability p is not exceeded during the time T ,
that is,

FM(T ) (ξp ) = p.
. (4.37)

Because .p = exp(ln p), it follows from Eq. (4.34) that

ξp2 ln p
. exp − 2
=− + . (4.38)
2σX νX (0) T

This leads to the formula

ν + (0) T
ξp (T ) = σX
. 2 ln X
. (4.39)
ln(1/p)

This formula can then be used to find the response level that has a probability of
1%, say, of being exceeded (.p = 0.99) during time T .
42 4 A Point Process Approach to Extreme Value Statistics

Table 4.1 Table of +

.νX (0) T 10 100 1000 .10, 000
.ξp /σX -values
.p = 0.37 .2.1460 .3.0349 .3.7169 .4.2919
.p = 0.50 .2.3105 .3.1533 .3.8143 .4.3765

.p = 0.90 .3.0176 .3.7028 .4.2797 .4.7876

.p = 0.95 .3.2474 .3.8924 .4.4448 .4.9357
.p = 0.99 .3.7156 .4.2908 .4.7975 .5.2556

The most probable extreme value, denoted by .ξ̂ = ξ̂ (T ), is given to good

approximation by the formula
+
ξ̂ = σX
. 2 ln νX (0) T (4.40)

ξ̂ is the value where the PDF of the extreme value .M(T ) attains its maximum. It
.

follows that .ξ̂ ≈ ξ0.37 because .ln(1/0.37) ≈ 1.0 (.e−1 ≈ 0.37).

Specific values of the quantiles .ξp /σX for various numbers of zero-upcrossings
are listed in Table 4.1. Apart from providing a set of useful reference values,
Table 4.1 also clearly illustrates the slow increase in typical extreme values with
increasing time for a stationary Gaussian process.
Another quantity that is often used as a measure of extreme values is the expected
largest value during a given time T , that is, .E[M(T )]. This expected value can be
calculated as,
∞ dFM(T ) (a)
.E[M(T )] = a da. (4.41)
0 da

It is now convenient to introduce a new integration variable .η defined by the equation

FM(T ) (a) = e−η , which implicitly defines a as a function of .η. Note that .a = 0 (∞)
.

corresponds to .η = ∞ (0). It is obtained that,

dFM(T ) (a) dFM(T ) (a) dη dFM(T ) (a)

. da = da = dη = −e−η dη . (4.42)
da dη da dη

Substituted into Eq. (4.41), this leads to the equation,

∞
E[M(T )] =
. a(η) e−η dη , (4.43)
0
4.6 Extreme Values of Gaussian Processes 43

+
where .a = a(η) is a function of .η. The way .η is defined, .η = νX (0) T
exp{−a 2 /(2σX2 )}. Our focus is on large a-values, that is, small .η-values, and it is
obtained by solving with respect to a:

+
.a = σX 2 ln(νX (0) T ) − 2 ln η
1/2
+ ln η
= σX 2 ln(νX (0) T ) 1 − +
ln(νX (0) T )

+ ln η (ln η)2
= σX 2 ln(νX (0) T ) 1 − + − + + . . . .
2 ln(νX (0) T ) 8(ln(νX (0) T ))2
(4.44)

It can be shown that the main contribution to the integral in Eq. (4.43) comes
from small .η-values, and it is found that,
∞ −η dη
∞ 2 −η dη
+ 0 ln η e 0 (ln η) e
.E[M(T )] ≈ σX 2 ln(νX (0) T ) 1 − − + . . .
+ +
2 ln(νX (0) T ) 8(ln(νX (0) T ))2
2

6 + λE
π 2
+ λE
= σX 2 ln(νX (0) T ) 1 + + − + + . . .
2 ln(νX (0) T ) 8(ln(νX (0) T ))2
(4.45)
∞ ∞
because . 0 ln η e−η dη = −λE and . 0 (ln η)2 e−η dη = π 2 /6 + λ2E , where
+
.λE = 0.5772 . . . denotes Euler’s constant. Usually, .ln(ν (0) T ) is sufficiently large
X
to warrant the following approximation
λE
+
E[M(T )] ≈ σX
. 2 ln(νX (0) T ) + . (4.46)
+
2 ln(νX (0) T )

Similarly, it is found that,

∞ +
E[M(T ) ] =
.
2
a(η)2 e−η dη ≈ 2σX2 ln(νX (0) T ) + λE . (4.47)
0

From Eqs. (4.47) and (4.46), the expression for the variance of .M(T ) may be
derived. It is obtained that,

π 2 σX2
) = E[M(T ) ] − E[M(T )] ≈
2 2 2
σM(T
.
+ . (4.48)
12 ln(νX (0) T )
44 4 A Point Process Approach to Extreme Value Statistics

One may note that .E[M(T )] → ∞, while .σM(T ) → 0 when .T → ∞. In Fig. 4.5 it
is clearly seen how the mean value of .M(T ) increases, while the standard deviation
decreases with increasing values of T .
Let us conclude this discussion of the extreme value distribution of a stationary
Gaussian process by showing that it approaches one of the three asymptotic
extreme value distributions discussed in Chap. 2, viz. the Gumbel distribution. The
expression for .FM(T ) (a) given by Eq. (4.34) can be written as,

FM(T ) (a) = exp − exp − h(a) ,
. (a → ∞), (4.49)

a2 +
where .h(a) = + ln(νX (0) T ). Let .a0 denote the solution of the equation .h(a) =
2σX2

+
0. This gives .a0 = σX 2 ln(νX (0) T ). It can be verified that the range of values
of a where most of the extreme value distribution “lives,” cf. Fig. 4.5, satisfies .|a −
a0 | << a0 for .a0 → ∞. Then .h(a) = h(a) − h(a0 ) ≈ h (a0 ) (a − a0 ) for large
values of .a0 . Hence, it follows that asymptotically,

FM(T ) (a) ≈ exp − exp − h (a0 ) (a − a0 )
.

a − a0
= exp − exp − , (a0 → ∞) , (4.50)
σ0
+
where .σ0 = σX / 2 ln(νX (0) T ) . This is clearly an extreme value distribution of
the Gumbel type. It may be noted that the mean value of this Gumbel distribution is
.a0 + 0.5772σ0 , cf. Sect. 2.2, which agrees with the mean value given in Eq. (4.46).

Similarly, the variance of this Gumbel distribution is .π 2 σ02 /6, which coincides with
the variance derived in Eq. (4.48).
The Gumbel distribution limit will follow for a large class of extreme value
models of the form given in Eq. (4.49) with a differentiable function h with the
properties that .h(a) → ∞ when .a → ∞, and .h (a) > 0 for .a > al for some value
.al .

4.7 The Crossing Rate of Transformed Processes

Assume that two stationary and differentiable processes .X(t) and .Y (t) satisfy the
equation

Y (t) = h X(t) ,
. (4.51)
4.8 Hermite Moment Models 45

where .h(·) is a given differentiable function. The following useful result can then
be shown: the upcrossing rate .νY+ (b) of .Y (t) is determined by the upcrossing rate
+
.ν (a) of .X(t) by the relation (Naess, 1983; Grigoriu, 1984),
X

n
νY+ (b) =
.
+
νX (aj ), (4.52)
j =1

where .a1 , . . . , an denote all possible x-solutions of the equation .b = h(x).

Let us apply this result on the example .Y (t) = X(t)2 , where .X(t) is a stationary
Gaussian process with mean value zero. It can be shown that .Y (t) is also stationary.
In this√particular case,√.h(x) = x 2 so that the equation .b = h(x) has the solutions
.a1 = b and .a2 = − b (.b ≥ 0). According to Eqs. (4.52) and (4.22), it is obtained
that,

1 σẊ b
νY+ (b) =
. exp − , b ≥ 0. (4.53)
π σX 2σX2

The extreme value distribution for the .Y (t) process then becomes
b
.FMY (T ) (b) = exp{−νY+ (b) T } = exp − νY+ (0) T exp − , (4.54)
2σX2

where
1 σẊ
νY+ (0) =
.
+
(= 2 νX (0)). (4.55)
π σX

Note that .FMY (T ) (b) is in fact a Gumbel distribution. Analogously to the Gaussian
case, one will find that, for example,

E[MY (T )] = 2σX2 ln(νY+ (0) T ) + λE .
. (4.56)

For the similar case of .Z(t) = X(t) |X(t)|, then .νZ+ (b) = νY+ (|b|)/2 because
√ in
this case the equation .b = h(x) = x| x| has only one solution, viz. .a = sign(b) |b|.

4.8 Hermite Moment Models

When only statistical moments of a response process are available, it has been
proposed to use Hermite moments to capture non-Gaussian behavior and its effect
on extreme response statistics (Winterstein, 1985, 1988). Assuming that a stationary
response process .X(t) can be related to a stationary standard Gaussian process .U (t)
by a strictly increasing function g by .X = g(U ), an approximation to g is sought in
46 4 A Point Process Approach to Extreme Value Statistics

terms of Hermite polynomials. Specifically,

X − μX
. = X0 = g(U ) ≈ κ U + Σn=3N
gn H en−1 (U )
σX

= κ U + g3 (U 2 − 1) + g4 (U 3 − 3U ) + . . . . (4.57)

The expansion coefficients .gn control the shape of the standardized distribution,
while the .κ parameter is a scaling factor ensuring that .X0 (t) has unit variance. For
.N = 4, the .gn can be expressed in terms of the central moments .αn = E(X ),
n
0
assuming that .α4 > 3 (Winterstein, 1988),
α3
g3 =
. √ , (4.58)
4 + 2 1 + 1.5(α4 − 3)

√
1 + 1.5(α4 − 3) − 1
g4 =
. , (4.59)
18
and
−1/2
κ = 1 + 2 g32 + 6 g42
. . (4.60)

The condition .α4 > 3 corresponds to a “softening” response, signifying that the
tails of the distribution is wider than the Gaussian. The opposite case, .α4 < 3, is
discussed by Winterstein (1988). For some recent work on this method, cf. Zhang
et al. (2019). Having obtained an approximation of the function g in terms of
statistical moments estimated from the recorded or simulated response time series,
the result from the previous section can now be applied to determine the crossing
rate of the response process to estimate extreme value statistics by the point process
method.
Previously, a common model adopted in ocean engineering was that the ocean
surface elevation could be regarded as a Gaussian random field. However, in
recent years there has been a development toward implementing also non-Gaussian,
second-order wave field models. This topic is briefly discussed in Sect. 11.7 on
extremes of non-Gaussian random fields. With a Gaussian model of the random
wave field, the dynamic response of marine structures to the ocean waves would
then appear to be represented as a transformation of a Gaussian input process.
There will be several examples of such modeling in this book, and it will be clear
that, in general, a simple, marginal transformation between the input and output
processes will not be adequate. However, it is a rather interesting observation that
the upcrossing rate for a wide range of nonlinear dynamical models is largely
determined by the probability density (Naess and Gaidai, 2008). This indicates that
in such cases a marginal transformation may give reasonably good results also for
the upcrossing rate.
4.9 Return Period 47

4.9 Return Period

Let Z be a random variable, and let

p = Prob{Z > z} = 1 − FZ (z).

. (4.61)

Assume that a series of independent observations of Z can be made. The mean

number of observations to the first time the observed (measured) value of Z exceeds
z is called the return period for exceedance of z, and it is denoted by .R̃(z). It can be
shown that,

1 1
R̃(z) =
. = . (4.62)
p 1 − FZ (z)

This equation can be explained by recognizing that, on average, .1/p trials must be
conducted before an event of probability p occurs.
Note that .R̃(z) refers to the number of observations and that these are assumed
to be statistically independent. To express the return period in terms of time,
knowledge about the time interval between the observations is needed. If the
observation interval is .Δt, the return period specified in terms of time, which is
denoted by .R(z), will be given as,

R(z) = Δt R̃(z).
. (4.63)

The observation interval .Δt must be chosen sufficiently long such that the individual
observations become approximately independent. Note that .R(ξp (T )) = T /(1−p),
where .ξp (T ) is given by Eq. (4.39) and T is the “observation” interval.
A design load with a probability of .10−2 of being exceeded during one year
is often used in connection with the design of offshore structures. If .X(t) denotes
a relevant load process considered for such a design provision, and .ξ denotes the
corresponding load level, then .Prob{Z > ξ } = 0.01, where .Z = max(X(t); 0 ≤
t ≤ 1 year). The return period for exceedance of .ξ then becomes,

1 1
.R̃(ξ ) = = = 100. (4.64)
Prob{Z > ξ } 0.01

The reference period in this case is one year; therefore, .R(ξ ) = 100 years.
It should be mentioned that time varying loads caused by, for example, ocean
waves cannot generally be considered as stationary over an extended period of time.
This implies that quantities such as yearly maxima must be calculated by using so-
called long-term statistics. This is discussed in the next section.
48 4 A Point Process Approach to Extreme Value Statistics

4.10 Long-Term Extreme Value Distributions

Clearly, the estimation of the extreme loads or load effects on, e.g., a marine
structure subjected to the ocean environment over the design life of the structure
must take into account the changing weather conditions. This is done in a consistent
manner by invoking an appropriate long-term statistical method.
There are basically three different approaches to estimating characteristic long-
term extreme values. These methods are based on (1) all peak values, (2) all
short-term extremes, or (3) the long-term extreme value. A more detailed description
follows, where .X(t) denotes a zero mean stochastic process, for example the wave
elevation or a corresponding load effect, that reflects the changing environmental
conditions. Therefore, .X(t) is a non-stationary process. Let T denote the long-term
time duration, e.g., 1 year, or a service life of, e.g., 30 years, and let .T̃ denote the
duration of each short-term weather condition, assuming that .T = K T̃ , where
K is a large integer. The long-term situation is considered to be a sequence of K
short-term conditions, where each short-term condition is assumed to be stationary.
Significantly, around the world there are different kinds of weather conditions, also
at sea. A coarse characterization of sea states:
– Extratropical, with slowly varying wave conditions
– Tropical, with rare hurricanes that represent very rapidly changing weather
conditions
In this section, the discussions and derivations are limited to extratropical condi-
tions.
Let W denote the vector of parameters that describes the short-term environmen-
tal condition. W can be considered as a random vector variable. For simplicity, let us
assume that .W = (Hs , Ts ), where .Hs is the significant wave height and .Ts a suitable
spectral period (generic notation). For example, .Ts may represent the spectral peak
period .Tp or the mean zero-crossing period .Tz . In principle, the analysis is entirely
similar if W contains more parameters, e.g., dominant wave direction, wind speed,
etc.

4.10.1 All Peak Values

A peak value of .X(t), denoted generically by .Xp , is defined here as the maximum
value of .X(t) between two consecutive zero-upcrossings. For each short-term
condition, let .FXp |Hs Ts (ξ |hs , ts ) denote the conditional distribution of the peak
value. Battjes (1970) showed that the long-term distribution .FXp (ξ ) of the peak
value .Xp is given as follows:

1 +
FXp (ξ ) =
. νX (0|hs , ts ) FXp |Hs Ts (ξ |hs , ts )fHs Ts (hs , ts ) dhs dts ,
+
νX (0) hs ts
(4.65)
4.10 Long-Term Extreme Value Distributions 49

+
where .νX (0) denotes the long-term average zero-upcrossing rate given by

+ +
. νX (0) = νX (0|hs , ts ) fHs Ts (hs , ts ) dhs dts . (4.66)
hs ts

+
Here, .νX (0|hs , ts ) denotes the average zero-upcrossing rate for the short-term
stationary condition characterized by .Hs = hs and .Ts = ts .
In practical applications, a commonly adopted statistical distribution for the peak
values in a short-term condition is the Rayleigh distribution, that is,
ξ2
.FXp |Hs Ts (ξ |hs , ts ) = 1 − exp − . (4.67)
2 σX (hs , ts )2

In the modeling of ocean waves, it is sometimes appropriate to use a more

accurate distribution of the peak values, or wave crest heights. A distribution that
is frequently used, is one proposed by Forristall (2000).
Under the assumption that all peak values can be considered as statistically
independent, which may not always be very accurate, the peak value .ξq with a
probability q of being exceeded per year is found by solving the following equation:
q
FXp (ξq ) = 1 −
. , (4.68)
+
S (1y) · νX (0)

where .S (1y) = 365·24·3600 denotes the number of seconds in a year. The short-term
duration .T̃ does not enter into this analysis. In the long run, the relative frequency
of the various sea states is reflected in the joint density .fHs Ts (hs , ts ), which can
be approximated by using an appropriate scatter diagram, if that is available. An
example of such a scatter diagram is shown in Table 4.2.
Let the scatter diagram be divided into m intervals for the .hs -values and
n intervals for the .ts -values. It may often be an acceptable approximation to
+
assume that .νX (0|hs , ts ) = Tz−1 ≈ c Ts−1 for a fixed constant c, cf. Eq. (4.24).
Equation (4.65) with the Rayleigh approximation for .FXp |Hs Ts (ξ |hs , ts ) can then be
approximately expressed in the following form:

1 K
m n
ξ2 ij
FXp (ξ ) ≈
. 1 − exp − , (4.69)
ts−1 2 σX (hi , tj )2 tj K
i=1 j =1

where

m
n
Kij
ts−1 =
. . (4.70)
tj K
i=1 j =1
Table 4.2 Scatter diagram northern North Sea, 1973–2001. Values given for .hs and .tp are upper-class limits
50

t.p (s)
h.s (m) 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 .>20

0.5 18 15 123 113 110 390 260 91 38 42 32 3 19 13 9 1 3 2 7

1.0 16 49 675 433 589 1442 1802 959 273 344 125 33 64 29 13 1 7 1 6
1.5 5 32 417 893 1107 1486 2757 1786 636 731 299 121 92 43 18 10 5 2 13
2.0 1 0 102 741 1290 1496 2575 1968 780 868 492 200 116 51 31 8 4 4 8
2.5 0 0 9 256 969 1303 2045 1892 803 941 484 181 157 58 23 19 5 1 8
3.0 0 0 1 45 438 1029 1702 1898 705 957 560 218 196 92 40 11 4 2 5
3.5 0 0 1 4 124 650 1169 1701 647 865 456 237 162 100 36 12 6 1 5
4.0 0 0 2 0 33 270 780 1369 573 868 427 193 157 91 51 13 3 0 1
4.5 0 0 0 0 3 90 459 1017 466 761 380 127 137 86 31 23 6 5 0
5.0 0 0 0 0 0 15 228 647 408 737 354 119 96 50 32 18 2 4 1
5.5 0 0 0 0 0 2 68 337 363 580 283 94 92 31 24 10 6 2 0
6.0 0 0 0 0 0 1 20 166 221 418 307 63 76 24 13 9 4 0 0
6.5 0 0 0 0 0 0 5 50 140 260 257 59 49 20 12 4 2 2 2
7.0 0 0 0 0 0 0 0 23 90 180 193 41 53 20 5 3 3 0 0
7.5 0 0 0 0 0 0 0 6 25 93 121 45 46 17 5 5 0 1 0
8.0 0 0 0 0 0 0 0 3 14 50 84 26 47 11 6 0 1 0 0
8.5 0 0 0 0 0 0 0 0 7 25 45 23 25 20 8 0 0 0 0
9.0 0 0 0 0 0 0 0 1 2 12 30 22 20 19 0 0 0 0 0
9.5 0 0 0 0 0 0 0 0 1 2 20 21 14 7 1 1 0 1 0
10.0 0 0 0 0 0 0 0 0 0 2 5 4 21 6 2 0 0 0 0
10.5 0 0 0 0 0 0 0 0 0 3 4 8 9 12 2 0 0 0 0
11.0 0 0 0 0 0 0 0 0 0 0 2 0 4 3 1 0 1 0 0
11.5 0 0 0 0 0 0 0 0 0 0 2 1 2 3 0 0 0 0 0
12.0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0
12.5 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
4 A Point Process Approach to Extreme Value Statistics

13.0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
4.10 Long-Term Extreme Value Distributions 51

Here, .Kij equals the number of observations in condition .(i, j ), that is, in the .hs -
interval .(hi − Δh/2, hi + Δh/2) and the .ts -interval .(tj − Δt/2, tj + Δt/2); .i =
1, . . . , m, .j = 1, . . . , n. .K = m
i=1
n
j =1 Kij is the total number of observations,
or sea states. Also note that the values for .hs and .ts included on the scatter diagram
in Table 4.2 are upper-class limits, that is, .hi + Δh/2 and .tj + Δt/2.

4.10.2 All Short-Term Extremes

+
where .k (st) = νX (0|hs , ts ) T̃ is the number of peak values during the short-term
condition specified by .Hs = hs and .Ts = ts . The validity of Eq. (4.71) is again
based on the assumption that all peak values are independent.
The long-term distribution of the short-term extreme peak values is often
approximated by the expression,

FX̃ (ξ ) =
. FX̃|Hs Ts (ξ |hs , ts ) fHs Ts (hs , ts ) dhs dts . (4.72)
hs ts

Although the error is usually not significant, the averaging done in Eq. (4.72) is
not quite correct in the sense that it is not a so-called ergodic average (Naess, 1984),
which would be the correct approach. To achieve this, Eq. (4.72) has to be modified
to read (Krogstad, 1985),

FX̃ (ξ ) = exp
. ln FX̃|Hs Ts (ξ |hs , ts ) fHs Ts (hs , ts ) dhs dts . (4.73)
hs ts

The root of this problem of averaging resides in the very notion of long-term
statistics, and how it has to be interpreted. By its very definition, it is a notion built
on evolution in time. And all information that is extracted about its properties is
obtained by taking time averages along the observed time histories. That is precisely
what is wrong with Eq. (4.72), because it applies a simple ensemble average. On
the other hand, Eq. (4.73), which is an ergodic average, results from taking the
appropriate time averages.
Assuming for illustration that .T̃ = 3 h, an estimate of the value .ξq , which has a
probability q of being exceeded per year, is in this case determined by the equation
q
FX̃ (ξq ) = 1 −
. . (4.74)
365 · 8
52 4 A Point Process Approach to Extreme Value Statistics

If Eq. (4.72) is used, a relation analogous to Eq. (4.69) would be (with .T̃ = 3 h
and .Tz−1 ≈ c Ts−1 )

n
m 602 ×3×c
ξ2 tj Kij
.F (ξ ) ≈ 1 − exp − . (4.75)
X̃ 2 σX (hi , tj )2 K
i=1 j =1

4.10.3 The Long-Term Extreme Value

The distribution of the extreme value .X̂ = X̂(T ), that is, the global extreme value
over a long-term period T , can be expressed as follows (Naess, 1984),

+
FX̂ (ξ ) = exp − T
. νX (ξ |hs , ts ) fHs Ts (hs , ts ) dhs dts , (4.76)
hs ts

+
where .νX (ξ |hs , ts ) denotes the average .ξ -upcrossing rate for the short-term station-
ary situation characterized by .Hs = hs and .Ts = ts .
From Eqs. (4.34) and (4.35), it follows that for the case of a zero mean Gaussian
process, Eq. (4.76) would read,

FX̂ (ξ ) =
.

σẊ (hs , ts ) ξ2
exp − T exp − 2
fHs Ts (hs , ts ) dhs dts ,
hs ts 2 π σX (hs , ts ) 2σX (hs , ts )
(4.77)

where the standard deviations .σX and .σẊ in the long-term situation become
functions of the environmental parameters .hs and .ts , as indicated.
With T = 1 year = .S (1y) seconds, the value .ξq , which has a probability q of being
exceeded per year, is now calculated from the equation,

. FX̂ (ξq ) = 1 − q . (4.78)

With reference to Table 4.2, Eq. (4.77) can then be expressed as a relation
analogous to Eq. (4.69) in the following way (with .T = 1 year and .Tz−1 ≈ c Ts−1 ):

m
n
S (1y) × c ξ2 K
ij
FX̂ (ξ ) ≈ exp −
. exp − 2
. (4.79)
tj 2 σX (hi , tj ) K
i=1 j =1

For the purpose of estimating extreme load effects, the use of scatter diagrams
calls for a certain amount of caution. If the scatter diagram is too coarse, leading
to poor resolution in the tail regions, the long-term extreme value estimates may
4.10 Long-Term Extreme Value Distributions 53

become inaccurate. In such cases, it is recommended to use a properly adapted

smooth joint density of the parameters characterizing the short-term sea states.
For our purposes, the joint density of .W = (Hs , Ts ) is needed. For North Sea
applications, the spectral period .Ts is often the spectral peak period .Tp due to the
fact that a commonly adopted spectral model is the JONSWAP spectrum, which is
usually parameterized by the significant wave height and the spectral peak period.
The marginal distribution of .Hs is now often modeled as one of the following two
alternatives:
– A three-parameter Weibull distribution
– A combination of a lognormal and a Weibull distribution
The following probabilistic model given by Haver (1980) and Haver and Nyhus
(1986) has been frequently adopted as the latter model. Expressed in terms of
probability densities, it assumes the form,

1 (ln h − θ )2
s
.fHs (hs ) = √ exp − , hs ≤ η, (4.80)
2π αhs 2 α2

and
β hs β−1 h
s β
fHs (hs ) =
. exp − , hs > η, (4.81)
ρ ρ ρ

where the value of the transition parameter .η separating the lognormal model for the
smaller values of .Hs from the Weibull model for the larger values, will depend on
the geographic location. A requirement is that .limhs ↑η fHs (hs ) = limhs ↓η fHs (hs ),
that is, .fHs (hs ) is continuous at .η.
This marginal density for the significant wave height is complemented by the
conditional density of the spectral peak period .Tp given the value of .Hs using a
lognormal model:

1 (ln t − μ)2
p
fTp |Hs (tp |hs ) = √
. exp − , (4.82)
2π σ tp 2 σ2

where the parameters .μ and .σ are assumed to depend on the significant wave height
hs in the following manner:
.

μ = a1 + a2 has 3 ,
. (4.83)

σ 2 = b1 + b2 exp(−b3 hs ) ,
. (4.84)

for suitably chosen constants .ai and .bi , .i = 1, 2, 3.

54 4 A Point Process Approach to Extreme Value Statistics

The joint density for the environmental parameters is then obtained by multiply-
ing the marginal density for the significant wave height with the conditional density
for the spectral peak period, that is,

.fW (w) = fHs Tp (hs , tp ) = fHs (hs ) fTp |Hs (tp |hs ) . (4.85)

The following set of parameter values was cited by Haver (2002) for locations
in the northern North Sea (Statfjord area): .α = 0.6565, .θ = 0.77, .η = 2.90,
.β = 2.691, .ρ = 1.503, .a1 = 1.134, .a2 = 0.892, .a3 = 0.225, .b1 = 0.005,

.b2 = 0.120, .b3 = 0.455.

4.10.4 Simplified Methods

In this chapter three alternative long-term statistical approaches to estimate the

extreme wave-induced response for ULS design checks at a given annual probability
of exceedance (or return period) have been outlined. It is emphasized that a long-
term approach is the most accurate approach, if it can be achieved at all. If computer
models can be used, a long-term analysis is possible with the computational power
that is accessible today. However, such analyses may still become a challenge in
terms of required computer time. Therefore, simplified design methods are still very
popular, but such methods would always need to be validated against the full long-
term approach.
The applicability of simplified approaches depends on the character of the
response, especially whether it can be considered quasistatic or dynamic, which
response values are relevant and also which accuracy is required; that is, whether
the analysis is carried out in pre-engineering or the detailed design phase. In this
connection, the fact that nonlinear hydrodynamic effects might be present and cause
sum- or difference-frequency excitation, respectively, should be considered.
A bottom fixed structure with a natural period below 3 seconds, say, could be
considered to have a quasistatic behavior under steady wave loading and a design
approach with appropriately chosen wave height and period would be relevant. For
structures with natural periods above 3 seconds, the quasistatic approach might
still be used in combination with the use of a dynamic amplification factor (DAF)
determined by a stochastic analysis for relevant sea states, if the DAF is limited, say,
to less than 1.5. The simplified methods would be relevant for early design phases
while a stochastic dynamic approach should be used in the detailed design phase.

Method of Equivalent Storms

Based on earlier work by Jahns and Wheeler (1972) and Haring and Heideman
(1978), Tromans and Vanderschuren (1995) proposed an alternative approach to the
calculation of the long-term extreme load or load effect. In their approach, the focus
4.10 Long-Term Extreme Value Distributions 55

is on storm events, similar to what is done in a peaks-over-threshold analysis. This

approach is particularly relevant for tropical areas with rare hurricanes. Hence, the
long-term situation is considered as a sequence of storm events. The method is based
on the assumption that the distribution of the storm extreme response value can
be approximated by a Gumbel extreme value distribution conditional on the most
probable extreme response for that storm. The distribution of the most probable
extreme value itself is assumed to follow a generalized Pareto distribution, which is
determined by fitting to data. By invoking the rule of total probability, as exemplified
by Eq. (4.72), the long-term extreme response value distribution can be calculated.

Contour Line Method

In recent years, the environmental contour line approach (Winterstein et al., 1993;
Haver and Winterstein, 2008) has been advocated as a rational basis for choosing
the appropriate short-term design storms leading to load and response extremes
corresponding to a prescribed return period, e.g., 100 years, or equivalently, a
prescribed annual probability of exceedance, which otherwise has to be obtained
from a long-term analysis.
Environmental contour line plots are convenient tools for complicated structural
dynamic systems where a full long-term response analysis is extremely time
consuming. Environmental contour lines make it possible to obtain reasonable long-
term extremes by concentrating the short-term considerations to a few sea states in
the scatter diagram.
The contour line approach can be applied for an offshore site if the joint
probability density for the significant wave height and the spectral peak period is
available in the form of a joint model as described by Eqs. (4.80)–(4.82). This joint
model must be fitted to the available data given in the form of a scatter diagram
such as the one in Table 4.2. As demonstrated by Moan et al. (2005), prediction
of extreme values is very sensitive to the amount of environmental data available
to represent the long-term variability of the sea states. The fitting of appropriate
analytical densities ensures a smoothing and can facilitate a reasonably accurate
representation of the long-term extreme response.
Contour lines corresponding to a constant annual exceedance probability can
be obtained by transforming the joint model to a space consisting of independent,
standard Gaussian variables and then using the inverse first-order reliability method
(IFORM), (see, e.g., Winterstein et al. 1993). In the standard Gaussian space,
the contour line corresponding to an annual exceedance probability of q will be
circles with radius .r = Φ −1 (1 − q/2920), where .Φ denotes the distribution of
a standard Gaussian variable, and 2920 = .365 × 8 is the number of 3-hour sea
states per year. Transforming these circles back to the physical parameter space
provides the q-probability contour lines. Approximate contour lines can be obtained
by determining the probability density for the point defined by the marginal q-
probability significant wave height and the conditional median spectral peak period,
and then estimating the q-probability contour line by the line of constant probability
56 4 A Point Process Approach to Extreme Value Statistics

34
q=10–4
30
–2
q=10
Spectral Peak Period (s)
26

22
q=0.63
18

2
0 2 4 6 8 10 12 14 16 18
Significant Wave Height (m)

Fig. 4.6 Environmental contour line plot for the wave conditions in the Statfjord area

density. Contour lines based on the joint model discussed in Sect. 4.10.3 are plotted
in Fig. 4.6, cf. Haver (2002). Even for the most complicated systems, simple
methods may often be used to identify the most critical range of the q-probability
contour line regarding a prediction of the q-probability response extreme.
The advantage of this method is that analyses of only a few sea states are
required. As the most unfavorable sea state along the q-probability contour line is
identified, a proper estimate for the q-probability response is taken as the p-fractile
of the distribution of the 3-hour extreme response value. It is important to note that
the median 3-hour extreme value for this sea state, i.e., .p = 0.50, will not represent
a proper estimate for the q-probability extreme value because this characteristic
value will not account for the inherent randomness of the 3-hour extreme value. The
fractile level, p, will depend on the aimed exceedance probability target, q, and the
degree of nonlinearity of the system. For most practical systems, .p = 0.90 seems
reasonable for .q = 10−2 , while .p = 0.95 may be more adequate for .q = 10−4 .
As an alternative to using a p-fractile above 0.5, the desired load effect may be
obtained by multiplying the median or expected maximum value with a factor of
1.2–1.3. Anyway, this simplified contour line method ideally needs to be validated
by a full long-term analysis for the relevant type of environmental conditions and
load effects.

Appendix

In this appendix the validity of Eq. (4.30) is demonstrated. The derivation of the
formula for the mean rate of upcrossings of a given level, as it is expressed
in Eq. (4.9), was based on assumptions about the probability of the number of
upcrossings during short time intervals. Using similar notation as in Sect. 4.2, that
is, .pn = pn (Δt) = Prob{N + (a, Δt) = n}, it was assumed in particular that
4.10 Long-Term Extreme Value Distributions 57

pn /p1 → 0, when .Δt → 0 for .n ≥ 2. In other words, the probability of occurrence

of two or more upcrossings during a short time interval can be neglected compared
with the probability of one upcrossing during the same time interval. To simplify
+
the notation somewhat, .λ = νX (a) is used. According to Eq. (4.9), it can therefore
be assumed that for sufficiently small .Δt,

p1 (Δt) = λΔt
. (4.86)

and

.pn (Δt) = 0 , n ≥ 2. (4.87)

∞
Because . n=0 pn = 1, it follows that,

p0 (Δt) = 1 − λΔt.
. (4.88)

To proceed, it is now required to use the assumption that the upcrossings of

the level a are independent events. This implies that the number of upcrossings
in a given time interval is statistically independent of the number of upcrossings
in another, nonoverlapping time interval. This may be used to derive the following
equation:

p0 (t + Δt) = Prob{No upcrossings in (0, t + Δt)}

= Prob{[No upcrossings in (0, t)] and

× [No upcrossings in (t, t + Δt)]}
= p0 (t) p0 (Δt). (4.89)

Equation (4.89) together with Eq. (4.88) gives,

p0 (t + Δt) − p0 (t)
. = −λ p0 (t). (4.90)
Δt
Strictly speaking, this equation is only approximately correct. However, on the basis
of the assumptions made, it is realized that the approximation becomes better the
smaller .Δt becomes. This leads to the differential equation,

dp0 (t)
. = −λ p0 (t), (4.91)
dt

which has the solution .p0 (t) = C exp(−λt), where C is a constant. Clearly .p0 (0) =
1. This gives .C = 1. The solution is therefore,

p0 (t) = e−λt ,
. (4.92)
58 4 A Point Process Approach to Extreme Value Statistics

which corresponds to Eq. (4.30).

While in the process, the expression for .pn (t) will also be derived. It is realized
that .n ≥ 1 upcrossings in the interval .(0, t + Δt) can occur as follows: { n
upcrossings in .(0, t) and 0 upcrossings in .(t, t + Δt) } or { .n − 1 upcrossings in
.(0, t) and 1 upcrossing in .(t, t + Δt) }, etc. This can be expressed by the equation,

n
pn (t + Δt) =
. pn−i (t) pi (Δt) = pn (t) p0 (Δt) + pn−1 (t) p1 (Δt). (4.93)
i=0

The last equality follows from Eq. (4.87).

Similarly to the preceding derivation, this leads to the differential equations,

dpn (t)
. = −λ pn (t) + λ pn−1 (t) , n = 1, 2, . . . (4.94)
dt

with the initial conditions .pn (0) = 0 (.n ≥ 1). These equations can be solved in
several ways. One way is to introduce the auxiliary functions .un (t), .n = 0, 1, . . .
defined by .pn (t) = e−λt un (t). When this is substituted into Eq. (4.94), it leads to
the equations,

dun (t)
. = λ un−1 (t) , n = 1, 2, . . . , (4.95)
dt

with initial conditions .un (0) = 0, .n = 1, 2, . . .. In particular,

du1 (t)
. = λ u0 (t) = λ, (4.96)
dt

which gives .u1 (t) = λ t, because .u1 (0) = 0. Successive solution of Eq. (4.95) gives,

(λ t)n
un (t) =
. , (4.97)
n!
and thereby,

(λ t)n −λt
pn (t) =
. e , (4.98)
n!

which holds for .n = 0, 1, 2, . . . (.0! = 1).

The Poisson process is a frequently used model for phenomena characterized
by events that occur approximately independent of each other. Subject to certain
conditions, the Poisson process can be used to model the stream of telephone calls
through a telephone exchange or the stream of cars passing through a road crossing.
Chapter 5
The ACER Method

5.1 Introduction

Extreme value statistics, even in applications, is generally based on asymptotic

results. This is done either by assuming that the epochal extremes, for example,
yearly extreme wind speeds at a given location, are distributed according to
the so-called generalized (asymptotic) extreme value distribution with unknown
parameters to be estimated on the basis of the observed data, cf. Chap. 2 or Coles
(2001) and Beirlant et al. (2004), or by assuming that the exceedances above high
thresholds follow a generalized (asymptotic) Pareto distribution with parameters
that are estimated from the data, cf. Chap. 3 or Coles (2001), Beirlant et al. (2004),
Davison and Smith (1990), and Reiss and Thomas (2007). As was discussed in
Chap. 1, the major problem with both of these approaches is that the asymptotic
extreme value theory itself cannot be used in practice to decide to what extent it
is applicable for the observed data. And since statistical tests to decide this issue
are rarely precise enough to settle this problem, the assumption that a specific
asymptotic extreme value distribution is the appropriate distribution for the observed
data is based more or less on faith or convenience.
On the other hand, one can reasonably assume that in many cases long time
series obtained from practical measurements do contain values that are large enough
to provide useful information about extreme events that are truly asymptotic. This
cannot be strictly proved in general, of course, but the accumulated experience
indicates that asymptotic extreme value distributions do provide reasonable, if not
always very accurate, predictions when based on measured data. This is amply
documented in the vast literature on the subject, and good references to this literature
are Beirlant et al. (2004), Embrechts et al. (1997), and Falk et al. (2004).
However, even if the situation might be tolerable, it is clearly not satisfactory.
In an effort to improve on the described state of affairs, an approach to the extreme
value prediction problem has been developed that is less restrictive and more flexible
than the ones using only asymptotic theory (Naess and Gaidai 2009; Naess et al.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 59

A. Naess, Applied Extreme Value Statistics,
[Link]
60 5 The ACER Method

2013). The approach is based on two separate components that are designed to
improve on two important aspects of extreme value prediction based on observed
data.
.⇒ The first component provides a nonparametric empirical representation of the
extreme value distribution inherent in the data. It also has the capability to
accurately capture and display the effect of statistical dependence in the data
on the extreme value distribution, which opens for the opportunity of using all
the available data in the analysis.
.⇒ The second component is then constructed based on the realization that all, or

at least almost all, the available data are subasymptotic. Hence, it is important
to set up the method of analysis to also incorporate subasymptotic data into the
estimation of extreme value distributions, which will be shown to have some
importance for accurate prediction.
The proposed method has been used on a wide variety of estimation problems,
and the experience is that it represents a very powerful addition to the toolbox of
methods for extreme value estimation and prediction. Several such examples will
be presented in this book, cf. Chaps. 6, 9, and 12. A recent and interesting example
of its practical use on the distribution of defects in metallic materials is given by
Cetin and Naess (2012). Fatigue failures in such materials are intimately related
to the distribution of defects, cf. Murakami (2002). Another recent example is its
application to sea ice dynamics proposed by Sinsabvarodom et al. (2022). Needless
to say, what is presented in this chapter is by no means considered the end of the
development of the ACER method. It is a novel method, and it is to be expected that
several aspects of the proposed approach will see improvements.

5.2 A Sequence of Conditioning Approximations

In this section a sequence of nonparametric distribution functions will be con-

structed that converges to the exact extreme value distribution for the time series
considered. This constitutes the core of the proposed approach.
Consider a stochastic process .Z(t), which has been observed over a time interval,
.(0, T ) say. Assume that values .X1 , . . . , XN , which have been derived from the

observed process, are allocated to the discrete times .t1 , . . . , tN in .(0, T ). This
could be simply the observed values of .Z(t) at each .tj , .j = 1, . . . , N, or it
could be average values or peak values over smaller time intervals centered at the
.tj ’s. Our goal is to accurately determine the distribution function of the extreme

value .MN = max{Xj ; j = 1, . . . , N }. Specifically, we want to estimate .P (η) =

Prob(MN ≤ η) accurately for large values of .η without asymptotics. Clearly,
.P (η) = Prob(X1 ≤ η, X2 ≤ η, . . . , XN ≤ η). Since N will typically be a large

number, direct estimation of this joint distribution function is not a practical option.
Hence, we need to develop another approach.
5.2 A Sequence of Conditioning Approximations 61

An underlying premise for the development in this chapter is that a rational

approach to the study of the extreme values of the sampled time series is to
consider exceedances of the individual random variables .Xj above given thresholds,
as in classical extreme value theory. The alternative approach of considering the
exceedances by upcrossing of given thresholds by a continuous stochastic process
has already been discussed in Chap. 4, see also Naess and Gaidai (2008) and Naess
et al. (2007). The approach taken in the present chapter seems to be the appropriate
way to deal with the recorded data time series of, for example, the hourly or daily
largest wind speeds observed at a given location, just to cite a concrete example.
The following basic rule from probability theory: .Prob(A ∩ B) =
Prob(A |B)Prob(B) for two events .A and .B turns out to be an important key
to estimating .P (η). Using this basic rule repeatedly, it is obtained that

.P (η) = Prob(MN ≤ η) = Prob{XN ≤ η, . . . , X1 ≤ η}

= Prob{XN ≤ η| XN −1 ≤ η, . . . , X1 ≤ η}Prob{XN −1 ≤ η, . . . , X1 ≤ η}
= Prob{XN ≤ η| XN −1 ≤ η, . . . , X1 ≤ η}
· Prob{XN −1 ≤ η| XN −2 ≤ η, . . . , X1 ≤ η}Prob{XN −2 ≤ η, . . . , X1 ≤ η}
..
.

N
= Prob{Xj ≤ η| Xj −1 ≤ η, . . . , X1 ≤ η} · Prob(X1 ≤ η). (5.1)
j =2

In general, the variables .Xj are statistically dependent. Hence, instead of

assuming that all the .Xj are statistically independent, which leads to the classical
approximation,

N
P (η) ≈ P1 (η) :=
. Prob(Xj ≤ η) , (5.2)
j =1

where .:= means “by definition,” the following one-step memory approximation will
to some extent account for the dependence between the .Xj ’s

Prob{Xj ≤ η| Xj −1 ≤ η, . . . , X1 ≤ η} ≈ Prob{Xj ≤ η| Xj −1 ≤ η},

. (5.3)

for .2 ≤ j ≤ N. With this approximation, it is obtained that

N
P (η) ≈ P2 (η) :=
. Prob{Xj ≤ η| Xj −1 ≤ η}Prob(X1 ≤ η) . (5.4)
j =2
62 5 The ACER Method

By conditioning on one more data point, the one-step memory approximation is

extended to

Prob{Xj ≤ η| Xj −1 ≤ η, . . . , X1 ≤ η} ≈ Prob{Xj ≤ η| Xj −1 ≤ η, Xj −2 ≤ η} ,
.

(5.5)

where .3 ≤ j ≤ N, which leads to the approximation,

N
P (η) ≈ P3 (η) :=
. Prob{Xj ≤ η| Xj −1 ≤ η, Xj −2 ≤ η}
j =3

· Prob{X2 ≤ η| X1 ≤ η} Prob(X1 ≤ η) . (5.6)

For a general k, .2 ≤ k ≤ N, it is obtained that

N
P (η) ≈ Pk (η) :=
. Prob{Xj ≤ η| Xj −1 ≤ η, . . . , Xj −k+1 ≤ η}
j =k

k−1
· Prob{Xj ≤ η| Xj −1 ≤ η . . . , X1 ≤ η} · Prob(X1 ≤ η) ,
j =2
(5.7)

where .P (η) = PN (η). It follows that the sequence of approximations

P1 (η), P2 (η), . . . constitutes a sequence of increasingly accurate representations
.

of the exact extreme value distribution .P (η).

It should be noted that the one-step memory approximation adopted above is not
a Markov chain approximation, as being discussed in Smith (1992), Coles (1994),
Smith et al. (1997), nor do the k-step memory approximations lead to kth-order
Markov chains, which are proposed in Yun (1998, 2000). An effort to relinquish
the Markov chain assumption to obtain an approximate distribution of clusters of
extremes is reported by Segers (2005).
It is now necessary to have a closer look at the values for .P (η) obtained by using
Eq. (5.7) as compared to Eq. (5.2). Equation (5.2) can be rewritten in the form

N

P (η) ≈ P1 (η) =
. 1 − α1j (η) , (5.8)
j =1

where .α1j (η) = Prob{Xj > η}, .j = 1, . . . , N . Then the approximation based on
assuming independent data can be written as

N
P (η) ≈ F1 (η) := exp −
. α1j (η) , (5.9)
j =1
5.2 A Sequence of Conditioning Approximations 63

for large values of .η, noting that .1 − x ≈ exp(−x) with high accuracy for small x.
The relative error of this approximation is less than 0.5% for values .|x| < 0.1, and
it decreases rapidly for decreasing values of .|x|.
Similarly, Eq. (5.7) can be expressed as

N

k−1
.P (η) ≈ Pk (η) = 1 − αkj (η) 1 − αjj (η) , (5.10)
j =k j =1

where .αkj (η) = Prob{Xj > η | Xj −1 ≤ η, . . . , Xj −k+1 ≤ η}, for .j ≥ k ≥ 2,

denotes the exceedance probability conditional on .k − 1 previous non-exceedances.
From Eq. (5.10), it is obtained that, for large values of .η,

N
k−1
P (η) ≈ Fk (η) := exp −
. αkj (η) − αjj (η) , η → ∞ , (5.11)
j =k j =1

and .Fk (η) → P (η) as .k → N with .FN (η) = P (η).

For the sequence of approximations .Fk (η) to have practical significance, it is
implicitly assumed that there is a cutoff value .kc satisfying .kc ⪡ N such that
effectively .Fkc (η) = FN (η). It may be noted that for k-dependent stationary data
sequences, that is, for data where .Xi and .Xj are independent whenever .|j − i| > k,
then .P (η) = Pk+1 (η) exactly, and, under rather mild conditions on the joint
distributions of the data, .limN →∞ P1 (η) = limN →∞ P (η) (Watson 1954). In fact,
it can be shown that .limN →∞ P1 (η) = limN →∞ P (η) is true for weaker conditions
than k-dependence (Leadbetter et al. 1983). However, for finite values of N, the
picture is much more complex, and purely asymptotic results should be used with
some caution.
Returning to Eq. (5.11), extreme value prediction by the conditioning approach
described above reduces to estimation of (combinations) of the .αkj (η) functions. In
accordance with the previous assumption
about a cutoff value .kc , for all k-values
of interest, .k ⪡ N, so that . k−1 α
j =1 jj (η) is effectively negligible compared to
N
. j =k αkj (η). Hence, for simplicity, the following approximation is adopted, which
is applicable to both stationary and nonstationary data

N
Fk (η) = exp −
. αkj (η) , k ≥ 1 . (5.12)
j =k

Going back to the definition of .α1j (η), it follows that . Nj =1 α1j (η) is equal to the
expected number of exceedances of the threshold .η during the time interval .(0, T ).
Equation (5.9) therefore expresses the approximation that the stream of exceedance
events constitute a (nonstationary) Poisson process.
This opens for an understanding
of Eq. (5.12) by interpreting the expressions . N j =k αkj (η) as the expected effective
64 5 The ACER Method

number of (assumed) independent exceedance events provided by conditioning on

k − 1 previous observations.
.

5.3 Empirical Estimation of the Average Conditional

Exceedance Rates

The concept of average conditional exceedance rate (ACER) of order k is now

introduced as follows:

1 N
εk (η) =
. αkj (η) , k = 1, 2, . . . (5.13)
N −k+1
j =k

It is noted that this empirical ACER function also depends on the number of
data points N . In contrast to the average upcrossing rate of Chap. 4, which would
typically express the average number of upcrossings per time unit, the ACER
functions are exceedance rates per data point.
The behavior and diagnostic power of the ACER functions will be demonstrated
for several examples in Chap. 6. In terms of the ACER function, we may now write

Fk (η) = exp − (N − k + 1)εk (η) , k ≥ 1 .
. (5.14)

In practice, there are typically two scenarios for the underlying process .Z(t). We
may consider it to be either a stationary process or, in fact, even an ergodic process,
which allows the replacement of ensemble averages with time averages, cf. Doob
(1953), Cramer and Leadbetter (1967), and Wong and Hajek (1985). The alternative
is to view .Z(t) as a process that depends on certain parameters whose variation in
time may be modelled as an ergodic process in its own right. For each set of values of
the parameters, the premise is that .Z(t) can then be modelled as an ergodic process.
This would be the scenario that can be used to model long-term statistics (Naess
1984; Schall et al. 1991).
For both these scenarios, the empirical estimation of the ACER function .εk (η)
proceeds in a completely analogous way by counting the total number of favorable
incidents, that is, exceedances combined with the requisite number of preceding
non-exceedances, for the total data time series and then finally dividing by .N − k +
1 ≈ N. This can be shown to apply for the long-term situation, as briefly discussed
below.
A few more details on the numerical estimation of .εk (η) for .k ≥ 2 may be
appropriate. Initially, the following random functions are introduced:

Akj (η) = 1{Xj > η, Xj −1 ≤ η, . . . , Xj −k+1 ≤ η} , j = k, . . . , N, k = 2, 3, . . .

(5.15)
5.3 Empirical Estimation of the Average Conditional Exceedance Rates 65

and

.Bkj (η) = 1{Xj −1 ≤ η, . . . , Xj −k+1 ≤ η} , j = k, . . . , N, k = 2, . . . , (5.16)

where .1{A } denotes the indicator function of some event .A , that is, .1{A } = 1 if
the event occurs, and .1{A } = 0 if not. Then, since .E[1{A }] = Prob{A }, where .E[·]
denotes the expectation operator, it follows that

Prob{Xj > η, Xj −1 ≤ η, . . . , Xj −k+1 ≤ η}

αkj (η) =
.
Prob{Xj −1 ≤ η, . . . , Xj −k+1 ≤ η}
E[Akj (η)]
= , j = k, . . . , N, k = 2, . . . . (5.17)
E[Bkj (η)]

Assuming an ergodic process, which implies stationarity, then obviously .εk (η) =
αkk (η) = . . . = αkN (η), and by ergodicity, replacing ensemble means with
corresponding time averages, it may be assumed that for the time series at hand
N
j =k akj (η)
εk (η) = lim N
. , (5.18)
N →∞
j =k bkj (η)

where .akj (η) and .bkj (η) are the realized values of .Akj (η) and .Bkj (η), respectively,
for the observed time series. Denoting a realization of the time series .X1 , X2 , . . . by
.x1 , x2 , . . ., then .akj (η) = 1 if .xj > η, xj −1 ≤ η, . . . , xj −k+1 ≤ η; otherwise

.akj (η) = 0. Similarly, .bkj (η) = 1 if .xj −1 ≤ η, . . . , xj −k+1 ≤ η; otherwise

.bkj (η) = 0.

Clearly, .limη→∞ E[Bkj (η)] = 1. Hence, .limη→∞ ε̃k (η)/εk (η) .= 1, where
N
j =k E[Akj (η)]
ε̃k (η) =
. . (5.19)
N −k+1

The advantage of using the modified ACER function .ε̃k (η) for .k ≥ 2 is that it is
easier to use for non-stationary or long-term statistics than .εk (η). Since our focus is
on the values of the ACER functions at the extreme levels, we may use any function
that provides correct predictions of the appropriate ACER function at these extreme
levels.
To see why Eq. (5.19) may be applicable for nonstationary time series, it is
recognized that, for large values of .η,

N
N
E[Akj (η)]
.P (η) ≈ exp − αkj (η) = exp −
E[Bkj (η)]
j =k j =k

N

≈ exp − E[Akj (η)] . (5.20)
j =k
66 5 The ACER Method

If the time series can be segmented into K blocks such that .E[Akj (η)] remains
approximately
constant within each block and such that . j ∈Ci E[Akj (η)] ≈
j ∈Ci akj (η) for a sufficient range of .η-values, where .C i denotes the set of indices
N N
for block no. i, .i = 1, . . . , K, then . j =k E[Akj (η)] ≈ j =k akj (η). Hence,

P (η) ≈ exp − (N − k + 1)ε̂k (η) , η → ∞,
. (5.21)

where, in this case,

1 N
ε̂k (η) =
. akj (η) , (5.22)
N −k+1
j =k

which is the empirical counterpart of Eq. (5.19).

It is of interest to note what events are actually counted for the estimation of
the various .εk (η), .k ≥ 2. Let us start with .ε2 (η). It follows from the definition of
.ε2 (η) that .ε2 (η) (N − 1) can be interpreted as the expected number of exceedances

above the level .η satisfying the condition that an exceedance is counted only if it is
immediately preceded by a non-exceedance. Hence, if a realization of a stochastic
process is sampled with a sufficiently high frequency to approximate the realization
with good accuracy, then .ε2 (η) (N − 1) is approximately the expected number
of upcrossings of .η among the .N − 1 data points. This implies that .ε2 (η) is
approximately the average upcrossing rate per data point. Therefore, if the .N − 1
data points correspond to a time interval of length T , then .ε2 (η) (N −1)/T expresses
the average upcrossing rate per time unit.
Assuming that we are studying a narrow-band stochastic response process,
sampling a realization by extracting the local peak values would be a typical
procedure for studying the extremes. A reinterpretation of .ε2 (η) (N − 1) for this
case would lead us to the conclusion that it equals the average number of clumps of
exceedances above .η for the realizations considered, where a clump of exceedances
is defined as a maximum number of consecutive exceedances of the peak values
above .η. Figure 5.1 illustrates how clumps of exceedances would be counted for
the estimation of .ε2 (η). A scrutiny of the figure shows that four clumps can be
identified.
In general, .εk (η) (N − k + 1) then equals the average number of clumps of
exceedances above .η separated by at least .k − 1 non-exceedances. Hence, for the
illustrating example of Fig. 5.1, it follows that for the estimation of .ε3 (η), the two
arrows show that what were three clumps for the estimation of .ε2 (η) now become
one clump for the estimation of .ε3 (η). Hence, in Fig. 5.1, we can now identify only
two clumps for the estimation of .ε3 (η).
If the time series analyzed is obtained by extracting local peak values from
a narrow-band response process, it is interesting to note that there is a certain
similarity between the ACER approximations and the envelope approximations for
extreme value prediction (Naess and Gaidai 2008; Vanmarcke 1975). For alternative
statistical approaches to account for the effect of clustering on the extreme value
5.3 Empirical Estimation of the Average Conditional Exceedance Rates 67

6
𝜂

−6

0 50 100 150
t

Fig. 5.1 An illustration of clumps of exceedances counted for the estimation of .ε2 (η)

distribution, the reader may consult Leadbetter (1983, 1995), Hsing (1987, 1991),
Ferro and Segers (2003), and Robert (2009). In these works the emphasis is on
the notion of the extremal index, which characterizes the clumping or clustering
tendency of the data and its effect on the extreme value distribution, cf. Sect. 2.9. In
the ACER functions, these effects are automatically accounted for.
Now, let us look at the problem of estimating a confidence interval for .εk (η),
assuming a stationary time series. If R realizations of the requisite length of the time
series are available or if one long realization can be segmented into R subseries, then
the sample standard deviation .ŝk (η) can be estimated by the standard formula

1 (r) 2
R
ŝk (η)2 =
. ε̂k (η) − ε̂k (η) , (5.23)
R−1
r=1

where .ε̂k(r) (η) denotes the ACER function estimate from realization no. r, and
R (r)
.ε̂k (η) = r=1 ε̂k (η)/R.
Assuming that realizations are independent, for a suitable number R, e.g., .R ≥
20,−Eq. (5.23)
+
leads
to a good approximation of the 95% confidence interval CI =
. C (η), C (η) for the value .εk (η), where

√
C ± (η) = ε̂k (η) ± 1.96 ŝk (η)/ R .
. (5.24)

Alternatively, and which also applies to the non-stationary case, it is consistent

with the adopted approach to assume that the stream of conditional exceedances
over a threshold .η constitute a Poisson process, possibly non-homogeneous. Hence,
the variance of the estimator .Êk (η) of .ε̃k (η), where
N
j =k Akj (η)
Êk (η) =
. , (5.25)
N −k+1
68 5 The ACER Method

is .Var[Êk (η)] = ε̃k (η). Therefore, for high levels .η, the approximate limits of a 95%
confidence interval of .εk (η) can be written as
1.96
C ± (η) = ε̂k (η) 1 ±
. . (5.26)
(N − k + 1)ε̂k (η)

5.4 Long-Term Extreme Value Analysis by the ACER

Method

In Sect. 4.10 we studied long-term extreme value distributions for the point process
method. This was related to a scatter diagram of short-term environmental con-
ditions. If the whole time series over a long-term scenario is available, we have
shown in the previous section that the long-term statistics using ACER functions
may be estimated directly from the time series, cf. Eq. 5.21. However, in many
cases it would be more practical to analyze each short-term condition separately
and combine the obtained ACER functions after that. This would, e.g., be the typical
approach in a simulation based long-term statistical analysis where the short-term
response time series would be simulated and the resulting time series subjected to
an ACER analysis. The obtained ACER functions would then be combined to form
the long-term ACER function. One advantage of doing a long-term analysis this
way would be the opportunity to apply the ACER function definition of Eq. (5.18)
to each short-term condition.
Adapting Eq. (4.76) to the discrete scenario for the ACER analysis over the
scatter diagram of Table 4.2, an estimated long-term ACER function would be
expressed as follows:

ε̂k (η) =
. ε̂k (η|hi , tj ) fHs Ts (hi , tj ) Δhi Δtj , (5.27)
hi tj

where the ACER functions .ε̂k (η|hi , tj ) have been estimated for each separate short-
term condition .(hi , tj ), or .(i, j ) for short.
An alternative equivalent formulation is obtained as in Sect. 4.10.3. Assume that
the number
of sea nstates in condition .(i, j ) is .Kij , .i = 1, . . . , m and .j = 1, . . . , n,
and .K = m i=1 j =1 Kij , cf. Sect. 4.10.1. We may then write

m
n
(ij ) Kij
.ε̂k (η) = ε̂k (η) , (5.28)
K
i=1 j =1
5.5 Estimation of Extremes for the Asymptotic Gumbel Case 69

(ij )
where the ACER function .ε̂k (η) is estimated for condition .(i, j ). So, again we
obtain the long-term extreme value distribution as

P (η) ≈ exp − (N − k + 1)ε̂k (η) ,
. (5.29)

using either Eq.(5.27) or (5.28), where N is the number of data over the long-
term period. However, please note the cautionary comments on the use of a scatter
diagram for long-term analysis in Sect. 4.10.3.

5.5 Estimation of Extremes for the Asymptotic Gumbel Case

The second component of the approach to extreme value estimation presented in

this chapter was originally derived for a time series with an asymptotic extreme
value distribution of the Gumbel type, cf. Naess and Gaidai (2009). This case is
therefore presented first, also because the extension of the asymptotic distribution to
a parametric class of extreme value distribution tails that are capable of capturing to
some extent subasymptotic behavior is more transparent, and perhaps more obvious,
for the Gumbel case. The reason behind the efforts to extend the extreme value
distributions to the subasymptotic range is the fact that the ACER functions allow
us to use not only asymptotic data, which is clearly an advantage since proving that
observed extremes are asymptotic is not possible.
The effect of the asymptotic distribution being of the Gumbel type on the possible
subasymptotic functional forms of .εk (η) cannot easily be decided in any detail.
However, using the asymptotic form as a guide, it is assumed that the behavior
of the ACER in the tail is dominated by a function of the form .exp{−a(η − b)c }
(.η ≥ η1 ≥ b) where a, b, and c are suitable constants, and .η1 is an appropriately
chosen tail marker. Hence, it will be assumed that

εk (η) = qk (η) exp{−ak (η − bk )ck } , η ≥ η1 ,

. (5.30)

where the function .qk (η) is slowly varying compared with the exponential function
.exp{−ak (η−bk )ck } and .ak , bk , and .ck are suitable constants, which in general will be
dependent on k. Note that the value .ck = qk (η) = 1 corresponds to the asymptotic
Gumbel distribution, which is then a special case of the assumed tail behavior. And,
of course, any extreme value distribution with an ACER function of the form given
by Eq. (5.30) is asymptotically Gumbel, cf. comments at the end of Sect. 4.6.

Note that under the assumptions made, a plot of .− log log εk (η)/qk (η) versus
.log(η − bk ) will exhibit a perfectly linear tail behavior. This will be illustrated in

Chap. 9.
It is realized that if the function .qk (η) could be replaced by a constant value, .qk
say, one would immediately be in a position to apply a linear extrapolation strategy
for deep tail prediction problems. In general, .qk (η) is not constant, but its variation
70 5 The ACER Method

in the tail region is often sufficiently slow to allow for its replacement by a constant,
possibly by adjusting the tail marker .η1 . The proposed statistical approach to the
prediction of extreme values is therefore based on the assumption that we can write

εk (η) = qk exp{−ak (η − bk )ck } , η ≥ η1 ,

. (5.31)

where .ak , bk , ck , and .qk are appropriately chosen constants. In a certain sense, this
is a minimal class of parametric functions that can be used for this purpose, which
makes it possible to achieve three important goals. Firstly, the parametric class
contains the asymptotic form given by .ck = qk = 1 as a special case. Secondly, the
class is flexible enough to capture to a certain extent subasymptotic behavior of any
extreme value distribution that is asymptotically Gumbel. Thirdly, the parametric
functions agree with a wide range of known special cases, of which a very important
example is the extreme value distribution for a regular stationary Gaussian process,
which has .ck = 2.
The viability of this approach has been successfully demonstrated for extreme
value statistics of the response processes related to a wide range of different
dynamical systems, cf. Naess and Gaidai (2008) and Naess et al. (2007). In Chap. 6,
the ACER method will be applied to the problem of predicting extreme wind
speeds. The performance of the annual maxima method and the POT method for
this purpose will also be discussed.
As to the question of finding the parameters .a, b, c, q (the subscript k, if it
applies, is suppressed), the adopted approach is to determine these parameters
by minimizing the following mean square error function with respect to all four
arguments:

J
2
F (a, b, c, q) =
. wj log ε̂(ηj ) − log q + a(ηj − b)c , (5.32)
j =1

where .η1 < . . . < ηJ denotes the levels where the ACER function has been
estimated, and .wj denotes a weight factor that puts more emphasis on the more
reliably estimated .ε̂(ηj ), cf. the discussion about the BLUE
below. However,
the
choice of weight factor is to some extent arbitrary. With . C − (η), C + (η) denoting
the 95% confidence interval for the value .εk (η), we have previously used .wj =
−θ
log C + (ηj ) − log C − (ηj ) with .θ = 1 and 2, combined with a Levenberg–
Marquardt least squares optimization method (Gill et al. 1981). This has usually
worked well, provided reasonable initial values for the parameters were chosen.
Note that the form of .wj puts some restriction on the use of the data. Usually, there is
a level .ηj beyond which .wj is no longer defined, that is, .C − (ηj ) becomes negative.
Hence, the summation in Eq. (5.32) has to stop before that happens. Also, the data
should be preconditioned by establishing the tail marker .η1 based on inspection of
the empirical ACER functions.
5.5 Estimation of Extremes for the Asymptotic Gumbel Case 71

In general, to improve robustness of results, it is recommended to apply

a nonlinearly constrained optimization (Forst and Hoffmann 2010). The set of
constraints is written as
⎧
⎪
⎪ log q − a(ηi − b)c < 0 ,
⎪
⎪
⎪
⎨ 0 < q < +∞ ,
⎪
. min Xj < b ≤ η1 , (5.33)
⎪
⎪
j
⎪
⎪ 0 < a < +∞ ,
⎪
⎪
⎩0 < c < 5.

Here, the first nonlinear inequality constraint is evident, since under our assumption
we have .ε̂(ηi ) = q exp{−a(ηi − b)c }, and .ε̂(ηi ) < 1 by definition.
A note of caution: When the parameter c is equal to 1.0 or close to it, that is, the
distribution is close to the Gumbel distribution, the optimization problem becomes
ill-defined or close to ill-defined. It is seen that when .c = 1.0, there is an infinity of
.(b, q) values that gives exactly the same value of .F (a, b, c, q). Hence, there is no

well-defined optimum in parameter space. There are simply too many parameters.
This problem is alleviated by fixing the q-value, and the obvious choice is .q =
1. The restriction .c < 5 is a practical one and has no real significance beyond
limiting the range of c-values. A c-value larger than 5 should cause an inspection of
the choice of tail marker. For c-values larger than 1.0, the c-value would typically
decrease with increasing tail marker, indicating a deep tail approach to the Gumbel
distribution, which is the asymptotic limit.
Although the Levenberg–Marquardt method generally works well with four or,
when appropriate, three parameters, a more direct and transparent optimization
method has also been developed for the problem at hand. It is realized by
scrutinizing Eq. (5.32) that if b and c are fixed, the optimization problem reduces to
a standard weighted linear regression problem. That is, with both b and c fixed,
the optimal values of a and .log q are found using closed-form weighted linear
regression formulas in terms of .wj , .yj = log ε(ηj ) and .xj = (ηj − b)c . In that
light, it can also be concluded that the best linear unbiased estimators (BLUEs) are
−2
obtained for .wj = σyj 2 = Var[y ] (empirical) (Draper and Smith 1998;
, where .σyj j
Montgomery et al. 2002). Unfortunately, this is not a very practical weight factor
for the kind of problem we have here because the summation in Eq. (5.32) then
typically would have to stop at undesirably small values of .ηj .
It is obtained that the optimal values of a and q are given by the relations
N
∗ j =1 wj (xj − x)(yj − y)
a (b, c) = −
. N , (5.34)
j =1 wj (xj − x)
2

and

. log q ∗ (b, c) = y + a ∗ (b, c)x , (5.35)

72 5 The ACER Method

N
where .x = N j =1 wj xj / j =1 wj , with a similar definition of .y.
To calculate the final optimal set of parameters, one may use the Levenberg–
Marquardt method on the function .F̃ (b, c) = F (a ∗ (b, c), b, c, q ∗ (b, c)) to find
the optimal values .b∗ and .c∗ and then use Eqs. (5.34) and (5.35) to calculate the
corresponding .a ∗ and .q ∗ .
For a simple construction of a confidence interval for the target deep tail extreme
value provided by a particular ACER function as given by the fitted parametric
curve, the empirical confidence band is reanchored to the fitted curve by centering
the individual confidence intervals CI.0.95 for the point estimates of the ACER
function on the fitted curve. Under the premise that the specified class of parametric
curves fully describes the behavior of the ACER functions in the tail, parametric
curves are fitted as described above to the boundaries of the reanchored confidence
band. These curves are used to determine a first estimate of a 95% confidence
interval for the target extreme value. To obtain a more precise estimate of the
confidence interval, a bootstrapping method would be recommended. A comparison
of estimated confidence intervals by both these methods will be presented in
Sect. 6.2 on extreme value prediction for synthetic data.
As a final point, it has been observed that the predicted value is not very sensitive
to the choice of .η1 , provided it is chosen with some care. This property is easily
recognized by looking at the way the optimized fitting is done. If the tail marker is
in the appropriate domain of the ACER function, the optimal fitted curve does not
change appreciably by moving the tail marker.

5.6 Estimation of Extremes for the General Case

For independent data in the general case, the ACER function .ε1 (η) can be expressed
asymptotically as
− 1
.ε1 (η) ≃ 1 + γ a(η − b) γ , (5.36)
η→∞

where .a > 0, b, .γ are constants. This follows from the explicit form of the
generalized extreme value (GEV) distribution, which has been discussed in Chap. 2.
Again, the implication of this assumption on the possible subasymptotic func-
tional forms of .εk (η) in the general case is not a trivial matter. The approach we
have chosen is to assume that the class of parametric functions needed for the
prediction of extreme values for the general case can be modelled on the relation
between the Gumbel distribution and the GEV distribution. While the extension of
the asymptotic Gumbel case to the proposed class of subasymptotic distributions
was fairly transparent, this is not equally so for the general case. However, using
a similar kind of approximation, the behavior of the mean exceedance rate in the
subasymptotic part of the tail is assumed to follow a function largely of the form
5.6 Estimation of Extremes for the General Case 73

− 1
.1 + γ a(η − b)c γ (.η ≥ η ≥ b) where .a > 0, b, .c > 0, and .γ > 0 are suitable
1
constants, and .η1 is an appropriately chosen tail level. Hence, it will be assumed
that (Naess 2010)
− γ1
εk (η) = qk (η) 1 + γk ak (η − bk )ck
. k , η ≥ η1 , (5.37)

where the function .qk (η) is weakly varying compared with the function
− γ1
. 1 + γk ak (η − bk ) k
c k and .ak > 0, .bk , .ck > 0, and .γk > 0 are suitable

constants that in general will be dependent on k. Note that the values .ck = 1
and .qk (η) = 1 correspond to the asymptotic limit, which is then a special case
of the general expression given in Eq. (5.37). Another method to account for
subasymptotic effects has been proposed by Eastoe and Tawn (2012), building on
ideas developed by Tawn (1990), Ledford and Tawn (1996), and Heffernan and
Tawn (2004). In this approach, the asymptotic form of the marginal distribution of
exceedances is kept, but it is modified by a multiplicative factor accounting for the
dependence structure of exceedances within a cluster.
An alternative form to Eq. (5.37) would be to assume that

− γ1
εk (η) = 1 + γk ak (η − bk )ck + dk (η)
. k , η ≥ η1 , (5.38)

where the function .dk (η) is weakly varying compared with the function .ak (η −
bk )ck . However, for estimation purposes, the form given by Eq. (5.37) appears to be
preferable as it leads to somewhat simpler estimation procedures.
For practical identification of the ACER functions given by Eq. (5.37), it is
expedient to assume that the unknown function .qk (η) varies sufficiently slowly to be
replaced by a constant. In general, .qk (η) is not constant, but its variation in the tail
region is assumed to be sufficiently slow to allow for its replacement by a constant.
Hence, as in the Gumbel case, it is in effect assumed that .qk (η) can be replaced by
a constant for .η ≥ η1 , for an appropriate choice of tail marker .η1 . For simplicity
of notation, in the following we shall suppress the index k on the ACER functions,
which will then be written as
−ξ
.ε(η) = q 1 + ã (η − b)c , η ≥ η1 , (5.39)

where .ξ = 1/γ , .ã = aγ .

For the analysis of data, first the tail marker .η1 is provisionally identified from
visual inspection of the log plot .(η, ln ε̂(η)). The value chosen for .η1 corresponds to
the beginning of regular tail behavior in a sense to be discussed below.
74 5 The ACER Method

The optimization process to estimate the parameters is done relative to the log
plot, as for the Gumbel case. The mean square error function to be minimized is in
the general case written as

N
2
F (ã, b, c, q, ξ ) =
. wj log ε̂(ηj )−log q +ξ log 1 + ã(ηj − b)c , (5.40)
j =1

where .wj is a weight factor as defined previously.

An option for estimating the five parameters .ã, b, c, q, ξ is again to use the
Levenberg–Marquardt least squares optimization method, which can be simplified
also in this case by observing that if .ã, b, and c are fixed in Eq. (5.40), the
optimization problem reduces to a standard weighted linear regression problem.
That is, with .ã, b, and c fixed, the optimal values of .ξ and .log q are found using
closed-form weighted linear regression formulas in terms of .wj , .yj = log ε̂(ηj )
and .xj = log(1 + ã(ηj − b)c ).
It is obtained that the optimal values of .ξ and .log q are given by relations
similar to Eqs. (5.34) and (5.35). To calculate the final optimal set of parameters,
the Levenberg–Marquardt method may then be used on the function .F̃ (ã, b, c) =
F (ã, b, c, q ∗ (ã, b, c), ξ ∗ (ã, b, c)) to find the optimal values .ã ∗ , .b∗ , and .c∗ , and
then the corresponding .ξ ∗ and .q ∗ can be calculated. The optimal values of the
parameters may, e.g., also be found by a sequential quadratic programming (SQP)
method (Numerical Algorithms Group 2010). This general case formulation of the
ACER method used on financial risk issues, which often requires heavy tailed
distributions, is discussed in Chap. 7.
For the construction of confidence intervals, the procedure discussed at the end
of the previous section can easily be adapted to the present case.
Chapter 6
Some Practical Aspects of Extreme Value
Analyses

6.1 Introduction

As stated in Chaps. 1 and 5, extreme value statistics, even in applications, is

generally based on asymptotic results. This is done either by assuming that the
epochal extremes, for example, yearly extreme wind speeds at a given location,
are distributed according to the generalized (asymptotic) extreme value distribution
with unknown parameters to be estimated on the basis of the observed data, cf.
Chap. 2, or by assuming that the exceedances above high thresholds follow a
generalized (asymptotic) Pareto distribution with parameters that are estimated from
the data, cf. Chap. 3. With the ACER method now available, the performance of
these three methods on simulated or measured data may be compared. Note that all
calculations of the empirical ACER functions in this chapter were performed using
the ACER program package for Matlab (Karpa 2012). This package also allows
for optimized fitting of parametric functions for prediction of long return period
extreme values for the case of asymptotic Gumbel distributions, cf. Sect. 5.5, which
totally dominates engineering applications.
The first example in this chapter deals with synthetic data, which allows us to
control the exact result to be predicted by the different methods. The second example
looks at measured wind speed data at three locations along the coast of Norway.
The measurement periods are ranging from 12 to 16 years. The goal would then,
e.g., be to predict a 100-year return period value for the wind speed on the basis of
these data. This represents a classical problem in wind engineering. For some other
problems of this kind of engineering challenges where the ACER method has been
applied, cf., e.g., Gaidai et al. (2016, 2018), Yu et al. (2020), and Sinsabvarodom
et al. (2022).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 75

A. Naess, Applied Extreme Value Statistics,
[Link]
76 6 Some Practical Aspects of Extreme Value Analyses

6.2 Extreme Value Prediction for Synthetic Data

In this section the performance of the ACER method and also the 95% CI estimation
will be illustrated. And 20 years of synthetic wind speed data are subjected to
analysis, amounting to 2000 data points, which is not much for detailed statistics.
However, this case may represent a real situation when nothing but a limited data
sample is available. In this case it would appear crucial to provide extreme value
estimates utilizing all data available. As will be demonstrated, the tail extrapolation
technique proposed for the ACER method performs on the average better than the
asymptotic POT or Gumbel methods.
The extreme value statistics will be analyzed by application to synthetic data for
which the exact extreme value statistics can be calculated (Naess and Clausen 2001).
In particular, it is assumed that the underlying (normalized) stochastic process .Z(t)
is stationary and Gaussian with mean value zero and standard deviation equal to
one. It is also assumed that the mean zero upcrossing rate .ν + (0) is such that the
product .ν + (0)T = 103 where .T = 1 year, which seems to be typical for the wind
speed process. Using the Poisson assumption, the distribution of the yearly extreme
value of .Z(t) is then calculated by the formula
2
+
η
F
.
1yr
(η) = exp −ν (η)T = exp −10 exp −
3
, (6.1)
2

where .T = 1 year and .ν + (η) is the mean upcrossing rate per year, and .η is the
scaled wind speed. The 100-year return period value .η100yr is then calculated from
the relation .F 1yr (η100yr ) = 1 − 1/100, which gives .η100yr = 4.80.
The Monte Carlo simulated data to be used for the synthetic example are gen-
erated based on the observation that the peak events extracted from measurements
of the wind speed process are usually separated by 3–4 days. This is done to obtain
approximately independent data, as required by the POT method. In accordance
with this, peak event data are generated from the extreme value distribution
2
η
F 3d (η) = exp −q exp −
. , (6.2)
2

where .q = ν + (0)T = 10, which corresponds to .T = 3.65 days, and .F 1yr (η) =
3d 100
F (η) .
Since the data points (i.e., .T = 3.65 days maxima) are independent, .εk (η) is
independent of k. Therefore, .k = 1 is chosen. Since there are 100 data from 1 year,
the data amounts to 2000 data points. For estimation of a 95% confidence interval
for each value of the ACER function .ε1 (η) for the chosen range of .η-values, the
required standard deviation in Eq. (11.37) was based on 20 estimates of the ACER
function using the yearly data. This provided a 95% confidence band on the ACER
function based on 2000 data. From these data, the predicted 100 year return level is
6.2 Extreme Value Prediction for Synthetic Data 77

obtained from .ε̂1 (η100yr ) = 10−4 . A nonparametric bootstrapping method was also
used to estimate a 95% confidence interval based on 1000 resamples of size 2000.
The POT prediction of the 100 year return level was based on using maximum
likelihood estimates (MLEs) of the parameters in Eq. (3.3) for a specific choice
of threshold. The 95% confidence interval was obtained from the parametrically
bootstrapped density of the POT estimate for the given threshold. A sample of 1000
data sets was used. One of the unfortunate features of the POT method is that the
estimated 100 year value may vary significantly with the choice of threshold, so also
for the synthetic data. We have followed the standard recommended procedures for
identifying a suitable threshold (Coles 2001).
Note that in spite of the fact that the true asymptotic distribution of exceedances
is the exponential distribution Eq. (3.4), the POT method used here is based on
adopting Eq. (3.3). The reason is simply that this is the recommended procedure
(Coles 2001), which is somewhat unfortunate, but understandable, the reason being
that the GP distribution provides greater flexibility in terms of curve fitting. If the
correct asymptotic distribution of exceedances had been used in this example, poor
results for the estimated return period values would be obtained. The price to pay
for using the GP distribution is that the estimated parameters may easily lead to an
asymptotically inconsistent distribution.
The 100 year return level predicted by the Gumbel method was based on
using the method of moments for parameter estimation on the sample of 20
yearly extremes. This choice of estimation method is due to the small sample of
extreme values. The 95% confidence interval was obtained from the parametrically
bootstrapped density of the Gumbel prediction. This was based on a sample of
size of 10,000 data sets of 20 yearly extremes. The results obtained by the method
of moments were compared with the corresponding results obtained by using the
maximum likelihood method. While there were individual differences, the overall
picture was one of very good agreement.
In order to get an idea about the performance of the ACER, POT, and Gumbel
methods, 100 independent 20-year MC simulations as discussed above were done.
Table 6.1 compares predicted values and confidence intervals for a selection of 10
cases together with average values over the 100 simulated cases. It is seen that the
average of the 100 predicted 100 year return levels is slightly better for the ACER
method than for both the POT and the Gumbel methods. But more significantly, the
range of predicted 100 year return levels by the ACER method is 4.34–5.36, while
the same for the POT method is 4.19–5.87 and for the Gumbel method 4.41–5.71.
Hence, in this case the ACER method performs consistently better than both these
methods. It is also observed from the estimated 95% confidence intervals that the
ACER method, as implemented in this book, provides slightly higher accuracy than
the other two methods. Lastly, it is pointed out that the confidence intervals of the
100 year return level values by the ACER method obtained by either the simplified
extrapolated confidence band approach or nonparametric bootstrapping are very
similar except for a slight mean shift. As a final comparison, the 100 bootstrapped
78 6 Some Practical Aspects of Extreme Value Analyses

Table 6.1 100-year return level estimates and 95% CI (BCI = CI by bootstrap) for A=ACER,
G=Gumbel, and P=POT. Exact value = 4.80
.Sim.N o. .A .η̂
100yr .A CI .A BCI .G .η̂
100yr .G BCI .P .η̂
100yr .PBCI
1 5.07 (4.67, 5.21) (4.69, 5.42) 4.41 (4.14, 4.73) 4.29 (4.13, 4.52)
10 4.65 (4.27, 4.94) (4.37, 5.03) 4.92 (4.40, 5.58) 4.88 (4.42, 5.40)
20 4.86 (4.49, 5.06) (4.44, 5.19) 5.04 (4.54, 5.63) 5.04 (4.48, 5.74)
30 4.75 (4.22, 5.01) (4.33, 5.02) 4.75 (4.27, 5.32) 4.69 (4.24, 5.26)
40 4.54 (4.20, 4.74) (4.27, 4.88) 4.80 (4.31, 5.39) 4.73 (4.19, 5.31)
50 4.80 (4.35, 5.05) (4.42, 5.14) 4.91 (4.41, 5.50) 4.79 (4.31, 5.34)
60 4.84 (4.36, 5.20) (4.48, 5.19) 4.85 (4.36, 5.43) 4.71 (4.32, 5.23)
70 5.02 (4.47, 5.31) (4.62, 5.36) 4.96 (4.47, 5.53) 4.97 (4.47, 5.71)
80 4.59 (4.33, 4.81) (4.38, 4.98) 4.76 (4.31, 5.31) 4.68 (4.15, 5.27)
90 4.84 (4.49, 5.11) (4.60, 5.30) 4.77 (4.34, 5.32) 4.41 (4.23, 4.64)
100 4.62 (4.29, 5.05) (4.45, 5.09) 4.79 (4.31, 5.41) 4.53 (4.05, 4.88)
Av. 100 4.82 (4.41, 5.09) (4.48, 5.18) 4.84 (4.37, 5.40) 4.72 (4.27, 5.23)

Fig. 6.1 Synthetic data

ACER .ε̂1 : Monte Carlo
−1
simulation (*); optimized 10
curve fit (—); empirical 95%
confidence band (- -); and
optimized confidence band −2
10
(.· · ··). Tail marker .η1 = 2.3
ACER (𝜂)
1

−3
10

−4
10

4.85
−5
10
2.5 3 3.5 4 4.5 5 5.5
𝜂

confidence intervals obtained for the ACER and Gumbel methods missed the target
value three times, while for the POT method this number was 18.
An example of the ACER plot and results obtained for one set of data is
presented in Fig. 6.1. The predicted 100-year value is 4.85 with a predicted 95%
confidence interval (4.45, 5.09). Figure 6.2 presents POT predictions based on
MLE for different thresholds in terms of the number n of data points above the
threshold. The predicted value is 4.7 at .n = 204, while the 95% confidence interval
is (4.25, 5.28). The same data set as in Fig. 6.1 was used. This was also used for the
Gumbel plot shown in Fig. 6.3. In this case the predicted value based on the method
100yr
of moments (MMs) is .η̂MM = 4.75 with a parametric bootstrapped 95% confidence
interval of (4.34, 5.27). Prediction based on the Gumbel–Lieblein BLUE method
6.3 Measured Wind Speed Data 79

Fig. 6.2 The point estimate 4.76

.η̂
100yr of the 100-year return

period value based on 20-year 4.74

synthetic data as a function of
4.72
the number n of data points
above the threshold. The 4.7
4.7
return level estimate = 4.7 at

100yr
.n = 204 4.68

𝜂
4.66

4.64

4.62

4.6

100 130 160 190 204 230 250

Fig. 6.3 The point estimate 5

100yr of the 100-year return
.η̂
period value based on 20-year 4
synthetic data. Lines are fitted
by the method of
3
moments—solid line (—) and
−ln(ln((N+1)/k))

the Gumbel–Lieblein BLUE

method—dash-dotted lite 2
(– .· –). The return level
estimate by the method of 1
moments is 4.75 and by the
Gumbel–Lieblein BLUE
0
method is 4.73
4.75
−1

3.6 3.8 4 4.2 4.4 4.6 4.8

𝜂

100yr
(GL), cf., e.g., Cook (1985), is .η̂GL = 4.73 with a parametric bootstrapped 95%
confidence interval equal to (4.35, 5.14).

6.3 Measured Wind Speed Data

In this section we analyze real wind speed data, measured at three weather stations
off the coast of Norway: at Torsvåg Fyr weather station (station number 90800), Sula
weather station (station number 65940), and Obrestad Fyr weather station (station
number 44080), cf. Karpa and Naess (2013). Figure 6.4 shows the geographical
position of each station. The hourly maximum of the 3-second wind gust (10 meters
above the ground) was recorded during 13 years (1997–2010) at the first station,
80 6 Some Practical Aspects of Extreme Value Analyses

Fig. 6.4 Map of Norway

with marked weather stations

12 years (1998–2010) at the second, and 16 years (1994–2010) at the third station
(Norwegian Meteorological Institute 2012).
Extreme wind speed prediction is an important issue for design of structures
exposed to the weather variations. Significant efforts have been devoted to the
problem of predicting extreme wind speeds on the basis of measured data by various
authors over several decades; see, e.g., Cook (1982), Naess (1998a), Palutikof et al.
(1999), and Perrin et al. (2006) for extensive references to previous work.
The objective is to estimate a 100-year wind speed for each of these locations.
Variation in the wind speed caused by seasonal variations in the wind climate during
the year makes the wind speed a nonstationary process on the scale of months.
Moreover, due to global climate change, yearly statistics may vary on the scale
of years. The latter is, however, a slow process, and for the purpose of long-term
prediction, it is assumed here that within a time span of 100 years, a quasi-stationary
model of the wind speeds applies. This may not be entirely true, of course, but for
the three stations under study, no apparent trend in the wind speed was detected over
the period of registration.
Figures 6.5, 6.6, and 6.7 demonstrate the plots of the time series observed from
each station. A conspicuous feature of the time series is the clear seasonal variation
of the wind speeds. Note the paucity of data at certain times at the Sula station.
The practical consequence of this is to shorten the effective length of the time
series. It is of some importance to note that the samples from Torsvåg Fyr and
Obrestad Fyr stations contain outlying observations, such as 45.3 m/s in June 06,
1997, 43.7 m/s in May 10, 2001 and 60.8 m/s in September 09, 2008 for Obrestad
Fyr station, and 45.3 m/s in July 12, 1998 and July 31, 1999 for Torsvåg Fyr
station. Such wind speeds are most likely spurious for the corresponding time
periods and latitudes. Moreover, observations from the weather stations in the close
6.3 Measured Wind Speed Data 81

Fig. 6.5 Observations from 50

outliers
Torsvåg Fyr station
45

wind speed, m/s

0
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
time

Fig. 6.6 Observations from

45
Sula station
40

35
wind speed, m/s

0
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
time

neighborhood of Obrestad Fyr confirm that no heavy storm has occurred during the
period in question, while no information from the stations in the neighborhood of
Torsvåg Fyr is available. Therefore, the outliers from Obrestad Fyr station have to
be rejected, while the outliers from Torsvåg Fyr are kept, mainly in order to show
that the ACER method is largely insensitive to observations of this kind.
In Figs. 6.8, 6.9, and 6.10, .ε̂k (η) is plotted versus wind speeds, .η/σ for different
values of k for the three stations. The figures reveal that there is significant
dependence between the data, but that this is largely accounted for by .k = 48
since there is a marked degree of convergence of the .ε̂k (η) for .k ≥ 48. It should
be noted that .k = 48 obviously corresponds to 2 days separated exceedances for
hourly observations. For .k ≥ 96, that is, 4 days declustered data, full convergence
has been achieved for all practical purposes. However, Figs. 6.8, 6.9, and 6.10 also
82 6 Some Practical Aspects of Extreme Value Analyses

Fig. 6.7 Observations from 45

Obrestad Fyr station
40

wind speed, m/s

0
1996 1998 2000 2002 2004 2006 2008 2010
time

Fig. 6.8 ACER estimates for k=1

different degrees of k=2
conditioning. Torsvåg Fyr k=4
k=24
wind gust statistics based on 10
−2
k=48
13 years of hourly data with k=72
outliers included, .σ = 5.30 k=96
ACER (𝜂)
k

−3
10

−4
10

3.5 4 4.5 5 5.5 6 6.5 7 7.5

𝜂/σ

reveal that for extreme value estimation .ε̂1 (η) can be used since the ACER functions
all converge in the far tail. This clearly demonstrates the power of an ACER function
plot as a diagnostic tool to decide on the value of k needed for extreme value
estimation in a particular case. In spite of significant dependence effects for the
lower wind speeds, for the extreme wind speeds this is largely absent. This makes
it possible to choose .k = 1, which makes much more data available for estimation,
with a possible reduction of uncertainty in estimation as a result.
Figures 6.11, 6.12, and 6.13 show the plots of the optimized fit to the data
for .ε̂1 (η) for each station. The 100-year return level value .η100yr and its 95% CI
are estimated parametrically. For the data with outliers from Torsvåg Fyr weather
station, .η̂100yr = 47.46 m/s and the 95% CI = (42.11, 50.71) with parameters of the
optimal curve: q = 0.44, b = 9.02, a = 0.1, c = 1.33. The predicted 100-year return
wind speed and 95% CI for the data without outliers are .η̂100yr = 47.21 m/s; CI =
6.3 Measured Wind Speed Data 83

Fig. 6.9 ACER estimates for k=1

different degrees of k=2
conditioning. Sula wind gust k=4
k=24
statistics based on 12 years of −2 k=48
10
hourly data, .σ = 5.49 k=72
k=96

ACER (𝜂)
k
−3
10

−4
10

3.5 4 4.5 5 5.5 6 6.5 7

𝜂/σ

Fig. 6.10 ACER estimates k=1

for different degrees of k=2
conditioning. Obrestad Fyr k=4
k=24
wind gust statistics based on −2 k=48
16 years of hourly data, 10 k=72
.σ = 5.47
k=96
ACERk(𝜂)

−3
10

−4
10

3.5 4 4.5 5 5.5 6 6.5 7

𝜂/σ

(39.94, 50.60) with optimal parameters q = 0.47, b = 8.49, a = 0.09, c = 1.36. In the
case of the Sula wind station, .η̂100yr = 46.33 m/s; CI = (43.41, 47.77), where the
parameters of the optimal curve are q = 0.58, b = 0, a = 0.005, c = 2.07. Finally, for
the Obrestad Fyr station, the 100-year return level value is .η̂100yr = 48.38 m/s with
confidence interval (43.18, 50.74) and optimal parameters q = 0.29, b = 12.34, a =
0.13, c = 1.27.
The annual maxima method is applied to the wind gust data to compare the
estimated 100-year return level values. The application of this method to the wind
data is based on the premise that the time series of the yearly maxima can be
modeled as a sequence of iid random variables, which would seem to be a reasonable
assumption to make as a first approximation. The Gumbel estimate .η̂100yr is based
on the method of moments (MMs) and the Gumbel–Lieblein BLUE method (GL),
cf., e.g., Cook (1985). A computer program has been written in the Matlab language
84 6 Some Practical Aspects of Extreme Value Analyses

Fig. 6.11 Plot for Torsvåg

−1
Fyr of .ε̂1 (η) against .η/σ on a 10
logarithmic scale with
.η1 = 2.07σ for the optimized −2
10
parameter values with the
95% confidence band, −3
.σ = 5.30
10

ACER (𝜂)
1
−4
10

−5
10

−6
10

8.95
3 4 5 6 7 8 9 10
𝜂/σ

Fig. 6.12 Plot for Sula of −1

10
against .η/σ on a
.ε̂1 (η)
logarithmic scale with
.η1 = 2.36σ for the optimized
−2
10
parameter values with the
95% confidence band, −3
10
.σ = 5.49
ACER (𝜂)
1

−4
10

−5
10

−6
10

8.43
3 4 5 6 7 8 9
𝜂/σ

to implement both methods. Figures 6.14, 6.15, and 6.16 present observed yearly
extremes extracted from the hourly data together with fitted straight lines on Gumbel
probability paper. Hereby, the 100-year return level values for the first station with
100yr 100yr
outliers included are .η̂MM = 51.33 m/s and .η̂GL = 51.57 m/s, while in the
100yr 100yr
case of rejected outlying observations, .η̂MM = 44.31 m/s and .η̂GL = 45.84 m/s.
100yr
For Sula and Obrestad Fyr stations, the 100-year return level values are .η̂MM =
100yr 100yr 100yr
48.66 m/s with .η̂GL = 52.9 m/s and .η̂MM = 48.59 m/s with .η̂GL = 53.79 m/s,
respectively.
Despite the fact that the Gumbel–Lieblein BLUE method is considered as one of
the best available conventional Gumbel methods, the application of the GL method
requires tables of the BLUE coefficients which are not easily available for annual
data with sample size .N > 24. Observed results reveal sensitivity of this method
6.3 Measured Wind Speed Data 85

Fig. 6.13 Plot for Obrestad −1

10
Fyr of .ε̂1 (η) against .η/σ on a
logarithmic scale with
.η1 = 2.5σ for the optimized
−2
10
parameter values with the
95% confidence band, −3
10
.σ = 5.47

ACER (𝜂)
1
−4
10

−5
10

−6
10

8.85
3 4 5 6 7 8 9 10
𝜂/σ

Fig. 6.14 The point estimate 5

100yr of the 100-year return
.η̂
period value by the annual 4
maxima method. Lines are
fitted by the method of
moments—solid line (—) and 3
−ln(ln((N+1)/k))

the Gumbel–Lieblein BLUE

method—dash-dotted lite 2
(– .· –). Torsvåg Fyr wind
speed statistics, 13 years of
1
hourly data, .σ = 5.30

9.68
−1

6 6.5 7 7.5 8 8.5 9 9.5 10

𝜂/σ

to outliers, which also applies for the method of moments. It is also noted that the
Gumbel–Lieblein BLUE method seems to have a tendency to overestimate predicted
return level values, while the method of moments seems to be reasonably stable for
the studied sets of data.
The POT method is also applied to the wind gust time series. Immediately, this
would seem to be an unwarranted approach since the time series of wind speeds
is conspicuously nonstationary. Efforts have been made to account for the seasonal
variations by explicit modelling of the parameters of the GP distribution, cf. Coles
(2001). However, this does not seem to be a widely adopted procedure. Instead, the
POT method is applied directly to the full time series, recognizing that the extracted
relevant data comes from a period of 3 or 4 months, which may be considered to
represent a more or less stationary period. And by manipulating time frames, this is
86 6 Some Practical Aspects of Extreme Value Analyses

Fig. 6.15 The point estimate 5

100yr of the 100-year return
.η̂
period value by the annual 4
maxima method. Lines are
fitted by the method of
moments—solid line (—) and 3

−ln(ln((N+1)/k))
the Gumbel–Lieblein BLUE
method—dash-dotted lite 2
(– .· –). Sula wind speed
statistics, 12 years of hourly
data, .σ = 5.49 1

8.85 9.63
−1
5.5 6 6.5 7 7.5 8 8.5 9 9.5 10
𝜂/σ

Fig. 6.16 The point estimate 5

100yr of the 100-year return
.η̂
period value by the annual 4
maxima method. Lines are
fitted by the method of
3
moments—solid line (—) and
−ln(ln((N+1)/k))

the Gumbel–Lieblein BLUE

method—dash-dotted lite 2
(– .· –). Obrestad Fyr wind
speed statistics, 16 years of
1
hourly data, .σ = 5.47

8.88 9.84
−1

5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10

𝜂/σ

considered to represent 1 year. This trick approximately circumvents the problem of

nonstationarity. This will be our approach here.
Following the WAFO-group (2000), data were declustered beforehand. It was
done in such a way that peak events separated by 3.5 days or more were extracted
from the measured data and selected for the analysis to achieve approximate
independence of the exceedances (Naess 1998b). Figures 6.17, 6.18, and 6.19
present POT estimates of .η100yr for different threshold numbers based on maximum
likelihood estimation, cf. Chap. 3. Estimates were obtained by using the MATLAB
(2009) Statistics Toolbox routine gpfit. It is interesting to observe the unstable
characteristics of the estimates over a range of threshold values. While it is clearly of
interest to discuss methods for stabilizing the POT estimates, this issue is considered
to be outside the scope of this book.
6.3 Measured Wind Speed Data 87

Fig. 6.17 The point estimate

.η̂
100yr of the 100-year return
9.5
period value based on 13
years of wind data for
Torsvåg Fyr station as a 9.4
function of the number n of
data points above the 9.32

𝜂100yr/σ
threshold. The return level 9.3
estimate = 49.41 at n = 131,
.σ = 5.30
9.2

9.1

9
90 100 110 120 131 140 150 160 170 180
n

Fig. 6.18 The point estimate

100yr of the 100-year return
.η̂ 8
period value based on 12
years of wind data for Sula
station as a function of the
number n of data points 7.9
7.9
above the threshold. The
𝜂100yr/σ

return level estimate = 43.42

at n = 120, .σ = 5.49
7.8

7.7

60 80 100 120 140 160 180

In Tables 6.2, 6.3, 6.4, and 6.5, the 100-year return period values are listed
together with the predicted 95% confidence intervals for all methods and each
station. In case of the annual maxima method, the 95% confidence intervals are
estimated from a parametric bootstrapping of the Gumbel estimates based on a
sample of size 10,000 data sets of 13, 12, and 16 yearly extremes. For the POT
method, the bootstrapped 95% confidence intervals were estimated by using the
MATLAB (2009) Statistics Toolbox routine bootstrp. A total of 10,000 samples
are generated by sampling with replacement from the observed exceedances above
high thresholds.
88 6 Some Practical Aspects of Extreme Value Analyses

Fig. 6.19 The point estimate

.η̂
100yr of the 100-year return 8.6
period value based on 12
years of wind data for 8.5
Obrestad Fyr station as a 8.43
function of the number n of 8.4
data points above the

𝜂100yr/σ
threshold. The return level 8.3
estimate = 46.1 at n = 151,
.σ = 5.47
8.2

8.1

100 110 130 151 170 190 200

Table 6.2 Predicted 100-year return period levels for Torsvåg Fyr weather station by the ACER
method for different degrees of conditioning, annual maxima, and POT methods, respectively;
outliers are considered true observations
Method Spec .η̂
100yr m/s 95% CI (.η100yr ) m/s
ACER, various k 1 47.46 (42.11, 50.71)
2 48.18 (41.48, 51.31)
4 46.96 (42.25, 49.63)
24 48.36 (43.44, 51.63)
48 47.54 (43.46, 49.75)
72 47.44 (44.39, 48.79)
96 48.78 (44.53, 51.61)
Annual maxima MM 51.33 (43.08, 61.57)
GL 51.57 (44.24, 60.67)
POT – 49.41 (40.95, 59.42)

6.4 Extreme Value Prediction for a Narrow-Band Process

In engineering mechanics, a classical extreme response prediction problem is the

case of a lightly damped mechanical oscillator subjected to random forces. To
illustrate this prediction problem, the response process of a linear mechanical
oscillator driven by a Gaussian white noise will be investigated. Let .X(t) denote
the displacement response. The dynamic model can then be expressed as .Ẍ(t) +
2ζ ωe Ẋ(t) + ωe2 X(t) = W (t), where .ζ = relative damping, .ωe = undamped
eigenfrequency, and .W (t) = a stationary Gaussian white noise (of suitable intensity).
By choosing a small value for .ζ , the response time series will exhibit narrow-band
characteristics, that is, the spectral density of the response process .X(t) will assume
significant values only over a narrow range of frequencies. This manifests itself by
producing a strong beating of the response time series, which means that the size
6.4 Extreme Value Prediction for a Narrow-Band Process 89

Table 6.3 Predicted 100-year return period levels for Torsvåg Fyr weather station by the ACER
method for different degrees of conditioning, annual maxima, and POT methods, respectively;
outliers are rejected
Method Spec .η̂
100yr m/s 95% CI (.η100yr ) m/s
ACER, various k 1 47.21 (39.94, 50.60)
2 47.79 (41.13, 50.93)
4 46.32 (42.00, 49.04)
24 47.22 (43.26, 50.04)
48 46.38 (43.60, 48.19)
72 46.32 (44.24, 47.37)
96 47.80 (44.45, 49.95)
Annual maxima MM 44.31 (39.36, 50.39)
GL 45.84 (40.72, 52.41)
POT – 42.62 (39.01, 47.31)

Table 6.4 Predicted 100-year return period levels for Sula weather station by the ACER method
for different degrees of conditioning, annual maxima, and POT methods, respectively
Method Spec .η̂
100yr m/s 95% CI (.η100yr ) m/s
ACER, various k 1 46.33 (43.41, 47.77)
2 46.81 (44.08, 49.04)
4 47.99 (44.80, 50.57)
24 46.65 (44.10, 48.07)
48 46.83 (44.28, 48.03)
72 45.80 (43.01, 46.96)
96 45.69 (42.32, 47.01)
Annual maxima MM 48.66 (41.58, 57.58)
GL 52.90 (44.29, 63.39)
POT – 43.42 (39.07, 47.80)

of the response peaks will change slowly in time, see Fig. 6.20. A consequence of
this is that neighbouring peaks are strongly correlated, and there is a conspicuous
clumping of the peak values. This gives rise to the problem of accurate prediction,
since the usual assumption of independent peak values is then violated.
Many approximations have been proposed to deal with this correlation problem,
but no completely satisfactory solution has been presented. In this section it will
be shown that the ACER method solves this problem efficiently and elegantly in
a statistical sense. In Fig. 6.21 are shown some of the ACER functions for the
example time series. It may be verified from Fig. 6.20 that there are approximately
32 sample points between two neighbouring peaks in the time series. To illustrate a
point, the time series consisting of all sample points has been analyzed. Usually, in
practice, only the time series obtained by extracting the peak values would be used
for the ACER analysis of a narrow-band process. In the present case, the first ACER
function is then based on assuming that all the sampled data points are independent,
which is obviously completely wrong. The second ACER function, which is based
90 6 Some Practical Aspects of Extreme Value Analyses

Table 6.5 Predicted 100-year return period levels for Obrestad Fyr weather station by the ACER
method for different degrees of conditioning, annual maxima, and POT methods, respectively
Method Spec .η̂
100yr m/s 95% CI (.η100yr ) m/s
ACER, various k 1 48.38 (43.18, 50.74)
2 48.11 (42.38, 50.69)
4 48.81 (42.34, 51.59)
24 47.90 (42.87, 50.53)
48 48.90 (43.82, 50.72)
72 49.47 (44.06, 51.52)
96 48.55 (43.46, 49.96)
Annual maxima MM 48.59 (42.10, 56.84)
GL 53.79 (46.16, 63.53)
POT – 46.10 (41.00, 55.00)

2.5

1.5

0.5
x(t)

−0.5

−1

−1.5

−2

−2.5
60 70 80 90 100 110 120
Time [s]

Fig. 6.20 Part of the narrow-band response time series of the linear oscillator with fully sampled
and peak values indicated

on counting each exceedance with an immediately preceding non-exceedance, is

nothing but an upcrossing rate. Using this ACER function is largely equivalent to
assuming independent peak values. It is now interesting to observe that, e.g., the
25th ACER function can hardly be distinguished from the second ACER function.
In fact, the ACER functions after the second do not change appreciably until one
starts to approach the 32nd, which corresponds to hitting the previous peak value in
the conditioning process. So, the important information concerning the dependence
structure in the present time series seems to reside in the peak values, which may
not be very surprising. It is seen that the ACER functions show a significant change
in value as a result of accounting for the correlation effects in the time series. To
verify the full dependence structure in the time series, it is necessary to continue the
6.4 Extreme Value Prediction for a Narrow-Band Process 91

−1 k=1
10
k=2
k=25
k=32
−2
k=64
ACERk(𝜂) 10

−3
10

−4
10

−5
10
1.5 2 2.5 3 3.5 4
𝜂/σ

Fig. 6.21 Comparison between ACER estimates for different degrees of conditioning for the
narrow-band time series

k=1
k=2
k=3
−1 k=4
10 k=5
ACER (𝜂)
k

−2
10

−3
10

1 1.5 2 2.5 3 3.5 4

𝜂/σ

Fig. 6.22 Comparison between ACER estimates for different degrees of conditioning based on
the time series of the peak values, cf. Fig. 6.20

conditioning process down to at least the 64th ACER function. In the present case
there is virtually no difference between the 32nd and the 64th, which shows that the
dependence structure in this particular time series is captured almost completely by
conditioning on the previous peak value. It is interesting to contrast the method of
dealing with the effect of sampling frequency discussed here with that of Robinson
and Tawn (2000).
To illustrate the results obtained by extracting only the peak values from the time
series, which would be the approach typically chosen in an engineering analysis,
the ACER plots for this case are shown in Fig. 6.22. By recognizing that there is an
92 6 Some Practical Aspects of Extreme Value Analyses

almost one-to-one correspondence between upcrossings and peaks, it can be verified

by comparing results from Figs. 6.21 and 6.22 that they are in very close agreement
since the second ACER function in Fig. 6.21 corresponds to the first ACER function
in Fig. 6.22 by the observed one-to-one correspondence and by noting that there is
a factor of approximately 32 between corresponding ACER functions in the two
figures. This is due to the fact that the time series of peak values contains about 32
times less data than the original time series.
Chapter 7
Estimation of Extreme Values
for Financial Risk Assessment

7.1 Introduction

In 1991 the Norwegian government decided that the power market should be
deregulated allowing for power trading. Following this, the Nord Pool market
was established in 1996 as a common electricity market for Norway and Sweden.
Finland followed into the Nord Pool market area in 1998, and western and eastern
Denmark joined in 1999 and 2000, respectively. In 2002 Nord Pool spot was
established and today runs the spot (1 day ahead) market for electricity in Norway,
Sweden, Denmark, Finland, and Estonia. Today, a large part of the consumption of
electricity in this market is traded through Nord Pool spot, and the spot market is
the most important.
Due to the difficulties and costs of storing electricity (basically it cannot be
stored), the observed spot price is highly volatile and the changes of these spot
prices are often very large. Some stylized facts for the price changes in these spot
data are that they display very heavy tails, significant serial correlation, seasonality,
and volatility clustering (Weron 2006). The seasonality comes from the fact that the
electricity cannot be stored, so the price during the high demand periods (during
the day, weekdays, and the winter) tends to be higher. Most of the electricity in the
market is produced by hydropower plants, so prices are also highly influenced by
precipitation, giving higher electricity prices in dry years. Extreme price changes or
spikes are observed in electricity markets around the world (Escribano et al. 2002)
and have been studied extensively over the last years. Earlier work on modelling
these spikes as an error process (Contreras et al. 2003; Garcia et al. 2005; Swider
and Weber 2007) applying extreme value theory will be used here (Byström 2005;
Chan and Gray 2006). This chapter largely follows the work reported by Dahlen
et al. (2015).
In the examples presented, a conditional extreme value approach will be used,
as suggested by McNeil and Frey (2000), to estimate tail quantiles for the return
distribution and thus get a measure of the Value-at-Risk (VaR). They suggested

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 93

A. Naess, Applied Extreme Value Statistics,
[Link]
94 7 Estimation of Extreme Values for Financial Risk Assessment

that a Generalized Pareto (GP) distribution is to be fitted, with the use of the POT
method, to a data set of residuals from an AR-GARCH (Brockwell and Davis 2002)
filtering process. Here, the ACER method will also be used to estimate the tails of
these residuals, and the obtained results will be compared to those obtained with
the use of the POT method. The methods will be compared in both in sample and
out of sample performances. A reason for using the ACER method over the POT
method is that there are essentially no asymptotic arguments and no requirements
about independent data in the derivation of the ACER method, which in turn leads
to better use of the data.

7.2 Value-at-Risk

To quantify the risk associated with the models applied, it was decided to use the
Value-at-Risk (VaR) metric. The VaR metric, for the next day, is defined for a
probability .α as the value the loss will not exceed with probability .α. For a series of
stochastic variables .Xt , which would here be the future price of electricity, the VaR
will be defined as

Prob(Xt ≤ VaRα ) = α.
. (7.1)

So, the VaR is directly related to the quantiles of the distribution of electricity prices.
For the models used in this chapter, the VaR is straightforward to calculate. For the
POT method, the VaR is simply obtained by inverting the GP distribution for a given
probability. This gives the VaR as, cf. Eq. (3.3),
σ
VaRα = u +
. (1 − α)−γ − 1 , (7.2)
γ

where u denotes the chosen threshold value, and .σ and .γ are the scale and shape
parameters of the GP distribution of exceedances of u, respectively. For the ACER
method, the VaR is found just as easily by inverting the estimated ACER function
for the desired exceedance level .α, cf. Eq. (5.39). The VaR then becomes
1/c
1 q ξ
.VaRα = b + −1 , (7.3)
ã 1−α

where .ã, b, c, q, and .ξ are the parameters estimated by fitting the assumed
parametric ACER function to the observed data. In the derivation of Eq. (7.3), the
approximation .exp(−x) ≈ 1 − x for small x is used. It should be noted that when
using the conditional approach, the VaR estimations need to be inserted into the
equation for the conditional quantiles. As a last reminder, it should be noted that for
the VaR to be interpreted as the unlikely loss, these two methods need to be fitted to
7.3 Application to Simulated Time Series of Electricity Prices 95

the left tail of the return distribution. So, using the VaR, a simple and straightforward
method to quantify the risk in the Nordic Power Exchange spot market is at hand.

7.3 Application to Simulated Time Series of Electricity Prices

Before using the method developed on real time series, it was decided to try to
motivate the use of the ACER method instead of the POT method by applying the
two to simulated time series with known properties. By this approach, it should be
possible to get an idea of the difference in performance for the two methods, and
if the ACER method performs better, argue that it should be used when estimating
extreme quantiles. The initial step was simply simulating iid innovations from a
Student’s t distribution and comparing the estimated quantiles with the ones from
the actual distribution. Data sets were simulated consisting of 10 time series with
3650 independent outcomes of the t variable in each, which then provides us with
data sets of 36,500 observations. This was repeated five times, resulting in five data
sets for comparing the performance of the ACER and POT methods. For the data sets
simulated, a t variable with .ν = 4 degrees of freedom was used. Applying the ACER
and POT methods to the simulated data sets and estimating the 100 time series return
level, which is the level that is expected to be exceeded once per 100 time series, the
results presented in Table 7.1 were obtained. In this table, the estimated 100 time
series return level is presented along with the percentage deviation from the real
return level obtained from the Student’s t distribution and .95% confidence intervals
for the estimated return level. From this table it is seen that while the ACER method
estimates a return level closer to the real return level four out of five times, the width
of the confidence intervals is far greater when using the POT method. The reason
for the difference in the confidence intervals is likely due to the fact that the ACER
method uses approximately .48% of all observations, compared to approximately
.10% of all observations used by the POT method.

Table 7.1 The 100 time series return level estimated with ACER and POT for the t model with
=4
.ν

100ts
.ηACER .CIACER
100ts
.ηP OT .CIP OT

32.99 (2.1%) (29.32, 37.80) 33.82 (4.7%) (23.75, 43.88)

27.78 (14.0%) (24.17, 31.59) 24.56 (24.0%) (18.35, 31.05)
31.68 (1.9%) (28.43, 34.72) 30.24 (6.4%) (21.87, 38.61)
29.76 (7.7%) (26.88, 33.31) 31.83 (1.5%) (22.46, 41.20)
31.80 (1.5%) (26.29, 34.25) 29.05 (10.1%) (21.15, 36.96)
96 7 Estimation of Extreme Values for Financial Risk Assessment

7.4 Electricity Price Data

In this chapter, the daily spot price at the Nordic Power Exchange, Nord Pool, is
studied. It will be the price at 9 a.m. for a 10-year period, from January 01, 2000 to
December 31, 2009. The price region considered is the middle of Norway. It should
be noted that this price region was established on November 20, 2006. Before that
time, it was integrated with the rest of Norway. It is also worth mentioning that the
spot price in this market is actually a 1-day futures price, where the price for the
following day is set by an auction at noon.
Plotted in Fig. 7.1 are the daily prices observed over this 10-year period and in
Fig. 7.2 are the daily price changes, presented as logarithmic returns. Observable
from these figures are large spikes in both the price and price change processes.
There also seems to be some volatility clustering in the return series. In Table 7.2,
some descriptive statistics is presented along with test values for the Ljung–Box test
(Brockwell and Davis 2002) for different lags on both the return and squared return
series. .Q(h) denotes the test statistic at lag h for the return series, while .Q2 (h)
is the same for the squared series. A large excess kurtosis and positive skewness
2.0
1.5
NOK/KWh
1.0
0.5
0.0

0 1000 2000 3000

Day

Fig. 7.1 Daily electricity spot prices on Nord Pool for the 10-year period
2
Price changes
1
0
−1
−2

0 1000 2000 3000

Day

Fig. 7.2 Daily electricity price changes on Nord Pool for the 10-year period
7.5 Conditional Approach 97

Table 7.2 Descriptive statistics for the return series

Mean Skew. [Link]. .Q(1) .Q(2) .Q(7) .Q
2 (1) .Q
2 (2) .Q
2 (7)

.3.00 · 10
−4 .0.97 .23.27 .182.20 .430.86 .1505.67 .265.50 .606.49 .863.32

are observed, meaning that both very small and very large price changes occur
often compared to a normal distribution, and large positive price changes are more
common than the large negative price changes. From the Ljung–Box test, significant
serial correlation for all lags considered here is observed, which is expected. For the
squared returns, there is significant serial correlation for both seven and ten lags,
meaning that there is significant volatility clustering or a GARCH effect.

7.5 Conditional Approach

Following the conditional approach of McNeil and Frey (2000) and the use of a
similar model for the electricity spot prices as given by Byström (2005), the data
were modelled with the use of an AR-GARCH model, and the ACER method
was then applied for estimation of the residual tail quantiles. To accommodate the
heavy tailed data, which are typically observed in finance, the proposed procedure
of Sect. 5.6 was used. For these returns, the goal was to fit an AR-GARCH model
where the seasonality in the process was modeled by an AR process. The terms
included in this AR process were the AR(1) and AR(7) terms because of the
clear seasonality over the day and over the week. For modeling the volatility, a
GARCH(1,1) process (Bollerslev 1986) was chosen. This provides a model that
should be able to capture the serial correlation over the week and the observed
heteroscedasticity. The model then assumes the form

Rt = a0 + a1 Rt−1 + a7 Rt−7 + σt Zt , .
. (7.4)
σt2 = α0 + α1 σt−1
2 2
Zt−1 + β1 σt−1
2
, (7.5)

where .Rt denotes daily log return rates of spot prices .Xt , that is, .Rt =
log(Xt /Xt−1 ). The .Zt are iid random variables of mean value zero and variance
equal to 1.0, while .a0 , a1 , a7 , α0 , α1 , β1 are all nonnegative constants. In this
chapter, it will be assumed that .Zt is Normal or Student’s t distributed, scaled
to unit variance. The conditional quantiles, for these two models, can then be
calculated as
∗
qt,α
. = a0 + a1 Rt−1 + a7 Rt−7 + σt qα , (7.6)

where .qα is the standard .α-quantile of the Normal or Student’s t-distribution.

For the heavy tails observed in this type of data, this standard AR-GARCH
process will not be sufficient to model the tails of the return distribution accurately.
98 7 Estimation of Extreme Values for Financial Risk Assessment

The error distribution simply cannot be assumed to be Normal or Student’s t, as will

be observed later. Introduced by McNeil and Frey (2000) and applied to Nord Pool
spot prices by Byström (2005), the extreme value theory approach has proven itself
superior to the use of a standard AR-GARCH approach. This approach was used
here, but instead of using only a POT fitted GP distribution for estimation of the
tail quantiles for the standardized residual distribution, the ACER method was also
applied. The performance of this method was then compared to the performance of
the conditional model where the POT method was used. After using the ACER
and POT methods to estimate the tail quantiles of the residual distribution, the
conditional quantiles for these models were calculated as
∗
qt,α
. = a0 + a1 Rt−1 + a7 Rt−7 + σt q̂α , (7.7)

where .q̂α is the quantile of the residual distribution, associated with probability .α,
estimated by the POT or ACER method. The standardized residuals obtained by
the Normal AR-GARCH filter were still slightly serially correlated, but most of the
heteroscedacity was removed. These conditional quantiles may also be regarded as
the .VaR(α) estimate when considering the distribution of the lower tail. For the
in sample performance test, the conditional tail quantiles were estimated for all
observations, and then the number of empirical exceedances over these quantiles
was compared to what was to be expected. When dealing with the out of sample
performance, the initial step was estimating a model using the first 5 years, and then
this model was used to predict the tail quantiles for the next day. The model was
then re-estimated with the last 5 years of the data, or a rolling window of length 5
years, for each day, and the tail quantiles for the next day were predicted. As for
the in sample performance test, the empirical exceedances over the predicted tail
quantiles are compared to what is expected.
The reason for the prefiltering of the data set was the desire to use a GARCH
process to model the volatility and to accommodate for sudden changes in the
volatility. The use of an AR process for the autocorrelation is mainly because of
the POT method’s need of iid observations. There are no such iid requirements for
the ACER method, so the use of an AR process is strictly not needed in this case.

7.6 Results

Using the models introduced, the goal is to analyze the 10 years of daily spot price
data from the Nord Pool. As an introduction, let us start off with an unconditional
tail fitting, that is, apply the POT and ACER methods directly to the return series.
Then, the conditional approach will be introduced, as detailed in McNeil and Frey
(2000), where both the ACER and POT methods will be used to fit the residuals
produced by the AR-GARCH filtering.
7.6 Results 99

7.6.1 Unconditional Approach

As mentioned, the unconditional approach will be to apply the POT and ACER
method directly to the return series. To compare the performance of these two
methods, both in sample and out of sample fit will be considered. For the in
sample fit, the methods are simply used to estimate the tail quantiles of the return
distribution, and these quantiles will be compared to what is actually observed. For
the out of sample test, the first 5 years of the data will be used to estimate the model
and then predict the tail quantiles for the following day. The rolling window is then
moved to the next day, the model is re-estimated, and the quantiles for the next day
are calculated from the estimated parameters. This is repeated for the last 5 years of
the data set.
For the POT method, it is necessary for the data used to be iid, which from
the Ljung–Box test results is clearly not the case. There are both significant serial
correlation and volatility clustering. To deal with this problem, the data will be
declustered by extracting peaks with enough lags in between that it is reasonable
to assume independence between the observations. For the ACER method, there is
no need to decluster the data as the correlation between lags is accounted for in the
choice of ACER function.
In Table 7.3, the number of empirical exceedances over the estimated in sample
quantiles is presented for both methods, along with the number of expected
exceedances over these quantiles. It is observed from this table that there is not a
great difference between the number of exceedance over the estimated quantiles for
the two methods. For the out of sample performance the number of exceedances
over the estimated quantiles can be observed in Table 7.4. Again, there is no
great difference between the results of the two methods, but the ACER method is
again slightly more accurate than the POT method. A problem with doing such
out of sample prediction is that the last 5 years of the data are used with no
more emphasis on what happened yesterday than 5 years ago. This means that the
estimated quantiles, and thus risk measures, will need a long time to be able to
incorporate a rise or fall in volatility. This again leads to the effect that most of
the exceedances over the predicted quantiles will be observations from the periods
with high volatility, while one would ideally like the exceedances to be uniformly
distributed over the period in question.

Table 7.3 In sample Probability Expected POT ACER

performance of the
.0.95 183 175 179
unconditional methods
.0.99 37 38 38
.0.995 18 19 19
.0.999 4 3 4
.0.9995 2 3 2
.0.9999 0 1 0
100 7 Estimation of Extreme Values for Financial Risk Assessment

Table 7.4 Out of sample Probability Expected POT ACER

performance of the
.0.95 91 99 99
unconditional methods
.0.99 18 29 24
.0.995 9 17 11
.0.999 2 2 2
.0.9995 1 1 1
.0.9999 0 0 0

Table 7.5 Estimated AR-GARCH parameter values

Parameter Value-N Value-t
.μ .2.997 · 10−3 (1.69 · 10−3 ) .−1.628 · 10−3 (1.14 · 10−3 )
.a1
−2
.−0.338(2.95 · 10 ) .−0.283(1.57 · 10 )
−2

.a7
−2
.0.388(2.33 · 10 ) .0.425(1.45 · 10 )
−2

.α0
−4
.2.985 · 10 (1.62 · 10 )
−5 −4
.8.518 · 10 (2.16 · 10 )
−4

.α1
−2
.6.362 · 10 (1.43 · 10 )
−2 .0.455(8.93 · 10 )
−2

.β1
−2
.0.931(1.70 · 10 ) .0.740(2.58 · 10 )
−2

.ν – .2.582(0.14)

Table 7.6 Descriptive statistics for residual series

Mean Skew. [Link]. .Q(1) .Q(2) .Q(7) .Q2 (1) 2
.Q (2)
2
.Q (7)

.2.90 · 10
−3 .4.44 .97.72 .6.70 .102.91 .141.82 .7.20 · 10
−3 .8.60 · 10
−3 .3.67 · 10
−1

7.6.2 Conditional Approach

To be able to accommodate for sudden changes in volatility, at least to some extent,

the conditional approach is used. The first step is to filter the data with an AR-
GARCH process. For this, a GARCH(1,1) process with AR parameters for the first
and seventh lag is applied before using the POT and ACER methods for fitting to
the standardized residuals. The residuals are standardized with the current volatility.
The parameters for the AR-GARCH model with both normal and t distributed errors
are presented in Table 7.5, with the standard errors in brackets. For the model with
normally distributed errors, all parameters are significant at a .0.01 significance
level, except .α0 which is significant at a .0.1 significance level and .μ which is
nonsignificant. For the model with t distributed errors, all parameters are significant
at .0.01 significance level, except .μ which is nonsignificant.
Descriptive statistics and Ljung–Box test results for the residual series, that is,
the residual series after pre-filtering with the normal AR-GARCH model, can be
found in Table 7.6. For the residual series, there are still positive skewness and high
excessive kurtosis. It is observed that while there is still significant serial correlation,
it has been greatly reduced, and there are no significant GARCH effects in the
residual series. It was observed that it is possible to remove slightly more of the
serial correlation by including more AR-terms, but the difference is minimal, so the
model with less parameters is preferred.
7.6 Results 101

Table 7.7 Generalized Parameter Value

Pareto distribution parameters
.σ .0.5318
from POT
.ξ .0.3118
.λ .9.32 · 10
−2

u 1

Table 7.8 Parameters Parameter Value

estimated for the ACER
.ã .0.254
method
b .0.010
c .1.181

q .0.46
.ξ .0.334

Table 7.9 Predicted in sample right quantiles for the different methods
Probability Expected AR-GARCH-N AR-GARCH-t C-POT C-ACER
0.95 182 128 32 187 182
0.99 37 57 9 34 35
0.995 18 43 4 19 18
0.999 4 28 2 4 4
0.9995 2 23 1 3 1
0.9999 0 17 0 1 0

After filtering the return series, the POT and ACER methods are applied to the
series of standardized residuals. As this series is much closer to iid than the return
series and observations over the chosen threshold seem to be independent of each
other, there is no need to decluster the data in the same way that was done with the
unconditional method. Nevertheless, it should be noted that only the observations
over the chosen threshold are used, which in this case will be less than .10% of the
data. Using the POT method to fit a GP distribution to the data, with the threshold
u selected from inspection (Coles 2001), the parameters presented in Table 7.7 are
obtained. Here .λ is the empirical estimate of .P(X > u). Inverting Eq. (3.3) for the
desired probabilities gives us the POT estimated quantiles, which in turn is inserted
into Eq. (7.7) to get the conditional quantiles. For the ACER method, the same
procedure is used, and the parameters for the extrapolated ACER function can be
found in Table 7.8.
For the in sample performance of these two methods, the same procedure as
for the unconditional approach is used. In Table 7.9, the number of exceedances
over a given quantile for the different methods is presented. It is seen that for the
standard AR-GARCH model with standard normally distributed errors, the quantiles
are severely underestimated for all quantile levels, and for the same model with t
distributed errors the quantiles for the lower levels are severely overestimated. It is
also observed that the AR-GARCH model, where the POT method has been applied
to the standardized residuals, is clearly able to estimate the extreme quantiles much
102 7 Estimation of Extreme Values for Financial Risk Assessment

Table 7.10 Exceedances over predicted out of sample quantiles. Right tail
Probability Expected AR-GARCH-N AR-GARCH-t C-POT C-ACER
0.95 91 117 32 94 92
0.99 18 42 9 17 17
0.995 9 27 4 11 10
0.999 2 12 2 4 2
0.9995 1 11 1 0 1
0.9999 0 4 0 0 0

Table 7.11 Yearly Year C-POT C-ACER

distribution of exceedances
for the conditional ACER and 1 16 16
POT method. Out of sample 2 17 17
3 19 20
4 24 21
5 18 18

better than just a standard AR-GARCH model. This is the same as was found by
Byström (2005).
To compare the performance of the POT and ACER methods, it is important to
assess the out of sample fit for the two methods. To do this, a model is estimated
using the first 5 years of the data, and then the conditional quantiles for the next
day are estimated. The model is then re-estimated using what is now the last 5
years of the data, and again the next day conditional quantiles is predicted. This
gives us a period of 5 years for the out of sample prediction. In Table 7.10, the
number of exceedances over the predicted out of sample quantiles is presented. It
is seen from this that the performance of the conditional POT and the conditional
ACER method is quite similar, with the conditional ACER method giving a slightly
better out of sample fit for the more extreme quantiles. In Table 7.11, the counted
exceedances over the predicted .95% day ahead quantile are presented. It is seen that
the distribution of the exceedances is fairly even over the years. The exceedances
are well distributed and do not become more frequent in the high volatility periods.
Chapter 8
The Upcrossing Rate
via the Characteristic Function

8.1 Introduction

As was detailed in Chap. 4, a key function for a practical assessment of the extreme
value distribution of stochastic response processes is the average rate of upcrossings
of high levels by the response. An important class of such response processes
can be expressed as a second-order stochastic Volterra series, that is, a stochastic
Volterra series that has been truncated after the second-order term (Schetzen 1980).
A substantial amount of work has been done to derive methods for efficient analysis
of this model, starting with the seminal paper by Kac and Siegert (1947). Later, with
the development of the offshore industry, this paper had an impact on investigations
of the response statistics of large floating structures. Early contributions in this field
of research were made, among many others, by Neal (1974), Vinje (1983), Langley
(1984), Naess (1985b, 1990b), and Donley and Spanos (1990).
The type of stochastic Volterra series models that will be studied here can be
expressed as a sum of a linear and a nonlinear, quadratic transformation of a
Gaussian process. Such a representation of the response process would apply to
the standard model for expressing the total wave forces or horizontal excursion
responses of, e.g., a tension leg platform in a random sea way. It also applies
to the response of a linear structure to a quadratic wind loading where the wind
speed is modeled as a Gaussian process. It would also apply to the representation
of large deformations of flexible structures subjected to Gaussian loads in which
the strains/stresses exhibit a strong quadratic effect. The problem of determining
the marginal probability distribution function of such response processes has been
solved in the sense that computer programs are available that allow very accurate
numerical calculations of these functions, cf. Naess and Johnsen (1992).
The focus of this chapter is to show that it is also possible to accurately
calculate the average upcrossing rate of second-order stochastic Volterra models.
This allows the formulation of approximate extreme value distributions for such
response processes. An interesting aspect of the development in this chapter is the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 103
A. Naess, Applied Extreme Value Statistics,
[Link]
104 8 The Upcrossing Rate via the Characteristic Function

surprising structural complexity of the problem of calculating the upcrossing rate

of a second-order Volterra model. It is also hoped that this chapter may serve to
illustrate the power of the point process approach to practical extreme value analysis
via the upcrossing rate function. The performance of the numerical method that has
been developed will be illustrated by application to three specific examples.

8.2 The Response Process

As already stated, the object of study in this chapter is a stochastic response process
Z(t) modeled as a second-order stochastic Volterra series. Specifically, it is assumed
.

that .Z(t) can be written as a sum of a linear, first-order response component .Z1 (t)
and a nonlinear, second-order component .Z2 (t). That is, cf. Naess (1990b),

.Z(t) = Z1 (t) + Z2 (t), (8.1)

where
∞
Z1 (t) =
. h1 (τ )X(t − τ ) dτ, (8.2)
0

and
∞ ∞
Z2 (t) =
. h2 (τ1 , τ2 )X(t − τ1 )X(t − τ2 ) dτ1 dτ2 . (8.3)
0 0

In Eqs. (8.2) and (8.3), .X(t) denotes a stationary, real Gaussian process. .X(t)
could represent a random wave elevation process or a stochastic wind velocity
field. It was chosen here to limit the exposition to the case of a unidirectional
situation. How to deal with the multidirectional case is explained in detail in Naess
(1990b). Based on the results from this reference, it will be recognized that all results
obtained in this chapter apply equally well to the multidirectional case.
The functions .h1 (τ ) and .h2 (τ1 , τ2 ) characterize the physical system that is
modeled. .h1 (τ ) is an ordinary impulse response function defining a linear dynamical
system. .h2 (τ1 , τ2 ), which is referred to as the quadratic impulse response function,
characterizes the second-order properties of the physical system, but in contrast to
the linear impulse response function, it does not have a direct physical interpretation.
To derive expressions suitable for practical numerical calculations, the input
process .X(t) is represented as follows:

N
1/2
X(t) =
. SX (ωj )Δω Bj eiωj t , (8.4)
j =−N
8.2 The Response Process 105

where .SX (ω) denotes the two-sided spectral density of .X(t). Throughout this
chapter, when the summation index runs from negative to positive values, it
invariably omits zero. .0 < ω1 < · · · < ωN is an equidistant discretization of
the pertinent part of the positive frequency axis. .ω−i = −ωi , .Δω = ωi+1 − ωi . The
assumption of an equidistant discretization is adopted for simplicity of presentation
and is not necessary. In fact, often a non-equidistant version is used to avoid
having too many frequencies, which is sometimes convenient. The formulas are
easily adapted to cover the situation of non-equidistant discretization. .{Bi } is a set
of independent, complex Gaussian .N(0, 1)-variables with independent, identically
distributed real and imaginary parts. These variables can be assumed to satisfy the
relation .B−i = Bi∗ , where * signifies complex conjugation. .i2 = −1. Hence, .X(t)
becomes a Gaussian process with zero mean value. The case of a nonzero mean
value does not create any difficulties for the analysis, as is easily recognized.
By substituting Eq. (8.4) into Eqs. (8.2) and (8.3), the following expressions are
obtained:

N
Z1 (t) =
. qi Bi eiωi t , (8.5)
i=−N

where

qi = Ĥ1 (ωi )[SX (ωi )Δω]1/2 ,

. (8.6)

and

N
N
Z2 (t) =
. Qij Bi Bj∗ ei(ωi −ωj )t , (8.7)
i=−N j =−N

where

Qij = Ĥ2 (ωi , −ωj ) · [SX (ωi )SX (ωj )]1/2 Δω.
. (8.8)

The function .Ĥ1 (ω) denotes the linear transfer function, corresponding to .h1 (τ ).
Typically, the second order Volterra series is used for modeling the stochastic
loading process. For the response process to be of the same type, it is necessary
that the equations of motion of the dynamic system considered are linear and time
invariant. Such a system is characterized by a linear transfer function, .L̂(ω) say. The
implication of this is that .Ĥ1 (ω) can be expressed as follows:

Ĥ1 (ω) = L̂(ω)K̂1 (ω).

. (8.9)

L̂(ω) is assumed to be given as follows:

L̂(ω) = (−ω2 M + iωC + K)−1 ,

. (8.10)
106 8 The Upcrossing Rate via the Characteristic Function

where .M = M(ω), .C = C(ω), and K are appropriate mass, damping, and stiffness
parameters. The function .K̂1 (ω) is a linear transfer function associated with the
loading process.
The function .Ĥ2 (ω, ω' ) denotes the quadratic transfer function (QTF), which
depends on two frequencies. It is obtained as a Fourier transform of .h2 (τ1 , τ2 ).
'
.Ĥ2 (ω, ω ) can be expressed as follows:

Ĥ2 (ω, ω' ) = L̂(ω + ω' )K̂2 (ω, ω' ).

. (8.11)

The function .K̂2 (ω, ω' ) denotes a QTF for the quadratic forces on the structure.
It is shown by Naess (1987), see also Johnson and Kotz (1970), that by solving
the eigenvalue problem (assumed nonsingular),

.Qvj = λj vj ; (8.12)

to find the eigenvalues .λj and orthonormal eigenvectors .vj , .j = −N, . . . , −1,
1, . . . , N , the quadratic response process can be represented as
.

N
Z2 (t) =
. λj Wj (t)2 . (8.13)
j =−N

Here .Wj (t), .j = −N, . . . , −1, .1, . . . , N , are real stationary Gaussian .N(0, 1)-
processes which can be represented as follows:

N
Wj (t) =
. vj (ωk )Bk eiωk t , (8.14)
k=−N

where .vj (ωk ) denotes the kth component of .vj . Note that the property .vj (ω−k ) =
vj (ωk )∗ can be assumed, cf. Naess (1990b). For each fixed t, .{Wj (t)} becomes a set
of independent Gaussian variables.
Having achieved the desired representation of the quadratic response .Z2 (t), it
can then be shown that the first-order response can be expressed as

N
Z1 (t) =
. βj Wj (t). (8.15)
j =−N

The (real) parameters .βj are given by the relations

N
βj =
. Ĥ1∗ (ωk ) SX (ωk )Δω vj (ωk ). (8.16)
k=−N
8.3 The Average Crossing Rate 107

This gives us the following representation of the total response process (Vinje
1983; Naess 1985b):

N
Z(t) =
. {λj Wj (t)2 + βj Wj (t)}. (8.17)
j =−N

Based on this representation, Naess (1987) describes how to calculate the

statistical moments of the response process. Asymptotic upper and lower bounds
on the mean level upcrossing rate are also given, but these bounds are valid only if
specific conditions on the ratios .βj /λj are satisfied. It turns out that these conditions
are very often violated in practical applications. Our goal is therefore to find a
numerical method for accurate calculation of the mean level upcrossing rate of the
total response process .Z(t). This is the topic of the next section.

8.3 The Average Crossing Rate

The calculation of the mean upcrossing rate .νZ+ (·; t) at time t of a stochastic process
.Z(t) is usually based on the Rice formula. It states that

∞
νZ+ (ζ ; t) =
. żfZ(t)Ż(t) (ζ, ż) d ż, (8.18)
0

where .fZ(t)Ż(t) (·, ·) denotes the joint probability density function of .Z(t) and
.Ż(t) = dZ(t)/dt. A direct application of Eq. (8.18) requires the calculation of

.fZ(t)Ż(t) . An alternative approach, which turns out to be ideally suited for the

problem at hand, is to express the mean crossing rate in terms of the characteristic
function .MZ Ż (u, v) = E[exp{i(u Z(t) + v Ż(t))}]. It has been shown by Naess
(2000b) that in general
+∞ +∞
1 1 ∂MZ Ż (u, v) −iζ u
νZ+ (ζ, t) = −
. e du dv
(2π )2 −∞ −∞ v ∂v
+∞
i ∂MZ Ż (u, v)
− e−iζ u du, (8.19)
4π −∞ ∂v v=0

where the integral w.r.t. v is interpreted as a principal value integral in the following
∞ −ε ∞
sense: . −∞ = limε→0 ( −∞ + ε ). A heuristic derivation of this formula is given
in Sect. 8.5.3.
108 8 The Upcrossing Rate via the Characteristic Function

For the stationary case, it can be shown that the last integral on the right hand
side of Eq. (8.19) vanishes. Hence, for a stationary process, the following formula
is obtained (Vinje 1983; Naess 2000b):
+∞ +∞
1 1 ∂MZ Ż (u, v) −iζ u
νZ+ (ζ ) = −
. e du dv, (8.20)
(2π )2 −∞ −∞ v ∂v

where the upcrossing rate is now independent of t, and again, the integral with
respect to v is to be interpreted as a principal value integral as described above.
An alternative expression useful for numerical calculations of the upcrossing rate
+
.ν (·), whether the process is stationary or not, can be obtained by considering the
Z
characteristic function as a function of two complex variables. It can then often
be shown that this new function becomes holomorphic in suitable regions of .C2 ,
where .C denotes the complex plane. Under suitable conditions, the use of complex
function theory allows the derivation of the following alternative expression for the
crossing rate, cf. Naess and Karlsen (2004):
∞−ia ∞−ib
1 1
νZ+ (ζ ) = −
. M (z, w) e− izζ dz dw, (8.21)
(2 π )2 − ∞−ia −∞−ib w 2 Z Ż

where .0 < a < a1 for some positive constant .a1 , and .b0 < b < b1 for some
constants .b0 < 0 and .b1 > 0.
The calculation of the characteristic function .MZ Ż is discussed in Grigoriu
(1995), but no explicit expression is derived. Here, a different approach will be
followed, which leads to a convenient explicit representation of the characteristic
function suitable for calculation of the integrals appearing in Eq. (8.19), (8.20),
or (8.21). To this end, consider the multidimensional Gaussian vectors .W =
(W−N , . . . , WN )' and .Ẇ = (Ẇ−N , . . . , ẆN )' . It is obtained that the covariance
matrix of .(W ' , Ẇ ' )' is given by

Σ11 Σ12
Σ=
. , (8.22)
Σ21 Σ22

where .Σ11 = I = the .2N × 2N identity matrix, .Σ12 = (rij ) = (E[Wi Ẇj ]), .Σ21 =
(E[Ẇi Wj ]), and .Σ22 = (sij ) = (E[Ẇi Ẇj ]); .i, j = −N, . . . , −1, 1, . . . , N. .rij =
− rj i and .Σ12 = Σ21' . It follows from Eq. (8.14) that the entries of the covariance

matrix .Σ can be expressed in terms of the eigenvectors .vj , cf. Naess (1987). Let
.Λ = diag(λ−N , . . . , λN ) be the diagonal matrix with the parameters .λj on the

diagonal, and let .β = (β−N , . . . , βN )' , cf. Eq. (8.16). It will be shown in Sect. 8.5.3
that (Naess 2000a)

1 1 2 ' 1
MZ Ż (u, v) = √
. exp − v β C β + d ' A−1 d , (8.23)
det(A) 2 2
8.4 Numerical Calculation 109

where

A = A(u, v) = I − 2i u Λ − 2 i v Λ Σ21 + Σ12 Λ + 4 v 2 Λ C Λ,

. (8.24)

C = Σ22 − Σ21 Σ12 ,

. (8.25)

.d = d(u, v) = i u I + i v Σ12 − 2 v 2 Λ C β. (8.26)

8.4 Numerical Calculation

Early efforts to carry out numerical calculation of the mean crossing rate using
Eq. (8.21) have been reported in Naess and Karlsen (2004). These initial investiga-
tions indicated that the method had the potential to provide very accurate numerical
results. Equation (8.21) will be rewritten as follows:
∞−ia
1 1
νZ+ (ζ ) = −
. I (ζ, w) dw, (8.27)
(2 π )2 − ∞−ia w2

where
∞−ib
I = I (ζ, w) =
. M(z, w) e− izζ dz
−∞−ib
∞−ib
= exp{− izζ + ln M(z, w)} dz. (8.28)
−∞−ib

A numerical calculation of the mean upcrossing rate can start by calculating

the function .I (ζ, w) for specified values of .ζ and w. However, a direct numerical
integration of Eq. (8.28) is made difficult by the oscillatory term .exp{−iℜ(z)ζ },
where .ℜ(z) denotes the real part of the complex number z. This problem can be
avoided by invoking the method of steepest descent, also called the saddle point
method. For this purpose, write

g(z) = g(z; w) = − izζ + ln M(z, w)

= φ(x, y) + i ψ(x, y), (8.29)

where .z = x + iy. .φ(x, y) and .ψ(x, y) become real harmonic functions when .g(z)
is holomorphic. The idea is to identify the saddle point of the surface .(x, y) →
φ(x, y) closest to the integration line from .−∞ − ib to .∞ − ib. By shifting this
integration line to a new integration contour that passes through the saddle point
and then follows the path of steepest descent away from the saddle point, it can be
shown that the function .ψ(x, y) stays constant, and therefore the oscillatory term
110 8 The Upcrossing Rate via the Characteristic Function

in the integral degenerates to a constant. This is a main advantage of the method of

steepest descent for numerical calculations. It can be shown that the integral does
not change its value as long as the function .g(z) is a holomorphic function in the
region bounded by the two integration contours and if the integrals vanish along the
contour segments required to close the region.
If .zs denotes the identified saddle point, where .g ' (zs ) = 0, the steepest descent
path away from the saddle point will follow the direction given by .− g ' (z)∗ , for
.z /= zs , cf. Henrici (1977). Typically, the singular points of the function g will be

around the imaginary axis, which indicates that the direction of the paths of steepest
descent emanating from the saddle point will typically not deviate substantially from
a direction orthogonal to the imaginary axis. This provides a guide for setting up
a numerical integration procedure based on the path of steepest descent. First the
saddle point .zs is identified. Then the path of steepest descent starting at .zs and going
“right” is approximated by the sequence of points .{zj }∞ j =0 calculated as follows:

z 0 = zs ,
. z1 = zs + h, (8.30)

g ' (zj )∗
Δzj = −
. h, j = 1, 2, . . . (8.31)
|g ' (zj )|

zj +1 = zj + Δzj ,
. j = 1, 2, . . . , (8.32)

where h is a small positive constant.

Similarly, the path of steepest descent going “left” is approximated by the
sequence .{zj }−∞
j =0 calculated by

. z−1 = zs − h, (8.33)

g ' (zj )∗
Δzj = −
. h, j = −1, −2, . . . (8.34)
|g ' (zj )|

zj −1 = zj + Δzj ,
. j = −1, −2, . . . (8.35)

A numerical estimate .Iˆ of I can be obtained as follows:

Iˆ = Iˆ+ + Iˆ− ,
. (8.36)

where

h K
Iˆ+ = exp{g(zs )} +
. Δzj exp{g(zj )}, (8.37)
2
j =1
8.4 Numerical Calculation 111

and
−K

h
Iˆ− = exp{g(zs )} −
. Δzj exp{g(zj )}, (8.38)
2
j =−1

for a suitably large integer K.

A numerical estimate .ν̂Z+ (ζ ) of the mean crossing rate can now be calculated by
the sum

1 L
1 ˆ
ν̂Z+ (ζ ) = −
.
2
ℜ 2
I (ζ, wj ) Δwj , (8.39)
(2 π ) w
j =−L j

where the discretization points .wj are chosen to follow the negative real axis from
a suitably large negative number up to a point at .−ε, where .0 < ε ≤ a, then follow
a semicircle in the lower half plane to .ε on the positive real axis, and finally follow
this axis to a suitably large positive number. Since the numerical estimate does not
necessarily have an imaginary part that is exactly equal to zero, the real part operator
.ℜ has been applied.

Generally, the CPU time required to carry out the computations above can be
quite long, depending on the size of the problem, which is related to the number N of
eigenvalues. It is therefore of interest to see if approximating formulas are accurate
enough. The first such approximation to have a look at is the Laplace approximation
for the inner integral over the saddle point (Henrici 1977). The simplest version of
this approximation, adapted to the situation at hand, leads to the result

2π
.I = I (ζ, w) ≈ exp{ g(zs ; w)}, (8.40)
s ;w)
2
− ∂ g(z
∂x 2

which can be substituted directly into Eq. (8.39), leading to an approximation of

νZ+ (ζ ), which is denoted by .ν̃Z+ (ζ ).
.

This approximation can also be exploited in the following way: (1) The full
method is used for an inner interval of w-values, which contribute significantly to
the integral in Eq. (8.27). (2) A Laplace approximation is then used in an outer
interval of w-values where the contribution is less than significant. Of course,
the level of significance is chosen according to some suitable criterion. By this
procedure, the CPU time was reduced by factor of about 3. This method will be
referred to as the hybrid method, and the corresponding approximation of .νZ+ (ζ ) is
denoted by .ν̌Z+ (ζ ).
A simple approximation proposed in Teigen and Naess (1999, 2003) is worth
a closer scrutiny. It is based on the widely adopted simplifying assumption that
the displacement process is independent of the velocity process. This leads to an
112 8 The Upcrossing Rate via the Characteristic Function

alternative approximation of .νZ+ (ζ ), which is denoted by .ν +

Z (ζ ). It is given by the
formula
fZ (ζ )
ν+
.
+
Z (ζ ) = νZ (ζref ) , (8.41)
fZ (ζref )

where .fZ denotes the marginal probability density of the surge response, and .ζref
denotes a suitable reference level, typically the mean response. Here, .ζref has been
chosen as the point where .fZ assumes its maximum, which corresponds well with
the mean response level. A general approximation for .νZ+ (ζref ) is given in Teigen
and Naess (2003). If only slow-drift response is considered, a good approximation
is obtained by putting .νZ+ (ζref ) ≈ 1/T0 , where .T0 = 2π/ω0 is the slow-drift period.
The advantage of Eq. (8.41) is that the rhs is much faster to calculate than the exact
formula.
A few comments on why the mean upcrossing rate has practical significance
may seem appropriate. As is already known, the extreme value distribution is not
completely determined by the mean upcrossing rate. This is true only when the
upcrossing events of the high response levels can be assumed to be statistically
independent. Usually that is a good approximation except when the total damping
is very small. For such cases, Naess (1999) has developed a simple, but effective,
method to account for the effect of low damping on the extreme value distribution.
This method is based on the mean upcrossing rate and the appropriate damping
parameter. This is often advantageous since no time series of response is required.
However, when those are available, the key to an accurate estimation of the extreme
value distribution would be the application of the ACER method.

8.5 Numerical Examples

8.5.1 Slow-Drift Response

To illustrate the accuracy of the numerical method, the first example concerns a
simple model for the slow-drift response of a moored offshore structure, cf. Naess
and Machado (2000). Specifically, the response process for this case may be written
as (.λ = λ1 = λ2 )

Z2 (t) = λ W1 (t)2 + W2 (t)2 .
. (8.42)

For this process, the upcrossing rate .νZ+2 (z) is given by the relation

σ̂1 ζ 1 ζ
νZ+2 (ζ ) = √ exp −
. + ln , (8.43)
2π 2λ 2 λ
8.5 Numerical Examples 113

Table 8.1 Comparison of + +

.ζ .νZ (ζ ) .ν̃Z (ζ )
2 2
analytical and numerical
0.5 .1.37 · 10
−2 .1.37 · 10−2
mean upcrossing rate
1.0 .4.31 · 10
−3 .4.30 · 10
−3

2.0 .2.995 · 10−4 .2.995 · 10

−4

3.0 .1.802 · 10
−5 .1.793 · 10
−5

5.0 .5.618 · 10
−8 .5.618 · 10
−8

7.0 .1.605 · 10
−10 .1.592 · 10
−10

Table 8.2 Main particulars Draught (m) 80.0

of the moored deep floater
(MDF) Column diameter (m) 10.0
Natural period surge/sway (s) 133.5
Natural period yaw (s) 121

where .σ̂1 = s11 − (r12 )2 . This special case provides a suitable test for the accuracy
of the numerical method.
Let .ν̃Z+2 (ζ ) denote the mean upcrossing rate of .Z2 (t) calculated by the numerical
method. Table 8.1 compares the analytical with the numerical mean upcrossing rate
for different levels, and it is seen that the agreement is very good indeed.

8.5.2 Moored Deep Floater

The numerical results presented in this example are based on a specific model
structure (Naess et al. 2006). It is a moored deep floater (MDF), which is also
called a spar-buoy, with main particulars as detailed in Table 8.2. Figure 8.1
shows the submerged part of the floater in the form of a computer mesh, which
is used for the calculation of the hydrodynamic transfer functions. The total mass
(including added mass) of the MDF is .M = 12.5 · 106 kg. The damping ratio is
set equal to .ξ = 0.06, and the natural frequency in surge or sway is .ω0 = 0.047
rad/s. Note that the second-order theory is based on the assumption that the QTF
.Ĥ2 (ωi , −ωj ) = L̂(ωi − ωj ) K̂2 (ωi , −ωj ), where .K̂2 (·, ·) is a QTF characterizing

the slowly varying surge forces on the MDF, and .L̂(·) is a linear transfer function
for the surge motion of the MDF, that is,

1
L̂(ω) =
. . (8.44)
M [−ω2 + 2iξ ω0 ω + ω02 ]

An example of the quadratic transfer function for a floating offshore structure is

presented in Sect. 9.6.5.
The deep floater studied in this example could represent the supporting structure
of a floating wind turbine. The Hywind turbine is a particular case of such a
114 8 The Upcrossing Rate via the Characteristic Function

Fig. 8.1 Computer mesh of

the submerged part of the 0
moored deep floater and the
–10
near field of the sea surface
–20

–30

–40

–50

–60

–70

–80
–20 20

0 0

20 –20

Fig. 8.2 A sketch of the

Hywind floating wind turbine
(©Equinor)

structure, where the concept of a moored deep floater is used as a supporting

structure. A sketch of the Hywind turbine is presented in Fig. 8.2.
The random stationary sea state is specified by a JONSWAP spectrum, which is
given as follows:

αg 2 5 ωp 4 1 ω 2
SX (ω) =
. exp − + ln γ exp − −1 , (8.45)
ω5 4 ω 2
2σ ωp
8.5 Numerical Examples 115

where .g = 9.81 ms.−2 , .ωp denotes the peak frequency in rad/s, and .α, .γ , and .σ are
parameters related to the spectral shape. .σ = 0.07 when .ω ≤ ωp , and .σ = 0.09
when .ω > ωp . The parameter .γ is chosen to be equal to 3.0. The parameter .α is
determined from the following empirical relationship:
H 2
s
α = 5.06 2
. 1 − 0.287 ln γ . (8.46)
Tp

.Hs = significant wave height and .Tp = 2π/ωp = spectral peak wave period. For
the subsequent calculations, .Hs = 10.0 m and .Tp = 12 s. The natural frequency in
surge is 0.047 rad/s, which is well below the range where the waves have noticeable
energy. This is why the second-order, nonlinear term in the Volterra expansion is
needed to capture the resonant motions in surge of the MDF.
To get an accurate representation of the response process, there is a specific
requirement that must be observed. Since the damping ratio is only .6%, the
resonance peak of the linear transfer function for the dynamics is quite narrow.
Hence, to capture the dynamics correctly, the frequency resolution must secure a
sufficient number of frequency values over the resonance peak. This usually leads
to an eigenvalue problem with the Q-matrix of size of the order of magnitude
.100 × 100. Using the full representation of this size in calculating the mean crossing

rate by the general method described here would lead to very heavy calculations. In
order to reduce this, the effect of restricting the calculations by retaining only some
of the terms in Eq. (8.13) has been investigated.
For the specific example considered, where exactly 100 (positive) frequencies
have been used, the values of the obtained eigenvalues .λj have been plotted in
Fig. 8.3. It is seen that asubstantial portion of the response variance, which is
given by .Var[Z2 (t)] = 4 N 2
j =1 λj , would be lost if only 10 or 20 eigenvalues were
retained. This is also a factor to consider when deciding on the number of terms to
retain.

Fig. 8.3 The 100 normalized 1

eigenvalues .λj /λ1

0.5
Normalized eigenvalues

−0.5

−1
0 20 40 60 80 100
Eigenvalue number
116 8 The Upcrossing Rate via the Characteristic Function

Table 8.3 Comparison of .η= ζ /λ1 .h = 1.0 · 10−3 .h = 1.0 · 10−2

calculated mean upcrossing −3 −3
2.0 .8.38 · 10 .8.38 · 10
rate .ν̂Z+2 (ζ ) for different step
5.0 .3.93 · 10
−3 .3.93 · 10
−3
lengths
10.0 .5.53 · 10
−4 .5.50 · 10
−4

15.0 .5.70 · 10
−5 .5.65 · 10
−5

20.0 .5.34 · 10
−6 .5.26 · 10
−6

25.0 .4.81 · 10
−7 .4.71 · 10
−7

Table 8.4 Calculated mean + + +

.η = ζ /λ1 .ν̂Z
2
(ζ ) .ν̌Z
2
(ζ ) .ν̃Z
2
(ζ )
upcrossing rates for ten
2.0 .8.38 · 10
−3 .8.38 · 10
−3 .7.41 · 10−3
eigenvalues
5.0 .3.93 · 10
−3 .3.93 · 10
−3 .3.59 · 10
−3

10.0 .5.50 · 10−4 .5.50 · 10−4 .5.23 · 10

−4

15.0 .5.65 · 10−5 .5.65 · 10−5 .5.59 · 10

−5

20.0 .5.26 · 10
−6 .5.26 · 10
−6 .5.36 · 10
−6

25.0 .4.71 · 10
−7 .4.71 · 10
−7 .4.92 · 10
−7

Table 8.5 Calculated mean + + +

.η = ζ /λ1 .ν̂Z
2
(ζ ) .ν̌Z
2
(ζ ) .ν̃Z
2
(ζ )
upcrossing rates for 50
2.0 .6.55 · 10
−3 .6.55 · 10
−3 .5.93 · 10−3
eigenvalues
5.0 .3.25 · 10
−3 .3.25 · 10
−3 .2.98 · 10
−3

10.0 .5.04 · 10−4 .5.04 · 10−4 .4.70 · 10

−4

15.0 .5.44 · 10−5 .5.44 · 10−5 .5.28 · 10

−5

20.0 .5.19 · 10
−6 .5.19 · 10
−6 .5.20 · 10
−6

25.0 .4.71 · 10
−7 .4.71 · 10
−7 .4.86 · 10
−7

In this example, the focus is on the slow-drift response. Hence, only results for
Z2 (t) will be presented. In the tables, .ν̂Z+2 (ζ ), .ν̌Z+2 (ζ ), .ν̃Z+2 (ζ ), and .ν +
.
Z2 (ζ ) denote
the mean upcrossing rate of .Z2 (t) calculated by the full numerical method, the
hybrid method, the Laplace approximation, and the simplified method of Eq. (8.41),
respectively.
To highlight the effect of the increment parameter h, Table 8.3 compares the
results obtained by the full numerical method for two values of h for ten eigenvalues,
that is, for a response representation retaining the first ten terms. The CPU time
differs by a factor of roughly 10 between the two choices of a value for h. Since the
differences between the calculated crossing rates are fairly small, the larger value
was chosen to save CPU time.
Tables 8.4 and 8.5 present the results obtained for 10 and 50 eigenvalues,
respectively. It is apparent that there is some variability of the calculated mean
upcrossing rates depending on the number of eigenvalues included in the analysis.
Ideally, it would therefore be desirable to carry out the calculations with at least 50
eigenvalues.
To get a more detailed picture of how the crossing rate varies with the number of
eigenvalues retained, the mean upcrossing rate was calculated for the level .η = 20 as
a function of the number of eigenvalues. The result is shown in Fig. 8.4. It was also
8.5 Numerical Examples 117

Fig. 8.4 The mean −5

x 10
1.6
upcrossing rate of the level Original system
.η = 20 as a function of the Updated system

number of retained 1.4

eigenvalues

Uppcrossing frequency
1.2

0.8

0.6

0.4
10 20 30 40 50 60 70
Number of eigenvalues

decided to investigate the effect of updating the truncated response representation

so that it had the correct variance. This was achieved by multiplying the retained
eigenvalues by a suitable factor. The effect of this updating on the calculated
upcrossing rate is also shown in Fig. 8.4. The figure indicates a couple of interesting
conclusions. Updating for variance can lead to inaccurate results for the crossing
rate for small to moderate number of eigenvalues retained. Comparing Figs. 8.3
and 8.4, it is seen that surprisingly accurate results are obtained for even a small
number of retained eigenvalues when the truncation is done exactly where negative
eigenvalues are followed by positive eigenvalues. This seems to provide the right
balance between the terms in the response representation, and it indicates a useful
criterion for truncating the response representation for crossing rate calculations.
It is also of great interest to observe that the simple Laplace approximation in fact
provides quite accurate estimates of the mean upcrossing rates, and for this method
the number of eigenvalues has practically no effect on the computational burden.
Hence, from a practical point of view, this is an extremely appealing method. In
Table 8.6, the results obtained by the hybrid method, the Laplace approximation, and
also the simple approximation of Eq. (8.41) for 100 eigenvalues have been listed.
It is seen that while there is excellent agreement between the hybrid method and
the Laplace approximation, the simple approximation leads to significantly lower
values. In terms of extreme value predictions, for the example structure at hand the
Laplace approximation is within about 1% of the hybrid method, while the simple
approximations would lead to an underestimation of typically 5–10% compared
with the two more accurate methods.

8.5.3 Wind Excited Structure

The third example is a simple model of a wind excited structure. It could easily
be adapted to the study of the response to wind loading of a high slender mast
118 8 The Upcrossing Rate via the Characteristic Function

Table 8.6 Calculated mean + + +

.η = ζ /λ1 .ν̌Z
2
(ζ ) .ν̃Z
2
(ζ ) .ν Z
2
(ζ )
upcrossing rates for 100
2.0 .6.17 · 10
−3 .5.59 · 10
−3 .6.03 · 10−3
eigenvalues
5.0 .3.03 · 10
−3 .2.78 · 10
−3 .2.74 · 10
−3

10.0 .4.71 · 10−4 .4.40 · 10−4 .3.65 · 10

−4

15.0 .5.11 · 10
−5 .4.96 · 10
−5 .3.44 · 10
−5

20.0 .4.88 · 10
−6 .4.90 · 10
−6 .2.94 · 10
−6

25.0 .4.44 · 10
−7 .4.63 · 10
−7 .2.44 · 10
−7

Fig. 8.5 A light-mast at a

stadium

supporting a large light panel for illumination at a sports stadium. An example of

such a structure is shown in Fig. 8.5.
The theory can be extended to cover the general case of an MDOF system as
discussed by Benfratello et al. (1998). However, for the purpose of illustration, an
SDOF model of structural response has been chosen.
Let the wind load on the structure be given as .F (t) = c U (t)2 , where c is
some constant, and .U (t) denotes a stationary Gaussian process representing the
wind speed. Writing .U (t) = Ū + V (t), where .Ū is the mean wind speed, and
.V (t) denotes a zero mean Gaussian process, the structural response to the loading

process .F̃ (t) = 2 c Ū V (t) + c V (t)2 will be considered. That is, the constant force
term .c Ū 2 is neglected. For a linear dynamic model, its effect can be added at the
end.
8.5 Numerical Examples 119

The structural response to .F̃ (t) is assumed to be determined by the equation

2cŪ c
Z̈(t) + 2 ξ ω0 Ż(t) + ω02 Z(t) = M −1 F̃ (t) =
. V (t) + V (t)2 , (8.47)
M M
where .ω0 is the undamped natural frequency of the system, .ξ is the relative damping,
and M is the total mass.
The linear transfer function of this system is clearly given by .L̂(ω) = [−ω2 +
2 i ξ ω0 ω + ω02 ]−1 . For this particular force model, the Volterra series is of a
degenerate kind. It can be shown that the QTF for the quadratic response component
of Eq. (8.47) is given by the expression .c L̂(ωi − ωj )/M (Naess 1987).
What is now needed to fully specify the Q-matrix of Eq. (8.12) is the wind
velocity spectrum .SV (ω). Here, a Davenport spectrum is adopted

ω SV (ω) 4 θ2
. = , (8.48)
κ Ū 2 (1 + θ 2 )4/3

where .θ = ω L/(2π Ū ), .κ is the roughness/drag coefficient, and L is a length scale.

For the present example, a turbulence level of .σ/Ū = 0.1 and .0.3 is √ assumed. The
corresponding drag coefficient .κ is given by the relation .σ/Ū = 6 κ. For the
other parameters, .Ū = 50 [m/s] and .L = 1200 [m] are selected. The damping
ratio .ξ = 0.1 and the natural frequency .ω0 = π [rad/s]. Also, .c/M = 0.5 · 10−3 .
The frequency discretization of the problem ranges from .−5 to 5 [rad/s] with an
increment of .0.1 [rad/s].
In Table 8.7 are listed calculated values for the mean upcrossing rates .νZ+ (ζ ) of
the response process .Z(t) with no restrictions imposed on the covariance structure
for the case .σ/Ū = 0.3. To get an idea about the influence of the correlation
between W and .Ẇ , the upcrossing rate under the assumption that .Σ12 = 0 has
also been calculated. This crossing rate is denoted by .ν̃Z+ (ζ ). From Table 8.7, it is
seen that .νZ+ (ζ ) and .ν̃Z+ (ζ ) are almost equal, supporting the often assumed negligible
influence on the crossing rate of the dependence between W and .Ẇ .
Both estimates .ν̃Z+ (ζ ) and .νZ+ (ζ ) are then compared with the upcrossing rate
+
.νg (ζ ) of the linear part of the response, which is a Gaussian process. The spectrum

of the linear part of the response is

2
2 c Ū 2
SZ (ω) = |L̂(ω)|
. SV (ω). (8.49)
M

Comparing .νg+ (ζ ) with .νZ+ (ζ ), it is observed that the relative difference between
them increases as the level .ζ becomes higher, which is to be expected.
From a practical point of view, it is interesting to observe the effect on the
predicted extreme responses of the full quadratic model for the wind loading as
120 8 The Upcrossing Rate via the Characteristic Function

Table 8.7 Mean upcrossing + + + (ζ )

.ζ [meters] .ν̃Z (ζ ) .νZ (ζ ) .νg
rates against different levels.
+ 0.1 0.29891 0.28714 0.22118
.ν̃Z (ζ ): numerical method
assuming .Σ12 = 0; .νZ+ (ζ ): 0.15 0.17232 0.16506 0.12021
numerical method; and 0.2 0.08484 0.08092 0.05119
+
.νg (ζ ): Gaussian
0.3 0.01454 0.01368 0.00446
approximation
0.4 0.00181 0.00167 1.46.·10−4
0.5 1.83.·10−4 1.66.·10−4 1.82.·10−6

opposed to a Gaussian approximation. For this purpose, the following extreme value
distribution is adopted:

Fext (ζ ) = exp − νZ+ (ζ ) T ,
. (8.50)

where .Fext (ζ ) is the cumulative distribution function of the extreme response during
a time interval of length T , that is, .max{Z(t); 0 ≤ t ≤ T }.
Taking into account also the contributing constant forcing term .c Ū 2 that was left
out and which adds a displacement of 0.13 m, it can be verified that the extreme
values at the exceedance probability level .10−2 are about 6–7% higher for the full
quadratic model as compared to the Gaussian approximation when .σ/Ū = 0.1.
The corresponding number is about 30% when .σ/Ū = 0.3. Clearly, these numbers
depend on the specific examples considered, but they serve to indicate what the
effect will be of neglecting the quadratic part of the wind loading.

Appendix 1: The Average Crossing Rate

A heuristic proof of Eqs. (8.19) and (8.20) will be given. For a rigorous derivation,
the reader is referred to Naess (2000b). The derivation will be based on Parseval’s
formula, which relates pairs of Fourier transforms. First, equation (8.18) will be
rewritten. For this purpose the Heaviside function .H (x) is introduced. It is defined
as follows: .H (x) = 0 for .x < 0, .H (0) = 1/2, and .H (x) = 1 for .x > 0. For
simplicity, also write .p(x) = fZ Ż (ζ, x), where .ζ will be fixed. Then,
∞
νZ+ (ζ ) =
. x p(x) H (x) dx. (8.51)
−∞

The Fourier transform .F and its inverse .Fˆ are (formally) defined as follows:
∞
.fˆ(t) = F [f (x)] = f (x) ei x t dx, (8.52)
−∞
8.5 Numerical Examples 121

and
∞
.f (x) = Fˆ [fˆ(t)] = 1 fˆ(t) e− i x t dt. (8.53)
2π −∞

Let .G(x) = x p(x). Parseval’s formula provides us with the following equation
(.∗ denotes complex conjugation):
∞ ∞
1
. G(x) H (x) dx = Ĝ(t) Ĥ (t)∗ dt. (8.54)
−∞ 2π −∞

The Fourier transform .Ĥ (t) of the Heaviside function can be shown to be given by
the relation (Bracewell 1986)

i
Ĥ (t) =
. + π δ(t), (8.55)
t

where .δ(·) denotes Dirac’s delta function. The Fourier transform pair .H (x), .Ĥ (t)
can only be interpreted as such in a formal manner. A certain caution should
therefore be exercised when they are used in calculations. That is, one should
understand their limitations and proper use, which requires knowledge about how
the relation given in Eq. (8.55) was established (Bracewell 1986).
If .p̂(t) denotes the Fourier transform of .p(x), then .Ĝ(t) = − i d p̂(t)/dt. From
the definition of the characteristic function, it is obtained that
∞
1
p̂(t) =
. MZ Ż (s, t) e− iζ s ds, (8.56)
2π −∞

leading to
∞
d p̂(t) 1 ∂MZ Ż (s, t) − iζ s
. = e ds. (8.57)
dt 2π −∞ ∂t

Hence it is found that

∞ ∞
+ −i ∂MZ Ż (s, t) − iζ s i
.ν (ζ )
Z = e − + π δ(t) dt ds
(2 π )2 −∞ −∞ ∂t t
∞ ∞
1 1 ∂MZ Ż (s, t) − iζ s
=− e dt ds
(2 π )2 − ∞ − ∞ t ∂t
∞
i ∂MZ Ż (s, t)
− e− iζ s ds, (8.58)
4 π −∞ ∂t t=0

which is seen to agree with Eq. (8.19). A scrutiny of the derivation of the (formal)
Fourier transform of the Heaviside function .H (·) reveals that the integral with
122 8 The Upcrossing Rate via the Characteristic Function

respect to t in the above expression should be understood as a principal value

integral as described after Eq. (8.19).
In the case of a stationary process, .νZ+ (ζ ) = νZ (ζ )/2, where .νZ (ζ ) denotes the
average crossing rate of the level .ζ . The corresponding Rice formula for the crossing
rate is given as
∞
νZ (ζ ) =
. x p(x) sign(x) dx, (8.59)
−∞

where .sign(x) = − 1 for .x < 0, .sign(0) = 0, and .sign(x) = 1 for .x > 0. Since
Fˆ [sign(x)] = 2 i/t, it is seen from the derivation above that indeed
.

∞ ∞
+ −i ∂MZ Ż (s, t) − iζ s i
.ν (ζ )
Z = e − dt ds
(2 π )2 −∞ −∞ ∂t t
∞ ∞
1 1 ∂MZ Ż (s, t) − iζ s
=− e dt ds. (8.60)
(2 π )2 −∞ −∞ t ∂t

Appendix 2: The Characteristic Function

The derivation of the formula for the characteristic function .MZ Ż will be based on
an integral equality which is cited, but not proved, in a less general form by Cramer
(1946). The prime (’) will be used to denote transposition of a matrix or vector, and
.det(A) denotes the determinant of a square matrix A.

Theorem 8.1 Let A be a complex, symmetrical, and nonsingular .n × n matrix, and

assume that the real part of A is positive definite. Let d denote a complex .n × 1
vector. Then the following integral equality holds:
∞ ∞
' 1 ' (2 π )n/2 1 ' −1
. ··· exp d x − x A x dx1 . . . dxn = √ exp d A d .
−∞ −∞ 2 det(A) 2
(8.61)
Proof Let .μ = A− 1 d. It is then easily verified that .2 d ' x − x ' A x = − (x −
μ)' A (x − μ) + d ' A− 1 d. Hence, it follows that the desired result is obtained if the
following equality is proved (.z = x − μ):
∞−i𝔍(μ1 ) ∞−i𝔍(μn )
1 ' (2 π )n/2
. ··· exp − z A z dz = √ , (8.62)
−∞−i𝔍(μ1 ) −∞−i𝔍(μn ) 2 det(A)

where .μ = (μ1 , . . . , μn )' , .𝔍(μj ) denotes the imaginary part of .μj , and .dz =
dz1 . . . dzn .
By assumption, .A = P + iQ, where P and Q are real symmetric matrices
with P positive definite. Then there exists a real unitary matrix U such that
8.5 Numerical Examples 123

.U ' P U = D = diag(λ1 , . . . , λn ), .λj > 0 for .j = 1, . . . , n. .diag(·) denotes a

diagonal matrix with the indicated arguments on the diagonal. The variable shift
' ' '
.z → y is made, where .y = U z. Then, .z A z = y Dy + i y Q̃y, where .Q̃ = U QU
'

is a real symmetric matrix. A scaling √ of the coordinate

√ variables
√ is introduced
√ by
the variable shift .y → u, where .y = Du, and . D = diag( λ1 , . . . , λn ). Then,
' ' '
√ −1 √ −1
.z A z = u u+i u Q̂u, where .Q̂ = D Q̃ D is again a real symmetric matrix.
Hence, a real unitary matrix V exists such that .V ' Q̂ V = Ω = diag(ω1 , . . . , ωn ).
Let .E = I + iΩ = diag(1 + iω1 , . . . , 1 + iωn ). By the nvariable shift .u2 → v,
where .v = V u, it follows that .z' A z = v ' E v = j =1 (1 + iωj ) vj , .v =
(v√1 , . . . , vn )' . Since U√and V are unitary matrices, it is observed that .dz = dy =
( det(D))− 1 du = ( det(D))− 1 dv. Denoting the integral on the left hand side of
Eq. (8.62) by I , it is obtained that

∞−ic1 ∞−icn 1
n
1
I=√
. ··· exp − (1 + iωj ) vj2 dv1 . . . dvn ,
det(D) −∞−ic1 −∞−icn 2
j =1
(8.63)

where .cj , .j = 1, . . . , n, are suitable constants. Since the functions .fj (z) =
exp{−(1/2) (1 + iω) z2 } are entire functions in the complex variable .z = x + iy
and since .lim|x|→∞ f (z) = 0 uniformly in y for .|y| ≤ const., it follows that the
path of integration from .− ∞ − icj to .∞ − icj in Eq. (8.63) can be replaced
by the path from .− ∞ √ to .∞ by Cauchy’s theorem. From the standard result that
∞ 2 } dx =
.
−∞ exp{−a x π/a for a complex constant a provided that .ℜ(a) > 0, it
now follows that
√
1 n
2π 1 (2 π )n/2
.I = √ =√ √ . (8.64)
det(D) j =1 1 + iωj det(D) det(E)

To prove Eq. (8.62), it only remains to show that .det(D) · det(E) = det(A).
Invoking the fact that U and V are unitary, it is obtained that
√ −1 √ −1
. det(E) = det(I + iΩ) = det(I + iQ̂) = det(I + i D Q̃ D )
√ −1 √ −1 √ −1 √ −1
= det( D D D + i D Q̃ D )
= (det(D))− 1 det(D + iQ̃) = (det(D))− 1 det(P + iQ)
det(A)
= , (8.65)
det(D)

which is what was needed to prove.

Our goal is to calculate the characteristic function

MZ Ż (u, v) = E [ exp iu Z + iv Ż ].
. (8.66)
124 8 The Upcrossing Rate via the Characteristic Function

Using conditional probabilities, one may write

MZ Ż (u, v) = E eiu Z E eiv Ż |W = E eiu Z E eiv (Ż|W ) .
. (8.67)

From Eq. (8.17) it follows that .Ż = Y ' Ẇ , where .Y = β +2DW , .β = (β1 , . . . , βn )' ,
and .D = diag(λ1 , . . . , λn ). Hence, .(Ż|W ) = Y ' (Ẇ |W ). .(Ẇ |W = w), where
'
.w = (w1 , . . . , wn ) ∈ R , is a Gaussian vector with mean value .μ = Σ21 w
n

and covariance matrix .C = Σ22 − Σ21 Σ12 , cf. Anderson (1958). This means that
'
.(Ż|W = w) is a scalar Gaussian variable with mean value .m = y μ = y Σ21 w
'
'
and variance .s = y C y, where .y = β + 2Dw. Invoking the expression for the
2

characteristic function of a Gaussian variable, it follows that .E[exp{iv (Ż|W )}] =

exp{iv Y ' Σ21 W − 12 v 2 Y ' C Y }. It is now obtained that

1 2 '
MZ Ż (u, v) = E [ exp iu W ' DW + iu β ' W + iv Y ' Σ21 W −
. v Y C Y ].
2
(8.68)

The calculation of the expected value of this equation amounts to the calculation of
the following integral:
∞ ∞
1 1 2 ' 1
MZ Ż (u, v) =
. ··· exp − v β C β + d ' w − w ' B w dw,
(2π )n/2 −∞ −∞ 2 2
(8.69)

where

d = (i u I + i v Σ12 − 2 v 2 D C) β,
. (8.70)

and

.B = I − 2iu D − 4 i v D Σ21 + 4 v 2 D C D. (8.71)

It is recognized by inspection that the matrix B is not in general symmetric.

However, the expression .w ' Bw is a scalar quantity, which implies that .w ' Bw =
(w ' Bw)' = w ' B ' w. Hence .w ' Bw = w ' Aw, where A is the symmetrized version of
B, that is, .A = (B + B ' )/2. By invoking Theorem 8.1, it is therefore obtained:
Theorem 8.2 Let the stochastic process .Z(t) be represented as given by Eq. 8.17.
Then the characteristic function .MZ Ż (u, v) = E [ exp iu Z + iv Ż ] of the joint
variable .(Z, Ż) is given by the expression

1 1 2 ' 1 ' −1
.MZ Ż (u, v) = √ exp − v β Cβ+ d A d , (8.72)
det(A) 2 2
8.5 Numerical Examples 125

where

d = (i u I + i v Σ12 − 2 v 2 D C) β,
. (8.73)

and

A = I − 2iu D − 2 i v D Σ21 + Σ12 D + 4 v 2 D C D.

. (8.74)

nNote that in2 the absence of a linear component, that is, when .Z(t) =
j =1 λj Wj (t) , then

1
MZ Ż (u, v) = √
. . (8.75)
det(A)

The marginal characteristic functions .MZ (u) = E[exp{i u Z}] and .MŻ (v) =
E[exp{i v Ż}] are now easily obtained by the relations .MZ (u) = MZ Ż (u, 0) and
.MŻ (v) = MZ Ż (0, v). In particular,

1 u βj
n 2 2
1
. MZ (u) = exp − , (8.76)
n 2 1 − 2 i u λj
j =1 (1 − 2 i u λj ) j =1

which is in agreement with previously derived results (Kac and Siegert 1947; Naess
1986). It is also obtained that

1 1 2 ' 1 ' −1
MŻ (v) =
. exp − v β C β + d̃ Ã d̃ , (8.77)
2 2
det(Ã)

where

d̃ = (i v Σ12 − 2 v 2 D C) β,
. (8.78)

and

Ã = I − 2 i v D Σ21 + Σ12 D + 4 v 2 D C D.
. (8.79)
Chapter 9
Monte Carlo Methods and Extreme
Value Estimation

9.1 Introduction

The last decade has seen a dramatic increase in the use of Monte Carlo methods
for solving stochastic engineering problems. There are primarily two reasons for
this increase. First, the computational power available today, even for a laptop
computer, is formidable and steadily increasing. Second, the versatility of Monte
Carlo methods make them very attractive as a way of obtaining solutions to
stochastic problems. The drawback of Monte Carlo methods for a range of problems
has been that the required numerical calculations may take days, weeks, or even
months to do. But this situation is changing, some numerical problems that required
several days of computer time for their solution just a few years ago can now be
solved in minutes or hours. This has really opened the door for the use of Monte
Carlo-based methods for solving a wide array of stochastic engineering problems. In
this chapter the focus is on adapting Monte Carlo methods for estimation of extreme
values of stochastic processes encountered in various engineering disciplines.

9.2 Simulation of Stationary Stochastic Processes

The approach to the simulation of stationary stochastic processes favored in this

book is the spectral representation method (Shinozuka and Deodatis 1991). The
main reason for this is its simplicity and transparency for practical applications.
The procedure is described in Sect. 9.2.2, where Eq. (9.15) is a key result. Useful
background information for this is contained in Sect. 9.2.1. Combining this with
Example 9.2.4 provides a hands-on guide of how to simulate realizations of a
stationary Gaussian process.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 127
A. Naess, Applied Extreme Value Statistics,
[Link]
128 9 Monte Carlo Methods and Extreme Value Estimation

The use of the fast Fourier transform (FFT) technique can substantially speed
up the production of realizations of stationary Gaussian processes, cf. Shinozuka
(1974) and Newland (1991).

9.2.1 Realizations of Stochastic Processes

Monte Carlo methods can be used for any input-output system subjected to a
stochastic process such that for every realization of the input stochastic process
a corresponding realization of the output process can be calculated. The basic idea
underlying the Monte Carlo method is that of producing a sample of output/response
time histories from a sample of input/loading time histories. This makes it possible
to estimate various statistics of the output/response process based on the available
sample of realizations. Hence, a key element in the Monte Carlo method is therefore
the realizations of a stochastic process. For their simplicity, the focus here is on a
couple of widely used ways of representing stationary stochastic processes, and a
discussion is given of how realizations of such processes are generated, see also
Shinozuka and Jan (1972) and Shinozuka and Deodatis (1991).
A simple, but useful representation of a stationary stochastic process .X(t) can be
obtained as follows,

N

X(t) =
. Aj cos ωj t + Bj sin ωj t , ∞ < t < ∞, (9.1)
j =1

where, for .j = 1, . . . , N , the .ωj are positive constants, the .Aj and the .Bj are
random variables with .E[Aj ] = E[Bj ] = 0, .E[A2j ] = E[Bj2 ] = σj2 , and .E[Aj Bj ] =
0 for .j = 1, . . . , N . Also, .E[Aj Ak ] = E[Bj Bk ] = E[Aj Bk ] = 0 for .j /= k. When
these conditions are satisfied, it can be shown that .X(t) is a (weakly) stationary
process with,

mX = E[X(t)] = 0,
. (9.2)

and

N
CX (τ ) = E[(X(t) − mX )(X(t + τ ) − mX )] =
. σj2 cos ωj τ. (9.3)
j =1

The first step of the procedure is to specify what kind of random variables .Aj
and .Bj are assumed to be. A common choice in many cases is to assume that
these variables are independent and normally distributed with zero mean value and
known standard deviations .σj . A realization of .X(t) is then obtained when a set of
outcomes of the random variables .Aj and .Bj has been generated. In this connection,
9.2 Simulation of Stationary Stochastic Processes 129

it is useful to make the following observation: if .Ãj is normally distributed with zero
mean and standard deviation equal to 1.0, then .Aj = σj Ãj is normally distributed
with mean zero and standard deviation equal to .σj . Similarly for .Bj . Equation (9.1)
may therefore be written in the form,

N

X(t) =
. σj Ãj cos(ωj t) + B̃j sin(ωj t) , (9.4)
j =1

where .Ãj and .B̃j , .j = 1, . . . , N , is now a set of independent, standard normally

distributed variables. There are computer programs that may be used to generate
independent outcomes of a standard, normally distributed variable. It is seen that
2N outcomes are needed for this example. Specifically, assume that .ãj and .b̃j ,
.j = 1, . . . , N are the obtained outcomes from such a program. The corresponding

realization .x(t) is then,

N

x(t) =
. σj ãj cos(ωj t) + b̃j sin(ωj t) . (9.5)
j =1

This procedure can be repeated as many times as needed to produce the requested
sample size of realizations. It is, of course, understood here that each realization
is generated independently of all others. This procedure is illustrated with an
application to ocean waves in Example 9.2.4.
The following representation is also frequently used, especially when ergodic
properties are desirable,

N
X(t) =
. aj cos(ωj t + Φj ), (9.6)
j =1

where .aj and .ωj , .j = 1, . . . , N are positive constants, and .{Φj }N j =1 is a set of
independent random variables that are uniformly distributed over .(0, 2π ). It may be
shown that .X(t) is ergodic with respect to the mean value and the autocorrelation.
Generating realizations of this process is now quite straightforward. It is seen
that, in fact, only outcomes of the uniformly distributed phase angles .Φj , .j =
1, . . . , N are needed. If .Rj , .j = 1, . . . , N, denote independent random variables
uniformly distributed on .(0, 1), usually referred to as random numbers, then clearly
one may put .Φj = 2π Rj . Hence, by invoking a random number generator, usually
available on any computer, outcomes of .Φj , .j = 1, . . . , N can be easily generated.
Let .rj , .j = 1, . . . , N , denote a set of independent random number outcomes. A
realization .x(t) of .X(t) is then,

N
.x(t) = aj cos(ωj t + 2π rj ). (9.7)
j =1
130 9 Monte Carlo Methods and Extreme Value Estimation

9.2.2 Variance Spectra

Fourier analysis is routinely used to decompose the time histories of load and
response as sums or integrals of .cos(·) and .sin(·) terms over the frequency domain. It
is known how the transfer function gives a direct connection between the amplitudes
at each frequency for the load and response, provided the system is linear and time
invariant. A time history that is periodic can be decomposed as a sum over a finite
or countably infinite number of frequencies. If it is not periodic, the decomposition
must be expressed as an integral. It can be shown that this is possible only if the
time history dies out with time. Regarding the realizations of stationary stochastic
processes, they will not generally be periodic. Nor will they decrease with time
because the variance is constant, and this is a measure of the fluctuations around
the mean value. A direct frequency decomposition of the realizations of a stationary
process is therefore not directly feasible. This difficulty is circumvented by using
the autocovariance function of a stationary process. This function will generally
approach zero when its time argument increases.
Let .X(t) be a stationary process with autocovariance function ∞ .CX (τ ). Assume
that .CX (τ ) → 0 when .τ → ∞ sufficiently rapidly so that . −∞ |CX (τ )| dτ has a
finite value, that is, .CX (τ ) is assumed to be an integrable function.
The variance spectrum .SX (ω) of .X(t) is defined as the Fourier transform of
.CX (τ ) as follows,

∞
1
SX (ω) =
. CX (τ ) e−iωτ dτ. (9.8)
2π −∞

It was shown already in the 1930s that .CX (τ ) and .SX (ω) constituted a Fourier
transform pair; that is, .CX (τ ) is given by the inverse Fourier transform as,
∞
CX (τ ) =
. SX (ω) eiωτ dω. (9.9)
−∞

Equations (9.8) and (9.9) are often called the Wiener-Khintchine relations after
the originators, cf. Chatfield (1989).
The quantity .SX (ω) is known by many names. A few names that seem appropri-
ate to mention here are energy spectrum, power spectral density, spectral density or
just spectrum. The name variance spectrum, ties in directly with the interpretation
of .SX (ω). By putting .τ = 0 in Eq. (9.9), it is obtained that,
∞
2
.σX = CX (0) = SX (ω) dω. (9.10)
−∞

This equation shows that .SX (ω) can be interpreted as a distribution of variance along
the frequency axis, provided that .SX (ω) ≥ 0, and in Sect. 9.2.5, it is shown that this
is always the case. Negative frequencies are a mathematical convenience and have
no real physical content. If .SX (ω) is to be interpreted as distribution of variance
9.2 Simulation of Stationary Stochastic Processes 131

along the frequency axis, one would therefore expect that .SX (−ω) = SX (ω). This
symmetry property of .SX (ω) can be shown directly from Eq. (9.8). It follows also
from a rewriting of the Wiener-Khintchine relations to real form, which is of interest
in itself. Because .CX (τ ) is symmetric, it follows from Eq. (9.8) that,
∞ ∞
1 1
SX (ω) =
. CX (τ ) e−iωτ dτ + CX (−τ ) eiωτ dτ
2π 0 2π 0

1 ∞
= CX (τ ) cos ωτ dτ , (9.11)
π 0

where Euler’s relation .eix = cos x +i sin x is used. The last integral in the upper line
is obtained by the change of variable .τ → −τ , while the lower line follows from
the symmetry property .CX (−τ ) = CX (τ ).
From Eq. (9.11), it follows that,

SX (−ω) = SX (ω),
. (9.12)

because .cos(−ωt) = cos ωt. Equation (9.9) can then be rewritten as,
∞
CX (τ ) = 2
. SX (ω) cos ωτ dω. (9.13)
0

If .SX (ω) is a reasonably nice function, the integral on the rhs of Eq. (9.13) can
be approximated by a finite sum, viz.

∞
N
.CX (τ ) = 2 SX (ω) cos ωτ dω ≈ 2SX (ωj ) Δω cos ωj τ, (9.14)
0 j =1

for a suitable choice of .ω1 < . . . < ωN , and sufficiently small .Δω = (ωN −
ω1 )/(N − 1). Let us now recollect previous results and define a stationary process,

N

X̃(t) =
. Aj cos ωj t + Bj sin ωj t , (9.15)
j =1

where the random variables .Aj and .Bj satisfy the necessary conditions for
stationarity, cf. Eq. (9.1). In addition, .σj2 = E[A2j ] = E[Bj2 ] = 2SX (ωj )Δω. Then,
according to Eq. (9.10), the autocovariance of .X̃(t) is given as,

N
CX̃ (τ ) =
. 2SX (ωj ) Δω cos ωj τ. (9.16)
j =1
132 9 Monte Carlo Methods and Extreme Value Estimation

A stationary process .X̃(t) is thus constructed with approximately the same variance
distribution, that is, variance spectrum, as .X(t). In a certain sense, .X̃(t) can be said
to represent .X(t). What has just been described, is a variant from a class of methods
that is extensively used in practice to generate realizations of a given stationary
process. To get a concrete realization of a process represented by Eq. (9.15), one
has to generate outcomes of the random variables that enter the sum.
On the basis of Eq. (9.12) and the fact that negative frequencies do not really
have any physical meaning, it is common practice in engineering to use the one-
+
sided variance spectrum, which is denoted by .SX (ω) in this chapter, and defined
as,

+ 2SX (ω) , ω ≥ 0,
.S (ω)
X = (9.17)
0 , ω < 0.

The distribution of variance is thereby concentrated to positive (physically

realizable) frequencies. The variance expressed in terms of the one-sided variance
spectrum is clearly,
∞
+
2
.σX = SX (ω) dω. (9.18)
0

A typical variance spectrum produced from measured data for the wave elevation
at a point on the sea surface at a location in the North Sea is shown in Fig. 9.1. The
somewhat jagged look is mostly due to low numerical resolution.

Fig. 9.1 Typical “wave spectrum” from the North Sea

9.2 Simulation of Stationary Stochastic Processes 133

9.2.3 Units of Variance Spectra

In Eq. (9.10), the units of .SX (ω) are the square of the units of .X(t) divided by
radians per second. If .X(t) models the wave elevation at a location on the ocean
surface and is measured in meters, then the units of .SX (ω) are m.2 s/rad.
So far, the variance spectrum has only been considered as a function of circular
frequency .ω with units rad/s. It is also quite common to give the variance spectrum
as a function of frequency f with units in Hertz (oscillations per second). The
relation .ω = 2πf implies a rescaling of the frequency axis when .ω is replaced by
f . It is therefore not correct to believe that .SX (2πf ) would represent the variance
spectrum as a function of f ; this would easily lead to the wrong variance. If .G+X (f )
denotes the one-sided variance spectrum as a function of f in Hz, then to conserve
the variance, it is needed to require that .G+ X (f )df = SX (ω)dω. This leads to the
relation,

.G+ +
X (f ) = 2π SX (ω). (9.19)

Exactly the same relation applies to two-sided spectra. A factor of .2π = 6.28
occurring erroneously in the variance can have a significant impact on the results
in some cases. Thus, care should be exercised when adopting values of spectral
moments by checking which kind of frequency is used in the variance spectrum.

9.2.4 Example: A Realization of a Wave Process

In this example, the procedure for producing a realization of a stationary stochastic

process described in Sect. 9.2.1 is illustrated. In particular, it will be shown how a
realization of a stochastic process with a given variance spectrum can be generated.
To be even more specific, it will be assumed that the task is to produce an arbitrary
realization of the wave elevation .X(t) at a given location in the North Sea with a
recorded wave spectrum, as depicted in Fig. 9.1. Thus, it is assumed that this wave
elevation can be represented as a stationary stochastic process .X(t) with a one-
sided variance spectrum .G+ X (f ), as shown in Fig. 9.1. As mentioned in Sect. 9.6.5,
there are several alternative methods that can be used to generate realizations of a
stochastic process. Our choice here is to use the method described in Sect. 9.6.5,
which amounts to approximating the process .X(t) with .X̃(t), and then generate
realizations of .X̃(t) instead.
To proceed, it is necessary to specify what kind of random variables .Aj and .Bj
to use in Eq. (9.15). The common choice in the case of wave processes on the open
ocean is to assume that these variables are independent and normally distributed
with zero mean value. To determine the standard deviation, it is necessary to decide
on a suitable discretization of the frequency axis. For many practical purposes, it
would be desirable to have a discretization that would give of the order of .103
134 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.2 Wave spectrum

from the North Sea

frequencies in the frequency range where the waves have significant energy, e.g.,
from 0.27 to 0.99 Hz in this case, cf. the discussion at the end of the example.
Because the objective here is to illustrate the procedure, a rather coarse frequency
increment .Δf = 0.09 Hz will do.. That gives 8 frequencies denoted by .f1 , . . . , f8
(.f1 = 0.315 Hz, .fj +1 = fj +0.09 Hz) in the specified frequency range. The process
.X̃(t) can then be written as,

8

. X̃(t) = Aj cos(2πfj t) + Bj sin(2πfj t) . (9.20)
j =1

Figure 9.2 shows the relevant part of the wave spectrum in Fig. 9.1 magnified along
the frequency axis to clarify how the standard deviation of .Aj and .Bj is determined.
What remains to get an approximate realization of .X(t) is to generate outcomes
of the random variables that enter .X̃(t). For this, an observation made in Sect. 9.2.1
is invoked: if .Ãj is normally distributed with zero mean and standard deviation
equal to 1.0, then .Aj = σj Ãj is normally distributed withzero mean and standard
deviation equal to .σj . Similarly for .Bj . Because .σj = G+
X (fj ) Δf , Eq. (9.20)
may be written in the form,

8

. X̃(t) = G+
X (fj ) Δf Ãj cos(2πfj t) + B̃j sin(2πfj t) , (9.21)
j =1

where .Ãj and .B̃j , .j = 1, . . . , 8, now constitute a set of independent, standard nor-
mally distributed variables. Computer programs are easily available for generating
independent outcomes of a standard, normally distributed variable. It is seen that 16
is needed for the example.
9.2 Simulation of Stationary Stochastic Processes 135

Table 9.1 A table of the numerical information needed to produce a realization of the stochastic
process given by Eq. (9.21)

+
j .fj [Hz] . GX (fj ) Δf [cm] .r2j −1 .r2j .ãj .b̃j

1 .0.315 .1.90 0.10097 0.32533 .−1.2760 .−0.4529

2 .0.405 .5.45 .0.76520 0.13586 0.7232 .−1.0991

3 0.495 6.57 0.34673 0.54876 .−0.3942 0.1225

4 0.585 3.15 0.80959 0.09117 0.8764 .−1.3335
5 0.675 2.23 0.39292 0.74945 .−0.2717 0.6728
6 0.765 1.56 0.37542 0.04805 .−0.3175 .−1.6641
7 0.855 1.44 0.64894 0.74296 0.3825 0.6525
8 0.945 1.41 0.24805 0.24037 .−0.6807 .−0.7051

Fig. 9.3 A realization of the 25

wave elevation 20
15
Wave elevation in cm

10
5
0
–5
–10
–15
–20
–25
0 2 4 6 8 10 12 14 16 18 20
Time in seconds

An alternative procedure is to use a table or computer program for generating

(pseudo-)random numbers, which are uniformly distributed between 0 and 1. This
can also be used by invoking the following result. If .Φ(·) denotes the distribution
of a standard, normally distributed variable, and R denotes a random variable that
is uniformly distributed between 0 and 1, then the random variable .Z = Φ −1 (R)
is standard and normally distributed. By generating 16 independent outcomes of
R: .r1 , . . . , r16 , then .z1 = Φ −1 (r1 ), . . . , z16 = Φ −1 (r16 ) will be 16 independent
outcomes of a standard, normally distributed variable. This procedure is used here,
and the results are shown in Table 9.1, where .ãj = Φ −1 (r2j −1 ) is an outcome of
.Ãj and .b̃j = Φ
−1 (r ) is an outcome of .B̃ .
2j j
A piece of the corresponding realization is shown in Fig. 9.3, and one may get a
similar impression as when observing irregular seas out on the oceans. In practice,
there is often a need to generate many realizations to perform statistical analyses.
It is then necessary to repeat the procedure just described the requisite number of
times, and for each realization, a new set of outcomes independent of the previous
ones are chosen.
If a piece of the realization that has been generated was shown with a duration
which was twice as long, it would have become clear that the wave pattern is
136 9 Monte Carlo Methods and Extreme Value Estimation

repeating itself. This is due to the way it was constructed, which indeed makes it
periodic. The period is determined by the greatest common divisor of the frequency
increment and the initial frequency of the discretization of the frequency range
that is chosen. In this case, the period becomes 1/0.045 = 22.2 s. The practical
consequence of this is that one must choose a discretization that is in correspondence
with the required length of a realization. It may be worth mentioning that for most
practically relevant discretizations, one may say that a corresponding realization will
have a period that can be assumed to be given as .1/Δf . One does not fully avoid
the problem related to periodicity by choosing an (almost) irrational ratio between
the starting frequency and the frequency increment, or other “smart” tricks such as
choosing the frequencies randomly within each subinterval of the discretization.
Another point worth noting is that the process .X̃(t) is not ergodic. If it is desirable
to ensure this property, one may use the method described in Sect. 9.2.1. However,
the difference in the practical results obtained by using this method versus the one
described here, is usually rather small if the discretization is properly done.

9.2.5 The Variance Spectrum Directly from the Realizations

When the variance spectrum was defined, it was mentioned that a realization .x(t)
of a stationary process .X(t) does not have a Fourier transform because it does not
decrease toward zero for large t. Intuitively, one would nevertheless expect that most
of the information regarding the frequency content of .x(t) should be contained in a
finite section of .x(t) if this section is large enough. A section of .x(t) can be defined
as follows,

x(t) , 0 ≤ t ≤ T ,
xT (t) =
. (9.22)
0 , elsewhere.

Because .xT (t) is zero outside a finite interval, it has a Fourier transform, viz.
∞ T
1 1
XT (ω) =
. xT (t) e−iωt dt = x(t) e−iωt dt. (9.23)
2π −∞ 2π 0

It would seem natural to expect that there is a connection between .XT (ω) and the
variance spectrum .SX (ω). And it turns out that the connection is, in fact, quite
simple. It is written in the following way: assume that .xj (t), .j = 1, 2, . . . are
realizations of a stationary process. Then the following equation applies,

2π
N
SX (ω) = lim
. lim |Xj,T (ω)|2 , (9.24)
T →∞ N →∞ T N
j =1
9.3 Monte Carlo Simulation of Load and Response 137

where .Xj,T (ω) denotes the Fourier transform of .xj,T (t), which equals .xj (t) for
.0 ≤ t ≤ T and zero elsewhere. From Eq. (9.24), it is also immediately seen that
.SX (ω) ≥ 0.

Equation (9.24) is based on the availability of an ensemble of realizations.

As previously discussed, there are many situations where only one realization is
available. By assuming that the process is ergodic, it can be shown that the variance
spectrum can be determined in the following way. Let .x(t) denote a realization
of the assumed ergodic process .X(t). Define a set of truncated Fourier transforms
.X(j ),T (ω) over the intervals . (j − 1)T , j T , .j = 1, 2, . . ., as follows,

jT
1
.X(j ),T (ω) = x(t) e−iωt dt. (9.25)
2π (j −1)T

It is then obtained that,

2π
N
SX (ω) = lim
. lim |X(j ),T (ω)|2 . (9.26)
T →∞ N →∞ T N
j =1

Equation (9.24) or (9.26) can be regarded as the basis for the fast Fourier transform
(FFT) algorithm for calculating spectra. This method, developed around 1965, has
assumed a dominating position among numerical methods for calculating Fourier
transforms. This is primarily due to the fact that the method is much faster than
traditional methods. An extensive discussion of the FFT method for calculating
Fourier transforms is given by Newland (1991).

9.3 Monte Carlo Simulation of Load and Response

The linear or linearized equations of motion for marine structures considered in

this book and their solutions are discussed in Naess and Moan (2013). For such
equations, the comments of the previous section apply. An example of a nonlinear
dynamic model that is often adopted for an offshore structure, can be written in the
following form,

M Ẍ(t) + C Ẋ(t) + f X(t), Ẋ(t), t = P (t) ,

. (9.27)

where .M and .C are suitable mass and damping matrices, respectively;

.f (·, ·) = f kl (·, ·) , .k, l = 1, . . . , n, is a nonlinear matrix function; .P (t) =
(P1 (t), . . . , Pn (t))' denotes a stochastic loading process; and .X(t) = (X1 (t),
'
.. . . , Xn (t)) is the corresponding response process.

The Monte Carlo method applied to such a system would consist of generating
a statistical sample of specified size N, say, of response time histories by first
generating a sample of time histories of the same size of the loading process, and
138 9 Monte Carlo Methods and Extreme Value Estimation

then solve Eq. (9.27) for each of the load time histories in the sample using methods
of numerical integration of dynamical system equations, cf. Argyris and Mlejnek
(1991) and Naess and Moan (2013). When the desired sample of response time
histories is produced, the statistical analysis of the response may then proceed as
discussed in the remaining part of this chapter.

9.4 Sample Statistics of Simulated Response

An important element for graphic representation of sampled data from a statistical

population for which the underlying distribution function is unknown is the so-
called plotting position formula. It is based on the notion of order statistics.
Let us start by assuming that X is a continuous random variable with a
distribution .F (x) and a probability density .f (x). The given sample of independent
observations .x1 , x2 , . . . , xn is now ordered in an increasing sequence .x(1) ≤ x(2) ≤
. . . ≤ x(n) . The random variable .X(m) corresponding to .x(m) is called the mth order
statistic, .m = 1, . . . , n. The probability density .fm (x) of .X(m) follows from the
observation that .X(m) = x implies the event that there are .m − 1 outcomes of
X with values less than (or equal to) x, and .n − m outcomes with values greater
than x. According to the binomial distribution, the probability of this event equals
.n!/ (m − 1)!(n − m)! F (x)
m−1 1 − F (x) n−m . Hence, it is obtained that (Casella

and Berger 2002),

n! n−m
fm (x) =
. F (x)m−1 1 − F (x) f (x). (9.28)
(m − 1)!(n − m)!

It is now required to calculate .E[F (X(m) )]. This is given by,

n n−m
.E[F (X(m) )] = F (x)fm (x) dx = m F (x)m 1 − F (x) f (x) dx
x m x

n 1 n m!(n − m)! m
=m F m (1 − F )n−m dF = m = . (9.29)
m 0 m (n + 1)! n+1

Similarly, the variance of .F (X(m) ) is calculated to be,

Var[F (X(m) ] = E[F (X(m) )2 ] − E[F (X(m) )]2

m(m + 1) m2 m(n + 1 − m)
= − = . (9.30)
(n + 1)(n + 2) (n + 1)2 (n + 1)2 (n + 2)

The results of Eqs. (9.29) and (9.30) are useful because they provide a means
of plotting the sample of observations .x1 , x2 , . . . , xn in order to estimate the
distribution .F (x) empirically. Equation (9.29) states that the expected value of the
9.4 Sample Statistics of Simulated Response 139

distribution function evaluated at the observation of order m is equal to .m/(n + 1).

This result suggests that an optimal plotting strategy is obtained by plotting the
points . x(m) , m/(n+1) , .m = 1, . . . , n. Equation (9.30) provides information on the
variance of .F (X(m) ), that is, the ordinate of the plotting point. Due to the symmetry
of the expression, it attains its maximum .1/[4(n + 2)] at the median and decreases
symmetrically to .n/[(n + 1)2 (n + 2)] toward the ends of the interval .[0, 1]. Because
the distribution function itself is not known, these results are called distribution-free
results.
The preceding results are complemented by calculating the probability that the
mth observation is not exceeded by a future observation. Now, the conditional
probability that a single observation will not exceed .X(m) given that .X(m) = x
is equal to .F (x). The corresponding unconditional probability .pm , say, is then
obtained by using the law of total probability, which gives exactly the same result
as given by Eq. (9.29), that is,

m
pm =
. F (x)fm (x) dx = . (9.31)
x n+1

This result shows that a new observation of the continuous random variable X has
equal probability of assuming a value in any of the .n + 1 intervals defined by the
previous n observations. This lends further support to the optimality of the plotting
position formula expressed by Eq. (9.29).
It may be pointed out that several alternative plotting position formulas for
specific classes of distributions have been suggested over the years, which primarily
aim at correcting for sample bias. A discussion of this topic is not pursued here,
but the reader is rather referred to the literature. A discussion of particular interest
related to the estimation of return periods can be found in Makkonen (2006, 2008),
where it is argued for the use of Eq. (9.29).
A useful diagnostic tool to check the accuracy of an assumed statistical distribu-
tion F for the observed data is obtained by comparing the fitted distribution .F̂ with
the data on a quantile plot (QQ-plot) or a probability plot (PP-plot). Assuming that
.F̂ is strictly increasing and continuous, the QQ-plot is obtained by comparing the

ordered data with the corresponding quantiles of the fitted distribution by plotting
m
. F̂ −1 , x(m) . (9.32)
n+1

The name QQ-plot derives from the fact that both .F̂ −1 n+1 m
and .x(m) are estimates
of the .m/(n + 1)th quantile of F . If F is a good choice for the distribution of the
data, the QQ-plot should be close to the straight line of slope 1 passing through the
origin. Alternatively, the fit of .F̂ to the data can be checked by the PP-plot, which is
obtained by plotting the points
m
. F̂ x(m) , . (9.33)
n+1
140 9 Monte Carlo Methods and Extreme Value Estimation

A good fit is again demonstrated if the plotted graph is approximately a straight line.
The main difference between the two plots is that the QQ-plot gives a more clear
impression of the fit of the tail data, which may be of particular significance for
extreme value statistics.
It is worth noting that for a range of distribution functions QQ-plots can
be constructed without having to estimate distribution parameters. This typically
applies to distributions characterized by a scale and a location parameter, e.g., the
normal distribution. In such cases the intercept of the line fitted to the QQ-plot
would represent location, while the slope represents scale. For example, a QQ-plot
for a normal distribution can be achieved by plotting . Φ −1 (m/(n + 1)), x(m) .

9.5 Latin Hypercube Sampling

Latin hypercube sampling (LHS) (McKay et al. 1979) is a method for effectively
reducing the sample size for Monte Carlo simulations of stochastic response
processes that depend on many random parameters. In cases where the parameters
of a dynamic model, such as mass, damping, and stiffness are modeled as random
variables, the response process will then also depend on these random variables.
Specifically, let us assume that the model depends on the random parameters
.Y1 , . . . , Ym . To highlight the dependence on these parameters, a random response

process .X(t) of this model may then be written as .X(t) = X(t; Y1 , . . . , Ym ).

Because the external loading is often modeled as a stochastic process, the response
.X(t; y1 , . . . , ym ) becomes a stochastic process for each sample .y1 , . . . , ym of the

random parameters. If m is not small, then the number of samples needed to provide
good sample statistics for the response process may become huge if no consideration
is made on how to effectively represent the statistical variability of the random
parameters. LHS is a very good and simple method for this purpose.
LHS starts by selecting k different values from each of the m random variables
in the following manner. The interval .(0, 1) is divided into k equally long intervals
.Ij = ((j − 1)/k, j/k), .j = 1, . . . , k. Let .Fi (y) denote the distribution function

of .Yi , .i = 1, . . . , m. For each i, k independent outcomes .u1 , . . . , uk of the random

number U , which is uniformly distributed on (0,1), are produced. The resulting
ordered sample for .Yi is .yi,1 < . . . < yi,k where .yi,j = Fi−1 ((j − 1)/k + uj /k).
Note that .(j −1)/k+uj /k is nothing but an outcome of a random number uniformly
distributed on the interval .((j − 1)/k, j/k). A Latin hypercube (LH) presample for
.Yi is then obtained as .yi,r1 , . . . , yi,rk , where .r1 , . . . , rk is a random reordering of

.1, . . . , k. Finally, an LH sample for .Y1 , . . . , Ym , which will also be of size k, is now

represented by the .m × k array or matrix .(yi,rj ), where each column is an element

in the LH sample.
It is tacitly assumed that the random parameters are independent. If this is not
the case, LHS can also deal with correlated parameters. Standard statistical software
packages usually offer LHS as an optional sampling technique.
9.6 Estimation of Extreme Response 141

9.6 Estimation of Extreme Response

The view that estimation of extreme values by Monte Carlo methods is generally
prohibitive in terms of computer time, is a truth in contention. In fact, with present-
day computational power, it is possible to perform simulations on a scale that allows
estimation of extreme structural response for a range of problems. Admittedly, it is
not difficult to describe a dynamic model for which the required simulation time
for direct estimation of extreme response would be beyond any acceptable bounds,
but the systems for which this is the case are steadily diminishing in step with the
development of increasingly powerful computers. This being the case, it is good
reason to believe that Monte Carlo simulation-based methods will become much
more frequently used even for extreme value estimation than what is the case today.
It is therefore considered appropriate to discuss such methods here. Note that the
methods described in this section are equally applicable to measured response time
histories, obtained, for instance, from real life experiments. In this section, only the
Gumbel and the point process methods are applied. For simplicity, only the case of
positive extreme values will be discussed. The necessary modifications to deal with
other situations are usually obvious.

9.6.1 The Gumbel Method

It was pointed out in Chap. 5 that for response processes relevant for many
engineering structures, the appropriate extreme value distribution would almost
always be the Gumbel distribution. Therefore, let us assume that this is indeed the
case for the response process .X(t), which can be simulated by a suitable procedure.
Now, assume that N independent response time histories, each of duration T , have
been simulated for a given environmental condition. For the Gumbel method, the
extreme response is then identified for each time series or block of data. These
extreme value data are assumed to be Gumbel distributed, and plotting the obtained
data set of extreme values using a Gumbel probability plot should then ideally result
in a straight line. In practice, one cannot expect this to happen, but on the premise
that the data follow a Gumbel distribution, a straight line can be fitted to the data.
Due to its simplicity, a popular method for fitting this straight line is the method of
moments, cf. Sect. 2.6. That is, writing the Gumbel distribution of the extreme value
.M(T ) as,

Prob(M(T ) ≤ ξ ) = exp − exp − a(ξ − b) ,
. (9.34)

it was shown that the parameters .a > 0 and b are related to the mean value .mM
and standard deviation .σM of .M(T ) as follows: .b = mM − 0.5772 a −1 and .a =
1.2826/σM (Bury 1975). The estimates of .mM and .σM obtained from the available
142 9 Monte Carlo Methods and Extreme Value Estimation

sample therefore provide estimates of a and b, which leads to the fitted Gumbel
distribution by the method of moments.
Typically, a specified fractile value of the fitted Gumbel distribution is then
extracted and used in a design consideration. To be specific, let us assume that
the requested fractile value is the .100(1 − α)% fractile, where .α is usually a
small number, for example .α = 0.1. To quantify the uncertainty associated with
the obtained .100(1 − α)% fractile value based on a sample of size N, the 95%
confidence interval of this value is often used. A good estimate of this confidence
interval can be obtained by using a parametric bootstrapping method (Efron and
Tibshirani 1993; Davison and Hinkley 1997), cf. Sect. 2.8. In our context, this
simply means that the initial sample of N extreme values is assumed to have been
generated from an underlying Gumbel distribution, whose parameters are, of course,
unknown. If this Gumbel distribution had been known, it could have been used
to generate a large number of (independent) samples of size N. For each sample,
a new Gumbel distribution would be fitted and the corresponding .100(1 − α)%
fractile value identified. If the number of samples had been large enough, an accurate
estimate of the 95% confidence interval on the .100(1 − α)% fractile value based
on a sample of size N could be found. Because the true parameter values of the
underlying Gumbel distribution are unknown, they are replaced by the estimated
values obtained from the initial sample. This fitted Gumbel distribution is then used
as previously described to provide an approximate 95% confidence interval. Note
that the assumption that the initial N extreme values are actually generated with
good approximation from a Gumbel distribution cannot be easily verified, which is a
drawback of this method. As has been pointed out, compared with the POT method,
the Gumbel method would also seem to use much less of the information available
in the data. This may explain why the POT method has become increasingly popular
over the past years, but the Gumbel method is still widely used in practice.

9.6.2 The Point Process Method

It is known from Eq. (4.31) that an approximation of the distribution of the extreme
value .M(T ) is obtained from the formula,

Prob(M(T ) ≤ ξ ) = exp{−ν + (ξ ) T },
. (9.35)

where .ν + (ξ ) denotes the mean upcrossing rate of a stationary process .X(t).

The method to be discussed here, relies on this particular approximation. This
implies that the mean upcrossing rate needs to be estimated from the simulated time
series. Assuming the so-called ergodic mean value property, it is obtained that

1 +
.ν + (ξ ) = lim n (ξ ; T ) , (9.36)
T →∞ T
9.6 Estimation of Extreme Response 143

where .n+ (ξ ; T ) denotes a realization of .N + (ξ ; T ); that is, .n+ (ξ ; T ) denotes the

counted number of upcrossings during time T from a particular simulated time
history. In practice, k time histories of a suitable length .T0 , say, are provided by
simulation. The appropriate ergodic mean value estimate of .ν + (ξ ) is then,

1 +
k
+
.ν̂ (ξ ) = nj (ξ ; T0 ) , (9.37)
k T0
j =1

where .n+j (ξ ; T0 ) denotes the counted number of upcrossings of the level .ξ from
time history no. j . This will often be the chosen approach to the estimation of the
mean upcrossing rate.
For a suitable number k, e.g., .k ≥ 20 − 30, and provided that .T0 is sufficiently
large, a fair approximation of the 95% confidence interval (CI.0.95 ) for the value
+
.ν (ξ ) can be obtained as,

ŝ(ξ ) ŝ(ξ )
CI0.95 (ξ ) = ν̂ + (ξ ) − 1.96 √ , ν̂ + (ξ ) + 1.96 √
. , (9.38)
k k

where the empirical standard deviation .ŝ(ξ ) is given as,

+
1 nj (ξ ; T0 )
k 2
ŝ(ξ )2 =
. − ν̂ + (ξ ) . (9.39)
k−1 T0
j =1

Note that k and .T0 may not necessarily be the number and length of the actually
simulated response time series. Rather, they were chosen to optimize the estimate
of Eq. (9.39). If, initially, .k̃ time series of length .T̃ are simulated, then .k = k̃ k0
and .T̃ = k0 T0 . That is, each initial time series of length .T̃ was divided into .k0 time
series of length .T0 , assuming, of course, that .T̃ is large enough to allow for this in
an acceptable way. The consistency of the estimates obtained by Eq. (9.39) can be
checked by the observation that .Var[N + (ξ ; t)] = ν + (ξ ) t when .N + (ξ ; t) becomes
a Poisson random variable, which by assumption occurs for large values of .ξ . This
leads to the equation,
⎡ ⎤
k N + (ξ ; T )
+
1 0
⎦ = ν (ξ ) ,
Var ⎣
j
.s(ξ ) =
2
(9.40)
k T0 T0
j =1

where .{N1+ (ξ ; T0 ), . . . , Nk+ (ξ ; T0 )} denotes a random sample with a possible

outcome .{n+ + +
1 (ξ ; T0 ), . . . , nk (ξ ; T0 )}. Hence, .ŝ(ξ ) /k ≈ ν̂ (ξ )/(k T0 ). Because
2

this last relation is consistent with the adopted assumptions (for large .ξ ), it could
have been used as the empirical estimate of the variance in the first place. It is
also insensitive to the blocking of data discussed previously because .k T0 = k̃ T̃ .
However, the accuracy of this approach may be poor for small to moderate values of
144 9 Monte Carlo Methods and Extreme Value Estimation

.ξ , where the Poisson assumption about the upcrossing events may fail. In contrast,
the advantage of Eq. (9.39) is that it does not rely on any specific assumptions about
the statistical distributions involved.
The idea underlying the development of the approach described here is based on
the observation that for dynamic models relevant for most engineering structures,
the mean .ξ -upcrossing rate as a function of the level .ξ is highly regular in a
particular way (Naess and Gaidai 2008). As is shown later, the mean upcrossing
rate tail, say, for .ξ ≥ ξ0 , behaves similarly to .exp{−a(ξ − b)c } (.ξ ≥ ξ0 ), where
.a > 0, .b ≤ ξ0 , and .c > 0 are suitable constants. Hence, as discussed in detail by

Naess and Gaidai (2008), it may be assumed that,

ν + (ξ ) ≈ q(ξ ) exp{−a(ξ − b)c } , ξ ≥ ξ0 ,

. (9.41)

where the function .q(ξ ) is slowly varying compared with the exponential function
exp{−a(ξ − b)c }. Equation (9.41) can be rewritten as,
.

. ln ln ν + (ξ )/q(ξ ) ≈ c ln(ξ − b) + ln a , ξ ≥ ξ0 . (9.42)

It follows that by plotting .ln ln ν + (ξ )/q(ξ ) versus .ln(ξ − b), it is expected
that an almost perfectly linear tail behavior will be obtained. Now, as it turns out, the
function .q(ξ ) can be largely considered as a constant q, say, for tail values of .ξ . This
suggests using a method for identifying the parameters q and b by optimizing the
linear fit in the tail. When this is achieved, the corresponding values of a and c can
then be extracted from the plot. This is discussed at some length in Naess and Gaidai
(2008). A plot of .ln ln ν + (ξ )/q versus .ln(ξ − b) for optimal parameters b and
q will be referred to as an optimal transformed plot. Examples of such plots will be
given for some of the examples to follow, mostly for the purpose of demonstrating
the validity of assumptions. An alternative, more extensive method for optimizing
the fit to the data is described in Chap. 5.
In engineering applications, it is quite common to assume that the observed
extreme value response data do follow a Gumbel distribution, cf. Sect. 9.6.1. The
problem with this approach is that classical extreme value theory cannot be used
to decide to what extent the asymptotic distribution is actually valid for a given
set of extreme value data. Note that the asymptotic Gumbel distribution given by
Eq. (9.34) corresponds to an asymptotic upcrossing rate that is purely exponential,
that is, with .c = 1 in Eq. (9.41). Hence, by adopting a much more general class
of functions, with the purely exponential functions as a subclass, to represent the
upcrossing rate, as done here, the ability to capture subasymptotic behavior is
greatly enhanced. By this, the necessity to adopt a strictly asymptotic extreme value
distribution of questionable validity is avoided. However, note that for any .c > 0,
the corresponding extreme value model will be asymptotically Gumbel.
Note also that the so-called Weibull method for extreme value prediction, which
is based on the assumption that the local peak values follow a three parameter
Weibull distribution is, in fact, basically a prejudiced version of the point process
method in the sense that the parameter q is a priori given the value 1.
9.6 Estimation of Extreme Response 145

In cases where the approximation implied by Eq. (9.35) may be questioned, the
ACER method should be applied.

9.6.3 A Comparison of Methods

In this subsection, the performance of the point process method is compared with
that of the Gumbel method. This is done for the particular case of the horizontal
deck response of a jacket structure installed on the Kvitebjørn field in the North
Sea. For this kind of response process, the point process approach is very accurate
because a plot of the ACER functions shows that beyond the second ACER function,
which corresponds to the upcrossing rate for suitably sampled time series, there are
no dependence effects that need to be accounted for.
Figure 9.4 depicts the Kvitebjørn jacket platform with the superstructure
removed together with the corresponding three-dimensional computer model used
for the Monte Carlo simulations, see Naess et al. (2007) for details.
For the simulations discussed here, two long-crested sea states described by a
JONSWAP wave spectrum as listed in Table 9.2, were used. Twenty independent
response time histories, each of 3 hours’ duration, were simulated for each sea state.
For the Gumbel method, the extreme horizontal deck response in the wave direction
is identified for each time series. These extreme value data are then assumed to be
Gumbel distributed, and plotting each data set as a Gumbel probability plot results

Fig. 9.4 Left: Sketch of the

Kvitebjørn platform with the
superstructure removed.
Right: Computer model of the
Kvitebjørn platform
(Karunakaran et al. 2001)

Table 9.2 Representative sea .Hs (m) .Tp (s)

states
12.0 12.0
14.7 16.5
146 9 Monte Carlo Methods and Extreme Value Estimation

Gumbel plot, H=12m, T=12s

P0 level
2
−ln(ln((n+1)/k))

−1

−2
0.25 0.3 0.35 0.4 0.45 L 0.5 0.55 0.6 0.65
Mk G

Fig. 9.5 Empirical Gumbel plot of the 20 simulated 3-hour extremes of the horizontal deck
displacement for the sea state .Hs = 12 m and .Tp = 12 s together with the fitted Gumbel distribution
(.− − −)

in Figs. 9.5 and 9.6. Specifically, the observed 3-hour extremes .Mk are plotted
versus .− ln ln(21/k) , for .k = 1, . . . , 20. The fitted straight line in each figure,
which represents the fitted Gumbel distribution, is based on the moment estimation
method, cf. Sect. 9.6.1.
The 90% fractile value .LG of the fitted Gumbel distribution is identified and
shown in each figure. Table 9.3 lists the obtained 90% fractile values. To quantify
the uncertainty associated with the obtained 90% fractile value based on a sample of
size 20, the 95% confidence interval (CI.0.95 ) of this value is used. A good estimate
of this confidence interval can be obtained by using a parametric bootstrapping
method (Efron and Tibshirani 1993; Davison and Hinkley 1997). In our context, this
simply means that the initial sample of 20 extreme values is assumed to have been
generated from an underlying Gumbel distribution, whose parameters are, of course,
unknown. If this Gumbel distribution had been known, it could have been used to
generate many samples of size 20. For each sample, a new Gumbel distribution
would be fitted, and the corresponding 90% fractile value identified. If the number of
samples had been large enough, an accurate estimate of the 95% confidence interval
on the 90% fractile value based on a sample of size 20 could be found. Because
the true parameter values of the underlying Gumbel distribution are unknown, they
are replaced by the estimated values obtained from the initial sample. This fitted
Gumbel distribution is then used as previously described to provide an approximate
95% confidence interval. Note that the assumption that the initial 20 extreme values
are actually generated from a Gumbel distribution is quite accurate in this case, as
discussed later.
9.6 Estimation of Extreme Response 147

Gumbel plot H =14.7m, T =16.5s

s p
5

P0 level
2
−ln(ln((n+1)/k))

−1

−2
0.4 0.5 0.6 0.7 0.8
LG
Mk

Fig. 9.6 Empirical Gumbel plot of the 20 simulated 3-hour extremes of the horizontal deck
displacement for the sea state .Hs = 14.7 m and .Tp = 16.5 s together with the fitted Gumbel
distribution (.− − −)

Table 9.3 90% fractile .Hs (m) .Tp (s) .LG (m) CI.0.95
values .LG of the fitted
Gumbel distributions 12.0 12.0 0.47 (0.40, 0.54)
14.7 16.5 0.63 (0.56, 0.73)

Invoking the parametric bootstrap, the 95% confidence interval is estimated for
each case based on 100,000 bootstrap samples. The obtained results are listed in
Table 9.3. This way of estimating the 90% fractile value of the 3-hour extreme value
distribution is referred to as the Gumbel method. The empirical densities obtained
for the predicted 90% fractile values with the CI.0.95 indicated are shown in Figs. 9.7
and 9.8.
Let us now compare the results provided by the Gumbel method previously dis-
cussed, with the results obtained by the point process method. Using a Levenberg–
Marquardt least squares optimization method, cf. Chap. 5, leads to the results shown
in Figs. 9.9 and 9.10.
As shown in Figs. 9.11 and 9.12, when the mean upcrossing rate is plotted on a
logarithmic scale, the tails are closely linear. This means that the associated extreme
value distribution can be expected to be similar to a Gumbel distribution, which
would correspond to exactly linear tails, cf. Eq. (4.31). Although it may not be
obvious that the data plotted in Figs. 9.5 and 9.6 come from a distribution very
close to a Gumbel distribution, the approximately linear exponential decay of the
crossing rate strongly supports this assumption. This is yet another indication of the
usefulness of the mean upcrossing rate function.
148 9 Monte Carlo Methods and Extreme Value Estimation

Bootstrapped PDF, Hs=12m, Tp=12s

8
PDF

0
0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7
Level estimate

Fig. 9.7 Empirical density of the predicted 90% fractile value based on sample of size 20 for the
sea state with .Hs = 12 m, .Tp = 12 s. The .∗ indicates the limits of CI.0.95

Bootstrapped PDF, H =14.7m, T =16.5s

s p
11

6
PDF

0
0.4 0.5 0.6 0.7 0.8 0.9
Level estimate

Fig. 9.8 Empirical density of the predicted 90% fractile value based on sample of size 20 for the
sea state with .Hs = 14.7 m, .Tp = 16.5 s. The .∗ indicates the limits of CI.0.95
9.6 Estimation of Extreme Response 149

−1.5

−2

−2.5
log ν+(η)

−3
10

−3.5

−4

−4.5

−5
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
response η

Fig. 9.9 Log plot of the empirical and fitted upcrossing rate with the reanchored 95% empirical
confidence band (.−−) and fitted confidence band (– .·) for the sea state with .Hs = 12 m, .Tp = 12 s.
90% fractile estimate = 0.47 m, with CI.0.95 = (0.45, 0.50)

−1.5

−2

−2.5
log ν+(η)

−3
10

−3.5

−4

−4.5

−5
0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
response η

Fig. 9.10 Log plot of the empirical and fitted upcrossing rate with the reanchored 95% empirical
confidence band (.−−) and fitted confidence band (– .·) for the sea state with .Hs = 14.7 m, .Tp =
16.5 s. 90% fractile estimate = 0.62 m, with 95% CI.0.95 = (0.58, 0.65)
150 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.11 Mean upcrossing Sea state Hs=12m, T=12s

rate statistics along with 95% 0

confidence bands (.−−) for

the sea state with .Hs = 12 m, −1
.Tp = 12 s, and response
standard deviation
.σ = 0.047 m. .∗ : Monte −2

Carlo; .− − − : linear fit

log ν+(ξ)
−3

10
−4

−5

−6
−15 −10 −5 0 5 10 15
ξ/σ

Fig. 9.12 Mean upcrossing Sea state Hs=14.7m, T=16.5s

rate statistics along with 95% 0

confidence bands (.−−) for

the sea state with −1
.Hs = 14.7 m, .Tp = 16.5 s,
and response standard
deviation .σ = 0.068 m. .∗ : −2

Monte Carlo; .− − − : linear

log ν+(ξ)

fit −3
10

−4

−5

−6
−15 −10 −5 0 5 10 15
ξ/σ

Aiming at .T = 3-hour extreme response prediction, one needs upcrossing

rates down to about .10−4 –.10−6 . Accurate estimates based on direct Monte Carlo
simulation down to this order are expensive in terms of CPU time for the considered
structure. It is therefore convenient when accurately estimated upcrossing rates
down to about .10−3 can be used as a basis for extrapolation down to the appropriate
response level .ξ (with .ν + (ξ ) ≈ 10−5 ), as illustrated in Figs. 9.9 and 9.10.
Returning now to the specific prediction of the 90% fractile of the 3-hour extreme
value distribution, .LCR , Figs. 9.9 and 9.10 lead to the estimates listed in Table 9.4.
The estimated 95% confidence intervals are given in Table 9.4, and indicated in
the figures. They are significantly smaller than those obtained by the Gumbel
method. The prediction accuracy is thus significantly higher for the proposed
method. However, it is also observed that there is good agreement between the .LCR -
9.6 Estimation of Extreme Response 151

Table 9.4 90% fractile .Hs (m) .Tp (s) .LCR (m) CI.0.95
values .LCR of the fitted
Gumbel distributions 12.0 12.0 0.47 (0.45, 0.50)
14.7 16.5 0.62 (0.58, 0.65)

values and the .LG -values, which is to be expected because the exact extreme value
distribution is very close to a Gumbel distribution.

9.6.4 Combination of Multiple Stochastic Load Effects

A prominent problem in the design of structures subjected to random loads is to find

methods for the combination of resulting load effects at high and extreme response
levels. In codified design, this is usually implemented as linear combination rules
of specified characteristic values of the individual load effects (Madsen et al.
1986; Melchers 1999). For nonlinear dynamic structures, the precision level of
such procedures would seem highly questionable. One of the reasons for adopting
such simplified procedures is the complexity of the task to accurately predict the
extreme value statistics of the combined load effects, even in the case of linear
combinations. Over the years, several simplified procedures have been suggested
for the linear combination of load effects, most notably the Ferry Borges-Castanheta
method (Ditlevsen and Madsen 1996), Turkstra’s rule (Turkstra 1970; Madsen et al.
1986), the load coincidence method (Wen 1990; Melchers 1999), the SRSS method
(Wen and Pearce 1981; Wen 1990), and the point crossing approximation method
(Larrabee and Cornell 1981; Madsen et al. 1986). A main shortcoming of these
combination procedures is that they apply only to the case of independent load
effect components. A method for lifting this restriction from the point process
approximation is proposed by Toro (1984). An effort to extend Turkstra’s rule to
dependent processes is described by Naess and Royset (2000). In this section, the
use of the point process method for stochastic load effect combination problems is
illustrated. The illustrations in this section are from Naess and Gaidai (2009).
The general formulation of the load effect combination problem to be studied
here is the following:

H (t) = h[X1 (t), . . . , XN (t)],

. (9.43)

where the stochastic load effect component processes .X1 (t), . . . , XN (t) are com-
bined according to a specified deterministic function h to produce the load effect
combination process .H (t). The component processes may derive from a vector
solution process of a dynamic model for the structural response of an offshore
platform to random waves. They may often be modeled as stationary stochastic
processes, but that is not a requirement for the application of the point process
method.
152 9 Monte Carlo Methods and Extreme Value Estimation

The typical problem to answer concerning the load effect combination process
.H (t) is to determine the probability of exceeding a critical threshold .hc during a
specified time interval T . Let us call this the failure probability and denote it by
.pf = pf (T ). Defining .M(T ) = max{H (t) : 0 ≤ t ≤ T }, it is realized that the goal

is to calculate,

pf = 1 − Prob(M(T ) ≤ hf ) .
. (9.44)

In many practical applications, the structure of the process .H (t) is quite involved
and the dimension N can be high. This makes a direct analytical approach virtually
impossible. In such cases, Monte Carlo simulations of some sort would seem to be
the most attractive way to provide estimates of the failure probability.
Two different load effect combination examples are used for illustration pur-
poses: von Mises yielding stress and linear combination of non-Gaussian load
effects. In both cases, the load components are correlated stochastic processes. A
Newmark integration method was used to produce accurate response time series, cf.
Naess and Moan (2013).
The load effect components .Xi (t) are modeled as stationary processes, being
the response of Duffing-type systems to the same stationary Gaussian white noise
excitation .W (t) with .E[W (t)W (t + τ )] = δ(τ ), where .δ(·) denotes the Dirac delta
function. That is,

Ẍi + 2ζi ωi Ẋi + ωi2 (Xi + εXi3 ) = W (t)/mi ,

. (9.45)

with specific damping constants .ζi and resonance frequencies .ωi = 2π/Ti , and .mi
represent masses, .i = 1, . . . , N . In this subsection N is taken to be 3.
First, the linear case, i.e., .ε = 0 in Eq. (9.45), is considered. In this case, the
(two-sided) PSD of the process .Xi (t) is given as .Si (ω) = |Ai (ω)|2 , where,

1
Ai (ω) = √
. , (9.46)
2π mi (−ω2 + 2iζi ωi ω + ωi2 )

with .i = 1, 2, 3. The correlation coefficient .ρij = E[Xi (t)Xj (t)]/σi σj between

Xi (t) and .Xj (t) is given by,
.

∞
ρij =
. Ai (ω) A∗j (ω) dω/σi σj . (9.47)
−∞
∞
Here, .σi2 = Var[Xi (t)] = −∞ Si (ω)dω is the variance of .Xi (t), .i = 1, 2, 3.
9.6 Estimation of Extreme Response 153

von Mises Stress Combination

Let .σx , σy , σz be axial stresses and .τxy , τxz , τyz be shear stresses in a structural
element. According to the von Mises yield criterion (Madsen et al. 1986), yielding
occurs if,

.(σx − σy )2 + (σx − σz )2 + (σy − σz )2 + 6(τxy

2
+ τxz
2
+ τyz
2
) ≥ 2σY2 , (9.48)

where .σY is the yield stress. In many cases, in practice, several stress components
are zero. In the analysis carried out here, only xy-plane stresses are encountered,
meaning that .σz = τxz = τyz = 0. According to the von Mises criterion, yielding
then occurs if,

(σx − σy )2 + σx2 + σy2 + 6τxy

.
2
≥ 2σY2 . (9.49)

The load effect vector in three-dimensional space is introduced as,

(X1 (t), X2 (t), X3 (t)) ∝ (σx (t), σy (t) , τxy (t)) ,

. (9.50)

where the components .Xk (t) are determined by Eq. (9.45).

The von Mises criterion (Eq. (9.49)) states that yielding (failure) occurs when,

1/2
HvM (t) = (X1 − X2 )2 + X12 + X22 + 6X32
. ≥ hf , (9.51)
√
where .hf ∝ 2σY .
In the linear case, (.ε = 0 in Eq. (9.45)), the process .HvM (t)2 becomes a quadratic
expression in correlated stationary Gaussian processes. Hence, the saddle point
method described by Naess and Karlsen (2004) is applicable. Because this method
is very accurate, it provides an opportunity to check the accuracy and efficiency of
the point process method. Table 9.5 lists the damping values and resonance periods
used in Eq. (9.45) for the load effect processes .Xi (t).
For the linear case (.ε = 0), it is straightforward to calculate the standard
deviations and correlation coefficients for the .Xk , cf. Eq. (9.47): .σ1 = 0.38,
.σ2 = 0.45, .σ3 = 0.52, .ρ12 = 0.36, .ρ13 = 0.14, .ρ23 = 0.41. There is significant

correlation between some of the load effect components. For each example, 20
time series of length approximately 0.92 hours each were simulated. The total
computation time for each example was less than a minute, including simulation
time and optimization.

Table 9.5 Model parameters i .ζi .Ti(s) m

1 0.04 1.8 1
2 0.04 2.0 1
3 0.04 2.2 1
154 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.13 Log plot of the ν+(ξ)

mean upcrossing rate of
.HvM (t): Monte Carlo (.•),
10−1
reanchored empirical 95%
confidence band (- -), fitted
curve (– –), fitted confidence 10−2
band (– .·). Linear case: .ε = 0.
.a = 0.795, .b = 0.996, 10−3
.c = 1.607, .ln q = −0.249,
.σ = 0.682 −4
10

−5
10

10−6
2 4 6 8 10 ξ/σ

Fig. 9.14 Optimal ν+(ξ)

transformed plot of the mean
upcrossing rate of .HvM (t):
Monte Carlo (.•), empirical
confidence band (- -), saddle 10
−0.5

point results (o), linear

extrapolation (– –). Linear
10−1
case: .ε = 0. .b = 0.996,
.ln q = −0.249, .σ = 0.682
10−2

−4
10
−6
10
10−8
2.3 3 4 6 8 10 12 ξ/σ

The log plot presented in Fig. 9.13 shows the optimized fitted parametric curve
for the case of linear dynamics, i.e., .ε = 0, together with the confidence band
generated by the fitted parametric curves to the borders of the reanchored empirical
95% confidence band. For illustration purposes, the predicted value given by
+
.ν (ξ ) = 10
−6 , which corresponds to the 99% fractile value of a 3-hour extreme

value distribution is indicated. Predicted value = .6.84 = 10.03 σ , with 95% CI =

(6.60, 7.12).
In Fig. 9.14, it is demonstrated that when the fitted parametric curve shown in
Fig. 9.13 is replotted as an optimal transformed plot, which yields a straight line, the
empirical curve is also largely indistinguishable from a straight line, verifying the
validity of our assumption about the representation of the upcrossing rate function.
The results from saddle point calculations, which are practically exact, are also
plotted in Fig. 9.14, and they are seen to agree very well with the extrapolated
straight line results.
Figure 9.15 shows the optimized fitted parametric curve for the case of nonlinear
dynamics, i.e., .ε = 1, together with the confidence band as generated previously.
9.6 Estimation of Extreme Response 155

Fig. 9.15 Log plot of the +

ν (ξ)
mean upcrossing rate of
.HvM (t): Monte Carlo (.•), −1
10
reanchored empirical 95%
confidence band (- -), fitted −2
curve (– –), fitted confidence 10
band (– .·). Nonlinear case:
.ε = 1.0. .a = 0.082,
−3
10
.b = −0.382, .c = 3.334,
.ln q = 0.124, .σ = 0.534 −4
10

−5
10

10−6
3 4 5 6 7 8 ξ/σ

Fig. 9.16 Optimal +

ν (ξ)
transformed plot of the mean
upcrossing rate of .HvM (t): 10
−0.5

Monte Carlo (.•), empirical

confidence band (- -), linear
extrapolation (–). Nonlinear 10
−1
case: .ε = 1.0. .b = −0.382,
.ln q = 0.124, .σ = 0.534

−2
10

10−4

10−6
4 5 6 7 8 ξ/σ

Predicted 99% fractile value of 3-hour extreme = 4.29 = 8.04 .σ , 95% confidence
interval = (4.21, 4.36). Figure 9.16 presents the mean upcrossing rate function of
.HvM (t) plotted on the optimal transformed scale. It is observed that the assumption

is again fully verified.

Linear Combination of Non-Gaussian Load Effects

To get a flexible model that also provides a convenient way of investigating the
effect of statistical dependence between load components, an example from Naess
and Royset (2000) is used. The input load components are again assumed to be given
by Eq. (9.45), with the same parameters as in the previous case study, cf. Table 9.5.
156 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.17 Log plot of the +

ν (ξ)
mean upcrossing rate of
.Hlc (t): Monte Carlo (.•), 10−1
empirical confidence band (-
-), fitted curve (– –), fitted −2
10
confidence band (– .·). Linear
case: .ε = 0. .a = 0.147,
.b = 0.129, .c = 3.742, 10−3
.ln q = −1.103, .σ = 1.026,
.α = −0.5 −4
10

−5
10

10−6
1.5 2 2.5 3 ξ/σ

As an example of non-Gaussian load effect component processes, memoryless

transformations of the input processes .Xi (t) provided by Eq. (9.45) is considered.
In this section, it is assumed accordingly that,

3
Hlc (t) =
. Zi (t), (9.52)
i=1

with

.Zi (t) = Xi (t)|Xi (t)|α , − 1 < α < 1. (9.53)

Two .α-values were chosen here, .−0.5 and .0.5. The number of terms in the sum
(9.52) is chosen to be three, but it can be any positive integer because it does not
matter much for the Monte Carlo simulation.
Figure 9.17 shows the optimized fitted parametric curve for .α = −0.5 and the
case of linear dynamics, i.e., .ε = 0, together with the confidence band generated by
the allowed parametric curves. The predicted 99% fractile value of 3 hour extreme
= 3.42 = 3.34 .σ , with the 95% confidence interval = (3.36, 3.49). In Fig. 9.18 it is
demonstrated that when the fitted parametric curve shown in Fig. 9.17 is replotted
as an optimal transformed plot, the empirical curve is largely indistinguishable from
a straight line, supporting the assumption about the representation of the upcrossing
rate function.
Figure 9.19 shows the optimized fitted parametric curve for .α = 0.5 and the case
of linear dynamics, together with the confidence band generated by the allowed
parametric curves. The predicted 99% fractile value of 3 hour extreme = 5.15 = 7.69
.σ , with the 95% confidence interval = (4.95, 5.37). In Fig. 9.20 it is demonstrated

that when the fitted parametric curve shown in Fig. 9.19 is replotted as an optimal
transformed plot, the empirical curve is largely indistinguishable from a straight
9.6 Estimation of Extreme Response 157

Fig. 9.18 Optimal +

ν (ξ)
transformed plot of the mean
upcrossing rate of .Hlc (t):
Monte Carlo (.•), confidence
band (- -), linear extrapolation −1
10
(–). Linear case: .ε = 0.
.b = 0.129, .ln q = −1.103,
.σ = 1.026, .α = −0.5

−2
10

−4
10

10−6
1.5 2 2.5 3 ξ/σ

Fig. 9.19 Log plot of the ν+(ξ)

mean upcrossing rate of
.Hlc (t): Monte Carlo (.•), 10
−1

empirical confidence band (-

-), fitted curve (– –), fitted
10−2
confidence band (– .·). Linear
case: .ε = 0. .a = 1.771,
−3
.b = 0.449, .c = 1.280, 10
.ln q = −0.980, .σ = 0.670,
.α = 0.5 10
−4

−5
10

10−6
1 2 3 4 5 6 7 8 ξ/σ

line, again supporting the assumption about the representation of the upcrossing
rate function.
Figures 9.21 and 9.22 show the log plot and the optimal transformed plots for
the case .α = −0.5, under nonlinear dynamics with .ε = 1. The predicted 99%
fractile value of the 3-hour extreme = 3.04 = 3.18 .σ , with the 95% confidence
interval = (3.00, 3.09). By way of comment to the rather high optimal value for
c that was found in this particular case, it may be of interest to observe that the
function .F (q ∗ (c), a ∗ (c), b∗ (c), c) is almost constant for a range of values from
about .c = 4 to 7, where the calculations stopped. From the transformed optimal
plot, the assumption is also still verified for this case.
Figures 9.23 and 9.24 show the log plot and the optimal transformed plots for
the case .α = 0.5, under nonlinear dynamics with .ε = 1. The predicted 99% fractile
value of 3-hour extreme = 3.06 = 6.147 .σ , with the 95% confidence interval = (2.97,
3.12). From the transformed optimal plot, it is again seen that the assumption is fully
verified also for this case.
158 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.20 Optimal ν+(ξ)

transformed plot of the mean
upcrossing rate of .Hlc (t):
10−0.5
Monte Carlo (.•), confidence
band (- -), linear extrapolation
(–). Linear case: .ε = 0.
.b = 0.449, .c = 1.280,
.ln q = −0.980, .σ = 0.670,
.α = 0.5 10−1

10−2

−4
10
10−6
1 2 4 6 8 ξ/σ

Fig. 9.21 Log plot of the ν+(ξ)

mean upcrossing rate of
.Hlc (t): Monte Carlo (.•), 10−1
empirical confidence band (-
-), fitted curve (– –), fitted 10
−2
confidence band (– .·).
Nonlinear case: .ε = 1.
.a = 0.014, .b = −0.456,
−3
10
.c = 5.423, .ln q = −1.048,
.σ = 0.958, .α = −0.5
10−4

10−5

10−6
1.6 2 2.5 3 ξ/σ

Fig. 9.22 Optimal +

ν (ξ)
transformed plot of the mean
upcrossing rate of .Hlc (t):
Monte Carlo (.•), confidence
band (- -), linear extrapolation 10−1
(–). Nonlinear case: .ε = 1.
.b = −0.456, .ln q = −1.048,
.σ = 0.958, .α = −0.5

10−2

10−4

10−6
1.6 2 2.5 3 ξ/σ
9.6 Estimation of Extreme Response 159

Fig. 9.23 Log plot of the ν+(ξ)

mean upcrossing rate of
.Hlc (t): Monte Carlo (.•); −1
10
empirical confidence band (-
-); fitted curve (– –); fitted
confidence band (– .·). 10−2
Nonlinear case: .ε = 1.
.a = 1.872, .b = 0.030,
−3
10
.c = 1.767, .ln q = −0.527.
.σ = 0.498, .α = 0.5 −4
10

−5
10

10−6
1 2 3 4 5 6 ξ/σ

Fig. 9.24 Optimal +

ν (ξ)
transformed plot of the mean
upcrossing rate of .Hlc (t):
Monte Carlo (.•), confidence
band (- -), linear extrapolation 10−0.5
(–). Nonlinear case: .ε = 1.
.b = 0.030, .ln q = −0.527.
.σ = 0.498, .α = 0.5
10−1

−2
10

10−4
10−6
1 2 3 4 5 6 ξ/σ

20 maxima .Mj , j = 1, . . . , 20 are extracted, one from each realization, in order

to view it on a Gumbel plot. The latter is a plot of .Mj versus .− ln(ln 20+1 j ), .j =
1, . . . , 20.
A 95% confidence interval for the response level .L90 of the Gumbel distribution
not being exceeded during time .T = 500 max(T1 , T2 , T3 ) (cf. Table 9.5), with
probability 90% based on a sample of size 20 can be obtained by the MC technique
and with parametric bootstrapping from the fitted Gumbel distribution (Davison and
Hinkley 1997). One hundred thousand bootstrap samples where used to estimate the
density of the 90% fractile, and from this density, the desired confidence interval was
extracted. Figures 9.25 and 9.26 present the .L90 estimates by the point process and
Gumbel methods for the nonlinear system .ε = 1, .α = 0.5. Figure 9.27 presents the
parametrically bootstrapped density of .L90 from the Gumbel distribution.
The estimate of .L90 from the point process method is 3.70 with 95% confidence
interval .(3.65, 3.75). For the Gumbel method, the estimate of .L90 is 3.55 with 95%
confidence interval .(3.23, 3.85).
160 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.25 Distribution by the 1

point process method; 0.9
nonlinear system, .ε = 1,
.α = 0.5 0.8

0.7

0.6

CDF
0.5

0.4

0.3

0.2

0.1

0
1 2 3 4 5 6
Response level, α=0.5

Fig. 9.26 Gumbel plot of 20 3.5

maxima; nonlinear system, 3
.ε = 1, .α = 0.5
2.5

2
−ln(ln(n+1)/k)

1.5

0.5

−0.5

−1

−1.5
2.5 2.7 2.9 3.1 3.3 3.5 3.7 3.9 4
M
k

9.6.5 Total Surge Response of a TLP

The next example illustrates the problem of predicting the total surge response of
a tension leg platform (TLP) in random waves. The TLP concept was developed
for production of oil at offshore fields. A simple rendition of a TLP structure is
presented in Fig. 9.28. With the tools developed in this chapter, the problem of
response prediction may be solved in a rather satisfactory manner. The presented
material is largely taken from Naess et al. (2007). A Monte Carlo-based approach to
the investigation of the response statistics of an offshore structure is also presented
in Sagrilo et al. (2011).
9.6 Estimation of Extreme Response 161

Fig. 9.27 Bootstrapped 3

density; nonlinear system,
.ε = 1, .α = 0.5
2.5

PDF
1.5

0.5

0
2.8 3 3.2 3.4 3.6 3.8 4 4.2 4.4
Level estimate

Fig. 9.28 A sketch of a TLP

structure

The equations of motion for a floating, rigid-body TLP structure subjected to

environmental forces such as wind, waves, and current would generally be written
as,

M Z̈(t) + H (Z(t), Ż(t), t) = F (t) .

. (9.54)
162 9 Monte Carlo Methods and Extreme Value Estimation

Here, .M denotes a generalized .6 × 6 mass matrix, .Z = (Z1 , . . . , Z6 )T = the

structure’s response vector, while .H denotes a nonlinear vector function. .F (t)
denotes a stochastic loading process, which in general also depends on the response
of the structure. This point is further discussed later.
Because the main purpose of this section is to illustrate the versatility and
accuracy of the proposed method, we chose to discuss a simplified SDOF model
for the surge response of the TLP in random waves. Except for interaction effects
between different motion modes, the SDOF model allows for the introduction of
most of the relevant nonlinear effects that may influence the response of the TLP.
Hence, the following SDOF equation of motion is studied,

M Z̈(t) + H (Z(t), Ż(t), t) = F (t),

. (9.55)

where .Z = Z(t) denotes the surge response of the TLP; M is the mass of the
platform, including added mass; and H is a nonlinear function to be specified.
As discussed in Chap. 8, the hydrodynamic loading process .F (t) is assumed to
consist of two components: a linear, first-order (wave frequency) term .F1 (t), and
a nonlinear, slowly varying, second-order term .F2 (t).
To set up the proper dynamic model for the surge response .Z(t), it is necessary
to take into account the fact that hydrodynamic loading on a floating body depends
on the motions of the body. In case of the slow-drift motions of the TLP, it is of
some importance to take into account the dependence of the slow-drift force .F2 (t)
on the slowly varying surge velocity .Ż2 (t). Because of a nonlinear dynamic model,
a definition of the slow-drift response .Z2 (t) has to introduced. A suitable definition
would seem to be the following: The slow-drift response .Z2 (t) is obtained from the
total response .Z(t) by a low-pass filter that removes all wave frequency components.
In practice, this may be achieved by a running mean operator used iteratively.
To account for the dependence of the slow-drift force on the slowly vary-
ing velocity, it is appropriate to write .F2 (t, Ż2 (t)) rather than .F2 (t). However,
.F2 (t, Ż2 (t)) is not directly available to us, but only .F2 (t) ≡ F2 (t, 0) given by an

equation entirely similar to Eq. (8.7). Since, in the context of slow-drift motions,
.Ż2 (t) is small, the following approximation is adopted,

∂F2 (t, 0)
F2 (t, Ż2 (t)) ≈ F2 (t, 0) +
. Ż2 (t). (9.56)
∂ Ż2

It is shown by Naess and Johnsen (1993) that for a TLP structure . ∂F∂2Ż(t,0) ≈
2
−c F2 (t, 0) ≡ −c F2 (t) for a suitable constant .c > 0 may to some extent serve
as a useful approximation to capture qualitatively the time-variant damping effect,
which is the result of the expansion in Eq. (9.56).
The first dynamic model adopted for .Z(t) is now the following,

M̃ Z̈ + C Ż + K(Z + εZ 3 ) = F1 (t) + F2 (t, Ż2 ) ≈ F (t) − cF2 (t) Ż2 .

. (9.57)
9.6 Estimation of Extreme Response 163

Here, .M̃ = M + m̃, where .m̃ is an appropriately chosen (constant) added mass.
C, K, and .ε are suitably chosen positive constants. This equation is rewritten in the
equivalent form,

F (t)
. Z̈ + 2ωe ζ Ż + 2ωe c̃F2 (t) Ż2 + ωe2 (Z + εZ 3 ) = , (9.58)
M̃

where .ωe2 = K/M̃, .ζ = C/(2ωe M̃), and .c̃ = c/(2ωe M̃).

Thus, the dynamic system is nonlinear with a Duffing-type hardening stiffness
nonlinearity and time-varying damping. For the TLP the relative damping coeffi-
cient .ζ is usually small. As a consequence, the contribution from the time-varying
term is non-negligible, especially for severe seas for which the slow drift response
is significant. The third-order term in the restoring force, generally referred to as
the set down effect, is caused by the fact that the tethers will induce the TLP to act
like an inverted pendulum. Note that the set down effect will also have an influence
on the hydrodynamic loading process, which depends not only on .Ż, but also on Z.
Even if this dependence could have been taken into account, it was neglected here
because it is of minor importance.
For the numerical simulations, a particular model of a TLP is considered, and
the corresponding LTF and QTF are computed using the second-order diffraction
program (WAMIT 2008). For simplicity, unidirectional seas are used, meaning that
the directional argument .β is skipped. This simplification should have no effect
on the conclusions based on the comparison of accuracy. The combined first-order
and second-order slowly varying surge deck motion is studied applying the single-
degree-of-freedom model. The TLP particulars are detailed in Table 9.6, and the
subsurface part of the structure is shown in Fig. 9.29, but without the vertical tethers.
The values in Table 9.6 are used to obtain the second-order response. This means
that for the second-order response, a simplified version of Eq. (9.57) was used,
where mass .M̃, stiffness K, and damping coefficient C are frequency independent,
which is a good approximation for the slow drift motion. The time-invariant
damping part .ζ is considered to be 5%.
Two versions of Eq. (9.58) are used. The first version is a linear, time-invariant
model obtained by putting .c̃ = ε = 0. The second version is the fully nonlinear
model where the parameter .c̃ in Eq. (9.58) is chosen such that .Var[2ωe c̃F2 (t) Ż2 (t)]
is about 10% of .Var[2ωe ζ Ż(t)]. The parameter .ε is estimated from the condition
that .0.2Z(t) ≥ εZ 3 (t) when .Z(t) ≤ 6σZ , i.e., even in the extreme response region,
stiffness hardening contributes not more than 20% relative to the linear part for
severe sea, which lead to .ε = 1.36 · 10−4 . Finally, the following approximate values

Table 9.6 Particulars of the Column diameter D (m) 10.0

TLP
Eigenperiod surge .Te (s) 128.8
Relative damping .ζ 0.05
.1.5 · 10
Total mass (incl. added mass) .M̃ (kg) 7
164 9 Monte Carlo Methods and Extreme Value Estimation

Fig. 9.29 Sketch of the

submerged part of the TLP.
Units in meters
0

30
20 20
10 10
0 0
10 10
20 20

were found: .c̃ = 30/(M̃g) for moderate seas, and .c̃ = 90/(M̃g) for severe seas,
where .g = 9.81 m/s2 , cf. Eq. (9.60). The adopted parameter values are largely
arbitrary, but the choices made seem to provide a reasonable model for the chosen
TLP structure.
To get an accurate representation of the response process, there is a specific
requirement that must be observed. Because the damping ratio is only 5%, the
frequency resolution .Δω must secure a sufficient number of frequency values over
the resonance peak. This will ensure that the second-order, difference-frequency
response component captures the TLP surge dynamics with sufficient accuracy. It is
commonly required that there are at least 5 discrete frequencies over the frequency
range where .|L̂(ω)|2 is equal to or higher than half of the resonance peak height
2
.max(|L̂(ω)| ), where,

−1
L̂(ω) = − ω2 + 2iζ ωe ω + ωe2
. . (9.59)

For the surge force QTF .K̂2 (ω, ω' ), a suitable initial frequency grid must be
chosen for which the values of the force QTF are calculated. The calculation of
the force QTF is generally the most time-consuming part of the numerical analysis.
Therefore, the initial grid is usually rather coarse to avoid excessive computer
time. For the calculations at hand, the discrete frequency range was the following:
.ω1 = 2π/30.0, . . . , ωn = 2π/4.0 (rad/s), .n = 30. This necessitates the use of an

interpolation procedure to be able to provide values of the QTF on a much finer

grid than the initial one to comply with the requirement of sufficient frequency
resolution to capture the dynamics of slow-drift motion. In this chapter cubic
spline interpolation is used. In the particular case considered here, the resolution
requirement led to the choice .Δω = 0.0018 rad/s and .L = 760 interpolated discrete
frequencies.
9.6 Estimation of Extreme Response 165

The random stationary sea state is specified by a JONSWAP spectrum, which is

given as follows,
2
αg 2 5 ωp 4 1 ω
.Sη (ω) = exp − + ln κ exp − −1 , (9.60)
ω5 4 ω 2χ 2 ωp

where .g = 9.81 ms.−2 , .ωp denotes the peak frequency in rad/s, and .κ and .χ are
parameters affecting the spectral shape. .χ = 0.07 when .ω ≤ ωp , and .χ = 0.09
when .ω > ωp . The parameter .κ is chosen to be equal to 3.3, which is a rather typical
value. The parameter .α is determined from the following empirical relationship,

Hs 2
α = 5.06
. 1 − 0.287 ln κ , (9.61)
Tp2

where .Hs denotes the significant wave height and .Tp = 2π/ωp is the spectral
peak wave period. Table 9.7 presents the sea state parameters, along with the
corresponding response standard deviations. For more results, cf. Naess et al.
(2007).
Figure 9.30 shows the LTF for the wave exciting force amplitude, while Fig. 9.31
depicts the spline interpolated QTF. Because the QTF is complex valued, only its
absolute value is plotted.
For the chosen sea state, Figs. 9.32 and 9.33 present the corresponding response
tail crossing rates obtained by Monte Carlo simulation for the linear system given

Table 9.7 Representative sea state, along with response standard deviations for the linear and
nonlinear TLP model
.Hs (m) .Tp (s) .σZ (m), lin. .σZ (m), nonlin.
10.0 11 9.3 8.2

Fig. 9.30 Wave exciting 107N/m Surge wave exciting force. Dimensional results for unit wave amplitude
3
force amplitude, surge LTF

2.5

1.5

0.5

0
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
ω
166 9 Monte Carlo Methods and Extreme Value Estimation

Force amplitude ⨉ 105 N

2
1.5

1
0.5 1.5

1.5 1

1
0.5
0.5
⍵1
⍵2

Fig. 9.31 Wave exciting force amplitude, surge QTF

Fig. 9.32 Optimal Lin TLP, moderate sea Hs = 10m, Tp = 11s

transformed plot of the v+(ξ)
empirical crossing rates by
Monte Carlo simulation (.∗)
with 95% confidence bands
(.−−) based on 1,000 hours of
response time histories for the 10–2
case of linear dynamics
(.c̃ = ε = 0). Saddle point
integration results (——) 10–3
coincide with the optimized
linear fit with .b = 0.75 σZ ,
.q = 0.0205, where
10–6
.σZ = 9.3 m (see Table 9.7)
10–10

2 3 5 8.68 ξ/σ
z

by putting .ε = 0 and .c̃ = 0 in Eq. (9.58). Figure 9.32 shows the results obtained
from 1,000 realizations, which requiring less than 1 hour on a laptop computer.
The crossing rate plots are done on the transformed scale, see Eq. (9.42). Extreme
response prediction based on Eq. (9.35) will typically involve crossing rates of
the orders .10−6 –.10−7 , but to illustrate the achieved accuracy the extrapolated
results at the crossing rate level .10−10 are highlighted. Figure 9.32 also shows
the highly accurate results obtained by using a saddle point integration technique
(Naess et al. 2006). These results cannot be distinguished from those obtained
by linear extrapolation of the mean upcrossing rate function provided by Monte
Carlo simulations. Hence, linear extrapolation can be done over several orders of
magnitude with high accuracy.
9.6 Estimation of Extreme Response 167

Fig. 9.33 Optimal Lin TLP, moderate sea Hs=10m, Tp=11s

transformed plot of the v+(ξ)
empirical crossing rates by
Monte Carlo simulation (.∗)
with 95% confidence bands
(.−−) based on 25 hours of
response time histories for the 10-2
case of linear dynamics
(.c̃ = ε = 0); optimized linear
fit (——) with .b = 0.75 σZ , 10-3
.q = 0.020, where .σZ = 9.3 m
(see Table 9.7), for the case of
linear dynamics (.c̃ = ε = 0)
10-6

10–10

2 3 5 8.51 ξ/σz

Fig. 9.34 Optimal Nonlin TLP, moderate sea Hs=10m, Tp=11s

transformed plot of the v+(ξ)
empirical crossing rates by
Monte Carlo simulation (.∗)
with 95% confidence bands
(.−−) based on 1,000 hours of
10–2
response time histories for the
fully nonlinear model;
optimized linear fit (——)
with .b = 0.54 σZ , 10–3
.q = 0.0248, where
.σZ = 8.2 m (see Table 9.7)

10–6

10–10

1 2 3 5 7.50 ξ/σz

To illustrate the fact that good accuracy can be obtained with much shorter time
simulation records, Fig. 9.33 shows the results of using only 25 hours of simulated
response time histories, which required less than 10 minutes on a standard laptop
PC.
Figures 9.34 and 9.35 present response tail crossing rates for the chosen sea
state obtained by Monte Carlo simulation for the fully nonlinear model given by
Eq. (9.58). In this case the Monte Carlo simulation results are the only results
available for verification of the extrapolation method for the nonlinear model, but the
experience from the linear case indicates that the results obtained from Fig. 9.34 are
very accurate. Again, the crossing rate plots are done on the transformed scale, see
168 9 Monte Carlo Methods and Extreme Value Estimation

Nonlin TLP, moderate sea Hs=10m, Tp=11s

v+(ξ)

10–2

10–3

10–6

10–10

2 3 5 7.22 ξ/σz

Fig. 9.35 Optimal transformed plot of the empirical crossing rates by Monte Carlo simulation
(.∗) with 95% confidence bands (.−−) based on 25 hours of response time histories for the fully
nonlinear model; optimized linear fit (——) with .b = 0.54 σZ , .q = 0.0234, where .σZ = 8.2 m
(see Table 9.7)

Eq. (9.42), and for illustration purposes the extrapolated results at the crossing rate
level .10−10 are highlighted. These results can be compared with the corresponding
results obtained from only 25 h of simulated response time histories shown in
Fig. 9.35. The agreement is again very good.
Chapter 10
Bivariate Extreme Value Distributions

10.1 Introduction

The title of this chapter is deliberately chosen to focus on the bivariate case instead
of the general multivariate. The reason is mainly one of expediency, because the
general multivariate case would easily embroil us in the necessity to roll out a heavy
machinery of notation without contributing to a deeper understanding of the issues
involved. For a discussion of the general multivariate case, the reader may consult
the book by Beirlant et al. (2004). The extension of extreme value statistics from the
univariate to the multivariate case meets with several challenges. First of all, there
is no direct simple generalization of the univariate extreme value types theorem to
the multivariate case, and in particular, this also applies to the bivariate case.
Developed on the basis of Gumbel’s logistic and mixed models (Gumbel 1960a,b,
1961; Gumbel and Mustafi 1967), the later results on possible asymptotic bivariate
extreme value distributions became, in a sense, too general, which poses severe
problems for practical applications. Significant efforts have been made to model
and estimate a function which describes the dependence structure between extreme
components, cf. e.g., Coles and Tawn (1991, 1994). However, there are no precise
estimation tools that allow us to decide on the joint distribution of the bivariate
extremes from a given set of bivariate data. Of course, the marginal data sets can be
used to derive estimates of the marginal extreme value distributions, as in Zachary
et al. (1998) and de Haan and de Ronde (1998), but the joint distribution is still a
long way off.
A popular method of trying to cope with the problem of bivariate extremes
is to adopt a copula to represent the joint distribution structure. This copula is
then usually combined with asymptotic extreme value distributions to represent the
marginal distributions, typically of the GEV type (Coles 2001). For this purpose, a
range of different copulas have been proposed (Tawn 1988; Waal and van Gelder
2005), see also Castillo et al. (2005) for a compendious treatment of bivariate
copulas. Even in the case of the bivariate extreme value copula (Pickands 1981;

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 169
A. Naess, Applied Extreme Value Statistics,
[Link]
170 10 Bivariate Extreme Value Distributions

Balakrishnan and Lai 2009), due to the properties of the dependence function,
generally speaking, there are an infinite number of models. Therefore, the main
problem with this approach is that it is rather ad hoc. That is, there seems to be
no appropriate theoretical justification for choosing one particular copula over the
other.
It is therefore of considerable interest to note that the ACER method can easily
be extended to several dimensions, in particular, to two (Naess 2011). By this fact, a
vehicle is obtained for providing a nonparametric statistical estimate of the bivariate
extreme value distribution inherent in a bivariate time series. It will be seen that
the bivariate ACER function is able to cover both spatial and temporal dependence
characteristics of the given time series. Thus, it covers all simultaneous and non-
simultaneous extreme events. From a practical point of view, this makes it possible
to investigate the true behavior of the bivariate extreme value distribution for a
particular case, and at the same time check the validity of the proposed copula
models for bivariate extremes.
As a first effort in investigating the functional representation of the empirically
estimated bivariate ACER surface, the bivariate extreme value copula approach will
be adopted. Specifically, the Asymmetric logistic and Gumbel logistic models are
used, combined with asymptotically consistent marginal extreme value distributions
based on the univariate ACER functions. Since the univariate ACER functions have
proved to portray accurately the marginal tail behavior, this will offer an opportunity
to verify to some extent the viability of copula models to capture the dependence
structure of bivariate extreme value distributions.
The performance of the bivariate ACER method will be illustrated by application
to two measured bivariate time series. The first consists of simultaneous wind
speed data measured at two stations off the coast of Norway. The second time
series consists of simultaneous wind speed and wave height data measured at an
offshore location in the North Sea. For some other examples on the application of
the bivariate ACER method, see, e.g., Gaidai et al. (2019) and Xu et al. (2022).

10.2 Componentwise Extremes

Assume that .(X1 , Y1 ), (X2 , Y2 ), . . . is a sequence of iid bivariate random variables

having distribution function .F (x, y). Let .Mx,N = max1≤i≤N Xi and .My,N =
max1≤i≤N Yi . Then .MN = (Mx,N , My,N ) is a vector of componentwise maxima,
where the index i for which the component time series .X1 , X2 , . . . and .Y1 , Y2 , . . .
assume their extreme may be different between the two series. Hence, .MN does not
necessarily correspond to an observed vector of the original time series.
The search for the limiting forms of the bivariate extreme value distributions
follows more or less the pattern of the univariate case by studying .MN as .N → ∞.
By way of a first observation, the marginal distributions of the asymptotic bivariate
extreme value distribution should by necessity be given by the asymptotic univariate
extreme value distributions. Since the marginals are known, the representations may,
10.2 Componentwise Extremes 171

in fact, be simplified a little by assuming that both .Xi and .Yi have the standard
Fréchet distribution .F (z) = exp(−1/z), .z > 0. By transformation of variables, any
other marginal distribution can be obtained. With this specific choice of marginal,
a special case of the GEV distribution with parameters .μ = 0, .σ = 1 and .γ = 1
is at hand. Now, it follows that .Prob(Mx,N ≤ z) = exp(−N/z), or, equivalently,
.Prob(Mx,N /N ≤ z) = exp(−1/z), .z > 0. The same result applies to .My,N . Hence,

to obtain standard univariate results for each margin, the re-scaled vector,

MN∗ = (Mx,N /N, My,N /N),

. (10.1)

should be considered.
The main result of this section is the following:
Let .MN∗ = (Mx,N∗ , M ∗ ) be defined by Eq. (10.1) where the .(X , Y ) are iid
y,N i i
random vectors with standard Fréchet marginal distributions. Then if,
∗ ∗
Prob(Mx,N
. ≤ x, My,N ≤ y) → G(x, y), as N → ∞, (10.2)

where G is a non-degenerate distribution function, G has the form,

G(x, y) = exp{−V (x, y)}, x > 0, y > 0,

. (10.3)

where

1 w 1−w
V (x, y) = 2
. max , dH (w), (10.4)
0 x y

where H is a distribution function on .[0, 1] satisfying the mean value constraint

1
. w dH (w) = 1/2. (10.5)
0

The distributions obtained as limits in Eq. (10.2) are called the class of bivariate
asymptotic extreme value distributions. According to this result, this class is in a
1-1 correspondence with the set of distributions H on .[0, 1] satisfying Eq. (10.5).
So, this rather remarkable result tells us that for any such distribution function on
.[0, 1], a valid bivariate asymptotic extreme value distribution is obtained.

Two simple examples may serve as illustration. Denote by W a random variable

with distribution function H . First, let W have the two possible outcomes 0 and
1, and .Prob(W = 0) = Prob(W = 1) = 0.5. Hence the distribution function
H has jumps at 0 and 1, so that .dH (0) = 0.5 and .dH (1) = 0.5. The condition
given by Eq. (10.5) is satisfied, and .V (x, y) = x −1 + y −1 by Eq. (10.4), so that the
corresponding bivariate extreme value distribution becomes,

.G(x, y) = exp{−(x −1 + y −1 )} = exp{−x −1 } exp{−y −1 )}, x > 0, y > 0,

(10.6)
172 10 Bivariate Extreme Value Distributions

which clearly illustrates the case of two independent variables with standard
Fréchet marginals. The opposite case of two fully dependent variables is obtained
by considering a degenerate random variable with unit mass at .W = 0.5. The
corresponding distribution function then has one jump at .w = 0.5, i.e., .dH (0.5) =
1. Equation (10.5) is satisfied, and the corresponding bivariate extreme value
distribution is,

.G(x, y) = exp{− max(x −1 , y −1 )}, x > 0, y > 0, (10.7)

where the marginals are still standard Fréchet.

To obtain the general situation for any GEV marginal, it is now only necessary to
transform the marginals from standard Fréchet to the required members of the GEV
family. Specifically, by defining,
x − μ 1/γx y − μ 1/γy
x y
.x̃ = 1 + γx and ỹ = 1 + γy , (10.8)
σx σy

it follows that the complete set of bivariate asymptotic extreme value distributions
is determined by distribution functions of the form,

.G(x, y) = exp{−V (x̃, ỹ)}, (10.9)

provided .[1 + γx (x − μx )/σx ] > 0 and .[1 + γy (y − μy )/σy ] > 0, and where
the function V satisfies Eq. (10.4) for some distribution function H , satisfying
Eq. (10.5). The marginal distributions are GEV with parameters .(μx , σx , γx ) and
.(μy , σy , γy ), respectively.

10.3 Bivariate ACER Functions

Consider a bivariate stochastic process .Z(t) = X(t), Y (t) with dependent

component processes, which has been observed over a time interval, .(0, T ) say.
Assume that the sampled values .(X1 , Y1 ), . . . , (XN , YN ) are allocated to the
(usually equidistant) discrete times .t1 , . . . , tN in .(0, T ). Our goal in this section
is to construct a methodology that allows us to accurately determine empirically the
joint distribution function of the extreme value vector .MN = Mx,N , My,N , where
.Mx,N = max Xj ; j = 1, . . . , N , and with a similar definition of .My,N . Specifi-

cally, the goal is to estimate .P (ξ, η) = Prob Mx,N ≤ ξ, My,N ≤ η accurately for
large values of .ξ and .η.
10.3 Bivariate ACER Functions 173

For notational convenience, it is expedient to introduce the non-exceedance event

Ckj (ξ, η) = Xj −1 ≤ ξ, Yj −1 ≤ η, . . . , Xj −k+1 ≤ ξ, Yj −k+1 ≤ η for .2 ≤ k ≤
.

j ≤ N + 1. Then, from the definition of .P (ξ, η) it follows that,

P (ξ, η) = Prob CN +1,N +1 (ξ, η)

= Prob XN ≤ ξ, YN ≤ η | CN N (ξ, η) · Prob CN N (ξ, η)

..
.
N
= Prob Xj ≤ ξ, Yj ≤ η | Cjj (ξ, η) · Prob C22 (ξ, η) . (10.10)
j =2

Following the pattern of the derivations of the univariate ACER functions in

Chap. 5, it will be shown in Sect. 10.5.2, based on Eq. (10.10) and the properties of
conditional probability, that a sequence of approximations may be introduced which
converges to the target distribution .P (ξ, η). In practice, following the derivations in
Sect. 10.5.2, it is therefore assumed that the following representation applies for a
suitably chosen k, and for large values of .ξ and .η, cf. Eq. (10.34),

N
P (ξ, η) ≈ exp −
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ξ, η) , (10.11)
j =k

where .αkj (ξ ; η) = Prob Xj > ξ | Ckj (ξ, η) , .βkj (η; ξ ) = Prob Yj > η | Ckj (ξ, η)
and .γkj (ξ, η) = Prob Xj > ξ, Yj > η | Ckj (ξ, η) . Note that Eq. (10.11) applies
equally well to stationary and non-stationary time series. This is due to the fact that
the possible time dependence of the conditional exceedance probabilities .αkj (ξ ; η),
.βkj (η; ξ ) and .γkj (ξ, η) has been retained, which is reflected in the presence of the

time parameter j .
From Eq. (10.11) it emerges that for the estimation of the bivariate extreme value
distribution, it is necessary and sufficient to estimate the sequence of functions
N
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ξ, η) . To get a more compact representation, it
j =k
is expedient to introduce the concept of a k’th order bivariate average conditional
exceedance rate (ACER) function as follows, cf. Eq. (10.35),

1
N
Ek (ξ, η) =
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ξ, η) ; k = 1, 2, . . .
N −k+1
j =k
(10.12)

Hence, when .N ⪢ k, and for large values of .ξ and .η,

P (ξ, η) ≈ exp { − (N − k + 1) Ek (ξ, η)} .

. (10.13)
174 10 Bivariate Extreme Value Distributions

The numerical estimation of the bivariate ACER function for the observed
stationary or non-stationary time series consists in counting of the appropri-
ate exceedance events. The estimation procedure is derived in more details in
Sect. 10.5.2.

10.4 Functional Representation of the Empirically Estimated

Bivariate ACER Functions

From the definition of .Ek (ξ, η) follows that the product .Ek (ξ, η) · (N − k + 1)
represents the expected number of the bivariate observations .Zj = (Xj , Yj ) such
that their components exceed corresponding levels .ξ and .η (both simultaneous
and non-simultaneous) and follow after at least .k − 1 previous simultaneous non-
exceedances. Therefore the bivariate ACER function .Ek (ξ, η) is able to capture the
temporal and spatial dependence structure of the considered bivariate time series.
Moreover, as is discussed in Sect. 10.5.2, in practice the existence of an effective
.ke ⪡ N such that .P (ξ, η) = Pke (ξ, η) = exp − (N − ke + 1) Eke (ξ, η) can be

assumed.
This provides a means to obtain high quantiles of the bivariate extreme value
distribution. The joint T -year return period contour associated with the event that
either .Mx,N or .My,N or both is exceeded, that is, . Mx,N > ξ T ∪ My,N > ηT ∪
Mx,N > ξ T ∩ My,N > ηT , is represented by,

1
.1 − F 1yr (ξ T , ηT ) = , (10.14)
T

where .F 1yr (ξ, η) is the joint distribution function of the annual maxima. Assuming
that the duration of the observation period of the bivariate process .Z(t) is .ny years,
then, with .k = ke ,

N −k+1
.F 1yr (ξ, η) = exp − Ek (ξ, η) . (10.15)
ny

From Eqs. (10.14) and (10.15) follows that the joint T -year return levels .(ξ T , ηT )
are obtained as solution of the implicit equation:

1 ny
.Ek (ξ T , ηT ) = − log 1 − . (10.16)
T N −k+1

It is evident that the empirically estimated k-th order bivariate ACER rarely
provides enough information for estimation of high quantiles of the joint extreme
value distribution. In addition, the exact behavior of the bivariate ACER as a
continuous function of two variables cannot be decided using available statistical
data. Therefore, the subasymptotic functional form of the ACER surface .Ek (ξ, η)
10.4 Functional Representation of the Empirically Estimated Bivariate ACER. . . 175

can possibly be obtained approximately by the copula representation of a bivariate

extreme value distribution.
From the result by Sklar (1959), for any pair of random variables .(X, Y ) with
marginal distribution functions .Fx (ξ ) and .Gy (η), the joint distribution function
.Hxy (ξ, η) = Prob(X ≤ ξ, Y ≤ η) can be presented by a bivariate copula .C(u, v)

as follows: .Hxy (ξ, η) = C Fx (ξ ), Gy (η) , cf. e.g., Nelsen (2006) or Balakrishnan

and Lai (2009). This result applies to any bivariate extreme value distribution as
well.
Considering the above and using the result of Pickands (1981), any bivariate
extreme value distribution .Hxy (ξ, η) with marginal univariate extreme value distri-
butions .Fx (ξ ) and .Gy (η) is given by the formula,

log Fx (ξ )
Hxy (ξ, η) = exp log Fx (ξ )Gy (η) · D
. , (10.17)
log Fx (ξ )Gy (η)

where the Pickands dependence function .D(·) is a convex function and satisfies
.D(x) : [0, 1] ⍿−→ [ max(x, 1 − x), 1], cf. Gudendorf and Segers (2010).
We assume that asymptotically consistent marginal extreme value distributions
.Fx (ξ ) and .Gy (η) are represented by the corresponding univariate ACER functions,

that is,

Fx (ξ ) ≈ exp − (N − k + 1)εkx (ξ ) , ξ ≥ ξ1 ,
.
y
(10.18)
Gy (η) ≈ exp − (N − k + 1)εk (η) , η ≥ η1 ,

where the subasymptotic functional form of the univariate ACER function is

x y
represented as .εkx (ξ ) = qkx exp{−akx (ξ − bkx )ck } with a similar definition of .εk (η),
cf. Chap. 5.
Now, substituting Eq. (10.18) into Eq. (10.17) the following representation of the
bivariate extreme value distribution applies:

εkx (ξ )
y
Hxy (ξ, η) = exp
. − (N − k + 1) εkx (ξ ) + εk (η) ·D y .
εkx (ξ ) + εk (η)
(10.19)

On the other hand, as it has been discovered before in Eq. (10.13), the bivariate
extreme value distribution can be expressed through the bivariate ACER function:
.Hxy (ξ, η) = exp { − (N − k + 1) Ek (ξ, η)}. Thereby, the functional form of the

bivariate ACER surface can possibly be obtained by:

εkx (ξ )
y
. Ek (ξ, η) = εkx (ξ ) + εk (η) D y . (10.20)
εkx (ξ ) + εk (η)
176 10 Bivariate Extreme Value Distributions

Consequently, our aim now is to find the dependence function .D(·) that would
provide optimal fit of the parametrical surface defined by Eq. (10.20) to the
empirical bivariate ACER .Eˆk (ξ, η).
Subject to the form of the dependence function .D(·), different parametric
differentiable and non-differentiable models can be considered. By setting .D(x) =
θ x 2 − θ x + 1, where .0 ≤ θ ≤ 1, the Type A bivariate extreme value distribution, or
Gumbel mixed model, is obtained, cf. e.g., Gumbel (1960a,b), Gumbel and Mustafi
(1967). Another differentiable model is acquired by setting .D(x) = [x m + (1 −
x)m ]1/m for .m ≥ 1. This is the Type B distribution or Gumbel logistic model
(Gumbel 1961; Hougaard 1986). The functional form of the bivariate ACER surface
in the Gumbel logistic case becomes,
1
m y m m
. Gk (ξ, η) = εkx (ξ ) + εk (η) . (10.21)

The Type C distribution, also known as the biextremal model (Tiago de Oliveira
1984), can be considered as an example of a non-differentiable model. The
dependence function in this case is .D(x) = max(x, 1 − θ x) for .0 ≤ θ ≤ 1.
In the literature, differentiable models have usually been of most interest, and
they have been used to analyze bivariate environmental events. Yue et al. (1999)
and Yue (2000), apply the Gumbel mixed model to rainfall data in order to provide
storm frequency analysis. Yue (2001a,b) has also studied the Gumbel logistic model
with application to flood peak—flood volume pair of bivariate data.
In the work by Yue and Wang (2004), a comparison between the Gumbel mixed
and the Gumbel logistic models has been made. The authors argued that both models
are appropriate and give similar estimates of the joint distribution of two Gumbel-
distributed random variables whose Pearson product-moment correlation coefficient
is: .0 ≤ ρ ≤ 2/3. When .ρ > 2/3 the Gumbel mixed model cannot be applied, see
also Tiago de Oliveira (1982). For this reason, it was decided to consider the Gumbel
logistic model as one that fits better the objectives of the present work.
In the dependence function .D(·) for the Gumbel logistic model, Tawn (1988)
added extra parameters .φ and .θ to get further flexibility. This leads to the Asymmet-
ric logistic model, which sets .D(x) = [φ m x m + θ m (1 − x)m ]1/m + (θ − φ)x + 1 − θ
with .0 ≤ θ ≤ 1, 0 ≤ φ ≤ 1, m ≥ 1. The functional form of the bivariate ACER
surface in the Asymmetric logistic case is obtained as,
1
m y m m y
. Ak (ξ, η) = φεkx (ξ ) + θ εk (η) + (1 − φ)εkx (ξ ) + (1 − θ )εk (η).
(10.22)

The optimal parameters .m∗ for .Gk (ξ, η) and .θ ∗ , φ ∗ and .m∗ for .Ak (ξ, η), can be
found by minimizing a mean square error function. This is discussed only for the
10.5 Numerical Examples 177

case of .Ak (ξ, η), since the case of .Gk (ξ, η) then follows easily. Specifically, the
mean square error function for .Ak (ξ, η) is defined as,

Nη Nξ
2
F (m, θ, φ) =
.
'
wij log Eˆk (ξi , ηj ) − log Ak (ξi , ηj ) , (10.23)
j =1 i=1

where .Nξ , Nη are numbers of levels .ξ and .η, respectively,

at which the ACER
' =w /
function have been empirically estimated, and .wij wij with,
ij

−2
wij = log CI + (ξi , ηj ) − log CI − (ξi , ηj )
. , (10.24)

denoting normalized weight factors that put more emphasis on the more reliable
estimates.
The constrained optimization problem with the objective function F defined in
Eq. (10.23) is written as,

F (m, θ, φ) → min ,
. (10.25)
{m, θ, φ} ∈ S ,

with a constraints domain,

S = {m, θ, φ} ∈ R3 θ, φ ∈ [0, 1]; m ∈ [1, +∞) .
. (10.26)

10.5 Numerical Examples

10.5.1 Wind Speed Measured at Two Adjacent Weather

Stations

In this section, the results obtained by Naess and Karpa (2015a) will be discussed.
The simultaneous wind speed data measured along the Norwegian coast at Sula
and Nordøyan Fyr weather stations, were analyzed to obtain numerical estimates
of bivariate extreme wind speeds. Figure 10.1 shows the geographical locations of
the measurement sites. The hourly maximum of the three seconds wind gust (10 m
above the ground) were recorded during 13 years (1999–2012).
Figure 10.2 demonstrates the plot of the observed data. This plot reveals a rather
strong dependence between the two time series.
The Pearson product-moment correlation coefficient is found to be .ρ = 0.73.
The Kendall’s rank correlation coefficient is .τ = 0.5, while Spearman’s .ρ equals
.0.68, which also indicates a nonlinear relationship between Sula and Nordøyan wind

speeds.
178 10 Bivariate Extreme Value Distributions

Fig. 10.1 Map of a part of

Norway with the marked
weather stations: A—Sula
station, B—Nordøyan Fyr
station

Fig. 10.2 Coupled

observations of wind speed 40
data observed at the Sula
station (.ξ axis) and at the
𝜂, Nordoyan wind speed [m/s]

Nordøyan Fyr station (.η axis)

0
0 10 20 30 40
ξ, Sula wind speed [m/s]

It was decided to divide the data series into 13 one-year records for the analysis.
By this, also the standard deviation of the ACER function estimates can be
calculated fairly accurately.
The univariate ACER functions were estimated first, using the Matlab-based
standalone downloadable application (Karpa 2012). In Figs. 10.3 and 10.4 the
sequence of .ε̂1 . . . ε̂96 are plotted versus different wind speed levels. Both figures
reveal that there is significant temporal dependence between consecutive data. It
is also seen that this dependence effect is largely accounted for by .k = 24 since
there is a marked degree of convergence in the tail of .ε̂k for .k ≥ 24 in both cases.
Here, for .k = 96, which corresponds to conditioning on data recorded up to 4 days
earlier, .ε̂96 is considered to represent the final converged results, since .ε̂96 ≈ ε̂k for
.k > 96 in the tail. Therefore, there is no need to consider conditioning of an even

higher order than 96. So, effectively, .ke = 96 for the recorded data. Also note that 4
days is a typical separation of wind speed data adopted in the declustering process
to achieve independence between the data used in, e.g., a peaks-over-threshold
10.5 Numerical Examples 179

Fig. 10.3 ACER estimates 10–1

k=1
for different degrees of k=2
conditioning. Wind speed k=4
data from the Sula station, cf. k=24
Fig. 6.9 10–2 k=48
k=72
k=96

ACERk(ξ)
10–3

10–4

20 25 30 35
ξ

Fig. 10.4 ACER estimates 10–1

k=1
for different degrees of
k=2
conditioning. Wind speed k=4
data from the Nordøyan Fyr k=24
station 10–2 k=48
k=72
k=96
ACERk(𝜂)

10–3

10–4

20 25 30 35 40
𝜂

analysis. Figures 10.3 and 10.4 also demonstrate that for extreme value estimation,
.ε̂1 can be used since the ACER functions all coalesce in the far tail. This makes it
possible to choose .k = 1, which makes much more data available for estimation,
with a possible reduction of uncertainty in estimation as a result.
The sequence of estimated bivariate ACER surfaces .Eˆk (ξ, η) is shown in
Fig. 10.5. Matlab programs for ACER 2D analyses are available, cf. Karpa (2014).
.Eˆk (ξ, η) with .k = 1 is the uppermost. As it is seen from the figure, the cross-section

of the surfaces at a high value of the wind speed level .η gives the univariate ACER
functions of the wind speed data from the Sula station, while the cross-section at a
high level of .ξ represents the univariate ACER of the time series from the Nordøyan
Fyr.
Due to the observed convergence, the ACER surface for .k = 96 is very close
to the surface obtained by taking the logarithm of the exact bivariate extreme value
distribution.
180 10 Bivariate Extreme Value Distributions

Fig. 10.5 Bivariate ACER

surface estimates for different
degrees of conditioning.
.Eˆk (ξ, η) surfaces are plotted
on a logarithmic scale

Table 10.1 Optimal k AL GL

parameters of AL and GL fits
1 .mA = 2.44, θ = 0.86, φ = 0.92 .mG = 2.01
96 .mA = 4.53, θ = 0.97, φ = 0.97 .mG = 3.87

Fig. 10.6 Contour plot of the eˆ1 G1 A1

empirically estimated .Eˆ1 45
surface, and the optimized –4.1
–3.6
Gumbel logistic .G1 and –3.2
–2.7
optimized Asymmetric 40
–2.3
logistic .A1 surfaces based on
marginal univariate ACER. –2
Boxes indicate levels on a 35 –1.8
logarithmic scale 𝜂 –1.5
–1.3
30
–1

20
20 25 30 35 40
ξ

Parameters of the optimal Asymmetric logistic (AL) and Gumbel logistic (GL)
surfaces are presented in Table 10.1.
Figures 10.6 and 10.7 show the contour plots of the optimized Asymmetric
logistic .Ak (ξ, η) and the optimized Gumbel logistic .Gk (ξ, η) fits to the data for
.Eˆk (ξ, η) surface for .k = 1 and .k = 96, respectively. The contour lines of three

surfaces are plotted for those levels of .ξ and .η, where the bivariate ACER surface
.Eˆk (ξ, η) have been empirically estimated.

The figures reveal that the empirical bivariate ACER surface .Eˆk captures high
correlation between the data, and so do the optimally fitted .Gk and .Ak surfaces. It
10.5 Numerical Examples 181

Fig. 10.7 Contour plot of the eˆ96 G96 A96

empirically estimated .Eˆ96
surface, and the optimized 45
Gumbel logistic .G96 and –4.2
optimized Asymmetric
logistic .A96 surfaces based on 40
–4
marginal univariate ACER. –3.9
Boxes indicate levels on a –3.7
logarithmic scale 𝜂
35 –3.5
–3.4
–3.3
–3.2
30 –3.1
–3
–2.9

25 30 35 40
ξ

is also seen that the behavior of the estimated ACER surface in the case .k = 96 in
Fig. 10.7 affirms high uncertainty due to deficiency of data. However, the optimal
surfaces .G96 and .A96 capture the statistical properties of the bivariate observations.
It is noticeable that the level of agreement between the estimated bivariate ACER
and both the optimized Asymmetric logistic and the Gumbel logistic surfaces is
equally significant. Yet, it is also important to keep in mind that the empirical
bivariate ACER .Eˆk is the only discrete surface. The MATLAB (2009) built-in
routine contourc that has been used to obtain the figures calculates the contour
lines by producing a regularly spaced grid determined by the dimensions of a
surface. Therefore, it evidently generates a certain spacing inaccuracy of the ACER
.Eˆk surface level lines plot. In addition, the figures ascertain that the optimized

Asymmetric logistic and Gumbel logistic surfaces conform at a level sufficient to

affirm that they actually coincide.
Thereby, the optimized Gumbel logistic model with asymptotically consistent
marginals obtained from the optimized univariate ACER, can be used as the
parametric representative of the bivariate ACER surface estimated from the given
data set.
Also, comparison of the contour lines of .G1 and .G96 that correspond to the same
return period levels shows fairly good agreement in the tail considering the high
uncertainty for the case .k = 96. To highlight the results that would be obtained
by adopting the common approach of assuming Gumbel marginal extreme value
distributions combined with a suitable copula model, in Fig. 10.8 are plotted the 50
and 100-year return period levels obtained by using the Asymmetric logistic model
with asymptotically consistent marginals obtained from the optimized univariate
ACER marginals together with the corresponding return levels obtained by using
the Gumbel logistic model with Gumbel marginals fitted by the method of moments.
It is clear that the discrepancy is significant, which is primarily caused by the
182 10 Bivariate Extreme Value Distributions

Fig. 10.8 Contour plot of the A1 GM M

return period levels for the
optimized Asymmetric 62
logistic .A1 surfaces (solid 100
line —) and the Gumbel 60
50
logistic model with the
Gumbel marginals .GMM 58
(dash-dot line –. · –). Boxes
indicate return period levels 𝜂 56
in years

52 100

50
50

44 46 48 50 52 54
ξ

use of asymptotic Gumbel marginals. Finally, it is noted that the bivariate ACER
methodology has been studied in more detail in an initial study on synthetic data
with known extreme value distribution, and therefore a known 100-year return
period level . ξ 100yr , η100yr . To get an idea about the performance of the ACER
method and the existing Gumbel logistic model with Gumbel marginals fitted by
the method of moments, Monte Carlo simulations were carried out to produce 100
bivariate data samples. It was observed that the predicted 100-year return period
levels were consistently better for the ACER method.

10.5.2 Wind Speed and Wave Height Measured at a North Sea

Weather Station

This example presents results obtained by Naess and Karpa (2015b). Wind speed
(WS—3 hours mean [m/s]) and significant wave height (Hs—total sea [m]) data
measured in the Norwegian sea at location N 65.29, E 7.32 were analyzed to obtain
numerical estimates of the bivariate extreme value distribution. Figure 10.9 shows
the geographical position of the measurement site. The data were recorded during
54 years (1957–2011), 8 times per day (every three hours).
Figure 10.10 demonstrates the plot of the observed data. As it is seen from the
plot, there is a rather strong dependence between the two time series.
The Pearson product-moment correlation coefficient is found to be .ρ = 0.79.
The Kendall’s rank correlation coefficient is .τ = 0.56 and Spearman’s .ρ is equal to
.0.7, which also indicates a nonlinear connection between WS and Hs.

It should be noted that the available bivariate observations have low accuracy.
This especially concerns the significant wave height data, where the graduating
10.5 Numerical Examples 183

Fig. 10.9 Map of a part of

Norway with the marked
location

Fig. 10.10 Coupled 18

observations of wind speed
16
data (.ξ axis) and significant
wave height data (total sea, .η 14
axis)
𝜂, wave height [m]

0
0 5 10 15 20 25 30 35
ξ, wind speed [m/s]

mark is 0.1 m, and on average, there are 98 unique numerical values of the Hs
data per year. Obviously, for fairly accurate estimation of the bivariate ACER
functions, more data would be required. In order to get a good sample size of records
(realizations), it was decided to divide the data series into 18 three-years records for
the analysis. By this procedure, also the standard deviation of the ACER function
estimates can be calculated fairly accurately.
As in the previous example, the univariate ACER functions .ε̂k were estimated
first, using the Matlab-based standalone downloadable application (Karpa 2012).
In Figs. 10.11 and 10.12, .ε̂k is plotted versus different levels of wind speeds and
wave heights, respectively, for different values of k. From both figures, it is clearly
seen that there is significant time dependence between WS observations, as well
as between Hs data. It is also understood that this dependence effect is largely
184 10 Bivariate Extreme Value Distributions

Fig. 10.11 ACER estimates

k=1
for different degrees of 10
−1
k=2
conditioning. Wind speed k=3
data k=8
k=16
−2
10 k=24
k=32

ACER (ξ)
k
−3
10

−4
10

12 16 20 24 28
ξ

Fig. 10.12 ACER estimates k=1

for different degrees of −1 k=2
conditioning. Significant 10 k=3
wave height (total sea) k=8
k=16
−2 k=24
10 k=32
ACERk(𝜂)

−3
10

−4
10

4 6 8 10 12 14
𝜂

accounted for by .k = 16 since there is a marked degree of convergence in the

tail of .ε̂k for .k ≥ 16 in both cases. Obviously, .k = 16 corresponds to exceedances
separated by at least two days of non-exceedances for three hours observations.
For .k ≥ 32, which corresponds to four days declustered data, full convergence has
been achieved. Figures 10.11 and 10.12 also demonstrate that for extreme value
estimation, .ε̂2 can be used since the ACER functions for .k ≥ 2 all converge in the
far tail. Again, this clearly demonstrates the power of an ACER function plot as a
diagnostic tool to decide on the value of k needed for extreme value estimation in
a particular case. In spite of significant dependence effects for the WS and Hs data
with lower magnitudes, for the extreme values this is largely absent. This makes it
possible to choose .k = 2, which makes much more data available for estimation,
with a possible reduction of uncertainty in estimation as a result.
Figures 10.13 and 10.14 show the plots of the optimized parametric fit to the data
for .ε̂k for .k = 2 for both time series. In particular, 100-year return level value and
its 95% CI are estimated parametrically and plotted. For the wind speed data, the
10.5 Numerical Examples 185

Fig. 10.13 Plot of .ε̂2 (ξ ) −2 ɛ2(ξ)

versus wind speeds .ξ on a 10
fit
logarithmic scale for the ɛ2 (ξ)
optimized parameter values; −3 CI
+

.ξ1 = 14.5 10
CI
−

ACERx(ξ)
2
−4
10

−5
10

−6
10

15 20 25 30 35
ξ

Fig. 10.14 Plot of .ε̂2 (η) −2

10 ɛ2(𝜂)
against wave heights .η on a
fit
logarithmic scale for the ɛ2 (𝜂)
optimized parameter values; 10
−3
CI+
.η1 = 4.5 CI−
ACERy(𝜂)
2

−4
10

−5
10

−6
10

6 8 10 12 14 16 18
𝜂

optimal parameters are: .q = 0.05, .b = 0.1, .a = 1.9 · 10−4 , .c = 3.14, while in

case of Hs data, the parameters of the optimal curve are: .q = 0.04, .b = −2.27,
.a = 0.02, .c = 2.23.

Figure 10.15 demonstrates the empirically estimated bivariate ACER surfaces

.Eˆk (ξ, η) for different values of k on a logarithmic scale. .Eˆk (ξ, η) with .k = 1 is the

uppermost. As it is seen from the figure, the cross-section of the surfaces at a high
level of the wave height .η gives the univariate ACER functions of the wind speed
data, while the cross-section at a high wind speed level represents the univariate
ACER of the Hs time series, respectively.
The same arguments as in the univariate case are applied to make the decision
about the bivariate ACER surface to be used in the analyses. That is, as long as
the surfaces for .k ≥ 2 all converge in the tail and estimation of .Eˆ2 (ξ, η) is more
accurate due to availability of more data, we would choose the surface with the
degree of conditioning .k = 2.
186 10 Bivariate Extreme Value Distributions

Fig. 10.15 Bivariate ACER k=1

surface estimates for different
k=2
degrees of conditioning. –1
.Eˆk (ξ, η) surfaces are plotted
k=3
k=8
on a logarithmic scale
k=16
–2
k=24
eˆk k=32

–3

–4
5
10
15 20 25 15 𝜂
ξ

The optimal parameters of the Asymmetric logistic fit were found to be: .mAL =
7, .θ = 1 and .φ = 0.91. Since the additional parameters .θ and .φ are close to one, it
would seem reasonable to consider a Gumbel logistic model. The same optimization
procedure as in the Asymmetric logistic case, was applied to obtain the optimal
dependence parameter of the Gumbel logistic copula .mGL = 4.78.
Figures 10.16 and 10.17 show the contour plots of the optimized Asymmetric
logistic fit .A L2 (ξ, η) to the data for the .Eˆ2 (ξ, η) surface and also the contour plots
of the optimized Gumbel logistic surface .G L2 (ξ, η).
In Fig. 10.16, contour lines of the three surfaces are plotted for those levels of .ξ
and .η, where the bivariate ACER surface .Eˆ2 (ξ, η) have been empirically estimated.
Contour lines that correspond to the return period levels are presented in Fig. 10.17.
The figures reveal that the empirical bivariate ACER surface .Eˆ2 captures high
correlation between the data and so do the optimally fitted .G L2 and .A L2 surfaces.
Note that the contour lines of the bivariate ACER surface of fully correlated data
would show up as lines that consist of only horizontal and vertical line segments.
This happens because for such data, .Hxy (ξ, η) = min Fx (ξ ), Gy (η) , which
y
implies that .Ek (ξ, η) = max εkx (ξ ), εk (η) , cf. Eq. (10.18). It is seen that the level
of agreement between the estimated bivariate ACER and optimized Asymmetric
logistic and Gumbel logistic surfaces is equally significant. Thereby, the optimized
Gumbel logistic model with asymptotically consistent marginals obtained from the
optimized univariate ACER, can be used as the parametric representative of the
bivariate ACER surface estimated from the given data set. The Matlab programs for
ACER 2D analyses from Karpa (2014) were used to obtain the results presented in
this section.
10.5 Numerical Examples 187

Fig. 10.16 Contour plot of Ê2 GL2 AL2

the empirically estimated
−4.3
.Eˆ2 (ξ, η) surface (.•), −4
16
optimized Gumbel logistic −3.6
.G L2 (ξ, η) (.◦) and optimized −3.3
Asymmetric logistic 14 −3
.A L2 (ξ, η) (—) surfaces. −2.8
Boxes indicate levels on a
12 −2.6
logarithmic scale 𝜂
−2.4
10 −2.2
−2

18 20 22 24 26 28
ξ

Fig. 10.17 Contour plot of Ê2 GL2 AL2

the return period levels for 17
.Eˆ2 (ξ, η) surface (.•), 100
optimized Asymmetric 50
logistic .A L2 (ξ, η) (—) and
20
optimized Gumbel logistic
.G L2 (ξ, η) (.◦) surfaces.
10
15
Boxes indicate return period 5
levels in years 𝜂
2

11
26 28 30 32
ξ

Appendix 1: The Sequence of Conditioning Approximations

Consider a bivariate stochastic process .Z(t) = X(t), Y (t) with dependent com-
ponent processes, which has been observed over a time interval, .(0, T ) say. Assume
that the sampled values .Z1 = (X1 , Y1 ), . . . , ZN = (XN , YN ) are allocated to the
(usually equidistant) discrete times .t1 , . . . , tN in .(0, T ). The goal is to determine
the joint distribution function of the extreme value vector .MN = Mx,N , My,N ,
where .Mx,N = max Xj ; j = 1, . . . , N , and with a similar definition of .My,N .
Specifically, the goal is to estimate .P (ξ, η) = Prob Mx,N ≤ ξ, My,N ≤ η
accurately for large values of .ξ and .η.
188 10 Bivariate Extreme Value Distributions

Following the pattern of Chap. 5, the implementation of a sequence of approxi-

mations based on conditioning is now outlined, where the first is a one-step memory
approximation. This approximation concept is described by Naess (1985a, 1990a).
Hereafter, whenever expedient to ease the notation, .ζ = (ξ, η) is used with
a componentwise ordering relationship for .Zi , e.g., .Zi ≤ ζ means .Xi ≤ ξ and
.Yi ≤ η. Also, the event .Ckj (ζ ) = Ckj (ξ, η) = Zj −1 ≤ ζ, . . . , Zj −k+1 ≤ ζ of
.k − 1 consecutive componentwise non-exceedances .(k ≥ 2) is introduced. Then,

from the definition of .P (ξ, η), it emerges that,

P (ξ, η) = P (ζ ) = Prob CN +1,N +1 (ζ )

= Prob ZN ≤ ζ | CN N (ζ ) · Prob CN N (ζ )
. (10.27)
N
= Prob Zj ≤ ζ | Cjj (ζ ) · Prob C22 (ζ ) .
j =2

The first approximation of the sequence is obtained by assuming that the observed
data pairs are independent, that is, data points .Zi and .Zj are statistically independent
for all .i, j , .i /= j , so that all conditioning in Eq. (10.27) can be neglected.
In this special case, it is obtained that,

N
P (ζ ) ≈ P1 (ζ ) = Prob Zj ≤ ζ
j =1
.
N
= 1 − Prob(Xj > ξ ) − Prob(Yj > η) + Prob(Xj > ξ, Yj > η) .
j =1
(10.28)

Now, the designations .α1j (ξ ; η) = Prob(Xj > ξ ), .β1j (η; ξ ) = Prob(Yj > η) and
γ1j (ξ, η) = γ1j (ζ ) = Prob(Zj > ζ ) for .1 ≤ j ≤ N are introduced. It should be
.

noted that although neither .α1j (ξ ; η) depends on .η nor .β1j (η; ξ ) depends on .ξ , yet
this notation is kept for the correct further derivations.
Equation (10.28) can now be rewritten as,

N
P (ζ ) ≈ P1 (ζ ) = 1 − α1j (ξ ; η) − β1j (η; ξ ) + γ1j (ζ )
j =1
. (10.29)

N
≈ exp − α1j (ξ ; η) + β1j (η; ξ ) − γ1j (ζ ) ;
j =1

for large values of the components of .ζ , where the approximation .1 − x ≈ exp(−x)

for small values of x, has been applied, cf. Chap. 5.
10.5 Numerical Examples 189

In general, the variables .Zj are statistically dependent in the componentwise

sense. In this case, clearly the first genuine conditioning approximation is obtained
by neglecting all previous data except the immediate predecessor in Eq. (10.27).
While this approximation sometimes captures the effect of dependence in the time
series on the extreme value distribution, it is in general not sufficient. Consequently,
the following .k − 1-step memory approximation is adopted,

N
P (ζ ) ≈ Pk (ζ ) = Prob Zj ≤ ζ | Ckj (ζ ) · Prob Ckk (ζ )
j =k

N
. = 1 − 1 − Prob Zj ≤ ζ | Ckj (ζ ) · Prob Ckk (ζ ) (10.30)
j =k

N

≈ exp − 1 − Prob Zj ≤ ζ | Ckj (ζ ) · Prob Ckk (ζ ) ,
j =k

for large values of the components of .ζ . Then, .k = 2 means conditioning on only

the previous observation.
By introducing the notations .αkj (ξ ; η) = Prob Xj > ξ | Ckj (ζ ) , .βkj (η; ξ ) =
Prob Yj > η | Ckj (ζ ) and .γkj (ξ, η) = γkj (ζ ) = Prob Zj > ζ | Ckj (ζ ) , for .k ≤
j ≤ N, it can now be shown that,

N
. exp − 1 − Prob Zj ≤ ζ | Ckj (ζ )
j =k

N
= exp − αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ ) . (10.31)
j =k

Similarly, it is found that,

k−1
Prob Ckk (ζ ) ≈ exp −
. αjj (ξ ; η) + βjj (η; ξ ) − γjj (ζ ) ; (10.32)
j =1

for large values of the components of .ζ . Under these conditions, this leads to the
result,

N
Pk (ζ ) ≈ exp −
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ )
j =k

k−1
− αjj (ξ ; η) + βjj (η; ξ ) − γjj (ζ ) . (10.33)
j =1
190 10 Bivariate Extreme Value Distributions

Thereby, based on the definition of the extreme value distribution .P (ζ ) and the
N
properties of conditional probability, a set . Pk (ζ ) k=1 of conditional probability
distributions has been constructed, which converges to the target distribution .P (ζ )
of the extreme value .MN in the limit as k increases for large values of the
components of .ζ .
For most applications, and for practical significance, the following assumption on
this sequence of approximations is made: there is an effective .ke satisfying .ke ⪡ N
such that .P (ζ ) = Pke (ζ ). Then, .P1 (ζ ) ≤ P2 (ζ ) ≤ . . . ≤ Pke (ζ ) = P (ζ ). It may be
noted that for a k-dependent stationary bivariate stochastic process .Z(t), that is, for
data where .Zi and .Zj are independent componentwise whenever .|j − i| > k, then
.P (ζ ) = Pk+1 (ζ ) exactly, as in the univariate case.

It will be verified that the property .ke ⪡ N is indeed satisfied for the
k−1 speed data analyzed in the present paper. Also, under this assumption,
wind
αjj (ξ ; η) + βjj (η; ξ ) − γjj (ζ ) is generally negligible compared to
jN=1
.

. j =k αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ ) . This leads to the approximation (.k = ke ),

N
P (ζ ) ≈ Pk (ζ ) ≈ exp −
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ ) ; (10.34)
j =k

for large values of .ξ and .η, from which it emerges that for the estimation of the
bivariate extreme value distribution, it is sufficient to estimate the sequence of
N
functions . αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ ) j =k .

Appendix 2: Empirical Estimation of the Bivariate ACER

Functions

To get a more compact representation, it is expedient to introduce the concept of k’th

order bivariate average conditional exceedance rate (ACER) function as follows,

1
N
Ek (ζ ) =
. αkj (ξ ; η) + βkj (η; ξ ) − γkj (ζ ) ; k = 1, 2, . . .
N −k+1
j =k
(10.35)

Hence, when .N ⪢ k, and for large values of the components of .ζ ,

Pk (ζ ) ≈ exp { − (N − k + 1) Ek (ζ )} .
. (10.36)
10.5 Numerical Examples 191

Note that for higher dimensional cases, it is expedient to use an alternative

formulation of the ACER functions, which is related to the problem of numerical
calculation of the ACER functions, viz.,

N

1
.Ek (ζ ) = 1 − Prob Zj ≤ ζ | Ckj (ζ ) ; k = 2, 3, . . .
N −k+1
j =k
(10.37)

A few more details on the numerical estimation of the bivariate ACER functions
are worth noting. It is useful to start by introducing a set of random functions. For
.k = 2, . . . , N , and .k ≤ j ≤ N, let,

Akj (ξ ; η) = 1 Xj > ξ ∩ Ckj (ζ ) ,

Bkj (η; ξ ) = 1 Yj > η ∩ Ckj (ζ ) ,
. (10.38)
Gkj (ξ, η) = Gkj (ζ ) = 1 Zj > ζ ∩ Ckj (ζ ) ,

Ckj (ξ, η) = Ckj (ζ ) = 1 Ckj (ζ ) ,

where .1{A } denotes the indicator function of some event .A .

From these definitions it follows that, for instance,

E[Akj (ξ ; η)]
αkj (ξ ; η) =
. , (10.39)
E[Ckj (ζ )]

where .E[·] denotes the expectation operator. A similar equation holds for .βkj (η; ξ )
with .Bkj (η; ξ ) instead of .Akj (ξ ; η) in the numerator, and for .γkj (ζ ) with .Gkj (ζ )
instead of .Akj (ξ ; η), by analogy.
Assuming ergodicity of the process .Z(t) = X(t), Y (t) , then obviously
.Ek (ζ ) = αkk (ξ ; η) + βkk (η; ξ ) − γkk (ζ ) = . . . = αkN (ξ ; η) + βkN (η; ξ ) −

γkN (ζ ) , and it may be assumed that for the bivariate time series at hand,
N
j =k akj (ξ ; η) + bkj (η; ξ ) − gkj (ζ )
Ek (ζ ) = lim
. N , (10.40)
N →∞
j =k ckj (ζ )

where .akj (ξ ; η), .bkj (η; ξ ), .gkj (ζ ) and .ckj (ζ ) are the realized values of .Akj (ξ ; η),
Bkj (η; ξ ), .Gkj (ζ ) and .Ckj (ζ ), respectively, for the observed time series.
.
192 10 Bivariate Extreme Value Distributions

While Eq. (10.40) clearly displays how the conditional exceedances are counted
for the estimation of the bivariate ACER functions, this approach becomes very
cumbersome for higher dimensional problems. Fortunately, this can be circum-
vented by invoking Eq. (10.37). It is observed that,

Prob Zj ≤ ζ ∩ Ckj (ζ ) E[Ck+1,j +1 (ζ )]

Prob Zj ≤ ζ | Ckj (ζ ) =
. = ,
Prob Ckj (ζ ) E[Ckj (ζ )]
(10.41)

since clearly .{Zj ≤ ζ } ∩ Ckj (ζ ) = Ck+1,j +1 (ζ ). A consequence of this, is that

Eq. (10.40) may be replaced by the following equation,
N
j =k ck+1,j +1 (ζ )
.Ek (ζ ) = 1 − lim N , (10.42)
N →∞
j =k ckj (ζ )

which is a form particularly suitable for estimating .Ek (ζ ) for high dimensional
distributions.
Clearly, . lim E[Ckj (ζ )] = 1. Hence,
ξ,η→∞

. lim E˜k (ζ ) / Ek (ζ ) = 1 , (10.43)

ξ,η→∞

where
N
j =k E[A kj (ξ ; η)] + E[B kj (ξ ; η)] − E[Gkj (ζ )]
E˜k (ζ ) = lim
. . (10.44)
N →∞ N −k+1

The advantage of using the modified bivariate ACER function .E˜k (ζ ) for .k ≥ 2
is that it is somewhat easier to use for non-stationary or long-term statistics than
.Ek (ζ ). This aspect is discussed in Chap. 5, see also Karpa and Naess (2013). Since

our focus is on the values of the ACER functions at the extreme levels, any function
may be used that provides correct predictions of the appropriate ACER function at
these extreme levels.
Now, let us look at the problem of estimating confidence intervals for the bivari-
ate ACER function. If several realizations of the time series .Z(t) = X(t), Y (t)
are provided or the time series can be appropriately sectioned into several records,
e.g., several annual, or other time span records, the sample estimate of .Ek (ζ ) would
be,

1 ˆ (r)
R
Eˆk (ζ ) =
. Ek (ζ ) , (10.45)
R
r=1
10.5 Numerical Examples 193

where R is the number of realizations (samples). .Eˆk (ζ ) is estimated using the

(r)

result from Eq. (10.40) for the stationary time series, or using the result from
Eq. (10.44) for non-stationary time series, where the index .(r) refers to realization
no. r. The sample standard deviation .ŝk (ζ ) then can be estimated by the standard
formula,

1 ˆ (r) 2
R
ŝk (ζ )2 =
. Ek (ζ ) − Eˆk (ζ ) . (10.46)
R−1
r=1

Assuming that realizations are independent, Eq. (10.46) leads to a good approxi-
mation of the 95% confidence interval CI .= . CI− (ζ ), CI+ (ζ ) for the value .Ek (ζ ),
where,

ŝk (ζ )
CI± (ζ ) = Eˆk (ζ ) ± τ · √ ,
. (10.47)
R

and .τ = t −1 (1 − 0.95)/2, R − 1 is the corresponding quantile of the Student’s
t-distribution with .R − 1 degrees of freedom.
Chapter 11
Space–Time Extremes of Random Fields

11.1 Introduction

The initial motivation for developing the methods presented in this chapter was
the search for a practical solution to the air gap problem for offshore structures.
Over the years several reports had been filed detailing damage of the deck structure
of offshore platforms due to wave impacts. In fact, such damage seemed to occur
more frequently than could be expected from predictions based on standard theory.
These wave impacts are highly undesirable events as they may, in the worst case,
compromise the structural integrity of the platform deck. Hence, considerable
attention was being paid to this problem over some time, and now reasonably good
predictive tools have become available, as will be demonstrated in this chapter.
It was eventually realized that the standard procedure for predicting the extreme
wave crest height to be expected at a given platform location is largely based on
extreme value statistics for a single point (Forristall, 2006). It is clear that if this is
used as a basis for predicting the probability of wave impact on the deck structure,
then the fact that there is also a significant area effect on the extreme crest height
distribution over the deck area is neglected. Hence, for a proper solution of the air
gap problem, it is necessary to model the ocean surface as a random field so that
also the spatial aspect can be correctly dealt with (Forristall, 2006). For the case
of a homogeneous Gaussian random wave field, this aspect is also discussed by
Socquet-Juglard et al. (2005) and Baxevani and Rychlik (2006).
Recently, a new, simplified method for predicting the space–time extreme value
statistics of homogeneous Gaussian random fields has been proposed (Naess and
Batsevych, 2010). In the present chapter it is shown that this simplified method
can be extended to deal with homogeneous non-Gaussian random fields and, in
particular, with the special case of a second-order homogeneous ocean wave field
(Naess and Batsevych, 2012). Additionally, a new semi-parametric representation of
the space–time extreme value distribution for quite general homogeneous random
fields over rectangular domains is proposed. It is demonstrated that both the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 195
A. Naess, Applied Extreme Value Statistics,
[Link]
196 11 Space–Time Extremes of Random Fields

simplified method and the new semi-parametric method make it possible to make
good predictions of the extreme crest heights over the deck area for jacket platforms,
which are assumed to be transparent for the big waves. Even if the second-order
ocean wave field is intrinsically non-Gaussian, its deviation from a Gaussian field
is not very significant. Therefore, to implement a more stringent test of the two
methods proposed, an example of a strongly non-Gaussian field has also been
included. Both methods will be shown to provide accurate extreme value predictions
for this case as well.
The proposed simplified approach, as opposed to the new semi-parametric
method, can be easily modified to also provide good solutions of the air gap problem
for a TLP or a semi-submersible platform, provided measured or simulated time
series of the ocean surface elevation or air gap at a sufficient number of points
below the deck structure are available, which properly account for the motions
of the platform and the effect of the structure on the wave field. Due to the
interaction effects between the wave field and the structure, the wave field is not
homogeneous in space under the platform deck, even if the incoming wave field is
of this type. Hence, a prediction method that can deal with a random field that is
nonhomogeneous in space while stationary in time will be required. The simplified
method proposed in this chapter can provide such a prediction tool.
Over the last 20 years or so, several approximations for the extreme value
distributions of a real-valued random field .X(x, t), where .x = (x1 , . . . , xd ) ∈ D ⊆
Rd and .t ∈ T ⊆ R, have been proposed, of which a few are mentioned (Adler,
1981; Vanmarcke, 1983; Sun, 1993; Maes and Breitung, 1997; Ditlevsen, 2004).
Of particular interest are two books which provide substantial theoretical results,
mainly for Gaussian random fields (Piterbarg, 1996; Adler and Taylor, 2007).

11.2 Spatial–Temporal Extremes for Gaussian Random

Fields

Let .X = X(x, t) denote a zero mean random field defined on .Rd × R, and let

X̂ = X̂(D × T ) = max{X(x, t); x ∈ D , t ∈ T } ,

. (11.1)

where .T denotes a time interval, specifically, .T = (0, T ) for a given T .

In Chapter 14 of Adler and Taylor (2007), the following result is proved for a
smooth Gaussian random field:

. Prob(X̂ ≥ ξ ) − E[φ(Aξ (X, D × T ))] < O e−αξ 2 /(2σ 2 ) , (11.2)

where .φ(Aξ (X, D × T )) denotes the Euler characteristic of .Aξ (X, D × T ), which
is the excursion set of X over .D × T . That is, .Aξ (X, D × T ) = {(x, t) ∈ D × T :
X(x, t) ≥ ξ }, .σ 2 is the variance of X (assumed constant), and .α > 1 is a constant
11.2 Spatial–Temporal Extremes for Gaussian Random Fields 197

that can be identified as discussed by Adler and Taylor (2007). Certain restrictions
apply to the geometry of the domain .D. Instead of going into detail about this
formula, a largely equivalent result will be given, which is due to Piterbarg (1996).
This result applies to a homogeneous Gaussian field, which has a simple geometry
for .D, typically a nice convex domain, e.g., like a rectangle or a sphere. Any
numerical representation of the homogeneous random field X by necessity leads to
a truncation of the spectral density (assumed to exist). Hence, there is no practical
limitation in assuming that the spectral density of X has bounded support, which
leads to smoothness of any order. Piterbarg’s (asymptotic) formula, cf. Chapter 1 of
Piterbarg (1996), can then be written as (.ξ = uσ , .u → ∞ )

.Prob(X̂ ≤ σ u) PA (u) = exp −ϕ(u)(2π )(d+1)/2 Hd (u)Vd , (11.3)

where . denotes asymptotically equal, .Hk (u) = (−1)k ϕ(u)−1 (d/du)k ϕ(u) is a
Hermite polynomial of order k, .ϕ is the standard normal probability density, that is,
√ √
−1 e−u2 /2 , .V = m(D × T )/m(V ), .m(V ) = (2π σ )d+1 / det Λ (.Λ
.ϕ(u) = ( 2π ) d
is defined by Eq. (11.5) below), and .m(A ) denotes the Lebesgue measure (volume)
of the domain .A .
Before the calculation of .Vd is discussed any further, a slight detour is made and
some basic results are presented as detailed by Krogstad et al. (2004). Denoting the
spectral density of X by .Ψ (k, ω), its covariance function .γ (x, t) is given as

γ (x, t) =
. ei(kx +ωt) Ψ (k, ω) dkdω, (11.4)
Rd+1

where .k denotes the wave number vector.

Denote by .Λ = (Λij ) the essentially nonsingular covariance matrix of the
gradient .∇X(x, t) = (∂X/∂x1 , . . . , ∂X/∂xd , ∂X/∂t) evaluated in the point .x = 0,
.t = 0

Λij = Cov ∇i X(0, 0), ∇j X(0, 0)
.

= −∇i ∇j γ (0, 0) = ri rj Ψ (r) dr, (11.5)
Rd+1

where .r = (r1 , . . . , rd , rd+1 ) = (k1 , . . . , kd , ω).

A linear space transformation .A : D × T → D̃ × T˜ is now performed with the
help of a nonsingular .(d + 1) × (d + 1) matrix A

ỹ = Ax̃ ,
. (11.6)

where .x̃ = (x, t) = (x1 , . . . , xd , t), then a new Gaussian random field .X̃(ỹ) defined
as

X̃(ỹ) = X(A−1 ỹ)/σ

. (11.7)
198 11 Space–Time Extremes of Random Fields

has unit variance, .σ̃ = 1, and the covariance matrix of the gradient of .X̃ is given by

1
Λ̃ =
. (A−1 )T ΛA−1 . (11.8)
σ2
√
Since .Λ is a symmetrical matrix, the transformation .A = d + 1σ −1 Λ1/2
is well defined and results in .Λ̃ = (d + 1)−1 I . For this special case of the
Gaussian random field .X̃(ỹ) corresponding to the mentioned covariance matrix .Λ̃,
a complementary extreme value distribution can be obtained for the particular case
of a rectangular domain .Δ with edges oriented along the coordinate axes, that is,
d+1
.Δ = ⊗
i=1 [0, Li ], where .⊗ denotes a Cartesian product. This probability can be
written as

d+1 ∞
ˆ
.Prob(X̃ > u) ϕ(u)
Hk−1 (u)
Σk (L) + ϕ(x)dx , (11.9)
[2π(d + 1)]k/2 u
k=1

where .L = (L1 , . . . , Ld+1 ) and .Σk (L) is the kth elementary symmetric polynomial,
defined as

.Σk (L) = Lj 1 · Lj 2 · · · Lj k . (11.10)
1≤j1 <j2 <···<jk ≤d+1

This result is obtained as a special case of Theorem 5.1 in the book by Piterbarg
(1996).
Now, let us rewrite Eq. (11.9) for the case of the Gaussian field .X(x, t)
having variance .σ and diagonal covariance matrix of general form, .ΛX =
diag(λ1 , . . . , λd+1 ). This case is important since an arbitrary Gaussian field .Z(z, t)
with covariance matrix .ΛZ can be brought into this form by performing a rotation
A (.AT = A−1 ) of the coordinate axes to the principal directions. As a consequence,
only rectangular domains with edges oriented along principal directions are consid-
ered.
It is now easy to check that for .X(x, t)

d+1 ∞
Prob(X̂ > uσ ) ϕ(u)
. (2π )k/2 Hk−1 (u)Σk (q) + ϕ(t)dt, (11.11)
k=1 u

√
where .q = (q1 , q2 , · · · , qd+1 ), .qi = Li /0i , .0i = 2π σ/ λi , .i = 1, . . . , d + 1.
11.2 Spatial–Temporal Extremes for Gaussian Random Fields 199

For high levels u and, correspondingly, small exceedance probabilities, one may
write

.Prob(X̂ ≤ uσ ) = 1 − Prob(X̂ > uσ ) exp −Prob(X̂ > uσ ) (11.12)

d+1
PB (u) = exp −ϕ(u) (2π )k/2 Hk−1 (u)Σk (q) .
k=1

This formula (referred to in the following as PB) is in agreement with the

“conventional” Piterbarg’s asymptotic formula Eq. (11.3) (referred to as PA) if only
the leading term of the sum in Eq. (11.12) is taken into account, as .Σd+1 (q) = Vd .
The leading term in the exponent of PB (and the only one in PA) provides the
probability of the exceedance from inside the volume of the domain .D × T , while
other terms of PB represent probability of exceedance for the lower dimensional
manifolds associated with the boundaries of the domain, like faces, edges, and
vertices in the case of three-dimensional .Δ. Therefore, if one wants to quantify
the exceedance probability for a rectangle .Δ, one of whose spatial sizes, say .L1 ,
tends to zero, PA must be rewritten for the lower dimensional case. This demands
in particular using .Vd−1 instead of .Vd , as the last tends to zero, while the same
formula PB will work for all ranges of .Δ, that is, there is no need to change it upon
a “squeeze” of one of the dimensions.
To check the validity of both PA and PB, a comparison of them will be made
against numerical results for (1+1)- and (2+1)-dimensional Gaussian fields. For the
(2+1)-dimensional Gaussian field, the explicit expression for PB is

PB (u) = exp − exp(−u2 /2) qx + qy + qt

.
√
+ 2π u (qx qy + qx qt + qy qt ) + 2π qx qy qt (u2 − 1) . (11.13)

Here .qx = q1 = L1 /01 , .qy = q2 = L2 /02 , .qt = q3 = L3 /03 , and .01 =

√ √
2π σ/ Λxx , .02 = 2π σ/ Λyy , .03 = 2π σ/ Λtt = Tz = 2π/ωz , where

Λxx kx2
. = Ψ (k, ω) dω dk , (11.14)
Λyy ky2

Λtt = ωz2 σ 2 =
. ω2 Ψ (k, ω) dω dk . (11.15)

The last equality in Eq. (11.15) defines the zero-upcrossing (circular) frequency .ωz .
A more general result corresponding to the special case of Eq. (11.13) with .qt = 0
was derived already by Ditlevsen (1971).
200 11 Space–Time Extremes of Random Fields

It can be shown that the rhs of Eq. (11.3) has the following Gumbel distribution
as its max-stable asymptotic limit (Krogstad et al., 2004):

G(uσ ) = exp − exp − hV (u − hV )
. , (11.16)

where the parameter .hV is obtained by solving the equation

Vd h2V e−hV /2 = 1 ,
2
. (11.17)

which has the approximate solution

hV =
. 2 log Vd + 2 log(2 log Vd ) . (11.18)

11.3 A Simplified Approach

The first step is rewriting Eq. (11.1) as follows:

X̂ = max {max X(x, t)} .

. (11.19)
0≤t≤T x∈D

A stochastic process consisting of spatial extremes is then introduced

X̂D (t) = max X(x, t) .

. (11.20)
x∈D

It is now assumed that a finite set of points .{xi }M

i=1 can be chosen so that

X̂D (t) ≈ max X(xi , t) ,

. (11.21)
1≤i≤M

for each .t ∈ (0, T ), and that

.X̂ ≈ max X̂D (tn ) ≈ max { max X(xi , tn )} , (11.22)

1≤n≤N 1≤n≤N 1≤i≤M

for a suitable discretization of the time interval .(0, T ), .0 ≤ t1 < . . . < tN ≤ T . For
each n, .n = 1, . . . , N , denote by .xin a point so that . max X(xi , tn ) = X(xin , tn ),
1≤i≤M
and let .X̃n = X(xin , tn ). Then

X̂ ≈ max X̃n .
. (11.23)
1≤n≤N
11.3 A Simplified Approach 201

If the sampling times .tn are sufficiently dense in .(0, T ), then the time series .X̃n
can be considered as quasi-continuous in time and an accurate discrete representa-
tion of a smooth stochastic process .X̃(t) so that .X̂ ≈ max X̃(t).
0≤t≤T
The derivations in this section apply also to the case of a nonhomogeneous field,
and the extreme value distribution .FX̂ (ξ ) is then with good approximation given by
the equation
T
FX̂ (ξ ) ≈ exp −
. ν + (ξ ; t) dt , (11.24)
0 X̃

where .ν + (ξ ; t) denotes the average upcrossing rate of the level .ξ by .X̃(t) at time
X̃
t. This approximation is contingent on the assumption that the probability of initial
exceedance can be neglected. For estimation purposes, Eq. (11.24) is rewritten as

FX̂ (ξ ) ≈ exp{−ν + (ξ ) T } ,
. (11.25)
X̃

where the time average .ν + (ξ ) is given as

X̃
T
1
. ν + (ξ ) = ν + (ξ ; t) dt , (11.26)
X̃ T 0 X̃

which is in a form suitable for empirical estimation from time series. However,
it should be clear that in the case of a nonhomogeneous field, the extreme value
predictions can only refer to the initial space–time domain .D × T , if no additional
modelling or structure is introduced.
The comparison between Eqs. (11.3), (11.12), and (11.25) reveals that
.− log(PB (ξ/σ )) is not a multiplicative function with respect to T (or, equivalently,
+
.qt ) in contrast to .− log(PA (ξ/σ )) and .− log(F (ξ )) ≈ ν (ξ ) T . More precisely,
X̂ X̃

PA (ξ/σ ) = exp{−νA+ (ξ ) T } ,
. (11.27)

.PB (ξ/σ ) = exp{−ν0+ (ξ ) − νB+ (ξ ) T } , (11.28)

where, for the three-dimensional case,

νA+ (uσ ) = 2π qx qy (u2 − 1) exp(−u2 /2)/Tz ,

.
√
νB+ (uσ ) = νA+ (uσ ) + 1 + 2π u(qx + qy ) exp(−u2 /2)/Tz , (11.29)
√
ν0+ (uσ ) = (qx + qy + 2π u qx qy ) exp(−u2 /2) ,

as it immediately follows from Eq. (11.13) since .qt = L3 /03 = T /Tz . The term
+
.ν (ξ ), the only one that survives in PB when .T → 0, gives the probability of
0
202 11 Space–Time Extremes of Random Fields

an initial exceedance event on the “boundary” .D of the .D × T domain. Since

the most interesting cases in practice would typically have domains where the time
dimension is much greater than the spatial ones, which leads to .max(qx , qy ) qt ,
+
.ν (ξ ) will be safely neglected for the numerical examples in this chapter. A similar
0
approximation was also implemented in Eq. (11.24).
From Eqs. (11.3) and (11.16), it is seen that the mean upcrossing rate tail, say
for .ξ ≥ ξ0 , behaves in a manner largely determined by a function of the form
.exp{−a(ξ − b) } (.ξ ≥ ξ0 ), where a, b and c are suitable constants. It is therefore
c

assumed that the mean upcrossing rate function of .X̃(t) can be represented as

ν + (ξ ) ≈ q(ξ ) exp{−a(ξ − b)c } , ξ ≥ ξ0 ,

. (11.30)

for a suitable choice of .ξ0 , where the function .q(ξ ) is slowly varying compared with
the exponential function .exp{−a(ξ − b)c } for tail values of .ξ , cf. Naess and Gaidai
(2008). Now, typically, the function .q(ξ ) can be largely considered as a constant for
tail values of .ξ . This suggests an extrapolation strategy obtained by replacing .q(ξ )
by a suitable constant value, q say.
The adopted procedure for identifying appropriate values for the parameters
.a, b, c, q, assuming a constant q, follows closely the optimization method detailed

in Chap. 5. It is based on minimizing the following mean square error function with
respect to the four arguments

N
2
.F (q, a, b, c) = wj log ν̂ + (ξj ) − log q + a(ξj − b)c , (11.31)
j =1

where .ν̂ + (ξj ) is the empirical estimate of the upcrossing rate at the level .ξj (.ξ0 ≤
ξ1 ≤ . . . ≤ ξN ), and .wj denotes a weight factor that puts more emphasis on the more
reliable estimates of .ν̂ + (ξj ). The choice of weight factor is to some extent arbitrary.
Here, a weight factor will be used based on the confidence interval associated with
the empirical estimate of the upcrossing rate .ν̂ + (ξj ).

11.4 Spatial–Temporal Extremes for Non-Gaussian Random

Fields

In this section, the discussion is limited to random wave fields with two spatial
dimensions. However, it should be quite obvious how to extend the methods
discussed to higher dimensional spaces. Accordingly, let .X = X(x, t) denote a
zero mean random field defined on .R2 × R, and let

.X̂ = X̂(D × T ) = max{X(x, t); x ∈ D , t ∈ T } , (11.32)

11.4 Spatial–Temporal Extremes for Non-Gaussian Random Fields 203

where .T denotes a time interval, specifically, .T = (0, T ) for a given time T , and
.D is the area of interest.
Except for the case of homogeneous Gaussian fields and memoryless transfor-
mations of them, the asymptotic distribution function of .X̂ is unknown (Piterbarg,
1996; Adler and Taylor, 2007). In this section, the focus is on the case of second-
order homogeneous wave fields. Since they are non-Gaussian and cannot be
obtained by a simple transformation of a Gaussian field, the standard theory of
Gaussian fields does not apply. However, even if no simple formulas can be derived
for extremes of second-order wave fields, the simplified empirically based approach
described in the previous section may still be applied, cf. Naess and Batsevych
(2010).
The points .{xi }M
i=1 of Eq. (11.21) for the case of a wave field are typically chosen
as the nodal points of a rectangular grid with a mesh size determined by the smallest
significant wavelength of the field. A mesh size equal to one-tenth of this smallest
wavelength would be sufficient. Then to ensure the approximation in Eq. (11.22), the
(equidistant) discretization of the time interval .(0, T ), .0 = t0 < t1 < . . . < tN ≤ T .
.Δt = tj −tj −1 can in most practical cases be chosen as one-tenth of the spectral peak

period. The robustness of the results with respect to discretization can be verified by
refining the mesh size in space and time.
Although the main focus of this section is on second-order ocean waves, the
applicability of the methods discussed is far more wide ranging. However, for more
general cases the validity of the approximation in Eq. (11.22) may require a more
careful consideration. For a homogeneous random field which has a spectral density
of bounded support, which can often be assumed in applications, then, the field is
infinitely smooth and there will be uniform bounds on the rate of change of the field.
Hence, a finite grid will always exist that secures the validity of the approximation in
Eq. (11.22) within a given accuracy. For more general cases, assuming that the field
has continuously differentiable realizations with uniformly bounded derivatives on
bounded domains, then, again the validity of Eq. (11.22) within a specified level of
accuracy is guaranteed for a suitable choice of the grid. Usually in practice, physical
rather than mathematical arguments can be used to justify this approximation.
An alternative procedure is now introduced based on a detailed study of the
extreme value statistics of Gaussian random fields described by Piterbarg (1996)
and Adler and Taylor (2007). Consider a rectangular domain .D = Lx × Ly ,
where .Lx denotes an interval of length .Lx , and similarly, .Ly denotes an interval
of length .Ly . Neglecting the probability of initial exceedance, which is acceptable
for long time intervals, it is then postulated that the distribution of .X̂(D × T ) for a
homogeneous random field X can be written as

FX̂ (ξ ) = exp − ν0+ (ξ ) + νx+ (ξ )Lx + νy+ (ξ )Ly + νxy
.
+
(ξ )Lx Ly T . (11.33)

Here .ν0+ (ξ ) denotes the one-point upcrossing rate of the level .ξ . .νx+ (ξ ) (.νy+ (ξ ))
represents corrections to the one-point upcrossing rate due to extensional effects in
+ (ξ ) corrects for area effects. In particular, .ν + (ξ ) is
the x (y) direction, while .νxy 0
204 11 Space–Time Extremes of Random Fields

the mean upcrossing rate of the stochastic process .X(x, t) for an arbitrary, fixed
point .x. .νx+ (ξ ) is determined by introducing the stochastic process .X̂Lx (t) =
max X(x, t), for a suitably fixed value .y0 , and writing the mean upcrossing rate
x∈Lx ×{y0 }
of this process as .ν0+ (ξ ) + νx+ (ξ )Lx . This defines the mean normalized upcrossing
rate .νx+ (ξ ). Similarly, .νy+ (ξ ) is determined by introducing the stochastic process
.X̂L (t) = max X(x, t), for a suitably fixed value .x0 , and writing the mean
y
x∈{x0 }×Ly
upcrossing rate of this process as .ν0+ (ξ ) + νy+ (ξ )Ly . This, then, defines the mean
normalized upcrossing rate .νy+ (ξ ). Finally, .νxy
+ (ξ ) is determined by writing the

mean upcrossing rate of the process .X̂D (t) = X̂Lx ×Ly (t) = max X(x, t) as
x∈Lx ×Ly
+
.ν (ξ )
0 + νx+ (ξ )Lx + νy+ (ξ )Ly +
+ νxy (ξ )Lx Ly . Combining this with the previous
relations provides a way to + (ξ ). These functions can be estimated in
obtain .νxy
a manner similar to the procedure to be described next. The advantage of the
representation provided by Eq. (11.33) is that the calibration of the right hand side
needs to be done for only one rectangular domain. After that, the formula can then
be used for any other rectangular domain (with the same orientation).
While the procedure for estimating the upcrossing rate functions described above
is one way to achieve this, a somewhat different approach is used here. Instead
of describing the procedure in full generality, it will be detailed as applied to the
specific example in the section on numerical examples. The edges of the chosen
rectangular domain for the calibration, which was 300 .× 300 m, were divided into
subintervals of length 25 m. An aggregate of rectangular subdomains .Dmn , .m, n =
0, 1, . . . , 12, where .Dmn = (0, 25 · m) × (0, 25 · n), is then created. Note that this
includes the degenerate domains .(0, 0), .(0, 25 · m) × (0, 0), and .(0, 0) × (0, 25 ·
n). A linear regression approach based on the upcrossing rate of the area extremes
process for each of the resulting rectangular domains .Dmn obtained for .m, n =
0, 1, . . . , 12 was used to estimate .ν0+ (ξj ), νx+ (ξj ), νy+ (ξj ), νxy
+ (ξ ) for each value of
j
.ξj , .j = 1, . . . , J , which denotes a preassigned range and number of .ξ -levels leading

to meaningful estimates of various upcrossing rates from the data.

In the following, .ν + (ξ ) will be used as a generic notation denoting either the
mean or the time averaged upcrossing rate, as the case may be. A key issue for
prediction of extreme values is the estimation of .ν + (ξ ). Also for the case considered
in this section, it is assumed that the mean upcrossing rate tail, say for .ξ ≥ ξ0 ,
behaves in a manner largely determined by a function of the form .exp{−a(ξ − b)c }
(.ξ ≥ ξ0 ), where a, b, and c are suitable constants. Consequently, it is assumed that
the mean upcrossing rate function of .X̃(t) can be represented as

ν + (ξ ) ≈ q(ξ ) exp{−a(ξ − b)c } , ξ ≥ ξ0 ,

. (11.34)

of .ξ . This points to an extrapolation strategy based on replacing .q(ξ ) by a suitable

constant value, q say.
The adopted procedure for identifying appropriate values for the parameters
.a, b, c, q, assuming a constant q, is largely identical to that of the previous section.

11.5 Empirical Estimation of the Mean Upcrossing Rate

In the previous section it was shown that the key to providing estimates of the
extreme values of the response process .X(t) on the basis of simulated response time
histories is the estimation of the mean upcrossing rate. By assuming the requisite
ergodic properties of the response process for a short-term condition, the mean
upcrossing rate is conveniently estimated from the ergodic mean value. That is, it
may be assumed that

1 +
ν + (ξ ) = lim
. n (ξ ; 0, t) , (11.35)
t→∞ t

where .n+ (ξ ; 0, t) denotes a realization of .N + (ξ ; 0, t), that is, .n+ (ξ ; 0, t) denotes

the counted number of upcrossings during time t from a particular simulated time
history for which the starting point .t = 0 is suitably chosen. In practice, k time
histories of a specified length, .T0 say, are simulated. The appropriate ergodic mean
value estimate of .ν + (ξ ) is then

1 +
k
ν̂ + (ξ ) =
. nj (ξ ; 0, T0 ) , (11.36)
k T0
j =1

where .n+
j (ξ ; 0, T0 ) denotes the counted number of upcrossings of the level .ξ by time
history no. j . This will be the approach to the estimation of the mean upcrossing rate
adopted in this chapter.
For a suitable number k, e.g., .k ≥ 20, and provided that .T0 is sufficiently large,
a fair approximation of the 95% confidence +
interval for the value .ν (ξ ) can be
− +
obtained as .CI0.95 (ξ ) = C (ξ ) , C (ξ ) , where

ŝ(ξ )
C ± (ξ ) = ν̂ + (ξ ) ± 1.96 √ ,
. (11.37)
k

and the empirical standard deviation .ŝ(ξ ) is given as

1 nj (ξ ; 0, T0 ) 2
k +
.ŝ(ξ ) =
2
− ν̂ + (ξ ) . (11.38)
k−1 T0
j =1
206 11 Space–Time Extremes of Random Fields

Note that k and .T0 may not necessarily be the number and length of the actually
simulated response time series. Rather, they may be chosen to optimize the estimate
of Eq. (11.38). If initially .k̃ time series of length .T̃ are simulated, then .k = k̃k0 and
.T̃ = k0 T0 . That is, each initial time series of length .T̃ has been divided into .k0 time

series of length .T0 , assuming, of course, that .T̃ is large enough to allow for this in
an acceptable way. The consistency of the estimates obtained by Eq. (11.38) can be
checked for large values of .ξ by the observation that .Var[N + (ξ ; 0, t)] = ν + (ξ )t
since .N + (ξ ; 0, t) is then a Poisson random variable by assumption. This leads to
the equation
⎡ ⎤
k N + (ξ ; 0, T )
+
1 0
⎦ = ν (ξ ) ,
Var ⎣
j
.ŝ(ξ ) =
2
(11.39)
k T0 T0
j =1

where .{N1+ (ξ ; 0, T0 ), . . . , Nk+ (ξ ; 0, T0 )} denotes a random sample with a possible

outcome .{n+ + +
1 (ξ ; 0, T0 ), . . . , nk (ξ ; 0, T0 )}. Hence, .ŝ(ξ ) /k ≈ ν (ξ )/kT0 . Since
2

this last relation is consistent with the adopted assumptions, it could have been
used as the empirical estimate of the sample variance in the first place. It is also
insensitive to the blocking of data discussed above since .kT0 = k̃ T̃ . However, the
advantage of Eq. (11.38) is that it applies whatever the value of .ξ , and it does not
rely on any specific assumptions about the statistical distributions involved.

11.6 Numerical Examples for Gaussian Random Fields

Since a major motivation for developing the approach discussed in this chapter is
application to random ocean wave fields, the numerical examples presented will be
for zero mean, homogeneous Gaussian fields specified by spectral densities defined
in terms of a so-called JONSWAP spectrum .S(ω), which is a one-sided spectrum,
and a directional spreading function .D(θ ) (Sarpkaya and Isaacson, 1981):
2
αg 2 5 ωp 4 1 ω
.S(ω) = exp − +ln γ exp − 2 −1 , ω > 0, (11.40)
ω5 4 ω 2σ ωp

where .g = 9.81 ms.−2 , .ωp denotes the peak frequency in rad/s, and .α, .γ , and .σ are
parameters related to the spectral shape. .σ = 0.07 when .ω ≤ ωp , and .σ = 0.09
when .ω > ωp . The parameter .γ is chosen to be equal to 3.0. The parameter .α is
determined from the following empirical relationship (Naess et al., 2007):
H 2
s
α = 5.06 2
. 1 − 0.287 ln γ , (11.41)
Tp
11.6 Numerical Examples for Gaussian Random Fields 207

Hs = significant wave height and .Tp = 2π/ωp = spectral peak wave period. .Hs
.

= 14.0 m and .Tp = 16 s are chosen for all the following examples. The directional
spreading function is expressed as

22s−1 Γ 2 (s + 1) (θ − θ0 )
D(θ ) =
. cos2s , (11.42)
π Γ (2s + 1) 2

where .θ0 is the main wave direction, chosen so that .θ0 = 0, and then .−π < θ ≤ π .
The choice of spreading parameter is .s = 8, which is a typical value often used.
The extreme value distributions PA and PB may be considered as having
been derived under the Poisson assumption, i.e., when independence of individual
upcrossings or exceedances is assumed, supplemented by Rice-like formulas for the
exceedance rates .νA+ (ξ ) and .νB+ (ξ ). Thus PA and PB are liable to work in case of
a wideband spectrum .Ψ (k, ω), as they do in the one-dimensional case of a random
process. It is therefore to be expected that for the narrow-band case, there will be
some discrepancy between the exact values and the predictions provided by PA and
PB. Also the simplified approach proposed in this chapter is based on the use of
the mean upcrossing rate function. However, this approach can easily be amended
to cope with statistical dependence between the peak values in the extracted time
series by using the ACER method, cf. Chap. 5.
Another aspect that comes into question in the case of a random ocean wave
field is the effective dimension of the field. Due to the dispersion relation, which
implies a strong coupling between frequency and wave number, the time and
space dimensions of the wave field will likewise be strongly coupled, indicating
a degeneracy. In fact, the dispersion relation implies that all the mass of the spectral
density function .Ψ (k, ω) will be localized on a surface in .(ω, k)-space.
Numerical results obtained for three examples will now be presented for .d ≤ 2.
The simulations of the example random fields were carried out on a standard desktop
computer, and the required CPU time was of the order of 1 hour for each case. For
each of the examples, predictions of the 99.9% quantile of a 3-hour extreme value
distribution by the proposed method (marked by an asterisk in the relevant figures)
and the analytical method are also provided.

11.6.1 1+1-Dimensional Gaussian Field

In the first example, a wideband 1+1-dimensional Gaussian field is considered. It

has a spectral density given as

Ψ (ω, k) = S(|ω|)S(|k|) ,
. (11.43)

where .S(·) is defined by Eq. (11.40). This is clearly an artificial example, but it
serves the purpose of securing no interaction between the time and space dimensions
of the random field. It is therefore to be expected that the theoretical formulas with
208 11 Space–Time Extremes of Random Fields

Fig. 11.1 The spectral

density .Ψ (ω, k) of the
2
1+1-dimensional Gaussian
field 1.5

<(Z,k)
1
0.5
0
0

1
2
1.5
1
2 0 0.5
Z k

Fig. 11.2 Part of a 100

realization of the
1+1-dimensional Gaussian 50
X (x,t)

random field 0
–50
–100
40
30
20
10
t [s] 20 25 30
0 10 15
0 5
x [m]

d = 1 should apply. A plot of the spectral density is shown in Fig. 11.1, while
.

part of a realization of the associated Gaussian random field is given in Fig. 11.2.
Plots of the empirical and analytical results are presented in Figs. 11.3 and 11.4 for
one single point (.L1 = 0), which corresponds to an ordinary Gaussian temporal
process, and for an interval (.L1 = 100), respectively. It is seen that the empirical
and analytical results are in very good agreement in the tail for both cases and that
the predictions of the 99.9% quantile in the 3-hour extreme value distributions are
in good agreement.

11.6.2 1+1-Dimensional Gaussian Sea

The second example is a 1+1-dimensional Gaussian sea, which corresponds to the

limiting case of long-crested waves. In this case the power spectral density assumes
the form

1 ω|ω|
.Ψ (ω, k) = S(|ω|) δ k − , (11.44)
4 g
11.6 Numerical Examples for Gaussian Random Fields 209

Fig. 11.3 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical
+ –2
.log10 (νB (ξ )) (thick solid
line), optimal fitted curve
(.− · −) with both empirical –3

log10(Q+)
(dotted lines) and fitted
(dashed lines) 95% –4
confidence band for a point
.L1 = 0 for Example 1. –5
Predictions of 99.9%
quantile: 5.22 (analytical);
–6
5.06 and 95% CI = (4.93,
5.18) (proposed method)
–7
0 1 2 3 4 5
[/V

Fig. 11.4 Plot of the 0

empirical .log10 (ν + (ξ ))
(.• • •), analytical –1
+
.log10 (νB (ξ )) (thick solid
line), optimal fitted curve –2
(.− · −) with both empirical
(dotted lines) and fitted –3
log10(Q+)

(dashed lines) 95%

confidence band for an –4
interval .L1 = 100 m for
Example 1. Predictions of –5
99.9% quantile: 6.09
(analytical); 6.11 and 95% CI –6
= (6.03, 6.18) (proposed
method) –7

2 3 4 5 6
[/V

where the dispersion relation for deep water waves is implemented in the form of a
delta function. The resulting degeneracy would expectedly have some influence on
the extreme value distribution. In the present case, the wave field can be represented
as
N
ω2
iωj t−i gj x
.X(t, x) = Re S(ωj )Δω · Cj e , (11.45)
j =0

where .Cj are complex .N(0, 1)-distributed random variables, that is, .Cj = Rj + iSj
with .Rj and .Sj two independent .N(0, 1/2)-distributed variables.
Part of a realization of the Gaussian random wave field generated by Eq. (11.45)
is plotted in Fig. 11.5. It is clearly seen how the strong coupling between frequency
and wave number manifests itself.
210 11 Space–Time Extremes of Random Fields

Fig. 11.5 Part of a

realization of the long-crested
random wave field

X(x,t)
0
–10
500
400 30
300 25
20
200 15
100 10
x [m] 5 t [s]
0 0

Fig. 11.6 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical –2
+
.log10 (νB (ξ )) (thick solid
line), optimal fitted curve –3
(.− · −) with both empirical
log10(Q+)

(dotted lines) and fitted –4

(dashed lines) 95%
confidence band for a point –5
.L1 = 0 for Example 2.
Predictions of 99.9% –6
quantile: 5.22 (analytical);
5.27 and 95% CI = (5.14, –7
5.41) (proposed method)
0 1 2 3 4 5 6
[/V

Plots of the empirical and analytical results are presented in Figs. 11.6 and 11.7
for a point (.L1 = 0), which again corresponds to an ordinary Gaussian temporal
process, and for an interval (.L1 = 100 m). It is seen that the empirical and analytical
results are still in complete agreement in the tail for the one-point case, while there is
a significant discrepancy between the results for .L1 = 100 m. In fact, the results for
.L1 = 100 m are only slightly higher than the corresponding results for the one-point

case, illustrating the effect of the degeneracy on the extreme values. It is interesting
to observe that if the normalization constant .01 is calibrated to the empirical results,
very accurate predictions are obtained, and they remain accurate for other domains
without recalibration. This will be illustrated in the next example.
11.6 Numerical Examples for Gaussian Random Fields 211

Fig. 11.7 Plot of the

empirical .log10 (ν + (ξ )) –1
(.• • •), analytical
+ –2
.log10 (νB (ξ )) (thick solid
line), optimal fitted curve
(.− · −) with both empirical –3

log10(Q+)
(dotted lines) and fitted
(dashed lines) 95% –4
confidence band for an
interval .L1 = 100 m for –5
Example 2. Predictions of
99.9% quantile: 5.70 –6
(analytical); 5.47 and 95% CI
= (5.36, 5.58) (proposed –7
method)
1 2 3 4 5 6
[/V

11.6.3 A Short-Crested Gaussian Sea

In this example a model of a short-crested Gaussian random sea way is simulated.

It corresponds to a case where the main direction of propagation of the waves is
the x-direction. Having assumed deep water, the relation between the wave number

vector and the spreading angle will be .k = (kx , ky ) = (ω2 /g) cos θ, sin θ . The
spectral density of the random field X can then be expressed as

ω2 ω2
Ψ (k, ω) =
. S(ω) D(θ ) δ kx − cos θ δ ky − sin θ dθ . (11.46)
g g

This leads to the expressions

. Λxx = kx2 Ψ (k, ω) dω dk = Dc ωc4 σ 2 /g 2 ,

Dc = D(θ ) cos2 (θ ) dθ ,

1
ωc4 = ω4 S(ω)dω = (2π/tc )4 ,
σ2

Λyy = ky2 Ψ (k, ω) dω dk = (1 − Dc ) ωc4 σ 2 /g 2 ,

Λtt = ω2 Ψ (k, ω) dω dk = ωz2 σ 2 , (11.47)

2πg g tc2
01 = 2π σ/ Λxx = √ = √ ,
Dc ωc2 2π Dc
212 11 Space–Time Extremes of Random Fields

2πg g tc2
02 = 2π σ/ Λxx = √ = √ ,
1 − Dc ωc2 2π 1 − Dc

03 = 2π σ/ Λtt = 2π/ωz = Tz ,
qx = L1 /01 , qy = L2 /02 qt = L3 /03 = T /Tz ,

where .(L1 , L2 , T ) is the size of the rectangular domain of interest.

Calculating the values of the parameters using the formulas above gives the
values .01 = 135 m and .02 = 280 m. Plotting the empirical against the analytical
results obtained for the calculated parameter values for .L1 = L2 = 0, .L1 × L2 =
100 m × 0, .L1 × L2 = 0 × 100 m, and .L1 × L2 = 100 m × 100 m leads to Figs. 11.8,
11.9, 11.10, and 11.11.

Fig. 11.8 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical
+ –2
.log10 (νB (ξ )) (thick solid
line), optimal fitted curve
(.− · −) with both empirical –3
(dotted lines) and fitted
log10(Q+)

(dashed lines) 95% –4

confidence band for
.L1 = L2 = 0 for Example 3. –5
Predictions of 99.9%
quantile: 5.22 (analytical); –6
5.13 and 95% CI = (5.01,
5.25) (proposed method) –7

0 1 2 3 4 5
[/V

Fig. 11.9 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical –2
+
.log10 (νB (ξ )) (original: thick
dashed line; tuned: thick solid
–3
line), optimal fitted curve
(.− · −) with both empirical
log10(Q+)

–4
(dotted lines) and fitted
(dashed lines) 95%
confidence band for the –5
interval .L1 × L2 = 100 m × 0
for Example 3. Predictions of –6
99.9% quantile: 5.47 (tuned
analytical); 5.33 and 95% CI –7
= (5.26, 5.40) (proposed
1 2 3 4 5
method)
[/V
11.6 Numerical Examples for Gaussian Random Fields 213

Fig. 11.10 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical –2
+
.log10 (νB (ξ )) (original: thick
dashed line; tuned: thick solid
–3
line), optimal fitted curve

log10(Q+)
(.− · −) with both empirical
–4
(dotted lines) and fitted
(dashed lines) 95%
–5
confidence band for the
interval .L1 × L2 = 0 × 100 m
for Example 3. Predictions of –6
99.9% quantile: 5.47 (tuned
analytical); 5.31 and 95% CI –7
= (5.18, 5.41) (proposed 1 2 3 4 5
method) [/V

Fig. 11.11 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical
+ –2
.log10 (νB (ξ )) (original: thick
dashed line; tuned: thick solid
line), optimal fitted curve –3
(.− · −) with both empirical
log10(Q+)

(dotted lines) and fitted –4

(dashed lines) 95%
confidence band for the –5
square domain
.L1 × L2 = 100 × 100 m for –6
Example 3. Predictions of
99.9% quantile: 5.72 (tuned
–7
analytical); 5.74 and 95% CI
= (5.56, 5.89) (proposed 1 2 3 4 5 6
method) [/V

It is seen from these figures that there is also in this case a significant effect
of dimensional degeneracy. Since the main wave direction is along the x-axis, the
strongest effect of dimensional degeneracy is to be expected for the cases .L1 ×L2 =
100 m × 0 and .L1 × L2 = 100 × 100 m, which is also corroborated by the plots.
As mentioned in the previous example, by tuning the parameters .01 and .02 to
the empirical results, the tuned theoretical predictions become very accurate. The
tuning can be done for each dimension separately. That is, .01 is calibrated to the
results for the case .L1 × L2 = 100 m × 0, while .02 is calibrated to the results for the
case .L1 × L2 = 0 × 100 m. It has been verified that the calibrated values apply also
to other domain sizes. For the particular example at hand, it was found that the tuned
parameters are .01 = 474 m, .02 = 477 m. The results obtained by using the tuned
parameters have also been plotted in Figs. 11.8, 11.9, 11.10, and 11.11 together with
the predictions of the 99.9% quantiles. It is seen that fairly good agreement is now
achieved for all cases.
214 11 Space–Time Extremes of Random Fields

Fig. 11.12 Plot of the –1

empirical .log10 (ν + (ξ ))
(.• • •), analytical
+
.log10 (νA (ξ )) (original: thick –2
dash dot line; tuned: thick
solid line) results for the

log10(Q+)
square domain –3
.L1 × L2 = 100 × 100 m for
Example 3. Empirical 95%
confidence band (thin dashed –4
lines)

–5

2 3 4
[/V

Fig. 11.13 Plot of the

empirical .log10 (ν + (ξ )) –1
(.• • •), analytical
+
.log10 (νA (ξ )) (original: thick
dash dot line; tuned: thick –2
solid line) results for the
log10(Q+)

square domain –3
.L1 × L2 = 300 × 300 m for
Example 3. Empirical 95%
confidence band (thin dashed –4
lines)

–5

2 3 4 5
[/V

It has been mentioned already that the tuned version of PB seems to give
accurate predictions for practically any size. This has been verified for a range
of domain sizes providing substantial support for this assertion. It is also of
interest to investigate to what extent PA can give accurate predictions. If tuned
to each particular size, it could perhaps provide reasonable predictions, but that
is impractical. Let us therefore illustrate how it performs using the same tuned
parameters as previously established, that is, .01 = 474 m, .02 = 477 m. The results
for the two square domains of size .100 × 100 m and .300 × 300 m have been plotted
in Figs. 11.12 and 11.13. It is seen that there is a significant discrepancy between
the tuned PA results and the empirical results for the smaller domain, while there is
good agreement for the large one, giving a good indication of the importance of the
neglected terms in PA. If the domain is too small, the leading term is not sufficient
to obtain accurate estimates.
The value of the 99% fractile of a 3-hour extreme value distribution, which
corresponds to .ν + = 10−6 , as a function of the size of a square domain has
been plotted in Fig. 11.14. It is seen that the extreme values show a fairly strong
11.7 Numerical Examples for Non-Gaussian Random Fields 215

Fig. 11.14 The 99% fractile 6

value of a 3-hour extreme
value distribution as a
function of the size of a
square domain with
5.5
.x = L1 = L2 for the

[(10–6)/V
short-crested Gaussian sea

4.5
0 100 200 300 400 500
x [m]

dependence on the area of the domain. For prediction of extremes over, e.g., the
deck area of an offshore structure, which would typically be about .100 × 100 m, it
is seen that the area effect amounts to 10–15% larger extremes than predicted by a
one-point estimate.

11.7 Numerical Examples for Non-Gaussian Random Fields

11.7.1 A Second-Order Wave Field

Since a major motivation for developing the approach discussed in this chapter is
application to random ocean wave fields, the numerical example presented will be
for a zero mean, homogeneous second-order short-crested random wave field. It has
been shown, cf., e.g., Toffoli et al. (2008), that such a wave field can be built up in
the following way. First, let .η(t, r, θ ) denote the wave elevation of a long-crested sea
propagating in the direction specified by the angle .θ , where .r = x cos θ + y sin θ .
Then .η(t, r, θ ) = η1 (t, r, θ ) + η2 (t, r, θ ). Here the linear, first-order part .η1 (t, r, θ )
is given by the relation

N
η1 (t, r, θ ) =
. |Ck | cos(χk ), (11.48)
k=1

where .Ck = |Ck |eiεk , .k = 1, . . . , N, is a set of independent complex random

variables, whose real and imaginary parts are independent and normally distributed
as .N (0, S(ωk )Δω). .χk = rk − ωk t + εk , where .rk = ωk2 r/g. .S(ω) is the JONSWAP
216 11 Space–Time Extremes of Random Fields

spectrum, which was introduced in the previous section. The nonlinear, second-
order part .η2 (t, r) has the following representation:

N

η2 (t,r, θ ) =
. mkl |Ck Cl | cos(χk ) cos(χl )
k,l=1

− Mkl |Ck Cl | sin(χk ) sin(χl ) , (11.49)

where .mkl = (2g)−1 min(ωk2 , ωl2 ) and .Mkl = (2g)−1 max(ωk2 , ωl2 ).
The short-crested wave field .X(x, y, t) is then obtained by superposition of m
long-crested wave fields as follows:

m
.X(x, y, t) = D(θk )Δθ η(t, x cos θk + y sin θk , θk ), (11.50)
k=1

where .D(θ ) denotes the directional spreading function.

In the numerical example, a specific model of a short-crested, second-order
random sea way is simulated. It corresponds to a case where the main direction of
propagation of the waves is the x-direction. Having assumed deep water, the relation
between the wave number
vector and the spreading angle will be .k = (kx , ky ) =
(ω2 /g) cos θ, sin θ . The spectral density in terms of wave numbers and frequency
of the first-order part of the random field X can then be expressed as
π
Ψ (k, ω) =
. S(ω) D(θ )
−π

ω2 ω2
· δ kx − cos θ δ ky − sin θ dθ . (11.51)
g g

Part of a realization of the short-crested, second-order random wave field is

shown in Fig. 11.15. The standard deviation of the wave field was estimated to

Fig. 11.15 Part of a

realization of the
short-crested, second-order
random wave field 10
0
–10
1000
800

600
400 1000
800
200 600
y 400
0 0 200
x
11.7 Numerical Examples for Non-Gaussian Random Fields 217

Fig. 11.16 The one-point –1

upcrossing rate .ν0+ (ξ ), cf.
–1.5
Eq. (11.33)
–2
–2.5

log10(Q0+)
–3
–3.5
–4
–4.5
–5
–5.5
0 1 2 3 4 5
[V

Fig. 11.17 The upcrossing –4

rate .νx+ (ξ ), cf. Eq. (11.33)
–4.5

–5
log10(Qx+)

–5.5
–6
–6.5

–7
–7.5
–8

2 2.5 3 3.5 4 4.5 5 5.5

be .σ = 3.53m. From the simulated wave field, the four upcrossing rate functions
of Eq. (11.33) have been estimated by the linear regression approach described in
Sect. 11.4 on space–time extremes. The results are shown in Figs. 11.16, 11.17,
11.18, and 11.19 together with the optimally fitted curves obtained by the point
process procedure.
The prediction results obtained by the simplified empirical approach in combi-
nation with the point process procedure have been plotted in Figs. 11.20, 11.21,
11.22, and 11.23. Similarly, the results obtained by the parametric approach of
Eq. (11.33) are also plotted. In the figures the predicted 99.9% fractile value of
the 3-hour extreme value distribution obtained by the simplified approach has been
indicated by an asterisk. This fractile value is achieved at .ν + = 10−7 . It is seen that
the agreement between the two approaches proposed in this chapter is very good for
the cases studied.
The value of the 99.9% fractile of a 3-hour extreme value distribution as a
function of the size of a square domain has been plotted in Fig. 11.24. It is seen
that the extreme values show a fairly strong dependence on the area of the domain.
For prediction of extremes over, e.g., the deck area of an offshore structure, which
218 11 Space–Time Extremes of Random Fields

Fig. 11.18 The upcrossing

rate .νy+ (ξ ), cf. Eq. (11.33) –4
–4.5
–5

log10(Qy+)
–5.5
–6
–6.5
–7
–7.5
–8
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5
[V

Fig. 11.19 The upcrossing –6.5

+ (ξ ), cf. Eq. (11.33)
rate .νxy
–7
–7.5
–8
–8.5
log10(Q+xy)

–9
–9.5
–10
–10.5
–11
3 3.5 4 4.5 5 5.5 6 6.5
[V

Fig. 11.20 Case 1: Single –1

point. Plot of the empirical
upcrossing rate .log10 (ν + (ξ )) –2
(.• • •), of the optimal fitted
curve for the simplified
–3
approach (.− − −), of
+
log10(Q+)

.log10 (νP (ξ )) (solid line) for

the parametric approach. –4
Both empirical (dotted lines)
and fitted (dash-dotted lines) –5
95% confidence bands are
shown. Predictions of 99.9% –6
quantile: 5.80 and 95% CI =
(5.75, 5.84) (simplified –7
method); 5.77 (calibrated 0 1 2 3 4 5
parametric) [V
11.7 Numerical Examples for Non-Gaussian Random Fields 219

–1

–2

–3

log10(Q+) –4

–5

–6

–7
1 2 3 4 5 6 7
[V

Fig. 11.21 Case 2: .100 × 100 m.2 . Plot of the empirical upcrossing rate .log10 (ν + (ξ )) (.• • •), of
the optimal fitted curve for the simplified approach (.− − −), of .log10 (νP+ (ξ )) (solid line) for the
parametric approach. Both empirical (dotted lines) and fitted (dash-dotted lines) 95% confidence
bands are shown. Predictions of 99.9% quantile: 6.98 and 95% CI = (6.73, 7.23) (simplified
method); 6.89 (calibrated parametric)

–1

–2

–3
log10(Q+)

–4

–5

–6

–7
3 4 5 6 7
[V

Fig. 11.22 Case 3: .300 × 300 m.2 . Plot of the empirical upcrossing rate .log10 (ν + (ξ )) (.• • •), of
the optimal fitted curve for the simplified approach (.− − −), of .log10 (νP+ (ξ )) (solid line) for the
parametric approach. Both empirical (dotted lines) and fitted (dash-dotted lines) 95% confidence
bands are shown. Predictions of 99.9% quantile: 7.35 and 95% CI = (7.11, 7.62) (simplified
method); 7.45 (calibrated parametric)
220 11 Space–Time Extremes of Random Fields

–2

–3

log10(Q+) –4

–5

–6

–7
4 5 6 7 8
[V

Fig. 11.23 Case 4: .500 × 500 m.2 . Plot of the empirical upcrossing rate .log10 (ν + (ξ )) (.• • •), of
the optimal fitted curve for the simplified approach (.− − −), of .log10 (νP+ (ξ )) (solid line) for the
parametric approach. Both empirical (dotted lines) and fitted (dash-dotted lines) 95% confidence
bands are shown. Predictions of 99.9% quantile: 7.79 and 95% CI = (7.50, 8.10) (simplified
method); 7.72 (calibrated parametric)

Fig. 11.24 The 99.9% fractile value of a 3-hour extreme value distribution as a function of the
size of a square domain with .x = Lx = Ly

would typically be about .100×100 m, it is seen that the area effect may amount to as
much as 15–20% larger extreme crest heights than predicted by one-point estimates.

11.7.2 A Student’s t Random Field

To test the proposed methods on a strongly non-Gaussian random field, what shall
be referred to as a Student’s t field has been constructed. This was done in the
11.7 Numerical Examples for Non-Gaussian Random Fields 221

Fig. 11.25 Part of a

realization of the Student’s t
random field 10

–5

50
40 50
30 40
20 30
10 20
10
x t

Fig. 11.26 The one-point 1

upcrossing rate .ν0+ (ξ ), cf.
Eq. (11.33) 0

–1

–2
log10(Q0+)

–3

–4

–5

–6

–7
–20 –10 0 10 20 30 40
[V

following manner. Let .X(x, t) = η1 (t, x, 0), where .η1 (t, x, 0) is as defined in the
previous example. .X(x, t) is then a homogeneous Gaussian random field of zero
mean with one spatial dimension. Let .Xj (x, t), .j = 1, . . . , 4, denote independent
copies of .X(x, t). The random field .Z(x, t) is now constructed as follows:

X(x, t)
Z(x, t) =
. , (11.52)
4
k=1 Xj (x, t) +
1 2 σ
4 20

where .σ 2 denotes the variance of .X(x, t). The added term .σ/20 in the denominator
is introduced to avoid near singularities in the generated fields and make sure that
it satisfies the conditions discussed after Eq. (11.22). Part of a realization of the
resulting random field is shown in Fig. 11.25.
In Figs. 11.26 and 11.27 are shown the obtained results for .ν0+ (ξ ) and .νx+ (ξ ),
respectively. These figures clearly display the strongly non-Gaussian characteristics
of this random field.
222 11 Space–Time Extremes of Random Fields

–2

–3

–4

log10(Qx+) –5

–6

–7

–8

–9
0 10 20 30 40 50
[V

Fig. 11.27 The upcrossing rate .νx+ (ξ ), cf. Eq. (11.33)

–2
–2.5
–3
–3.5
–4
log10(Q+)

–4.5
–5
–5.5
–6
–6.5
–7
5 10 15 20 25 30 35
[V

Fig. 11.28 Single point. Plot of the empirical upcrossing rate .log10 (ν + (ξ )) (.• • •), of the optimal
fitted curve for the simplified approach (.− − −), of .log10 (νP+ (ξ )) (solid line) for the parametric
approach. Both empirical (dotted lines) and fitted (dash-dotted lines) 95% confidence bands are
shown. Predictions of 99.9% quantile: 32.33 and 95% CI = (30.25, 34.69) (simplified method);
32.33 (calibrated parametric)

Figure 11.28 shows the one-point prediction result, while Fig. 11.29 shows the
prediction results for an interval of size 100 m. It is seen that there is a strong
extensional effect of about 50% on the predicted extreme values. It is also clearly
demonstrated that the predictions obtained by the two proposed methods are in
excellent agreement.
11.8 Comments 223

–1

–2

–3
log10(Q+)
–4

–5

–6

–7
5 10 15 20 25 30 35 40 45 50
[V

Fig. 11.29 .x = 100 m. Plot of the empirical upcrossing rate .log10 (ν + (ξ )) (.• • •), of the optimal
fitted curve for the simplified approach (.− − −), of .log10 (νP+ (ξ )) (solid line) for the parametric
approach. Both empirical (dotted lines) and fitted (dash-dotted lines) 95% confidence bands are
shown. Predictions of 99.9% quantile: 48.92 and 95% CI = (47.73, 49.78) (simplified method);
47.70 (calibrated parametric)

11.8 Comments

Analytical formulas for the extreme value distribution of a homogeneous Gaussian

random field have been discussed at some length, and some properties of these
formulas have been highlighted. It has been shown that by proper calibration, the
analytical formulas provide good results also for the case of a Gaussian random sea.
Provided there is no degeneracy in the dimensionality of the Gaussian random
field, the extreme value distribution PA is asymptotically accurate with respect to
the level .ξ as well as to the size of the domain, while PB is asymptotically accurate
with respect to the level .ξ for any size.
The region of applicability of the approximation PB is suggested to be .νB+ (ξ ) ≤
10 . It is then formulated in terms of the value of .νB+ rather than for the level .ξ ,
−2

and it appears to be valid for a domain of arbitrary size .L1 × L2 .

The accuracy of the proposed prediction procedure obtained by optimal fitting to
the simulated data has also been amply demonstrated. This leaves the door open for
accurate prediction of extremes of non-Gaussian random fields. While the analytical
formulas are restricted to homogeneous Gaussian fields, no such restriction applies
to the proposed prediction method.
A simplified, empirically based procedure for prediction of space–time extremes
of homogeneous non-Gaussian random fields has been presented. The advantage of
this method is its flexibility and general applicability, while it appears to be accurate
and robust. A drawback of the method is that if the spatial domain is changed, the
whole analysis has to be repeated. In an effort to ameliorate this situation, a quasi-
parametric representation of the extreme value distribution for rectangular domains
224 11 Space–Time Extremes of Random Fields

is proposed. When this approach applies, it has the advantage that it only needs to
be calibrated for one rectangular domain. After that, it applies to any rectangular
domain (with the same orientation). The case studies presented show that the two
methods proposed seem to provide very good practical tools for the prediction of
space–time extremes of random fields.
Chapter 12
A Case Study—Extreme Water Levels

12.1 Introduction

As part of the efforts to reduce the vulnerability to flooding, it is of paramount

importance to have available flooding charts that spell out the risk of flooding for a
given location. This risk is typically expressed as the flooding levels associated with
various return periods, e.g., the 100 year flooding level. In Norway the concerns
about flooding events are primarily connected with two types of flooding: On the one
hand, in rivers and lakes, which are mainly related to melting snow; and on the other,
along the coast, which are mainly due to a combination of offshore storm surges
and tides. In this chapter the attention is limited to coastal areas. Specifically, the
extreme value statistics of sea levels measured at three stations along the Norwegian
coastline will be investigated: Oslo, Heimsjø and Honningsvåg, see Fig. 12.1. This
provides an excellent opportunity to compare the performance of the following
four methods for extreme value estimation: The Annual Maxima (AM) method
(Chap. 2), the Peaks-Over-Threshold (POT) method (Chap. 3), the ACER method
(Chap. 5), and also the Revised Joint Probabilities (RJP) method (Tawn and Vassie
1989; Tawn 1992) (see also Batstone et al. (2009) for modifications of the RJP
method). Since the RJP method has not been discussed previously in this book,
it will be explained in some detail here. This chapter largely follows the work
presented by Skjong et al. (2013).

12.2 Data Sets

The data sets used in this chapter are water level measurements and tidal predictions
for three locations on the Norwegian coast. Water level measurements are collected
by automated equipment, while the tidal predictions are based on standard numerical
models.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 225
A. Naess, Applied Extreme Value Statistics,
[Link]
226 12 A Case Study—Extreme Water Levels

Fig. 12.1 Map with locations Honningsvag ●

of the sea level measuring
stations considered in this
paper

●
Heimsjo

●
Oslo
scale approx 1:17,000,000

0 200 400 600 km

There is one data set from the south of Norway, Oslo. Tidal variations are not
as large an influence on the water level here as it is further north. The location is
referred to as surge-dominant. The total height of sea levels are also lower here than
further north. There is one data set from the middle of Norway, Heimsjø, and one
from the far north, Honningsvåg. The sea levels are much more influenced by tides
here; the locations are tide-dominant. The data sets are not in the public domain, but
were provided by the Norwegian Hydrographic Service, a division of the Norwegian
Mapping Authority.
The sea level measurements are done either hourly or every 10 minutes. For
the 10 minute data, hourly measurements have been extracted for use with the
POT, RJP and ACER methods. There are several reasons for this: Firstly, the RJP
method uses tidal predictions which were provided as hourly sea levels. Secondly,
many locations have hourly data for older periods and ten minute measurements
for the recent years. In order to use both data sets, values every hour are extracted
from the more frequent observations. Thirdly, the literature used in the theoretical
studies have hourly measurements. Methodological comparisons are therefore made
simpler.
If not otherwise noted, all references to the height of the sea level are in
centimeters relative to a mean sea level. All return periods are, if not otherwise
noted, given in years.
12.3 Annual Maxima Method 227

12.2.1 Oslo

For the measurement station in Oslo, there are available hourly measured sea levels
from December 10, 1914, to December 31, 1991, and 10 minute interval data from
October 1, 1991, to September 16, 2010, which was the cutoff date for our analyses.
These data are uncorrected for post-glacial rebound, which is an important factor
in the Oslo area. To correct for this, 4 mm/year have been added to data points for
years after 1988 and the same amount has been subtracted per year before 1988,
according to the formula .currentV alue + 0.4 · (currentY ear − 1988). This was
found in a previous report (Hansen and Roald 2000), and is based on the fact that
1988 was the base year to calculate mean sea level (MSL) in Oslo.
For the same periods, tidal sea level predictions are available. These are based on
a numerical model, the details of which are not publicly available.

12.2.2 Heimsjø

Heimsjø is located on the coast of Sør-Trøndelag, and the measurement station is

found at latitude .63◦ 26' N and longitude .09◦ 07' E. Data are available as hourly
measurements from November 1, 1928, to December 31, 1990, and as 10-minute
interval measurements from November 1, 1990, to September 16, 2010. Tide
predictions are also available for the same periods.

12.2.3 Honningsvåg

Honningsvåg is found at the very north of Norway, in Finnmark. The location of the
measurement station is latitude .70◦ 59' N and longitude .25◦ 59' E. Measured values
exist in hourly form from June 5, 1970, to December 31, 1988, and as 10 minute
data from June 1, 1988, to September 16, 2010. Corresponding tidal predictions are
also available.

12.3 Annual Maxima Method

As we know from Chap. 2, the annual maxima (AM) method is based on the
assumption that .MN = max{X1 , . . . , XN }, where .X1 , . . . , XN are independent
observations using a block size of one year, is distributed according to the
generalized extreme value (GEV) distribution.
For the published work done on Norwegian sea levels, the shape parameter
of the GEV distribution is assumed to be fixed at zero. This means that the
228 12 A Case Study—Extreme Water Levels

Gumbel distribution is adopted, which can be ascertained by studying the underlying

statistics and their domain of attraction. However, results for the case of a nonzero
shape parameter will also be presented below. This is done because it has been
argued that the GEV distribution to be used should be decided on the basis of the
extreme value data (Coles 2001). It will be shown that this can be misguided advice.

12.3.1 Application to Water Level Measurements

Oslo

Oslo has data available from all years from 1914 to 2010—except 1939, from
which there are no available measurements. In addition, some of the years with
available data have important data missing. For instance, there are only 312 data
points in 1914, all from the month of December. This means that there is a very
real possibility that the real annual maximum is excluded. In 1915, there is much
missing data from the important autumn and winter months, where annual maxima
are often found. This year is therefore also excluded from consideration. In 1972,
there are no measurements in February, July, August, September and October, which
means that so much data are missing that the probability that the year’s maximum
is left out is large. 1974 lacks data for July, August and September, and 1991 only
has data for October, November and December. Both these are excluded. Finally,
measurements for 2010 only go to September 16, so its maximum is also dropped
from consideration.
For the Gumbel model, the parameter estimates are .μ̂ = 167.31 (2.074) and .σ̂ =
18.630 (1.497). For the GEV model, .μ̂ = 168.21 (2.233), .σ̂ = 19.030 (1.579) and
.γ̂ = − 0.089 (0.070). Standard errors are shown in parentheses.

With the data as specified above, estimates of return period levels together with
maximum likelihood 95% confidence intervals are published in Table 12.1 for the
Gumbel and GEV models. The obtained results agree fairly well with published
values collected from a report by the Norwegian Map Authority (Hansen and Roald
2000), which were also based on the AM method.
The return level plot in Fig. 12.2 shows that all points stay within the 95%
confidence intervals for both models. The two curves follow slightly different paths;
the Gumbel model in Fig. 12.2a fits fairly well to all points, while the GEV model

Table 12.1 Estimates of Gumbel model GEV model

return levels and 95%
.R = 1/p .ẑp CI .ẑp CI
confidence intervals for Oslo
5 195.2 (188.4, 202.1) 194.9 (188.7, 201.1)
10 209.2 (200.4, 218.0 207.0 (199.1, 214.9)
20 222.6 (211.9, 233.4) 217.9 (207.4, 228.3)
100 253.0 (237.7, 268.2) 240.1 (220.5, 259.7)
200 265.9 (248.7, 283.2) 248.8 (223.9, 273.8)
12.3 Annual Maxima Method 229

260

260
240

240
220

220
Return level

Return level
200

200
180

180
160

160
140

140
1 2 5 10 20 50 100 200 1 2 5 10 20 50 100 200
Return period Return period

(a) (b)

Fig. 12.2 Return level plots for the AM methods, Oslo. (a) Gumbel model. (b) GEV model

in Fig. 12.2b fits generally better to most points but poorly to the rightmost point.
Since the estimated shape parameter is negative, the return levels of the GEV model
are bounded at .μ̂ − σ̂ /γ̂ = 381.7 (cm), while the Gumbel model return levels are
unbounded.

Heimsjø

Heimsjø has data for all years from 1928 to 2010. As with Oslo, some years are
missing so much data that it is likely that the true annual maximum has been left
out. One example is 1934, where data for all the first seven months are missing.
This year is left out of the analysis. 1938 has missing data for January, February
and December; important months where the real annual maximum is likely to be.
1943 lacks data for months January to April. 1959 lacks any data from September
to December and has little from February and August. Finally, 2010 lacks data for
the last months of the year. All these years are therefore left out.
For the Gumbel model, the parameter estimates are .μ̂ = 311.84 (1.399) and .σ̂ =
11.685 (0.978). For the GEV model, .μ̂ = 312.69 (1.480), .σ̂ = 11.889 (1.022) and
.γ̂ = − 0.137 (0.066). Estimates of return levels are presented in Table 12.2.

The return level plots in Fig. 12.3 show the curves of the Gumbel and GEV
models, together with confidence intervals and observed data points. The Gumbel
model in Fig. 12.3a seems to slightly overestimate the points at high levels,
while the GEV model in Fig. 12.3b perhaps underestimates them. All points are
within confidence intervals, however. The GEV model return levels are bounded at
399.7 cm.
230 12 A Case Study—Extreme Water Levels

Table 12.2 Estimates of Gumbel model GEV model

return levels and 95%
.R = 1/p .ẑp CI .ẑp CI
confidence intervals for
Heimsjø 5 329.4 (324.8, 333.9) 328.8 (325.0, 332.7)
10 338.1 (332.3, 343.9) 335.7 (331.0, 340.4)
20 346.5 (339.5, 353.6) 341.7 (335.8, 347.6)
100 365.5 (355.5, 375.6) 353.3 (343.2, 363.3)
200 373.7 (362.4, 385.0) 357.5 (345.2, 369.9)
360

360
340

340
Return level

Return level
320

320
300

300

1 2 5 10 20 50 100 200 1 2 5 10 20 50 100 200

Return period Return period

(a) (b)

Fig. 12.3 Return level plots for the annual maxima methods, Heimsjø. (a) Gumbel model. (b)
GEV model

Table 12.3 Estimates of Gumbel model GEV model

return levels and 95%
.R = 1/p .ẑp CI .ẑp CI
confidence intervals for
Honningsvåg 5 348.8 (342.0, 355.6) 348.0 (343.2, 352.8)
10 357.7 (349.0, 366.3) 353.5 (348.3, 358.7)
20 366.2 (355.6, 376.7) 357.8 (351.7, 363.9)
100 385.4 (370.4, 400.4) 365.0 (355.3, 374.7)
200 393.6 (376.7, 410.6) 367.2 (355.6, 378.7)

Honningsvåg

Honningsvåg has data for all years from 1970 to 2010, except 1985. Most of the
years have acceptable amounts of data, but four years are missing important data:
1970 has no data from January to May, 1988 nothing in February and very little in
March, 1989 nothing in August and September and 2010 nothing from October to
December. All these are excluded from the model fitting.
For the Gumbel model, parameter estimates are .μ̂ = 331.08(2.087) and .σ̂ =
11.812(1.458). For the GEV model, .μ̂ = 332.85(2.285), .σ̂ = 12.344(1.626) and
.γ̂ = −0.277(0.115). Estimates of return levels are given in Table 12.3.
12.4 The Peaks-Over-Threshold Method 231

370

370
360

360
350

350
Return level

Return level
340

340
330

330
320

320
310

310
1 2 5 10 20 50 100 200 1 2 5 10 20 50 100 200
Return period Return period

(a) (b)

Fig. 12.4 Return level plots for the annual maxima methods, Honningsvåg. (a) Gumbel model.
(b) GEV model

The probability and quantile plots show that there are fewer points available for
the model estimation than for Oslo and Heimsjø. The fit is still quite good overall,
and similar to the other two locations—even though only a few points are on the line,
the points seem to follow a sort of oscillating S shape around the optimal straight
line.
The return level plots in Fig. 12.4 show the trend of the Gumbel and GEV models.
Again it can be seen that the fit is not impressive, although the data points stay within
the confidence bounds. As for Oslo and Heimsjø, the estimated shape parameter is
negative, .γ̂ = −0.2771, meaning that the Gumbel model gives higher estimates for
the return levels for high return periods. Thus, it would appear that the higher return
levels might be overestimated by the Gumbel model in Fig. 12.4a and somewhat
underestimated by the GEV in Fig. 12.4b. The GEV return levels are bounded at
377.4 cm, significantly below even the 100-year return level of the Gumbel model.
This is clearly not a sensible situation, and points to the hazard of using essentially
curve fitting to decide on which GEV to use for estimating long return period sea
levels.

12.4 The Peaks-Over-Threshold Method

For many physical processes, the assumption of temporally independent observa-

tions is unrealistic. Stationarity is usually a more plausible assumption, and it says
that even though observations or data may be dependent, their stochastic properties
are temporally homogeneous. A GEV distribution remains an appropriate model
for block maxima of stationary series, and a GP distribution can also be shown to
remain appropriate for threshold excesses, see Beirlant et al. (2004).
232 12 A Case Study—Extreme Water Levels

The degree of dependence between data points is in some sense quantified in the
extremal index .θ , which satisfies .0 < θ ≤ 1, cf. Sect. 2.9. This parameter can be
interpreted as a measure of the tendency of the process to cluster at extreme levels,
and one can informally say that the inverse of the extremal index is the limiting mean
cluster size. This means that if .θ = 0.5, then extreme values would approximately
arrive in groups of two.
For the block maxima case, dependence is mostly absorbed in the parameters,
which have to be estimated anyway. However, some change is necessary for
threshold excess modeling. A common method used to overcome this issue is
declustering, where the generalized Pareto distribution is instead fitted to the
maxima of clusters. These clusters are identified by some empirical rule.
The easiest method of cluster identification is taking a threshold and saying
that subsequent observations must be above this threshold to be part of the
current cluster. A modification can be made by allowing one or more subsequent
observations to be below the threshold before the current cluster is left. Selecting
the amount of subsequent observations allowed before a cluster is left, is done
by selecting a number for r, the allowed distance between observations above the
threshold. For instance, this means that if observation number 50 and observation
number 55 are both above the threshold u, they would be in the same cluster for
.r ≥ 5 since the distance between the observations is 5. For .r < 5, they would be

considered to be from different clusters.

To be able to use a POT model in practice, the threshold u must be selected,
as discussed in Chap. 3. In the mean residual life plots, cf. Eq. (3.11), for the
three data sets that are analyzed in this chapter, the choice of .u0 has been clearly
marked. Another method is available, based on fitting the model to a wide range
of thresholds. In Eq. (3.10) it was used that there is a linear relationship between
scale parameters .σu above a valid threshold .u0 . Furthermore, shape parameters .γu
should be constant. By reparametrizing the scale parameter to .σu∗ = σu − γu u, one
obtains a parameter which should also be constant above .u0 . The plot of .(u, γu ) and
∗
.(u, σu ) is hereby called the stability plot. Stability plots are also presented for the

three available data sets.

Having chosen the reference level .u0 , one would typically calculate the return
period levels for several values of .u ≥ u0 . This is obtained by using the following
formulas:

xN = u + σ [(N ny ζu θ )γ − 1]/γ ,
. (12.1)

for the case of .γ /= 0, and

xN = u + σ ln(N ny ζu θ ),
. (12.2)

for .γ = 0. Here N = number of years, .ny = the average number of observations per
year, .ζu = Prob(X > u), .θ = the extremal index.
The parameters of the Generalized Pareto distributions in this section are
estimated by maximum likelihood. This was done with the maximum likelihood
12.4 The Peaks-Over-Threshold Method 233

procedure in the fpot function in the evd library in R. Standard errors for the GP
parameters come from this R function as well, and are extracted from a numerical
approximation of the observed information. Confidence intervals for the parameters
are then calculated by using the approximate normality of the maximum likelihood
estimator. The uncertainty in .ζu is ignored since it is usually small compared to the
errors of the other parameters (Coles 2001).

12.4.1 Application to Water Level Measurements

Oslo

Figure 12.5a shows the mean residual life plot for Oslo. To select an appropriate
threshold, one would search for approximate linearity within the confidence bounds.
The vertical line shows the choice made at 134 cm. The figure does not paint a
completely clear picture, as is the case with most real data sets, but from around the
indicated spot and up to about 200 cm there is a certain level of linearity.
Figure 12.5b shows the effect of threshold selection on the model parameters.
Ideally, the shape and modified scale parameters should be invariant to threshold
change as long as the data are above the minimum threshold. One can see that this
is basically the case from the beginning of the plot and up to about 170–175 cm. A
threshold of 134 cm is within this range and cannot be rejected based on Fig. 12.5.
Having selected a threshold, one would go on to select the allowed distance
between points in a cluster, r. Figure 12.6a shows the behaviour of the extremal
index as the allowed distance between observations in a cluster is increased. The
extremal index declines quite sharply until about .r = 12, then almost flattens out
1.5
25

0
20

1.0

Modified scale
Mean Excess

−100
15

Shape
0.5
10

−200
5

0.0

−300
0

120 160 200 120 160 200

100 150 200 250
Threshold u (cm) Threshold u (cm)
Threshold u (cm)

(a) (b)

Fig. 12.5 Threshold selection plots for the POT model for Oslo. (a) Mean of exceedances of
increasing thresholds (cm). (b) Stability of parameters for increasing threshold (cm)
234 12 A Case Study—Extreme Water Levels

0.10 0.12 0.14 0.16 0.18 0.20 0.22

320
200 year return level estimate (cm)
Extremal index estimate

300
280
260
240
220
0 10 20 30 40 0 10 20 30 40
Value of r (hours) Value of r

(a) (b)

Fig. 12.6 Change in extremal index and 200 year return level as calculated by the POT model for
Oslo. (a) Extremal index for r between 1 and 40. (b) 200 year return level for r between 1 and 40

before declining faster again. A similar tapering in the steepness of the curve is seen
at about .r = 24, and possibly at about .r = 36.
With hourly observations, these changes in steepness coincide with the 12 hour
cycle of the lunar tidal component. It is therefore safe to assume that extreme storms
above .u = 134 may last from one high tide to another. This suggests selecting a
value of r larger than 12 to encapsulate the full length of such storms. But it does
not seem plausible to have r much larger, since if the tide was the dampening factor,
then the subsequent rising tide should bring the storm with it up to extreme levels
again.
More insight into the choice of r is granted by looking at Fig. 12.6b, which
shows the development of the 200 year return level as r is increased from 1 to
40. The return level stays approximately constant from .r = 1 to .r = 5 but increases
relatively sharply from there and up to .r = 8. It then decreases just as sharply,
before staying relatively constant from .r = 12 up to about .r = 22. It then drops
sharply, before staying constant up to .r = 36. For .r ∈ (1, 5) and .r ∈ (12, 22), the
return level is approximately the same. Having argued that r should not be much
larger than 12, a value of .r = 1 is chosen since this gives approximately the same
result while using more data. Note that .r = 1 means that a new cluster starts as soon
as one value is below the threshold.
With the threshold at 134 cm, the model parameters were estimated to be .γ̂ =
0.019(0.025) and .σ̂ = 13.993(0.494) by the fpot function in the evd package in
R. Probability and quantile plots indicate that the model with estimated parameters
fits well to the maxima of the identified clusters. From the 778,199 points in the
data series, 7364 points are above the chosen threshold, giving an estimate of the
exceedance probability .ζ of .ζ̂ = 0.0095. From the 7364 points above the threshold,
12.4 The Peaks-Over-Threshold Method 235

Table 12.4 Return level estimates and 95% confidence intervals for the POT model
Oslo Heimsjø Honningsvåg
.R = 1/p .ẑp CI .ẑp CI .ẑp CI
5 200.5 (189.3, 213.9) 329.5 (321.7, 338.8) 351.6 (338.1, 368.3)
10 211.1 (197.0, 228.5) 335.3 (325.9, 347.0) 360.5 (344.2, 380.9)
20 221.9 (204.7, 243.8) 340.7 (329.5, 354.8) 369.2 (350.1, 393.8))
100 247.6 (221.7, 282.2) 351.7 (336.5, 372.0) 388.8 (362.5, 424.6)
200 258.9 (228.8, 300.0) 355.8 (338.9, 378.9) 397.0 (367.4, 438.3)

Fig. 12.7 Return levels (cm)

260
●
for increasing return periods
(years) for Oslo 240

● ●
220
Return level (cm)

● ●
●
●
●
●
●
●
●●●●●
●●●
200

●
●●●
●
●
●●
●●
●●
●●
●
●
●●
●●
●●
●●
●
●●
●
●●
●
●
●
●●
●
●
●●
180

●
●
●●
●●
●
●
●●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
●●
●●
●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●●
160

●
●
●●
●
●
●
●
●●
●
●●
●
●●
●
●●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
140

●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●

0.1 0.5 1.0 5.0 10.0 50.0

Return period (years)

1675 clusters were identified, giving an extremal index of .θ̂ = 0.2275. All this gives
the return level estimates shown in Table 12.4.
The return level plot is shown in Fig. 12.7. All data points are contained within
the confidence bounds of the fitted model, and most of them stay on or very close to
the line. Points above 185 cm behave somewhat more erratically than those below,
but not to a dramatic extent. This level corresponds approximately to the level at
which the stability of parameters, as displayed in Fig. 12.5b, begins to wear off.

Heimsjø

Figure 12.8a shows the mean residual life plot for the Heimsjø data. Approximate
linearity is found around the vertical line, which is placed at .u = 287 cm. The
stability plot in Fig. 12.8b shows the effect on the model parameters of threshold
change around the 287 cm mark. Above approximately 280 cm and up to around
325–330 cm, the parameters are approximately constant.
An appropriate value of r is sought for the identification of independent clusters
and the estimation of the extremal index. Figure 12.9a shows how the extremal index
236 12 A Case Study—Extreme Water Levels

0.8
30

0.6
25

100
0.4

Modified scale
20
Mean Excess

0
0.2
Shape
15

0.0

−100
10

−0.4 −0.2

−200
5
0

260 280 300 320 260 280 300 320

200 250 300 350
Threshold u (cm) Threshold u (cm)
Threshold u (cm)

(a) (b)

Fig. 12.8 Threshold selection plots for the POT model for Heimsjø. (a) Mean of exceedances of
increasing thresholds (cm). (b) Stability of parameters for increasing threshold (cm)
380
200 year return level estimate (cm)
0.5
Extremal index estimate

370
0.4

360
350
0.3

340
0.2

0 10 20 30 40 0 10 20 30 40
Value of r (hours) Value of r

(a) (b)

Fig. 12.9 Change in extremal index and 200 year return level as calculated by the POT model for
Heimsjø. (a) Extremal index for r between 1 and 40. (b) 200 year return level for r between 1 and
40

estimate behaves. It is close to piecewise constant, with significant drops at around

r = 10 to 12 and at .r = 24. Figure 12.9b shows that there is quite little difference
.

between the estimate for .r = 1 and larger r. The difference is somewhat larger than
was found for Oslo, however, and r is chosen to include storms crossing from one
tide to another. This leads to the choice .r = 13.
With the 287 cm threshold, model parameters are estimated as .σ̂ =
13.709(0.653) and .γ̂ = −0.116(0.030). Probability and quantile plots show a
fair agreement with the fitted model. However, for the highest sea levels, there is
an issue with the model fit, which will be discussed below. From the 686867 points
12.4 The Peaks-Over-Threshold Method 237

Fig. 12.10 Return levels

360
(cm) for increasing return ●

periods (years) for Heimsjø

●

●
●

340
●

Return level (cm)

●
●●
●
●
●
●●●●●
●
●●
●●●
●
●●
●
●●
●
●
●●

320
●
●
●
●●
●
●
●●
●
●●
●●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●●
●●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●●
●
●●
●
●●

300
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●

0.1 0.5 1.0 5.0 10.0 50.0

Return period (years)

in the Heimsjø data set, 2363 points are above 287 cm, giving an estimate of the
exceedance probability .ζ of .ζ̂ = 0.0034. From the points above the threshold, 733
clusters were identified, giving an extremal index of .θ̂ = 0.310.
Return level estimates are presented in Table 12.4. They go from slightly
overshooting to slightly undershooting the published return levels, but all are
certainly within confidence bounds.
As mentioned above, for the highest sea levels, there are some issues with the
model fit. This is better seen by looking at the return level plot in Fig. 12.10, which
shows irregularities after about 330 cm—corresponding to the level after which
parameters in Fig. 12.8b are no longer near-constant. The five highest points are
above the model line, and it seems possible that return levels for long return periods
can be underestimated.

Honningsvåg

The mean residual plot in Fig. 12.11a shows tendencies of linearity within confi-
dence bounds slightly before the 280 cm mark. A threshold of 278 cm is selected,
indicated by the vertical black line. The stability plot in Fig. 12.11b is far from
constant up to around 275–280 cm, but from there it is reasonable to call the
parameters near-constant up to perhaps 310 cm. The threshold .u = 278 is therefore
barely within the acceptable area as far as the stability plot is concerned.
To find an appropriate level of r, Fig. 12.12a is inspected. Honningsvåg is tide-
dominant like Heimsjø, and the same pattern is shown here as was observed there;
the extremal index plot is approximately piecewise constant. Figure 12.12b shows
a similar pattern as that for Heimsjø as well, but the differences in return levels are
much larger. The values for .r ≤ 10 and .r > 24 are about the same, however. The
238 12 A Case Study—Extreme Water Levels

0.0

200
25

−0.6 −0.5 −0.4 −0.3 −0.2 −0.1

150
20

Modified scale
Mean Excess

Shape
15

100
10

50
5

0
0

260 300 340 260 300 340

250 300 350
Threshold u (cm) Threshold u (cm)
Threshold u (cm)

(a) (b)

Fig. 12.11 Threshold selection plots for the POT model for Honningsvåg. (a) Mean of
exceedances of increasing thresholds (cm). (b) Stability of parameters for increasing threshold
(cm)
0.5

440
200 year return level estimate (cm)
0.4
Extremal index estimate

420
0.3

400
380
0.2

360
0.1

0 10 20 30 40 0 10 20 30 40
Value of r (hours) Value of r

(a) (b)

Fig. 12.12 Change in extremal index and 200 year return level as calculated by the POT model, for
Honningsvåg. (a) Extremal index for r between 1 and 40. (b) 200 year return level for r between
1 and 40

difference is large enough to not be within the confidence intervals, and .r = 13 is

chosen since it was argued earlier for a .r ≥ 12.
A threshold of 278 cm gives estimated model parameters of .σ̂ = 14.692(0.530)
and .γ̂ = −0.025(0.026). Probability and quantile plots show a fair agreement with
the fitted model. However, as for Heimsjø, for the highest sea levels, there is an issue
with the model fit. There are 330883 sea level measurements in the Honningsvåg
data, 9264 of which are above 278 cm. This makes for an estimate of the exceedance
12.5 Revised Joint Probabilities Method 239

Fig. 12.13 Return levels

(cm) for increasing return ●

360
periods (years) for ●●
Honningsvåg ●●
●●
●●●●●
●
●●
●

340
●
●●
●
●●

Return level (cm)

●
●
●●
●
●
●●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●●
●
●●
●

320
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●

300
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
280

●
●
●
●
●

0.1 0.5 1.0 5.0 10.0 50.0

Return period (years)

probability .ζ of .ζ̂ = 0.0280. From the points above the threshold, 1590 clusters
were identified, giving an extremal index of .θ̂ = 0.172.
Estimated return levels are presented in Table 12.4.
As indicated above, the quantile plot shows that a few of the most extreme data
points are off the fitted model curve, with Fig. 12.13 giving a clearer impression. It
looks likely that higher return levels may be overestimated, although all points are
within confidence intervals.

12.5 Revised Joint Probabilities Method

The revised joint probabilities (RJP) method is an attempt by Tawn (1992) to

improve the joint probabilities method employed by Pugh and Vassie (1980).
Whereas they assume that hourly surge levels are independent, Tawn argues that
this is clearly a false assumption. Instead, he uses that for a stationary sequence
.Y1 , . . . , Yn , . . .,

.Prob(max{Y1 , . . . , Yn } < y) ≈ [Prob(Y1 < y)]nθ (12.3)

for large y, and .0 < θ ≤ 1. .θ is the extremal index, and .θ −1 is defined as the
limit of .θ −1 (y) as y tends toward the upper end point of the distribution of Y .
.θ
−1 (y) is defined as the mean of the distribution of cluster sizes. The extremal

index can be equal to 1 for both dependent and independent sequences, when the
sequence behaves like an independent sequence at high levels. Unfortunately, as will
be demonstrated in the section on the ACER method, for the data analyzed in this
chapter, .θ (y) does not seem to be a robust parameter relative to its dependence on y.
240 12 A Case Study—Extreme Water Levels

In fact, it will be seen that .θ (y) may display significant dependence on y while there
is clear evidence that .θ (y) → 1 when y increases. However, in the present chapter,
the extremal index is estimated as would be typically done in the RJP method.
The RJP method relies on a componentwise analysis of the sea level Z. It is
divided into three components, .Zt = Mt + Xt + Yt , where .Mt is the mean sea level,
.Xt the tidal level and .Yt the surge level. For the purpose of this discussion, .Mt = 0

for all t. The tidal level .Xt is estimated by a numerical model.

If N is the number of hours in a year, the surge levels .Y1 , . . . YN are taken
to be a realization of a stationary sequence. It is then assumed that (Coles 2001)
.Prob(max{Y1 , . . . , YN } ≤ y) = Fs (y)
N θs = G(y; μ , σ , γ ) for large values of y,
s s s
say .y > u. .Fs is the marginal distribution for surges, while .θs (.0 ≤ θs ≤ 1) is the
extremal index for surges. G refers to the GEV distribution model. It follows that,

−1 y − μs −1/γs
.Fs (y) = exp −(N θs ) 1 + γs for y > u, (12.4)
σs

on .{y : 1 + γs (y − μs )/σs > 0}, with .σs > 0 and arbitrary .γs and .μs .
If T is the tidal cycle length in hours, it follows that (Tawn 1992),
T θ

T
Prob(max{Z1 , . . . , ZT } ≤ z) = Prob(
. Yt ≤ z − Xt } = Fs (z − Xt ) ,
t=1 t=1
(12.5)

where .θ is an hourly sea level extremal index (.θs ≤ θ ≤ 1). Combining Eqs. (12.4)
and (12.5) gives the RJP distribution for annual maximum sea levels,
T
z − μs − Xt −1/γs
−1
G(z) = exp −θ (T θs )
. 1 + γs , (12.6)
σs
t=1

where .z > u + max{X1 , . . . , XT }.

Equation (12.6) shows the case where the surge distribution .Fs (·), and therefore
its parameters .μs , σs and .γs , are independent of the concomitant tidal level.
Unfortunately, the assumption that tide and surge are independent processes is poor
in shallow water areas, where turbulent frictional processes on the sea bed cause the
tide and surge components to interact. This causes effects such as surge values at
high tides being damped and surges on the rising tide being amplified (Dixon and
Tawn 1994). These effects vary from site to site however, so it is attempted to model
the interaction on the residuals from the observed surges.
The tidal range from lowest observed tide (LAT) to highest observed tide (HAT)
is split into .nb equi-probable bands, i.e., each band has an equal amount of observed
measurements. For each of the tidal observations, there is a concurrent surge
observation. This means that if there are .nb tidal bands and .nobs observations in
total, there are .nobs /nb observations of both the tide and surge in each band.
12.5 Revised Joint Probabilities Method 241

If the tide and surge were independent, an equal amount of points should also be
expected to exceed a given level u. But if there is interaction, then the least number
of points should be in the top band where surges are damped. Similarly, the largest
number should be in the middle bands where surges are magnified. The amount of
discrepancy between bands will therefore be a quantifiable measure of the tide-surge
interaction.
The level u is now chosen to be a high empirical quantile of the surge distribution,
.zq for .q = 0.9975. For the independent case, there should then be .(1 − q) ·

nobs /nb = v observations in each band. .nb = 5 is chosen as in Dixon and Tawn
(1994), meaning .nobs · 0.0025/5 = v observations per tidal band. In actuality,
since independence is a flawed assumption, .Ni surges per band are observed, for
.i = 1, . . . , nb . To put the interaction into a quantifiable setting, a standard .χ test
2
5
statistic is used, .χ = i=1 (Ni − v) /v . If .Ni ≈ v, then .χ will be small. Tide
2 2 2 2

and surge are deemed to interact with 95 % confidence if the test statistic is above
the associated 4 degrees of freedom table value of .χ4,0.952 = 9.488.
To account for the tide-surge interaction, the method used in Dixon and Tawn
(1994) is adopted, where the surge series is location-scale normalized by .St∗ =
(Yt −a(Xt ))/b(Xt ), where .a(Xt ) and .b(Xt ) are some tide-dependent functions. .{St∗ }
is then supposed to be stationary, and established methods can be used to estimate
the associated model parameters .μs ∗ , .σs ∗ and .γs ∗ . The parameter estimates for the
original surge series are then given by .μs (X) = μs ∗ b(X)+a(X), .σs (X) = σs ∗ b(X),
and .γs (X) = γs ∗ .
Equation (12.6) is then modified to
T
z − μs ∗ b(Xt ) − a(Xt ) − Xt −1/γs ∗
−1
G(z) = exp −θ (T θs )
. 1 + γs ∗ ,
(σs ∗ b(Xt ))
t=1
(12.7)

for .z > u + max{X1 , . . . , XT }. The estimation of .a(X) and .b(X) is discussed in

Dixon and Tawn (1994).

12.5.1 Estimating Return Levels with the RJP Method

In the 1992 and 1994 reports (Tawn 1992; Dixon and Tawn 1994), the r-largest
method is applied to estimate the parameters of the surge distribution. This method
is a modification of the GEV annual maxima method, and uses the r largest values
per year to estimate model parameters.
The threshold applied in the estimation of .θs is now adopted, and the POT method
is then used to estimate model parameters. This is a method which has the advantage
of using much more data. The fact that for a POT model .γP OT = γGEV = γ for the
corresponding GEV model is used, with .σP OT = σGEV + γ · (u − μGEV ) (Coles
2001).
242 12 A Case Study—Extreme Water Levels

In practice, the shape and scale are estimated with a maximum likelihood
procedure by the function fpot in the R package evd, while the location parameter
is calculated by the fgev function. This is also the case for the confidence intervals
of model parameters. The model is fitted to the maxima of clusters, where the
clusters are identified by the same rule that governs the estimation of .θ .
The extremal indices .θ and .θs require two choices each; a threshold above which
clusters are counted and an empirical clustering rule which says how many non-
exceedances are allowed before the current cluster is terminated. For the first choice,
the quantile that was found in the POT analysis is used. There a thorough analysis
was done on where the threshold should be placed, and it makes sense using this
same threshold to estimate .θ . Applying the procedure of using the same quantiles
for both extremal indices, as was done in Dixon and Tawn (1994), corresponding
thresholds are obtained for the surge series and the estimation of .θs as well.
For the second choice, it is necessary to decide what constitutes an independent
storm. In Dixon and Tawn (1994), they found that .r = 30 was a good choice, and
that the ratio of extremal indices was not too dependent on this choice in any case.
In the present analysis the results are more sensitive to this choice, since the POT
method with clustering is used to estimate model parameters. Still this r value of 30
is adopted.
A practical challenge with the RJP method is that Eq. (12.7) is impossible to
solve analytically for z. One may start out by defining .G(zp ) = 1 − p and get the
relation,

T
zp − μs ∗ b(Xt ) − a(Xt ) − Xt −1/γs ∗
. − T θs θ −1 ln(1 − p) = 1 + γs ∗ ,
(σs ∗ b(Xt ))
t=1
(12.8)

but one cannot go further by algebraic manipulation. A numerical procedure is

therefore developed where the right hand side of the equation is calculated for a
range of relevant z values and then matched to the left side of the equation.
The tidal cycle length is 18.61 years, meaning that .T = 18.61 · 8766 =
163135.3 ≈ 163135. Any span of consecutive 163135 observations should therefore
approximately contain a full tidal cycle, and in the numerical procedure the last
163135 observations in the tidal series are taken. The numerical analysis code is
written in R.

12.5.2 Application to Water Level Measurements

Oslo

A scrutiny of the sea level data for Oslo reveals that a large percentage of the
observed sea level rises stem from surges. Locations with such characteristics are
12.5 Revised Joint Probabilities Method 243

called surge-dominant, as opposed to tide-dominant where most of an observed sea

level stems from the current tidal level. Areas where surge dominates usually do not
have as much tide-surge interaction as tide-dominant areas, but the .χ 2 test is still
used to quantify the amount.
By using 5 tidal bands and cutting the data at the 99.75% quantile, 389 surges
per band are expected. Instead 346, 380, 398, 431, and 386 data points per band are
found, in order from the lowest tide level to the highest. This is not a very bad result,
but there is still evidence of significant tide-surge interaction. The test statistic has a
result of .χ 2 = 9.778, compared to the table value of .χ4,0.95
2 = 9.488.
2
Since the .χ test shows a significant level of interaction, it is modeled by using
5 tidal bands. Figure 12.14a shows the tide against surge data to the left, and tide
against transformed surge data to the right, both showing only points above the
99.75% quantile. There seems to be very little difference between the left and the
right plots except for the scales on the x axes.
Figure 12.14b shows how the ratio between the .χ 2 test statistic and the corre-
sponding .χn2b −1,0.95 value develops after 2–30 bands have been used to transform
the surge data. .χ 2 /χn2b −1,0.95 = 1 is indicated by the horizontal line, and ratios
above this correspond to .χ 2 tests showing significant interaction. It is seen that
for .nb = 2 and .nb = 15 the interaction is insignificant, but for .nb = 2 the
interaction is insignificant for the untransformed data as well, with a test statistic
of 1.85 versus the corresponding table value of 3.84. Transforming the data with
.nb = 15 seems needlessly complex for a model where there was hardly significant

interaction in the original test. Furthermore, there is little actual difference in return
levels. The same is true if they are compared to the return levels achieved using no
tide-surge correction. For parsimony the model without correction is selected, based

Surge data Corrected surge data

Ratio of chi−squared test statistic vs table value
100
100

1.6
1.4
80

80
Tide level (cm)

Tide level (cm)

1.2
1.0
60

0.8
40

0.6

80 100 120140160 4 6 8 10 12 5 10 15 20 25 30
Surge level (cm) Surge level (cm) Number of bands

(a) (b)

Fig. 12.14 Tide-surge interaction plots for Oslo. (a) Tide levels and concurrent surges above the
99.75% quantile. (b) Ratio between .χ 2 test statistic and table value .χn2b −1,0.95
244 12 A Case Study—Extreme Water Levels

Table 12.5 Return level estimates and 95% confidence intervals for the RJP model
Oslo Heimsjø Honningsvåg
.R = 1/p .ẑp CI .ẑp CI .ẑp CI
5 194.5 (184.1, 207.6) 318.2 (314.6, 323.5) 332.4 (331.3, 335.6)
10 206.7 (192.5, 225.2) 325.4 (321.1, 336.1) 341.9 (339.1, 347.9)
20 217.9 (200.0, 242.5) 333.0 (327.0, 354.0) 350.9 (346.1, 360.4))
100 242.0 (215.0, 282.9) 352.5 (339.3, 422.4) 370.6 (360.2, 390.7)
200 251.8 (220.7, 300.7) 362.1 (344.1, 466.5) 378.7 (365.6, 404.4)

on Eq. (12.6) instead of Eq. (12.7). Estimated parameters are .μ̂s = 95.655(2.313),
σ̂s = 17.576(1.053), .γ̂s = −0.049(0.039), .θ̂ = 0.079 and .θ̂s = 0.068.
.

Resulting return levels and confidence intervals for Oslo can be found in
Table 12.5.

Heimsjø

Heimsjø is tide-dominant, and it is observed from the data that there is quite a small
distance between the maximum of its tidal series and the maximum of observed
sea levels. This also holds for the mean of annual maxima from tide calculations
and observations. Tide-dominance usually implies larger tide-surge interaction than
surge-dominance, and the .χ 2 test for Heimsjø certainly lives up to this expectation.
With 5 tidal bands and cutting at the 99.75% quantile, about 343 surge observations
are expected per band. Instead, 790, 479, 288, 112 and 75 points are found. As
expected for a tide-dominant location, the least number of points are in the highest
tidal band, where the high tide dampens surges. By far the most points are in the
bottom band.
The huge discrepancy between expected and observed number of surges in each
band leads to a massively large value for the .χ 2 test statistic, with .χ 2 = 1007.3
2
compared to the table value of .χ4,0.95 = 9.488. The results of transforming the
data with 2 to 30 bands are shown in Fig. 12.15b. The ratio never drops below 1
as one would wish, but instead .nb = 12 is chosen, which gives a relatively low
ratio compared to other choices. It gives .χ 2 = 89.865 versus the quantile value of
.χ11,0.95 = 19.675. Figure 12.15a shows that the number of points in each band is

much more similar in the transformed case to the right of the figure compared to the
left, uncorrected side.
Estimated parameters are .μ̂s ∗ = 3.494(0.283), .σ̂s ∗ = 1.671(0.160), .γ̂s ∗ =
0.074(0.079), .θ̂ = 0.106 and .θ̂s ∗ = 0.063.
Return level and confidence intervals for Heimsjø are presented in Table 12.5.
The low return period levels are quite different from their published counterparts,
but the upper confidence bound of the 20-year return level estimate contains the
corresponding published value (Hansen and Roald 2000). This could perhaps be
explained by the tide-surge interaction still present, but estimation of return levels
12.5 Revised Joint Probabilities Method 245

Surge data Corrected surge data

300
300

Ratio of chi−squared test statistic vs table value

250
250

60
200
200
Tide level (cm)

Tide level (cm)

150
150

40
100
100

20
50

50
0
0

80 120 160 4 6 8 10 12 5 10 15 20 25 30
Surge level (cm) Surge level (cm) Number of bands

(a) (b)

Fig. 12.15 Tide-surge interaction plots for Heimsjø. (a) Tide levels and concurrent surges above
the 99.75% quantile. (b) Ratio between .χ 2 test statistic and table value .χn2b −1,0.95

without any tidal bands also show this apparent underestimation of low return period
levels. The standard error of the estimated .γs is larger than the estimate itself, and
the estimate being positive means very large upper bounds for long return periods.

Honningsvåg

Honningsvåg is tide-dominant like Heimsjø, but the .χ 2 test shows much less
interaction than for Heimsjø. It is still significant on a 95% confidence level, with
.χ = 23.652. Figure 12.16b shows how the .χ ratio develops. After .nb = 2, it does
2 2

not go below 1 until .nb = 25, but for .nb = 16 it is quite close and this is chosen.
Here, .χ 2 = 26.106 against .χ15,0.95
2 = 24.996. Figure 12.16a shows the surge data
before and after transformation with 16 tidal bands.
Estimated parameters are .μ̂s ∗ = 2.943(0.274), .σ̂s ∗ = 1.617(0.124), .γ̂s ∗ =
−0.041(0.049), .θ̂ = 0.085 and .θ̂s ∗ = 0.043.
Return levels and confidence intervals for Honningsvåg are shown in Table 12.5.
246 12 A Case Study—Extreme Water Levels

Surge data Corrected surge data

Ratio of chi−squared test statistic vs table value

300
300

2.5
250
250

2.0
200
Tide level (cm)

Tide level (cm)

200

1.5
150
150

1.0
100
100

0.5
50
50

0.0
0
0

60 70 80 90 3 4 5 6 7 8 9 5 10 15 20 25 30
Surge level (cm) Surge level (cm) Number of bands

(a) (b)

Fig. 12.16 Tide-surge interaction plots for Honningsvåg. (a) Tide levels and concurrent surges
above the 99.75% quantile. (b) Ratio between .χ 2 test statistic and table value .χn2b −1,0.95

12.6 The ACER Method

All calculations in this section were performed with the ACER package for Matlab
(Karpa 2012).

12.6.1 Application to Water Level Measurements

Note that for estimation of the 5, 10, and 20-year return period levels, which for all
stations are in sample estimates, the ACER method will provide unbiased estimates
of the exact values. For estimation of the long return period levels, the first goal is
to find a tail marker .η0 that represents a sufficiently high threshold. For the ACER
method, this is found where the curves of the estimated ACER functions .ε̂k (η) start
behaving regularly in the sense of Eq. (5.31). After having chosen such a point
initially, one would typically go even further into the tail to verify robustness with
respect to the estimated return levels. For the hourly sea level measurements studied
in this chapter, it was found that a level for which the remaining data amounted to
roughly 5–10% of the total data set gave stable numerical estimates.

Oslo

The curves in the ACER plot of Fig. 12.17a detail the effect dependence has on
the ACER function estimates for Oslo. It can be seen that the independent case of
12.6 The ACER Method 247

Fig. 12.17 ACER plots for k =1

Oslo. (a) Log plot of k =2
estimated ACER functions. −1
10 k =4
(b) Fitted and extrapolated k = 12
ACER.2 function k = 24
−2
10

ACERk (η)
−3
10

−4
10

100 120 140 160 180 200 220

(a)

−2
10

−3
10
ACER2 (η)

−4
10

−5
10

−6
10

120 140 160 180 200 220 240 260 280

(b)

k = 1 has a curve which stays significantly above the rest, clearly demonstrating that
.

hourly measurements are strongly dependent. The ACER plot also shows that there
is a significant diurnal dependence effect, which is to be expected. But this effect is
seen to vanish at the higher levels, showing that the ACER functions coalesce in the
tail for .k ≥ 2. Thus, for estimation of long return period sea levels it is advantageous
to choose .k = 2, since this case allows for the use of more data than for .k > 2, with
the potential benefit of higher accuracy as a result.
With the selected estimated ACER function, i.e., .ε̂2 (η), the optimized parameter
estimates are .d = −4.048, .b = 96.901, .a = 0.037, .c = 1.117, which is based
on a tail marker .η0 = 100. The return level estimates for Oslo are presented in
Table 12.6.
248 12 A Case Study—Extreme Water Levels

Table 12.6 Return level estimates and 95% confidence intervals for the ACER model
Oslo Heimsjø Honningsvåg
.R = 1/p .ẑp CI .ẑp CI .ẑp CI
5 199.6 (191.0, 205.5) 330.5 (327.0, 333.6) 351.6 (346.4, 357.1)
10 210.1 (199.7, 216.9) 336.3 (332.4, 339.7) 357.7 (352.3, 363.8)
20 220.1 (207.7, 227.6) 341.6 (337.4, 345.3) 363.3 (357.8, 370.1))
100 242.3 (224.9, 251.8) 352.6 (347.7, 357.0) 375.3 (369.6, 383.8)
200 251.6 (232.0, 262.0) 357.0 (351.8, 361.7) 380.2 (374.4, 389.2)

Heimsjø

Figure 12.18a displays well-behaved ACER functions for Heimsjø, but the plot is
qualitatively quite different from the previous one for Oslo. Now, the dependence
of hourly data has a smaller influence on the ACER function values. This may be
due to the fact that the tidal effects are much stronger for Heimsjø than for the other
two locations. Also for this location, all ACER functions for .k ≥ 2 coalesce in the
tail. Therefore, .k = 2 is selected for the return level estimation. The tail marker is
chosen to be .η0 = 230, which gives the following optimal values of the parameters:
.d = −2.524, .b = 154.689, .a = 5.27 · 10
−6 , .c = 2.755. The return level estimates

are found in Table 12.6.

Honningsvåg

Figure 12.19a reveals that the ACER plot for Honningsvåg is very similar to the
one for Heimsjø. This may be explained by the fact that both stations have a strong
tidal component. Since all ACER functions for .k ≥ 2 coalesce in the tail, .k = 2
is again selected for the return level estimation. The tail marker is chosen to be
.η0 = 260, which gives the following optimal values of the parameters: .d = −3.012,

.b = 231.209, .a = 8.15 · 10
−4 , .c = 1.907. The return level estimates are given in

Table 12.6.

12.7 Discussion of Results

12.7.1 Oslo

Oslo is surge-dominant and has the largest amount of data of the locations presented
in this chapter. Both these factors mean that all the methods applied should perform
quite well here. Surge-dominance means that the non-stochastic tidal component of
the measured sea levels does not interfere to a large degree, and a large amount of
data is naturally desirable. As for possible sources of error, the removal of the post-
12.7 Discussion of Results 249

Fig. 12.18 ACER plots for k =1

Heimsjø. (a) Log plot of k =2
estimated ACER functions. −1
10 k =4
(b) Fitted and extrapolated k = 12
ACER.2 function k = 24
−2
10

ACERk (η)
−3
10

−4
10

240 260 280 300 320 340

(a)

−2
10

−3
10
ACER2 (η)

−4
10

−5
10

−6
10

240 260 280 300 320 340 360

(b)

glacial rebound trend is certainly worth mentioning, and the same procedure as in
the report by Hansen and Roald (2000) was followed.
Figure 12.20a shows that return levels are very similar for all methods for 5, 10
and 20-year periods, while larger differences arise for the 100 and 200 year periods.
Note that the lines between the indicated points are only drawn for visual purposes.
Figure 12.20b shows the 200 year return level estimates, together with 95%
confidence intervals. As seen, the Gumbel model is producing the largest estimate,
at 266 cm, while the GEV model gives the smallest, at 249 cm. This difference is
not large, and all methods contain the results from the other methods within their
confidence interval. Also note that the ACER method has the shortest confidence
interval, indicating perhaps a higher estimation accuracy.
250 12 A Case Study—Extreme Water Levels

Fig. 12.19 ACER plots for k =1

Honningsvåg. (a) Log plot of k =2
estimated ACER functions. −1
10 k =4
(b) Fitted and extrapolated k = 12
ACER.2 function k = 24
−2
10

ACERk (η)
−3
10

−4
10

260 280 300 320 340 360

(a)

−2
10

−3
10
ACER2 (η)

−4
10

−5
10

−6
10

280 300 320 340 360 380 400

(b)

12.7.2 Heimsjø

For Heimsjø, the non-stochastic tidal component dominates, and estimating the
extreme value distribution becomes more difficult. This is seen in the larger
differences between methods. The RJP method, in particular, disagrees markedly
for the three first periods of 5, 10, and 20 years. This is shown in Fig. 12.21a. Since
the ACER method provides unbiased estimates of the exact in sample values, it can
be concluded that the RJP method estimates are inaccurate at these return periods.
However, the RJP method performs much better for the 100 and 200 year levels,
where it basically is in line with all models but the Gumbel one.
Figure 12.21b shows better the difference between the 200 year estimates, and
highlights the extreme upper bound of the estimate obtained by the RJP method. The
other confidence bounds look minuscule in comparison, but this is because of the
scale of the bounds of the RJP method. The Gumbel model produces a significantly
12.7 Discussion of Results 251

Fig. 12.20 Comparison

280
between return levels in Oslo.
(a) Return levels for 5, 10, 20,
100 and 200 year periods. (b) 260
200 year return levels and

Return Level
confidence bounds 240

220
Gumbel
200 GEV
POT
RJPM
180
ACER
0 50 100 150 200
Return Period

(a)

320

300
Return Level

280

260

240

220

Gumbel GEV POT RJPM ACER

(b)

higher return level compared to the other methods. However, the difference between
the highest estimate (Gumbel) and the lowest (POT) for the longest return period is
only 18 cm. It is seen that the ACER method again provides the smallest confidence
interval.

12.7.3 Honningsvåg

Mostly the same pattern as in Heimsjø is shown for Honningsvåg, but with a greater
scatter of the results. All methods except the RJP method agree at the 5 year return
level, but diverge at the higher levels. The RJP method starts out estimating lower
return levels, but ends up in the same region as the ACER estimate for the 200 year
period. The Gumbel and POT methods give by far the largest estimates, while the
GEV method produces by far the lowest.
252 12 A Case Study—Extreme Water Levels

Fig. 12.21 Comparison

between return levels in 380
Heimsjø. (a) Return levels for
5, 10, 20, 100 and 200 year 360
periods. (b) 200 year return

Return Level
levels and confidence bounds
340

320 Gumbel
GEV
POT
300 RJPM
ACER
0 50 100 150 200
Return Period

(a)

480

460

440

420
Return Level

400

380

360

340

320

Gumbel GEV POT RJPM ACER

(b)

Figure 12.22a compares the 5, 10, 20, 100 and 200 year estimates for all methods,
while Fig. 12.22b shows the comparatively large difference between the 200 year
estimates. The POT estimate for that period is 30 cm larger than the corresponding
GEV estimate. Again the ACER method provides the smallest confidence interval.

12.7.4 Comments

The Annual Maxima method and the Peaks-over-Threshold method are both widely
known and much applied methods. However, they both have some possible defects.
As mentioned, the AM method throws away much of the data, and may end up
fitting a poor model since all locations have less than 100 years of observed maxima
to draw from. The POT method allows for the use of a great deal more data, but
is subjected to two individual choices: the extreme threshold and the empirical
12.7 Discussion of Results 253

Fig. 12.22 Comparison

between return levels in 400
Honningsvåg. (a) Return
levels for 5, 10, 20, 100 and 380
200 year periods. (b) 200

Return Level
year return levels and
confidence bounds 360

340 Gumbel
GEV
POT
320 RJPM
ACER
0 50 100 150 200
Return Period

(a)

440

420
Return Level

400

380

360

340

Gumbel GEV POT RJPM ACER

(b)

clustering rule. The former of the choices has available supportive literature (Coles
2001), but the selection of the clustering rule has little general theory, and some
experience is a valuable asset in dealing with this problem.
The Revised Joint Probabilities (RJP) method is less widely applied than the
first two methods, but has seen application to a number of British locations by the
creators and proponents of the method, cf. Tawn (1992), Dixon and Tawn (1994),
and Haigh et al. (2010). But because the literature is more sparse, and from fewer
sources, it is more difficult to fully assess the reliability of this method.
As for the estimated return period levels by the RJP method, a notable feature
is what appears to be underestimated short period return levels in the tide-dominant
locations of Heimsjø and Honningsvåg. This is not an artifact from the tide-surge
interaction modeling, since corresponding return levels were equally low when the
estimation was performed without such modeling. The POT method is used to
estimate surge parameters, which was not done by Dixon and Tawn (1994) in earlier
work. This could perhaps affect results.
254 12 A Case Study—Extreme Water Levels

However, their methods are adopted in declustering the data using .r = 30,
which is quite different from the choice made in the POT analysis. It was done
to emulate the method as used by Dixon and Tawn (1994) but could possibly have
benefited from more careful consideration. Other choices that need to be made are
the functions with which tide-surge interaction is corrected for and the quantiles
that are deemed extreme. The correction functions were made to the specifications
in Dixon and Tawn (1994), while the quantiles used were taken from the preceding
POT section which provides a generally thorough explanation of the threshold
selection.
In total, it is difficult to recommend a method that has the likely defect of
underestimating short period return levels. Although it agrees more with the other
methods for higher return levels, this is an area where there is a large margin of error
anyway. The method is also somewhat difficult to implement, since Eq. (12.7) needs
to be solved numerically. Tide measurements also need to be available. Using such
data is, however, a strong point of the model, if applied correctly. More data are
then used in the estimation. This attempt to incorporate sea level specific data and
methodology into the return level estimation is arguably the most desirable quality
of the method.
The ACER method is the most recently developed method, and it is therefore
interesting to see that it produces results very similar to those by more conventional
methods. In the return level plots in this section it is seen that the ACER method
produces 200 year estimates in the middle of the range of the methods used here.
Two choices need to be made when estimating return levels with the ACER
method; the tail marker .η0 and the ACER function .εk used for the parameter
estimation. It was found that as long as data around or above the 90-95% quantile
were used, quite consistent results were obtained.
The ACER method seems to be a viable alternative for extreme sea level
estimation, and with the developed methodology for parameter estimation and
construction of confidence intervals it is quite easy to implement. One of the
attractive features of the ACER method is its diagnostic power. By plotting the
ACER functions of relevant order, it can be decided which order is necessary to
capture the effect of dependence in the data on the extreme value statistics.
In summary, the ACER method seems to be an attractive method, with its quite
easily implemented methodology and attractive statistical properties. When looking
at 200 year return levels, it usually agrees with the other data-intensive methods,
that is the RJP method and the POT method. While the empirical ACER functions
provide a completely general nonparametric representation of the extreme value
distribution given by the data, for the purpose of extrapolation to long return period
levels a specific class of parametric distributions were introduced. This class of
distributions specifically targets the cases where the Gumbel distribution is the
appropriate asymptotic extreme value distribution. In the context of this chapter,
this assumption is justified based on the underlying water level statistics, which
invariably belongs to the domain of attraction of an asymptotic Gumbel distribution
for the extremes.
References

Adler, R.J. 1981. The Geometry of Random Fields. New York: John Wiley & Sons, Ltd.
Adler, R.J., and J.E. Taylor. 2007. Random Fields and Geometry. New York: Springer.
Anderson, T.W. 1958. An Introduction to Multivariate Statistical Analysis. New York: John Wiley
& Sons, Inc.
Argyris, J., and H.-P. Mlejnek. 1991. Dynamics of Structures. Amsterdam: Elsevier Science
Publishers B.V.
Balakrishnan, N., and C.-D. Lai. 2009. Continuous Bivariate Distributions. New York: Springer,
NY.
Batstone, C., M. Lawless, K. Horburgh, D. Blackman, and J.A. Tawn. 2009. Calculating extreme
sea level probabilities around complex coastlines: A best practice approach. In Proceedings
Irish National Hydrology Conference, 24–32. Hydrology Ireland.
Battjes, J.A. 1970. Long Term Wave Height Distributions of Seven Stations around the British Isles
(Report A.44 ed.). Godalming: National Institute of Oceanography.
Baxevani, A., and I. Rychlik. 2006. Maxima for Gaussian seas. Ocean Engineering 33 (7): 895–
911.
Beirlant, J., Y. Goegebeur, J. Segers, and J. Teugels. 2004. Statistics of Extremes. Chichester: John
Wiley & Sons, Ltd.
Benfratello, S., M. Di Paola, and P.D. Spanos. 1998. Stochastic response of MDOF wind excited
structures by means of Volterra series approach. Journal of Wind Engineering and Industrial
Aerodynamics 74–76: 1135–1145.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econo-
metrics 31 (3): 307–327.
Bracewell, R.N. 1986. The Fourier Transform and its Applications. 2nd ed. New York: McGraw-
Hill Inc.
Brockwell, P.J., and R.A. Davis. 2002. Introduction to Time Series and Forecasting. Berlin:
Springer Science.
Bury, K.V. 1975. Statistical Models in Applied Sciences. New York: John Wiley & Sons, Inc.
Byström, H.N.E. 2005. Extreme value theory and extremely large electricity price changes.
International Review of Economics and Finance 14 (1): 41–55.
Casella, G., and R. Berger. 2002. Statistical Inference. Boston: Cengage Learning.
Castillo, E., A.S. Hadi, N. Balakrishnan, and J.M. Sarabia. 2005. Extreme Value and Related
Models with Applications in Engineering and Science. Hoboken: John Wiley & Sons, Ltd.
Cetin, A., and A. Naess. 2012. Toward a proper statistical description of defects. International
Journal of Fatigue 38: 100–107.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 255
A. Naess, Applied Extreme Value Statistics,
[Link]
256 References

Chan, K.F., and P. Gray. 2006. Using extreme value theory to measure value-at-risk for daily
electricity spot prices. International Journal of Forecasting 22 (2): 283–300.
Chatfield, C. 1989. The Analysis of Time Series–An Introduction. London: Chapman and Hall.
Coles, S.G. 1994. A temporal study of extreme rainfall. In Statistics for the Environment 2 - Water
Related Issues, ed. V. Barnett and K.F. Turkman, Chapter 4, 61–78. Chichester: John Wiley &
Sons.
Coles, S.G. 2001. An Introduction to Statistical Modeling of Extreme Values. Springer Series in
Statistics. London: Springer-Verlag.
Coles, S.G., and M.J. Dixon. 1999. Likelihood-based inference of extreme value models. Extremes
2: 5–23.
Coles, S.G., and J.A. Tawn. 1991. Modelling extreme multivariate events. Journal of the Royal
Statistical Society. Series B (Methodological) 53 (2): 377–392.
Coles, S.G., and J.A. Tawn. 1994. Statistical methods for multivariate extremes: An application to
structural design. Journal of the Royal Statistical Society. Series C (Applied Statistics) 43 (1):
1–48.
Contreras, J., R. Espinola, F.J. Nogales, and A.J. Conejo. 2003. ARIMA models to predict next-day
electricity prices. Power Systems, IEEE Transactions 18 (3): 1014–1020.
Cook, N.J. 1982. Towards better estimation of extreme winds. Journal of Wind Engineering and
Industrial Aerodynamics 9 (3): 295–323.
Cook, N.J. 1985. The Designer’s Guide to Wind Loading of Building Structures. London:
Butterworths.
Cramer, H. 1946. Mathematical Methods of Statistics. Princeton: Princeton University Press.
Cramer, H., and M.R. Leadbetter. 1967. Stationary and Related Stochastic Processes. New York:
John Wiley & Sons.
Dahlen, K.E., P.B. Solibakke, S. Westgaard, and A. Naess. 2015. On the estimation of extreme
values for risk assessment and management: The ACER method. International Journal of
Business 20 (1): 33–51.
Davison, A.C., and D.V. Hinkley. 1997. Bootstrap Methods and their Applications. London:
Cambridge University Press.
Davison, A.C., and R.L. Smith. 1990. Models for exceedances over high thresholds. Journal of the
Royal Statistical Society, B 52 (3): 393–442.
de Haan, L. 1970. On regular variation and its applications to the weak convergence of sample
extremes. Tract, Mathematical Centre, Amsterdam.
de Haan, L. 1994. Extreme value statistics. In Extreme Value Theory and Applications, ed.
J. Galambos, J. A. Lechner, and E. Simiu. Dordrecht: Kluwer Academic Publishers.
de Haan, L., and J. de Ronde. 1998. Sea and wind: Multivariate extremes at work. Extremes 1:
7–45.
Dekkers, A.L.M., J.H.J. Einmahl, and L. de Haan. 1989. A moment estimator for the index of an
extreme-value distribution. The Annals of Statistics 17 (4): 1833–1855.
Ditlevsen, O. 1971. Extremes and First Passage Times. Copenhagen: Dissertation (dr. techn.),
Technical University of Denmark.
Ditlevsen, O. 2004. Extremes of random fields over arbitrary domains with application to concrete
rupture stresses. Probabilistic Engineering Mechanics 19: 373–384.
Ditlevsen, O., and H.O. Madsen. 1996. Structural Reliability Methods. Chichester: John Wiley &
Sons, Inc.
Dixon, M.J., and J.A. Tawn. 1994. Extreme sea-levels at the UK A-class sites: site-by-site analyses.
Internal doc. no. 65, Proudman Oceanographic Laboratory.
Donley, M.G., and P.D. Spanos. 1990. Dynamic Analysis of Non-Linear Structures by the Method
of Statistical Quadratization, vol. 57. Lecture Notes in Engineering. Berlin: Springer-Verlag.
Doob, J.L. 1953. Stochastic Processes. New York: John Wiley & Sons.
Draper, N.R., and H. Smith. 1998. Applied Regression Analysis. New York: Wiley-Interscience.
Eastoe, E.F., and J.A. Tawn. 2012. Modelling the distribution of the cluster maxima of exceedances
of subasymptotic thresholds. Biometrika 99 (1): 43–55.
References 257

Efron, B., and R.J. Tibshirani. 1993. An Introduction to the Bootstrap. New York: Chapman and
Hall.
Embrechts, P., C. Klüppelberg, and T. Mikosch. 1997. Modelling Extremal Events. New York:
Springer.
Escribano, A., J.I. Pena, and P. Villaplana. 2002. Modeling electricity prices: International
evidence. SSRN (Social Science Research Network).
Falk, M., J. Hüsler, and R.-D. Reiss. 2004. Laws of Small Numbers: Extremes and Rare Events.
2nd ed. Basel: Birkhäuser.
Ferro, C., A. T. and J. Segers. 2003. Inference for clusters of extreme values. Journal of the Royal
Statistical Society, B 65 (2): 545–556.
Fisher, R.A., and L.H.C. Tippett. 1928. On the estimation of the frequency distributions of the
largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society
24: 180–190.
Forristall, G.Z. 2000. Wave crest distributions: Observations and second-order theory. Journal of
Physical Oceanography 30: 1931–1943.
Forristall, G.Z. 2006. Maximum wave heights over an area and the air gap problem. In Proceedings
25th International Conference on Offshore Mechanics and Arctic Engineering, OMAE–2006–
92022. New York: ASME.
Forst, W., and D. Hoffmann. 2010. Optimization - Theory and Practice. New York: Springer.
Fréchet, M. 1927. Sur la loi de probabilité de l’écart maximum. Annales de la Société Polonaise
de Mathematique, Cracow 6: 193–213.
Gaidai, O., G. Storhaug, and A. Naess. 2016. Extreme value statistics of large container ship roll.
Journal of Ship Research 60 (2): 92–100.
Gaidai, O., G. Storhaug, and A. Naess. 2018. Statistics of extreme hydroelastic response for large
ships. Marine Structures 61: 142–154.
Gaidai, O., A. Naess, O. Karpa, X. Xu, Y. Cheng, and R. Ye. 2019. Improving extreme wind speed
prediction for North Sea offshore oil and gas fields. Applied Ocean Research 88: 63–70.
Garcia, R.C., J. Contreras, M.V. Akkeren, and J.B. Garcia. 2005. A GARCH forecasting model to
predict day-ahead electricity prices. Power Systems, IEEE Transactions 20 (2): 867–874.
Gill, P., W. Murray, and M.H. Wright. 1981. Practical Optimization. London: Academic Press.
Gnedenko, B.V. 1943. Sur la distribution limite du terme maximum d’une série aléatoire. Annals
of Mathematics 44: 423–453.
Grigoriu, M. 1984. Crossings of non-Gaussian translation processes. Journal of Engineering
Mechanics, ASCE 110 (4): 610–620.
Grigoriu, M. 1995. Applied Non-Gaussian Processes. Englewood Cliffs: PTR Prentice Hall.
Gudendorf, G., and J. Segers. 2010. Chapter 6. Extreme-Value Copulas. In Copula Theory and
Its Applications: Proceedings of the Workshop Held in Warsaw, 25–26 September 2009, ed.
P. Jaworski, F. Durante, W. Härdle, and T. Rychlik. Berlin: Springer-Verlag.
Gumbel, E.J. 1958. Statistics of Extremes. New York: Columbia University Press.
Gumbel, E.J. 1960a. Bivariate exponential distributions. Journal of the American Statistical
Association 55 (292): 698–707.
Gumbel, E.J. 1960b. Multivariate extremal distributions. Bulletin de l’Institut International de
Statistique 37 (2): 471–475.
Gumbel, E.J. 1961. Bivariate logistic distributions. Journal of the American Statistical Association
56 (294): 335–349.
Gumbel, E.J., and C.K. Mustafi. 1967. Some analytical properties of bivariate extremal distribu-
tions. Journal of the American Statistical Association 62 (318): 569–588.
Haigh, I. D., R. Nicholls, and N. Wells. 2010. A comparison of the main methods for estimating
probabilities of extreme still water levels. Coastal Engineering 57: 838–849.
Hansen, H., and L. Roald. 2000. Extreme water level analysis at sea at selected stations (in
Norwegian). Doc. no. 11, Norwegian Water Resources and Energy Directorate.
Haring, R.E., and J.C. Heideman. 1978. Gulf of Mexico rare wave return periods. In Proceedings
of the Offshore Technology Conference, Number OTC 3230. Houston.
258 References

Haver, S. 1980. Analysis of Uncertainties related to the Stochastic Modelling of Ocean Waves.
Report UR-80-09, Division of Marine Structures, NTH (NTNU), Trondheim.
Haver, S. 2002. On the prediction of extreme wave crest heights. In Proceedings of 7th Inter-
national Workshop On Wave Hindcasting and Forecasting. Banff: Meteorological Service of
Canada, Environment Canada.
Haver, S., and K.A. Nyhus. 1986. A wave climate description for long term response calculations.
In Proceedings 9th International Conference on Offshore Mechanics and Arctic Engineering.
New York: ASME.
Haver, S., and S.R. Winterstein. 2008. Environmental contour lines: A method for estimating long
term extremes by a short term analysis. Transactions - Society of Naval Architects and Marine
Engineers SMTC-067-2008.
Heffernan, J.E., and J.A. Tawn. 2004. A conditional approach for multivariate extremes (with
discussion). Journal of the Royal Statistical Society, Series B 66 (2): 497–546.
Henrici, P. 1977. Applied and Computational Complex Analysis. vol. II. New York: John Wiley &
Sons, Inc.
Hosking, J.R.M., and J.R. Wallis. 1987. Parameter and quantile estimation for the generalized
Pareto distribution. Technometrics 29: 339–349.
Hosking, J.R.M., and J.R. Wallis. 1997. Regional Frequency Analysis: An Approach Based on
L-Moments. London: Cambridge University Press.
Hosking, J.R.M., J.R. Wallis, and E.F. Wood. 1985. Estimation of the generalized extreme-value
distribution by the method of probability-weighted moments. Technometrics 27: 251–261.
Hougaard, P. 1986. A class of multivariate failure time distributions. Biometrika 73 (3): 671–678.
Hsing, T. 1987. On the characterization of certain point processes. Stochastic Processes and
Applications 26: 297–316.
Hsing, T. 1991. Estimating the parameters of rare events. Stochastic Processes and Applications
37: 117–139.
Jahns, H.O., and J.D. Wheeler. 1972. Long term wave probabilities based on hindcasting of severe
storms. In Proceedings of the Offshore Technology Conference, Number OTC 1590. Houston.
Johnson, N.L., and S. Kotz. 1970. Distributions in Statistics: Univariate Continuous Distributions
- 2. New York: John Wiley & Sons.
Kac, M., and A.J.F. Siegert. 1947. On the theory of noise in radio receivers with square law
detectors. Journal of Applied Physics 18: 383–397.
Karpa, O. 2012. ACER User Guide and Program. NTNU: Freely available at the internet address:
[Link]
Karpa, O. 2014. ACER 2D Matlab Programs. NTNU: Freely available at the internet address:
[Link]
Karpa, O., and A. Naess. 2013. Extreme value statistics of wind speed data by the ACER method.
Journal of Wind Engineering and Industrial Aerodynamics 112: 1–10.
Karunakaran, D., S. Haver, M. Baerheim, and N. Spidsoe. 2001. Dynamic behaviour of the
Kvitebjørn jacket in the North Sea. In Proceedings of the 20th International Conference on
Offshore Mechanics and Arctic Engineering, OMAE2001/OFT1184. Rio de Janeiro: ASME.
Krogstad, H.E. (1985). Height and period distributions of extreme waves. Applied Ocean Research
7 (3): 158–165.
Krogstad, H.E., J. Liu, H. Socquet-Juglard, K.B. Dysthe, and K. Trulsen. 2004. Spatial extreme
value analysis of nonlinear simulations of random surface waves. In Proceedings 23rd
International Conference on Offshore Mechanics and Arctic Engineering, OMAE–2004–
51336. New York: ASME.
Langley, R.S. 1984. The statistics of second order wave forces. Applied Ocean Research 6 (4):
182–186.
Larrabee, R.D., and C.A. Cornell. 1981. Combination of various load processes. Journal of the
Structural Division, ASCE 107: 223–239.
Leadbetter, M.R. 1983. Extremes and local dependence in stationary sequences. Zeitschrift für
Wahrscheinlichkeitstheorie und verwandte Gebiete 65: 291–306.
References 259

Leadbetter, M.R. 1995. On high-level exceedance modeling and tail-inference. Journal of Statisti-
cal Planning and Inference 45: 247–260.
Leadbetter, M.R., G. Lindgren, and H. Rootzén. 1983. Extremes and Related Properties of Random
Sequences and Processes. New York: Springer-Verlag.
Ledford, A.W., and J.A. Tawn. 1996. Statistics for near independence in multivariate extreme
values. Biometrika 83: 169–187.
Madsen, H.O., S. Krenk, and N.C. Lind. 1986. Methods of Structural Safety. New Jersey: Prentice-
Hall Inc.
Maes, M., and K. Breitung. 1997. Direct approximation of the extreme value distribution of non-
homogeneous Gaussian random fields. Journal of Offshore Mechanics and Arctic Engineering
119: 252–256.
Makkonen, L. 2006. Plotting positions in extreme value analysis. Journal of Applied Meteorology
and Climatology 45 (2): 334–340.
Makkonen, L. 2008. Problems in the extreme value analysis. Structural Safety 30: 405–419.
MATLAB. 2009. version [Link] (R2009b). Natick: The MathWorks Inc.
McKay, M.D., W.J. Conover, and R.J. Beckman. 1979. A comparison of three methods for select-
ing values of input variables in the analysis of output from a computer code. Technometrics 21:
239–245.
McNeil, A.J., and R. Frey. 2000. Estimation of tail-related risk measures for heteroscedastic
financial time series: An extreme value approach. Journal of Empirical Finance 7 (3–4): 271–
300.
Melchers, R.E. 1999. Structural Reliability Analysis and Prediction. 2nd ed. West Sussex: John
Wiley & Sons. ISBN 0471987719.
Moan, T., Z. Gao, and E. Ayala-Uraga. 2005. Uncertainty of wave-induced response of marine
structures due to long term variation of extratropical wave conditions. Journal of Marine
Structures 18 (4): 359–382.
Montgomery, D.C., E.A. Peck, and G.G. Vining. 2002. Introduction to Linear Regression Analysis.
Amsterdam: Elsevier Science Publishers B. V.
Murakami, Y. 2002. Metal Fatigue: Effects of Small Defects and Nonmetallic Inclusions. 2nd ed.
London: Academic Press.
Naess, A. 1983. Prediction of extremes of Morison type loading - an example of a general method.
Ocean Engineering 10 (5): 313–324.
Naess, A. 1984. On the long-term statistics of extremes. Applied Ocean Research 6 (4): 227–228.
Naess, A. 1985a. The joint crossing frequency of stochastic processes and its application to wave
theory. Applied Ocean Research 7 (1): 35–50.
Naess, A. 1985b. Statistical analysis of second-order response of marine structures. Journal of Ship
Research 29 (4): 270–284.
Naess, A. 1986. The statistical distribution of second-order slowly-varying forces and motions.
Applied Ocean Research 8 (2): 110–118.
Naess, A. 1987. The response statistics of non-linear second-order transformations to Gaussian
loads. Journal of Sound and Vibration 115 (1): 103–129.
Naess, A. 1990a. Approximate first-passage and extremes of narrow-band Gaussian and non-
Gaussian random vibrations. Journal of Sound and Vibration 138 (3): 365–380.
Naess, A. 1990b. Statistical analysis of nonlinear, second-order forces and motions of offshore
structures in short-crested random seas. Probabilistic Engineering Mechanics 5 (4): 192–203.
Naess, A. 1998a. Estimation of long return period design values for wind speeds. Journal of
Engineering Mechanics, ASCE 124 (3): 252–259.
Naess, A. 1998b. Statistical extrapolation of extreme value data based on the peaks over threshold
method. Journal of Offshore Mechanics and Arctic Engineering, ASME 120: 91–96.
Naess, A. 1999. Extreme response of nonlinear structures with low damping subjected to stochastic
loading. Journal of Offshore Mechanics and Arctic Engineering, ASME 121: 255–260.
Naess, A. 2000a. Characteristic functions related to quadratic transformations of Gaussian pro-
cesses. Technical Report R-1-00, Department of Structural Engineering, Norwegian University
of Science and Technology, Trondheim.
260 References

Naess, A. 2000b. Crossing rate statistics of quadratic transformations of Gaussian processes.

Technical Report R-2-00, Department of Structural Engineering, Norwegian University of
Science and Technology, Trondheim.
Naess, A. 2010. Estimation of extreme values of time series with heavy tails. Preprint Statistics
No. 14/2010, Department of Mathematical Sciences, Norwegian University of Science and
Technology, Trondheim.
Naess, A. 2011. A note on the bivariate ACER method. Preprint Statistics No. 01/2011, Department
of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim.
Naess, A., and O. Batsevych. 2010. Space-time extreme value statistics of a Gaussian random field.
Probabilistic Engineering Mechanics 25 (5): 372–379.
Naess, A., and O. Batsevych. 2012. Space-time extreme value statistics of non-Gaussian random
fields. Probabilistic Engineering Mechanics 28: 169–175.
Naess, A., and P.H. Clausen. 1999. Statistical extrapolation and the peaks-over-threshold method.
In Proceedings 18th International Conference on Offshore Mechanics and Arctic Engineer-
ing, OMAE–99–6422. New York: ASME.
Naess, A., and P.H. Clausen. 2001. Combination of peaks-over-threshold and bootstrapping
methods for extreme value prediction. Structural Safety 23: 315–330.
Naess, A., and O. Gaidai. 2008. Monte Carlo methods for estimating the extreme response of
dynamical systems. Journal of Engineering Mechanics, ASCE 134 (8): 628–636.
Naess, A., and O. Gaidai. 2009. Estimation of extreme values from sampled time series. Structural
Safety 31: 325–334.
Naess, A., and J.M. Johnsen. 1992. An efficient numerical method for calculating the statistical
distribution of combined first-order and wave-drift response. Journal of Offshore Mechanics
and Arctic Engineering 114 (3): 195–204.
Naess, A., and J.M. Johnsen. 1993. Response statistics of nonlinear, compliant offshore structures
by the path integral solution method. Probabilistic Engineering Mechanics 8 (2): 91–106.
Naess, A., and H.C. Karlsen. 2004. Numerical calculation of the level crossing rate of second order
stochastic Volterra systems. Probabilistic Engineering Mechanics 19 (2): 155–160.
Naess, A., and O. Karpa. 2015a. Statistics of bivariate extreme wind speeds by the ACER method.
Journal of Wind Engineering and Industrial Aerodynamics 139: 82–88.
Naess, A., and O. Karpa. 2015b. Statistics of extreme wind speeds and wave heights by the bivariate
ACER method. Journal of Offshore Mechanics and Arctic Engineering, ASME 137: 021602–
1–7.
Naess, A., and U. Machado. 2000. Response statistics of large compliant offshore structures.
In Proceedings 8th ASCE Specialty Conference on Probabilistic Mechanics and Structural
Reliability. New York: ASCE.
Naess, A., and T. Moan. 2013. Stochastic Dynamics of Marine Structures. New York: Cambridge
University Press.
Naess, A., and J.O. Royset. 2000. Extensions of Turkstra’s rule and their application to combination
of dependent load effects. Structural Safety 22: 129–143.
Naess, A., H.C. Karlsen, and P.S. Teigen. 2006. Numerical methods for calculating the crossing
rate of high and extreme response levels of compliant offshore structures subjected to random
waves. Applied Ocean Research 28 (1): 1–8.
Naess, A., O. Gaidai, and S. Haver. 2007. Efficient estimation of extreme response of drag
dominated offshore structures by Monte Carlo simulation. Ocean Engineering 34 (16): 2188–
2197.
Naess, A., O. Gaidai, and P.S. Teigen. 2007. Extreme response prediction for nonlinear floating
offshore structures by Monte Carlo simulation. Applied Ocean Research 29 (4): 221–230.
Naess, A., O. Gaidai, and O. Karpa. 2013. Estimation of extreme values by the average conditional
exceedance rate method. Journal of Probability and Statistics. [Link]
797014.
Neal, E. 1974. Second order hydrodynamic forces due to stochastic excitation. In Proceedings 10th
ONR Symposium, Cambridge.
References 261

Nelsen, R.B. 2006. An Introduction to Copulas. Springer Series in Statistics. New York: Springer,
NY.
Newland, D.E. 1991. An Introduction to Random Vibrations and Spectral Analysis. 2nd ed.
London: Longman.
Norwegian Meteorological Institute. 2012. Climate Data Web Services. [Link]
wsKlima/start/start_en.html. Accessed 29 Dec 2011.
Numerical Algorithms Group. 2010. NAG Toolbox for Matlab. Oxford: NAG Ltd.
Palutikof, J.P., B.B. Brabson, D.H. Lister, and S.T. Adcock. 1999. A review of methods to calculate
extreme wind speeds. Meteorological Applications 6: 119–132.
Perrin, O., H. Rootzen, and R. Taesler. 2006. A discussion of statistical methods used to estimate
extreme wind speeds. Theoretical and Applied Climatology 85 (3–4): 203–215.
Pickands, J. 1981. Multivariate extreme value distributions. Bulletin of the International Statistical
Institute 49: 859–878.
Piterbarg, V.I. 1996. Asymptotic Methods in the Theory of Gaussian Processes and Fields.
Providence: American Mathematical Society.
Pugh, D.J., and J.M. Vassie. 1980. Applications of the joint probability method for extreme sea-
level computations. In Proceedings of the Institution of Civil Engineers, Part 2(69), 959–975.
Reiss, R.-D., and M. Thomas. 2007. Statistical Analysis of Extreme Values. 3rd ed. Basel:
Birkhäuser.
Rice, S.O. 1954. Mathematical analysis of random noise. In Selected Papers on Noise and
Stochastic Processes, ed. N. Wax, 133–294. New York: Dover Publications, Inc.
Robert, C.Y. 2009. Inference for the limiting cluster size distribution of extreme values. Annals of
Statistics 37 (1): 271–310.
Robinson, M.E., and J.A. Tawn. 2000. Extremal analysis of processes sampled at different
frequencies. Journal of the Royal Statistical Society, B 62 (1): 117–136.
Sagrilo, L.V.S., Z. Gao, A. Naess, and E.C.P. Lima. 2011. A straightforward approach for using
single time domain simulations to assess characteristic extreme responses. Ocean Engineering
38: 1464–1471.
Sarpkaya, T., and M. Isaacson. 1981. Mechanics of Wave Forces on Offshore Structures. New York:
Van Nostrand Reinholdt.
Schall, G., M.H. Faber, and R. Rackwitz. 1991. The ergodicity assumption for sea states in
the reliability estimation of offshore structures. Jornal of Offshore Mechanics and Arctic
Engineering, ASME 113 (3): 241–246.
Schetzen, M. 1980. The Volterra and Wiener Theories of Nonlinear Systems. New York: John Wiley
& Sons, Inc.
Segers, J. 2005. Approximate distributions of clusters of extremes. Statistics and Probability
Letters 74: 330–336.
Shinozuka, M. 1974. Digital simulation of random processes in engineering mechanics with the
aid of FFT techniques. In Stochastic Problems in Mechanics, 277–286. Waterloo: University of
Waterloo Press.
Shinozuka, M., and G. Deodatis. 1991. Simulation of stochastic processes by spectral representa-
tion. Applied Mechanics Review 44 (4): 191–203.
Shinozuka, M., and C.-M. Jan. 1972. Digital simulation of random processes and its application.
Journal of Sound and Vibration 25 (1): 111–128.
Sinsabvarodom, C., A. Naess, B.J. Leira, and W. Chai. 2022. Extreme value estimation of Beaufort
sea ice dynamics driven by global wind effects. China Ocean Engineering. [Link]
1007/s13344--022--0046--3.
Skjong, M., A. Naess, and O.B. Naess. 2013. Statistics of extreme sea levels for locations along
the Norwegian coast. Journal of Coastal Research 29 (5): 1029–1048.
Sklar, A. 1959. Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut
de statistique de l’Université de Paris 8: 229–231.
Smith, R.L. 1992. The extremal index for a Markov chain. Journal of Applied Probability 29:
37–45.
262 References

Smith, R.L., J.A. Tawn, and S.G. Coles. 1997. Markov chain models for threshold exceedances.
Biometrika 84 (2): 249–268.
Socquet-Juglard, H., K. Dysthe, K. Trulsen, H.E. Krogstad, and J. Liu. 2005. Probability
distributions of surface gravity waves during spectral changes. Journal of Fluid Mechanics
542 (Nov 2005): 195–216.
Sun, J. 1993. Tail probabilities of the maxima of Gaussian random fields. Annals of Probability 21
(1): 34–71.
Swider, D.J., and C. Weber. 2007. Extended ARMA models for estimating price developments on
day-ahead electricity markets. Electric Power Systems Research 77 (5): 583–593.
Tawn, J.A. 1988. Bivariate extreme value theory: Models and estimation. Biometrika 75 (3): 397–
415.
Tawn, J.A. 1990. Discussion of paper by A. C. Davison and R. L. Smith. Journal of the Royal
Statistical Society, B 52 (3): 393–442.
Tawn, J.A. 1992. Estimating probabilities of extreme sea-levels. Applied Statistics 41 (1): 77–93.
Tawn, J.A., and J.M. Vassie. 1989. Extreme sea levels: The joint probabilities method revisited and
revised. In Proceedings of the Institution of Civil Engineers, Part 2(87), 429–442.
Teigen, P.S., and A. Naess. 1999. Stochastic response analysis of deepwater structures in short-
crested random waves. Journal of Offshore Mechanics and Arctic Engineering, ASME 121:
181–186.
Teigen, P.S., and A. Naess. 2003. Extreme response of floating structures in combined wind and
waves. Journal of Offshore Mechanics and Arctic Engineering, ASME 125: 87–93.
Tiago de Oliveira, J. 1982. Bivariate Extremes: Models and Statistical Decision. North Carolina:
Center for Stochastic Processes, University of North Carolina, Chapel Hill. Technical report
No. 14.
Tiago de Oliveira, J. 1984. Bivariate models for extremes; Statistical decision. In Statistical
Extremes and Applications, 131–153. Dordrecht: Springer.
Toffoli, A., E. Bitner-Gregersen, M. Onorato, and A. Babanin. 2008. Wave crest and trough
distributions in a broad-banded directional wave field. Ocean Engineering 35 (17–18): 1784–
1792.
Toro, G.R. 1984. Probabilistic Analysis of Combined Dynamic Responses. Report no. 65, Stanford
University, Palo Alto.
Tromans, P.S., and L. Vanderschuren. 1995. Response based design conditions in the North Sea:
Application of a new method. In Proceedings of the Offshore Technology Conference, Number
OTC 7683. Houston.
Turkstra, C.J. 1970. Theory of Structural Safety. Ontario: SM Study No. 2, Solid Mechanics
Division, University of Waterloo.
Vanmarcke, E.H. 1975. On the distribution of the first-passage time for normal stationary random
processes. Journal of Applied Mechanics 42: 215–220.
Vanmarcke, E.H. 1983. Random Fields: Analysis and Synthesis. Cambridge: The MIT Press.
Vinje, T. 1983. On the statistical distribution of second-order forces and motions. International
Shipbuilding Progress 30: 58–68.
Waal, D.J., and P.H.A.J.M. van Gelder. 2005. Modelling of extreme wave heights and periods
through copulas. Extremes 8: 345–356.
WAFO-group. 2000. WAFO - A Matlab Toolbox for Analysis of Random Waves and Loads - A
Tutorial. Lund: Math. Stat., Center for Math. Sci., Lund University.
WAMIT. 2008. WAMIT Inc. [Link].
Watson, G.S. 1954. Extreme values in samples from m-dependent stationary stochastic processes.
The Annals of Mathematical Statistics 25 (4): 798–800.
Wen, Y.-K. 1990. Structural Load Modeling and Combination for Performance and Safety
Evaluation. Amsterdam: Elsevier Science Publishers B.V.
Wen, Y.K., and H.T. Pearce. 1981. Recent developments in probabilistic load combinations. In
Proceedings on Probabilistic Methods in Structural Engineering, St. Louis: ASCE.
Weron, R. 2006. Modeling and Forecasting Electricity Loads and Prices: A Statistical Approach.
New Yok: John Wiley & Sons, Inc.
References 263

Winterstein, S.R. 1985. Non-normal responses and fatigue damage. Journal of Engineering
Mechanics, ASCE 111 (10): 1291–1295.
Winterstein, S.R. 1988. Nonlinear vibration models for extremes and fatigue. Journal of Engineer-
ing Mechanics, ASCE 114: 1772–1790.
Winterstein, S.R., and C.A. MacKenzie. 2013. Extremes of nonlinear vibration: Comparing models
based on moments, L-moments, and maximum entropy. Journal of Offshore Mechanics and
Arctic Engineering 135 (2): 021602.
Winterstein, S., T. Ude, C.A. Cornell, P. Bjerager, and S. Haver. 1993. Environmental parameters
for extreme response: Inverse FORM with omission factors. In Proceedings 6th International
Conference on Structural Safety and Reliability (ICOSSAR’93). Innsbruck: Balkema.
Wong, E., and B. Hajek. 1985. Stochastic Processes in Engineering Systems. New York: Springer,
NY.
Xu, X., F. Wang, O. Gaidai, A. Naess, Y. Xing, and J. Wang. 2022. Bivariate statistics of floating
offshore wind turbine dynamic response under operational conditions. Ocean Engineering 257:
paper 111657.
Yu, S., W. Wu, B. Xie, S. Wang, and A. Naess. 2020. Extreme value prediction of current profiles
in the South China Sea based on EOFs and the ACER method. Applied Ocean Research 105:
paper 102408.
Yue, S. 2000. The Gumbel mixed model applied to storm frequency analysis. Water Resources
Management 14: 377–389.
Yue, S. 2001a. A bivariate extreme value distribution applied to flood frequency analysis. Nordic
Hydrology 32 (1): 49–64.
Yue, S. 2001b. The Gumbel logistic model for representing a multivariate storm event. Advances
in Water Resources 24 (2): 179–185.
Yue, S., and C.Y. Wang. 2004. A comparison of two bivariate extreme value distributions.
Stochastic Environmental Research and Risk Assessment 18: 61–66.
Yue, S., T. Ouarda, B. Bobée, P. Legendre, and P. Bruneau. 1999. The Gumbel mixed model for
flood frequency analysis. Journal of Hydrology 226 (1–2): 88–100.
Yun, S. 1998. The extremal index of a higher-order stationary Markov chain. Annals of Applied
Probability 8: 408–437.
Yun, S. 2000. The distribution of cluster functionals of extreme events in a d’th-order Markov
chain. Journal of Applied Probability 37: 29–44.
Zachary, S., G. Feld, G. Ward, and J. Wolfram. 1998. Multivariate extrapolation in the offshore
environment. Applied Ocean Research 20 (5): 273–295.
Zhang, X.-Y., Y.-G. Zhao, and Z.-H. Lu. 2019. Unified Hermite polynomial model and its
application in estimating non-Gaussian processes. Journal of Engineering Mechanics 145 (3):
04019001.
Index

A type B, 176
ACER function, 64 type C, 176
asymptotic GEV case, 72 Block maxima method, 8
asymptotic Gumbel case, 69 Bootstrapping, 15, 27
confidence interval, 67 nonparametric, 15, 27
empirical estimation, 64 parametric, 15, 147
modified, 65
parameter estimation, 70, 74
ACERmanual, 75 C
ACER method, 246 Characteristic function, 108, 122
long-term analysis, 68 Clumps of exceedances, 66
Annual maxima method, 227 Combination of load effects, 151
AR-GARCH filtering, 98 non-Gaussian, 152, 155
AR-GARCH model, 97 Componentwise extremes, 170
AR process, 97 Confidence interval, 15, 27
Average conditional exceedance rate, 64 Contour line, 55
Average crossing rate, 120
Average rate of level crossings, 29
Average upcrossing frequency, 32 D
Average upcrossing rate, 32 Dirac’s delta function, 121
Gaussian process, 35 Distribution-free, 139
transformed process, 44 Distribution of peaks, 33
Gaussian process, 35
Downcrossing, 32
B
Biextremal model, 176
Bivariate ACER functions, 172 E
empirical estimation, 174, 190 Electricity market, 93
Bivariate copula, 175 Electricity price data, 96
Bivariate extreme value distributions, 170 Electricity prices
asymmetric logistic model, 176 simulated time series, 95
bivariate ACER surface, 176 Energy spectrum, 130
Gumbel logistic model, 176 Estimator
Gumbel mixed model, 176 de Haan, 24
type A, 176 maximum likelihood, 26

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 265
A. Naess, Applied Extreme Value Statistics,
[Link]
266 Index

moment, 25 model checking, 13

Extremal types theorem, 6 parameter estimation, 11
outline proof, 9 GP distribution, 21
Extreme price changes, 93 Gumbel distribution, 6, 44
Extreme response Gumbel-Lieblein BLUE method, 83
estimation, 141 Gumbel method, 141, 145, 147
Extreme value Gumbel mixed model, 176
expected largest, 42
Gaussian process, 40
long-term, 51 H
most probable, 42 Heaviside function, 120
quantile, 41 Heavy tails, 93
Extreme value distribution, 39 Hermite moment model, 45
ACER method, 246 Heteroscedacity, 97
annual maxima method, 227
asymptotic limits, 5
I
dependent data, 16
IFORM, 55
domains of attraction, 10
Inverse first order reliability method, 55
long-term, 48
peaks-over-threshold method, 231
revised joint probabilities method, J
239 JONSWAP spectrum, 114
upcrossing rate method, 38
Extreme value estimation
Gumbel method, 141 K
point process method, 142 Kvitebjørn jacket platform, 145
Extreme value prediction
measured wind speed data, 79
L
narrow band process, 88
Latin hypercube sampling (LHS), 140
synthetic data, 76
Linear transfer function, 105
Extreme value theory
Load coincidence method, 151
classical, 16
Load effects
combination, 151
Lognormal model, 53
F
Long-term extreme value distribution
Ferry Borges-Castanheta method, 151
all peak values, 48
First exceedance, 38
all short-term extremes, 51
Fréchet distribution, 6
long-term formulation, 52

G M
GARCH(1,1) process, 97 Mean residual life plot, 23
Gaussian random field Mean upcrossing frequency, 32
11-dimensional Gaussian field, 207 Mean upcrossing rate, 32, 107
11-dimensional Gaussian sea, 208 empirical estimation, 205
numerical examples, 206 Monte Carlo simulation, 137
short-crested Gaussian sea, 211 Moored deep floater, 113
Generalized extreme value distribution, 7
Generalized Pareto distribution, 20
GEV distribution, 7 N
block maxima method, 8 Non-Gaussian random field
confidence interval by bootstrapping, 15 numerical examples, 215
maximum likelihood estimation, 12 second-order wave field, 215
method of moments, 12 Student’s t random field, 220
Index 267

Nonstationary process, 48 S
Nonstationary time series, 65 Saddle point integration technique, 166
Nord Pool, 98 Sample statistics, 138
Nord Pool market, 93 Scatter diagram, 49, 55
Nord Pool spot, 93 Seasonality, 93
Serial correlation, 93
Shape parameter, 2
O Short-crested Gaussian sea, 211
Order statistics, 138 Short-term condition, 48
Significant wave height, 48, 115
Simplified methods
P long-term, 54
Parameter estimation method of contour lines, 55
ACER function, 70, 74 method of equivalent storms, 54
GEV distribution, 11 Slow-drift response, 112
GP distribution, 24 Spatial-temporal extremes
Parseval’s formula, 120 Gaussian random fields, 196
Peak, 33 non-Gaussian random fields, 202
Peaks-over-threshold method, 19, 231 Spectral density, 130
Peak value, 48 Spectral peak period, 115
Pickands dependence function, 175 Spectral period, 48
Piterbarg’s asymptotic formula, 199 Spectrum, 130
Plotting position formula, 138 variance, 130
Point crossing approximation method, Spot price data, 98
151 Square root of sum of squares (SRSS) method,
Point process method, 142, 145, 147 151
Poisson process, 58 Stationary sea state, 114
POT method, 19 Stochastic process
confidence interval by bootstrapping, 27 Gaussian, 35
de Haan estimator, 24 realization, 128
maximum likelihood estimator, 26 simulation, 127
model validation, 26 Stochastic Volterra series, 103
moment estimator, 25 Student’s t-distribution, 97
parameter estimation, 24 Surge response of a TLP, 160
threshold selection, 22
Power spectral density, 130
PP-plot, 139 T
Probability plot, 139 Tail fitting
conditional, 97, 98
unconditional, 98
Q Time to first passage, 39
QQ-plot, 139 Turkstra’s rule, 151
Quadratic transfer function, 106
Quantile plot, 139
U
Upcrossing, 30
R
Rayleigh approximation, 49
Rayleigh-density, 37 V
Response process, 104 Value-at-risk, 94
Return period, 23, 47 Variance spectrum, 130
Revised joint probabilities method, 239 from realizations, 136
Rice formula, 32 units, 133
RJP method, 239 VaR metric, 94
268 Index

Volatility clustering, 93 Wave process

Volterra series, 103 realization, 133
Von Mises stress combination, 153 Wave spectrum
Von Mises yielding stress, 152 JONSWAP, 53
Weibull distribution, 6
Wiener-Khintchine relations, 130
W Wind speed measurements
Water level Nordøyan Fyr, 177
surge-dominant, 226 North Sea weather station, 182
tide-dominant, 226 Obrestad Fyr, 79
Water level measurements Sula, 79, 177
Heimsjø, 227, 229, 235, 244, 248 Torsvåg Fyr, 79
Honningsvåg, 227, 230, 237, 245, 248
Oslo, 227, 228, 233, 242, 246
Wave height measurements Z
North Sea weather station, 182 Zero-upcrossing, 33

Common questions

The Gumbel distribution provides a means to model the distribution of the maximum (or minimum) of a set of independent, identically distributed random variables, especially under the assumption of an unbounded upper tail. It is often used to approximate the statistical behavior of rare events in stochastic processes, and its parameters can be estimated to assess the probability of extreme occurrences and return levels, making it vital for extreme value analysis .

Approximate representations within the conditional probability framework facilitate the estimation of bivariate distributions by breaking down complex relationships into manageable components, representing them with systematic functions that account for dependencies. This methodology enhances our ability to model multivariate extreme events effectively, allowing for more accurate predictions and understanding of joint occurrence probabilities .

In narrow-banded processes, the density of the peaks of a process X(t) can be calculated using a Rayleigh distribution, as given by its peak density function fXp(a). This is based on the variance σ²X and assumes that the peaks are approximately independent and reflect the Gaussian distribution of the process under small damping conditions .

The ACER method is preferred for long return period estimates because it can account for dependency structures and provides shorter confidence intervals, which implies higher precision in estimates. This is particularly advantageous in hydrology, where the accurate prediction of rare, severe events is crucial for risk management and infrastructure planning .

Choosing a high threshold in the Peaks-Over-Threshold (POT) method is crucial because it allows for the consideration of data exceeding the threshold as extreme events. This avoids wasting potentially useful data by not limiting analysis to only block extremes. The threshold choice is important as it determines the definition of extremes, which are then modeled using the Generalized Pareto distribution in place of the Generalized Extreme Value distribution .

In sea level studies, non-stochastic tidal components can dominate the observed data, complicating the estimation of extreme value distributions. These components can cause significant variance in the extreme value estimates if not correctly accounted for, as they introduce dependencies and patterns not present in purely random processes. Consequently, models must be adapted to incorporate these deterministic influences for accurate estimation .

The ACER method affects model comparisons by providing shorter confidence intervals, suggesting potentially higher estimation accuracy compared to methods like Gumbel and POT. It addresses temporal and spatial dependencies and performs unbiased estimations in sample values, which helps in comparing and validating the robustness of different models, especially over long periods where other methods might diverge more substantially .

The 'domain of attraction' in extreme value theory refers to the set of distribution functions that converge to a specific extreme value distribution family as the sample size grows. In the context of the Peaks-Over-Threshold method, the assumption is that the data, which exceed the high threshold, stem from a distribution within this domain of attraction, allowing for the application of the Generalized Pareto distribution to model exceedances .

The choice of sampling frequency affects the extent to which a time series is representative of the continuous realization of a stochastic process. Proper sampling is key to ensuring the stored time series captures the essential characteristics of the process, allowing for accurate analyses, such as peak value extraction and extremal analysis. Incorrect sampling may lead to a loss of significant information or misrepresentation of the process .

The k-th order bivariate ACER function, Ek(ξ, η), represents the average conditional exceedance rate and captures the temporal and spatial dependence structure of the bivariate time series. It is crucial for estimating the distribution as it accounts for both simultaneous and non-simultaneous exceedances, thus providing a more compact representation of the bivariate process .

Extreme Value Analysis Technical Reference and Documentation 2017
No ratings yet
Extreme Value Analysis Technical Reference and Documentation 2017
90 pages
Understanding Extreme Value Statistics
No ratings yet
Understanding Extreme Value Statistics
41 pages
Extreme Value Analysis in Statistics
100% (1)
Extreme Value Analysis in Statistics
48 pages
Advanced Numerical Methods & Statistics
No ratings yet
Advanced Numerical Methods & Statistics
133 pages
Analysis and Prediction of Pipeline Corrosion Defects Based On Data Analytics of In-Line Inspection
No ratings yet
Analysis and Prediction of Pipeline Corrosion Defects Based On Data Analytics of In-Line Inspection
19 pages
Data Uncertainty in Surface Wave Inversion
No ratings yet
Data Uncertainty in Surface Wave Inversion
10 pages
Decision Intelligence: Time Series Forecasting Risk Analysis Optimization
No ratings yet
Decision Intelligence: Time Series Forecasting Risk Analysis Optimization
33 pages
Monte Carlo Simulation Guide in Excel
No ratings yet
Monte Carlo Simulation Guide in Excel
22 pages
(SMR) Decisionmaking Toolkit For Structural Integrity Management (SIM) of Ageing Offshore Structures.
No ratings yet
(SMR) Decisionmaking Toolkit For Structural Integrity Management (SIM) of Ageing Offshore Structures.
13 pages
Optimal Inspection Intervals for Pipelines
No ratings yet
Optimal Inspection Intervals for Pipelines
13 pages
15 Monte Carlo Simulation Method
No ratings yet
15 Monte Carlo Simulation Method
11 pages
Biomedical Data Science with R
No ratings yet
Biomedical Data Science with R
17 pages
Current Strategies for Poverty Estimation
No ratings yet
Current Strategies for Poverty Estimation
36 pages
2021 - Nature - Bayesian Statistics and Modelling
100% (1)
2021 - Nature - Bayesian Statistics and Modelling
26 pages
Ipc2012 90690
No ratings yet
Ipc2012 90690
12 pages
HPE Grade 12
No ratings yet
HPE Grade 12
43 pages
IPC2022-87236 - Overcoming Challenges Using Machine Learning
No ratings yet
IPC2022-87236 - Overcoming Challenges Using Machine Learning
10 pages
Economical Eigenvalue Method Analysis
No ratings yet
Economical Eigenvalue Method Analysis
9 pages
Advances in Magnetic Flux Leakage Signal Matching and Corrosion Growth
No ratings yet
Advances in Magnetic Flux Leakage Signal Matching and Corrosion Growth
16 pages
Overview of Support Vector Regression
No ratings yet
Overview of Support Vector Regression
15 pages
Fundamentals of Machine Learning - Thomas P Trappenberg
No ratings yet
Fundamentals of Machine Learning - Thomas P Trappenberg
369 pages
Introduction to Probability & Statistics
No ratings yet
Introduction to Probability & Statistics
249 pages
Ipc2012 90620
No ratings yet
Ipc2012 90620
17 pages
Proban - TM - Manual de Probabilidade - Veritas
No ratings yet
Proban - TM - Manual de Probabilidade - Veritas
128 pages
Bayesian Model for Gas Pipeline Failure Analysis
No ratings yet
Bayesian Model for Gas Pipeline Failure Analysis
11 pages
Generic Risk-Based Inspection for Steel
No ratings yet
Generic Risk-Based Inspection for Steel
247 pages
Time-Series Plot Basics
No ratings yet
Time-Series Plot Basics
33 pages
Number Systems Overview and Key Concepts
100% (1)
Number Systems Overview and Key Concepts
5 pages
Project Management Causal Loop Diagram
No ratings yet
Project Management Causal Loop Diagram
11 pages
Time Series Forecasting in R
No ratings yet
Time Series Forecasting in R
98 pages
Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013
No ratings yet
Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013
32 pages
Frequency Domain State-Space System Identification
No ratings yet
Frequency Domain State-Space System Identification
6 pages
Information Theory Overview
No ratings yet
Information Theory Overview
114 pages
Data Visualization and Customer Segmentation Slides 2009
100% (1)
Data Visualization and Customer Segmentation Slides 2009
42 pages
Introducing The Art of Statistics How To Learn Fro
No ratings yet
Introducing The Art of Statistics How To Learn Fro
6 pages
Bayesian Statistics and MCMC Methods For Portfolio Selection
No ratings yet
Bayesian Statistics and MCMC Methods For Portfolio Selection
62 pages
R With RStudio For Introductory Statistics
100% (1)
R With RStudio For Introductory Statistics
163 pages
Ang y Tang ProbabilityConceotinEngineering PDF
No ratings yet
Ang y Tang ProbabilityConceotinEngineering PDF
419 pages
LV PH DThesis 2019
No ratings yet
LV PH DThesis 2019
183 pages
Statistics of Extremes in Hydrology-Katz - Parlange-Naveau PDF
No ratings yet
Statistics of Extremes in Hydrology-Katz - Parlange-Naveau PDF
18 pages
Tree Map Visualization Techniques
No ratings yet
Tree Map Visualization Techniques
12 pages
Analyzing Failure Distribution Functions
No ratings yet
Analyzing Failure Distribution Functions
8 pages
Risk Book Using Krantz Template
No ratings yet
Risk Book Using Krantz Template
285 pages
Bayesian Inference for ARFIMA Models
No ratings yet
Bayesian Inference for ARFIMA Models
32 pages
Evolution of Official Statistics in India
100% (1)
Evolution of Official Statistics in India
17 pages
Auto ARIMA for Time Series Forecasting
No ratings yet
Auto ARIMA for Time Series Forecasting
9 pages
Unconstrained Non-Linear Optimization Methods
No ratings yet
Unconstrained Non-Linear Optimization Methods
27 pages
Confidence Intervals for Engineers
No ratings yet
Confidence Intervals for Engineers
13 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
108 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
100 pages
Interest Factor Formulas Overview
No ratings yet
Interest Factor Formulas Overview
3 pages
Extreme Value Theory Introduction DeHaanFerreira
No ratings yet
Extreme Value Theory Introduction DeHaanFerreira
421 pages
An Introduction To Statistical Modeling of Extreme V (Llues: Stuart Coles
0% (1)
An Introduction To Statistical Modeling of Extreme V (Llues: Stuart Coles
10 pages
57 Paper
No ratings yet
57 Paper
39 pages
An Introduction To Statistical Modeling of Extreme Values
80% (5)
An Introduction To Statistical Modeling of Extreme Values
221 pages
Extreme Market Risk and EVT Analysis
No ratings yet
Extreme Market Risk and EVT Analysis
19 pages
Introduction to Extreme Value Theory
No ratings yet
Introduction to Extreme Value Theory
57 pages
GARCH-Enhanced Extreme Value Modeling
No ratings yet
GARCH-Enhanced Extreme Value Modeling
11 pages
Mike Zero - Mike EVA Technical Documentation
No ratings yet
Mike Zero - Mike EVA Technical Documentation
90 pages
Software Review for Extreme Value Analysis
No ratings yet
Software Review for Extreme Value Analysis
15 pages
ASCO Canada Launches Red Hat Valves
No ratings yet
ASCO Canada Launches Red Hat Valves
5 pages
Linear Correlation Analysis
No ratings yet
Linear Correlation Analysis
72 pages
Descriptive Statistics Methods Overview
100% (1)
Descriptive Statistics Methods Overview
7 pages
One-Way ANOVA Fundamentals and Examples
100% (1)
One-Way ANOVA Fundamentals and Examples
32 pages
Grade 11 Normal Distribution Module
No ratings yet
Grade 11 Normal Distribution Module
38 pages
Confidence Intervals in Statistical Inference
No ratings yet
Confidence Intervals in Statistical Inference
10 pages
Practical Research 1 Midterm Exam 2019-2020
44% (9)
Practical Research 1 Midterm Exam 2019-2020
2 pages
Activity Prediction for Differently-Abled
No ratings yet
Activity Prediction for Differently-Abled
6 pages
Statistical Analysis of Connector Forces
No ratings yet
Statistical Analysis of Connector Forces
10 pages
Panel Threshold Regression Models
No ratings yet
Panel Threshold Regression Models
86 pages
Comparative Criminal Law
No ratings yet
Comparative Criminal Law
57 pages
BU Business Statistics Assignment 2
No ratings yet
BU Business Statistics Assignment 2
2 pages
Hypergeometric Distribution Explained
No ratings yet
Hypergeometric Distribution Explained
6 pages
Binomial and Poisson Distribution Tutorial
100% (1)
Binomial and Poisson Distribution Tutorial
5 pages
Predicting Galangal Production in Indonesia
No ratings yet
Predicting Galangal Production in Indonesia
12 pages
Six Sigma Program and Application Study
No ratings yet
Six Sigma Program and Application Study
2 pages
4th Year CSE Course Structure 2018-19
No ratings yet
4th Year CSE Course Structure 2018-19
35 pages
Statistical Quality Control Methods Explained
No ratings yet
Statistical Quality Control Methods Explained
19 pages
Trust in Leadership: Knowledge & Self-Control
No ratings yet
Trust in Leadership: Knowledge & Self-Control
22 pages
Probability and Statistics Exam Paper
No ratings yet
Probability and Statistics Exam Paper
5 pages
34 DiD
No ratings yet
34 DiD
59 pages
Understanding Multinomial Distribution
No ratings yet
Understanding Multinomial Distribution
6 pages
Data Analytics Online Training Course
No ratings yet
Data Analytics Online Training Course
4 pages
Forest Mensuration Book
No ratings yet
Forest Mensuration Book
389 pages
Total Petroleum Hydrocarbons Analysis
No ratings yet
Total Petroleum Hydrocarbons Analysis
2 pages
Introduction to Statistics in Python
No ratings yet
Introduction to Statistics in Python
35 pages
Understanding Generalized Linear Models
No ratings yet
Understanding Generalized Linear Models
13 pages
Seasonal Data Deseasonalization Guide
No ratings yet
Seasonal Data Deseasonalization Guide
15 pages
Regression Analysis Solutions Guide
No ratings yet
Regression Analysis Solutions Guide
5 pages
Sampling Distributions and Statistics
No ratings yet
Sampling Distributions and Statistics
8 pages