0% found this document useful (0 votes)
110 views

HMTK Tutorial

This document provides a user guide for OpenQuake, an open-source software for seismic hazard and risk assessment. It describes the software's features for calculating earthquake hazards and risks, summarizing catalog analysis and modeling methods. The guide is intended for hazard modelers and covers topics like catalog tools, hazard calculation tools, and incorporating geology and geodetic data. It provides examples of using the software and cites references for the implemented scientific methods.

Uploaded by

Violeta Chamorro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

HMTK Tutorial

This document provides a user guide for OpenQuake, an open-source software for seismic hazard and risk assessment. It describes the software's features for calculating earthquake hazards and risks, summarizing catalog analysis and modeling methods. The guide is intended for hazard modelers and covers topics like catalog tools, hazard calculation tools, and incorporating geology and geodetic data. It provides examples of using the software and cites references for the implemented scientific methods.

Uploaded by

Violeta Chamorro
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

“OpenQuake: Calculate, share, explore”

Hazard Modeller’s
Toolkit - User Guide
Copyright c 2014 GEM Foundation

P UBLISHED BY GEM F OUNDATION

GLOBALQUAKEMODEL . ORG / OPENQUAKE

Citation
Please cite this document as:
Weatherill, G. A. (2014) OpenQuake Hazard Modeller’s Toolkit - User Guide. Global Earth-
quake Model (GEM). Technical Report

Disclaimer
The “Hazard Modeller’s Tookit - User Guide” is distributed in the hope that it will be useful,
but without any warranty: without even the implied warranty of merchantability or fitness for a
particular purpose. While every precaution has been taken in the preparation of this document,
in no event shall the authors of the manual and the GEM Foundation be liable to any party for
direct, indirect, special, incidental, or consequential damages, including lost profits, arising out
of the use of information contained in this document or from the use of programs and source
code that may accompany it, even if the authors and GEM Foundation have been advised of
the possibility of such damage. The Book provided hereunder is on as "as is" basis, and the
authors and GEM Foundation have no obligations to provide maintenance, support, updates,
enhancements, or modifications.
The current version of the book has been revised only by members of the GEM model facility
and it must be considered a draft copy.

License
This Book is distributed under the Creative Common License Attribution-NonCommercial-
NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) (see link below). You can download this Book and
share it with others as long as you provide proper credit, but you cannot change it in any way or
use it commercially.

First printing, June 2014


Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 The Development Process 5
1.2 Getting Started and Running the Software 6
1.2.1 Current Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 About this Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Catalogue Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 The Earthquake Catalogue 15
2.1.1 The Catalogue Format and Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.2 The “Selector” Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Declustering 22
2.2.1 Gardner and Knopoff (1974) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 AFTERAN (Musson, 1999b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Completeness 25
2.3.1 Stepp, 1971 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Recurrence Models 28
2.4.1 Aki (1965) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.2 Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.3 Kijko and Smit (2012) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.4 Weichert (1980) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Maximum Magnitude 30
2.5.1 Kijko (2004) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.2 Cumulative Moment (Makropoulos and Burton, 1983) . . . . . . . . . . . . . . . . . 32
2.6 Smoothed Seismicity 34
2.6.1 Frankel (1995) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.2 Implementing the Smoothed Seismicity Analysis . . . . . . . . . . . . . . . . . . . . . . 34
3 Hazard Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Source Model and Hazard Tools 37
3.1.1 The Source Model Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.2 The Source Model Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Hazard Calculation Tools 44

4 Geology Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1 Fault Recurrence from Geology 47
4.1.1 Epistemic Uncertainties in the Fault Modelling . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.2 Tectonic Regionalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.3 Definition of the Fault Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.4 Fault Recurrence Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.5 Running a Recurrence Calculation from Geology . . . . . . . . . . . . . . . . . . . . 59

5 Geodetic Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1 Recurrence from Geodetic Strain 65
5.1.1 The Recurrence Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.2 Running a Strain Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Books 69
Articles 69
Reports 70

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

I Appendices 75

A The 10 Minute Guide to Python! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77


A.1 Basic Data Types 77
A.1.1 Scalar Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A.1.2 Iterables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
A.1.3 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.1.4 Loops and Logicals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.2 Functions 82
A.3 Classes and Inheritance 82
A.3.1 Simple Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
A.3.2 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
A.3.3 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.4 Numpy/Scipy 85
The Development Process
Getting Started and Running the Software
Current Features
About this Tutorial
Visualisation

1. Introduction

The Hazard Modeller’s Toolkit (or HMTK) is a Python 2.7 library of functions written by
scientists at the GEM Model Facility, which is intended to provide scientists and engineers with
the tools to help create the seismogenic input models that go into the OpenQuake hazard engine.
The process of developing building the hazard model is a complex and often challenging one,
and whilst many aspect of the practice are relatively common, the choice of certain methods or
tools for undertaking each step can be a matter of judgement. The intention of this software is
to provide scientists and engineers with the means to apply many of the most commonly used
algorithms for preparing seismogenic source models using seismicity and geological data. In
forthcoming versions will hope to make available more tools for the current processes indicated
here, and to integrate new functionalities for i) merging and homogenisation of earthquake
catalogues, ii) calculation of activity rates from geological and geodetic data, iii) testing and
interpretation of Ground Motion Prediction Equations, and iv) integration of seismological and
geological data and treatment of uncertainty in the construction of seismogenic source zones.

1.1 The Development Process


The Hazard Modeller’s Toolkit is an activity that has been under development in GEM and
has followed different stages. The present decision to make the modelling tools available as
a library reflects the general trend in the OpenQuake development process toward having a
modular software framework. This means that the modelling - hazard - risk process is separated
into libraries (e.g. oq-hazardlib, oq-risklib) that can be utilised as standalone tools, in addition
to being integrated within the OpenQuake engine and platform. This is designed to allow for
flexibility in the process, and also allow the user to begin to utilise (possibly in other contexts)
functions and classes that are intended to address particular stages of the calculation. Such an
approach ensures that each sub-component of the toolkit is fully tested, with a minimal degree of
duplication in the testing process. In the HMTK this is taken a step further, as we are aiming
to provide the hazard modeller as much control over the modelling process as possible, whilst
retaining as complete a level of code testing as is practical to implement given the development
resources available.
The change in the HMTK development process that leads towards the current version is
designed to address particular objectives:
6 Chapter 1. Introduction

Portability Reduction in the number of Python dependencies to allow for a greater degree of
cross-platform deployment than is currently feasible with the main OpenQuake engine
Adaptability Cleaner separation of methods into self-contained components that can be imple-
mented and tested within requiring adaption of the remainder of the code.
Abstraction This concept is often a critical component object-oriented development. It de-
scribes the specification of a core behaviour of a method, which implementations (by
means of the subclass) must follow. For example, a declustering algorithm must follow the
common behaviour path, in this instance i) reading and earthquake catalogue and some
configurable parameters, ii) identifying the clusters of events, iii) identifying the main-
shocks from within each cluster,iv) returning this information to the user. The details of the
implementation are then dependent on the algorithm, providing that the core flow is met.
This is designed to allow the algorithms to be interchangeable in the sense that different
methods for particular task could be selected with no (or at least minimal) modification to
the rest of the code.
Usability The creation of a library which could itself be embedded within larger applications
(e.g. as part of a graphical user interface).

1.2 Getting Started and Running the Software


The Modeller’s Toolkit and associated software are designed for execution from the command
line. As with OpenQuake, the preferred environment is Ubuntu Linux (12.04 or later). A careful
effort has been made to keep the number of additional dependencies to a minimum. No packaged
version of the software has been released at the time of writing, so the user must install the
dependencies manually. More information regarding the current dependencies of the toolkit can
be found at
https://github.com/gem/oq-engine.
The current dependencies are:
• Numpy and Scipy (included in the standard OpenQuake installation)
• Shapely (included in the standard OpenQuake installation)
• Openquake nrmllib (included in the standard OpenQuake installation)
(http:/github.com/gem/oq-nrmllib)
• Openquake hazardlib (included in the standard OpenQuake installation)
(http:/github.com/gem/oq-hazardlib)
• Matplotlib (http://matplotlib.org/)
• PyYaml
• Python Decorator
If the OpenQuake Hazard Library (oq-hazardlib) and the OpenQuake Nrml library are already
installed on your system (as would be the case if a full installation of the OpenQuake-engine has
already been successfully installed) then both dependencies are already available. If this is not
the case, however, you will need to install the packages manually. For Linux and OSX users, the
recommended approach is to run:

~$ sudo pip install -e https://github.com/gem/oq-hazardlib.git


~$ sudo pip install -e https://github.com/gem/oq-nrmllib.git

For Windows users, the recommended approach for windows users would be to use Ubuntu
Linux 12.04 within a Virtual Machine. However, instructions to install natively in Windows can
be found in due course.
The Matplotlib, Pyyaml and Decorator dependencies are installed in the library for the demos,
but can be installed easily from the command line by:
1.2 Getting Started and Running the Software 7

~$ sudo pip install matplotlib


~$ sudo pip install pyyaml
~$ sudo pip install decorator

To enable usage of the openquake.hmtk within any location in the operating system, OSX
and Linux users should add the folder location manually to the command line profile file. This
can be done as follows:
1. Using a command line text editor (e.g. VIM), open the ~/.profile folder as follows:
~$ vim ~/.profile

2. At the bottom of the profile file (if one does not exist it will be created) add the line:
export PYTHONPATH=/path/to/openquake.hmtk/folder/:$PYTHONPATH

Where /path/to/openquake.hmtk/folder/ is the system path to the location of the


openquake.hmtk folder (use the command pwd from within the openquake.hmtk folder to
view the full system path).
3. Re-source the bash shell via the command
~$ source ~/.profile

Windows Installation
Although this installation has been primarily tested in a Linux/Unix environment it is possible to
install natively in Windows using the following process. This assumes that no other version of
Python is installed in your windows environment.
The easiest way to install the OpenQuake hazardlib and nrmllib is by virtue of the PythonXY
program http://code.google.com/p/pythonxy/. This is a free and open python user interface,
which will bring in almost all of the dependencies the OpenQuake libraries need. The an installer
for the latest version of PythonXY can be downloaded from here:
http://code.google.com/p/pythonxy/wiki/Downloads?tm=2.
Click on the executable and follow the instructions (the installation may take up to half an
hour or more, depending on the system). It is strongly recommended that the use opt for the
“FULL” installation, which should bring in almost all of the dependencies needed for installing
the OpenQuake hazard library.
For the oq-nrmllib it is necessary to install the Lxml library (http://lxml.de). This is not
officially supported for Windows, so it is recommended (by Python developers themselves!) to
install the unofficial Lxml binding from here: 1 . Download and then run the 32-bit package
named “lxml-#.#.#.win32-py2.7.exe”.
Next you will need to install the op-hazardlib and oq-nrmllib. From the web reposito-
ries listed previously click the button “Download Zip”, then extract contents to the folders
C:/oq-hazardlib and C:/oq-nrmllib respectively.
Now open up an enhanced IPython console. Go to Start − > Python(xy) − > Enhanced
Consoles − > IPython (sh). This will open up an Ipython console terminal. To install the
oq-hazardlib, at the console prompt type:

~$: cd C:/oq-hazardlib/
~$: python setup.py install build --compiler=mingw32

This will install the oq-hazardlib with the full C-extensions, which speed up some of the
geometry calculations. Then do the same with the oq-nrmllib.
1 http://www.lfd.uci.edu/ gohlke/pythonlibs/
8 Chapter 1. Introduction

~$: cd C:/oq-hazardlib/
~$: python setup.py install

Then close the console by typing exit.


Finally, download the zipped folder of the openquake.hmtk from the github repository and
unzip to a folder of your choosing. To allow for usage of the openquake.hmtk throughout your
operating system, do the following:
1. From the desktop, right-click My Computer and open Properties
2. In the “System Properties” window click on the Advanced tab.
3. From the “Advanced” section open the Environment Variables.
4. In the “Environment Variables” you sill see a list of “System Variables”, select “Path” and
“Edit”.
5. Add the path to the openquake.hmtk directory to the list of folders then save.
After this process it may be necessary to restart PythonXY.

1.2.1 Current Features


The Hazard Modeller’s Toolkit is currently divided into three sections:
1. Earthquake Catalogue and Seismicity Analysis These functions are intended to address
the needs of defining seismic activity rate from an earthquake catalogue. They algorithms
for identification of Non-Poissonian events (declustering), analysis of catalogue com-
pleteness, calculation of activity rate and b-value and, finally, estimation of maximum
magnitude using statistical analyses of the earthquake catalogue. Also included in these
tools is an initial implementation of a smoothed seismicity algorithm using the Frankel
(1995) approach.
2. Active Faults Source Models from Geological Data
These functions are intended to address the Modeller needs for defining earthquake activity
rates on fault sources from the geological slip rate, including support for some epistemic
uncertainty analysis on critical parameters in the process.
3. Seismic Source Models from Geodetic Data
These functions are intended to address the use of geodetic data to derive seismic activity
rates from a strain rate model for a region, implementing the Seismic Hazard Inferred
from Tectonics (SHIFT) methodology developed by Bird and Liu (2007) and applied on a
global scale by Bird et al. (2010).
A summary of the algorithms available in the present version is given in Table 1.1.

1.2.2 About this Tutorial


As previously indicated, the Modeller’s Toolkit itself is a Python library. This means that its
functions can be utilised in many different python applications. It is not, at present, a stand-alone
software, and requires some investment of time from the user to understand the functionalities
and learn how to link the various tools together into a workflow that will be suitable for the
modelling problem at hand.
This manual is designed to explain the various functions in the toolkit and to provide some
illustrative examples showing how to implement them for particular contexts and applications.
The tutorial itself does not specifically require a working knowledge of Python. However,
an understanding of the basic python data types, and ideally some familiarity with the use
of Python objects, is highly desirable. Users who are new to Python are recommended to
familiarise themselves with Appendix A of this tutorial. This provides a brief overview of
the Python programming language and should introduce concepts such as classes and dictio-
naries, which will be encountered in due course. For more detail of the complete Python
1.2 Getting Started and Running the Software 9

Feature Algorithm
Seismicity
Declustering Gardner and Knopoff (1974)
AFTERAN (Musson, 1999a)
Completeness Stepp (1971)
Recurrence Maximum Likelihood (Aki, 1965)
Time-dependent MLE
Weichert (1980)
Smoothed Seismicity Frankel (1995)
Geology
Recurrence Anderson and Luco (1983) “Arbitrary”
Anderson and Luco (1983) “Area MMAX ”
Characteristic (Truncated Gaussian)
Youngs and Coppersmith (1985) Exponential
Youngs and Coppersmith (1985) Characteristic
Geodetic Strain
Recurrence Seismic Hazard Inferred from Tectonics (SHIFT)
Bird et al. (2010) and Bird and Liu (2007)

Table 1.1 – Current algorithms in the HMTK

language, a comprehensive overview of its features and usage standard python documentation
(http://docs.python.org/2/tutorial/). Where necessary particular Python programming concepts
will be explained in further detail.
The code snippets (indicated by verbatim text) can be executed from within an ”Interactive
Python (IPython)” environment, or may form the basis for usage of the openquake.hmtk in other
python scripts that the user may wish to run construct themselves. If not already installed on
your system, IPython can be installed from the python package repository by entering:

~$ sudo pip install ipython

An “interactive” session can then be opened by typing ipython at the command prompt. If
matplotlib is installed and you wish to use the plotting functionalities described herein then
you should open IPython with the command:

~$ ipython --pylab

To exit an IPython session at any time simply type exit.


For a more visual application of the openquake.hmtk the reader is encouraged to utilise the
“IPython Notebook” (http://ipython.org/notebook.html). This novel tool implements IPython
inside a web-browser environment, permitting the user to create and store real Python workflows
that can be retrieved and executed, whilst allowing for images and text to be embedded. A
screenshot of the openquake.hmtk used in an IPython Notebook environment is shown in Figure
1.1. From version 1.0 of IPython, the IPython Notebook comes installed. A notebook session
can be started via the command:

~$ ipython notebook --pylab inline


10 Chapter 1. Introduction

Figure 1.1 – Example of the openquake.hmtk embedded in an IPython Notebook

1.2.3 Visualisation
In addition to the scientific tools, which will be described in detail in due course, the open-
quake.hmtk also has a set of functionalities for visualisation of data and results pertinent to the
preparation of seismic hazard input models. Whilst not necessarily an essential component of the
openquake.hmtk, the usage of the plotting functions can facilitate model development. Particular
visualisation functions shall be referred to where relevant for the particular tool or data set.
Map Creation
Contained within the existing plotting tools is a set of functions to create maps of geospatial
data. To do this, the openquake.hmtk requires the Matplotlib "Basemap" add-on, which can also
be installed from the command line using the Python Package Manager (pip) for OSX/Linux
environments:
~$ sudo pip install basemap

Or, for Windows users, can be downloaded and installed from


http://matplotlib.org/basemap/users/download.html.
To set-up a simple basemap it is necessary to define the configuration of the plot (such as
spatial limit and coastline resolution). This is done as follows:
1 In [1]: from openquake . hmtk . plotting . mapping import HMTKBaseMap
1.2 Getting Started and Running the Software 11

2
3 In [2]: map_config = \{ " min_lon " : 18.0 ,
4 " max_lon " : 32.0 ,
5 " min_lat " : 33.0 ,
6 " max_lat " : 43.0 ,
7 " resolution " : " h " \}
8
9 In [3]: basemap1 = HMTKBaseMap ( map_config , " Title of Map " )

HMTKBaseMap is instantiated with a dictionary of configuration parameters: minimum lon-


gitude (min_lon), maximum longitude (max_lon), minimum latitude (min_lat), maximum
latitude (max_lat). The setting resolution determines the coastline resolution: ‘c’ (crude), ‘l’
(low), ‘i’ (intermediate), ‘h’ (high) and ‘f’ (full).
N.B. The resolution of the coastline boundary can be computationally demanding, depend-
ing on the application. For very large scale (e.g. continent or global) it is recommended to
use the ‘c’ or ‘l’ settings, whilst for maps of the order of 20◦ × 20◦ or better, the ‘h’ setting
can be used.
The class HMTKBaseMap contains a set of methods for mapping other data for model building:
• .add_catalogue(catalogue, overlay=False)
This function will overlay an earthquake catalogue onto the basemap. The input value
catalogue is the earthquake catalogue as an instance of the class
openquake.hmtk.seismicity.catalogue.Catalogue (see the next section for de-
tails), whilst the parameter overlay indicates whether the use expects to add another layer
on top of the map (True) or whether to close the image to overlaying (False), as would be
the case when the image is ready for exporting. An example map for the Aegean is shown
in Figure 1.2.
• .add_source_model(model, area_border=’k-’, border_width=’1.0’,
point_marker=’ks’, point_size=2.0, overlay=False)
This function adds a source model to the basemap. The input value model is an instance of
the class openquake.hmtk.sources.source_model.mtkSourceModel (see next sec-
tion). The remaining parameters control the plot settings: i) area_border (the colour
of the borders of an area source), ii) border_width (the line width of the area source
border), iii) point_marker (the marker style for a point source), iv) point_size (the
size of the point source). An example of a source model plot is shown in 1.3.
• .add_colour_scaled_points(longitude, latitude, data, shape=’s’,
alpha=1.0, size=20, norm=None, overlay=False)
Overlays a set of data points with colour scaled according to the data value.
• .add_size_scaled_points(longitude, latitude, data, shape=’o’,
logplot=False, alpha=1.0, colour=’b’, smin=2.0, sscale=2.0,
overlay=False)
Overlays a set of data points with size scaled according to the data value.
12 Chapter 1. Introduction

Earthquake Catalogue
2.00 <= M < 3.00
42°N 3.00 <= M < 4.00
4.00 <= M < 5.00
5.00 <= M < 6.00
6.00 <= M < 7.00
40°N 7.00 <= M < 8.00

38°N

36°N

34°N

20°E 22°E 24°E 26°E 28°E 30°E


Figure 1.2 – Example visualisation of an Earthquake Catalogue
1.2 Getting Started and Running the Software 13

Source Models

42°N

40°N

38°N

36°N

34°N

20°E 22°E 24°E 26°E 28°E 30°E


Figure 1.3 – Example visualisation of a source model with area sources (blue) and simple fault
sources (black)
The Earthquake Catalogue
The Catalogue Format and Class
The “Selector” Class
Declustering
Gardner and Knopoff (1974)
AFTERAN (Musson, 1999b)
Completeness
Stepp, 1971
Recurrence Models
Aki (1965)
Maximum Likelihood
Kijko and Smit (2012)
Weichert (1980)
Maximum Magnitude
Kijko (2004)
Cumulative Moment (Makropoulos and
Burton, 1983)
Smoothed Seismicity
Frankel (1995)
Implementing the Smoothed Seismicity
Analysis

2. Catalogue Tools

2.1 The Earthquake Catalogue


The seismicity tools are intended for use in deriving activity rates from an observed earthquake
catalogue, which may include both instrumental and historical seismicity. The tools are broken
down into five separate libraries: i) Declustering, ii) Completeness, iii) Calculation of Gutenberg-
Richter a- and b-value, iv) Statistical estimators of maximum magnitude from seismicity) and
v) Smoothed Seismicity. In a common use case it is likely that many of the above methods,
particularly recurrence and maximum magnitude estimation, may need to be applied to a selected
sub-catalogue (e.g. earthquakes within a particular polygon). The toolkit allows for the creation
of a source model containing one or more of the supported OpenQuake seismogenic source
typologies, which can be used as a reference for selection, e.g. events within an area source
(polygon), events within a distance of a fault etc. The supported input formats for both the
catalogue are described below, and the source models in the subsequent chapter.

2.1.1 The Catalogue Format and Class


The input catalogue must be formatted as a comma-separated value file (.csv), with the following
attributes in the header line (attributes with an * indicate essential attributes), although the order
of the columns need not be fixed:
To load the catalogue using the IPython environment, in an open IPython session type:
1 >> from openquake . hmtk . parsers . catalogue import CsvCatalogueParser
2 >> c atalogue_filename = ’ path / to / catalogue_file . csv ’
3 >> parser = CsvCatalogueParser ( catalogue_filename )
4 >> catalogue = parser . read_file ()

N.B. the csv file can contain additional attributes of the catalogue too and will be
parsed correctly; however, if the attribute is not one that is specifically recognised by the
catalogue class then a message will be displayed indicating:

Catalogue Attribute ... is not a recognised catalogue key

This is expected behaviour and simply indicates that although this data is given in the
input file, it is not retained in the data dictionary.
16 Chapter 2. Catalogue Tools

Attribute Description
eventID* A unique identifier (integer) for each earthquake in the catalogue
Agency The code (string) of the recording agency for the event solution
year* Year of event (integer) in the range -10000 to present
(events before common era (BCE) should have a negative value)
month* Month of event (integer)
day* Day of event (integer)
hour* Hour of event (integer) - if unknown then set to 0
minute* Minute of event (integer) - if unknown then set to 0
second* Second of event (float) - if unknown set to 0.0
timeError Error in event time (float)
longitude* Longitude of event, in decimal degrees (float)
latitude* Latitude of event, in decimal degrees (float)
SemiMajor90 Length (km) of the semi-major axis of the 90 %
confidence ellipsoid for location error (float)
SemiMinor90 Length (km) of the semi-minor axis of the 90 %
confidence ellipsoid for location error (float)
ErrorStrike Azimuth (in degrees) of the 90 %
confidence ellipsoid for location error (float)
depth* Depth (km) of earthquake (float)
depthError Uncertainty (as standard deviation) in earthquake depth (km) (float)
magnitude* Homogenised magnitude of the event (float) - typically Mw
sigmaMagnitude* Uncertainty on the homogenised magnitude (float) typically Mw

Table 2.1 – List of Attributes in the Earthquake Catalogue File (* Indicates Essential)

The variable catalogue is an instance of the class openquake.hmtk.seismicity.catalogue.Catalogue,


which now contains the catalogue itself (as catalogue.data) and some methods that can be
applied to the catalogue. The first attribute (catalogue.data), is a dictionary where each
attribute of the catalogue is either a 1-D numpy vector (for float and integer values) or a python
list (for string values). For example, to return a vector containing all the magnitudes in the
magnitude column of the catalogue simply type:
1 >> catalogue . data [ ’ magnitude ’]
2 array ([ 6.5 , 6.5 , 6. , ... , 4.8 , 5.2 , 4.1])

The catalogue class contains several helpful methods (called via catalogue. ...):
• catalogue.get_number_events() Returns the number of events currently in the cata-
logue (integer)
• catalogue.load_to_array(keys) Returns a numpy array of floating data, with the
columns ordered according to the list of keys. If the key corresponds to a string item (e.g.
Agency) then an error will be raised.
1 >> catalogue . load_to_array ([ ’ year ’ , ’ longitude ’ , ’ latitude ’ ,
2 ’ depth ’ , ’ magnitude ’ ])
3 array ([[ 1910. , 26.941 , 38.507 , 13.2 , 6.5 ] ,
4 [ 1910. , 22.190 , 37.720 , 20.4 , 6.5 ] ,
5 [ 1910. , 28.881 , 33.274 , 25.0 , 6.0 ] ,
6 ... ,
7 [ 2009. , 20.054 , 39.854 , 20.2 , 4.8 ] ,
8 [ 2009. , 23.481 , 38.050 , 15.2 , 5.2 ] ,
9 [ 2009. , 28.959 , 34.664 , 18.4 , 4.1 ]])
2.1 The Earthquake Catalogue 17

• catalogue.load_from_array(keys, data_array) Creates the catalogue data dictio-


nary from an array, given header as an ordered list of dictionary keys. This can be used in
the case where the earthquake catalogue is loaded in a simple ascii format. For example, if
the user wishes to load in a catalogue from the Zmap format, which gives the columns as:
longitude, latitude, year, month, day, magnitude, depth, hour,
minute, second
This file type could be parsed into a catalogue without the need of a specific parser, as
follows:
1 >> import numpy
2 # Assuming no headers in the file
3 # ( set skip_header =1 if headers are found )
4 >> data = numpy . genfromtxt ( ’ PATH / TO / ZMAP_FILE . txt ’ ,
5 skip_header =0)
6
7 >> headers = [ ’ longitude ’ , ’ latitude ’ , ’ year ’ , ’ month ’ ,
8 ’ day ’ , ’ magnitude ’ , ’ depth ’ , ’ hour ’ ,
9 ’ minute ’ , ’ second ’]
10
11 # Create instance of a catalogue class
12 >> from openquake . hmtk . seismicity . catalogue import Catalogue
13 >> catalogue = Catalogue ()
14
15 # Load the data array into the catalogue
16 >> catalogue . load_from_array ( data , headers )

• catalogue.get_decimal_time()
Returns the time of the earthquake in a decimal format
• catalogue.hypocentres_as_mesh()
Returns the hypocentres of an earthquake as an instance of the class
“openquake.hazardlib.geo.mesh.Mesh” (useful for geospatial functions)
• catalogue.hypocentres_to_cartesian()
Returns the hypocentres in a 3D cartesian framework
• catalogue.purge_catalogue(flag_vector)
Purges the catalogue of all False events in the boolean vector. Thus is used for remov-
ing foreshocks and aftershocks from a catalogue after the application of a declustering
algorithm.
• catalogue.sort_catalogue_chronologically()
Sorts an input into chronological order.
N.B. Some methods will implicitly assume that the catalogue is in chronological order, so
it is recommended to run this function if you believe that there may be events out of order
• catalogue.select_catalogue_events(IDX)
Orders the catalogue according to the event order specified in IDX. Behaves the same as
purge_catalogue(IDX) if IDX is a boolean vector
• catalogue.get_depth_distribution(depth_bins, normalisation=False,
bootstrap=None)
Returns a depth histogram for the catalogue using bins specified by depth_bins. If
normalisation=True then the function will return the histogram as a probability mass
function, otherwise the original count will be returned. If uncertainties are reported on
depth such that one or more values in
catalogue.data[’depthError’] are greater than 0., the function will perform a boot-
strap analysis, taking into account the depth error, with the number of bootstraps given by
the keyword bootstrap.
18 Chapter 2. Catalogue Tools

1 # Import numpy and matplotlib


2 >> import numpy as np
3 >> import matplotlib . pyplot as plt
4
5 # Define depth bins for ( e . g )
6 # 0. - 150 km in intervals of 10 km
7 >> depth_bins = np . arange (0. , 160. , 10.)
8
9 # Get normalised histograms ( without bootstrapping )
10 >> depth_hist = catalogue . ge t_d ep th_ dis tr ibu ti on (
11 depth_bins ,
12 normalisation = True )

To generate a simple histogram plot of hypocentral depth, the process below can be
followed to produce a depth histogram similar to the one shown in Figure 2.1:
1 >> from openquake . hmtk . plotting . seismicity . catalogue_plots import \
2 plot_depth_histogram
3
4 >> depth_bin = 5.0
5 >> plot_depth_histogram ( catalogue ,
6 depth_bin ,
7 filename = " / path / to / image . eps " ,
8 filetype = " eps " )

Depth Histogram
3500

3000

2500

2000
Count

1500

1000

500

0
0 50 100 150 200 250 300 350 400
Depth (km)

Figure 2.1 – Example depth histogram

• catalogue.get_magnitude_depth_distribution(magnitude_bins, depth_bins,
normalisation=False, bootstrap=None)
Returns a two-dimensional histogram of magnitude and hypocentral depth, with the cor-
responding bins defined by the vectors magnitude_bins and depth_bins. The options
normalisation and bootstrap are the same as for the one dimensional histogram. The
usage is illustrated below:
1 # Define depth bins for ( e . g )
2 # 0. - 150 km in intervals of 55 km
2.1 The Earthquake Catalogue 19

3 >> depth_bins = np . arange (0. , 155. , 5.)


4
5 # Define magnitude bins ( e . g .) 2.5 - 7.6 in intervals of 0.1
6 >> magnitude_bins = np . arange (2.5 , 7.7 , 0.1)
7
8 # Get normalised histograms ( without bootstrapping )
9 >> m_d_hist = catalogue . g e t _ m a g n i t u d e _ d e p t h _ d i s t r i b u t i o n (
10 magnitude_bins ,
11 depth_bins ,
12 normalisation = True ,
13 bootstrap = None )

To generate a plot of magnitude-depth density, the following function can be used to


produce a figure similar to that shown in Figure 2.2.
1 >> from openquake . hmtk . plotting . seismicity . catalogue_plots import \
2 plot_magnitude_depth_density
3 >> magnitude_bin = 0.1
4 >> depth_bin = 5.0
5 >> p l o t _ m a g n i t u d e _ d e p t h _ d e n s i t y (
6 catalogue ,
7 magnitude_bin
8 depth_bin ,
9 logscale = True , \ # Logarithmic colour scale
10 filename = " / path / to / image . eps " , \ # Optional
11 filetype = " eps " ) \ # Optional

Magnitude-Depth Count
400

350
102
300

250
Depth (km)

200

101
150

100

50

0
3 4 5 6 7 100
Magnitude
Figure 2.2 – Example magnitude-depth density plot

• catalogue.get_magnitude_time_distribution(magnitude_bins, time_bins,
normalisation=False, bootstrap=None)
Returns a 2D histogram of magnitude with time. time_bins are the bin edges for the
time windows, in decimal years. To run the function simple follow:
20 Chapter 2. Catalogue Tools

1 # Define annual time bins from 1900 CE to 2012 CE


2 >> time_bins = np . arange (1900. , 2013. , 1.)
3 # Define magnitude bins ( e . g .) 2.5 - 7.6 in intervals of 0.1
4 >> magnitude_bins = np . arange (2.5 , 7.7 , 0.1)
5 # Get normalised histograms ( without bootstrapping )
6 >> mag_time_hist = catalogue . g e t _ m a g n i t u d e _ t i m e _ d i s t r i b u t i o n (
7 magnitude_bins ,
8 time_bins ,
9 normalisation = True ,
10 bootstrap = None )

To automatically generate a plot, similar to that shown in Figure 2.3 , run the following:
1 >> from openquake . hmtk . plotting . seismicity . catalogue_plots import \
2 plot_magnitude_time_density
3 >> magnitude_bin_width = 0.1
4 >> time_bin_width = 0.1
5 >> p l o t _ m a g n i t u d e _ t i m e _ d e n s i t y ( catalogue ,
6 magnitude_bin_width ,
7 time_bin_width ,
8 filename = " / path / to / image . eps " ,
9 filetype = " eps " )

Magnitude-Time Count
8
102

6
Magnitude

5
101

2
1920 1940 1960 1980 2000 100
Time (year)

Figure 2.3 – Example magnitude-time density plot

2.1.2 The “Selector” Class


In the process of constructing a PSHA seismogenic source model from seismicity it is necessary
to select sub-sets of the earthquake catalogue, usually for calculating earthquake recurrence
statistics pertinent to a particular region or seismogenic source. As catalogue selection is such a
prevalent aspect of the source modelling process, the selection is done inside the HMTK via the
use of a "Selector" tool. This tools is a container for all methods associated with the selection
of sub-catalogues from a given earthquake catalogue. It will be seen in due course that later
methods relating to the selection of the catalogue for a particular source require as an input an
instance of the selector class, rather than the catalogue itself.
2.1 The Earthquake Catalogue 21

To setup the “Selector” tool:


1 >> from openquake . hmtk . seismicity . selector import CatalogueSelector
2
3 # Assuming that there already exists a
4 # catalogue named ’’ catalogue1 ’’
5
6 >> selector1 = CatalogueSelector ( catalogue1 ,
7 create_copy = True )

The optional keyword create_copy ensures that when the events not selected are purged
from the catalogue a “deepcopy” is taken of the original catalogue. This ensures that the original
catalogue remains unmodified when a subset of events is selected.
The catalogue selector class has the following methods:
.within_polygon(polygon, distance=None)
Selects events within a polygon described by the class openquake.hazardlib.geo.
polygon.Polygon. distance is the distance (in km) to use as a buffer, if required. Optional
keyword arguments upper_depth and lower_depth can be used to limit the depth range of
the catalogue returned by the selector to only those events whose hypocentres are within the
specified depth limits.
.circular_distance_from_point(point, distance, distance_type="epicentral")
Selects events within a distance from the a location. The location (point) is an instance of
the openquake.hazardlib.point.Point class, whilst distance is the selection distance (km) and
distance_type can be either "epicentral" or "hypocentral".
.cartesian_square_centred_on_point(point, distance)
Selects events within a square of side length distance, on a location (represented as an
openquake Point class).
.within_joyner_boore_distance(surface, distance)
Returns earthquakes within a distance (km) of the surface projection (“Joyner-Boore” dis-
tance) of a fault surface. The fault surface must be defined as an instance of the class
openquake.hazardlib.geo.surface.simple_fault.SimpleFaultSurface or
openquake.hazardlib.geo.surface.complex_fault.ComplexFaultSurface.
.within_rupture_distance(surface, distance)
Returns earthquakes within a distance (km) of a fault surface. The fault surface must be
defined as an instance of the class
openquake.hazardlib.geo.surface.simple_fault.SimpleFaultSurface or
openquake.hazardlib.geo.surface.complex_fault.ComplexFaultSurface.
.within_time_period(start_time=None, end_time=None)
Selects earthquakes within a time period. Times must be input as instances of a datetime
object. For example:
1 >> from datetime import datetime
2 >> selector1 = CatalogueSelector ( catalogue1 , create_copy = True )
3 # Early time limit is 1 January 1990 00:00:00
4 >> early = datetime (1990 , 1 , 1 , 0 , 0 , 0)
5 # Late time limit is 31 December 1999 23:59:59
6 > >: late = datetime (1999 , 12 , 31 , 23 , 59 , 59)
7 >> c atalogue_nineties = selector1 . within_time_period (
8 start_time = early ,
9 end_time = late )

.within_depth_range(lower_depth=None, upper_depth=None)
Selects earthquakes whose hypocentres are within the range specified by the lower depth
limit (lower_depth) and the upper depth limit (upper_depth), both in km.
.within_magnitude_range(lower_mag=None, upper_mag=None)
22 Chapter 2. Catalogue Tools

Selects earthquakes whose magnitudes are within the range specified by the lower limit
(lower_mag) and the upper limit (upper_mag).

2.2 Declustering
To identify Poissonian rate of seismicity, it is necessary to remove foreshocks/aftershocks/swarms
from the catalogue. The Modeller’s Toolkit contains, at present, two algorithms to undertake this
task, with more under development.

2.2.1 Gardner and Knopoff (1974)


The most widely applied simple windowing algorithm is that of Gardner and Knopoff (1974).
Originally conceived for Southern California, the method simply identifies aftershocks by virtue
of fixed time-distance windows proportional to the magnitude of the main shock. Whilst this
premise is relatively simple, the manner in which the windows are applied can be ambiguous.
Four different possibilities can be considered (Luen and Stark, 2012):
1. Search events in magnitude-descending order. Remove events if it is in the window of the
largest event
2. Remove every event that is inside the window of a previous event, including larger events
3. An event is in a cluster if, and only if, it is in the window of at least one other event in the
cluster. In every cluster remove all events except the largest
4. In chronological order, if the ith event is in the window of a preceding larger shock that
has not already been deleted, remove it. If a larger shock is in the window of the ith event,
delete the ith event. Otherwise retain the ith event.
It is the first of the four options that is implemented in the current toolkit, whilst others may be
considered in future. The algorithm is capable if identifying foreshocks and aftershocks, simply
by applying the windows forward and backward in time from the mainshock. No distinction
is made between primary aftershocks (those resulting from the mainshock) and secondary or
tertiary aftershocks (those originating due to the previous aftershocks); however, it is assumed
all would occur within the window.
Several modifications to the time and distance windows have been suggested, which are
summarised in Stiphout et al. (2012). The windows originally suggested by Gardner and Knopoff
(1974) are approximated by:

distance (km) =100.1238M+0.983


(
100.032M+2.7389 if M ≥ 6.5 (2.1)
time (decimal years) =
100.5409M−0.547 otherwise

An alternative formulation is proposed by Grünthal (as reported in Stiphout et al. (2012)):

2
distance (km) =e1.77+(0.037+1.02M)
2
(
|e−3.95+(0.62+17.32M) | if M ≥ 6.5 (2.2)
time (decimal years) =
102.8+0.024M otherwise
A further alternative is suggested by Uhrhammer (1986)

distance (km) = e−1.024+0.804M time (decimal years) = e−2.87+1.235M (2.3)

A comparison of the expected window sizes with magnitude are shown for distance and time
(Figure 2.4).
2.2 Declustering 23

(a)
1,000

Distance (km)
100

10 Gardner & Knopoff (1974)


Gruenthal
Uhrhammer (1986)
1
4 4.5 5 5.5 6 6.5 7 7.5 8 8.5
Magnitude

(b)
10,000

1,000
Time (days)

100

Gardner & Knopoff (1974)


10
Gruenthal
Uhrhammer (1986)
1
4 4.5 5 5.5 6 6.5 7 7.5 8 8.5
Magnitude

Figure 2.4 – Scaling of declustering time and distance windows with magnitude

The Gardner and Knopoff (1974) algorithm and its derivatives represent are most computa-
tionally straightforward approach to declustering. The time_dist_windows attribute indicates
the choice of the time and distance window scaling model from the three listed. As the current
version of this algorithm considers the events in a descending-magnitude order, the parameter
foreshock_time_window defines the size of the time window used for searching for foreshocks,
as a fractional proportion of the size of the aftershock window (the distance windows are always
equal for both fore- and aftershocks). So for an evenly sized time window for foreshocks and
aftershocks,the
foreshock_time_window parameter should equal 1. For shorter or longer foreshock time
windows this parameter can be reduced or increased respectively.
To run a declustering analysis on the earthquake catalogue it is necessary to set-up the
configuration using a python dictionary (see Appendix A). A config file for the Gardner and
Knopoff (1974) algorithm, using for example the Uhrhammer (1986) time-distance windows
with equal sized time window for aftershocks and foreshocks, would be created as shown:
1 >> from openquake . hmtk . seismicity . declusterer . dist ance_t ime_wi ndows import \
2 UhrhammerWindow
3
4 >> declust_config = {
5 ’ time_distance_window ’: UhrhammerWindow () ,
6 ’ fs_time_prop ’: 1.0}

To run the declustering algorithm simply import and run the algorithm as shown:
1 >> from openquake . hmtk . seismicity . declusterer . dec_gardner_knopoff import \
2 GardnerKnopoffType1
3
4 >> declustering = GardnerKnopoffType1 ()
5
6 >> cluster_index , cluster_flag = declustering . decluster (
7 catalogue ,
8 declust_config )
24 Chapter 2. Catalogue Tools

There are two outputs of a declustering algorithm: cluster_index and cluster_flag.


Both are numpy vectors, of the same length as the catalogue, containing information about the
clusters in the catalogue. cluster_index indicates the cluster to which each event is assigned
(0 if not assigned to a cluster). cluster_flag indicates whether an event is a non-Poissonian
event, in which case the value is assigned to 1, or a mainshock, the value is assigned as 0. This
output definition is the same for all declustering algorithms.
At this point the user may wish to either retain the catalogue in its current format, in
which case they may wish to add on the clustering information into another attribute of the
catalogue.data dictionary, or they may wish to purge the catalogue of non-Poissonian events.
To simply add the clustering information to the data dictionary simply type:
1 >> catalogue . data [ ’ Cluster_Index ’] = cluster_index
2 >> catalogue . data [ ’ Cluster_Flag ’] = cluster_flag

Alternatively, to purge the catalogue of non-Poissonian events:


1 >> mainshock_flag = cluster_flag == 0
2 >> catalogue . purge_catalogue ( mainshock_flag )

2.2.2 AFTERAN (Musson, 1999b)


A particular development of the standard windowing approach is introduced in the program
AFTERAN (Musson, 1999b). This is a modification of the Gardner and Knopoff (1974)
algorithm, using a moving time window rather than a fixed time window. In AFTERAN,
considering each earthquake in order of descending magnitude, events within a fixed distance
window are identified (the distance window being those suggested previously). These events are
searched using a moving time window of T days. For a given mainshock, non Poissonian events
are identified if they occur both within the distance window and the initial time window. The
time window is then moved, beginning at the last flagged event, and the process repeated. For a
given mainshock, all non-Poissonian events are identified when the algorithm finds a continuous
period of T days in which no aftershock or foreshock is identified.
The theory of the AFTERAN algorithm is broadly consistent with that of Gardner and
Knopoff (1974). This algorithm, whilst a little more computationally complex, and therefore
slower, than the Gardner and Knopoff (1974) windowing approach, remains simple to implement.
As with the Gardner and Knopoff (1974) function, the time_dist_window attribute indicates
the choice of the time and distance window scaling model. The parameter time_window
indicates the size (in days) of the moving time window used to identify fore- and aftershocks.
The following example will show how to run the AFTERAN algorithm, using the Gardner and
Knopoff (1974) definition of the distance windows, and a fixed-width moving time window of
100 days.
1
2 >> from openquake . hmtk . seismicity . declusterer . dec_afteran import \
3 Afteran
4
5 >> from openquake . hmtk . seismicity . declusterer . dist ance_t ime_wi ndows import \
6 GardnerKnopoffWindow
7
8 >> declust_config = {
9 ’ time_distance_window ’: GardnerKnopoffWindow () ,
10 ’ time_window ’: 100.0}
11
12 >> declustering = Afteran ()
13
14 >> cluster_index , cluster_flag = declustering . decluster (
15 catalogue ,
2.3 Completeness 25

16 declust_config )

2.3 Completeness
In the earliest stages of processing an instrumental seismic catalogue to derive inputs for seismic
hazard analysis, it is necessary to determine the magnitude completeness threshold of the
catalogue. To outline the meaning of the term ”magnitude completeness” and the requirements
for its analysis as an input to PSHA, the terminology of Mignan and Wöessner (2012) is adopted.
This defines the magnitude of completeness as the ”lowest magnitude at which 100 % of the
events in a space-time volume are detected (Rydelek and Sacks, 1989; Woessner and Wiemer,
2005)”. Incompleteness of an earthquake catalogue will produce bias when determining models
of earthquake recurrence, which may have a significant impact on the estimation of hazard at a
site. Identification of the completeness magnitude of an earthquake catalogue is therefore a clear
requirement for the processing of input data for seismic hazard analysis.
It should be noted that this summary of methodologies for estimating completeness is
directed toward techniques that can be applied to a ”typical” instrumental seismic catalogue.
We therefore make the assumption that the input data will contain basic information for each
earthquake such as time, location, magnitude. We do not make the assumption that network-
specific or station-specific properties (e.g., configuration, phase picks, attenuation factors) are
known a priori. This limits the selection of methodologies to those classed as estimators of
”sample completeness”, which defines completeness on the basis of the statistical properties
of the earthquake catalogue, rather than ”probability-based completeness”, which defines the
probability of detection given knowledge of the properties of the seismic network (Schorlemmer
and Woessner, 2008). This therefore excludes the methodology of Schorlemmer and Woessner
(2008), and similar approaches such as that of Felzer (2008)
The current workflows assume that completeness will be applied to the whole catalogue,
ideally returning a table of time-varying completeness. The option to explore spatial variation
in completeness is not explicitly supported, but could be accommodated by an appropriate
configuration of the toolkit.
In the current version of the Modeller’s Toolkit the Stepp (1971) methodology for analysis
of catalogue completeness is implemented. Further methods are in development, and will be
input in future releases.

2.3.1 Stepp, 1971


This is one of the earliest analytical approaches to estimation of completeness magnitude. It
is based on estimators of the mean rate of recurrence of earthquakes within given magnitude
and time ranges, identifying the completeness magnitude when the observed rate of earthquakes
above MC begins to deviate from the expected rate. If a time interval (Ti ) is taken, and the
earthquake sequence assumed Poissonian, then the unbiased estimate of the mean rate of events
per unit time interval of a given sample is:

1 n
λ= ∑ Ti (2.4)
n i=1

with variance σλ2 = λ /n. Taking the unit time interval to be 1 year, the standard deviation of
the estimate of the mean is:

√ √
σλ = λ/ T (2.5)
26 Chapter 2. Catalogue Tools

where T is√the sample length. As the Poisson assumption implies a stationary process, σλ
behaves as 1/ T in the sub-interval of the sample in which the mean rate of occurrence of a
magnitude class is constant. Time variation of MC can usually be inferred graphically√ from the
analysis, as is illustrated in Figure 2.5. In this example, the deviation from the 1/ T line for
each magnitude class occurs at around 40 years for 4.5 < M < 5, 100 years for 5.0 < M < 6.0,
approximately 150 years for 6.0 < M < 6.5 and 300 years for M > 6.5. Knowledge of the
sources of earthquake information for a given catalogue may usually be reconciled with the
completeness time intervals.

Figure 2.5 – Example of Completeness Estimation by the Stepp (1971) methodology

The analysis of Stepp (1971) is a coarse, but relatively robust, approach to estimating
the temporal variation in completeness of a catalogue. It has been widely applied since its
development. The accuracy of the completeness magnitude depends on the magnitude and time
intervals considered, and a degree of judgement is often needed to determine the time at which
the rate deviates from the expected values. It has tended to be applied to catalogues on a large
scale, and for relatively higher completeness magnitudes.
To translate the methodology from a largely graphical methods into a computational method
the completeness period needs to be identified by automatically identifying the point at which
the gradient of the observed values decreases with respect to that expected from a Poisson
process (see 2.5). In the implementation found within the current toolkit, the divergence point is
identified by fitting a two-segment piecewise linear function to the observed data. Although a
two-segment piecewise linear function is normally fit with four parameters (intercept, slope1 ,
slope2 and crossover point), by virtue of the assumption that for the complete catalogue the rate
is assumed to be stationary such that σλ = √1T the slope of the first segment can be fixed as
−0.5, and the second slope should be constrained such that slope2 ≤ −0.5, whilst the crossover
point (xc ) is subject to the constraint (xc ≥ 0.0). Thus it is possible to fit the two-segment linear
function using constrained optimisation with only three free parameters. For this purpose the
2.3 Completeness 27

toolkit minimises the residual sum-of-squares of the model fit using numerical optimisation.
To run the Stepp (1971) algorithm the configuration parameters should be entered in the
form of a dictionary, such as the example shown below:
1 comp_config = { ’ magnitude_bin ’: 0.5 ,
2 ’ time_bin ’: 5. ,
3 ’ increment_lock ’: True }

The algorithm has three configurable options. The time_bin parameter describes the
size of the time window in years, the magnitude_bin parameter describes the size of the
magnitude bin, sensitivity is as described previously. The final option (increment_lock) is
an option that is used to ensure consistency in the results to avoid the completeness magnitude
increasing for the latest intervals in the catalogue simply due to the variability associated
with the short duration. If increment_lock is set to True, the program will ensure that the
completeness magnitude for shorter, more recent windows is less than or equal to that of
older, longer windows. This is often a condition for some recurrence analysis tools, so it
may be advisable to set this option to true in certain workflows. Otherwise it should be set
to False to show the apparent variability. Some degree of judgement is necessary
here. In particular it is expected that the user may be aware of circumstances particular to their
catalogue for which a recent increase in completeness magnitude is expected (for example, a
certain recording network no longer operational).
The process of running the algorithm is shown below:
1 >> from openquake . hmtk . seismicity . completeness . comp_stepp_1971 import \
2 Stepp1971
3
4 >> c omp le ten ess _a lgo ri thm = Stepp1971 ()
5
6 >> c ompleteness_table = co mp let ene ss _al go rit hm . completeness (
7 catalogue ,
8 comp_config )
9
10 >> c ompleteness_table
11 array ([[ 1990. , 4.25] ,
12 [ 1962. , 4.75] ,
13 [ 1959. , 5.25] ,
14 [ 1906. , 5.75] ,
15 [ 1906. , 6.25] ,
16 [ 1904. , 6.75] ,
17 [ 1904. , 7.25]])

As shown in the resulting completeness_table, the completeness algorithm will output


the time variation in completeness (in this example with the increment_lock set) in the form
of a two-column table with column 1 indicating the completeness year for the magnitude bin
centred on the magnitude value found in column 2.
At present, it may be the case that the user wishes to enter a time-varying completeness
results for use in subsequent functions, based on alternative methods or on judgement. This can
be entered in the completeness_table setting, as in the example shown here (take note of the
requirements for the square brackets):
1 co mp leteness_table : [[1990. , 4.0] ,
2 [1960. , 5.0] ,
3 [1930. , 6.0] ,
4 [1900. , 6.5]]

If a completeness_table is input then this will override the selection of the completeness
algorithm, and the calculation will take the values in completeness_table directly.
28 Chapter 2. Catalogue Tools

2.4 Recurrence Models


The current sets of tools are intended to determine the parameters of the Gutenberg and Richter
(1944) recurrence model, namely the a- and b-value. It is expected that in the most common
use case the catalogue that is input to these algorithms will be declustered, with a time-varying
completeness defined according to a completeness_table of the kind shown previously. If no
completeness_table is input the algorithm will assume the input catalogue is complete above
the minimum magnitude for its full duration.

2.4.1 Aki (1965)


The classical maximum likelihood estimator for a simple unbounded Gutenberg and Richter
(1944) model is that of Aki (1965), adapted for binned magnitude data by Bender (1983).
It assumes a fixed completeness magnitude (MC ) for the catalogue, and a simple power law
recurrence model. It does not explicitly take into account magnitude uncertainty.

log10 (e)
b=  (2.6)
m̄ − m0 + ∆M 2

where m̄ is the mean magnitude, m0 the minimum magnitude and ∆M the discretisation interval
of magnitude within a given sample.

2.4.2 Maximum Likelihood


This method adjusts the Aki (1965) and Bender (1983) method to incorporate for time variation
in completeness. The catalogue is divided into into S sub-catalogues, where each sub-catalogue
corresponds to a period with a corresponding MC . An average a- and b-value (with uncertainty)
is returned by taking the mean of the a- and b-value of each sub-catalogue, weighted by the
number of events in each sub-catalogue.

1 S
b̂ = ∑ wi bi (2.7)
S i=1

1 >> mle_config = { ’ magnitude_interval ’: 0.1 ,


2 ’ Average Type ’: ’ Weighted ’ ,
3 ’ reference_magnitude ’: None }
4
5 >> from openquake . hmtk . seismicity . occurrence . b_maximum_likelihood import \
6 BMaxLikelihood
7
8 >> recurrence = BMaxLikelihood ()
9
10 >> bval , sigmab , aval , sigmaa = recurrence . calculate (
11 catalogue ,
12 mle_config ,
13 completeness = completeness_table )

Where magnitude_window indicates the size of the magnitude bin, recurrence_algorithm


and reference_magnitude the magnitude for which the output calculates that rate greater than
or equal to (set to 0 for 10a ).

2.4.3 Kijko and Smit (2012)


A recent adaption of the Aki (1965) estimator of b-value for a catalogue containing different
completeness periods has been proposed by Kijko and Smit (2012). Dividing the earthquake
2.4 Recurrence Models 29

catalogue into s subcatalogues of ni events with corresponding completeness magnitudes mci for
i = 1, 2, ..., s, the likelihood function of β where β = b ln (10.0) is given as:

s ni
L = ∏ ∏ β exp( −β mij − mimin )
 
(2.8)
i=1 j=1

which gives a maximum likelihood estimator of β :

 −1
r1 r2 rs
β= + +···+ (2.9)
β1 β2 βs
where ri = ni /n and n = ∑si=1 ni above the level of completeness mi .
1
2 >> kijko_smit_config = { ’ magnitude_interval ’: 0.1 ,
3 ’ reference_magnitude ’: None \}

2.4.4 Weichert (1980)


Recognising the typical conditions of an earthquake catalogue, Weichert (1980) developed a
maximum likelihood estimator of b for grouped magnitudes and unequal periods of observation.
The likelihood formulation for this approach is:

N!
L (β |ni , mi ,ti ) = ∏ pni i (2.10)
n !
∏i i i
where L is the likelihood estimator of β , n the number of earthquakes in magnitude bin m
with observation period t. The parameter p is defined as:

ti exp (−β mi )
pi = (2.11)
∑ j t j exp (−β m j )
The extremum of ln (L) is found at:

∑i ti mi exp (−β mi )
(2.12)
∑ j t j exp (−β m j )
The computational implementation of this method is given as an appendix to Weichert
(1980). This formulation of the maximum likelihood estimator for b-value, and consequently
seismicity rate, is in widespread use, with applications in many national seismic hazard analysis
(e.g. Frankel et al., 1996; Frankel et al., 2002). The algorithm has been demonstrated to be
efficient and unbiased for most applications. It is recognised by Felzer (2008) that an implicit
assumption is made regarding the stationarity of the seismicity for all the time periods.
To implement the Weichert (1980) recurrence estimator, the configuration properties are
defined as:
1
2 >> weichert_config = { ‘ magnitude_interval ’: 0.1 ,
3 ‘ reference_magnitude ’: None ,
4 # The remaining parameters are optional
5 ‘ bvalue ’: 1.0 ,
6 ‘ itstab ’: 1E -5 ,
7 ‘ maxite r ’: 1000}
30 Chapter 2. Catalogue Tools

As the Weichert (1980) algorithm is reaches the MLE estimation by iteration then three
additional optional parameters can control the iteration process: bvalue is the initial guess for
the b-value, itstab the difference in b-value in order to reach convergence, and maxiter the
maximum number of iterations. 1

2.5 Maximum Magnitude


The estimation of the maximum magnitude for use in seismic hazard analysis is a complex,
and often controversial, process that should be guided by information from geology and the
seismotectonics of a seismic source. Estimation of maximim magnitude from the observed
(instrumental and historical) seismicity can be undertaken using methods assuming a truncated
Gutenberg and Richter, 1944 model, or via non-parametric methods that are independent any
assumed functional form.

2.5.1 Kijko (2004)


Three different estimators of maximum magnitude are given by Kijko (2004), each depending on
a different set of assumptions:
1. ”Fixed b-value”: Assumes a single b-value with no uncertainty
2. ”Uncertain b-value”: Assumes and uncertain b-value defined by an expected b and the
standard deviation
3. ”Non-Parametric Gaussian”: Assumes no functional form (can be applied to seismicity
observed to follow a more characteristic distribution)
Each of these estimators assumes the general form:

mmax = mobs
max + ∆ (2.13)

where ∆ is an increment that is dependent on the estimator used.


The uncertainty on mmax is also defined according to:

q
σmmax = σm2 obs + ∆2 (2.14)
max

In the three estimators some lower bound magnitude constraint must be defined. For those
estimators that assume an exponential recurrence model the lower bound magnitude must be
specified by the users. For the non-Parametric Gaussian method and explicit lower bound
magnitude does not have to be specified; however, the estimation is conditioned upon the largest
N magnitudes, where N must be specified by the user.
If the user wishes to input a maximum magnitude that is larger than that observed in the
catalogue (e.g. a known historical magnitude), this can be specified in the config file using
input_mmax with the corresponding uncertainty defined by
input_mmax_uncertainty. If these are not defined (i.e. set to None) then the maximum
magnitude will be taken from the catalogue.
All three estimators require an iterative solution, therefore additional parameters can be
specified in the configuration file that control the iteration process: tolerance difference in
MM ax estimate for the algorithm to be considered converged, and
maximum_iterations the maximum number of iterations for stability.
1 Theiterative nature of the Weichert (1980) algorithm can result in very slow convergence and unstable behaviour
when the magnitudes infer b-values that are very small, or even negative. This can occur when very few events are in
the resulting catalogue, or when the magnitudes converge within a narrow range.
2.5 Maximum Magnitude 31

”Fixed b-value”
For a catalogue of n earthquakes, whose magnitudes are distributed by a Gutenberg and Richter
(1944) distribiution with a fixed "b" value, the increment of maximum magnitude is determined
via:

m
Zmax n
1 − exp [−β (m − mmin )]
∆= dm (2.15)
1 − exp [−β (mobs
max − mmin )]
mmin

The execution of the Kijko (2004) ”fixed-b” algorithm is as follows:


1 >> mmax_config = { ‘ input_mmax ’: 7.6 ,
2 ‘ i np ut_ mm ax_ unc er tai nt y ’: 0.22 ,
3 ‘b - value ’: 1.0 ,
4 ‘ input_mmin ’: 5.0 ,
5 ‘ tolerance ’: 1.0 E -5 , \# Default
6 ‘ maximum_iterations ’: 1000\} \ # Defaults
7
8 >> from openquake . hmtk . seismicity . max_magnitude . ki jko _s ell ev ol_ fix ed _b \
9 import KijkoSellevolFixedb
10
11 >> mmax_estimator = KijkoSellevolFixedb ()
12
13 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
14 mmax_config )

”Uncertain b-value”
For a catalogue of n earthquakes, whose magnitudes are distributed by a Gutenberg and Richter
(1944) distribiution with an uncertain "b" value, characterised by and expected term (b) and a
corresponding undertainty (σb ), the increment of maximum magnitude is determined via:

mm ax  q n
n Z p
∆ = Cβ 1− dm (2.16)
p + m − mmin
mm in

2 2
where β = b ln (10.0), p = β / σβ , q = β /σβ and Cβ is a normalising coefficient
determined via:

1
Cβ = (2.17)
1 − [p/ (p + mmax − mmin )]q
In both the fixed and uncertain ”b” case a minimum magnitude will need to be input into
the calculation. If this value is lower than the minimum magnitude observed in the catalogue
the iterator may not stabilise to a satisfactory value, so it is recommended to use a minimum
magnitude that is greater than the minimum found in the observed catalogue.
The execution of the ”uncertain b-value” estimator is undertaken in a very similar to that of
the fixed b-value, the only additional parameter being the sigma-b term:
1 >> mmax_config = { ’ input_mmax ’: 7.6 ,
2 ’ i np ut_ mm ax_ unc er tai nt y ’: 0.22 ,
3 ’b - value ’: 1.0 ,
4 ’ sigma - b ’: 0.15
5 ’ input_mmin ’: 5.0 ,
6 ’ tolerance ’: 1.0 E -5 ,
32 Chapter 2. Catalogue Tools

7 ’ maximum_iterations ’: 1000}
8
9 >> from openquake . hmtk . seismicity . max - magnitude . kijko_sellevol_bayes \
10 import KijkoSellevolBayes
11
12 >> mmax_estimator = KijkoSellevolBayes ()
13
14 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
15 mmax_config )

Non-Parametric Gaussian
The non-parametric Gaussian estimator for maximum magnitude mmax is defined as:

m
Zmax
"
mmin −mi
 #n
∑ni=1 Φ m−m
 
h
i
− Φ h
∆=  dm (2.18)
∑ni=1 Φ mmaxh−mi − Φ mminh−mi
 
mmin

where mmin and mmax are the minimum and maximum magnitudes from a set of n events, Φ is
the standard normal cumulative distribution function. h a kernel smoothing factor:

h = 0.9 × min (σ , IQR/1.34) × n−1/5 (2.19)

with σ the standard deviation of a set of n earthquakes with magnitude mi where i = 1, 2, ...n,
and IQR the inter-quartile range.
Therefore the uncertainty on mmax is conditioned primarily on the uncertainty of the largest
observed magnitude. As in many catalogues the largest observed magnitude may be an earlier
historical event, which will be associated with a large uncertainty, this estimator tends towards
large uncertainties on mmax .
Due to the need to define some additional parameters the configuration file is slightly
different. No b-value or minimum magnitude needs to be specified; however, the algorithm will
consider only the largest number_earthquakes magnitudes (or all magnitudes if the number
of observations is smaller). The algorithm also numerically approximates the integral of the
Gaussian pdf, so number_samples is the number of samples of the distribution. The rest of the
execution remains the same as for the exponential recurrence estimators of Mmax :
1 >> mmax_config = { ’ input_mmax ’: 7.6 ,
2 ’ i np ut_ mm ax_ unc er tai nt y ’: 0.22 ,
3 ’ number_samples ’: 51 , # Default
4 ’ number_earthquakes ’: 100 # Default
5 ’ tolerance ’: 1.0 E -5 ,
6 ’ maximum_iterations ’: 1000}
7
8 >> from openquake . hmtk . seismicity . max - magnitude . k i j k o _ n o n p a r a m e t r i c _ g a u s s i a n \
9 import K i j k o N o n P a r a m e t r i c G a u s s i a n
10
11 >> mmax_estimator = K i j k o N o n P a r a m e t r i c G a u s s i a n ()
12
13 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,
14 mmax_config )

2.5.2 Cumulative Moment (Makropoulos and Burton, 1983)


The cumulative moment release method is an adaptation of the cumulative strain energy release
method for estimating mmax originally proposed by Makropoulos and Burton (1983). Another
method based on a pseudo-graphical formulation, an estimator of maximum magnitude can be
2.5 Maximum Magnitude 33

derived from a plot of cumulative seismic moment release with time. The average slope of this
plot indicates the mean moment release for the input catalogue in question. Two further straight
lines are defined with gradients equal to that of the slope of mean cumulative moment release,
both enveloping the cumulative plot. The vertical distance between these two lines indicates the
total amount of moment that may be released in the region, if no earthquakes were to occur in
the corresponding time (i.e. the distance between the upper and lower bounding lines on the time
axis). This concept is illustrated in Figure 2.6.

Figure 2.6 – Illustratation of Cumulative Moment Release Concept

The cumulative moment estimator of mmax , whilst simple in concept, has several key advan-
tages. As a non-parametric method it is independent of any assumed probability distribution
and cannot estimate mmax lower than the observed mmax . It is also principally controlled by the
largest events in the catalogue, this making it relative insensitive to uncertainties in completeness
or lower bound threshold. In practice, this estimator, and to some extent that of Kijko (2004)
are dependent on having a sufficiently long record of events relative to the strain cycle for
the region in question, such that the estimate of average moment release is stable. This will
obviously depend on the length of the catalogue, and for some regions, particularly those in low
strain intraplate environments, it is often the case that mmax will be close to the observed mmax .
Therefore it may be the case that it is most appropriate to use these techniques on a larger scale,
either considering multiple sources or an appropriate tectonic proxy.
For the cumulative moment estimator it is possible to take into account the uncertainty on
mmax by applying bootstrap sampling to the observed magnitudes and their respective uncer-
tainties. This has the advantage that σmmax is not controlled by the uncertainty on the observed
mmax , as it is for the Kijko (2004) algorithm. Instead it takes into account the uncertainty on
all the magnitudes in the catalogue. The cost of this, however, is that this method is more
computationally intensive, and therefore slower, than Kijko (2004), depending on the number of
bootstrap samples the user chooses.
The algorithm is slightly simpler to run than the Kijko (2004) methods; however, due to the
bootstrapping process it is slightly slower. It is run as per the following example:
1
2 >> mmax_config = { ’ number_bootstraps ’: 1000}
3
4 >> from openquake . hmtk . seismicity . max_magnitude . c u m u l a t i v e _m o m e n t _ r e l e as e \
5 import CumulativeMoment
6
7 >> mmax_estimator = CumulativeMoment ()
8
34 Chapter 2. Catalogue Tools

9 >> mmax , mmax_uncertainty = mmax_estimator . get_mmax ( catalogue ,


10 mmax_config )

For the cumulative moment algorithm the only user configurable parameter is the
number_bootstraps, which is the number of samples used during the bootstrapping process.

2.6 Smoothed Seismicity


The use of smoothed seismicity in seismic hazard analysis has generally become a common way
of characterising distributed seismicity, for which the seismogenic source are defined exclusively
from the uncertain locations of observed seismicity. There are many different methods for
smoothing the catalogue, adopting different smoothing kernels or making different correction
factors to compensate for spatial and/or temporal completeness.

2.6.1 Frankel (1995)


A smoothed seismicity method that has one of the clearest precedents for use in seismic hazard
analysis is that of Frankel (1995), originally derived to characterise the seismicity of the Central
and Eastern United States as part of the 1996 National Seismic Hazard Maps of the United States.
The method applies a simple isotropic Gaussian smoothing kernel to derive the expected rate
of events at each cell ñi from the observed rate n j of seismicity in a grid of j cells. This kernel
takes the form:

di2j /c2
∑ j n je
ñi = (2.20)
di2j /c2
∑j e
In the implementation of the algorithm, two steps are taken that we prefer to make config-
urable options here. The first step is that the time-varying completeness is accounted for using a
correction factor (t f ) based on the Weichert (1980) method:

∑i e−β mci
tf = (2.21)
∑i Ti e−β mci
where mci the completeness magnitude corresponding to the mid-point of each completeness
interval, and Ti the duration of the completeness interval. The completeness magnitude bins must
be evenly-spaced; hence, within the application of the progress a function is executed to render
the input completeness table to one in which the magnitudes are evenly spaced with a width of
0.1 magnitude units.

2.6.2 Implementing the Smoothed Seismicity Analysis


The smoothed seismicity separates out the core implementation (i.e. the gridding, counting
and execution of the code) and the choice of kernel. An example of the execution process is as
follows:
The first stage is to upload the catalogue into an instance of the catalogue class
1 >> input_file = ’ path / to / input_file . csv ’
2
3 >> from openquake . hmtk . parsers . catalogue . csv_catalogue_parser import \
4 CsvCatalogueParser
5
6 >> parser = CsvCatalogueParser ( input_file )
7
8 >> catalogue = parser . read_file ()
2.6 Smoothed Seismicity 35

Next setup the smoothing algorithm using and the corresponding kernel:
1
2 # Imports the smoothed seismicity algorithm
3 >> from openquake . hmtk . seismicity . smoothing . smoothed_seismicity import \
4 SmoothedSeismicity
5
6 # Imports the Kernel function
7 >> from openquake . hmtk . seismicity . smoothing . kernels . isotropic_gaussian \
8 import IsotropicGaussian
9
10 # Grid limits should be set up as
11 # [ min_long , max_long , spc_long ,
12 # min_lat max_lat , spc_lat ,
13 # min_depth , max_depth , spc_depth ]
14 >> grid_limits = [0. , 10. , 0.1 , 0. , 10. , 0.1 , 0. , 60. , 30.]
15 # Assuming a b - value of 1.0
16 >> smooth_seis = SmoothedSeismicity ( grid_limits ,
17 use_3d = True ,
18 bvalue =1.0)

The smoothed seismicity function needs to be set up with three variables: i) the extent (and
spacing) of the grid, ii) the choice to use 3D smoothing (i.e. distances are taken as hypocentral
rather than epicentral) and iii) the input b-value. The extent of the grid can also be defined from
the catalogue. If preferred the user need only specify the spacing of the longitude-latitude grid
(as a single floating point value), then the grid will be defined by taking the bounding box of the
earthquake catalogue and extended by the total smoothing length (i.e. the bandwidth (in km)
multiplied by the maximum number of bandwidths).
To run the smoothed seismicity analysis, the configurable parameters are: BandWidth the
bandwidth of the Gaussian kernel (in km), Length_Limit the number of bandwidths considered
as a maximum smoothing length, and increment chooses whether to output the incremental
a-value (for consistency with the original Frankel (1995) methodology) or the cumulative a-value
(corresponding to the a-value of the Gutenberg-Richter model).
The algorithm requires two essential inputs (the earthquake catalogue and the config file),
and three optional inputs:
• completeness_table A table of completeness magnitudes and their corresponding
completeness years (as output from the completeness algorithms)
• smoothing_kernel An instance of the required smoothing kernel class (currently only
Isotropic Gaussian is supported - and will be used if not specified)
• end_year The final year of the catalogue. This will be taken as the last year found in the
catalogue, if not specified by the user
The analysis is then run via:
1 # Set up config ( e . g . 50 km band width , up to 3 bandwidths )
2 >> config = { ‘ Length_Limit ’: 3. ,
3 ‘ BandWidth ’: 50. ,
4 ’ increment ’: True }
5 # Run the analysis !
6 >> output_data = smooth_seis . run_analysis (
7 catalogue ,
8 config ,
9 completeness_table ,
10 smoothing_kernel = IsotropicGaussian () ,
11 end_year = None )
12
13 # To write the resulting data to a csv file
14 >> smooth_seis . write_to_csv ( ‘ path / to / output_file . csv ’)
36 Chapter 2. Catalogue Tools

The resulting output will be a csv file with the following columns:
Longitude, Latitude, Depth, Observed Count, Smoothed Rate, b-value

where Observed Count is the observed number of earthquakes in each cell, and
Smoothed Rate is the smoothed seismicity rate.
Source Model and Hazard Tools
The Source Model Format
The Source Model Classes
Hazard Calculation Tools

3. Hazard Tools

3.1 Source Model and Hazard Tools


3.1.1 The Source Model Format
The seismic source model formats currently required by the openquake.hmtk are the nrml The
source model is both input and output in GEM’s NRML (“Natural Risk Markup Language“)
format (although support for shapefile input definitions are expected in future releases). However,
unlike the OpenQuake engine, for which each source typology must contain all of the necessary
attributes, it is recognised that it may be desirable to use the seismic source model with a partially
defined source (one for which only the ID, name and geometry are known) in order to make use
of the modelling tools. Therefore, the validation checks have been relaxed to allow for data such
as the recurrence model, the hypocentral depth distribution and the faulting mechanism to be
specified at a later stage. However, if using this minimal format it will not be possible to use the
resulting output file in OpenQuake until the remaining information is filled in.
A full description of the complete nrml seismogenic source model format is found in the
OpenQuake Version 1.0 manual (Crowley et al., 2010). An example of a minimal format is
shown below for:

Point Source

<?xml version=’1.0’ encoding=’utf-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">
<sourceModel name="Some Source Model">
<pointSource id="2" name="point" tectonicRegion="">

<pointGeometry>
<gml:Point>
<gml:pos>-122.0 38.0</gml:pos>
</gml:Point>

<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>10.0</lowerSeismoDepth>
</pointGeometry>

<magScaleRel></magScaleRel>
<ruptAspectRatio></ruptAspectRatio>

<truncGutenbergRichterMFD aValue="" bValue="" minMag="" maxMag="" />


38 Chapter 3. Hazard Tools

<nodalPlaneDist>
<nodalPlane probability="" strike="" dip="" rake="" />
<nodalPlane probability="" strike="" dip="" rake="" />
</nodalPlaneDist>

<hypoDepthDist>
<hypoDepth probability="" depth="" />
<hypoDepth probability="" depth="" />
</hypoDepthDist>

</pointSource>
</sourceModel>
</nrml>

Area Source

<?xml version=’1.0’ encoding=’utf-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">

<sourceModel name="Some Source Model">


<!-- Note: Area sources are identical to point sources, except for the geometry. -->
<areaSource id="1" name="Quito" tectonicRegion="">
<areaGeometry>
<gml:Polygon>
<gml:exterior>
<gml:LinearRing>
<gml:posList>
-122.5 38.0
-122.0 38.5
-121.5 38.0
-122.0 37.5
</gml:posList>
</gml:LinearRing>
</gml:exterior>
</gml:Polygon>

<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>10.0</lowerSeismoDepth>
</areaGeometry>

<magScaleRel></magScaleRel>

<ruptAspectRatio></ruptAspectRatio>

<incrementalMFD minMag="" binWidth="">


<occurRates></occurRates>
</incrementalMFD>

<nodalPlaneDist>
<nodalPlane probability="" strike="" dip="" rake="" />
<nodalPlane probability="" strike="" dip="" rake="" />
</nodalPlaneDist>

<hypoDepthDist>
<hypoDepth probability="" depth="" />
<hypoDepth probability="" depth="" />
</hypoDepthDist>

</areaSource>
</sourceModel>
</nrml>

Simple Fault Source

<?xml version=’1.0’ encoding=’utf-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">
3.1 Source Model and Hazard Tools 39

<sourceModel name="Some Source Model">


<simpleFaultSource id="3" name="Mount Diablo Thrust" tectonicRegion="">

<simpleFaultGeometry>
<gml:LineString>
<gml:posList>
-121.82290 37.73010
-122.03880 37.87710
</gml:posList>
</gml:LineString>

<dip>45.0</dip>
<upperSeismoDepth>10.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>

<magScaleRel></magScaleRel>

<ruptAspectRatio></ruptAspectRatio>

<incrementalMFD minMag="" binWidth="">


<occurRates></occurRates>
</incrementalMFD>

<rake></rake>
</simpleFaultSource>
</sourceModel>
</nrml>

Complex Fault Source

<?xml version=’1.0’ encoding=’utf-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">
<sourceModel name="Some Source Model">
<complexFaultSource id="4" name="Cascadia Megathrust" tectonicRegion="">

<complexFaultGeometry>
<faultTopEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5493260E+01
-124.977 41.214 0.4988560E+01
-125.140 42.096 0.4897340E+01
</gml:posList>
</gml:LineString>
</faultTopEdge>

<intermediateEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5593260E+01
-124.977 41.214 0.5088560E+01
-125.140 42.096 0.4997340E+01
</gml:posList>
</gml:LineString>
</intermediateEdge>

<intermediateEdge>
<gml:LineString>
<gml:posList>
-124.704 40.363 0.5693260E+01
-124.977 41.214 0.5188560E+01
-125.140 42.096 0.5097340E+01
</gml:posList>
</gml:LineString>
</intermediateEdge>

<faultBottomEdge>
40 Chapter 3. Hazard Tools

<gml:LineString>
<gml:posList>
-123.829 40.347 0.2038490E+02
-124.137 41.218 0.1741390E+02
-124.252 42.115 0.1752740E+02
</gml:posList>
</gml:LineString>
</faultBottomEdge>
</complexFaultGeometry>

<magScaleRel></magScaleRel>

<ruptAspectRatio></ruptAspectRatio>

<truncGutenbergRichterMFD aValue="" bValue="" minMag="" maxMag="" />

<rake></rake>
</complexFaultSource>
</sourceModel>
</nrml>

To load in a source model such as those shown above, in an IPython environment simply
execute the following:
1 >> from openquake . hmtk . parsers . source_model . nrml04_parser import \
2 nrm lSourc eMode lParse r
3
4 >> model_filename = ’ path / to / source_model_file . xml ’
5
6 >> model_parser = nrm lSourc eMode lParse r ( model_filename )
7
8 >> model = model_parser . read_file ()
9 Area source - ID : 1 , name : Quito
10 Point Source - ID : 2 , name : point
11 Simple Fault source - ID : 3 , name : Mount Diablo Thrust
12 Complex Fault Source - ID : 4 , name : Cascadia Megathrust

If loaded successfully a list of the source typology, ID and source name for each source will
be returned to the screen as shown above. The variable model contains the whole source model,
and can support multiple typologies (i.e. point, area, simple fault and complex fault).

3.1.2 The Source Model Classes


The HMTK provides a set of classes (tools, in effect) designed to represent the seismogenic
source. These classes mirror their equivalent classes in OpenQuake, albeit allowing for sources
to be used with only partial attributes (namely the name, ID and geometry). As it is a primary
objective of the HMTK to constrain information sufficient to define the full earthquake rupture
forecast for the source model. The source model tools contain five classes, one for each of the
four main source typologies (point, area, simple fault and complex fault) in addition to a source
model class containing methods to convert the full source model into its OpenQuake equivalent.

HMTK Source Model


The general source model class can be created using the following function, which at a minimum
requires a unique identifier (identifier) and a name (name):
1 >> from openquake . hmtk . sources . source_model import mtkSourceModel
2 >> model1 = mtkSourceModel ( identifier = " 0001 " ,
3 name = " Source Model 1 " )

If a list of sources is already provided, these can be passed to the class at the creation:
3.1 Source Model and Hazard Tools 41

1 >> model1 = mtkSourceModel ( identifier = " 0001 " ,


2 name = " Source Model 1 " ,
3 sources = list_of_sources )

The source model class contains two methods:


.serialise_to_nrml(filename, use_defaults=False)
This method converts the existing source model to an instance of the equivalent class from
the OpenQuake “nrml” library. This is needed in order to export the source model into the nrml
format for use with OpenQuake. When the boolean parameter use_defaults is set to True the
function will use default values for any missing variables, except for the magnitude frequency
distribution, which if missing will produce an error.
.convert_to_oqhazardlib(tom, simple_mesh_spacing=1.0,
complex_mesh_spacing=2.0, area_discretisation=10.0,
use_defaults=False)
This method converts the mtkSourceModel into an instance of the equivalent source model
class in the OpenQuake hazard library. This can be used to run a full PSHA calculation from
the source model. The OpenQuake source model class requires the definition of a temporal
occurrence model (TOM). This describes the type of recurrence model and the period for which
the probabilities are defined. For example, in the most common case in which the user wishes to
run a time-independent (i.e. Poissonian) PSHA and return the probability of exceeding a specific
ground motion level in, e.g., 50 years:
1 >> from openquake . hazardlib . tom import PoissonTOM
2
3 >> temporal_model = PoissonTOM (50.0)
4
5 >> oq_source_model1 = model1 . con ve rt_ to _oq ha zar dli b (
6 temporal_model )

The optional parameters control the discretisation of the geometry of the corresponding
sources, if they are present in the model: simple_mesh_spacing the mesh spacing (in km) of
the simple fault typology, complex_mesh_spacing the mesh spacing (in km) of the complex
fault typology, and area_discretisation the spacing of the mesh of nodes used to discretise
the area source model.

Default Values
In the ideal circumstances the user will have defined, for each source, the complete input model
needed for a PSHA calculation before converting to either the nrml or the oq-hazardlib format. It
is recognised, however, that it still be desirable to generate a hazard model from the source model,
even if some information (such as hypocentral depth distribution or nodal plane distribution)
remains incomplete. This might be the case if one wishes to explore the sensitivity of the hazard
curve to certain aspects of the modelling process. The default values are assumed to be as
follows:
• Aspect Ratio: 1.0
• Magnitude Scaling Relation: Wells and Coppersmith (1994) (“WC1994”)
• Nodal Plane Distribution: Strike = 0.0, Dip = 90.0, Rake = 0.0, Weight=1.0
• Hypocentral Depth Distribution: Depth = 10.0 km, Weight = 1.0

HMTK Point Source Model


The HMTK point source typology has the following attributes:
• id: Unique Identifier
• name: Name of source
• trt: Tectonic Region Type
42 Chapter 3. Hazard Tools

• geometry: Geometry of the source as an instance of the OpenQuake Point Geometry


• upper_depth: Upper seismogenic depth (km)
• lower_depth: Lower seismogenic depth (km)
• mag_scale_rel: Magnitude Scaling Relation
• rupt_aspect_ratio: Rupture Aspect Ratio
• mfd: Magnitude Frequency Distribution
• nodal_plane_dist: Nodal Plane Distribution
• hypo_depth_dist: Hypocentral Depth Distribution
• catalogue: Earthquake catalogue associated with the source
A source is created by:
1 >> from openquake . hmtk . sources . point_source import mtkPointSource
2
3 >> from openquake . hazardlib . geo . point import Point
4
5 # In this example the point is located at 30.0 N , 40.0 E
6 >> point_location = Point (30.0 , 30.0)
7
8 >> point_source1 = mtkPointSource ( " 001 " ,
9 " Point1 " ,
10 " Active Shallow Crust " ,
11 point_location ,
12 upper_depth =0.0 ,
13 lower_depth =30.0)

The point source class has the following methods:


.select_catalogue(selector, distance, selector_type="circle",
distance_metric="epicentral", point_depth=None,
upper_eq_depth=None, lower_eq_depth=None)
This selects a catalogue within a distance from the point location. The input selector must
be an instance of the openquake.hmtk.seismicity.selector.CatalogueSelector class,
distance is the distance (in km). Two different selection types (identified using the option
selector_type, are available: “circle” selects events within a circle of radius distance
from the point, “square” selects events within a square grid cell of side length distance
centred on the points. The distance can be selected in terms of “epicentral” or “hypocentral”
distance. point_depth can locate the selection point at a specific depth (only relevant if
hypocentral distance is used), whilst upper_eq_depth and lower_eq_depth limit the selection
to earthquakes within the specified upper depth limit and lower depth limit respectively.
.create_oqnrml_source(use_defaults=False)
Converts the mtkPointSource into its equivalent OpenQuake nrml model.
.create_oqhazardlib_source(tom, mesh_spacing, use_defaults=False)
Converts the source model into its equivalent oq-hazardlib class. tom is the temporal
occurrence model, mesh_spacing not used.

HMTK Area Source Model


The HMTK area source typology contains the same attributes as the HMTK point source typology,
with the following except:
• geometry: Geometry of the source as an instance of the Openquake Polygon geometry
A source is created by:
1 >> from openquake . hmtk . sources . point_source import mtkAreaSource
2 >> from openquake . hazardlib . geo import point , polygon
3 # Create a simple polygon
4 >> area_boundary = polygon . Polygon ([ point . Point (30.0 , 31.0) ,
5 point . Point (30.1 , 31.0) ,
3.1 Source Model and Hazard Tools 43

6 point . Point (30.1 , 31.0) ,


7 point . Point (30.0 , 30.0)])
8
9 >> area_source1 = mtkAreaSource ( " 001 " ,
10 " Area1 " ,
11 " Active Shallow Crust " ,
12 area_boundary ,
13 upper_depth =0.0 ,
14 lower_depth =30.0)

The area source model also has the following methods


.select_catalogue(selector, distance=None)
Where selector is an instance of the HMTK “selector” class, and distance is the buffer
distance (km) around the outside of the polygon (if desired)
.create_oqnrml_source(use_defaults=False)
Converts the mtkAreaSource into its equivalent OpenQuake nrml model.
.create_oqhazardlib_source(tom, mesh_spacing, area_discretisation,
use_defaults=False)
Converts the source model into its equivalent oq-hazardlib class. tom is the temporal
occurrence model and area_discretisation is the spacing (in km) of the mesh of nodes used
to discretise the area source model.

HMTK Simple Fault Source Model


The HMTK Simple Fault source model is one of two typologies intended to characterise a fault
model. The attributes are the same as for the point and area source typologies, with the following
exceptions:
• geometry: Geometry of the source as an instance of the OpenQuake simple fault surface
geometry
• dip: Dip angle in degrees
• rake: The rake angle of the fault (in degrees)
• fault_trace: The fault trace (i.e. the projection of the fault up-dip to the ground surface)
as an instance of the class openquake.hazardlib.geo.line.Line
This class can be created in a slightly different manner when compared to the point and area
source classes, as the example below describes:
1 >> from openquake . hmtk . sources . point_source import mtkSimpleFaultSource
2
3 >> from openquake . hazardlib . geo import line , point
4 # Create a simple polygon
5 >> fault_trace = line . Line ([ point . Point (30.0 , 31.0) ,
6 point . Point (30.5 , 30.5) ,
7 point . Point (31.0 , 30.5)])
8
9 >> fault_source1 = mtkSimpleFaultSource ( " 001 " ,
10 " SimpleFault1 " ,
11 " Active Shallow Crust " )
12
13 >> fault_source1 . create_geometry ( fault_trace ,
14 dip =60.0 ,
15 upper_depth =0. ,
16 lower_depth =25. ,
17 mesh_spacing =1.0)

The HMTK simple fault source has the following methods:


.select_catalogue(selector, distance, distance_metric="joyner-boore",
upper_eq_depth=None, lower_eq_depth=None)
44 Chapter 3. Hazard Tools

Selects the earthquakes within a distance of a simple fault source, where selector is an in-
stance of the HMTK “selector” class, distance is the distance from the fault, distance_metric
is the type of distance metric used (“joyner-boore” or “rupture”).
.create_oqnrml_source(use_defaults=False)
Converts the mtkSimpleFaultSource into its equivalent OpenQuake nrml model.
.create_oqhazardlib_source(tom, mesh_spacing, use_defaults=False)
Converts the source model into its equivalent oq-hazardlib class. tom is the temporal
occurrence model and mesh_spacing is the spacing (in km) of the mesh of nodes used to
discretise the fault surface.

HMTK Complex Fault Model


The HMTK Complex Fault describes a fault model using the OpenQuake complex fault typology
(i.e. one in which the trace edges of the fault need not be parallel). The attributes and methods of
this class are identical to those of the HMTK Simple Fault typology, with the exception that the
attribute fault_trace is now replaced with fault_edges
• fault_edges: The edges of the fault as a list of instances of the class
openquake.hazardlib.geo.line.Line
The object can be created in the following manner:
1 >> from openquake . hmtk . sources . point_source import m tkComp lexFau ltSou rce
2
3 >> from openquake . hazardlib . geo import line , point
4 # Create the upper edge of the fault in three dimensions
5 >> upper_edge = line . Line ([ point . Point (30.0 , 31.0 , 0. ,) ,
6 point . Point (30.5 , 30.5 , 1.) ,
7 point . Point (31.0 , 30.5 , 0.5.)])
8
9 # Create the lower edge of the fault in three dimensions
10 >> lower_edge = line . Line ([ point . Point (30.05 , 31.0 , 27.0. ,) ,
11 point . Point (30.53 , 30.5 , 21.) ,
12 point . Point (31.1 , 30.5 , 25.5.)])
13
14 >> fault_source1 = mtk Comple xFaul tSourc e ( " 001 " ,
15 " ComplexFault1 " ,
16 " Active Shallow Crust " )
17
18 >> fault_source1 . create_geometry ([ upper_edge ,
19 lower_edge ] ,
20 mesh_spacing =1.0)

3.2 Hazard Calculation Tools


The dependency of the HMTK on the openquake hazardlib permits the usage of its seismic
hazard calculators for performing small scale PSHA calculations. The motivation for doing so
comes primarily from the desire to explore the impact of modelling decisions, not only on the
resulting recurrence model but also on the resulting hazard curve. Such sensitivity studies can
provide an important insight into which elements of the model impact are most relevant for the
seismic hazard analysis.
In the following example we show how to set-up and run a PSHA calculation from an
openquake.hmtk source model, using one GMPE (Akkar and Bommer, 2010) and two intensity
measures (PGA and Sa (1.0)).
1. The initial step to running a PSHA calculation is to transform the HMTK source model into
its corresponding openquake.hazardlib model. To do this we use the .convert_to_oqhazardlib
function described previously
3.2 Hazard Calculation Tools 45

1 # Setup the temporal occurrence model


2 >> from openquake . hazardlib . tom import PoissonTOM
3 >> tom = PoissonTOM (50.0)
4 # If the HMTK source model is called " mtk_source_model "
5 >> oq_source_model = \
6 mtk_source_model . co nve rt_ to _oq ha zar dli b ( tom )

2. The next step is to set up a site model. To do this we use the


openquake.hazardlib.site classes. In this example we consider three sites:
1 >> from openquake . hazardlib import site
2 >> from openquake . hazardlib . geo . point import Point
3 # Site 1 is located at (30 E , 40 N ) , vs30 is 760 ( measured )
4 >> site_1 = site . Site ( Point (30.0 , 40.0) ,
5 760. ,
6 True ,
7 100.0 ,
8 5.0)
9 # Site 2 is located at (30.5 E , 40.5 N ) , vs30 is 500 ( measured )
10 >> site_2 = site . Site ( Point (30.5 , 40.5) ,
11 500. ,
12 True ,
13 100.0 ,
14 5.0)
15 # Site 3 is located at (31.0 E , 40.5 N ) , vs30 is 200 ( inferred )
16 >> site_3 = site . Site ( Point (31.0 , 40.5) ,
17 500. ,
18 True ,
19 100.0 ,
20 5.0)
21 # Join them together to form a site collection
22 >> sites = site . SiteCollection ([ site_1 , site_2 , site_3 ])

Alternatively if you have your site data in an array (such would be the case if you were
loading from a csv file), you can use a built-in HMTK function to create the site model
1 # For the same sites as in the previous example
2 >> from openquake . hmtk . hazard import si t e _a r r ay _ t o_ c o ll e c ti o n
3 >> site_array = np . array (
4 [[30.0 , 40.0 , 760. , 1.0 , 100. , 5.0 , 1.] ,
5 [30.5 , 40.5 , 500. , 1.0 , 100. , 5.0 , 2.] ,
6 [31.0 , 40.6 , 200. , 0.0 , 100. , 5.0 , 3.]])
7 >> sites = s it e _ ar r a y_ t o _c o l le c t io n ( site_array )

3. Define the GMPE tectonic regionalisation. In this case we consider only one tectonic
region type (Active Shallow Crust) and one GMPE (Akkar and Bommer, 2010)).
1 # The Akkar & Bommer (2010) GMPE is known to
2 # OpenQuake as AkkarBommer2010
3 >> gmpe_model = { " Active Shallow Crust " : " AkkarBommer2010 " }

4. Define the intensity measure types and corresponding intensity measure levels
1 >> imt_list = [ " PGA " , " SA (1.0) " ]
2 >> pga_iml = [0.001 , 0.01 , 0.02 , 0.05 , 0.1 ,
3 0.2 , 0.4 , 0.6 , 0.8 , 1.0 , 2.0]
4 >> sa1_iml = [0.001 , 0.01 , 0.02 , 0.05 , 0.1 ,
5 0.2 , 0.3 , 0.5 , 0.7 , 1.0 , 1.5]
6 >> iml_list = [ pga_iml , sa1_iml ]

5. Run the PSHA calculation


1 >> from openquake . hmtk . hazard import HMTKHazardCurve
2 >> haz_curves = HMTKHazardCurve ( oq_source_model ,
46 Chapter 3. Hazard Tools

3 sites ,
4 gmpe_model ,
5 iml_list ,
6 imt_list ,
7 truncation_level =3.0 ,
8 s ou rc e_ i nt eg ra t io n_ di s t = None ,
9 r u pt u r e_ i n te g r at i o n_ d i st = None )

6. The output, in the above example “haz_curves”, is a dictionary that has the following
form:
1
2 >> haz_curves
3 { PGA : np . array ([[ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
4 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
5 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )]]) ,
6 SA (1.0): np . array ([[ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
7 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )] ,
8 [ P ( IML_1 ) , P ( IML_2 ) , ... P ( IML_nIML )]])}

where P(IMLi ) is the probability of exceeding intensity measure level i in the period of
the temporal occurrence model (50 years in this case). So for each intensity measure
type there is a corresponding 2-D array of values with NSIT ES rows and NIMLS columns,
where NSIT ES is the number of sites in the site model, and NIMLS is the number of intensity
measure levels defined for the specific intensity measure type.
Fault Recurrence from Geology
Epistemic Uncertainties in the Fault Mod-
elling
Tectonic Regionalisation
Definition of the Fault Input
Fault Recurrence Models
Running a Recurrence Calculation from
Geology

4. Geology Tools

4.1 Fault Recurrence from Geology


The second set of tools is designed to support a workflow in which the modeller has sufficient
information to define both the geometry of the active fault surface and the slip rate, from which
they then wish to calculate the activity rate on the fault according to a particular magnitude
frequency distribution. It is recognised that in practice this is a complex and challenging process
as the physical parameters of many faults may be highly uncertain, and the propagation of this
uncertainty is critical in defining the epistemic uncertainty on such source models (Peruzza et al.,
2010). The current implementation of the tools focusses on the time-independent workflow
entirely, aiming to allow the user to constrain the activity rate from the geological slip for an
assumed single section. It is hoped that in future this will evolve to consider more complex
conditions, such as those in which the observations of displacement at points along the segment
and interactions between segments can be taken into consideration. The manner in which these
features take shape will become clearer as more data is input into the global fault database
created by the GEM Faulted Earth project.
The core of the time-independent workflow originates from the simple moment balance
in which the total moment release rate (in Nm) on the fault Ṁo is derived from the slip rate
ṡ(Anderson and Luco, 1983; Bungum, 2007):

Ṁo = cµAṡ (4.1)

where A is the area of the fault surface (in km2 ), µ is the shear modulus (characterised in the
toolkit in terms of GigaPascals, GPa) and c the coefficient of seismogenic coupling. Slip rates
must be input in mm yr−1 ; lengths and area in km or km2 respectively. The magnitude frequency
distribution calculators differ primarily in the manner in which this moment rate is distributed in
order to constrain the activity rate. The different calculators are described below.

4.1.1 Epistemic Uncertainties in the Fault Modelling


The manner in which epistemic uncertainties are incorporated into the fault recurrence calculation
would appear to vary somewhat in practice. This is in no small part due to the manner in which
the uncertainty on the contributing parameters is represented in a quantitative sense. For the
present implementation, and driven in part by the need for consistency with the OpenQuake
48 Chapter 4. Geology Tools

hazardlib, a purely decision based epistemic uncertainty analysis is supported. This requires that,
for each parameter upon which epistemic uncertainty is supported, the user must specify the
alternative values and the corresponding weightings. Currently, we support epistemic uncertainty
on six different parts of the model:
1. Slip Rate (mm yr−1 )
2. Magnitude Scaling Relation
3. Shear Modulus (GPa)
4. Displacement to Length Ratio
5. Number of standard deviations on the Magnitude Scaling Relation
6. Configuration of the Magnitude Frequency Distribution
As input to the function, these epistemic uncertainties must be represented as a list of tuples,
with a corresponding value and weight. For example, if one wished to characterise the slip on a
fault by three values (e.g. 3.0, 5.0, 7.0) with corresponding weightings (e.g. 0.25, 0.5, 0.25), the
slip should be defined as shown below:
1 >> slip = [(3.0 , 0.25) , (5.0 , 0.5) , (7.0 , 0.25)]

In characterising uncertainty in this manner the user is essentially creating a logic tree for
each source, in which the total number of activity rate models (i.e. the end branches of the logic
tree) is the product of the number of alternative values input for each supported parameter. The
user can make a choice as to how they wish this uncertainty to be represented in the resulting
hazard model:

Complete Enumeration
This will essentially reproduce the fault in the source model N times, where N is the total number
of end branches, where on each branch the resulting activity rate is multiplied by the end weight
the branch.

Collapsed Branches
In some cases it may become to costly to reproduce faults models separately for each end branch,
and the user may simply wish to collapse the logic tree into a single activity rate. This rate,
represented by an incremental magnitude frequency distribution, is the sum of the weighted
activity rates on all the branches. To calculate this the program will determine the minimum and
maximum magnitude of all the branches, then using a user specified bin width will calculate the
weighted sum of the occurrence rates in each bin.
N.B. When collapsing the branches it the original magnitude scaling relations used on the
branches and the scaling relation associated to the source in the resulting OpenQuake source
model are not necessarily the same! The user will need to specify the scaling relation that will
be assigned to the fault source when the branches are collapsed.

Magnitude Scaling Relations


To ensure compatibility with the OpenQuake engine, the scaling relations are taken directly
from the OpenQuake library. Therefore the only scaling relations available are those that can be
currently found in the oq-hazardlib (Wells and Coppersmith (1994) and Thomas et al. (2010) at
the time of writing). To implement new magnitude scaling relations the reader is referred to the
documentation and source code for the OpenQuake hazard library (http://docs.openquake.org/oq-
hazardlib)

4.1.2 Tectonic Regionalisation


Recognising once again that certain parameters may not be possible to constrain on a fault by fault
basis, a tectonic regionalisation can be invoked in order to define parameters, or a distribution of
4.1 Fault Recurrence from Geology 49

values, that can be assigned to all faults sharing that tectonic regionalisation classification. At
present, the regionalisation can be used to define default parameters/distributions for:
1. Magnitude Scaling Relation
2. Shear Modulus
3. Displacement to Length Ratio
If defining a tectonic regionalisation the inputs must be specified as a set of tuples, in the
same fashion as described for the epistemic uncertainties. In the present version there is no direct
geographical link between the fault and the tectonic region (i.e. the regionalisation is a data
holder and does not have geographical attributes), although it is anticipated that this may change
in the future. At present it will be necessary to indicate for each fault the tectonic regionalisation
class to which it belongs.

4.1.3 Definition of the Fault Input


The “YAML” Format
The establishment of a standard xml for representing input faults in OpenQuake remains to be
undertaken, and will be done so following the completion of the GEM Faulted Earth project. The
current version supports a less verbose, and more human-readable, characterisation using the “Yet
Another Markup Language (YAML)” format. The Yaml format is both case and spacing sensitive,
so care must be paid to the spacing and the punctuation characters below. For development, the
primary advantage of the Yaml format is that the data are largely being defined in a manner
that is consistent with the corresponding python objects, lists and dictionaries. This makes the
reading of the file a simpler process.
A template example (for a single simple fault) is broken down into steps below.
The first component of the Yaml file is the tectonic regionalisation:

#*****************************************************************************
#FAULT FILE IN YAML (Yet Another Markup Language) FORMAT
#*****************************************************************************
#
tectonic_regionalisation:
- Name: Active Shallow Crust
Code: 001
# Magnitude scaling relation (see http://docs.openquake.org/oq-hazardlib)
#for currently available choices!
Magnitude_Scaling_Relation: {
Value: [WC1994],
Weight: [1.0]}
# Shear Modulus (in gigapascals, GPa)
Shear_Modulus: {
Value: [30.0],
Weight: [1.0]}
# Fault displacement to length ratio
Displacement_Length_Ratio: {
Value: [1.25E-5],
Weight: [1.0]}

Here the tectonic regionalisation represents a list of categories (albeit only one is shown
above). The lists elements are demarcated by the ( - ) symbol and all indentation is with respect
to that symbol. So in the above example, a single region class has been defined with the name
“Active Shallow Crust” and a unique identifier code “001”. The default values are now provided
for the three data attributes: magnitude scaling relation, shear modulus and displacement to
length ratio. Each attribute is associated with a dictionary containing two keys: “Value” and
“Weight”, which then define the lists of the values and the corresponding weights, respectively.
N.B. If, for any of the above attributes, the number of weights is not the same as the number
of values, or the weights do not sum to 1.0 then an error will be raised!
50 Chapter 4. Geology Tools

A fault model must be defined with both a model identifier key (“001”) and a model name,
“001” and “Template Simple Fault” in the example below. From then on the each fault is then
defined as an element in the list.
Fault_Model_ID: 001
Fault_Model_Name: Template Simple Fault
Fault_Model:
- ID: 001
Tectonic_Region: Active Shallow Crust
Fault_Name: A Simple Fault
Fault_Geometry: {
Fault_Typology: Simple,
# For simple typology, defines the trace in terms of Long., Lat.
Fault_Trace: [30.0, 30.0,
30.0, 31.5],

# Upper Seismogenic Depth (km)


Upper_Depth: 0.0,
# Lower Seismogenic Depth (km)
Lower_Depth: 20.0,
Strike: ,
# Dip (degrees)
Dip: 60.}
Rake: -90
Slip_Type: Thrust
Slip_Completeness_Factor: 1
# slip [value_1, value_2, ... value_n]
# [weight_1, weight_2, ... weight_n]
Slip: {
Value: [18., 20.0, 23.],
Weight: [0.3, 0.5, 0.2]}
#Aseismic Slip Factor
Aseismic: 0.0
MFD_Model:
# Example of constructor for characteristic earthquake
- Model_Type: Characteristic
# Spacing (magnitude units) of the magnitude frequency distribution
MFD_spacing: 0.1
# Weight of the model
Model_Weight: 0.2
# Magnitude of the Characteristic Earthquake
Maximum_Magnitude:
# Uncertainty on Characteristic Magnitude (in magnitude units)
Sigma: 0.12
# Lower bound truncation (in number of standard deviations)
Lower_Bound: -3.0
# Upper bound truncation (in number of standard deviations)
Upper_Bound: 3.0
####################################################
- Model_Name: AndersonLucoArbitrary
# Example constructor of the Anderson & Luco (1983) - Arbitrary Exponential
# Type - chooses between type 1 (’First’), type 2 (’Second’) or type 3 (’Third’)
Type: First
MFD_spacing: 0.1
Model_Weight: 0.1
# Maximum Magnitude of the exponential distribution
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 4.5
# b-value of the exponential distribution as [expected, uncertainty]
b_value: [0.8, 0.05]
####################################################
- Model_Name: AndersonLucoAreaMmax
# Example constructor of the Anderson & Luco (1983) - Area-Mmax Exponential
# Type - chooses between type 1 (’First’), type 2 (’Second’) or type 3 (’Third’)
Model_Type: Second
MFD_spacing: 0.1
Model_Weight: 0.1
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
4.1 Fault Recurrence from Geology 51

Minimum_Magnitude: 4.5
b_value: [0.8, 0.05]
####################################################
- Model_Name: YoungsCoppersmithExponential
# Example constructor of the Youngs & Coppersmith (1985) Exponential model
MFD_spacing: 0.1
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
b_value: [0.8, 0.05]
####################################################
# Example constructor of the Youngs & Coppersmith (1985) Characteristic model
- Model_Name: YoungsCoppersmithCharacteristic
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
MFD_spacing: 0.1
b_value: [0.8, None]
Shear_Modulus: {
Value: [30., 35.0],
Weight: [0.8, 0.2]}
Magnitude_Scaling_Relation: {
Value: [WC1994],
Weight: [1.0]}
Scaling_Relation_Sigma: {
Value: [-1.5, 0.0, 1.5],
Weight: [0.15, 0.7, 0.15]}
Aspect_Ratio: 1.5
Displacement_Length_Ratio: {
Value: [1.25E-5, 1.5E-5],
Weight:[0.5, 0.5]}

The details of the magnitude frequency distribution configurations MFD_Model will be


expanded upon in the respective sections. The critical attributes for a fault are then:
• ID The unique identifier for the fault
• Name The fault name
• Tectonic Region The tectonic region class to which the fault is assigned. Note that if
the region class if not defined in the tectonic region header than an error will be raised.
• Fault Geometry A dictionary of the geometrical properties of the fault (described in
further detail in due course
• Rake The rake of the fault (as defined using the Aki and Richards (2002) convention)
• Slip A dictionary with the slip rate of the fault in mm yr−1 and their corresponding
uncertainties
• Aseismic A coefficient describing the pfraction of slip released aseismically (effectively
1 − c where c is the coupling coefficient)
• MFD_Model A list of models describing the corresponding magnitude frequency distribu-
tion properties
• Aspect_Ratio The rupture aspect ratio
• Scaling_Relation_Sigma The number of standard deviations of the magnitude scaling
relation and their corresponding weights.
For shear modulus, magnitude scaling relation and displacement to length ratio the fault input
is the same as for the tectonic regionalisation. These attributes can be for a single fault, but if
they are not provided for the fault they can be defined within the tectonic regionalisation. If they
are not defined within the tectonic regionalisation then the default values are assumed: 30 GPa
for the shear modulus, Wells and Coppersmith (1994) for the scaling relation and 1.25 × 10−5
for the displacement to length ratio.
The remaining attributes are not essential and are not used in the calculation at present, but
52 Chapter 4. Geology Tools

are included for completeness:


• Slip_Type Description of the type of slip (i.e. normal, thrust, etc.)
• Slip_Completeness_Factor The completeness factor (or quality factor) of the slip as
an integer from 1 (Well constrained) to 4 (Unconstrained)

4.1.4 Fault Recurrence Models


The current implementation of the openquake.hmtk supports nine different models for deriving
recurrence models from geological parameters. Six different models are provided for exponential
distributions, which originate from the study of Anderson and Luco (1983), one for a ”simple
characteristic” earthquake (used here), a further exponential model (Youngs and Coppersmith,
1985) and the hybrid characteristic-exponential model also presented by Youngs and Coppersmith
(1985). Models describing exponential recurrence requiring the definition of a ”b-value” for each
source. The various nuances and assumptions behind each model are described in the sources
cited here, and the reader is strongly encouraged to refer to the source material for further detail.
Common to many of these models is the moment magnitude definition of Hanks and
Kanamori (1979):

¯
Mo (MW ) = 10.016.05+1.5MW = Mo (MMAX ) e−d∆M (4.2)

where d¯ = 1.5 loge (10.0).


Anderson and Luco (1983) “Arbitrary”
The study of Anderson and Luco (1983) defines three categories of models for defining a
magnitude recurrence distribution from the geological slip. The first model refers to the case
when the recurrence is related to the entire fault, which we call the “Arbitrary“ model here. The
second model refers the recurrence to the rupture area of the maximum earthquake, the “Area
Mmax” model here. The third category relates the recurrence to a specific site on the fault, and is
not yet implemented in to tools. From within each of the three categories there are three different
subcategories, which allow for different forms of tapering at the upper edge of the model. The
reader is referred to the original paper of Anderson and Luco (1983) for further details and a
complete derivation of the models. The different forms of the recurrence model are referred
to here as types 1, 2 and 3, which correspond to equations 4, 3 and 2 in the original paper of
Anderson and Luco (1983).
1. The ‘first’ type of Anderson and Luco (1983) arbitrary model is defined as:

d¯ − b̄ µAṡ 
N (MW ≥ M) = exp b̄ [MMAX − M] (4.3)
d¯ Mo (MMAX )

where b̄ = b loge (10.0), Mo (MMAX ) is the moment of the maximum magnitude, and A and
ṡ are the are of the fault and the slip rate, as defined previously.
2. The ‘second’ type of model is defined as:

d¯ − b̄ µAṡ 
N (MW ≥ M) = exp −b̄ [MMAX − M] − 1 (4.4)
b̄ Mo (MMAX )
3. The ‘third’ type of model is defined as:

d¯ d¯ − b̄

µAṡ
N (MW ≥ M) = ×...
b̄ Mo (MMAX )
  (4.5)
1  
exp b̄ [MMAX − M] − 1 − [MMAX − M]

4.1 Fault Recurrence from Geology 53

The configuration of the MFD_Model for the Anderson and Luco (1983) “Arbitrary” calcula-
tors shown as follows:

- Model_Name: AndersonLucoArbitrary
# Example constructor of the Anderson & Luco (1983) - Arbitrary Exponential
# Type - chooses between type 1 (’First’), type 2 (’Second’) or type 3 (’Third’)
Type: First
MFD_spacing: 0.1
Model_Weight: 0.1
# Maximum Magnitude of the exponential distribution
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 4.5
# b-value of the exponential distribution as [expected, uncertainty]
b_value: [0.8, 0.05]

The Model_Name and the Type are self explanatory. Model_Weight is the weighting of the
particular MFD within the epistemic uncertainty analysis. MFD_Spacing is the spacing of the
evenly discretized magnitude frequency distribution that will be output. The three parameters
Minimum_Magnitude, Maximum_Magnitude and
Maximum_Magnitude_Uncertainty define the bounding limits of the MFD and the stan-
dard deviation of MMAX in MW units. Minimum_Magnitude is an essential attribute, whereas
Maximum_Magnitude and Maximum_Magnitude_Uncertainty are optional. If not specified
the code will calculate Maximum_Magnitude and
Maximum_Magnitude_Uncertainty from the magnitude scaling relationship. Finally, as these
are exponential models the b-value must be specified. Here it is preferred that b-value is specified
as a tuple of [b − value, b − value error], although at present the epistemic uncertainty on b-value
does not propagate (this may change in future!).
As with the catalogue tools, plotting functions are available to assist the user understand
the nature of the recurrence model used for a given fault. To illustrate the impact of the choice
of the ‘first’, ‘second’ and ‘third’ type of model we consider a simple fault with the following
properties: along-strike length = 200 km, down-dip width = 20 km, rake = 0.0 (strike-slip) and
slip-rate = 10 mm/yr. The Wells and Coppersmith (1994) magnitude scaling relation is assumed.
The fault and three magnitude frequency distributions are configured as shown:
1 >> slip = 10.0 # Slip rate in mm / yr
2 # Area = along - strike length ( km ) * down - dip with ( km )
3 >> area = 100.0 * 20.0
4 # Rake = 0.
5 >> rake = 0.
6 \ # Magnitude Scaling Relation
7 >> from openquake . hazardlib . scalerel . wc1994 import WC1994
8 >> msr = WC1994 ()
9
10 >> and_luc_config1 = { ’ Model_Name ’: ’ An derson LucoA rbitra ry ’ ,
11 ’ Model_Type ’: ’ First ’ ,
12 ’ Model_Weight ’: 1.0 ,
13 ’ MFD_spacing ’: 0.1 ,
14 ’ Maximum_Magnitude ’: None ,
15 ’ Minimum_Magnitude ’: 4.5 ,
16 ’ b_value ’: [0.8 , 0.05]}
17 >> and_luc_config2 = { ’ Model_Name ’: ’ And ersonL ucoAr bitrar y ’ ,
18 ’ Model_Type ’: ’ Second ’ ,
19 ’ Model_Weight ’: 1.0 ,
20 ’ MFD_spacing ’: 0.1 ,
21 ’ Maximum_Magnitude ’: None ,
22 ’ Minimum_Magnitude ’: 4.5 ,
23 ’ b_value ’: [0.8 , 0.05]}
54 Chapter 4. Geology Tools

24 >> and_luc_config3 = { ’ Model_Name ’: ’ And ersonL ucoAr bitrar y ’ ,


25 ’ Model_Type ’: ’ Third ’ ,
26 ’ Model_Weight ’: 1.0 ,
27 ’ MFD_spacing ’: 0.1 ,
28 ’ Maximum_Magnitude ’: None ,
29 ’ Minimum_Magnitude ’: 4.5 ,
30 ’ b_value ’: [0.8 , 0.05]}

101
AndersonLucoArbitrary - First Type
AndersonLucoArbitrary - Second Type
AndersonLucoArbitrary - Third Type
100

10-1
Annual Rate

10-2

10-3

10-44.5 5.0 5.5 6.0 6.5 7.0 7.5


Magnitude

Figure 4.1 – Comparison of magnitude frequency distributions for the specified fault using the three
different models of the Anderson and Luco, 1983 “Arbitrary” configuration

The models are then compared, as shown in Figure 4.1, using the following commands:
1 >> anderson_luco_arb = [ and_luc_config1 ,
2 and_luc_config2 ,
3 and_luc_config3 ]
4 # Import geological recurrence model plotting function
5 >> from openquake . hmtk . plotting . faults . geology_mfd_plot import \
6 pl ot _re cur re nce _m ode ls
7 >> p lot _r ecu rre nc e_m od els ( anderson_luco_arb ,
8 area ,
9 slip ,
10 msr ,
11 rake ,
12 msr_sigma =0.0) # Number of standard
13 # deviations above or
14 # below the median msr

Anderson and Luco (1983) “Area Mmax”


The second set of models from Anderson and Luco (1983) consider the case when the the
recurrence model is referred to the rupture area of the maximum earthquake specified on the
fault. As the area is not extracted directly from the geometry, additional information must be
provided, namely the aspect ratio of ruptures on the fault and the displacement to length ratio (α)
4.1 Fault Recurrence from Geology 55

of the fault (Bungum, 2007). This information is used to derived an additional parameter (β ):

s
αMo (0)
β= (4.6)
µW

The three types of Anderson and Luco (1983) “Area Mmax” model are then calculated via:
1. Type 1 (“First”)

d¯ − b̄ ṡ
 ¯ 
d 
N (MW ≥ M) = ¯ exp − MMAX exp b̄ [MMAX − M] (4.7)
d β 2
2. Type 2 (“Second”)

d¯ − b̄ ṡ
 ¯ 
d 
N (MW ≥ M) = exp − MMAX exp −b̄ [MMAX − M] − 1 (4.8)
b̄ β 2
3. Type 3 (“Third”)

d¯ d¯ − b̄ ṡ
  ¯ 
d
N (MW ≥ M) = exp − MMAX × . . .
b̄ β 2
  (4.9)
1  
exp b̄ [MMAX − M] − 1 − [MMAX − M]

As the rupture aspect ratio and displacement to length ratio are attributes of the fault and not
of the MFD, then the MFD configuration is the same as that of the Anderson and Luco (1983) “Ar-
bitrary” calculator, albeit that Model_Name must now be specified as AndersonLucoAreaMmax.
As before, the maximum magnitude and their uncertainty are optional, and will taken from the
magnitude scaling relation if not specified in the configuration. This is permitted simply to
ensure flexibility of the algorithm, although given the context of the “Area MMax” algorithm it
is understood that the maximum magnitude should be interpreted by the modeller. If this is not
the case, and the maximum magnitude is intended to be constrained using the geometry of the
rupture, the “Arbitrary” model may be preferable.
The three distributions can compared visually for the same fault using the plotting tools
shown previously. The example below, using the same fault properties defined previously, will
generate a plot similar to that shown in Figure 4.2.
1 >> and_luc_config1 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
2 ’ Model_Type ’: ’ First ’ ,
3 ’ Model_Weight ’: 1.0 ,
4 ’ MFD_spacing ’: 0.1 ,
5 ’ Maximum_Magnitude ’: None ,
6 ’ Minimum_Magnitude ’: 4.5 ,
7 ’ b_value ’: [0.8 , 0.05]}
8 >> and_luc_config2 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
9 ’ Model_Type ’: ’ Second ’ ,
10 ’ Model_Weight ’: 1.0 ,
11 ’ MFD_spacing ’: 0.1 ,
12 ’ Maximum_Magnitude ’: None ,
13 ’ Minimum_Magnitude ’: 4.5 ,
14 ’ b_value ’: [0.8 , 0.05]}
15 >> and_luc_config3 = { ’ Model_Name ’: ’ AndersonLucoAreaMmax ’ ,
16 ’ Model_Type ’: ’ Third ’ ,
56 Chapter 4. Geology Tools

102
AndersonLucoAreaMmax - First Type
AndersonLucoAreaMmax - Second Type
AndersonLucoAreaMmax - Third Type
101

100
Annual Rate

10-1

10-2

10-34.5 5.0 5.5 6.0 6.5 7.0 7.5


Magnitude

Figure 4.2 – Comparison of magnitude frequency distributions for the specified fault using the three
different models of the Anderson and Luco (1983) “Area-Mmax” configuration

17 ’ Model_Weight ’: 1.0 ,
18 ’ MFD_spacing ’: 0.1 ,
19 ’ Maximum_Magnitude ’: None ,
20 ’ Minimum_Magnitude ’: 4.5 ,
21 ’ b_value ’: [0.8 , 0.05]}
22 >> a n de rs on _ lu co _a r ea _m ma x = [ and_luc_config1 ,
23 and_luc_config2 ,
24 and_luc_config3 ]
25 >> p lot _r ecu rre nc e_m od els ( anderson_luco_area_mmax ,
26 area ,
27 slip ,
28 msr ,
29 rake ,
30 disp_length_ratio =1.25 E -5 ,
31 msr_sigma =0.0)

Characteristic
Although the term “Characteristic” may take on certain different meanings in the literature, in
the present calculator it is referring to the circumstance when the fault is assumed to rupture
with magnitudes distributed in a narrow range around the single characteristic magnitude. The
model is therefore a truncated Gaussian distribution, in which the following must be specified:
mean characteristic magnitude, the uncertainty (in magnitude units) and the number of standard
deviations above and below the mean to be used as truncation limits.

# Example of constructor for characteristic earthquake


- Model_Type: Characteristic
# Spacing (magnitude units) of the magnitude frequency distribution
MFD_spacing: 0.1
# Weight of the model
Model_Weight: 0.2
# Magnitude of the Characteristic Earthquake
Maximum_Magnitude:
4.1 Fault Recurrence from Geology 57

# Uncertainty on Characteristic Magnitude (in magnitude units)


Sigma: 0.12
# Lower bound truncation (in number of standard deviations)
Lower_Bound: -3.0
# Upper bound truncation (in number of standard deviations)
Upper_Bound: 3.0

The parameters Model_Type, MFD_Spacing, Model_Weight, and Maximum_Magnitude


are as described for the previous calculators. Sigma is the uncertainty of the characteristic
magnitude (in magnitude units), and Lower_Bound and Upper_Bound are the lower and upper
truncation limits of the Gaussian distribution respectively. Note that setting Sigma to 0.0 or
Lower_Bound and Upper_Bound to zero will simply result in the characteristic magnitude being
evaluated as a single Dirac function.

10-2
Characteristic

10-3
Annual Rate

10-4

10-5

10-66.8 7.0 7.2 7.4 7.6 7.8 8.0


Magnitude

Figure 4.3 – Magnitude frequency distribution for the specified fault configuration using the “Char-
acteristic” model

The distribution is shown, for the fault example defined previously, in Figure 4.3, which is
generated using the code example shown below:
1 >> characteristic = [{ ’ Model_Name ’: ’ Characteristic ’ ,
2 ’ MFD_spacing ’: 0.05 ,
3 ’ Model_Weight ’: 1.0 ,
4 ’ Maximum_Magnitude ’: None ,
5 ’ Sigma ’: 0.15 ,
6 ’ Lower_Bound ’: -3.0 ,
7 ’ Upper_Bound ’: 3.0}]
8 >> p lot _r ecu rre nc e_m od els ( characteristic ,
9 area ,
10 slip ,
11 msr ,
12 rake ,
13 msr_sigma =0.0)
58 Chapter 4. Geology Tools

Youngs and Coppersmith (1985) “Exponential”


This model is another form of “Exponential” model and is noted as being similar in construct to
the Anderson and Luco (1983) Type 2 models. It is included here mostly for completeness. The
model is given as:

µAṡ (1.5 − b) (1 − exp (−β (MMAX − M)))


N (MW ≥ M) = (4.10)
bM0MAX exp (−β (MMAX − M))

where M0MAX is the moment corresponding to the maximum magnitude. The inputs for the model
are defined in a similar manner as for the Anderson and Luco (1983) models:

- Model_Name: YoungsCoppersmithExponential
# Example constructor of the Youngs & Coppersmith (1985) Exponential model
MFD_spacing: 0.1
Model_Weight: 0.3
Maximum_Magnitude:
Maximum_Magnitude_Uncertainty:
Minimum_Magnitude: 5.0
b_value: [0.8, 0.05]

Note that all of the exponential models described here contain the term d − b, or some variant
thereof, where d is equal to 1.5. This introduces the condition that b ≤ 1.5.
Youngs and Coppersmith (1985) “Characteristic”
The Youngs and Coppersmith (1985) model is a hybrid model, comprising an exponential
distribution for lower magnitudes and a fixed recurrence rate for the characteristic magnitude
MC . The exponential component of the model is described via:

" #
µAṡ e(−β (MMAX −M−0.5)) M0MAX b10−c/2 beβ 1 − 10−c/2
N (M) − N (MC ) = + (4.11)
1 − e(−β (MMAX −M−0.5)) c−b c

where β = b ln (10), c = 1.5 and all other parameters are described as for the Youngs and
Coppersmith (1985) “Exponential” and Anderson and Luco (1983) models. The rate for the
characteristic magnitude is then given by:

β (N (M) − N (MC )) e−β (MMAX −M−1.5)


N (MC ) =  (4.12)
2 1 − e−β (MMAX −M−1.5)

As described in Youngs and Coppersmith (1985), this model assumes that:


1. The characteristic magnitude bin width ∆MC is 0.5 MW
2. Magnitudes are exponentially distributed up to a value M 0 , where M 0 = MMAX − ∆MC
3. The absolute rate of characteristic earthquakes Ṁ (MC ) is approximately equal to Ṁ (M 0 − 1)
The current calculator adopts the implementation found in the OpenQuake hazardlib. At
present, these three model assumptions are hard-coded, meaning that the distribution need
only be defined from the moment rate and the characteristic magnitude. For [implementation]
simplicity the input definition for the characteristic earthquake model is actually the same as for
the exponential model. However, here the attribute Maximum_Magnitude is actually referring to
the characteristic magnitude and not to the absolute maximum magnitude, which will be 0.25
larger.
The two Youngs and Coppersmith (1985) distributions are compared in Figure 4.4, which is
generated using the code below:
4.1 Fault Recurrence from Geology 59

100
YoungsCoppersmithExponential
YoungsCoppersmithCharacteristic

10-1
Annual Rate

10-2

10-3

10-45.0 5.5 6.0 6.5 7.0 7.5 8.0


Magnitude

Figure 4.4 – Magnitude frequency distribution for the specified fault configuration using the Youngs
and Coppersmith (1985) “Exponential” and “Hybrid” models

1 >> exponential = { ’ Model_Name ’: ’ Y o u n g s C o p p e r s m i t h E x p o n e n t i a l ’ ,


2 ’ MFD_spacing ’: 0.1 ,
3 ’ Maximum_Magnitude ’: None ,
4 ’ M a x i m u m _ M a g n i t u d e _ U n c e r t a i n t y ’: None ,
5 ’ Minimum_Magnitude ’: 5.0 ,
6 ’ Model_Weight ’: 1.0 ,
7 ’ b_value ’: [0.8 , 0.1]}
8
9 >> hybrid = { ’ Model_Name ’: ’ Y o u n g s C o p p e r s m i t h C h a r a c t e r i s t i c ’ ,
10 ’ MFD_spacing ’: 0.1 ,
11 ’ Maximum_Magnitude ’: None ,
12 ’ M a x i m u m _ M a g n i t u d e _ U n c e r t a i n t y ’: None ,
13 ’ Minimum_Magnitude ’: 5.0 ,
14 ’ Model_Weight ’: 1.0 ,
15 ’ b_value ’: [0.8 , 0.1] ,
16 ’ delta_m ’: None }
17
18 >> y oungs_coppersmith = [ exponential , hybrid ]
19
20 # View the corresponding magnitude recurrence model
21 >> p lot _r ecu rre nc e_m od els ( youngs_coppersmith ,
22 area ,
23 slip ,
24 msr ,
25 rake ,
26 msr_sigma =0.0)

4.1.5 Running a Recurrence Calculation from Geology


The particulars of the geological workflow are largely established in the input definition, and not
in the configuration file. Execution of the whole process is then relatively simple, so it is broken
down here into two steps. The first step simply imports an appropriate parser and loads the fault
60 Chapter 4. Geology Tools

source. The parser contains a method called read_file(x) which takes as an input the desired
mesh spacing (in km) used to create the mesh of the fault surface. In the example below this is
set to 1.0 km, but for large complex faults (such as large subduction zones) it may be desirable to
select a larger spacing to avoid storing a large mesh in RAM.
1 >> from openquake . hmtk . parsers . faults . fault_yaml_parser import \
2 FaultYmltoSource
3
4 >> input_file = ’ path / to / fault_model_file . yml ’
5
6 >> parser = FaultYmltoSource ( input_file )
7
8 # Spacing of the fault mesh ( km ) must be specified
9 # at input ( here 1.0 km )
10 >> fault_model , tect_reg = parser . read_file (1.0)

In the second step, we simply execute the method to calculate the recurrence on the fault,
and the write the resulting source model to an xml file.
1 # Builds the fault model
2 >> fault_model . build_fault_model ()
3 # Specify the xml file for writing the output model
4 >> output_file = ’ path / to / o u tp u t _s o u rc e _ mo d e l_ f i le . yml ’
5
6 >> fault_model . source_model . serialise_to_nrml ( outfile ,
7 use_defaults = True )

The serialiser takes as an optional input the choice to accept default values for certain
attributes that may be missing from the fault source definition. These values are as follows:

Attribute Default Value


Aspect Ratio 1.0
Magnitude Scaling Relation Wells and Coppersmith, 1994 (’WC1994’)
Nodal Plane Distribution Strike = 0., Dip = 90, Rake = 0., probability= 1.0
Hypocentral Depth Distribution Depth = 10.0 km, Probability = 1.0

Epistemic Uncertainties
The following example compares the case when epistemic uncertainties are incorporated into the
analysis. A demonstration file (tests/parsers/faults/yaml_examples/
simple_fault_example_4branch.yml) is included, which considers two epistemic uncertain-
ties for a specific fault: slip rate and magnitude frequency distribution type. The fault has two
estimates of slip rate (5 mm yr−1 and 7 mm yr−1 ), each assigned a weighting of 0.5. Two magni-
tude frequency distribution types (Characteristic and Anderson and Luco (1983) “Arbitrary”) are
assigned, with weights of 0.7 and 0.3 respectively. The analysis is run, in the manner described
previously. Firstly we consider the case when the different options are enumerated (the default
option:
1 >> from openquake . hmtk . parsers . faults . fault_yaml_parser import \
2 FaultYmltoSource
3 >> input_file = \
4 ’ tests / parsers / faults / yaml_examples /
5 s i m p l e _ f a u l t _ e x a m p l e _ 4 b r a n c h . yml ’
6 >> parser = FaultYmltoSource ( input_file )
7 >> fault_model , tect_reg = parser . read_file (1.0)
8 >> fault_model . build_fault_model ()
9 >> output_file = ’ path / to / o u tp u t _s o u rc e _ mo d e l_ f i le . yml ’
10 >> fault_model . source_model . serialise_to_nrml ( output_file ,
11 use_defaults = True )
4.1 Fault Recurrence from Geology 61

As four different activity rates are produced, the source is duplicated four time, each with
an activity rate that corresponds to the rate calculated for the specific branch, multiplied by the
weight of the branch. The output is a nrml file with four sources, as illustrated below:

<?xml version=’1.0’ encoding=’UTF-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">
<sourceModel name="Template Simple Fault">
<simpleFaultSource id="1_1" name="A Simple Fault" tectonicRegion="Active Shallow Crust">
<simpleFaultGeometry>
<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="6.64" binWidth="0.1">
<occurRates>1.66984888376e-05 0.000165760464747 0.000658012622916
0.00135343006814 0.00144508466725 0.00080109356265
0.000230216031359 3.05523435745e-05 0.0</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
<simpleFaultSource id="1_2" name="A Simple Fault" tectonicRegion="Active Shallow Crust">
<simpleFaultGeometry>
<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="4.5" binWidth="0.1">
<occurRates>0.0404671423911 0.033659102961 0.0279964224108
0.0232864098818 0.0193687920987 0.0161102595577
0.0133999302432 0.0111455765116 0.00927048675038
0.00771085501945 0.00641360984941 0.00533460831472
0.00443713392921 0.00369064724985 0.00306974667434
0.00255330407018 0.00212374582219 0.00176645483392
0.00146927313415 0.00122208816284 0.00101648865894
0.000845478440245 0.000703238335844 0.000584928170206
0.000486522060675 0.000404671423911</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
<simpleFaultSource id="1_3" name="A Simple Fault" tectonicRegion="Active Shallow Crust">
<simpleFaultGeometry>
<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="6.64" binWidth="0.1">
<occurRates>2.33778843726e-05 0.000232064650645 0.000921217672083
0.0018948020954 0.00202311853414 0.00112153098771
0.000322302443902 4.27732810043e-05 0.0</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
<simpleFaultSource id="1_4" name="A Simple Fault" tectonicRegion="Active Shallow Crust">
<simpleFaultGeometry>
62 Chapter 4. Geology Tools

<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="4.5" binWidth="0.1">
<occurRates>0.0566539993476 0.0471227441454 0.0391949913751
0.0326009738345 0.0271163089382 0.0225543633808
0.0187599023404 0.0156038071162 0.0129786814505
0.0107951970272 0.00897905378917 0.00746845164061
0.00621198750089 0.00516690614979 0.00429764534408
0.00357462569825 0.00297324415106 0.00247303676749
0.00205698238781 0.00171092342797 0.00142308412252
0.00118366981634 0.000984533670182 0.000818899438288
0.000681130884944 0.000566539993476</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
</sourceModel>
</nrml>

If, alternatively, the user wishes to collapse the logic tree branches to give a single source as
an output, this option can be selected as follows:
1 ...
2 >> from openquake . hazardlib . scalerel . wc1994 import WC1994
3 >> fault_model . build_fault_model ( collapse = True ,
4 rendered_msr = WC1994 ())
5 ...

To collapse the branches it is simple necessary to specify as an input to the function collapse
True. The second option requires further explanation. A seismogenic source requires the
definition of a corresponding magnitude scaling relation, even if multiple scaling relations have
been used as part of the epistemic uncertainty analysis. As collapsing the branches means that
the activity rate is no longer associated with a specified scaling relation, but is an aggregate of
many, the user must select the scaling relation to be associated with the output activity rate, for
use in the OpenQuake hazard calculation. Therefore the input rendered_msr must be given an
instance of one of the supported magnitude scaling relationships.
The output file should resemble the following:

<?xml version=’1.0’ encoding=’UTF-8’?>


<nrml xmlns:gml="http://www.opengis.net/gml" xmlns="http://openquake.org/xmlns/nrml/0.4">
<sourceModel name="Template Simple Fault">
<simpleFaultSource id="1_1" name="A Simple Fault" tectonicRegion="Active Shallow Crust">
<simpleFaultGeometry>
<gml:LineString>
<gml:posList>30.0 30.0 30.0 31.0</gml:posList>
</gml:LineString>
<dip>30.0</dip>
<upperSeismoDepth>0.0</upperSeismoDepth>
<lowerSeismoDepth>20.0</lowerSeismoDepth>
</simpleFaultGeometry>
<magScaleRel>WC1994</magScaleRel>
<ruptAspectRatio>1.5</ruptAspectRatio>
<incrementalMFD minMag="4.5" binWidth="0.1">
<occurRates>0.0971211417387 0.0807818471064 0.0671914137858
0.0558873837162 0.0464851010369 0.0386646229385
0.0321598325836 0.0267493836278 0.0222491682009
0.0185060520467 0.0153926636386 0.0128030599553
0.0106491214301 0.00885755339963 0.00736739201842
0.00612792976843 0.00509698997324 0.00423949160142
4.1 Fault Recurrence from Geology 63

0.00352625552195 0.00293301159081 0.00243957278146


0.00202914825659 0.00184661595792 0.00231362389313
0.00360190992617 0.00434969292261 0.00243432469089
0.000909841780938 0.000164472079868 0.0</occurRates>
</incrementalMFD>
<rake>-90.0</rake>
</simpleFaultSource>
</sourceModel>
</nrml>
Recurrence from Geodetic Strain
The Recurrence Calculation
Running a Strain Calculation
Books
Articles
Reports

5. Geodetic Tools

5.1 Recurrence from Geodetic Strain


The final third workflow currently supported by the Hazard Modeller’s toolkit is somewhat
more experimental than the seismicity or geological workflows. The role that geodesy plays
in directly constraining earthquake recurrence rates for active seismological structures and
regions is becoming widely recognised. Whilst s standard methodology has not yet emerged, the
tools constructed here are built around the “Seismic Hazard Inferred from Tectonics (SHIFT)”
methodology originally proposed by Bird and Liu (2007) and applied on a global scale by Bird
et al. (2010). The focus is placed on this model, in part, because it is fed by regional and/or
global scale models of strain on a continuum. The initial global application, which underpins part
of the present implementation, uses the first version of the Global Strain Rate Model (Kreemer
et al., 2003). The second version of the Global Strain Rate Model has been produced as part of
the Global Earthquake Model, and it is anticipated that this will form a basis for many future
uses of the present geodetic tools.

5.1.1 The Recurrence Calculation


The point of convergence between the geological and geodetic methodologies for estimating
earthquake recurrence is in the definition of the total moment rate for an active seismogenic
structure:
Ṁo = cµAṡ (5.1)
where A is the area of the coupled surface, ṡ is the slip rate, µ the shear modulus and c the
coefficient describing the fraction of seismogenic coupling. Initially, the moment-rate tensor
may be related to the strain rate tensor (εi j ) using the formula of Kostrov (1974):
Ṁoi j = 2µHAεi j (5.2)
where H is the seismogenic thickness and A the area of the seismogenic source. Adopting
the definition for a deforming continuum given by Bird and Liu (2007) the moment rate is equal
to:


2ε̇3 : ε̇2 < 0
Ṁo = A hczi µ (5.3)
−2ε̇1 : ε̇2 ≥ 0
66 Chapter 5. Geodetic Tools

where assuming ε̇1 ≤ ε̇2 ≤ ε̇3 and ε̇1 + ε̇2 + ε̇3 = 0 (i.e. no volumetric changes are observed).
The coupled seismogenic thickness (hczi) is a characteristic of each tectonic zone, and for the
current purposes corresponds to the values (and regionalisation) proposed by Bird and Kagan
(2004).
The Bird and Liu (2007) Approach
As the regionalisation of Bird and Kagan (2004) underpins the Bird and Liu (2007) methodology,
the following approach is used to derive the shallow sesimicity rate. The geodetic moment
rate is first divided by the “model” moment rate (ṀoCMT ), which is the integral of the tapered
Gutenberg-Richter distribution fit to the subset of the Global CMT catalogue for each tectonic
zone, then multiplied by the number of events in the sub-catalogue (NCMT ) above the threshold
(completeness) magnitude for the sub-catalogue (mT ):

 
Ṁo
mCMT ṄCMT

Ṅ m > T = (5.4)
ṀoCMT
The forecast rate of seismicity greater than mT (Ṅ (m > mT )) for a particular zone (or cell) is
described using the tapered Gutenberg-Richter distribution:

!−β
Mo (mT )
mCMT

Ṅ (m > mT ) =Ṅ m > T  ×...
Mo mCMT
T
! (5.5)
Mo mCMT

T − Mo (mT )
exp
Mo (mc )

5.1.2 Running a Strain Calculation


Strain File Format
As the current process considers only continuum models, a strain model can be input as a simple
comma separated value (.csv) file. This is a basic text file with the following headers:
longitude, latitude, exx, eyy, exy, region
45.0, -45., 0.0, 0.0, 0.0, IPL
.
.
.
177.0,-38.2,19.3,-37.0,-49.7,C

In the present format, the values exx, eyy, exy describe the horizontal components of the
strain tensor (in the example above in terms of nanostrain, 10−9 ). The region term corresponds
to the region (in this instance the Kreemer et al. (2003) class) to which the cell belongs: Intra-plate
[IPL], Subduction [S], Continental [C], Oceanic[O] and Ridge[R]. If the user does not have a
previously defined regionalisation, can be determined using the 0.6◦ × 0.5◦ global regionalisation
cells.
The simplest strain workflow is to implement the model as defined by Bird et al. (2010).
This process is illustrated in the following steps:
1. To simply load in the csv data the ReadStrainCsv tool is used:
1 >> from openquake . hmtk . parsers . strain . strain_csv_parser import \
2 ReadStrainCsv , WriteStrainCsv
3
4 # Load the file
5 >> reader = ReadStrainCsv ( ’ ’ path / to / strain / file . csv ’ ’)
6 >> strain_model = reader . read_data ( scaling_factor =1 E -9)
5.1 Recurrence from Geodetic Strain 67

2. If the regionalisation is not supplied within the input file then it can be assigned initially
from the in-built Kreemer et al. (2003) regionalisation. This is done as follows (and may
take time to execute depending on the size of the data):
1 # ( Optional ) To assign regionalisation from Kreemer et al . 2003
2 >> from openquake . hmtk . strain . regionalisation . k re em e r_ re g io na li s at io n \
3 import Kr eem er Reg io nal is ati on
4 # ( Optional )
5 >> regionalisation = Kr ee mer Re gio na lis ati on ()
6 # ( Optional )
7 >> strain_model = regionalisation . get_regionalisation (
8 strain_model )

3. The next step is to implement the Shift calculations. The Shift module must first be
imported and the magnitudes for which the activity rates are to be calculated must be
defined as a list (or array). The strain data is input into the calculation along with two other
configurable options: cumulative decides whether it is the cumulative rate of events
above each magnitude (True), or the incremental activity rate for the bin M (i) : M (i + 1)
(False), in_seconds decides whether to return the rates per second for consistency with
Bird et al. (2010) (True) or as annual rates (False).
1 >> from openquake . hmtk . strain . shift import Shift
2 # In this example , calculate cumulative rates
3 # for M > 5. , 6. , 7. , 8.
4 >> magnitudes = [5. , 6. , 7. , 8.]
5 >> model = Shift ( magnitudes )
6 >> model . c al cu l at e_ ac t iv it y_ r at e ( strain_model ,
7 cumulative = False ,
8 in_seconds = False )

4. Finally the resulting model can be written to a csv file. This will be in the same format as
the input file, now with additional attributes and activity rates calculated.
1 # Export the resulting rates to a csv file
2 >> writer = WriteStrainCsv ( ’ ’ path / to / output / file . csv ’ ’)
3 >> writer . write_file ( model . strain , scaling_factor =1 E -9)

Additional support for writing a continuum model to a nrml Point Source model is envisaged,
although further work is needed to determine the optimum approach for defining the seismogenic
coupling depths, hypocentral depths and focal mechanisms.
Bibliography

Books
Aki, K. and P. G. Richards (2002). Quantitative Seismology. Sausalito, California: University
Science Books (cited on page 51).

Articles
Aki, K. (1965). “Maximum Likelihood Estimate of b in the formula log N = a - b M and its
Confidence Limits”. In: Bulletin of the Earthquake Research Institute 43, pages 237–239
(cited on pages 9, 28).
Akkar, S. and J. J. Bommer (2010). “Empirical equations for the prediction of PGS, PGV, and
spectral accelerations in Europe, the Mediterranean Region, and the Middle East”. In: Seism.
Res. Lett. 81.2, pages 195–206. DOI: 10.1785/gssrl.81.2.195 (cited on pages 44, 45).
Anderson, J. G. and J. E. Luco (1983). “Consequences of Slip Rate Constraints on Earthquake
Occurrence Relations”. In: Bull. Seism. Soc. Am. 73, pages 471–496 (cited on pages 9, 47,
52–56, 58, 60).
Bender, B. (1983). “Maximum Likelihood Estimatation of b Values for Magnitude Grouped
Data”. In: Bulletin of the Seismological Society of America 73.3, pages 831–851 (cited on
page 28).
Bird, P. and Y. Y. Kagan (2004). “Plate-Tectonic Analysis of Shallow Seismicity: Apparent
Boundary Width, Beta, Corner Magnitude, Coupled Lithosphere Thickness, and Couping
in seven tectonic settings”. In: Bulletin of the Seismological Society of America 94(6),
pages 2380–2399 (cited on page 66).
Bird, P., C. Kreemer, and W . E. Holt (2010). “A Long-term Forecast of Shallow Seismicity
Based on the Global Strain Rate Map”. In: Seismological Research Letters 81(2), pages 184–
194 (cited on pages 8, 9, 65–67).
Bird, P. and Z. Liu (2007). “Seismic Hazard Inferred from Tectonics: California”. In: Seismologi-
cal Research Letters 78(1), pages 37–48 (cited on pages 8, 9, 65, 66).
Bungum, H. (2007). “Numerical modelling of fault activities”. In: Computers & Geosciences 33,
pages 808–820 (cited on pages 47, 55).
Frankel, A. (1995). “Mapping Seismic Hazard in the Central and Eastern United States”. In:
Seismological Research Letters 66.4, pages 8–21 (cited on pages 8, 9, 34, 35).
70 Chapter 5. Geodetic Tools

Gardner, J. K. and L. Knopoff (1974). “Is the sequence of earthquakes in Southern California,
with aftershocks removed, Poissonian?” In: Bulletin of the Seismological Society of America
64.5, pages 1363–1367 (cited on pages 9, 22–24).
Gutenberg, B. and C. F. Richter (1944). “Frequency of Earthquakes in California”. In: Bulletin
of the Seismological Society of America 34, pages 185–188 (cited on pages 28, 30, 31).
Hanks, T. and H. Kanamori (1979). “A moment magnitude scale”. In: Journal of Geophysical
Research 84, pages 2348–2350 (cited on page 52).
Kijko, A. (2004). “Estimation of the Maximum Earthquake Magnitude, MMAX ”. In: Pure and
Applied Geophysics 161, pages 1655–1681 (cited on pages 30, 31, 33).
Kijko, A. and A. Smit (2012). “Extension of the Aki-Utsu b-Value Estimator for Incomplete
Catalogs”. In: Bulletin of the Seismological Society of America 3, pages 1283–1287 (cited on
page 28).
Kostrov, V. V. (1974). “Seismic moment and energy of earthquakes and seismic flow of rock”.
In: Izy Acad. Sci. USSR Physics of the Solid Earth 1, pages 13–21 (cited on page 65).
Kreemer, C., W . E. Holt, and A. J. Haines (2003). “An integrated global model of present-day
plate motions and plate boundary deformation”. In: Geophysical Journal International 154,
pages 8–34 (cited on pages 65–67).
Luen, B. and P. B. Stark (2012). “Poisson tests of declustered catalogues”. In: Geophys. J. Int.
189, pages 691–700 (cited on page 22).
Makropoulos, K. C. and P. W. Burton (1983). “Seismic Risk of Circum-Pacific Earthquakes I.
Strain Energy Release”. In: Pure and Applied Geophysics 121.2, pages 247–266 (cited on
page 32).
Musson, R. M. W. (1999a). “Probabilistic Seismic Hazard Maps for the North Balkan Region”.
In: Annali di Geofisica 42.2, pages 1109–1124 (cited on page 9).
— (1999b). “Probabilistic Seismic Hazard Maps for the North Balkan Region”. In: Annali di
Geofisica 42.2, pages 1109–1124 (cited on page 24).
Peruzza, L., B. Pace, and F. Cavallini (2010). “Error propagation in time-dependent probability of
occurrence for characterisrtic earthquakes in Italy”. In: Journal of Seismology 14, pages 119–
141 (cited on page 47).
Rydelek, P. A. and I. S. Sacks (1989). “Testing the completeness of earthquake catalogues and
the hypothesis of self-similarity”. In: Nature 337.19, pages 251–253 (cited on page 25).
Schorlemmer, D. and J. Woessner (2008). “Probability of Detecting an Earthquake”. In: Bulletin
of the Seismological Society of America 98.5, pages 2103–2117 (cited on page 25).
Uhrhammer, R. (1986). “Characteristics of Northern and Central California Seismicity”. In:
Earthquake Notes 57.1, page 21 (cited on pages 22, 23).
Weichert, D. H. (1980). “Estimation of the Earthquake Recurrance Parameters for Unequal
Observation Periods for Different Magnitudes”. In: Bulletin of the Seismological Society of
America 70.4, pages 1337–1346 (cited on pages 9, 29, 30, 34).
Wells, D. L. and K. J. Coppersmith (1994). “New Empirical Relationships among Magnitude,
Rupture Length, Rupture Width, Rupture Area, and Surface Displacement”. In: Bull. Seism.
Soc. Am. 84.4, pages 974–1002 (cited on pages 41, 48, 51, 53, 60).
Woessner, J. and S. Wiemer (2005). “Assessing the quality of earthquake catalogues: estimating
the magnitude of completeness and its uncertainty”. In: Bulletin of the Seismological Society
of America 95.2, pages 684–698 (cited on page 25).
Youngs, R. R. and K. J. Coppersmith (1985). “Implications of Fault Slip Rates and Earthquake
Recurrence Models to Probabilistic Seismic Hazard Estimates”. In: Bull. Seism. Soc. Am. 75,
pages 939–964 (cited on pages 9, 52, 58, 59).
5.1 Recurrence from Geodetic Strain 71

Other Sources
Crowley, H., M. Colombi, J. Crempien, E. Erduran, M. Lopez, H. Liu, M. Mayfield, and M.
Milanesi (2010). GEM1 Seismic Risk Report: Part 1. GEM Technical Report 2010-5. Pavia,
Italy: GEM Foundation (cited on page 37).
Felzer, K. R. (2008). The Uniform California Earthquake Rupture Forecast, version 2 (UCERF
2) Appendix I: Calculating California Seismicity Rates. Technical report. U.S. Geological
Survey Open File Report 2007-01347I (cited on pages 25, 29).
Frankel, A., C. Mueller, T. Barnhard, D. Perkins, E. V. Leyendecker, N. Dickman, S. Hanson, and
M. Hopper (1996). National Seismic Hazard Maps: Documentation June 1996. Technical
report. United States Geological Survey Open-File Report 96-532 (cited on page 29).
Frankel, A., M. D. Petersen, C. S. Mueller, K. M. Haller, R. L. Wheeler, E. V. Leyendecker,
R. L. Wesson, S. C. Harmson, C. H. Cramer, D. M. Perkins, and K. S. Rukstales (2002).
Documentation for the 2002 Updata to the National Seismic Hazard Maps. Technical report.
United States Geological Survey Open-File Report 02-420 (cited on page 29).
Mignan, A. and J. Wöessner (2012). Theme IV - Understanding Seismicity Catalogs and their
Problems. Technical report doi: 10.5078/corssa-00180805. Community Online Resource for
Statistical Seismicity Analysis. URL: http://www.corssa.org (cited on page 25).
Stepp, J. C. (1971). “An investigation of earthquake risk in the Puget Sound area by the use of
the type I distribution of largest extreme”. PhD thesis. Pennsylvania State University (cited
on pages 9, 25–27).
Stiphout, T. van, J. Zhuang, and D. Marsan (2012). Theme V -Models and Techniques for
Analysing Seismicity. Technical report. Community Online Resource for Statistical Seismicity
Analysis. URL: http://www.corssa.org (cited on page 22).
Thomas, P., I. Wong, and N. A. Abrahamson (2010). Verification of Probabilistic Seismic Hazard
Analysis Computer Programs. PEER Report 2010/106. College of Engineering, University
of California, Berkeley: Pacific Earthquake Engineering Research Center (cited on page 48).
5.1 Recurrence from Geodetic Strain 73
Part I

Appendices
Basic Data Types
Scalar Parameters
Iterables
Dictionaries
Loops and Logicals
Functions
Classes and Inheritance
Simple Classes
Inheritance
Abstraction
Numpy/Scipy

A. The 10 Minute Guide to Python!

The HMTK is intended to be used by scientists and engineers without the necessity of having an
existing knowledge of Python. It is hoped that the examples contained in this manual should
provide enough context to allow the user to understand how to use the tools for their own needs.
In spite of this, however, an understanding of the fundamentals of the Python programming
language can greatly enhance the user experience and permit the user to join together the tools in
a workflow that best matches their needs.
The aim of this appendix is therefore to introduce some fundamentals of the Python program-
ming language in order to help understand how, and why, the HMTK can be used in a specific
manner. If the reader wishes to develop their knowledge of the Python programming language
beyond the examples shown here, there is a considerable body of literature on the topic from
both a scientific and developer perspective.

A.1 Basic Data Types


Fundamental to the use of the HMTK is an understanding of the basic data types Python
recognises:

A.1.1 Scalar Parameters


• float A floating point (decimal) number. If the user wishes to enter in a floating point
value then a decimal point must be included, even if the number is rounded to an integer.
1 >> a = 3.5
2 >> print a , type ( a )
3 3.5 < type ’ float ’ >

• integer An integer number. If the decimal point is omitted for a floating point number the
number will be considered an integer
1 >> b = 3
2 >> print b , type ( b )
3 3 < type ’ int ’ >

The functions float() and int() can convert an integer to a float and vice-versa. Note
that taking int() of a fraction will round the fraction down to the nearest integer
78 Chapter A. The 10 Minute Guide to Python!

1 >> float ( b )
2 3
3 >> int ( a )
4 3

• string A text string (technically a “list” of text characters). The string is indicated by the
quotation marks ”something” or ’something else’
1 >> c = " apples "
2 >> print c , type ( c )
3 apples < type ’ str ’ >

• bool For logical operations python can recognise a variable with a boolean data type (True
/ False).
1 >> d = True
2 >> if d :
3 print "y"
4 else :
5 print "n"
6 y
7 >> d = False
8 >> if d :
9 print "y"
10 else :
11 print "n"
12 n

Care should be taken in Python as the value 0 and 0.0 are both recognised as False if
applied to a logical operation. Similarly, booleans can be used in arithmetic where True
and False take the values 1 and 0 respectively
1 >> d = 1.0
2 >> if d :
3 print "y"
4 else :
5 print "n"
6 y
7 >> d = 0.0
8 >> if d :
9 print "y"
10 else :
11 print "n"
12 n

Scalar Arithmetic
Scalars support basic mathematical operations (# indicates a comment):
1 >> a = 3.0
2 >> b = 4.0
3 >> a + b # Addition
4 7.0
5 >> a * b # Multiplication
6 12.0
7 >> a - b # Subtraction
8 -1.0
9 >> a / b # Division
10 0.75
11 >> a ** b # Exponentiation
12 81.0
13 # But integer behaviour can be different !
14 >> a = 3; b = 4
A.1 Basic Data Types 79

15 >> a / b
16 0
17 >> b / a
18 1

A.1.2 Iterables
Python can also define variables as lists, tuples and sets. These data types can form the basis for
iterable operations. It should be noted that unlike other languages, such as Matlab or Fortran,
Python iterable locations are zero-ordered (i.e. the first location in a list has an index value of 0,
rather than 1).
• List A simple list of objects, which have the same or different data types. Data in lists can
be re-assigned or replaced
1 >> a_list = [3.0 , 4.0 , 5.0]
2 >> print a_list
3 [3.0 , 4.0 , 5.0]
4 >> another_list = [3.0 , " apples " , False ]
5 >> print another_list
6 [3.0 , ’ apples ’ , False ]
7 >> a_list [2] = -1.0
8 a_list = [3.0 , 4.0 , -1.0]

• Tuples Collections of objects that can be iterated upon. As with lists, they can support
mixed data types. However, objects in a tuple cannot be re-assigned or replaced.
1 >> a_tuple = (3.0 , " apples " , False )
2 >> print a_tuple
3 (3.0 , ’ apples ’ , False )
4 # Try re - assigning a value in a tuple
5 >> a_tuple [2] = -1.0
6 TypeError Traceback ( most recent call last )
7 < ipython - input -43 -644687 cfd23c > in < module >()
8 ----> 1 a_tuple [2] = -1.0
9
10 TypeError : ’ tuple ’ object does not support item assignment

• Range A range is a convenient function to generate arithmetic progressions. They are


called with a start, a stop and (optionally) a step (which defaults to 1 if not specified)
1 >> a = range (0 , 5)
2 >> print a
3 [0 , 1 , 2 , 3 , 4] # Note that the stop number is not
4 # included in the set !
5 >> b = range (0 , 6 , 2)
6 >> print b
7 [0 , 2 , 4]

• Sets A set is a special case of an iterable in which the elements are unordered, but contains
more enhanced mathematical set operations (such as intersection, union, difference, etc.)
1 >> from sets import Set
2 >> x = Set ([3.0 , 4.0 , 5.0 , 8.0])
3 >> y = Set ([4.0 , 7.0])
4 >> x . union ( y )
5 Set ([3.0 , 4.0 , 5.0 , 7.0 , 8.0])
6 >> x . intersection ( y )
7 Set ([4.0])
8 >> x . difference ( y )
9 Set ([8.0 , 3.0 , 5.0]) # Notice the results are not ordered !
80 Chapter A. The 10 Minute Guide to Python!

Indexing
For some iterables (including lists, sets and strings) Python allows for subsets of the iterable
to be selected and returned as a new iterable. The selection of elements within the set is done
according to the index of the set.
1 >> x = range (0 , 10) # Create an iterable
2 >> print x
3 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9]
4 >> print x [0] # Select the first element in the set
5 0 # recall that iterables are zero - ordered !
6 >> print x [ -1] # Select the last element in the set
7 9
8 >> y = x [:] # Select all the elements in the set
9 >> print y
10 [0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9]
11 >> y = x [:4] # Select the first four element of the set
12 >> print y
13 [0 , 1 , 2 , 3]
14 >> y = x [ -3:] # Select the last three elements of the set
15 >> print y
16 [7 , 8 , 9]
17 >> y = x [4:7] # Select the 4 th , 5 th and 6 th elements
18 >> print y
19 [4 , 5 , 6]

A.1.3 Dictionaries
Python is capable of storing multiple data types associated with a map of variable names inside a
single object. This is called a “Dictionary”, and works in a similar manner to a “data structure” in
languages such as Matlab. Dictionaries are used frequently in the HMTK as ways of structuring
inputs to functions that share a common behaviour but may take different numbers and types of
parameters on input.
1 >> earthquake = { " Name " : " Parkfield " ,
2 " Year " : 2004 ,
3 " Magnitude " : 6.1 ,
4 " Recording Agencies " = [ " USGS " , " ISC " ]}
5 # To call or view a particular element in a dictionary
6 >> print earthquake [ " Name " ] , earthquake [ " Magnitude " ]
7 Parkfield 6.1

A.1.4 Loops and Logicals


Python’s syntax for undertaking logical operations and iterable operations is relatively straight-
forward.
Logical
A simple logical branching structure can be defined as follows:
1 >> a = 3.5
2 >> if a <= 1.0:
3 b = a + 2.0
4 elif a > 2.0:
5 b = a - 1.0
6 else :
7 b = a ** 2.0
8 >> print b
9 2.5

Boolean operations can are simply rendered as and, or and not.


A.1 Basic Data Types 81

1 >> a = 3.5
2 >> if ( a <= 1.0) or ( a > 3.0):
3 b = a - 1.0
4 else :
5 b = a ** 2.0
6 >> print b
7 2.5

Looping
There are several ways to apply looping in python. For simple mathematical operations, the
simplest way is to make use of the range function:
1 >> for i in range (0 , 5):
2 print i , i ** 2
3 0 0
4 1 1
5 2 4
6 3 9
7 4 16

The same could be achieved using the while function (though possibly this approach is far
less desirable depending on the circumstance):
1 >> i = 0
2 >> while i < 5:
3 print i , i ** 2
4 i += 1
5 0 0
6 1 1
7 2 4
8 3 9
9 4 16

A for loop can be applied to any iterable:


1 >> fruit_data = [ " apples " , " oranges " , " bananas " , " lemons " ,
2 " cherries " ]
3 >> i = 0
4 >> for fruit in fruit_data :
5 print i , fruit
6 i += 1
7 0 apples
8 1 oranges
9 2 bananas
10 3 lemons
11 4 cherries

The same results can be generated, arguably more cleanly, by making use of the enumerate
function:
1 >> fruit_data = [ " apples " , " oranges " , " bananas " , " lemons " ,
2 " cherries " ]
3 >> for i , fruit in enumerate ( fruit_data ):
4 print i , fruit
5 0 apples
6 1 oranges
7 2 bananas
8 3 lemons
9 4 cherries

As with many other programming languages, Python contains the statements break to break
out of a loop, and continue to pass to the next iteration.
82 Chapter A. The 10 Minute Guide to Python!

1 >> i = 0
2 >> while i < 10:
3 if i == 3:
4 i += 1
5 continue
6 elif i == 5:
7 break
8 else :
9 print i , i ** 2
10 i += 1
11 0 0
12 1 1
13 2 4
14 4 16

A.2 Functions
Python easily supports the definition of functions. A simple example is shown below. Pay careful
attention to indentation and syntax!
1 >> def a_simple_multiplier (a , b ):
2 """
3 Documentation string - tells the reader the function
4 will multiply two numbers , and return the result and
5 the square of the result
6 """
7 c = a * b
8 return c , c ** 2.0
9
10 >> x = a_simple_multiplier (3.0 , 4.0)
11 >> print x
12 (12.0 , 144.0)

In the above example the function returns two outputs. If only one output is assigned then
that output will take the form of a tuple, where the elements correspond to each of the two
outputs. To assign directly, simply do the following:
1 >> x , y = a_simple_multiplier (3.0 , 4.0)
2 >> print x
3 12.0
4 >> print y
5 144.0

A.3 Classes and Inheritance


Python is one of many languages that is fully object-oriented, and the use (and terminology) of
objects is prevalent throughout the HMTK and this manual. A full treatise on the topic of object
oriented programming in Python is beyond the scope of this manual and the reader is referred to
one of the many textbooks on Python for more examples

A.3.1 Simple Classes


A class is an object that can hold both attributes and methods. For example, imagine we wish to
convert an earthquake magnitude from one scale to another; however, if the earthquake occurred
after a user-defined year we wish to use a different formula. This could be done by a method, but
we can also use a class:
A.3 Classes and Inheritance 83

1 >> class MagnitudeConverter ( object ):


2 """
3 Class to convert magnitudes from one scale to another
4 """
5 def __init__ ( self , converter_year ):
6 """
7 """
8 self . converter_year = converter_year
9
10 def convert ( self , magnitude , year ):
11 """
12 Converts the magnitude from one scale to another
13 """
14 if year < self . converter_year :
15 converted_magnitude = -0.3 + 1.2 * magnitude
16 else :
17 converted_magnitude = 0.1 + 0.94 * magnitude
18 return converted_magnitude
19
20 >> converter1 = MagnitudeConverter (1990)
21 >> mag_1 = converter1 . convert (5.0 , 1987)
22 >> print mag_1
23 5.7
24 >> mag_2 = converter1 . convert (5.0 , 1994)
25 >> print mag_2
26 4.8
27 # Now change the conversion year
28 >> converter2 = MagnitudeConverter (1995)
29 >> mag_1 = converter2 . convert (5.0 , 1987)
30 >> print mag_1
31 5.7
32 >> mag_2 = converter2 . convert (5.0 , 1994)
33 >> print mag_2
34 5.7

In this example the class holds both the attribute converter_year and the method to convert
the magnitude. The class is created (or “instantiated”) with only the information regarding the
cut-off year to use the different conversion formulae. Then the class has a method to convert a
specific magnitude depending on its year.

A.3.2 Inheritance
Classes can be useful in many ways in programming. One such way is due to the property of
inheritance. This allows for classes to be created that can inherit the attributes and methods of
another class, but permit the user to add on new attributes and/or modify methods.
In the following example we create a new magnitude converter, which may work in the same
way as the MagnitudeConverter class, but with different conversion methods.
1 >> class NewMag nitud eConve rter ( MagnitudeConverter ):
2 """
3 A magnitude converter using different conversion
4 formulae
5 """
6 def convert ( self , magnitude , year ):
7 """
8 Converts the magnitude from one scale to another
9 - differently !!!
10 """
11 if year < self . converter_year :
12 converted_magnitude = -0.1 + 1.05 * magnitude
84 Chapter A. The 10 Minute Guide to Python!

13 else :
14 converted_magnitude = 0.4 + 0.8 * magnitude
15 return converted_magnitude
16 # Now compare converters
17 >> converter1 = MagnitudeConverter (1990)
18 >> converter2 = Ne wMagni tudeCo nvert er (1990)
19 >> mag1 = converter1 . convert (5.0 , 1987)
20 >> print mag1
21 5.7
22 >> mag2 = converter2 . convert (5.0 , 1987)
23 >> print mag2
24 5.15
25 >> mag3 = converter1 . convert (5.0 , 1994)
26 >> print mag3
27 4.8
28 >> mag4 = converter2 . convert (5.0 , 1994)
29 >> print mag4
30 4.4

A.3.3 Abstraction
Inspection of the HMTK code (https://github.com/gem/oq-engine shows frequent usage of classes
and inheritance. This is useful in our case if we wish to make available different methods for
the same problem. In many cases the methods may have similar logic, or may provide the same
types of outputs, but the specifics of the implementation may differ. Functions or attributes that
are common to all methods can be placed in a “Base Class”, permitting each implementation of
a new method to inherit the “Base Class” and its functions/attributes/behaviour. The new method
will simply modify those aspects of the base class that are required for the specific method in
question. This allows functions to be used interchangeably, thus allowing for a "mapping" of
data to specific methods.
An example of abstraction is shown using our two magnitude converters shown previously.
Imagine that a seismic recording network (named "XXX") has a model for converting from their
locally recorded magnitude to a reference global scale (for the purposes of narrative, imagine that
a change in recording procedures in 1990 results in a change of conversion model). A different
recording network (named “YYY”) has a different model for converting their local magnitude
to a reference global scale (and we imagine they also changed their recording procedures, but
they did so in 1994). We can create a mapping that would apply the correct conversion for each
locally recorded magnitude in a short catalogue, provided we know the local magnitude, the year
and the recording network.
1 >> CONVERSION_MAP = { " XXX " : MagnitudeConverter (1990) ,
2 " YYY " : Ne wMagn itudeC onver ter (1994)}
3 >> e arthquake_catalogue = [(5.0 , " XXX " , 1985) ,
4 (5.6 , " YYY " , 1992) ,
5 (4.8 , " XXX " , 1993) ,
6 (4.4 , " YYY " , 1997)]
7 >> for earthquake in earthquake_catalogue :
8 converted_magnitude = \ # Line break for long lines !
9 CONVERSION_MAP [ earthquake [1]]. convert ( earthquake [0] ,
10 earthquake [2])
11 print earthquake , converted_magnitude
12 (5.0 , " XXX " , 1985) 5.7
13 (5.6 , " YYY " , 1992) 5.78
14 (4.8 , " XXX " , 1993) 4.612
15 (4.4 , " YYY " , 1997) 3.92

So we have a simple magnitude homogenisor that applies the correct function depending on
A.4 Numpy/Scipy 85

the network and year. It then becomes a very simple matter to add on new converters for new
agencies; hence we have a “toolkit” of conversion functions!

A.4 Numpy/Scipy
Python has two powerful libraries for undertaking mathematical and scientific calculation,
which are essential for the vast majority of scientific applications of Python: Numpy (for
multi-dimensional array calculations) and Scipy (an extensive library of applications for maths,
science and engineering). Both libraries are critical to both OpenQuake and the HMTK. Each
package is so extensive that a comprehensive description requires a book in itself. Fortunately
there is abundant documentation via the online help for Numpy www.numpy.org and Scipy
www.scipy.org, so we do not need to go into detail here.
The particular facet we focus upon is the way in which Numpy operates with respect to
vector arithmatic. Users familiar with Matlab will recognise many similarities in the way the
Numpy package undertakes array-based calculations. Likewise, as with Matlab, code that is well
vectorised is signficantly faster and more efficient than the pure Python equivalent.
The following shows how to undertake basic array arithmetic operations using the Numpy
library
1 >> import numpy as np
2 # Create two vectors of data , of equal length
3 >> x = np . array ([3.0 , 6.0 , 12.0 , 20.0])
4 >> y = np . array ([1.0 , 2.0 , 3.0 , 4.0])
5 # Basic arithmetic
6 >> x + y # Addition ( element - wise )
7 np . array ([4.0 , 8.0 , 15.0 , 24.0])
8 >> x + 2 # Addition of scalar
9 np . array ([5.0 , 8.0 , 14.0 , 22.0])
10 >> x * y # Multiplication ( element - wise )
11 np . array ([3.0 , 12.0 , 36.0 , 80.0])
12 >> x * 3.0 # Multiplication by scalar
13 np . array ([9.0 , 18.0 , 36.0 , 60.0])
14 >> x - y # Subtraction ( element - wise )
15 np . array ([2.0 , 4.0 , 9.0 , 16.0])
16 >> x - 1.0 # Subtraction of scalar
17 np . array ([2.0 , 5.0 , 11.0 , 19.0])
18 >> x / y # Division ( element - wise )
19 np . array ([3.0 , 3.0 , 4.0 , 5.0])
20 >> x / 2.0 # Division over scalar
21 np . array ([1.5 , 3.0 , 6.0 , 10.0])
22 >> x ** y # Exponentiation ( element - wise )
23 np . array ([3.0 , 36.0 , 1728.0 , 160000.0])
24 >> x ** 2.0 # Exponentiation ( by scalar )
25 np . array ([9.0 , 36.0 , 144.0 , 400.0])

Numpy contains a vast set of mathematical functions that can be operated on a vector (e.g.):
1 >> x = np . array ([3.0 , 6.0 , 12.0 , 20.0])
2 >> np . exp ( x )
3 np . array ([2.00855369 e +01 , 4.03428793 e +02 , 1.62754791 e +05 ,
4 4.85165195 e +08])
5 # Trigonometry
6 >> theta = np . array ([0. , np . pi / 2.0 , np . pi , 1.5 * np . pi ])
7 >> np . sin ( theta )
8 np . array ([0.0000 , 1.0000 , 0.0000 , -1.0000])
9 >> np . cos ( theta )
10 np . array ([1.0000 , 0.0000 , -1.0000 , 0.0000])

Some of the most powerful functions of Numpy, however, come from its logical indexing:
86 Chapter A. The 10 Minute Guide to Python!

1 >> x = np . array ([3.0 , 5.0 , 12.0 , 21.0 , 43.0])


2 >> idx = x >= 10.0 # Perform a logical operation
3 >> print idx
4 np . array ([ False , False , True , True , True ])
5 >> x [ idx ] # Return an array consisting of elements
6 # for which the logical operation returned True
7 np . array ([12.0 , 21.0 , 43.0])

Create, index and slice n-dimensional arrays:


1 >> x = np . array ([[3.0 , 5.0 , 12.0 , 21.0 , 43.0] ,
2 [2.0 , 1.0 , 4.0 , 12.0 , 30.0] ,
3 [1.0 , -4.0 , -2.1 , 0.0 , 92.0]])
4 >> np . shape ( x )
5 (3 , 5)
6 >> x [: , 0]
7 np . array ([3.0 , 2.0 , 1.0])
8 >> x [1 , :]
9 np . array ([2.0 , 1.0 , 4.0 , 12.0 , 30.0])
10 >> x [: , [1 , 4]]
11 np . array ([[ 5.0 , 43.0] ,
12 [ 1.0 , 30.0] ,
13 [ -4.0 , 92.0]])

The reader is referred to the online documentation for the full set of functions!

You might also like