Garita M Applied Quantitative Finance Using Python For Finan
Garita M Applied Quantitative Finance Using Python For Finan
Finance
Using Python for
Financial Analysis
Mauricio Garita
Applied Quantitative Finance
Mauricio Garita
Applied Quantitative
Finance
Using Python for Financial Analysis
Mauricio Garita
Universidad Francisco Marroquín
Guatemala City, Guatemala
This Palgrave Pivot imprint is published by the registered company Springer Nature
Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
I want to thank God, my wife Sonia, Míkel and Maia for being the reason
for keeping moving forward.
Introduction
vii
What This Book Is and Is Not
This book is not a theoretical book that will explain each and every detail
of each indicator presented. This book centers in applying finance to the
different indicators to offer a hands-on experience.
This book does not cover all the aspects of finance. For example, the
book is centered in technical, quantitative and risk analysis of the stock
market, but it does not cover each and every avenue. This book does not
contain options and futures, Montecarlo Simulations and binomial trees.
The reason is that this book aims to be an introductory to intermediate
level. Considering advanced level there are books far more detailed on
this aspects.
This book does not aim to explain programming language to the
reader. It explains the easiest way to program a portfolio, a MACD, a
VaR and other financial instruments. Also, the code is simple and clean
so that the reader is not overwhelmed by programming. I truly believe
that we can learn to program if we start from the basics and this book
aims for this.
This book does not aim to be perfect. Its aim is to open a discussion
considering finance and the growth of the different methods of the inter-
net. Therefore, the book is explained through Yahoo API because it is
the present live information at every second. Even though, everything
can be elaborated in Excel after uploading the document to Jupyter
notebooks or Google Collaboratory.
ix
x WHAT THIS BOOK IS AND IS NOT
Why Python? 1
Installing Python in the Computer 4
Using Jupyter Notebooks with Python 6
Understanding Jupyter Notebooks 7
Using Google Colab 14
References 17
xi
xii CONTENTS
Loops 46
For Loop 46
While Loop 52
List Comprehension 55
References 57
Index 239
About the Author
xv
xvi ABOUT THE AUTHOR
Why Python?
Fig. 1 Comparison between R and Python (Source [Pfeiffer 2019]) 3
Fig. 2 Jupyter Notebooks (Source Obtained from the computer
of the author) 8
Fig. 3 Jupyter Notebook—selecting Python (Source Obtained
from the computer of the author) 9
Fig. 4 Creating a folder (Source Obtained from the computer
of the author) 10
Fig. 5 A sample of Jupyter Notebooks (Source Obtained
from the computer of the author) 10
Fig. 6 Installing a package (Source Obtained from the computer
of the author) 12
Fig. 7 Selecting package to install (Source Obtained
from the computer of the author) 12
Fig. 8 Searching for a package (Source Obtained from
the computer of the author) 13
Fig. 9 Process of installing a package (Source Obtained
from the computer of the author) 14
Fig. 10 Creating a new Colab document 15
Fig. 11 Google Colaboratory 16
Fig. 12 Changing name to notebook in Google Colab 16
Fig. 13 Accessing Google Colab documents 16
xvii
xviii LIST OF FIGURES
Value at Risk
Fig. 1 Cumulative Return of the portfolio (Source Elaborated
by the author with information from Yahoo Finance) 220
Fig. 2 Drawdown of the portfolio (Source Elaborated
by the author with information from Yahoo Finance) 222
Fig. 3 MSAUX SMA 227
Fig. 4 MSAUX EMA 228
Fig. 5 MSAUX Bollinger Bands 229
Fig. 6 MSAUX RSI 230
Fig. 7 Rebase of the closing price in funds and ETF 232
Fig. 8 Funds and ETF montlhy progression 232
List of Tables
xxiii
List of Equations
xxv
xxvi LIST OF EQUATIONS
Value at Risk
Equation 1 Value at Risk - position 216
Why Python?
When Guido Van Rossum created the languages, he was aiming for a
better, easier language to handle Amoeba which is a microkernel-based
operating system. The creating began in 1989 and in 1991 it was posted
by the author to USENET (Python Organization 2020).
From there, Python has grown into one of the most used program-
ming languages in the world. O’Reilly (2020) has mentioned that in
their online learning environment, Python is preeminent and that it
accounts approximately 10% of the usage in their website. The growth
of the language is demonstrated in the new application programming
The main reason for using Python is that it is growing faster than pre-
dicted in the finance sector, giving way to a faster and better analysis of
the stock market. The use of API for gathering information without the
hassle of downloading an excel, getting to know where the information
should be kept or knowing which folder the right folder is to use. The
future of finance is centered on the capacity of the new technology and
Python, for the author, is the best vehicle.
menu. This will automatically read the interface of the computer and
guide you to selecting it for macOS, Windows or Linux.
WHY PYTHON? 5
Once the Anaconda Navigator is installed, on the left side a menu can be
read with the categories Home, Environments, Projects (beta), Learning
and Community. For installing a package, it is important to close the
Jupyter Notebook that has been worked on, because if not that package
will not be read by the Notebook.
Once downloaded an installing screen just as the one below will guide
you on the installations.
When installing Anaconda in the Advanced installation options it is recom-
mended to set Python 3.8 as default. This will allow the possibility of sharing
information with other programs which is useful for the rest of the book.
The installation should be followed according to recommendation
and by attending this recommendation you will have the Anaconda
Navigator installed in your computer.
6 M. GARITA
When the user opens the Anaconda Navigator this will lead to the
main screen. In the main screen the user will see on the left side a menu
with the items Home, Environments, Projects (beta), Learning and
Community. More detail on each of the items in the menu will be given
as the book advances.
On the center the user will have a different option such as Jupyterlab,
Jupyter Notebook, qtconsole, spyder, glueviz, orange3, rstudio, vscode.
the exercises. Finally, on the right-hand side the user will see ≪Sign in
The book will be using mostly the Jupyter Notebook environment for
Once Jupyter Notebooks is open the program leads to the main direc-
tory. This is the directory where Python is installed. In the next segment,
we will discuss how to change the Python directory.
8 M. GARITA
Fig. 2 Jupyter Notebooks (Source Obtained from the computer of the author)
On the top left-hand side there are three important aspects for
Python:
• Files
– Here are the new and the old files. You can create a new file; you
can move files into a new folder or create a new folder. Also, you
can access files that have been created previously.
• Running
– This will give you the information concerning which programs
are being used.
• Clusters
– This will not be a topic for the book since it involves sharing
information with other data science webpages.
On the left-hand side there will be two options:
• Upload
– This will be used to upload a different document, excel or
Jupyter Notebook to the directory.
• New
– Depending on the version and the system, by selecting new you
can build a new Python Notebook, a text file, a folder or prompt
the Terminal.
WHY PYTHON? 9
Once Jupyter Notebook is launched the first step is to create a folder for
the work to be archived. This can be done by selecting the new bar on
the left and then from the options selecting folder (Fig. 3).
Once the folder has been created, it can be renamed by choosing the
folder and then choosing Rename that will appear once you choose the
folder that will be modified (Fig. 4).
Once the folder is created then the python files can be added by
choosing the top left sorting menu and choosing python.
Some important aspects before starting in Jupyter Notebooks (Fig. 5):
• Finding your directory: Type in the bar pwd and press enter. This
will tell where the file is so that it is retrievable, and it can be used
later in the process.
• About the Kernel: A kernel is basically the program that is composed
by the script (what you write on the notebook). It is important to
restart, shutdown or reconnect the kernel depending on the neces-
sity. When running the program, always restart the kernel if you
have been logged out of your computer or been away from the
computer for some time.
10 M. GARITA
Fig. 4 Creating a folder (Source Obtained from the computer of the author)
On the center of the Jupyter Notebook you will find the command
prompt line with a In [ ] and a gray line. Here is where the code is writ-
ten and then executed. Once executed you will have an Out [ ] line of
code with the result.
Finally, Jupyter Notebooks are very user-friendly and if a mistake is
made, it leads to a description of the error which allows to know a mis-
take in an easier way and this allows it to be solved.
Installing a package (Fig. 6):
When selecting Environments there are two columns of information.
The first column is titled Search Environments which leads to the dif-
ferent environments that will be used in the process. For example, there
are different environments depending on the use of the environment is
deep learning or Big Data. On the right of the environment there is the
package section which can be sorted by the installed/not installed on the
upper left. If one chose to see the installed packages it will display all the
packages that have been installed in the computer.
On the top right there is a search packages bar that is very useful when
searching for a specific package such as NumPy, a package that is used
consistently throughout the book (Fig. 7).
12 M. GARITA
Fig. 6 Installing a package (Source Obtained from the computer of the author)
Fig. 7 Selecting package to install (Source Obtained from the computer of the
author)
WHY PYTHON? 13
Fig. 8 Searching for a package (Source Obtained from the computer of the
author)
Once the package is located, the small square box should be selected for
download. On the bottom right of the page is an apply or clear button,
select the apply button for the installation of the package.
By choosing to apply the package a small window will appear that will
search for the package, modify it and solve any problem. Be aware that
this may take some time depending if the package is download and if
there is any information that is needed to download the package. Once
the package is installed it will appear in the installed section and it is
available to be used when installing a library (see next chapter) (Fig. 9).
14 M. GARITA
Once the packages are installed, the launch Jupyter Notebook can
be launched by going back to Home and pressing on launch under the
Jupyter Notebook symbol. This will open a search engine window and is
the first process to create a notebook.
When pressing the New tab to create a new document using the Google
option, one should look for the Google Colaboratory. It is usually in the
more option (Fig. 11).
Once the Google Colaboratory is found, then it can be selected.
Fig. 11 Google
Colaboratory
References
365 Data Science. 2020. Why Python for data science and why Jupyter to code in
Python Articles 11 min read. n.d. Accessed March 2, 2020. https://365datasci-
ence.com/why-python-for-data-science-and-why-jupyter-to-code-in-python/.
Anaconda. 2020. Anaconda distribution. Accessed March 2, 2020. https://
www.anaconda.com/distribution/.
Bloomberg Corporation. 2019. Bloomberg puts the power of Python in hedgers’
hands. 8 March. Accessed March 2, 2020. https://www.bloomberg.com/
professional/blog/bloomberg-puts-power-python-hedgers-hands/..
Burgess, Matthew, and Sarah Wells. 2020. Giant wealth fund seeks managers who
can beat frothy market. 9 February. Accessed March 2, 2020. https://finance.
yahoo.com/news/giant-wealth-fund-seeks-managers-230000386.html.
Das, Sejuti. 2020. Analytics India Magazine. 27 February. Accessed March
17, 2020. https://analyticsindiamag.com/why-jupyter-notebooks-are-so-
popular-among-data-scientists/.
Kan, Chi Nok. 2018. Data Science 101: Is Python better than R? 1 August.
Accessed March 2, 2020. https://towardsdatascience.com/data-science-
101-is-python-better-than-r-b8f258f57b0f.
O’Reilly. 2020. 5 key areas for tech leaders to watch in 2020. 18 February. Accessed
March 2, 2020. https://www.oreilly.com/radar/oreilly-2020-platform-analysis/.
Pfeiffer, Frank. 2019. R versus Python: Which programming language is better for
data science projects in Finance? 28 May. Accessed March 2, 2020. https://
finance-blog.arvato.com/r-versus-python-in-finance/.
Python Organization. 2020. Python organization. 7 January. Accessed March 2,
2020. https://docs.python.org/2/faq/general.html#what-is-python.
Shaik, Naushad. 2018. 5 reasons why learning Python is the best decision. 21
September. Accessed March 2, 2020. https://medium.com/datadrivenin-
vestor/5-reasons-why-i-learned-python-and-why-you-should-learn-it-as-well-
917f781aea05.
Learning to Use Python: The Basic Aspects
• Entering a Number
– In [1]: 200
– Out [1]: 200
You can achieve this just by pressing enter once the number has been
inserted into the line of command in the Jupyter Notebook.
• Addition in Python
– In [1]: 200 200
– Out [1]: 400
• Subtracting in Python
– In [1]: 200 200
– Out [1]: 0
• Multiplying in Python
– In [1]: 200 200
– Out [1]: 40000
• Dividing in Python
– In [1]: 200 200
– Out [1]: 1.0
• Square root in Python
– In [1]: (−1) (0.5)
– Out [1]: 6.123233995736766e−17+1j
The properties above can be applied to any number by writing the fol-
lowing code:
• To a power of three
– In [1]: (2) (3)
– Out [1]: 8
import ffn
btcusdclose
Date
2021-01-01 29374.152344
2021-01-02 32127.267578
• Entering a Number
– In [1]: 200
– Out [1]: 200
You can achieve this just by pressing enter once the number has been
inserted into the line of command in the Jupyter Notebook.
• Addition in Python
– In [1]: 200 200
– Out [1]: 400
• Subtracting in Python
– In [1]: 200 200
– Out [1]: 0
• Multiplying in Python
– In [1]: 200 200
– Out [1]: 40000
• Dividing in Python
– In [1]: 200 200
– Out [1]: 1.0
• Square root in Python
– In [1]: (−1) (0.5)
– Out [1]: 6.123233995736766e−17+1j
The properties above can be applied to any number by writing the fol-
lowing code:
btcusdclose
Date
2021-01-01 29374.152344
2021-01-02 32127.267578
Given that the numbers have decimal points, the type of number is a
float. When analyzing using the command type in bitcoin the result
is as follows:
W\SH ELWFRLQ
SDQGDVFRUHIUDPH'DWD)UDPH
24 M. GARITA
• To a power of three
– In [1]: (2) (3)
– Out [1]: 8
The following can be done with the variable bitcoin. Given that there are
two values in the DataFrame (which will be explained in future chapters),
the following process will choose the latest value:
ELWFRLQ>@
btcusdclose
Date
2021-01-02 32127.267578
If the price of the bitcoin in USD is needed to the power of three the
process is as follows:
ELWFRLQ>@
btcusdclose
Date
2021-01-02 2.534522e+13
btcusdclose
Date
2021-01-02 171.388892
What Is a List?
Lists are one of the most important features in Python because it helps in
managing a data frame. When using lists there are certain questions that
are important to answer one by one and it is what will be done in each
chapter.
A list equivalent is an array, they are both objects which contain infor-
mation. There is a difference when comparing a list or an array to a
string, because a string cannot be altered, and it is useful when the infor-
mation in the array will be used to guarantee that the data will be same
during the duration of the program. It is not that useful in finance, since
the data that is aimed for has to change continuously but it is important
to understand the properties that each has.
When adding a letter or a word, one has to add the quotes for adding
them to a list. An example below:
For the following example, two of the most important index will be used
with the following code:
OLVW >VS\>@GRZ>@@
The problem with this process is that given that the information comes
from a DataFrame, the information is not presented properly given the
structure of the information.
gspcclose
Date
2021-01-29 3741.26001, djiclose
Date
2021-01-29 30170.699219]!
Measuring a List
To know how many elements my list has, the function to be used is
len(). Len() will allow us to understand the elements inside the list in
order to create calculations. An example as follows:
• In [1]: len(my_list)
• Out [1]: 7
LEARNING TO USE PYTHON: THE BASIC ASPECTS 27
Since the list has been modified to add different elements, the total of
elements is seven (7). For example, in my_list = [10, 10.25, "string"]
there are three elements, if len(my_list) is used the result will be three
(3) because of each element.
len(list)
len(bitcoin)
Given that the variable bitcoin only had two prices, the length is two (2).
• In [1]: my_list[1]
• Out [1]: 2
One of the most important aspects when using list is that lists in Python,
as many other computation languages, starts by counting from zero (0).
This is why when indexing the element in position one [1] the result
28 M. GARITA
• In [1]: my_list[−1]
• Out [1]: 'b'
• In [1]: my_list[1:]
• Out [1]: [2,3,0.5,400+0j, "apple", "b"]).
Before, the example was demonstrated with the Dow Jones and the
S&P 500 by cutting the list as follows to get the last value:
list = [spy[18:],dow[18:]]
The example below takes all the elements except the element in the first
position, in the position of zero (0). It can also be used for indexing
between different values. An example below:
• In [1]: my_list[3:5]
• Out [1]: [0.5,400+0j]
In this case it includes position three (0.5) and the position four (400+j)
but it excludes the word "apple" because it is not included.
LEARNING TO USE PYTHON: THE BASIC ASPECTS 29
In the Dow Jones and S&P 500 example, the process can be elabo-
rated in the same way as before with the problem of the arrangement:
list = [spy[17:19],dow[17:18]]
[ gspcclose
Date
2021-01-28 3787.379883
2021-01-29 3741.260010, djiclose
Date
2021-01-28 30603.359375]
If the fifth element wants to be included then the most useful process is
to add the sixth position as follows:
• In [1]: my_list[3:6]
• Out [1]: [0.5,400+0j, 'apple']
List can also be used to recall up to a certain element. In this case, the
process is similar to the one above, but the only element used will be
after the colon. An example as follows:
• In [1]: my_list[:5]
• Out [1]: [1,2,3,0.5,400+0j]
As before, with Dow Jones and S&P 500 example, the process can
be elaborated in the same way:
OLVW >VS\>@GRZ>@@
JVSFFORVH
'DWH
GMLFORVH
'DWH
@
30 M. GARITA
• In [1]: my_list[:−2]
• Out [1]: [1,2,3,0.5,400+0j]
Or:
• In [1]: my_list[−2:]
• Out [1]: ["apple", "b"]
A list can also be added with other list, this is an important feature when
trying to unite two arrays. An example as follows:
With just an addition sign (+) two lists were concatenated. This is useful
when created having two sets of data that will be used conjointly.
This can also be used for adding elements:
• In [1]: my_list*2
• Out [1]: [1, 2, 3, 0.5, (400+0j), 'apple', 'b', 1, 2, 3, 0.5, (400+0j),
'apple', 'b']
LEARNING TO USE PYTHON: THE BASIC ASPECTS 31
In the list created that involved the indexes, the process can be
done as follow for multiplication, with the problem that it even
duplicates the titles in a DataFrame:
OLVW
JVSFFORVH
'DWH
GMLFORVH
'DWH
JVSFFORVH
'DWH
GMLFORVH
'DWH
@
7\SH(UURU7UDFHEDFN PRVWUHFHQWFDOO
ODVW
LS\WKRQLQSXWFGGF! LQPRGXOH!
!OLVW
The reason for this type of error is that, as explained before a not
integer given that the information is required from an API, in this
case Yahoo Finance. It could be suggested that for conversion one
should use the int command, but the error is as follow:
7\SH(UURU7UDFHEDFN PRVWUHFHQWFDOO
ODVW
LS\WKRQLQSXWFEFE! LQPRGXOH!
!LQW OLVW
7\SH(UURULQW DUJXPHQWPXVWEHDVWULQJDE\WHVOLNHREMHFWRUD
QXPEHUQRW OLVW
32 M. GARITA
Appending Lists
Another method of adding elements to a list is by the append( ) that is
useful for adding data. Although it can be done by using the addition
sign (+), the append( ) method is used in data structures and it could
behoove the user when using databases.
• In [1]: my_list.append("Python")
• Out [1]: [1, 2, 3, 0.5, (400+0j), 'apple', 'b', 'Python']
Apart from a word, another list of items can be appended into the origi-
nal list. An example below:
• In [1]: my_list.append(my_list_2)
• Out [1]: [1, 2, 3, 0.5, (400+0j), 'apple', 'b', 'Python', ['oranges',
5,0.5, (300+1j)], ['oranges', 5, 0.5, (300+1j)]]
This is useful when creating a new list with data from other sets. In the
example, as follows.
Arranging Lists
A list can be arranged to suit the needs of the user. This is extremely
useful when the list has to be arranged from first to last or it has to be
sorted by a certain requirement. An example as follows with the list
my_list = [1,2,3]
• In [1]: my_list.reverse()
• Out [1]: [3, 2, 1]
Another procedure for arranging list is sorted, which arranges the data
from first to last. This is helpful when the data has to be arranged, for
example, from the smallest return to the highest return. An example as
follows with the list my_list = [4,1,4,3,2,10,20,23]
• In [1]: my_list.sort()
• Out [1]: [1, 2, 3, 4, 4, 10, 20, 23]
belongs to the original, the best advice is to create a new list with the
original data that can be modified.
There is certain procedure that can be executed when using matrix that
are similar to a list. An example below:
In the example above the data that is being retrieved is the first list. If the
first element of the list will be retrieved the process is as follows:
The first bracket is for retrieving the list and the second bracket is for the
first element in the list. This is based on the fact that in Python the posi-
tion zero (0) is the first position.
A dictionary is divided into two elements: (1) the keys and (2) the
values. The keys are useful because the key will determine the value that
is being requested. The key is separated from the value by a colon (:) to
indicate that the first element is the key and the second element is the
value. An example as follows:
• In [1]: my_dictionary['key1']
• Out [1]: 'value1'
my_dictionary_2 = { }
• In [1]: my_dictionary_2 [ ]
• Out [1]: 23.5
• In [1]: my_dictionary_2 [ ] − 23
• Out [1]: 0.5
This is an important feature because it does not alter the data. Each cal-
culation is created apart from the original data, creating results without
any alteration.
LEARNING TO USE PYTHON: THE BASIC ASPECTS 35
Modifying a Dictionary
There will be moments when it will be necessary to alter the information
inside the dictionaries. It is a really simple process.
In, the example above 100 was added to the first key and modified the
value for a result of 123.5 This procedure can be done with addition,
multiplication, division, square root or any other arithmetic approach.
An example of this is as follows:
d = {}
d['price'] = '$25000'
d['car'] = 'Peugeot'
• In [1]: d
• Out [1]: {'car': 'Peugeot', 'price': 25000}
If a discount of 10% should be applied to the price of the car the process
would be as follows.
• In [1]: d['price']*(0.9)
• Out [1]: 22500.0
countries = {}
countries = {'Guatemala': 'Guatemala', 'Costa Rica': 'San José',
'Nicaragua': 'Managua'}
• In [1]: countries.keys()
• Out [1]: dict_keys(['Guatemala', 'Costa Rica', 'Nicaragua'])
36 M. GARITA
This allows the user to understand which keys have been chosen and
how to retrieve the data in the list. For example, it can be the name of
the stocks and the values that each stock holds. If the values of the dic-
tionary are not known and want to be retrieved the process is as follows:
• In [1]: countries.keys()
• Out [1]: dict_values(['Guatemala', 'San José', 'Managua'])
The DataFrame
As seen before, the book is centered on creating DataFrames with data
from the internet obtained directly from the webpage. For this, it is
important to understand the use of a DataFrame and the possibilities
that it has when compared to a list.
Data_Frame =
pd.DataFrame({'col1':[1,2,3,4], 'col2':[200,300,200,300], 'col3':[ 'abc',
'def', 'ghi’, 'xyz']}).
LEARNING TO USE PYTHON: THE BASIC ASPECTS 37
• In [1]: Data_Frame
• Out [1]:
• In [1]: Data_Frame['col1']
• Out [1]:
0 1
1 2
2 3
3 4
Name: col1, dtype: int64
If there are values repeated such as in the second column, with the
unique command the values can be removed.
• In [1]: Data_Frame['col2'].unique()
• Out [1]: array([200, 300])
• In [1]: Data_Frame['col1'].sum()
• Out [1]: 10
• In [1]: Data_Frame['col1'].mean()
• Out [1]: 2.5
• In [1]: Data_Frame['col1'].multiply(5)
• Out [1]:
0 5
1 10
2 15
3 20
Name: col1, dtype: int64
38 M. GARITA
• In [1]: Data_Frame.drop('col1',axis=1)
• Out [1]:
col2 col3
0 200 abc
1 300 def
2 200 ghi
3 300 xyz
• In [1]: Data_Frame.sort_values('col2')
• Out [1]:
DataFrames can also be used with Booleans. This is one of the most
interesting features. An example as follows:
– Choosing a column
VWRFNV> JPHFORVH @
'DWH
40 M. GARITA
DUUD\ >
'DWH
1DPHPVIWFORVH/HQJWKGW\SHIORDW
LEARNING TO USE PYTHON: THE BASIC ASPECTS 41
gmeclose aaplclose
Date
2020-01-02 6.310000 75.087502
2020-01-03 5.880000 74.357498
2020-01-06 5.850000 74.949997
2020-01-07 5.520000 74.597504
2020-01-08 5.720000 75.797501
… … …
2021-01-25 76.790001 142.919998
2021-01-26 147.979996 143.160004
2021–01-27 347.510010 142.059998
2021-01-28 193.600006 137.089996
2021-01-29 316.644714 133.184998
– Sort values
VWRFNVVRUWBYDOXHV JPHFORVH
gmeclose aaplclose
Date
2020-04-03 2.800000 60.352501 153.830002
2020-04-02 2.850000 61.232498 155.259995
2020-04-06 3.090000 65.617500 165.270004
2020-04-01 3.250000 60.227501 152.110001
2020-04-07 3.270000 64.857498 163.490005
… … … …
2021-01-25 76.790001 142.919998 229.529999
2021-01-26 147.979996 143.160004 232.330002
2021-01-28 193.600006 137.089996 238.929993
2021-01-29 316.644714 133.184998 234.684998
2021-01-27 347.510010 142.059998 232.899994
– Using a Boolean
VWRFNV!
42 M. GARITA
• In [1]: 1 > 2
• Out [1]: False
• In [1]: 1 < 2
• Out [1]: True
This is one of the most important aspects when comparing data. For
example, if there is a need to know if the Value at Risk (VaR) will be
higher than a specific percent or number, the Boolean could be useful to
understand the process.
To understand how Booleans work, it is important to specify the dif-
ferent cases in which a Boolean creates a response that can be useful for
the user (Table 1).
Booleans can also be used as a chain of comparison. When there are
different elements included, the Booleans can specify if the sequence is
correct. An example below:
LEARNING TO USE PYTHON: THE BASIC ASPECTS 43
== If the values of the operators are equal the result will be 2==1 (FALSE)
TRUE, if not the result will be false 1==1 (TRUE)
!= This operator is the contrary of the (==) sign because 2 != 1 (TRUE)
it means not equal. When using the != the order of the 1 != 1 (FALSE)
exclamation mark (!) and the equal sign (=) has to be as
shown in the example
There is another way of computing the same operator with
the same results, but it is less useful. The use is <>
> This operator specifies that the argument on the left is 2 > 1 (TRUE)
bigger than the argument on the right 1 > 2 (FALSE)
< This operator specifies that the argument on the left is 2 < 1 (FALSE)
smaller than the argument on the right 1 < 2 (TRUE)
<= o >= This operator is known as equal or greater than, or equal 3 >= 3 (TRUE)
or lesser than. This will include the argument that is equal, 3 >= 4 (FALSE)
being the main difference between the greater than and 4 >= 3 (TRUE)
lesser than operators
Or:
For Booleans the use of the command (and) or (or) is extremely useful
when comparing equations. This allows the user to understand if various
statement is True or False. An example as follows:
Or:
• In [1]: 3 != 2 and 3 == 2
• Out [1]: False
44 M. GARITA
When using (or), these questions if one statement is correct based on the
other statement. It will tell us if one of the statements is True regarding
if the other statement is False.
• In [1]: 3 < 2 or 1 == 1
• Out [1]: True
Another example:
• In [1]: 3 < 2 or 1 ! = 1
• Out [1]: False
If:
It is used when the user wants there to be an alternative to what is pre-
sented. This is, when the user chooses an action if a case happens.
if example:
performs action1
Example:
• In [1]: if True:
print ('It is correct')
• Out [1]: It is correct
Else:
The else statement is useful when combined with the if statements. The rea-
son for this is because when we use the if function, we are asking that if
something happens, the program should demonstrate a certain result. With
else, we complement the if function and tell the function to who what hap-
pens if what is commanded is not fulfilled. In this way we can have the two
parts of the equation, what happens and what does not happen.
LEARNING TO USE PYTHON: THE BASIC ASPECTS 45
Example:
• In [1]: a = False
if a:
print ('a is correct')
else:
print ('anything but correct')
In this example, what can be observed is that Python understands the True
and the False commands implicitly. These commands that were previously
identified as Boolean, allow the user to establish a relationship. In the pre-
vious case, the variable a was part of the function and the value False placed
a conditional giving as a result that if a is True for it to print a is correct and
the occasion that the result is otherwise the program will print a is anything
but correct. For the value that is True, the following code is as follows:
• In [1]: a = True
if a:
print ('a is correct')
else:
print ('anything but correct')
Elif:
The elif statement is used when there are different arguments. In the
case of an elif statement what the program wants to validate is if the vari-
ables are True or False. An example of an elif statement is as follows:
if expression:
statement(a)
elif expression(2):
statement (a)
elif expression3:
statement (a)
elif expression4:
statement(a)
46 M. GARITA
The following example takes into account the theory of the Value at Risk
(VaR). The purpose of the elif statement is to acknowledge if the fall
of the VaR is one or the other. In this case, the values proposed are in
million. To know more about the VaR, please refer to the chapter titled
Value at Risk (VaR).
In [1]:
var = 100
if var == 200:
print ("1 – Value is True")
print (var)
elif var == 150:
print ("2 – Value is True")
print (var)
elif var == 100:
print ("3 – Value is True")
print (var)
else:
print ("4 – Value is False")
print (var)
Loops
There are two types of loops that are extremely useful in Python. The first
is a for loop and the second one is a while loop. Each of the loops has its
own structure and should be used base on the necessity of the program.
For Loop
Loops are iterations that allow the user to elaborate sequences and use
them once and again. For creating a for loop the process although com-
plicated is centered on the purpose of the iteration.
Example:
m = [1,2,3,4,5,6]
LEARNING TO USE PYTHON: THE BASIC ASPECTS 47
In [1]:
for x in m:
print (x)
Out [1]:
1
2
3
4
5
6
The iteration of the for loop allows us in a simple way to separate each
of the elements included in the list. If a for loop was not used the pro-
cess would have been extremely complicated and also time consuming.
An example of the process without the for loop would have been as
follows:
print (m[0])
print (m[1])
print (m[2])
print (m[3])
print (m[4])
print (m[5])
By using the iteration, it is not only simple but also easy to do a process
with less time. This will be useful when elaborating Monte Carlo simula-
tions or for applying the Sharpe Ratio into a Markowitz model.
Python can also engage in an iteration or a loop without having to
define the element. For example, in the for loop the x was used and it did
not contain an element, this is basic because when the loop ends its itera-
tion this is the out result. If in the loop the x is changed into a word such
as example, the result is the same.
In [1]:
for example in m:
print (example)
48 M. GARITA
Out [1]:
1
2
3
4
5
6
m = [1,2,1,2,1]
In [1]:
for x in m:
print ('I am a result')
Out[1]:
I am a result
I am a result
I am a result
I am a result
I am a result
In [1]: 12%5
Out[1]: 2
One of the usual exercises that are practiced when using the mod-
ulo function is the identification of even numbers. Knowing that the
LEARNING TO USE PYTHON: THE BASIC ASPECTS 49
function modulo (%) gives us the residue, what we must do to find even
numbers is to divide within 2 and the result to be zero (0). An example
as follows:
m = [1,2,3,4,5,6]
In [1]:
for number in m:
if number%2 == 0:
print (number)
Out[1]:
2
4
6
If the example is with odd numbers, the process would only have to
change the part that indicates if number%2 == 0: and change it by if
number%2 == 1:
To the iteration else can be added to operation with a string as a
response. An example is the following:
In [1]:
for number in m:
if number%2 == 0:
print ("the number es even")
else:
print ("the number is uneven")
Out[1]:
In the same way the arguments can be changed to obtain a specific string
or another element. An example as follows:
50 M. GARITA
In [1]:
for number in m:
if number%2 == 1:
print ("The number is even")
else:
print (number)
Out[1]:
1
The number is even
3
The number is even
5
The number is even
m = [1,2,3,4,5,7,8,9,10]
add = 0
In [1]:
for number in m:
add = add + number
print (add)
Out[1]: 49
What the previous interaction does is add the elements of the list m
one after the other to obtain 49. In this case, the for loop is separated
from the print and the determination of the variable because the itera-
tion must be individual, but it will take the other elements.
For loops can also be used for strings with the same modality that are
used for elements.
LEARNING TO USE PYTHON: THE BASIC ASPECTS 51
In [1]:
a = "Letter"
for letter in a:
print (letter)
Out[1]:
L
e
t
t
e
r
In [1]:
Out[1]:
k1
k2
k3
In [1]:
Out[1]:
k1
1
k2
2
k3
3
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
While Loop
The while loop will give a result until there is a true value (True). That
is to say that it will be repeated again and again until obtaining the said
result. This can be quite useful to determine a value or a string.
In [1]:
a=0
else:
print ("end")
LEARNING TO USE PYTHON: THE BASIC ASPECTS 53
Out[1]:
a is 0
a is 1
a is 2
a is 3
a is 4
a is 5
a is 6
a is 7
a is 8
a is 9
end
In this case what is being checked is that the numbers less than 10 will
be added to one (1) until it is less than 10. When it reaches the tenth
(10) value, it will print a phrase that says "end". In this case we can verify
the existence of the elements less than 10 other than being able to add a
data.
To work with while loops it is important to know 3 commands:
1. break,
2. continue and
3. pass
Break allows the user to close the loop that is running. Continue allows
the user to do one more iteration and pass allows us to leave the loop
without effect. While loops can be longer, including more functions to
precisely define what we want.
In [1]:
a=0
while a < 5:
print ("a is", a)
print ("a is less than 5")
a+=1
54 M. GARITA
if a = = 4:
print ("a is equal to 4")
else:
print ("Continue")
continue
Out[1]:
a is 0
a is less than 5
We continue
a is 1
a is less than 5
We continue
a is 2
a is less than 5
We continue
a is 3
a is less than 5
a is equal to 4
a is 4
a is less than 5
continue
1. The variable "a" was created and a value of zero (0) was placed.
2. A while loop was created in which when "a" was less than (<) 5
then it would print "a is" and that could be the value of "a". Then
it would print "a is less than 5".
3. An if conditional was placed to say that if the "a" was absolutely
equal (==) to four then it would print "a is equal to 4." For all
the values that are not absolutely equal to four, the string "follow"
would be printed.
4. The "continue" command is placed to give it iteration, that is,
continue replicating.
When we use the break function, we put an end to it. That is why it is
important to use it correctly because it will interrupt the loop.
LEARNING TO USE PYTHON: THE BASIC ASPECTS 55
In [1]:
a=0
while a < 5:
print ("a is", a)
print ("a is less than 5")
a+=1
if a = = 2:
print ("a is equal to 2")
break
else:
print ("We continue")
continue
Out[1]:
a is 0
a is less than 5
We continue
a is 1
a is less than 5
a is equal to 2
What was indicated in the code is that it stops once it reaches the value
that is absolutely equal to 2. Therefore, when the number 2 is presented,
the last line reads "a is equal to 2" and the code does not continue. This
despite having an "else" that should continue the code.
Finally, if one writes pass result would be the same result as in the loop
without break because it is an argument so that nothing happens.
List Comprehension
The comprehensions of lists allow us to easily and clearly make a list with
a different notation and that this already includes a loop, specifically a for
loop. It is easy to understand as follows:
In [1]:
m = []
56 M. GARITA
Out[1]:
In this case it could have been done with a list with the variable "m"
through indexing "word" and that each of the letters was an element.
For example, if one uses m [0] it will give the letter "w" or if we use m
[3] it would give us the letter "d".
The above can be developed in a simple manner as follows:
print (n)
Result
In [1]:
Out[1]:
The above can be decompressed as follows. First the variable "list" was
created and assigned to multiply the data within the range of 0 to 11
high (**) squared. In this sense we obtain that 3 by 3 is equal to 9 or
LEARNING TO USE PYTHON: THE BASIC ASPECTS 57
In [1]:
Out[1]:
The above is simple because it creates a loop using the function modulo
(%) that indicates the values that have zero residue when being divided
by 2 then they are printed. Since the pairs are the only ones that in a
division of 2 can have zero residue, that is why it is easy to achieve the
above.
References
Hunner, Trey. 2018. Trey Hunner. 11 October. Accessed March 23, 2020.
https://treyhunner.com/2018/10/asterisks-in-python-what-they-
are-and-how-to-use-them/.
Python. 2006. 2.3.4 Numeric types -- int, float, long, complex. 18 October.
Accessed March 23, 2020. https://docs.python.org/2.4/lib/typesnumeric.
html.
Python. 2020. Data structures. 23 March. Accessed March 23, 2020. https://
docs.python.org/3/tutorial/datastructures.html.
Using FRED® API for Economic Indicators
and Data (Example)
to the person using the API. To obtain the key one should visit the API
Keys page at https://research.stlouisfed.org/docs/api/api_key.html.
The key can be requested by creating a user and filling out the informa-
tion. Once the key is received (it is usually a quick process), then it can
be used for analyzing the data.
First Step
The first process is using pip install1 which allows to retrieve the
information of the FRED. In the present example by using Google
Collaboratory the process is extremely simple.
Second Step
IURP IUHGDSL LPSRUW )UHG
IUHG )UHG DSLBNH\ ;;;;;;;;
In this step, once the fredapi is installed then the Fred data library can
be retrieved. For this it is necessary to have the API_key which will allow
the retrieval of the information.
Third Step
To retrieve information from the FRED one should know how to use
the FRED website and how to retrieve the information. For the follow-
ing exercise will use the information from the FRED website which is as
follows: https://research.stlouisfed.org/.
stable/reference/pip_install/.
USING FRED® API FOR ECONOMIC INDICATORS … 61
– Indexes
– Prices
– Price Index
– Consumer Price Index
– Employment
GW\SHIORDW
62 M. GARITA
There is a second method which is easier and works better with the
methods that are going to be seen in this book.
pip installfredapi
The command pip2 is the program that installs the packages in Python.
Once the fredapi is installed the following process is to install pandas_data-
reader. With pandas_datareader it is possible to access different information
such as FRED, World Bank, OECD and NASDAQ. For more information
concerning the different sources and how to use them please visit: https://
pandas-datareader.readthedocs.io/en/latest/remote_data.html.
LPSRUW SDQGDVBGDWDUHDGHU DV SGU
The start is the date from which the data begins, and the end is where
the data ends. Setting the date is important because it allows the differ-
ent analysis concerning the specific time. With these settings the data can
be retrieved by utilizing the code that was used before.
FRQVXPHUBSULFHBLQGH[ SGU'DWD5HDGHU &3$/778601 IUHG VWDUW
HQG
When the variable is created, the data can be analyzed. To check that
the data is correct, the next process is suggested:
consumer_price_index.tail()
GIBFSL FRQVXPHUBSULFHBLQGH[UHQDPH FROXPQV ^ &3$/778601
&3, `
DATE CPALTT01USM657N
2020–05-01 0.001950
2020–06-01 0.547205
2020–07-01 0.505824
2020–08-01 0.315321
2020–09-01 0.139275
What can be observed is that the last data in FRED is from the first
of May 2020. Given that the code could be difficult for
others to understand when sharing the Collaboratory, the title of the col-
umn must be changed:
The approach behind the GDP equation is to understand how the econ-
omy is funded and from there understand how it works. The theory is
based that the households offer funds to the banks as savings or to the
companies as investments. Households also offer labor to the companies
in returns of funds, salaries, that can be used for consuming the products
created by the companies or they can be invested and/or saved. Also,
from the salary, the government charges taxes which are used for public
goods and services. Finally, exports and imports are based on the expo-
sure of a country to other economies. In this sense, if a country didn´t
64 M. GARITA
share economic activity with other countries there wouldn´t exist exports
and imports, just consumption, investment and government expenditure.
In order to understand the economic cycle, the GDP will be analyzed.
The Real GDP will be used because it is inflation adjusted and the nom-
inal GDP is established on a base price. The code for the Real GDP is
. The data for the GDP is in billions of dollars, seasonally adjusted
at an annual rate and with a quarterly frequency.
Once the data is retrieved, a change can be made for a more useful
analysis. For this change, the pct.change()4 will be used to analyze the
growth of the GDP. Using pct.change() is simple, when the parenthesis
is left blank, the program assumes that there is one period, which is the
process that will be used.
DATE GDPC1
2019–07-01 0.636915
2019–10-01 0.586232
2020–01-01 –1.262655
2020–04-01 –8.986117
2020–07-01 7.403492
The name of the column can be changed with the process used before.
DATE GDPC1
2019–04-01 0.370716
2019–07-01 0.636915
2019–10-01 0.586232
2020–01-01 –1.262655
2020–04-01 –8.986117
2020–07-01 7.478741
2019–04-01 0.370716
erence/api/pandas.DataFrame.pct_change.html.
USING FRED® API FOR ECONOMIC INDICATORS … 65
nominalGDP
GDPDeflator = × 100
realGDP
To calculate the deflator of the GDP the sum function will be used.
The sum function sums all the data points in the list. The first step is to
establish a base year, in this case the year will be 2017.
Once the real GDP and the nominal GDP had been set into a varia-
ble, the following process is to create the sum of the production in the
year.
QJGSBVXP QRPLQDOB*'3VXP
QJGSBVXP LQW QJGSBVXP
UJGSBVXP UHDOB*'3VXP
UJGSBVXP LQW UJGSBVXP
The deflator for 2019 in comparison with the nominal GDP in 2017, the
prices have grown by 125%, which reflects that the growth has grown by
boost of prices. Considering that investment it is important to have a growth
higher than inflation, following the deflator is an important indicator.
type(deflator)
float
USING FRED® API FOR ECONOMIC INDICATORS … 67
In the example above the deflator is a float given that it has decimal
numbers. The number could be changed to an integer by using the int()
function.
int(deflator)
102
The process above changes the sum of the variable nominal_GDP into
an integer so that it can be divided into the sum of the integer of the
real_GDP1 variable.
Considering data structures, given that the information will be
retrieved from an API during this book the result when analyzing list is a
DataFrame.
type(real_GDP)
pandas.core.frame.DataFrame
Comparing GDP
When analyzing growth, it is important to compare how the growth of
the country that is being analyzed compares to the other countries. In
this process, a graph will be developed to understand how the different
countries behaved.
68 M. GARITA
Installing packages:
LPSRUW SDQGDVBGDWDUHDGHU DV SGU
LPSRUW SDQGDVDV SG
LPSRUW GDWHWLPH
LPSRUW PDWSORWOLES\SORWDV SOW
JGSBFRPSDULVRQKHDG
gdp_change.head()
gdp_change = gdp_change.dropna()
gdp_change.head()
USING FRED® API FOR ECONOMIC INDICATORS … 69
There are different sources of data that are viable in finance. Some of them
can be accessed without creating and CVS (comma-separated-values) file
or an XLS file (Microsoft Excel). The sources that can be accessed online
through Python and without using one of the after mentioned files are the
API (application programming interface). The API is useful because since
they can have a specific protocol, routines and data structures, the infor-
mation accessed can be retrieved every time we are using Python.
API Sources
(1) Google Finance: Google developed an API for retrieving infor-
mation from the financial markets. The API has been discontin-
ued and therefore it is mentioned here as a resource that can be
uploaded in the future.
(2) Yahoo Finance: One of the most important API that the book
will be working with, given that it is free and that the information
is accurate since it is retrieved directly from the markets. Since
Yahoo started its financial platform in 1997 it has grown and its
one of the most consulted platforms for financial decisions that is
accessible by price and by easiness of language.
(3) Quandl: The company's first movement in financial data was
when they launched, in 2013, a million free data set with its
universal API. This created the possibility for analyzing data
from other sources. In late 2018 they were acquired by Nasdaq
(National Association of Securities Dealer), making them one of
the most interesting datasets in the markets.
Although there are other interesting API in the market, in the book we
will be using the two specific API, Yahoo Finance and Quandl. The rea-
son for using these API is because they are gratuitous and that they are
accurate. There are other databases such as Bloomberg that have an API
but access to the data could be expensive.
For the use of CVS and XLS databases, the main databases that will be
used throughout the book are Yahoo Finance for statistical and portfolio
analysis and World Bank and International Monetary Fund for macroe-
conomic investment strategies.
The first API that will be used is the Yahoo Finance API. For this, it is
important to know certain packages that will be important in order to
handle the different data that will be accessed.
To import the other libraries first they must be installed. Please visit the
section on Anaconda Navigator and Installing Packages, this will be very
helpful in the long run.
Options:
Risk Analysis:
Time Series:
LPSRUW1XP3\DVQS
LPSRUWSDQGDVDVSG
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUWGDWHWLPH
LPSRUWSDQGDVBGDWDUHDGHUGDWDDVZHE
Once the Jupyter Notebook is set, the Yahoo API can be accessed. The
most important aspect is to set a start date and an end date for the values.
It is also important to add the stock quote that will be consulted, mainly
its ticker. A stock ticker symbol is a one to four letter code representing the
name of the company. For this example, the company Tesla will be used
for analysis. The ticker for Tesla is TSLA. The information is going to be
accessed from January 1, 2015 to January 1, 2019. The code is as follows:
start = datetime.datetime(2015,1,1).
end = datetime.datetime(2019,1,1).
In the example above two variables were created to define the infor-
mation that is going to be accessed. The start date is set to January 1,
2015 and the end date is set to January 1, 2019. Both variables are then
used in the main variable where the Yahoo API is going to be accessed.
The variable Tesla will use the combination of the module web with
DataReader to access the information from the stock company TSLA
datetime.html.
10 For more information regarding pandas_datareader: https://pandas-datareader.readthe-
docs.io/en/latest/.
USING STOCK MARKET DATA … 77
Fig. 1 Example of the retrieval of data from Tesla (Source Obtained from the
computer of the author)
from the API Yahoo in the dates from January 1, 2015 to January 1,
2019.
Now the information is in the current Jupyter Notebook. It is impor-
tant to recall that every time that we shut down the Jupyter Notebook
the data must be uploaded again using the same procedure or by run-
ning the whole script. To visualize the information the following com-
mand can be executed:
Tesla.tail( ) or Tesla.head( )
The.tail command will show the last five dates of the data. The.head
command will show the first five dates in the beginning of the dates that
were established. Figure 1 demonstrates the capacity of the Yahoo API.
From the Yahoo API the Opening price, the Highest Price of the day the
Lowest Price of the day the Adjusted Close and the Volume are displayed.
Fig. 2 Petroleum Prices using Quandl (Source Elaborated by the author with
information from Quandl)
LPSRUWTXDQGO
TXDQGODSLBFRQILJDSLBNH\ FB7H6W.H\
SHWUROHXP TXDQGOJHW 23(&25%DXWKWRNHQ FB[G8/K8N+[Q$NIMD
Now that it has been installed, the process for retrieving information for
Yahoo Finance is as follows:
As the example above states, the dates are included when getting
the stocks. The stocks are separated by a comma (,) and in the order
they are selected is the order that the stocks will be reflected in the
DataFrame.
stocks.tail()
calc_mean_var_weights.
80 M. GARITA
Closing prices:
stocks.tail()
High prices:
stocks.tail()
Low prices:
stocks.tail()
Pdw
'/Users/mauriciogarita'
df
pydata.org/pandas-docs/stable/generated/pandas.read_csv.html.
13 More information concerning the pd.read_table can be accessed at:https://pandas.
pydata.org/pandas-docs/stable/generated/pandas.read_table.html.
14 More information concerning the pd.read_excel can be accessed at: https://pandas.
pydata.org/pandas-docs/stable/generated/pandas.read_excel.html.
USING STOCK MARKET DATA … 83
Abstract Statistical Methods are part of the tools for analyzing securi-
ties. The following chapter explains the central limit theorem, returns,
ranges, boxplots, histograms and other sets of statistical measures for the
analysis of securities using Yahoo Finance API.
This next part of the book is centered on the use of mathematical and
statistical methods to understand the security based on quantitative anal-
ysis. The aim of quantitative analysis is to extract a value that explains
financial behavior (Keaton 2019).
N
xi
i=1 (1)
µ=
N
The formulas are basically the same with one important difference,
that the population mean centers on the number of items of a popula-
tion and the sample mean on a specific sample. The population should be
seen as the total observations that can be made and the sample is usually
a part of the population that is selected.
• Importing libraries
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
import pandas_datareader.data as web
import matplotlib.pyplot as plt
%matplotlib inline
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 87
start = datetime.datetime(2015,1,1)
end = datetime.datetime(2019,1,1)
IBM = web.DataReader(IBM,'yahoo',start,end)
Given that the information used for the IBM security is from January
1st, 2015 to January 1, 2019, this is considered as a sample. Therefore,
the x̄ will be considered as the mean of the security. To calculate the
mean, as the equation suggests, it is the sum of the elements divided by
the count of the elements.
IBM['Close'].mean()
• Result
151.81089374875862
The other aspect mentioned on the CLT is the median. The median
is the middle value of the set of numbers. To calculate the median in
Python it should be calculated as follows:
IBM['Close'].median()
• Result
152.5
Which one to choose? The rule of thumb is to use the mean when
there are no outliers and to use the median when there are outliers.
The third measure of the CLT is the mode. The mode is the most fre-
quent point of data in our data set. In a histogram, it is the highest bar.
To calculate it in Python:
IBM['Close'].mode()
• Result
146.479996
88 M. GARITA
The process can be combined with the function describe and the pack-
age f.fn () for obtaining certain statistical measures of Central Tendency.
The process is as follows (Fig. 1):
stocks.describe()
Creating a Histogram
One of the most important plots to display information in finance is a
histogram. A histogram demonstrates the frequency of data that is con-
tinuous. The meaning of statistical frequency is the times that a value
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 89
For example, the company IBM will be used from the before example.
Once the variable IBM is created there are different approaches to which
prices should be used to calculate returns. The conventional approach is
to use closing prices, since the prices are registered as the last price on
the stock. This is useful when working on a historical database.
If there is a necessity for using the actual data, the recommendation is
to use the opening price as the first day and the closing price as the last
day. There is also a discussion concerning the adjusted closing price, but
this will be seen in later chapters.
IBM_returns = IBM['Close'].pct_change(1)
Fig. 2 IBM Returns Frequency (Source Elaborated by the author with informa-
tion from Yahoo Finance)
K = number of bins
N = number of observations in the sets
• Result
count 1006.000000
mean -0.000254
std 0.013023
min -0.076282
25% -0.006450
50% 0.000220
75% 0.006213
max 0.088645
Name: Close, dtype: float64
The total observations are 1006. By including this value, Sturge’s Rule
can be calculated.
bins = 1+3.3222*np.log10(1006)
• Result
10.975231011547681
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 91
Fig. 3 IBM histogram with 40 bins (Source Elaborated by the author with
information from Yahoo Finance)
The same process can be followed with the f.fn() package using a sim-
pler process as follows (Fig. 4):
92 M. GARITA
returns = stocks.to_returns().dropna()
Creating a histogram:
returns.hist(bins=40, figsize=(12, 5));
ln = Natural Logarithm
P1 = Actual Price
P0 = Previous Price
To achieve this in Python, it is necessary to use the shift and the np.log.
The shift2 is used to move the calculation by one value, hence the name.
The np.log3 returns the natural logarithm with a base e. Both of them are
necessary to use in order to calculate the logarithmic returns.
IBM_log_returns = np.log(IBM['Close']/IBM['Close'].shift(1))
02/21/why-log-returns/.
2 Pandas documentation on shift: https://pandas.pydata.org/pandas-docs/stable/refer-
ence/api/pandas.DataFrame.shift.html.
3 Numpy documentation on np.log: https://docs.scipy.org/doc/numpy-1.15.0/refer-
ence/generated/numpy.log.html.
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 93
returns = stocks.to_log_returns().dropna()
94 M. GARITA
Fig. 7 IBM Returns with axvline (Source Elaborated by the author with infor-
mation from Yahoo Finance)
After calculating the returns, the process of creating the histogram is similar:
_ = plt.ylabel('Frequency')
plt.legend();
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 95
The plt.axvline was created with the mean function and with the
median function. The color4 was modified and a label was added so
that the legend becomes useful when analyzing the data. Given that the
mean and the median are very similar, there is almost no difference in
the graph but in extreme cases the difference between the mean and the
median can be considerable.
For this, it is necessary to discuss about skewness.5 Since it is important
to understand symmetry, skewness is useful as a measure for understanding
the lack of symmetry. When talking about skewness, the information can
be skewed to the left, to the right or be at the center point (Jain 2018).
IBM_log_returns.skew()
• Result
−0.65950468511581717
The skewness for a normal distribution should be zero and the sym-
metric data should be near this number.
• Result
6.2497976901878483
4 The different colors can be consulted at: https://matplotlib.org/examples/color/
named_colors.html.
5 Pandas information for skewness: https://pandas.pydata.org/pandas-docs/stable/ref-
erence/api/pandas.DataFrame.skew.html.
6 For more information on kurtosis in Pandas: https://pandas.pydata.org/pandas-docs/
stable/reference/api/pandas.DataFrame.kurtosis.html.
96 M. GARITA
• Leptokurtic: the value of kurtosis is greater than (>) than zero. The
interpretation is that the data is centered around the mean.
• Mesokurtic: the value of the kurtosis is equal (=) to zero. This rep-
resents a normal distribution
• Platykurtic: the value of the kurtosis is less than (<) than zero. The
interpretation is that the data is far from the mean.
In the example of IBM, the kurtosis is leptokurtic, since the values are
around the mean and the shape of the bell is taller in its area.
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 97
• Percent Change
Pepsi['Returns'] = Pepsi['Close'].pct_change(1)
CocaCola['Returns'] = CocaCola['Close'].pct_change(1)
One of the columns are created by the append method. The histogram
for both companies can be elaborated as follows (Fig. 8):
• Creating a histogram
CocaCola['Returns'].hist(bins=100,label='Coca-Cola',alpha=0.5)
Pepsi['Returns'].hist(bins=100,label='Pepsi',figsize=(10,8),alpha=0.5)
plt.legend();
98 M. GARITA
• Creating a histogram
CocaCola['LN Returns'].hist(bins=100,label='Coca-Cola',alpha=0.5)
Pepsi['LN Returns'].hist(bins=100,label='Pepsi',figsize=(10,8),alpha=0.5)
_ = plt.ylabel('Frequency')
plt.legend();
IBM_close = IBM['Close']
74.379997000000003
Since the max value of the series is 181.94 and the minimum value is
107.57, the difference is the range. This is important to know where the
data is located, between 181.94 and 107.75.
Another important range measure is the Interquartile range (IQR).
The interquartile range is the distance between the third quartile (Q3)
100 M. GARITA
and the first quartile (Q1) (Wan et al. 2014). The Q1 should be under-
stood as one-fourth (25%) of the data is the same or less than the Q1
result. The Q3 should be understood as three-fourths (75%) of the data
is the same or the Q3 result.
The measure is important for understanding outliers. The rule of
thumb is that if a value is larger than Q3 plus 1.5 times the IQR, then
this value is an outlier. This also applies if the value is Q1 minus 1.5
times the IQR range.
IBM_Q1 = IBM_close.quantile(0.25)
IBM_Q1
• Result
144.84000400000002
IBM_Q3 = IBM_close.quantile(0.75)
IBM_Q3
• Result
160.3650055
183.65250774999996
Given the result, there are outliers in the lower range, given that the
minimum value of the is 74.38 and the low outlier is 121.55. There is an
intense discussion about removing outliers, or retaining them consider-
ing that outliers are useful. For the present moment, the outliers should
be kept.
The visualization for analyzing outliers is the boxplots. The boxplots
are also known as the box-whisker plots and they are useful to under-
stand the concentration of the data. To create a boxplot the information
needed is as follows:
plt.boxplot(IBM_close);
_ = plt.ylabel('IBM Closing Price')
_ = plt.title('IBM Close - Boxplot')
_ = plt.xlabel('IBM')
Fig. 10 IBM Boxplot (Source Elaborated by the author with information from
Yahoo Finance)
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 103
Fig. 11 Coca-Cola and Pepsi box-plots (Source Elaborated by the author with
information from Yahoo Finance)
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
104 M. GARITA
• Concatenating
united_box = pd.concat([Pepsi['Close'],CocaCola['Close']],axis=1)
The next step is to create the columns in order to identify clearly the dif-
ference between the prices. The process is as follows:
For plotting, getting the information elaborated with the columns and
combining them for graphs.
united_box.plot(kind='box',figsize=(8,11),colormap='jet')
_ = plt.ylabel('Closing Prices')
x = data point
x̄ = mean of the data points in the series
n = total of data points
• Importing libraries
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
import pandas_datareader.data as web
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2015,1,1)
end = datetime.datetime(2019,1,1)
IBM = web.DataReader(IBM,'yahoo',start,end)
IBM_log_returns.var()
• Result
0.00017102549386496948
x = data point
x̄ = mean of the data points in the series
n = total of data points
The standard deviation measures the dispersion of the data set to its
mean. The standard deviation in finance is often referred to as volatility
(Hargrave 2020). Volatility is important because when analyzing higher
volatility, it is usually related to risk. In python it should be computed as
follows:
IBM_log_returns.std()
• Results
0.0130776715765831
x − observation
Weighting distance = K (9)
bandwith
Fig. 12 IBM KDE with log returns (Source Elaborated by the author with
information from Yahoo Finance)
IBM_log_returns = np.log(IBM['Close']/IBM['Close'].shift(1))
IBM_log_returns.plot(kind='kde',bw_method='scott', label='IBM',figsize=(12,6));
_ = plt.ylabel('Density')
erence/generated/scipy.stats.gaussian_kde.html#scipy.stats.gaussian_kde.
108 M. GARITA
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
Fig. 13 Coca-Cola and Pepsi KDE (Source Elaborated by the author with
information from Yahoo Finance)
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 109
• Calculating returns
Pepsi['Returns'] = Pepsi['Close'].pct_change(1)
CocaCola['Returns'] = CocaCola['Close'].pct_change(1)
CocaCola['Returns'].plot(kind='kde',label='Coca-Cola',figsize=(12,6))
Pepsi['Returns'].plot(kind='kde',label='Pepsi')
_ = plt.xlabel('Stock Return')
_ = plt.ylabel('Density')
plt.legend();
Covariance and Correlation
When comparing stocks, the technical aspects are important to under-
stand the behavior of the stock when trying to build a portfolio, but they
can be subjective and often misunderstood. There is an interesting dis-
cussion concerning technical analysis and other types of financial analysis
such as fundamental or econometric. It is important to understand the
tool one is using to make the most out of it.
When trying to understand the relation between different asses there
are two quantitative approaches that are very useful: covariance and cor-
relation. Covariance is useful when comparing how two assets are related
(Hayes 2019). It is also an important part of the Capital Asset Pricing
Model (CAPM) that will be explained in future chapters. The equation
of the covariance is as follows:
110 M. GARITA
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
• Calculating returns
Nutanix['Returns'] = Nutanix['Close'].pct_change()
SP500['Returns'] = SP500['Close'].pct_change()
Nutanix['Returns'] = Nutanix['Returns'].dropna()
SP500['Returns'] = SP500['Returns'].dropna()
• Create a DataFrame
Nutanix_SP500 = pd.concat([Nutanix['Returns'],SP500['Returns']],axis=1)
Nutanix_SP500.columns = ['Nutanix Returns',' SP500']
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 111
• Co-variance Matrix
covariance = Nutanix_SP500.cov()
covariance
annual_covariance
• Correlation Matrix
correlation = Nutanix_SP500.corr()
correlation
112 M. GARITA
First, create a process that includes the stocks with the close price.
returns = stocks.to_returns()
returns.tail()
ntnxclose spyclose
Date
2018-12-24 −0.009323 −0.026423
2018-12-26 0.077221 0.050525
2018-12-27 0.025437 0.007677
2018-12-28 0.009020 −0.001290
2018-12-31 0.032779 0.008759
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 113
If the returns that are being planned on using are logarithmic returns,
the process is as follows:
returns = stocks.to_log_returns()
returns.tail()
ntnxclose spyclose
Date
2018-12-24 −0.009366 −0.026778
2018-12-26 0.074385 0.049290
2018-12-27 0.025119 0.007648
2018-12-28 0.008980 −0.001291
2018-12-31 0.032253 0.008721
The process will continue with the logarithmic returns. To obtain the
correlation matrix is as follows:
ntnxclose spyclose
ntnxclose 1.000000 0.389009
spyclose 0.389009 1.000000
Scatterplots and Heatmaps
There are two important tools when analyzing the process of comparing
stocks. The first one is the scatterplot which will demonstrate the relation
between variables, in this case the SP500 and Nutanix (Fig. 14).
To create a scatterplot with a histogram and therefore simplify the
process of creating a histogram from scratch pandas has a scatter_matrix
for plotting that can be visualized by using the following command:
das-docs/stable/reference/api/pandas.plotting.scatter_matrix.html.
114 M. GARITA
Fig. 14 Scatter Matrix of Nutanix and SP500 (Source Elaborated by the author
with information from Yahoo Finance)
the only thing needed is the scatterplot, then the process should be as
follows (Fig. 15):
Fig. 15 Nutanix and SP500 Scatterplot (Source Elaborated by the author with
information from Yahoo Finance)
Fig. 16 Nutanix and SP500 Heatmap (Source Elaborated by the author with
information from Yahoo Finance)
116 M. GARITA
• Creating a heatmap
The heatmap is a very useful tool since the color demonstrates the com-
parison between the different variables and the correlation is expressed
in the boxes. When using different variables, which will the case of the
following chapters, it is important to understand which visual representa-
tion is the most appropriate.
The process can be followed in f.fn() package in the following process:
returns = stocks.to_log_returns()
returns.tail()
ntnxclose spyclose
Date
2018-12-24 −0.009366 −0.026778
2018-12-26 0.074385 0.049290
2018-12-27 0.025119 0.007648
2018-12-28 0.008980 −0.001291
2018-12-31 0.032253 0.008721
returns.corr()
ntnxclose spyclose
ntnxclose 1.000000 0.389009
spyclose 0.389009 1.000000
returns.plot_corr_heatmap();
STATISTICAL METHODS USING PYTHON FOR ANALYZING STOCKS 117
Works Cited
Brooks, Chris. 2008. Introductory econometrics for finance. Boston: Cambridge
University Press.
Ganti, Akhilesh. 2019. Central Limit Theorem (CLT). 13 September. Accessed
April 2, 2019. https://www.investopedia.com/terms/c/central_limit_theo-
rem.asp.
Hargrave, Marshall. 2020. Standard deviation definition. 1 February. Accessed
February 20, 2020. https://www.investopedia.com/terms/s/standarddevia-
tion.asp.
Hayes, Adam. 2019a. Correlation definition. 20 June. Accessed January 1, 2020.
https://www.investopedia.com/terms/c/correlation.asp.
Hayes, Adam. 2019b. Correlation definition. 20 June. Accessed October 8,
2019. https://www.investopedia.com/terms/c/correlation.asp.
Jain, Diva. 2018. Skew and Kurtosis: 2 Important statistics terms you need to know
in Data Science. 23 August. Accessed August 12, 2019. https://codeburst.
io/2-important-statistics-terms-you-need-to-know-in-data-science-skewness-
and-kurtosis-388fef94eeaa.
Kalla, Siddharth. 2020. Range (Statistics). n.d. Accessed January 4, 2020.
https://explorable.com/range-in-statistics.
118 M. GARITA
Keaton, Will. 2019. Quantitative Analysis (QA). 18 April. Accessed January 15,
2020. https://www.investopedia.com/terms/q/quantitativeanalysis.asp.
Kenton, Will. 2019. Kurtosis. 17 February. Accessed July 30, 2019. https://
www.investopedia.com/terms/k/kurtosis.asp.
Trochim, William M.K. 2020. Correlation. 10 March. Accessed March 12, 2020.
https://conjointly.com/kb/correlation-statistic/.
Wan, Xiang, Wengian Wang, Jiming Liu, and Tiejun Tong. 2014. Estimating
the sample mean and standard deviation from the sample size, median, range
and/or interquartile range. 19 December. Accessed January 03, 2019.
https://bmcmedresmethodol.biomedcentral.com/articles/https://doi.
org/10.1186/1471-2288-14-135.
Elements for Technical Analysis Using
Python
In the last chapter, the retrieving of data was explored through the pro-
cess of using an API or an Excel file. Once the data is on the Jupyter
Notebook the next step is to proceed with analyzing the information.
The first step is to understand how to display data in Python by using
different methods that it provides.
demonstrates how the variable has behaved. In the case of using a stock
price, the linear plot will show how one of the characteristics of the varia-
ble have behaved through a period of time. An example is as follows:
• Importing libraries
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUW GDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
B SOW[ODEHO 'DWH
B SOW\ODEHO &ORVLQJ3ULFH
B SOWWLWOH 7HVOD&ORVLQJ3ULFH
There is another way that the linear plot can be elaborated and
according to the author it is easier to work with the problem of the plot
above that is how the Date is shown. The recommended way to build a
linear plot is as follows (Fig. 2):
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 121
Fig. 1 Tesla closing price using Python (Source Elaborated by the author with
information from Yahoo Finance)
Fig. 2 Tesla Closing price with different size (Source Elaborated by the author
with information from Yahoo Finance)
B SOW[ODEHO 'DWH
B SOW\ODEHO &ORVLQJ3ULFH
122 M. GARITA
The above chart uses figsize used the width as the first number and the
height in inches. The plt.xlabel and plt.ylabel are used to add the labels
to the x and y-axis. These are attributes that are helpful for creating a
clean-looking linear plot.
When analyzing the data, the lowest closing price seems to be
between January 2016 and July 2017. To find out when the price was
the lowest, argmin can be used to return the minimum value.
Tesla['Close'].argmin()
• Result:
Timestamp('2016-02-10 00:00:00')
As the timestamp demonstrates, the lowest closing price for Tesla was
on the 10th of February. To know the price the stock was trading on this
day:
Tesla['Close'].min()
• Result:
143.66999799999999
The same can be done for the highest closing price traded by Tesla.
Given the chart, it is difficult to analyze when was the highest value pre-
sented. To establish the date argmax can be used on the series:
7HVOD> &ORVH @DUJPD[
• Result:
Timestamp('2017-09-18 00:00:00')
To find out what the highest price in the date range, the function max
can be used.
7HVOD> &ORVH @PLQ
• Result:
385.0
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 123
With the information used by max and by min, we can obtain the
range. The range is important. It tells us the difference between the
highest and the lowest values. The range can be obtained by the differ-
ence between max and min as follows:
7HVOD> &ORVH @PD[ 7HVOD> &ORVH @PLQ
• Result:
241.33000200000001
• Importing libraries
124 M. GARITA
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
B SOW[ODEHO 'DWH
B SOW\ODEHO &ORVLQJ3ULFH
SOWOHJHQG
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 125
One of the main differences with the first linear plot is the use of the
command plt.legend(). The command is useful when there are differ-
ent variables and the user wants to know which line is related to each
security. There are different means of creating a legend, considering the
necessity of the user.
Fig. 4 Setting the legend with shadow, framealpha, fancybox and borderpad
(Source Elaborated by the author with information from Yahoo Finance)
framealpha sets the different shades from white to black. 0 sets the
legend in black and 1 in white.
shadow when it is set to )DOVH eliminated a shadow outside the box
and when it is set to 7UXH it sets a shadow around the box.
borderpad sets the size of the legend box (Fig. 4).
Tesla['Volume'].plot(label='Tesla',figsize=(16,8),title='Volume Traded')
_ = plt.xlabel('Date')
_ = plt.ylabel('Volume')
When the price has been higher there has been more movement in the
stocks concerning Tesla.
Volume of Trade
One of the most interesting aspects when analyzing volume is to mul-
tiply it by the security price, giving as a result the volume of trade dur-
ing a specific period in terms of money invested. The total money traded
is useful to know the quantity of investment in a single day or period,
which is useful when analyzing if the market is selling or buying specific
security, or to understand the monetary impact of trade. The equation is
rather simple:
Tesla_total_traded.plot(label='Tesla',figsize=(16,8),title='Total Traded')
_ = plt.xlabel('Date')
_ = plt.ylabel('Total Traded')
Fig. 6 Total Traded Plot (Source Elaborated by the author with information
from Yahoo Finance)
• Importing libraries
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
130 M. GARITA
• Choosing securities
B SOW[ODEHO 'DWH
B SOW\ODEHO 9ROXPH
Fig. 7 Volume traded of Tesla, General Motors and Ford (Source Elaborated by
the author with information from Yahoo Finance)
Fig. 8 Closing Price of Ford, General Motors and Tesla (Source Elaborated by
the author with information from Yahoo Finance)
Candlestick Charts
The Candlestick charts were created in Japan during the 1700s by
Homma whose purpose was to analyze if there was a relation between
supply and demand of rice (Mitchell 2019). It is extremely useful for
analyzing emotional trading and actually is one of the most useful charts
in technical analysis.
134 M. GARITA
The Candlestick chart uses the open, high, low and close price in a
day. As stated in Fig. 22, the bar can be filled in or be black, although
the colors vary into green and red depending on the user. When the
body of the candlestick bar is filled, it means that the close was lower
than the open, therefore if the body is empty it means that the close was
higher than the open (Fig. 11).
To read a Candlestick chart it is important to understand if it is bullish
or bearish. These aspects are based on the price direction by analyzing
the close and open prices. It is assumed that the Candlesticks are respon-
sible for the focus on the opening price because of the importance of
these charts (J. J. Murphy 1999).
To create a Candlestick chart, the mplfinance1 package will be used.
The reason for using this package is the easiness of creating Candlestick
charts and how it adapts to the information accessed by Yahoo API.
The mplfinance package can be accessed at https://github.com/mat-
plotlib/mplfinance#release. There are different aspects considering this
package that is going to be highlighted during the process of creating
Candlestick Charts.
matplotlib/mplfinance/blob/master/examples/customization_and_styles.ipynb.
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 135
Fig. 12 Zoom candlestick chart (Source Elaborated by the author with infor-
mation from Yahoo Finance)
• Importing libraries
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import datetime
import mplfinance as mfp
start = datetime.datetime(2020,1,1)
end = datetime.datetime(2020,3,30)
• Choosing security
The above chart analyzes the trend concerning the security Zoom (ticker
ZM) which has seen an important growth given the COVID-19 and the
lockdown in various countries. From March 16 until March 30 the can-
dles have grown wider because of its high and low price, which demon-
strates high volatility in the security. A bullish pattern can also be seen
as well as the difference between closing price and opening price, which
leads to black and white candlesticks.
• Importing libraries
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUWGDWHWLPH
LPSRUWPSOILQDQFHDVPIS
Fig. 13 Dow Jones Candlestick chart with volume (Source Elaborated by the
author with information from Yahoo Finance)
• Choosing security
**Kwargs are extremely useful when working with functions and charts.
In the case of the mplfinance the kwargs can be useful to add the cus-
tomization of the plots into an only variable. Use kwargs in charts when
there are many variables rather than using an approach of describing each
variable, this is helpful for those reading the notebook (Mastromatteo
2020).
• Creating a **kwarg
mfp.plot(Dow_Jones,**kwargs,style='yahoo')
Fig. 14 Candlestick chart and volume chart using colors (Source Elaborated by
the author with information from Yahoo Finance)
Fig. 15 OHLC bars explained (Source From the article written by Basurto
[2020])
140 M. GARITA
• Importing libraries
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import datetime
import mplfinance as mfp
• Choosing security
• Importing libraries
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 141
Fig. 16 Dow Jones OHLC chart with volume (Source Elaborated by the
author with information from Yahoo Finance)
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import datetime
import mplfinance as mfp
start = datetime.datetime(2018,1,1)
end = datetime.datetime(2020,3,30)
• Choosing security
Fig. 17 Line charts with volume (Source Elaborated by the author with infor-
mation from Yahoo Finance)
kwargs = dict(title ='Dow Jones line chart with volume-1/1/2020 to 30/3/2020', ylabel='Line
chart',figratio=(30,15),figscale=0.75, volume=True) )
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
• Choosing companies
Amazon['MA50'] = Amazon['Close'].rolling(50).mean()
Walmart['MA50'] = Walmart['Close'].rolling(50).mean()
Target['MA50'] = Target['Close'].rolling(50).mean()
144 M. GARITA
Amazon['MA100'] = Amazon['Close'].rolling(100).mean()
Walmart['MA100'] = Walmart['Close'].rolling(100).mean()
Target['MA100'] = Target['Close'].rolling(100).mean()
Amazon['MA200'] = Amazon['Close'].rolling(200).mean()
Walmart['MA200'] = Walmart['Close'].rolling(200).mean()
Target['MA200'] = Target['Close'].rolling(200).mean()
For the above example the rolling2 DataFrame from pandas was used.
By using rolling the window of days can be defined (in this case 50) and by
using the rolling with the mean, the result is the average of the 1st day to
the 50th day and then the average of 2nd day to the 51st day and so on.
Once the three companies have in the DataFrame the MA50 column,
then they can be plotted individually to analyze the momentum of the
stocks.
Walmart['MA50'].plot(label='Walmart SMA50',figsize=(16,8))
Walmart['MA100'].plot(label='Walmart SMA100)
Walmart['MA200'].plot(label='Walmart SMA200)
Walmart['Close'].plot(label='Walmart Close')
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
plt.legend();
das-docs/stable/reference/api/pandas.DataFrame.rolling.html.
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 145
B SOW[ODEHO 'DWH
B SOW\ODEHO 3ULFH
SOWOHJHQG
146 M. GARITA
Amazon['MA50'].plot(label='Amazon SMA50',figsize=(16,8))
Amazon['MA100'].plot(label='Amazon SMA100)
Amazon['MA200'].plot(label='Amazon SMA200)
Amazon['Close'].plot(label='Amazon Close')
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
• Importing libraries
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import datetime
import mplfinance as mfp
3 The author contacted Daniel Goldfarb who is in charge of mplfinance and addressed
Fig. 21 Simple moving average for Walmart (Source Elaborated by the author
with information from Yahoo Finance)
start = datetime.datetime(2014,1,1)
end = datetime.datetime(2019,1,1)
• Choosing security
The SMA is equivalent to the one elaborated with matplotlib in Fig. 19:
Simple moving average of Walmart. The only problem are the legends,
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 149
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
• Choosing companies
• Plotting the EMA of the three companies (Figs. 22, 23, and 24)
Amazon['EMA50'].plot(label='Amazon EMA50',figsize=(16,8))
Amazon['EMA100'].plot(label='Amazon EMA100')
Amazon['EMA200'].plot(label='Amazon EMA200')
Amazon['Close'].plot(label='Amazon Close')
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
plt.legend();
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 151
Fig. 22 Amazon EMA 50, 100, 200 (Source Elaborated by the author with
information from Yahoo Finance)
Fig. 23 Target EMA 50, 100,200 (Source Elaborated by the author with infor-
mation from Yahoo Finance)
152 M. GARITA
Fig. 24 Walmart EMA 50, 100,200 (Source Elaborated by the author with
information from Yahoo Finance)
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
plt.legend();
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 153
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
plt.legend();
When compared with the SMA the difference is that the fall of Target
and Amazon is not as drastic as in the SMA. This leads us to choose
between selecting the EMA strategy by analyzing the current effects on
the market or the SMA if the market has seen an important change in the
past few weeks and one thinks it should not have a weight in the analysis.
• The MACD line which measures the distance between two moving
averages.
• Signal line that identifies price change
• Histogram that represents the difference between MACD and sig-
nal line.
4a43050e2ca8
154 M. GARITA
• Installing packages
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
IURPSDQGDVBGDWDUHDGHULPSRUWGDWDDVZHE
LPSRUWSDQGDVBGDWDUHDGHU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
• Choosing companies
$PD]RQ> EDVHOLQH @
:DOPDUW> EDVHOLQH @
7DUJHW> EDVHOLQH @
B SOW[ODEHO 'DWH
B SOW\ODEHO 3ULFH
B SOWWLWOH $PD]RQ0$&'
SOWOHJHQG
Fig. 25 Amazon MACD with Baseline (Source Elaborated by the author with
information from Yahoo Finance)
156 M. GARITA
Fig. 26 Walmart MACD with Baseline (Source Elaborated by the author with
information from Yahoo Finance)
Fig. 27 Target MACD with baseline (Source Elaborated by the author with
information from Yahoo Finance)
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 157
Walmart['MACD'].plot(label='Walmart MACD',figsize=(16,8))
Walmart['baseline'].plot(label='Baseline')
_ = plt.xlabel('Date')
_ = plt.ylabel('Price')
_ = plt.title('Walmart MACD')
plt.legend();
B SOW[ODEHO 'DWH
B SOW\ODEHO 3ULFH
B SOWWLWOH 7DUJHW0$&'
SOWOHJHQG
Using a signal line in the MACD is one of the most important trading
items because of its interpretation.
– When the MACD crosses the signal line from below to above the
indicator is considered bullish.
– When the MACD crossed the signal line from above to below the
indicator is considered bearish (Posey 2019).
158 M. GARITA
The signal line is equivalent to an EMA of nine periods. The signal line
will be plotted with the MACD.
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime.datetime(2014, 1, 1)
end = datetime.datetime(2019, 1, 1)
• Choosing companies
Fig. 28 MACD and Signal Line (Source Elaborated by the author with infor-
mation from Yahoo Finance)
macd.plot(label='Target MACD',figsize=(16,8))
signal_line.plot(label='Baseline')
_ = plt.xlabel('Date')
plt.legend();
160 M. GARITA
Bollinger Bands ®
When John Bollinger created the Bollinger Bands in the 1980s he found
a means of joining the quantitative aspect (standard deviation) and tech-
nical analysis for decision-making. Bollinger bands combine standard
deviation, a measure for volatility, and moving average defining when the
security has a contraction or an expansion (Bollinger 2018).
Bollinger Bands are extremely useful when analyzing security because:
– When there is low volatility the bands will be close together. When
there is high volatility the bands will be apart. Periods of low vola-
tility are often followed by high volatility. The same is for periods of
high volatility followed by low volatility.
– If a price moves beyond the upper barrier the prices are considered
overbought. Meaning that the stock is being bought at unjustifiably
high prices.
– If a price moves below the upper barrier then the prices are con-
sidered oversold. Meaning that the stocks are selling below its true
value.
• Importing libraries
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
LPSRUWSDQGDVBGDWDUHDGHUDVSGU
LPSRUWGDWHWLPH
LPSRUWPSOILQDQFHDVPIS
• Choosing security
ZLQGRZBRIBGD\V
QXPEHUBVWG
rolling_mean = Dow_Jones['Close'].rolling(window_of_days).mean()
rolling_std = Dow_Jones['Close'].rolling(window_of_days).std()
apd = mfp.make_addplot(high_low)
kwargs = dict(title ='Dow Jones Bollinger Band 1/8/2019 to 1/4/2020', ylabel='Bollinger Bands',
figratio=(30,15),figscale= 0.75)
As expressed before, there are three breaking points in the lower band
during the COVID-19 pandemic crisis, which means that the stocks
are oversold. This leads to the argument that because of fear there was
dumping or selling securities before they lost value.
162 M. GARITA
Fig. 29 Dow Jones Bollinger Bands (Source Elaborated by the author with
information from Yahoo Finance)
Parabolic SAR
The parabolic SAR gives and edges to the traders given that it analyzes
the movement of the stock. It was created by J. Welles Wilder Jr., which
also created the RSI (C. Murphy 2020). The logic behind the parabolic
SAR is as follows:
import talib
import numpy as np
import pandas as pd
import pandas_datareader as pdr
from pandas_datareader import data as web
import datetime
import mplfinance as mpf
import matplotlib.pyplot as plt
%matplotlib inline
– Choose dates
start = datetime.datetime(2019,8,1)
end = datetime.datetime(2021,1,30)
– Choose a stock
– Plot the Parabolic SAR with the closing price (Fig. 30).
aapl['Close'].plot(label='Closing Price',figsize=(16,8))
aapl['SAR'].plot(label= 'SAR')
_ = plt.xlabel('Date')
plt.legend();
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON 165
Source Created by the author with information from (Bolsa de Madrid 2020) (London Stock Exchange
Group 2020) (NYSE 2020)
The parabolic SAR is useful to buy or sell stocks depending on its rela-
tionship with the closing price of the stock, in this example, it is the
stock of Apple. The parabolic SAR recommends us to buy if the SAR line
is below the closing price and to sell if the SAR is above the closing price.
As seen before, at the beginning of the chart, the SAR line and the clos-
ing price were very similar but as volatility has struck Apple, the oppor-
tunity for selling and buying is clearer. This also can be complemented
with the stochastic oscillator.
start = datetime.datetime(2020,12,1)
end = datetime.datetime(2021,1,30)
In the equation before, the standard fourteen days were used for the %K
and three days for the %D.
– Result
Date High Low Open Close Volume Adj Close SAR slowk slowd fastk fastd
2021-01-25 145.089996 136.539993 143.070007 142.919998 157,611,700 142.919998 127.173966 87.180935 71.431681 88.401933 87.180935
2021-01-26 144.300003 141.369995 143.600006 143.160004 98,390,600 143.160004 128.248927 90.765333 85.085841 89.684699 90.765333
2021-01-27 144.300003 140.410004 143.429993 142.059998 140,843,800 142.059998 129.259392 87.155227 88.367165 83.379048 87.155227
2021-01-28 141.990005 136.699997 139.520004 137.089996 142,621,100 137.089996 130.209228 76.393343 84.771301 56.116282 76.393343
2021-01-29 136.740005 130.210007 135.830002 131.960007 177,180,600 131.960007 145.089996 55.823745 73.124105 27.975904 55.823745
ELEMENTS FOR TECHNICAL ANALYSIS USING PYTHON
167
168 M. GARITA
aapl['slowk'].plot(label='slowk',figsize=(16,8))
aapl['slowd'].plot(label= 'slowd')
_ = plt.xlabel('Date')
plt.legend();
aapl['fastk'].plot(label='fastk',figsize=(16,8))
aapl['fastd'].plot(label= 'fastd')
_ = plt.xlabel('Date')
plt.legend();
References
Bang, Julie. 2019. Candlestick bar. Investopedia.
Basurto, Stefano. 2020. Python trading toolbox: Introducing OHLC charts,
7 January. Accessed March 31, 2020. https://towardsdatascience.com/
trading-toolbox-03-ohlc-charts-95b48bb9d748.
Bollinger, John. 2018. John Bollinger answers “What are Bollinger Bands?”, n.d.
Accessed March 2, 2020. https://www.bollingerbands.com/bollinger-bands.
Bolsa de Madrid. 2020. Electronic Spanish Stock Market Interconnection System
(SIBE), n.d. Accessed March 23, 2020. http://www.bolsamadrid.es/ing/
Inversores/Agenda/HorarioMercado.aspx.
Chen, James. 2019. Line Chart, 12 August. Accessed March 25, 2020. https://
www.investopedia.com/terms/l/linechart.asp.
Halton, Clay. 2019. Line Graph, 21 August. Accessed March 25, 2020. https://
www.investopedia.com/terms/l/line-graph.asp.
Hayes, Adam. 2018. Volume definition, 4 February. Accessed March 25, 2020.
https://www.investopedia.com/terms/v/volume.asp.
Hayes, Adam. 2020a. Exponential Moving Average - EMA definition, 8 July.
Accessed April 2, 2020. https://www.investopedia.com/terms/e/ema.asp.
Hayes, Adam. 2020b. Moving Average (MA), 31 March. Accessed March 31,
2020. https://www.investopedia.com/terms/m/movingaverage.asp.
Kevin, S. 2015. Security analysis and portfolio management. Delhi: PHI.
London Stock Exchange Group. 2020. London Stock Exchange Group Business
Day, n.d. Accessed March 25, 2020. https://www.lseg.com/areas-expertise/
170 M. GARITA
our-markets/london-stock-exchange/equities-markets/trading-services/
business-days.
Mastromatteo, Davide. 2020. Python args and kwargs: Demystified. 09 September.
Accessed March 31, 2020. https://realpython.com/python-kwargs-and-args/.
Milton, Adam. 2020. Simple, exponential, and weighted moving averages, 9
November. Accessed March 31, 2020. https://www.thebalance.com/
simple-exponential-and-weighted-moving-averages-1031196.
Mitchell, Cory. 2019. Understanding basic candlestick charts. 19 December.
Accessed March 30, 2020. https://www.investopedia.com/trading/
candlestick-charting-what-is-it/.
Mitchell, Cory. 2020. How to use volume to improve your trading, 25 February.
Accessed March 27, 2020. https://www.investopedia.com/articles/techni-
cal/02/010702.asp.
Murphy, Casey. 2020. Investopedia, 16 November. Accessed January 1, 2021.
https://www.investopedia.com/trading/introduction-to-parabolic-sar/.
Murphy, John J. 1999. Technical analysis of the financial markets. New York:
New York Institute of Finance.
NYSE. 2020. TAQ closing prices, n.d. Accessed March 25, 2020. https://www.
nyse.com/market-data/historical/taq-nyse-closing-prices.
Posey, Luke. 2019. Implementing MACD, 30 March. Accessed April 2, 2020.
https://towardsdatascience.com/implementing-macd-in-python-cc9b2280126a.
Valuation and Risk Models with Stocks
Abstract One of the most important aspects for the analysis of securi-
ties is to analyze its risk models and how to valuate the instrument. In
the present chapter aims to explain the importance of risk, the different
financial measures for understanding risk in a portfolio of securities and
the impact on the returns of a portfolio.
Creating a Portfolio
Managing portfolios with multiple assets is one of the most interesting
processes when working on finance. Given that there are different assets
which can compose a portfolio, one of the most important processes of
start = datetime.datetime()
end = datetime.datetime()
The second step then creating the portfolio is assigning the weights that
the portfolio must have. For this approach, considering market capitali-
zation is important.
To determine market value two steps are essential. To know the last price
of the security traded and the number of shares outstanding. For this, it
is important to have a benchmark that can be used as a reference for the
securities. Since the example establishes the last price as of December 26,
2018, then the prices can be obtained as follows:
stocks.tail()
VALUATION AND RISK MODELS WITH STOCKS 173
Source Elaborated by the author with information from Yahoo Finance and Nasdaq aObtained from the
https://www.nasdaq.com
Source Elaborated by the author with information from Yahoo Finance and Nasdaq
Source Elaborated by the author with information from Yahoo Finance and Nasdaq
stocks_return = (stocks/stocks.shift()
portfolio_weights = np.array([])
The first step is to obtain the annual covariance. For the annual covariance
the number of days that will be used is 252, a more conventional approach to
the 360 or 365. The days are based on the stock market available days:
stocks_covariance = stocks_return.cov()
0.34392103687572534
2 The np.dot converts the two data sets into one. For more information visit: https://
docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html.
176 M. GARITA
The Beta
The measurement of the systematic risk is through the beta, which is a
degree of sensitivity that includes the variation of an asset compared with
an index that is used as a benchmark.
Given that the covariance will be obtained for calculating the beta, it
is important to elaborate a covariance matrix. First, the process will be
explained with the covariance formula and then it will be integrated with
the correlation formula.
Before coding in Python there are certain aspects that should be clar-
ified to understand the process of computing the beta with covariance:
• Using a for loop: The program below uses a simple for loop because
there is a need here to create a DataFrame with the two variables.
The process is simple, and for more information there are different
ways a for loop3 can be used. See For loop.
• Choosing a market for comparison: For example, the S&P 500
(^GSPC) was used because it is an interesting market reference
when trying to determine the behavior of the company.
• Using iloc: iloc4 is a pandas DataFrame function that selects a posi-
tion. It can be used to slice the DataFrame or to create new col-
umns. Please visit the chapter titled The Basics for more information.
• Using 252 days: The concept of 252 days comes from the regular
trading hours (RTH). It is a rule of thumb (Mitra y Mitra 2011).
The covariance and the variance are multiplied by this to convert it
into daily.
These aspects are important since they will be repeated throughout the
following sections of the book.
3 An excellent resource for understanding for loops can be found here: https://www.
w3schools.com/python/python_for_loops.asp.
4 For more information on using iloc and examples visit: https://pandas.pydata.org/
pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html.
178 M. GARITA
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
• start = datetime.datetime(2014, 1, 1 )
• end = datetime.datetime(2019, 1, 1 2019, 1, 1)
stocks_return = np.log(stocks/stocks.shift(1 ))
Covariance Matrix
DIS ^GSPS
DIS 0.035899 0.015926
^GSPS 0.015926 0.017540
VALUATION AND RISK MODELS WITH STOCKS 179
covariance_market = covariance.iloc[0,1 ]
covariance_market
beta_Disney = covariance_market/market_variance
beta_Disney
Beta = 0.90796359355848766
For calculating the beta with correlation, the formula, it is easier but it
is less common. It is important to remember that when comparing beta,
the same days of data are being used, and that this is applied to all the
variables. There are different approaches to the beta but the most nota-
ble aspect is to understand the result. To.
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
180 M. GARITA
start = datetime.datetime(2014, 1, 1 )
end = datetime.datetime(2019, 1, 1 )
stocks_return = np.log(stocks/stocks.shift(1 ))
• Calculating correlation
correlation = stocks_return.corr()
correlation
Correlation Matrix
DIS NFLX
NFLX 1.000000 0.467229
^GSPS 0.467229 1.000000
correlation_NFLX_GSPC = correlation.iloc[0,1 ]
correlation_NFLX_GSPC
• Netflix variance
• Market variance
• Beta
Beta = 0.46722863704340151
Both of them are less volatile than the market, meaning that if the
S&P 500 shifts in an upward direction by 1% then Disney will shift by
0.91% and Netflix by 0.46%. The beta is important for analyzing the pro-
cess in which the assets behave when compared to a market.
βi >1 The asset is more volatile than the market it is compared with
βi <1>0 The asset is less volatile than the market it is compared with
βi =0 The asset is not correlated with the market it is compared with
βi <0 Negatively correlated with the market it is compared with
182 M. GARITA
E Rp − RF = βp (E(Rm ) − RF)
According to the equation, the first step for calculating the CAPM is
to create a portfolio. To create a portfolio, one must understand that it is
composed of different assets and that those assets should have a different
weight.
The weights are known as portfolio weights and there are different
methods for calculating the weights but the example will focus on the
market value method seen before.
VALUATION AND RISK MODELS WITH STOCKS 183
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
start = datetime.datetime(2014, 1, 1 )
end = datetime.datetime(2019, 1, 1 )
tickers = ['NFLX', 'DIS', 'TSLA', 'AMZN' ]
stocks = pd.DataFrame()for x in tickers:
stocks[x] = web.DataReader(x, ‘yahoo’, start, end)[‘Close’]
start = datetime.datetime(2014, 1, 1 )
end = datetime.datetime(2019, 1, 1 )
stocks_return = np.log(stocks/stocks.shift(1 ))
stocks_return.dropna().head()
VALUATION AND RISK MODELS WITH STOCKS 185
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1 )
stocks_return['Portfolio' ] = weighted_returns_portfolio.sum(axis = 1 )
c u m u l a t i v e _ r e t u r n s _ p o r f o l i o = ( ( 1 + s t o c k s _ r e t u r n [ 'Portfolio' ] ) .
cumprod()-1).
With the column of the real risk-free rate and the portfolio returns, the
excess can be calculated very easy:
With the information, the only variable missing for the CAPM is the
beta. For this next step, the SPY (S&P 500) will be used:
start = datetime.datetime(2014, 1, 2 )
end = datetime.datetime(2019, 1, 1 )
The second step is to calculate the excess return of the market compared
with the risk-free rate. This is basic because it standardizes the process
by which the market behaves compared to the risk-free rate and the
portfolio.
To obtain the beta there are different processes, for this example the pro-
cess of the beta equation using covariance and variance will be used. The
equation is as follows:
The next step is to obtain the covariance matrix. From the covariance
matrix the coefficient can be inserted into a variable:
• Covariance Matrix
Covariance Matrix
• Covariance Coefficient
Once the covariance is obtained the next process is to insert the variance
of the portfolio into the process.
With the variance and the covariance, the beta can be obtained by using
the formula described before.
• Calculating Beta
beta = covariance_coefficient/variance_coefficient
beta
1.2979841588534764
Sharpe Ratio
The Sharpe Ratio was created by William F. Sharpe based on the impor-
tance of understanding the relation between risk and returns. The Sharpe
Ratio is considered one of the rentability and risk ratios.
When the Sharpe Ratio is higher the better it is considered since the
denominator is standard deviation or risk. It is useful when comparing
peers, for example in an exchange-traded fund (ETF) (Hargrave 2019).
VALUATION AND RISK MODELS WITH STOCKS 189
For obtaining the Sharpe Ratio the first step is to create the portfolio
using the method that has been used before:
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
In the example above the companies that are being used are Ford (F),
Fiat-Chrysler (FCAU) and Toyota Motors (TM). The dates are from
January 1, 2018 to April 1, 2019. For this example, a for loop was used
190 M. GARITA
stocks.corr()
F FCAU TM
F 1.000000 0.873512 0.837771
FCAU 0.873512 1.000000 0.852925
TM 0.837771 0.852925 1.000000
The use of the percent change is an easier approach because it takes the
last return and calculates the change in percentages, which is another
way of calculating the stocks returns without using logarithmic returns.
Mathematically it is not recommended to use logarithmic model to cal-
culate returns (Fig. 2).
• Create a variable for the portfolio by calculating the sum of the returns
• Add the benchmark by using the S&P 500 and calculating the
returns
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
• Create a new DataFrame for the portfolio and the benchmark for
calculating the correlation
correlation = portfolio_benchmark.corr()
correlation
Portfolio Benchmark
Portfolio 1.000000 0.666916
Benchmark 0.666916 1.000000
-0.7134969693218578
VALUATION AND RISK MODELS WITH STOCKS 193
import math
annual_days = 252
-11.18589313516733
Considering that the results are negative, the problem lies that the
mean return of the portfolio is smaller than the risk-free rate. As dis-
cussed before, the return of the portfolio is lower than the risk-free rate
exemplifies that the portfolio is not performing above our lowest target
(the RF Rate) and therefore a portfolio is not effective (Fig. 3).
By using highly correlated assets as a portfolio, returns were sacrificed
and the risk was higher because if there is a fall in one security the other
companies will respond in the same way. This is an important lesson
when building a portfolio and not considering the warnings when ana-
lyzing correlation.
Traynor Ratio
The ratio was created by John Traynor in 1965 and measure the rentabil-
ity compared to risk. The rule of thumb of the ratio is that an indicator
that is higher is a result of the portfolio management. When analyzing
the Traynor Ratio, if it is negative, the portfolio has underperformed the
risk-free rate.
Since the Traynor Ratio measures returns in excess earned on a riskless
investment compared by using a per market risk, it is useful because it
compares the returns to the risk of the investor (Keaton 2020). As seen
in the following equation, the main difference between a Sharpe Ratio
and a Traynor Ratio is the use of beta.
For example, the data set that was used for the Sharpe Ratio will be
used. This is important for comparison between the ratios.
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
VALUATION AND RISK MODELS WITH STOCKS 195
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
tickers = ['F','FCAU', 'TM' ,]
stocks = pd.DataFrame()
for x in tickers:
stocks[x] = web.DataReader(x, 'yahoo' , start, end)[ 'Close' ]
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1 )
• Add the benchmark by using the S&P 500 and calculating the
returns
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
covariance_market = covariance.iloc[3,4 ]
covariance_market
0.026395748767295644
0.96828357625053052
-0.011269427871314531
As a result, the Traynor Ratio is negative, which means that the port-
folio is not performing better than the risk-free rate. The result of the
Sharpe Ratio was negative also, so it complements the information given
by Traynor. The main difference between the Sharpe and the Traynor
ratio is that it compares with the beta and not the volatility (Fig. 4).
Jensen’s Measure
The Jensen’s Measure also known as alpha was created in 1968 with the
purpose of measuring the relationship between the return of the portfo-
lio in comparison with another portfolio return with the same risk, same
reference market and under the same parameters (Chen 2019).
198 M. GARITA
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1 )
• Add the benchmark by using the S&P 500 and calculating the
returns
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
covariance_market = covariance.iloc[3,4 ]
covariance_market
0.026395748767295644
0.96828357625053052
-0.00034608967689839253
The alpha is negative and so is the Traynor and the Sharpe Ratio. This
is a perfect example of a portfolio that is performing negatively when
compared to the risk-free rate. For decision making with alpha the fol-
lowing table is important (Table 6):
VALUATION AND RISK MODELS WITH STOCKS 201
Information Ratio
The information ratio analyzes the excess of the return when compar-
ing the portfolio without risk with the one supported by the investment.
The effect that the ratio measures is how the portfolio deviates from the
benchmark (Murphy 2019) The name is based on the consideration that
the manager of the portfolio has special information and therefore, he
will out beat the benchmark. The formula is as follows:
(Rp − Rm )
Information Ratio =
TE
TE = standard deviation of the difference between the portfolio and the
benchmark
Rp = Porfolio Return
βp = Beta of the portfolio
Rm = return of the market
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
202 M. GARITA
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1 )
• Add the benchmark by using the S&P 500 and calculating the
returns
start = datetime.datetime(2018, 1, 2 )
end = datetime.datetime(2019, 4, 1 )
covariance_market = covariance.iloc[3,4 ]
covariance_market
0.026395748767295644
0.96828357625053052
tracking_error = difference_benchmark_portfolio.std()
-0.07703212677308867
• Installing packages
import numpy as np
import pandas as pd
import pandas_datareader
import datetime
% matplotlib inline
start = datetime.datetime(2019, 1, 2 )
end = datetime.datetime(2020, 4, 1 )
For the first time the function zip()5 is used in the present book. The
reason for this is that it allows an iteration between the variables. In this
case the allocation and the return.
(Amazon,Ford,Citi,MacDonalds):
for securities in
securities['Investment' ] = securities['Allocation' ]* 300000
securities.tail()
High Low Open Close Volume Adj Close Return Allocation Investment
Date
2020–03-26 170.929993 161.000000 163.990005 167.350006 8,259,900.0 167.350006 0.950528 0.190106 57,031.696612
2020–03-27 169.740005 159.220001 162.779999 164.009995 6,441,400.0 164.009995 0.931557 0.186311 55,893.444319
2020–03-30 170.309998 163.570007 164.919998 168.130005 5,621,700.0 168.130005 0.954959 0.190992 57,297.514670
2020–03-31 169.509995 165.000000 166.839996 165.350006 4,519,900.0 165.350006 0.939169 0.187834 56,350.110779
2020–04-01 161.440002 156.350006 160.220001 158.169998 4,668,900.0 158.169998 0.898387 0.179677 53,903.214937
all_investments =
[Amazon['Investment'],Ford['Investment'],Citi['Investment'],MacDonalds['Investment']]
value_of_portfolio = pd.concat(all_investments,axis = 1)
value_of_portfolio.columns = [
'Amazon Investment','Ford Investment','Citi Investment','McDonalds Investment' ]
value_of_portfolio.head()
value_of_portfolio[‘Total Investment’].max()
396373.7772855786
value_of_portfolio[‘Total Investment’].min()
247946.82954672351
In [118]:
Fig. 5 Total position of the portfolio (Source Elaborated by the author with
information from Yahoo Finance)
208 M. GARITA
0.0003811570091341308
Fig. 6 Position behavior on each security (Source Elaborated by the author with
information from Yahoo Finance)
VALUATION AND RISK MODELS WITH STOCKS 209
References
Bryant, Bradley James. 2020. How to Calculate Portfolio Value. n.d.
Accessed January 3, 2020. https://www.sapling.com/5872650/
calculate-portfolio-value.
Chen, James. 2019. Jensen's Measure. 21 November. Accessed December 20,
2019. https://www.investopedia.com/terms/j/jensensmeasure.asp.
Chen, James. 2019. Financial Risk. 15 June. Accessed August 20, 2019.
https://www.investopedia.com/terms/f/financialrisk.asp.
David W. Mullins, Jr. 1982. Does the Capital Asset Pricing Model Work?
n.d. January. Accessed August 13, 2019. https://hbr.org/1982/01/
does-the-capital-asset-pricing-model-work.
Fontinelle, Amy. 2019. Systematic Risk. 30 September. Accessed October 1,
2019. https://www.investopedia.com/terms/s/systematicrisk.asp.
Hargrave, Marshall. 2019. Sharpe Ratio. 17 March. Accessed April 1, 2019.
https://www.investopedia.com/terms/s/sharperatio.asp.
Keaton, Will. 2020. Treynor Ratio. 22 March. Accessed Apri 1, 2020. https://
www.investopedia.com/terms/t/treynorratio.asp.
Mitra, Gautam, and Leela Mitra. 2011. The Handbook of News Analytics in
Finance. London: Wiley.
Murphy, Chris. 2019. Information Ratio – IR. 10 January. Accessed February
20, 2019. https://www.investopedia.com/terms/i/informationratio.asp.
Value at Risk
Historical VaR(95)
Since the VaR is based on the confidence level, it may have different
results based on a 65%, 90%, 95% or any other confidence interval. The
following example is Historical VaR(95), meaning that the confidence
interval will be at a 95%.
• Install packages
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
IURPSDQGDVBGDWDUHDGHULPSRUWGDWDDVZHE
LPSRUWSDQGDVBGDWDUHDGHU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
VWRFNV SG'DWD)UDPH
IRU[LQWLFNHUV
VWRFNV>[@ ZHE'DWD5HDGHU [ \DKRR VWDUWHQG > &ORVH @
VWRFNVWDLO
stocks_return = (stocks/stocks.shift(1))−1
stocks_return.tail()
portfolio_weights
portfolio_weights = portfolio_weights/np.sum(portfolio_weights)
portfolio_weights
VALUE AT RISK 213
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1)
stocks_return['Portfolio'] = weighted_returns_portfolio.sum(axis=1).dropna()
stocks_return['Portfolio'] = stocks_return['Portfolio'] * 100
Historical VaR(99)
For computing the Historical VaR at a 99% confidence level the only
change that has to be done is in the last part of the script, changing the
np.percentile to 1, which means the 1%.
-2.5793928700853099
214 M. GARITA
At a 99% confidence level the worst loss is 2.58% with the portfolio.
Clearly the VaR is higher given that the confidence level is lower. This
is rational and therefore it helps understand the process by which the
VaR works, given that a higher confidence level will give a higher per-
centage of loss and a lower confidence level will give a lower percentage
of loss.
• Install packages
#installing packages
• Installing packages
import numpy as np
import pandas as pd
from pandas_datareader import data as web
import pandas_datareader
import datetime
import matplotlib.pyplot as plt
%matplotlib inline
VWRFNVWDLO
stocks_return = (stocks/stocks.shift(1))−1
stocks_return.tail()
portfolio_weights = np.array(np.random.random(5))
portfolio_weights
portfolio_weights = portfolio_weights/np.sum(portfolio_weights)
portfolio_weights
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1)
confidence = 0.99
alpha = norm.ppf(1−confidence)
For this example, the norm.ppf is being used, the reason for this is that
it determines the probability density function of one (1) minus the confi-
dence interval. This is useful because it determines the probability of the
VaR. It is a similar process to the np.percentile.
• Create a position
position = 1e6
var = position*(mu−sigma*alpha)
var
27088.745452792264
VALUE AT RISK 217
days = 10
88644.949585607217
The worst loss for the next 10 days based on the portfolio that
depends on the stocks that have been chosen and the weights of the
stocks, could be of USD 88,644.95 or approximately 88.64% of the total
investment. Consider that this effect is at a 99% confidence interval. If
the example had been done with a 95% of confidence interval, the result
would have been as follows:
confidence = 0.95
alpha = norm.ppf(1−confidence)
63954.684614643818
218 M. GARITA
The result is a worst loss much smaller than the one determined at a
99% confidence level. In this example the loss is approximately 63.95%
of the total investment. The result also varies if the days are reduced, for
example to 5 days.
days_2 = 5
61773.537991536054
Historical Drawdown
A historical drawdown is often included with the VaR because it analyzes
the decline in the specific period that the portfolio is being analyzed and
based on the cumulative growth analyzes the peak and therefore the fall
or drawdown of the portfolio (Mitchell 2019). The process is similar to
the VaR but it uses the negative returns of the portfolio.
• Installing packages
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
IURPSDQGDVBGDWDUHDGHULPSRUWGDWDDVZHE
LPSRUWSDQGDVBGDWDUHDGHU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
VALUE AT RISK 219
VWRFNVWDLO
stocks_return = (stocks/stocks.shift(1))−1
stocks_return.tail()
portfolio_weights= np.array(np.random.random(5))
portfolio_weights
portfolio_weights = portfolio_weights/np.sum(portfolio_weights)
portfolio_weights
weighted_returns_portfolio = stocks_return.mul(portfolio_weights,
axis = 1)
&XPXODWLYH5HWXUQVSORW
B SOW[ODEHO 'DWHV V
B SOW\ODEHO 5HWXUQV
B SOWWLWOH &XPXODWLYH5HWXUQV3RUWIROLR
SOWVKRZ
running_maximum = np.maximum.accumulate(CumulativeReturns)
running_maximum.tail()
portfolio_drawdown = (CumulativeReturns)/running_max − 1
GUDZGRZQSORW
B SOW[ODEHO 'DWHV
B SOW\ODEHO 5HWXUQV
SOWVKRZ
Fig. 2 Drawdown of the portfolio (Source Elaborated by the author with infor-
mation from Yahoo Finance)
– Import libraries
LPSRUWIIQ
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
IURPSDQGDVBGDWDUHDGHULPSRUWGDWDDVZHE
LPSRUWSDQGDVBGDWDUHDGHU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
VALUE AT RISK 223
– Select stocks
VWDUW GDWHWLPHGDWHWLPH
HQG GDWHWLPHGDWHWLPH
stocks_return = stocks.to_returns().dropna()
stocks_return.tail()
=0
$0=1
'2&8
3721
GW\SHREMHFW
– Calculate performance
performance = stocks_return.calc_stats()
– Display performance
performance.display()
performance = stocks_return.calc_stats()
daily_mean −111.263
daily_vol 249.075
daily_skew 8.39942
daily_kurt 108.93
best_day 209.395
worst_day −58.9652
monthly_sharpe −1.30932
monthly_sortino −1.31108
monthly_mean −149.798
monthly_vol 114.409
monthly_skew −3.23038
monthly_kurt 10.7853
best_month 1.1376
worst_month −124.275
yearly_sharpe NaN
yearly_sortino NaN
yearly_mean −0.53764
yearly_vol NaN
yearly_skew NaN
yearly_kurt NaN
best_year −0.53764
worst_year −0.53764
avg_drawdown −2.67834
avg_drawdown_days 49.1111
avg_up_month 0.525848
avg_down_month −15.7354
win_year_perc 0
twelve_month_win_perc 0.4
– Importing libraries
LPSRUWIIQ
LPSRUWQXPS\DVQS
LPSRUWSDQGDVDVSG
IURPSDQGDVBGDWDUHDGHULPSRUWGDWDDVZHE
LPSRUWSDQGDVBGDWDUHDGHU
LPSRUWGDWHWLPH
LPSRUWPDWSORWOLES\SORWDVSOW
PDWSORWOLELQOLQH
VALUE AT RISK 227
B SOW[ODEHO 'DWH
B SOW\ODEHO 60$DQG&ORVLQJ3ULFH
B SOWWLWOH 06$8;60$DQG&ORVLQJ3ULFH
SOWOHJHQG
B SOW[ODEHO 'DWH
B SOW\ODEHO (0$DQG&ORVLQJ3ULFH
B SOWWLWOH 06$8;(0$DQG&ORVLQJ3ULFH
SOWOHJHQG
_ = plt.xlabel('Date')
plt.legend();
– Creating RSI for the Morgan Stanley Institutional Fund, Inc. Asia
Opportunity Portfolio Class A (MSAUX) using talib with 14 days
(Fig. 6)
B SOW[ODEHO 'DWH
B SOW\ODEHO 06$8;56,
B SOWWLWOH 06$8;56,
SOWOHJHQG
returns = funds.to_log_returns().dropna()
returns.head()
B SOW[ODEHO 'DWH
B SOW\ODEHO &ORVLQJ3ULFH
B SOWWLWOH 5HEDVHRIWKHFORVLQJSULFH
SOWOHJHQG
SHUIRUPDQFH IXQGVFDOFBVWDWV
B SOW[ODEHO 'DWH
B SOWWLWOH )XQGVPRQWO\3URJUHVVLRQ
SOWOHJHQG
performance.display()
232 M. GARITA
funds.to_drawdown_series().tail()
234 M. GARITA
Works Cited
Damodaran, Aswath. 2018. Value at Risk (VAR). New York University. n.d.
Accessed February 20, 2019. http://people.stern.nyu.edu/adamodar/
pdfiles/papers/VAR.pdf.
Mitchell, Cory. 2019. Drawdown definition and example. 25 June. Accessed July
30, 2019. https://www.investopedia.com/terms/d/drawdown.asp.
Works Cited
365 Data Science. 2020. Why Python for data science and why Jupyter to code in
Python Articles 11 min read. n.d. Accessed March 2, 2020. https://365datasci-
ence.com/why-python-for-data-science-and-why-jupyter-to-code-in-python/.
Anaconda. 2020. Anaconda distribution. Accessed March 2, 2020. https://
www.anaconda.com/distribution/.
Bang, Julie. 2019. Candlestick bar. Investopedia.
Basurto, Stefano. 2020. Python trading toolbox: Introducing OHLC charts with
Matplotlib. 07 January. Accessed March 31, 2020. https://towardsdatasci-
ence.com/trading-toolbox-03-ohlc-charts-95b48bb9d748.
Bloomberg Corporation. 2019. Bloomberg puts the power of Python XE “Python”
in hedgers’ hands. 8 March. Accessed March 2, 2020. https://www.bloomb-
erg.com/professional/blog/bloomberg-puts-power-python-hedgers-hands/.
Bollinger, John. 2018. John Bollinger answers “What are Bollinger Bands?”. n.d.
Accessed March 2, 2020. https://www.bollingerbands.com/bollinger-bands.
Bolsa de Madrid. 2020. Electronic Spanish Stock Market Interconnection System
(SIBE). n.d. Accessed March 23, 2020. http://www.bolsamadrid.es/ing/
Inversores/Agenda/HorarioMercado.aspx.
Brooks, Chris. 2008. Introductory econometrics for finance. Boston: Cambridge
University Press.
Bryant, Bradley James. 2020. How to calculate portfolio value. n.d. Accessed January
3, 2020. https://www.sapling.com/5872650/calculate-portfolio-value.
Burgess, Matthew, and Sarah Wells. 2020. Giant wealth fund seeks managers who
can beat frothy market. 9 February. Accessed March 2, 2020. https://finance.
yahoo.com/news/giant-wealth-fund-seeks-managers-230000386.html.
Chen, James. 2019a. Financial risk. 15 June. Accessed August 20, 2019.
https://www.investopedia.com/terms/f/financialrisk.asp.
Jain, Diva. 2018. Skew and Kurtosis: 2 Important statistics terms you need to know
in Data Science. 23 August. Accessed August 12, 2019. https://codeburst.
io/2-important-statistics-terms-you-need-to-know-in-data-science-skewness-
and-kurtosis-388fef94eeaa.
Kalla, Siddharth. 2020. Range (Statistics). n.d. Accessed January 4, 2020.
https://explorable.com/range-in-statistics.
Kan, Chi Nok. 2018. Data Science 101: Is Python better than R? 1 August.
Accessed March 2, 2020. https://towardsdatascience.com/data-science-101-
is-python-better-than-r-b8f258f57b0f.
Keaton, Will. 2019. Quantitative Analysis (QA). 18 April. Accessed January 15,
2020. https://www.investopedia.com/terms/q/quantitativeanalysis.asp.
Keaton, Will. 2020. Treynor ratio. 22 March. Accessed Apri 1, 2020. https://
www.investopedia.com/terms/t/treynorratio.asp.
Kenton, Will. 2019. Kurtosis. 17 February. Accessed July 30, 2019. https://
www.investopedia.com/terms/k/kurtosis.asp.
Kevin, S. 2015. Security analysis and portfolio management. Delhi: PHI.
London Stock Exchange Group. 2020. London Stock Exchange Group Business
Day. n.d. Accessed March 25, 2020. https://www.lseg.com/areas-expertise/
our-markets/london-stock-exchange/equities-markets/trading-services/
business-days.
Mastromatteo, Davide. 2020. Python args and kwargs: Demystified. 09 September.
Accessed March 31, 2020. https://realpython.com/python-kwargs-and-args/.
Milton, Adam. 2020. Simple, exponential, and weighted moving averages. 09
November. Accessed March 31, 2020. https://www.thebalance.com/
simple-exponential-and-weighted-moving-averages-1031196.
Mitchell, Cory. 2019a. Don’t trade based on MACD divergence until you read
this. 19 November. Accessed April 1, 2020. https://www.thebalance.com/
dont-trade-based-on-macd-divergence-until-you-read-this-1031217.
Mitchell, Cory. 2019b. Drawdown definition and example. 25 June. Accessed
July 30, 2019. https://www.investopedia.com/terms/d/drawdown.asp.
Mitchell, Cory. 2019. Understanding basic candlestick charts. 19 December.
Accessed March 30, 2020. https://www.investopedia.com/trading/
candlestick-charting-what-is-it/.
Mitchell, Cory. 2020. How to use volume to improve your trading. 25 February.
Accessed March 27, 2020. https://www.investopedia.com/articles/techni-
cal/02/010702.asp.
Mitra, Gautam, and Leela Mitra. 2011. The handbook of news analytics in finance.
United Kingdom: Wiley.
Murphy, Casey. 2020. Investopedia. 16 November. Accessed January 01, 2021.
https://www.investopedia.com/trading/introduction-to-parabolic-sar/.
238 Works Cited
C
Central Limit Theorem, 85 E
complex, 20, 21, 23, 25, 34, 216 Elif, 45
Else, 44
Excel, ix, 2, 6, 7, 71, 81, 83, 119
D
DataFrame, 23, 24, 26, 27, 31,
36–39, 52, 64, 67, 69, 79, 82, F
92, 95, 110, 111, 144, 146, f.fn, 36, 73, 74, 79, 80, 88, 91, 93, 96,
149, 154, 172, 177, 178, 112, 116, 222, 223, 226, 227, 231
180, 183, 184, 189, 195, float, 20, 21, 23, 25, 66, 67
198, 202, 212, 214, 219, for loop, 46–48, 50, 52, 55, 177, 178,
223, 225 180, 189
FRED, 59–63, 68 P
fredapi, 60, 62 Pandas, 73, 75, 89, 90, 92, 95, 102,
107, 149, 154
pandas_datareader, 62, 76, 86, 102,
G 105, 108, 110, 120, 123, 129,
GDP deflator, 65 135, 136, 140, 143, 147, 149,
Google Colab, 14, 16, 73, 79 158, 160, 163, 179, 182, 189,
The Gross Domestic Product, 63, 65 212, 214, 218, 222, 226
PyNance, 74
Python, vii, x, 1–7, 8–10, 11, 14,
H 19–22, 25, 27, 30, 32, 33, 35,
histogram, 87–89, 91–94, 96–99, 44–48, 51, 62, 66, 71–75, 77,
106, 107, 113, 157 81, 83, 85, 87, 92, 95, 100, 101,
119, 121, 154, 162, 172, 177
I
Indexing, 27, 28 Q
integer, 20–23, 31, 66, 67 QuantPy, 74
L R
len, 26–28 returns, 89
list, 25–33, 36, 47, 50, 51, 55–57, 65,
67, 68
List Comprehension, 55 S
Loops, 46 S&P 500, 28, 29, 110, 173, 174, 177,
181, 186, 188, 191, 195, 199, 202
SciPy, 74
M square root, 20, 22
Matplotlib, 4, 73–75 Sturge’s Rule, 89–91
mean, 37, 40, 79, 86, 87, 94–96, 105, subtracting, 20, 22
106, 144, 149, 154, 192, 193,
197, 200, 203, 204, 208, 215,
216, 223 T
median, 87 Ta-lib, 73
mode, 87 TIA, 74
multiply, 37, 40, 56, 128, 173 tickers, 52, 172, 178, 180, 183, 184,
multiplying, 20, 22 189, 195, 198, 202, 212, 214,
219, 223
N
Natural Logarithm, 92 V
NumPy, 11, 72, 74, 75 Value At Risk, 42