0% found this document useful (0 votes)
26 views75 pages

AI-Driven Finance Innovations

The document discusses the integration of artificial intelligence in finance, emphasizing the shift from traditional models to data-driven approaches. It highlights the importance of machine learning and deep learning in predicting market movements and the limitations of existing financial theories. The author concludes that AI has the potential to transform finance by leveraging big data and advanced algorithms, although practical execution and market microstructure considerations remain critical challenges.

Uploaded by

andylau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views75 pages

AI-Driven Finance Innovations

The document discusses the integration of artificial intelligence in finance, emphasizing the shift from traditional models to data-driven approaches. It highlights the importance of machine learning and deep learning in predicting market movements and the limitations of existing financial theories. The author concludes that AI has the potential to transform finance by leveraging big data and advanced algorithms, although practical execution and market microstructure considerations remain critical challenges.

Uploaded by

andylau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Artificial Intelligence in Finance

Dr. Yves J. Hilpisch

ODSC East, Boston, 30. April 2019


The Group
[Link]
150+ hours
16 week program of instruction

1,200 pages PDF


5,000+ lines
of code [Link]
[Link]
[Link]
About Myself
[Link]
Artif N EW b
icial ook p
Intel rojec
—AP ligen t:
ytho ce in
n-ba Fina
sed G nce
uide

[Link]
Resources (Gist):

[Link]
Overview
1. The Beauty Myth
2. Data-Driven Finance
3. Statistical Learning
4. OLS Regression
5. Efficient Markets
6. AI-First Finance
7. Algorithms
8. Deep Learning
9. Market Prediction
[Link]
The Beauty Myth
GUTs are among several long-established theories that
remain stubbornly unsupported by the big, costly
experiments testing them. …
Despite the dearth of data, the answers that all these theories
offer to some of the most vexing questions in physics are so
elegant that they populate postgraduate textbooks. As Peter
Woit of Columbia University observes, “Over time, these ideas
became institutionalised. People stopped thinking of them as
speculative.” That is understandable, for they appear to have
great explanatory power.
Cornerstones of Economics
A. Arbitrage Pricing
B. Expected Utility
C. Equilibrium
D. Normal Distributions
E. Linear Relationships
F. Efficient Markets
Theory Reality
μi = r + βi(μM − r)
“Market Risk”
“Idiosyncratic Risk”
Data-Driven Finance
structured data unstructured data alternative data

price data (eod, texts web texts


historical data minute, tick, …) news social media
fundamental data IoT satellite data

web texts
tick data news
streaming data social media
volume data IoT
satellite data
Eugene Wigner’s article “The Unreasonable Effectiveness
of Mathematics in the Natural Sciences” examines why so
much of physics can be neatly explained with simple
mathematical formulas such as f = ma or e = mc2.
Meanwhile, sciences that involve human beings rather
than elementary particles have proven more resistant to
elegant mathematics. Economists suffer from physics
envy over their inability to neatly [and successfully] model
human behavior. An informal, incomplete grammar of the
English language runs over 1,700 pages. Perhaps when it
comes to natural language processing and related fields,
we’re doomed to complex theories that will never have
the elegance of physics equations. But if that’s so, we
should stop acting as if our goal is to author extremely
elegant theories, and instead embrace complexity and
make use of the best ally we have: the unreasonable
effectiveness of data.
Statistical Learning
1
Mathematics. f(x) = 2 + x
2
yi = f(xi), i = 1,2,…, n

n
Statistics. (yi, xi)i=1
̂f(x) = α + βx ≈ y
α, β = ?, ?
Mathematics. Function

Output

Input

Statistics. Input

Function

Output
OLS Regression
Why OLS Regression?

1. centuries old: least squares approach used since more than 200
years
(see e.g. this article)
2. simple math: easy to understand and transfer to different data sets
3. lightning fast: fast to evaluate even on large data sets
4. scalable: basically not limit regarding data size
5. implementation: efficient implementations (e.g. Python) readily
available
n
Given input data. (yi, xi)i=1

Simple linear regression. yî = α + βxi ≈ yi

yi = α + βxi + ϵi
n
2
(yi − yî )

Minimization
problem.
min
α,β
i=1

Optimal Solution. Cov(x, y)


β=
Var(x)
α = ȳ − βx̄
Major assumptions of the linear regression model:

1. linearity: the model is linear in its parameters (coefficients and


error term)
2. independence: independent variables should not be perfectly
correlated with each other (no multicollinearity)
3. zero mean: the mean of the residuals should be zero
4. no correlation: residuals should not be correlated with the
independent variables
5. homoscedasticity: the standard deviation of the residuals should
be constant
6. no autocorrelation: the residuals should not be correlated with
each other
Efficient Markets
Eugene F. Fama (1965):

“For many years, economists, statisticians, and teachers


of finance have been interested in developing and
testing models of stock price behavior. One important
model that has evolved from this research is the theory
of random walks. This theory casts serious doubt on
many other methods for describing and predicting stock
price behavior—methods that have considerable
popularity outside the academic world. For example, we
shall see later that, if the random-walk theory is an
accurate description of reality, then the various
“technical” or “chartist” procedures for predicting stock
prices are completely without value.”—Eugene F. Fama
(1965): “Random Walks in Stock Market Prices”
Michael Jensen (1978): “Some Anomalous Evidence Regarding
Market Efficiency”:

“A market is efficient with respect to an information set S if it is


impossible to make economic profits by trading on the basis of
information set S.”

If a stock price follows a (simple) random walk (no drift & normally
distributed returns), then it rises and falls with the same probability of
50% (“toss of a coin”).

In such a case, the best predictor of tomorrow’s stock price — in a


least-squares sense — is today’s stock price.
AI-First Finance
“The grand aim of science is to cover the greatest number of
experimental facts by logical deduction from the smallest
number of hypotheses or axioms.”
— Albert Einstein

“Machine learning is the scientific method on steroids. It follows


the same process of generating, testing, and discarding or
refining hypotheses. But while a scientist may spend his or her
whole life coming up with and testing a few hundred
hypotheses, a machine-learning system can do the same in a
second. Machine learning automates discovery. It’s no surprise,
then that it’s revolutionizing science as much as it’s
revolutionizing business.”
Programming. Rules | Code

Output

Data

Machine Learning. Input

Rules | Code

Output
Financial Finance AI in Finance
Markets History = finaince

x x

f(•) m(•, a, b)

y f(x) ≠ y m(x, a*, b*) ≈ y


“non-linear, complex, “brain-driven & “data-driven &
changing” beauty myth” AI-first”
Financial Finance AI in Finance
Markets “normative economics =
History “positive economics =
= finaince
assumptions, axioms, etc.” data, relationships, etc.”

x x
(too) “simple and elegant “general, parametrizable,
theories” trainable algorithms”

f(•) m(•, a, b)

“hardly any supporting “might show good


empirical evidence” performance, but black box”

y f(x) ≠ y m(x, a*, b*) ≈ y


“non-linear, complex, “brain-driven & “data-driven &
changing” beauty myth” AI-first”
“The essential tool of econometrics is multivariate linear
regression, an 18th-century technology that was already mastered
by Gauss before 1794 … It is hard to believe that something as
complex as 21st-century finance could be grasped by something as
simple as inverting a covariance matrix.”

“… what if economists finally started to consider non-linear


functions?”

“An ML algorithm can spot patterns in a 100-dimensional world as


easily as in our familiar 3-dimensional one.”

“Econometrics might be good enough to succeed in financial


academia (for now), but succeeding in practice requires ML.”

Marcos López de Prado (2018)


Algorithms
Artificial Intelligence

Machine Learning
Deep Learning Reinforcement Learning
(LogReg, Gaussian NB,
(DNN, CNN, RNN) (Simple, Q-Learning, DRL)
Decision Trees, SVM)

Unsupervised Learning
Classification
(Clustering, Dim Reduction)

Supervised Learning Estimation

Online Learning Policies


Supervised Learning. Input
(Features)

Rules | Code

Output
(Labels)

Unsupervised
Input
Learning. (Features)
Rules | Code

Online Learning. Input Input Input


(Features) (Features) (Features)
Rules | Code
Output Output Output
(Labels) (Labels) (Labels)
Some specifications and explanations:

1. supervised learning: input data (features) and output data (labels) are
given; the algorithm learns from observed patterns
2. unsupervised learning: only input data (features) are given; the
algorithm identifies patterns, cluster, etc.
3. online learning: both input data (features) and output data (labels)
arrive incrementally (over time); the algorithm updates its parameters
(policies) incrementally
4. classification: the problem of learning about and predicting labels as
two or more discrete categories (e.g. {0, 1} or {A, B, C})
5. estimation: the problem of learning about and predicting labels as
continuous values (real numbers, floating point numbers, e.g. 1.435)
6. policies: the problem of learning about and applying action policies (e.g.
if (x=1, y=0.5, z=‘low’) then take action B2)
Practical Introduction to ML with
Python:
• IPython: Beyond Normal Python

• Introduction to NumPy

• Data Manipulation with Pandas

• Visualization with Matplotlib

• Machine Learning (ca. 180 pages)


Deep Learning
Deep Learning
—Some Background
Success Stories about Deep Learning
and Deep Reinforcement Learning:
• Self-Driving Cars

• Recommendation Engines

• Playing Atari Games

• Image Recognition & Classification

• Speech Recognition

• Playing the Game of Go


Mathematics of Deep Learning:
• Applied Mathematics

• Machine Learning Basics

• Deep Feedforward Networks

• Regularization for Deep Learning

• Optimization for Training Deep Models

• Convolutional Networks

• Recurrent & Recursive Nets

• Monte Carlo Methods

• …
Practice of Deep Learning
(with Python and Keras):
• What is Deep Learning?

• Mathematical Building Blocks

• Getting Started with Neural Networks

• Fundamentals of Machine Learning

• Deep Learning for Computer Vision

• Deep Learning for Text and Sequences

• Advanced Deep Learning Best

Practices
• Generative Deep Learning
Deep Learning
—Building Blocks
Neural Network
0 Hidden Layers

x1
y1

weights
x2
y2

x3

Input Layer Output Layer


Neural Network
1 Hidden Layer

x1
y1

weights

weights
x2
y2

x3

Input Layer Hidden Layer Output Layer


Neural Network
2 Hidden Layers

x1
y1

weights

weights

weights
x2
y2

x3

Input Layer Hidden Layers Output Layer


Deep Learning
—Universal Approximation Theorem
“In the mathematical theory of artificial neural
networks, the universal approximation
theorem states that a feed-forward network with a
single hidden layer containing a finite number
of neurons can approximate continuous
functions on compact subsets of Rn, under mild
assumptions on the activation function. The
theorem thus states that simple neural networks
can represent a wide variety of interesting functions
when given appropriate parameters; however, it
does not touch upon the algorithmic learnability of
those parameters.”
—[Link]
First Illustration with Keras & Tensorflow
Market Prediction
Market Prediction
—Scikit-Learn
With Scikit-Learn there are Deep Neural Network (Multi Layer
Perceptron, MLP) models available both for estimation …

from sklearn.neural_network import MLPRegressor


model = MLPRegressor(hidden_layer_sizes=1 * [1024,],
activation='relu', solver='adam',
learning_rate_init=0.001,
nesterovs_momentum=False,
shuffle=False, max_iter=10000
validation_fraction=0.1)
[Link](x, y)
pred = [Link](x)
… and classification.

from sklearn.neural_network import MLPClassifier


model = MLPClassifier(hidden_layer_sizes=1 * [1024,],
activation='sigmoid', solver='adam',
learning_rate_init=0.001,
nesterovs_momentum=False,
shuffle=False, max_iter=10000
validation_fraction=0.1)
[Link](x, y)
pred = [Link](x)
Market Prediction
—Keras
Keras, with e.g. TensorFlow as its backend, allows the sequential
building of Deep Neural Networks.

from [Link] import Dense


from [Link] import Sequential
from [Link] import Adam

model = Sequential()
[Link](Dense(128, input_dim=1, activation='relu'))
[Link](Dense(48, activation='relu'))
[Link](Dense(1, activation='linear')) # estimation
# [Link](Dense(1, activation='sigmoid')) # classification
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999,
epsilon=None, decay=0.0, amsgrad=False)
[Link](loss='mse', optimizer=adam,
metrics=['mse', 'accuracy'])

[Link](x, y, epochs=2000, verbose=False)


pred = [Link](x)
Conclusions
1. Finance has long been driven by the “beauty myth” — elegant
but too simplistic models, equations and approaches.
2. The availability of big financial data (historical—streaming,
structured—unstructured) gave rise to data-driven finance.
3. It might be assumed that the “unreasonable effectiveness of big
data” holds true in the financial domain as well.
4. Due to the availability of big data (e.g. billions of hours of virtual
car driving), Artificial Intelligence (AI) is changing almost every
area of our lives.
5. It is to be assumed that in the same way the combination of data-
driven and AI-first finance will change the field for good.
1. Deep Learning approaches “make us hopeful” that we can
overcome the main corollary of the Efficient Markets Hypothesis,
i.e. that the analysis of historical data is useless (for the creation of
alpha).
2. Furthermore, there are alternative algorithms available that
might also be useful (better) in predicting market movements:
A. recurrent neural networks
B. convolutional neural networks
C. deep reinforcement learning
1. However, so far we have only considered the prediction part of
algorithmic trading (i.e. the signal generation).
2. Two important topics have been left out:
A. market microstructure elements (e.g. transaction costs)
have not been considered in any meaningful way.
B. In addition, execution rules play an important role (sizing,
resizing, stop loss, profit capture, etc.) for the trading
performance.
After all, working with AI algorithms — based on Python — and
applying them to financial problems is fun, intellectually stimulating
and might finally lead to the “holy grail” of finance:

Being able to consistently outperform others and the markets.

This naturally raises questions regarding the future of the finance


domain, the eduction of people working in it, the ways companies
compete in the field and also regarding ethics and governance.
The Python Quants GmbH
Dr. Yves J. Hilpisch
+49 3212 112 9194
[Link] | ai@[Link] | @dyjh

You might also like