2021 Pho1 15 Neural Networks Part1
2021 Pho1 15 Neural Networks Part1
Cyrill Stachniss
https://www.ipb.uni-bonn.de/5min/
2
Image Classification
“cat”
“5”
3
Semantic Segmentation
“a label for
each pixel”
4
Neural Networks
§ Machine learning technique
§ Often used for classification, semantic
segmentation, and related tasks
§ First ideas discussed in the 1950/60ies
§ Theory work on NNs in the 1990ies
§ Increase in attention from 2000 on
§ Deep learning took off around 2010
§ CNNs for image tasks from 2012 on
5
Part 1
Neural Networks Basics
6
Neural Network
7
Biological Neurons
Biological neurons are the fundamental
units of the brain that
§ Receive sensory input from the
external world or from other neurons
§ Transform and relay signals
§ Send signals to other neurons and
also motor commands to the muscles
8
Artificial Neurons
Artificial neurons are the fundamental
units of artificial neural networks that
§ Receive inputs
§ Transform information
§ Create an output
9
Neurons
§ Receive inputs / activations from
sensors or other neurons
§ Combine / transform information
§ Create an output / activation
10
Neurons as Functions
We can see a neuron as a function
§ Input given by
§ Transformation of the input data can
be described by a function
§ Output
11
Neural Network
§ NN is a network/graph of neurons
§ Nodes are neurons
§ Edges represent input-output
connections of the data flow
13
Neural Networks are Functions
§ Neural networks are functions
§ Consist of connected artificial neurons
§ Input layer takes (sensor) data
§ Output layer provides the function
result (information or command)
§ Hidden layers do some computations
18
Multi-layer Perceptron
Seen as a Function
19
Image Classification Example
“cat”
20
What is the Network’s Input?
An image consists
of individual pixels.
image 21
What is the Network’s Input?
An image consists
of individual pixels.
pixel intensities
Each pixel stores
an intensity value.
image 22
What is the Network’s Input?
pixel intensities
An image consists
of individual pixels.
Each pixel stores
an intensity value.
image 23
What is the Network’s Input?
An image consists
of individual pixels.
Each pixel stores
an intensity value.
We have N+1 such
intensity values.
24
What is the Network’s Input?
This vector is
the input layer
of our network!
28
What is the Network’s Output?
“cat”
29
What is the Network’s Output?
Is it a...
cat or a
dog or a
human or a
...?
30
What is the Network’s Output?
Is it a...
cat or a
dog or a
human or a
...?
indicator
vector
31
What is the Network’s Output?
Is it a...
cat or a
dog or a
human or a
...?
indicator
vector
32
What is the Network’s Output?
Is it a...
cat or a
dog or a
human or a
...?
we are
never
certain...
33
Output of the Network
“cat”
largest
value
36
Multi-layer Perceptron
Let’s Look at a Single Neuron
37
Perceptron (Single Neuron)
output
output activation
for the next layer
(input) activations
weights
bias
activation function
output activation
40
Function Behind a Neuron
A neuron gets activated ( ) through
§ A weighted sum of input activations
§ A bias activation
§ An activation function
41
Similarity to Convolutions?
§ A neuron is similar to a convolution
§ Remember linear shift-invariant
kernels used as local operators
“activated”
“no activation”
“activated”
“no activation”
45
ReLU Activation Function
§ Most commonly used one is the so-
called “rectified linear unit” or ReLU
§
§ Often advantages for deep networks
46
Neuron Activation
§ A neuron is only activated if
§ If
§ the weighted activations are larger than the
negative bias
47
Common Activation Functions
There are different activation functions
§ sigmoid()
§ ReLU()
§ tanh()
§ atan()
§ softplus()
§ identity()
§ step-function()
§ …
ReLU is often used
48
Illustration
[Courtesy of S. Sharma]
49
Function Behind a Neuron
§ Neuron gets activated if the weighted
sum of input activations is large
enough (larger than the negative bias)
52
Each Layer Can Be Expressed
Through Matrix Multiplications
layer 1 layer 0
53
Do It Layer by Layer...
54
Do It Layer by Layer...
input = layer 0
layer 1
layer 2
layer k = output
57
Handwritten Digit Recognition
=5
28x28 pixel image
[Image courtesy: Nielsen] 58
Handwritten Digit Recognition
28x28
pixel
input
images output
vector
(784 dim)
(10 dim)
[Partial image courtesy: Nielsen] 61
What Happens in the Layers?
62
What Happens in the 1st Layer?
pixel
values
63
What Happens in the 1st Layer?
784 input activations = pixel intensities
784 weights = weights for pixel intensities
treat activations and weights as images
-1 0 +1
white
black
(rest doesn’t
matter)
-1 0 +1
pixel
values weights tell us
what matters for
activating the neuron!
Here:
§ Global (not local) operators
§ Weight matrix does not
weights tell us (yet) “slide over image”
what matters
66
Weights & Bias = Patterns
§ Weights define the patterns to look
for in the image
§ Bias tells us how well the image must
match the pattern
§ Activation functions “switches the
neuron on” if it matches the pattern
67
What Happens in the 2nd Layer?
§ The weights in layer 2 tell us which
1st layer patterns should be combined
§ The deeper we go, the more patterns
get arranged and combined
73
How to Make the Network
Compute What We Want?
§ So far, the network is a recipe for
sequentially performing computations
§ Structure and parameters are the
design choices
§ How to set them?
Learning!
74
Summary – Part 1
§ What are neurons and neural networks
§ Lots of different networks exists
§ Focus: multi-layer perceptrons (MLP)
§ Activations, weights, bias
§ Networks have many parameters
§ “It’s just a bunch of matrices and
vectors”
§ MLP for simple image classification
§ Part 2: Learning the parameters
75
Literature & Resources
§ Online Book by Michael Nielsen, Chapter 1:
http://neuralnetworksanddeeplearning.com/chap1.html
§ Nielsen, Chapter 1, Python3 code:
https://github.com/MichalDanielDobrzanski/DeepLearningPython
§ MNIST database:
§ http://yann.lecun.com/exdb/mnist/
§ Grant Sanderson, Neural Networks
https://www.3blue1brown.com/
§ Alpaydin, Introduction to Machine Learning
76
Slide Information
§ The slides have been created by Cyrill Stachniss as part of the
photogrammetry and robotics courses.
§ I tried to acknowledge all people from whom I used
images or videos. In case I made a mistake or missed
someone, please let me know.
§ Huge thank you to Grant Sanderson (3blue1brown) for
his great educational videos that influenced this lecture.
§ Thanks to Michael Nielsen for his free online book & code
§ If you are a university lecturer, feel free to use the course
material. If you adapt the course material, please make sure
that you keep the acknowledgements to others and please
acknowledge me as well. To satisfy my own curiosity, please
send me email notice if you use my slides.
77