0% found this document useful (0 votes)
16 views

CNN

Uploaded by

azqswxkamilo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

CNN

Uploaded by

azqswxkamilo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Why CNN for Image

• Some patterns are much smaller than the whole


image
A neuron does not have to see the whole image
to discover the pattern.
Connecting to small region with less parameters

“beak” detector
Why CNN for Image
• The same patterns appear in different regions.
“upper-left
beak” detector

Do almost the same thing


They can use the same
set of parameters.

“middle beak”
detector
Why CNN for Image
• Subsampling the pixels will not change the object
bird
bird

subsampling

We can subsample the pixels to make image smaller


Less parameters for the network to process the image
The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many times
Feedforward network Convolution

Max Pooling

Flatten
The whole CNN
Property 1
 Some patterns are much Convolution
smaller than the whole image
Property 2
Max Pooling
 The same patterns appear in
Can repeat
different regions.
many times
Property 3 Convolution
 Subsampling the pixels will
not change the object
Max Pooling

Flatten
The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many times
Feedforward network Convolution

Max Pooling

Flatten
CNN – Convolution Those are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1 Matrix
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
-1 1 -1 Filter 2
0 1 0 0 1 0
Matrix
0 0 1 0 1 0 -1 1 -1

……
6 x 6 image
Each filter detects a small
Property 1
pattern (3 x 3).
1 -1 -1
CNN – Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
CNN – Convolution -1 1 -1 Filter 1
-1 -1 1
If stride=2

1 0 0 0 0 1
0 1 0 0 1 0 3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
We set stride=1 below
0 0 1 0 1 0

6 x 6 image
1 -1 -1
CNN – Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1

6 x 6 image 3 -2 -2 -1

Property 2
-1 1 -1
CNN – Convolution -1 1 -1 Filter 2
-1 1 -1
stride=1 Do the same process for
1 0 0 0 0 1 every filter
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
4 x 4 image
CNN – Colorful image
11 -1-1 -1-1 -1-1 11 -1-1
1 -1 -1 -1 1 -1
-1 1 -1 -1-1 11 -1-1
-1-1 11 -1-1 Filter 1 -1 1 -1 Filter 2
-1-1 -1-1 11 -1-1 11 -1-1
-1 -1 1 -1 1 -1
Colorful image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0 convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0 x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected ……

……
0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 Filter 1 1: 1
-1 1 -1 2: 0
-1 -1 1 3: 0
4: 0 3


1 0 0 0 0 1
0 1 0 0 1 0 7: 0
0 0 1 1 0 0 8: 1
1 0 0 0 1 0 9: 0
0 1 0 0 1 0 10: 0


0 0 1 0 1 0
13: 0
6 x 6 image
14: 0
Less parameters! 15: 1 Only connect to 9
16: 1 input, not fully
connected

1 -1 -1 1: 1
-1 1 -1 Filter 1 2: 0
-1 -1 1 3: 0
4: 0 3


1 0 0 0 0 1
0 1 0 0 1 0 7: 0
0 0 1 1 0 0 8: 1
1 0 0 0 1 0 9: 0 -1
0 1 0 0 1 0 10: 0


0 0 1 0 1 0
13: 0
6 x 6 image
14: 0
Less parameters! 15: 1
16: 1 Shared weights
Even less parameters!

The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many times
Feedforward network Convolution

Max Pooling

Flatten
CNN – Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
CNN – Max Pooling

New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
3 0
0 0 1 1 0 0 -1 1
1 0 0 0 1 0
0 1 0 0 1 0 Max 3 1
0 3
0 0 1 0 1 0 Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
3 0
-1 1 Convolution

3 1
0 3
Max Pooling
Can repeat
A new image many times
Smaller than the original Convolution
image
The number of the channel Max Pooling
is the number of filters
The whole CNN
cat dog ……
Convolution

Max Pooling
A new image
Fully Connected
Feedforward network Convolution

Max Pooling
A new image
Flatten
3
Flatten
0

1
3 0
-1 1 3

3 1 -1
0 3 Flatten
1 Fully Connected
Feedforward network
0

3
Only modified the network structure and
CNN in Keras
input format (vector -> 3-D tensor)
input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 …… There are 25
-1 -1 1 3x3 filters.
-1 1 -1 Max Pooling
Input_shape = ( 28 , 28 , 1)
28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1
Only modified the network structure and
CNN in Keras
input format (vector -> 3-D tensor)
input
1 x 28 x 28
Convolution
How many parameters
9 25 x 26 x 26
for each filter?
Max Pooling
25 x 13 x 13
Convolution
How many parameters
225 50 x 11 x 11
for each filter?
Max Pooling
50 x 5 x 5
Only modified the network structure and
CNN in Keras
input format (vector -> 3-D tensor)
input
1 x 28 x 28
output Convolution
25 x 26 x 26
Fully Connected Max Pooling
Feedforward network
25 x 13 x 13
Convolution
50 x 11 x 11
Max Pooling
1250 50 x 5 x 5
Flatten
First Convolution Layer
• Typical-looking
filters on the
trained first layer

Filter size: 11 x 11
(AlexNet)

http://cs231n.github.io/understanding-cnn/
What does CNN learn?
𝜕𝑎𝑘
The output of the k-th filter is a x
11 x 11 matrix. 𝜕𝑥𝑖𝑗 input
11 11
Degree of the activation 𝑘 𝑘
of the k-th filter: 𝑎 = ෍ ෍ 𝑎 𝑖𝑗 25 3x3
Convolution
𝑖=1 𝑗=1 filters
𝑥 ∗ = 𝑎𝑟𝑔 max 𝑎𝑘 (gradient ascent)
𝑥
11 Max Pooling

3 -1 …… -1
𝑘 50 3x3
𝑎𝑖𝑗 Convolution
filters
-3 1 …… -3
11 50 x 11 x 11
……

……

……

Max Pooling

3 -2 …… -1
What does CNN learn? input

Find an image maximizing the output Convolution


of neuron:
𝑥 ∗ = 𝑎𝑟𝑔 max 𝑎 𝑗 Max Pooling
𝑥

Convolution

Max Pooling

flatten

Each figure corresponds to a neuron 𝑎𝑗


What does CNN learn? input

𝑥 ∗ = 𝑎𝑟𝑔 max 𝑦 𝑖 Can we see digits? Convolution


𝑥

Max Pooling
0 1 2
Convolution

Max Pooling
3 4 5
flatten

6 7 8

Deep Neural Networks are Easily Fooled


https://www.youtube.com/watch?v=M2IebCN9Ht4
𝑦𝑖
𝜕𝑦𝑘 𝑦𝑘 : the predicted
| | class of the model
𝜕𝑥𝑖𝑗

Pixel 𝑥𝑖𝑗

Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, “Deep Inside Convolutional


Networks: Visualising Image Classification Models and Saliency Maps”, ICLR, 2014
CNN
Modify
Deep Dream image

• Given a photo, machine adds what it sees ……


3.9
−1.5
2.3

CNN exaggerates what it sees

http://deepdreamgenerator.com/
Deep Style

CNN CNN

A Neural Algorithm content style


of Artistic Style
https://arxiv.org/abs/1508
.06576

CNN

?
More Application: Playing Go

Next move
Network (19 x 19
positions)

19 x 19 matrix 19 x 19 vector
19(image)
x 19 vector
Black: 1 Fully-connected feedforward
white: -1 network can be used
none: 0 But CNN performs much better.
More Application: Playing Go
record of
Training: 黑: 5之五 白: 天元 黑: 五之5 …
previous plays

Target:
CNN “天元” = 1
else = 0

Target:
CNN “五之 5” = 1
else = 0
Why CNN for playing Go?
• Some patterns are much smaller than the whole
image

Alpha Go uses 5 x 5 for first layer

• The same patterns appear in different regions.


More Application: Speech

The filters move in the


CNN frequency direction.
Frequency

Image Time
Spectrogram
More Application: Text

Source of image:
http://citeseerx.ist.psu.edu/viewdoc/downloa
d?doi=10.1.1.703.6858&rep=rep1&type=pdf

You might also like