Intro to TensorFlow 2.
0
MBL, August 2019
Josh Gordon (@random_forests)
1
Agenda 1 of 2
Exercises
● Fashion MNIST with dense layers
● CIFAR-10 with convolutional layers
Concepts (as many as we can intro in this short time)
● Gradient descent, dense layers, loss, softmax, convolution
Games
● QuickDraw
Agenda 2 of 2
Walkthroughs and new tutorials
● Deep Dream and Style Transfer
● Time series forecasting
Games
● Sketch RNN
Learning more
● Book recommendations
Deep Learning is representation learning
Image link
Image link
Latest tutorials and guides
[Link]/beta
News and updates
[Link]/tensorflow
[Link]/tensorflow
Demo
PoseNet and BodyPix
[Link]/pose-net
[Link]/body-pix
TensorFlow for JavaScript, Swift,
Android, and iOS
[Link]/js
[Link]/swift
[Link]/lite
Minimal MNIST in TF 2.0
A linear model, neural network, and deep
neural network - then a short exercise.
[Link]/mnist-seq
...
...
...
Softmax
model = Sequential()
[Link](Dense(256, activation='relu',input_shape=(784,)))
[Link](Dense(128, activation='relu'))
[Link](Dense(10, activation='softmax'))
Linear model
Neural network
Deep neural network
...
...
Softmax activation
After training, select all the
weights connected to this
output.
[Link][0].get_weights()
# Your code here
# Select the weights for a single output
# ...
img = [Link](28,28)
[Link](img, cmap = plt.get_cmap('seismic'))
...
...
Softmax activation
After training, select all the
weights connected to this
output.
Exercise 1 (option #1)
Exercise: [Link]/mnist-seq
Reference:
tensorfl[Link]/beta/tutorials/keras/basic_classification
TODO:
Add a validation set. Add code to plot loss vs epochs (next slide).
Exercise 1 (option #2)
[Link]/ijcav_adv
Answers: next slide.
import [Link] as plt
# Add a validation set
history = [Link](x_train, y_train, validation_data=(x_test, y_test) ...)
# Get stats from the history object
acc = [Link]['accuracy']
val_acc = [Link]['val_accuracy']
epochs = range(len(acc))
# Plot accuracy vs epochs
[Link]('Training and validation accuracy')
[Link](epochs, acc, color='blue', label='Train')
[Link](epochs, val_acc, color='orange', label='Val')
[Link]('Epoch')
[Link]('Accuracy')
[Link]()
Exercise 1 (option #2)
[Link]/ijcav_adv
Answers: next slide.
[Link]/ijcai_adv_answer
About TensorFlow 2.0
19
Install
# GPU
!pip install tensorflow-gpu==2.0.0-beta1
# CPU
!pip install tensorflow==2.0.0-beta1
In either case, check your installation (in Colab, you may need to use runtime -> restart after installing).
import tensorflow as tf
print(tf.__version__) # 2.0.0-beta1
Nightly is available too, but best bet: stick with a named release for stability.
TF2 is imperative by default
import tensorflow as tf
print(tf.__version__) # 2.0.0-beta1
x = [Link](1)
y = [Link](2)
z = x + y
print(z) # [Link](3, shape=(), dtype=int32)
You can interactive explore layers
from [Link] import Dense
layer = Dense(units=1, kernel_initializer='ones', use_bias=False)
data = [Link]([[1.0, 2.0, 3.0]]) # Note: a batch of data
print(data) # [Link]([[1. 2. 3.]], shape=(1, 3), dtype=float32)
# Call the layer on our data
result = layer(data)
print(result) # [Link]([[6.]], shape=(1, 1), dtype=float32)
print([Link]()) # [Link] have a handy .numpy() method
TF1: Build a graph, then run it.
import tensorflow as tf # 1.14.0
print(tf.__version__)
x = [Link](1)
y = [Link](2)
z = [Link](x, y)
print(z)
TF1: Build a graph, then run it.
import tensorflow as tf # 1.14.0
print(tf.__version__)
x = [Link](1)
y = [Link](2)
z = [Link](x, y)
print(z) # Tensor("Add:0", shape=(), dtype=int32)
with [Link]() as sess:
print([Link](x)) # 3
Keras is built-in to TF2
How to import [Link]
If you want to use [Link] and see the message “Using TensorFlow Backend”, you have accidentally
imported Keras (which is installed by default on Colab) from outside of TensorFlow.
Example
# !pip install tensorflow==2.0.0-beta1, then
>>> from [Link] import layers # Right
>>> from keras import layers # Oops
Using TensorFlow backend. # You shouldn’t see this
When in doubt, copy the imports from one of the tutorials on tensorfl[Link]/beta
Notes
A superset of the reference implementation. Built-in to TensorFlow 2.0 (no need to install Keras
separately).
Documentation and examples [Link] adds a bunch of stuff, including…
model subclassing (Chainer / PyTorch
● Tutorials: tensorfl[Link]/beta style model building), custom training
● Guide: tensorfl[Link]/beta/guide/keras/ loops using a GradientTape, a collection
of distributed training strategies, support
for [Link], Android, iOS, etc.
!pip install tensorflow==2.0.0-beta1
from tensorflow import keras
I’d recommend the examples you find on tensorfl[Link]/beta over other
resources (they are better maintained and most of them are carefully
reviewed).
More notes
TF 2.0 is similar to NumPy, with:
● GPU support
● Autodiff
● Distributed training
● JIT compilation
● A portable format (train in Python on Mac, deploy on iOS using Swift, or in a browser using
JavaScript)
Write models in Python, JavaScript or Swift (and run anywhere).
API doc: tensorfl[Link]/versions/r2.0/api_docs/python/tf
Note: make sure you’re looking at version 2.0 (the website still defaults to 1.x)
Three model building styles
Sequential, Functional, Subclassing
29
Sequential models
model = [Link]([
[Link](),
[Link](512, activation='relu'),
[Link](0.2),
[Link](10, activation='softmax')
])
[Link](optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
[Link](x_train, y_train, epochs=5)
[Link](x_test, y_test)
TF 1.x
model = [Link]([
[Link](),
[Link](512, activation='relu'),
[Link](0.2),
[Link](10, activation='softmax')
])
[Link](optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
[Link](x_train, y_train, epochs=5)
[Link](x_test, y_test)
TF 2.0
model = [Link]([
[Link](),
[Link](512, activation='relu'),
[Link](0.2),
[Link](10, activation='softmax')
])
[Link](optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
[Link](x_train, y_train, epochs=5)
[Link](x_test, y_test)
Functional models
inputs = [Link](shape=(32, 32, 3))
y = layers.Conv2D(3, (3, 3),activation='relu',padding='same')(inputs)
outputs = [Link]([inputs, y])
model = [Link](inputs, outputs)
[Link].plot_model(model, 'skip_connection.png', show_shapes=True)
Subclassed models
class MyModel([Link]):
def __init__(self, num_classes=10):
super(MyModel, self).__init__(name='my_model')
self.dense_1 = [Link](32, activation='relu')
self.dense_2 = [Link](num_classes,activation='sigmoid')
def call(self, inputs):
# Define your forward pass here
x = self.dense_1(inputs)
return self.dense_2(x)
Two training styles
Built-in and custom
35
Use a built-in training loop
[Link](x_train, y_train, epochs=5)
Or, define your own
model = MyModel()
with [Link]() as tape:
logits = model(images)
loss_value = loss(logits, labels)
grads = [Link](loss_value, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
A few concepts
38
A vector of partial derivatives.
Gradient points in
Gradient descent
direction of steepest Calculate the gradient.
ascent, so we step in Take a step.
reverse direction. Repeat.
Loss
Step size (learning rate).
t=1
t=2
t=3
Parameter
With more than one variable
Loss (w0, w1)
The gradient is a vector of partial derivatives (the
derivative of a function w.r.t. each variable while
the others are held constant).
w
w
1
The gradient points 0in the direction of steepest ascent. We usually want to
minimize a function (like loss), so we take a step in the opposite direction..
40
Training models with gradient descent
Forward pass
● Linear regression: y=mx +b
● Neural network: f(x) = softmax(W2(g(W1x)))
Calculate loss
● Regression: squared error.
● Classification: cross entropy.
Backward pass
● Backprop: efficient method to calculate gradients
● Gradient descent: nudge parameters a bit in the opposite direction
Try it: Linear regression
[Link]/tf-ws1
Bonus: Deep Dream training loop will be
similar.
A neuron
Linear combination of
x0 𝜃0
inputs and weights
𝜃1
x1
𝜃2
∑ g ŷ
ŷ = g ( ∑ x i 𝜃i )
x2
Can rewrite as a dot
product
Inputs weights sum activation output
ŷ = g ( x T𝜃 )
Bias not drawn (you could set x1 to be a constant input of 1).
One image and one class
Multiple inputs; one output
12
1.4 0.5 0.7 1.2
12 48 0.5 130.1 Plane
48
96
+ =
96 18
18
w x b Output
One image and two classes
Multiple inputs; multiple outputs
12
1.4 0.5 0.7 1.2
12 48 0.5 130.1 Plane
48
-2.0 0.1 0.2 -0.7
96
+ 1.2 = -11.4 Car
96 18
18
W is now a matrix W x b Output
46
Two images and two classes
12 48 Image 1 Image 2
96 18
1.4 0.5 0.7 1.2 12 4
+ 0.5
= 130.1 131.7 Plane
-2.0 0.1 0.2 -0.7 48 18 1.2 -11.4 -71.7 Car
0.2 0.9 -0.2 0.5 96 2 0.2 12.8 64.8 Truck
18 96
4 18
2 96
W x b Output
...
...
Softmax activation
After training, select all the
weights connected to this
output.
[Link][0].get_weights()
# Your code here
# Select the weights for a single output
# ...
img = [Link](28,28)
[Link](img, cmap = plt.get_cmap('seismic'))
...
...
Softmax activation
After training, select all the
weights connected to this
output.
A neural network
ReLU
130.1 Plane g(130.1) Plane ? Plane
-11.4 Car g(-11.4) Car
= ? Car
12.8 Truck g(12.8) Truck ? Truck
Output
Applied piecewise
130.1 Plane g(130.1) Plane 130.1 Plane
-11.4 Car g(-11.4) Car
= 0 Car
12.8 Truck g(12.8) Truck 12.8 Truck
Output
Activation functions introduce non-linearities
Notes
- You can make similar plots (and more) with this example. Note: from an older version of TF, but should work out of the box in Colab.
- Each of our convolutional layers used an activation as well (not shown in previous slides).
- You can make a demo of this in TensorFlow Playground by setting activation = Linear (or none)
Without activation, many layers are equivalent to one
# If you replace 'relu' with 'None', this model ...
model = Sequential([
Dense(256, activation='relu', input_shape=(2,)),
Dense(256, activation='relu'),
Dense(256, activation='relu'),
Dense(1, activation='sigmoid')
])
# ... has the same representation power as this one
model = Sequential([Dense(1, activation='sigmoid', input_shape=(2,))])
Softmax converts scores to probabilities
130.1 Plane softmax([130.1, -11.4, 12.8])
>>> 0.999, 0.001, 0.001
-11.4 Car
12.8 Truck
Scores Probabilities
Note: these are ‘probability like’ numbers (do not go to vegas and bet in this ratio).
Cross entropy compares two distributions
Cross entropy loss for a batch of
examples
Each example has a label in a one-hot True prob (either 1 or 0)
format in our case!
This is a bird
0 1 2 3 4 5 6 7 8 9 Sum over all examples Predicted prob
True probabilities (between 0-1)
0 0 1 0 0 0 0 0 0 0
Predicted probabilities
0.1 0.2 0.6 0.2 0.0 0.0 0.0 0.0 0.0 0.0
Rounded! Softmax output is always 0 < x < 1
Exercise
[Link]/ijcai_1-a
Complete the notebook for Fashion MNIST
Answers: next slide.
Exercise
[Link]/ijcai_1-a
Complete the notebook for Fashion MNIST
Answers: [Link]/ijcai_1-a_answers
TensorFlow RFP
jbgordon@[Link]
[Link]/tensorflow-rfp
Convolution
60
Not a Deep Learning concept
import scipy
from skimage import color, data
import [Link] as plt
img = [Link]()
img = color.rgb2gray(img)
[Link]('off')
[Link](img, cmap=[Link])
Convolution example
-1 -1 -1
-1 8 -1
-1 -1 -1
Notes
Edge detection intuition: dot
product of the filter with a
region of the image will be
zero if all the pixels around
the border have the same
value as the center.
Does anyone know who this is?
Convolution example
-1 -1 -1
-1 8 -1
-1 -1 -1
Notes
Edge detection intuition: dot
product of the filter with a
region of the image will be
zero if all the pixels around
the border have the same
value as the center.
Eileen Collins
A simple edge detector
kernel = [Link]([[-1,-1,-1],
[-1,8,-1],
[-1,-1,-1]])
result = [Link].convolve2d(img, kernel, 'same')
[Link]('off')
[Link](result, cmap=[Link])
Easier to see with seismic
-1 -1 -1
-1 8 -1
-1 -1 -1
Notes
Edge detection intuition: dot
product of the filter with a
region of the image will be
zero if all the pixels around
the border have the same
value as the center.
Eileen Collins
Example
2 0 1 1
1 0 1
0 1 0 0
0 0 0
0 0 1 0
0 1 0
0 3 0 0
An input image A filter Output image
(no padding) (3x3) (after convolving with stride 1)
Example
2 0 1 1
1 0 1
0 1 0 0 3
0 0 0
0 0 1 0
0 1 0
0 3 0 0
An input image A filter Output image
(no padding) (3x3) (after convolving with stride 1)
2*1 + 0*0 + 1*1 + 0*0 + 1*0 + 0*0 + 0*0 + 0*1 + 1*0
Example
2 0 1 1
1 0 1
0 1 0 0 3 2
0 0 0
0 0 1 0
0 1 0
0 3 0 0
An input image A filter Output image
(no padding) (3x3) (after convolving with stride 1)
Example
2 0 1 1
1 0 1
0 1 0 0 3 2
0 0 0
0 0 1 0 3
0 1 0
0 3 0 0
An input image A filter Output image
(no padding) (3x3) (after convolving with stride 1)
Example
2 0 1 1
1 0 1
0 1 0 0 3 2
0 0 0
0 0 1 0 3 1
0 1 0
0 3 0 0
An input image A filter Output image
(no padding) (3x3) (after convolving with stride 1)
In 3d
model = Sequential()
[Link](Conv2D(filters=4,
kernel_size=(4,4),
input_shape=(10,10,3))
A RGB image as a 3d volume.
Each color (or channel) is a
layer.
weights
4 In 3d, our filters have width,
4 height, and depth.
3
weights
4
4
3
weights
4 Applied in the same way as 2d
4 (sum of weight * pixel value as
3 they slide across the image).
...
weights
4 Applying the convolution over
4 the rest of the input image.
3
weights
4 More filters, more output channels.
4
3
Going deeper
model = Sequential()
[Link](Conv2D(filters=4,
kernel_size=(4,4),
input_shape=(10,10,3))
[Link](Conv2D(filters=8,
kernel_size=(3,3))
weights
3
3
4
Edges
???
Shapes
...
...
Textures
...
...
…
Exercise
[Link]/ijcai_1_b
Write a CNN from scratch for CIFAR-10.
Answers: next slide.
Ref: tensorfl[Link]/beta/tutorials/images/intro_to_cnns
Exercise
[Link]/ijcai_1b
Write a CNN from scratch for CIFAR-10.
Answers: [Link]/ijcai_1_b_answers
Game 1
Would you like to volunteer?
[Link]
Example: transfer learning
[Link]/ijcai_2
Transfer learning using a pretrained MobileNet and a Dense
layer.
Ref: tensorfl[Link]/beta/tutorials/images/transfer_learning
Ref: tensorfl[Link]/beta/tutorials/images/hub_with_keras
Example: transfer learning
[Link]/ijcai_2
Transfer learning using a pretrained
MobileNet and a Dense layer.
Answers: [Link]/ijcai_2_answers
Deep Dream
New tutorial
[Link]/dream-wip
Image segmentation
Recent tutorial
[Link]/im-seg
Timeseries forecasting
Recent tutorial
Game 2
Who would like to volunteer?
[Link]fl[Link]/assets/sketch_rnn_
demo/[Link]
CycleGAN
Recent tutorial
Under the hood
93
Let’s make this faster
lstm_cell = [Link](10)
def fn(input, state):
return lstm_cell(input, state)
input = [Link]([10, 10]); state = [[Link]([10, 10])] * 2
lstm_cell(input, state); fn(input, state) # warm up
# benchmark
[Link](lambda: lstm_cell(input, state), number=10) # 0.03
Let’s make this faster
lstm_cell = [Link](10)
@[Link]
def fn(input, state):
return lstm_cell(input, state)
input = [Link]([10, 10]); state = [[Link]([10, 10])] * 2
lstm_cell(input, state); fn(input, state) # warm up
# benchmark
[Link](lambda: lstm_cell(input, state), number=10) # 0.03
[Link](lambda: fn(input, state), number=10) # 0.004
AutoGraph makes this possible
@[Link]
def f(x):
while tf.reduce_sum(x) > 1:
x = [Link](x)
return x
# you never need to run this (unless curious)
print([Link].to_code(f))
Generated code
def tf__f(x):
def loop_test(x_1):
with ag__.function_scope('loop_test'):
return ag__.gt(tf.reduce_sum(x_1), 1)
def loop_body(x_1):
with ag__.function_scope('loop_body'):
with ag__.utils.control_dependency_on_returns([Link](x_1)):
tf_1, x = ag__.utils.alias_tensors(tf, x_1)
x = tf_1.tanh(x)
return x,
x = ag__.while_stmt(loop_test, loop_body, (x,), (tf,))
return x
Going big: [Link]
model = [Link]([
[Link](64, input_shape=[10]),
[Link](64, activation='relu'),
[Link](10, activation='softmax')])
[Link](optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
Going big: Multi-GPU
strategy = [Link]()
with [Link]():
model = [Link]([
[Link](64, input_shape=[10]),
[Link](64, activation='relu'),
[Link](10, activation='softmax')])
[Link](optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
Learning more
Latest tutorials and guides
● [Link]/beta
Books
● Hands-on ML with Scikit-Learn, Keras and TensorFlow (2nd edition)
● Deep Learning with Python
For details
● [Link]
100