05 Transfer Learning With Tensorflow Part 2 Fine Tuning
05 Transfer Learning With Tensorflow Part 2 Fine Tuning
Part 2: Fine-tuning
Where can you get help?
“If in doubt, run the code”
Model learns patterns/weights from similar problem space Patterns get used/tuned to speci c problem
fi
“Why use transfer learning?”
Why use transfer learning?
• Can leverage an existing neural network architecture proven to work on problems similar to our
own
• Can leverage a working network architecture which has already learned patterns on similar
data to our own (often results in great results with less data)
…
unfrozen and fine-
Layer 235 Layer 235 tuned on custom data
Changes
(unfrozen)
Layer 234 Layer 234
Working architecture
…
(e.g. E cientNet) pre-
trained on ImageNet
Layer 2 Layer 2
Stays same
Bottom layers (ma
(frozen) y)
stay frozen
Input Layer Input Layer
…
…
Fine-t
Might change usual uning
ly req
more d uires
ata th
featur an
Custom dataset (e.g. 10 classes of food) extrac e
tion
• Data augmentation (making your training set more diverse without adding
samples)
👩🍳 👩🔬
(we’ll be cooking up lots of code!)
How:
fi
Let’s code!
Dataset shapes
“Create batches of 32 images of size 224x224 split into red, green, blue colour channels.”
• Di erences: model construction (the Functional API is more flexible and able to produce
more sophisticated models)
ff
fi
Building a feature
extraction model
with the Keras
Functional API
Useful computer vision architectures
• tf.keras.applications and keras.applications have many of the most popular and
best performing computer vision architectures built-in & pre-trained, ready to use for your
own problems
Output
Input data Model
e r e p r es en t a t i o n o f i n p u t (feature vector)
Lea r n s f e a t ur
data
ffi
ffi
EfficientNet feature extractor
Input data
(10 classes of Food101) Changes
(same shape
as number
Stays same
of classes)
(frozen, pre-trained on ImageNet)
10
E cientNetB0 architecture. Source: https://ai.googleblog.com/2019/05/e cientnet-improving-accuracy-and.html
y - c on n e c t e d
Full
l a s s if ie r l a y er
(dense) c
ffi
ffi
EfficientNet fine-tuning
Input data
(10 classes of Food101) y e r g et
t o t h e o u t p u t l a
Layers closer
en / fi n e - t u n e d f i r s t
unfroz
Stays same Changes
(frozen, pre-trained on ImageNet) (unfrozen, gets ne-tuned on custom data)
10
E cientNetB0 architecture. Source: https://ai.googleblog.com/2019/05/e cientnet-improving-accuracy-and.html
- c on n e c t e d
Fully
Bottom layers tend to stay frozen (or are a s s if ie r l a y e r
(dense) cl
last to get unfrozen)
ffi
fi
ffi
Modelling experiments we’re running
Experiment Data Preprocessing Model
Feature Extractor: E cientNetB0
10 classes of Food101 data (random 10%
Model 0 (baseline) None (pre-trained on ImageNet, all layers
training data only)
frozen) with no top
750 images of pizza and steak 250 images of pizza and steak
pizza_steak Food101 Pizza, steak (2) (same as original Food101 (same as original Food101
dataset) dataset)
Output
Input data Model
(feature vector)
Log the performance of multiple models and then view and compare
these models in a visual way on TensorBoard (a dashboard for
TensorBoard tf.keras.callbacks.TensorBoard()
inspecting neural network parameters). Helpful to compare the results
of di erent models on your data.
Save your model as it trains so you can stop training if needed and
Model checkpointing come back to continue o where you left. Helpful if training takes a tf.keras.callbacks.ModelCheckpoint()
long time and can't be done in one sitting.
Leave your model training for an arbitrary amount of time and have it
Early stopping stop training automatically when it ceases to improve. Helpful when tf.keras.callbacks.EarlyStopping()
you've got a large dataset and don't know how long training will take.
ff
ff
Original Model vs. Feature Extraction
Changes Output layer(s) gets trained
Output Layer (shape = 1000) 10 on new data
…
Layer 235 Layer 235
…
(original model layers
(e.g. E cientNet) don’t update during training)
Layer 2 Layer 2
…
Changes
…
Layer 235 Layer 235 Layer 235
Changes
(unfrozen)
Layer 234 Stays same Layer 234 Layer 234
(frozen)
…
…
Layer 2 Layer 2 Layer 2
Stays same
(frozen)
Input Layer Input Layer Input Layer
…
…
Fine-t
Changes Might change usual uning
ly req
more d uires
ata th
featur an
Large dataset (e.g. ImageNet) Di erent dataset (e.g. 10 classes of food) extrac e
tion
Take the underlying patterns (also Helpful if you have a small amount of custom
Most of the layers in the original
called weights) a pretrained model data (similar to what the original model was
Feature extraction has learned and adjust its outputs
model remain frozen during training
trained on) and want to utilise a pretrained model
(only the top 1-3 layers get updated).
to be more suited to your problem. to get better results on your speci c problem.
Source: https://tensorboard.dev/experiment/73taSKxXQeGPQsNBcVvY3g/#scalars
ffi
ff