0% found this document useful (0 votes)
9 views

03 Convolution Neural Networks and Computer Vision With Tensorflow

The document discusses convolutional neural networks and computer vision problems. It provides an overview of common computer vision tasks like binary classification, multiclass classification, and object detection. It also outlines the basic architecture of a convolutional neural network, including the inputs and outputs of computer vision problems. The document will cover building CNN models for image classification problems in TensorFlow and evaluating model performance.

Uploaded by

Akbar Shakoor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

03 Convolution Neural Networks and Computer Vision With Tensorflow

The document discusses convolutional neural networks and computer vision problems. It provides an overview of common computer vision tasks like binary classification, multiclass classification, and object detection. It also outlines the basic architecture of a convolutional neural network, including the inputs and outputs of computer vision problems. The document will cover building CNN models for image classification problems in TensorFlow and evaluating model performance.

Uploaded by

Akbar Shakoor
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Convolutional Neural Networks

& Computer Vision


with
Where can you get help?
“If in doubt, run the code”

• Follow along with the code


• Try it for yourself
• Press SHIFT + CMD + SPACE to read the docstring
• Search for it
• Try again
• Ask (don’t forget the Discord chat!)
(yes, including the “dumb”
questions)
“What is a computer vision
problem?”
Example computer vision problems

“Is this a photo of sushi, steak or pizza?”

Binary classi cation Multiclass classi cation Object detection


n g o r a n o t h e r ) (more than one thing or e th i ng w e ’r e
(one thi (wh er e’s t h
another) looking for?)
fi
fi
What we’re going to cover
(broadly)
• Getting a dataset to work with (pizza_steak 🍕🥩)

• Architecture of a convolutional neural network (CNN) with TensorFlow

• An end-to-end binary image classi cation problem

• Steps in modelling with CNNs

• Creating a CNN, compiling a model, tting a model, evaluating a model

• An end-to-end multi-class image classi cation problem

• Making predictions on our own custom images

👩🍳 👩🔬
(w e’ ll be co ok ing u p lots of co d e! )

How:
fi
fi
fi
Computer vision inputs and outputs
224

W = 224 224 Sushi 🍣


H = 224 Steak 🥩
C=3 Pizza 🍕
(c = colour channels, R, G, B) Actual output
This is often a
convolutional neural network (CNN)!
🍣 🥩 🍕
[[0.31, 0.62, 0.44…], [[0.97, 0.00, 0.03],
[0.92, 0.03, 0.27…], [0.81, 0.14, 0.05],
[0.25, 0.78, 0.07…], [0.03, 0.07, 0.90],
…, (normalized pixel valu …,
es)
Numerical
Predicted output
encoding (often already ex
ists, if not,
you can build on (comes from looking at lots
e) of these)
Input and output shapes
(for an image classification example) We’re going to be building CNNs
to do this part!

224
[[0.31, 0.62, 0.44…], 🍣 🥩 🍕
224 [0.92, 0.03, 0.27…], [0.00, 0.97, 0.03]
[0.25, 0.78, 0.07…], i o n p r ob ab i l i t i e s )
(predict
…,

(gets represented as a tens


or)
[batch_size, width, height, colour_channels] Shape = [3]
Shape = [None, 224, 224, 3]
or
Shape = [32, 224, 224, 3] These will vary depending on the
(32 is a v e ry c o m m o n b a t c h problem you’re working on.
size)
Steps in modelling with TensorFlow

1. Turn all data into numbers (neural networks can’t handle images)
2. Make sure all of your tensors are the right shape
3. Scale features (normalize or standardize, neural networks tend to prefer normalization)
“What is a convolutional neural
network (CNN)?”
(typical)*

Architecture of a CNN

(what we’re working towa


rds
building)

Steak 🥩
Pizza 🍕

*Note: there are almost an unlimited amount of ways you could stack together a convolutional neural network, this slide demonstrates only one.
Let’s code!
Architecture of a CNN
(col o ur e d b l o c k e d it i o n )
Simple CNN

Deeper CNN
Breakdown of Conv2D layer
Example code: tf.keras.layers.Conv2D(filters=10, kernel_size=(3, 3), strides=(1, 1), padding=“same”)
Example 2 (same as above): tf.keras.layers.Conv2D(filters=10, kernel_size=3, strides=1, padding=“same”)

Hyperparameter name What does it do? Typical values

Decides how many lters should pass over an


10, 32, 64, 128 (higher values lead to more
Filters input tensor (e.g. sliding windows over an complex models)
image).

Determines the shape of the lters (sliding 3, 5, 7 (lowers values learn smaller features,
Kernel size (also called lter size) higher values learn larger features)
windows) over the output.

Pads the target tensor with zeroes (if “same”)


to preserve input shape. Or leaves in the
Padding “same” or “valid”
target tensor as is (if “valid”), lowering
output shape.

The number of steps a lter takes across an


Strides image at a time (e.g. if strides=1, a lter 1 (default), 2
moves across an image 1 pixel at a time).

📖 Resource: For an interactive demonstration of the above hyperparameters, see the CNN explainer website.
fi
fi
fi
fi
fi
What is overfitting?
Over tting — when a model over learns patterns in a particular dataset and isn’t able to
generalise to unseen data.

For example, a student who studies the course materials too hard and then isn’t able to perform
well on the nal exam. Or tries to put their knowledge into practice at the workplace and nds
what they learned has nothing to do with the real world.

Under tting Balanced Over tting


(goldilocks zone)
fi
fi
fi
fi
fi
Improving a model (from a model’s perspective)

Smaller model

Common ways to improve a deep model:


• Adding layers Larger model
• Increase the number of hidden units
• Change the activation functions
• Change the optimization function
• Change the learning rate (because you can alter each of
• Fitting on more data these, they’re hyperparameters)

• Fitting for longer


Improving a model (from a data perspective)

Method to improve a model


What does it do?
(reduce over tting)

Gives a model more of a chance to learn patterns between samples


More data (e.g. if a model is performing poorly on images of pizza, show it more
images of pizza).

Increase the diversity of your training dataset without collecting more


data (e.g. take your photos of pizza and randomly rotate them 30°).
Data augmentation
Increased diversity forces a model to learn more generalisation
patterns.

Not all data samples are created equally. Removing poor samples
Better data from or adding better samples to your dataset can improve your
model’s performance.

Take a model’s pre-learned patterns from one problem and tweak


Use transfer learning them to suit your own problem. For example, take a model trained on
pictures of cars to recognise pictures of trucks.
fi
What is data augmentation?
Looking at the same image but from di erent perspective(s)*.

Original Rotate Shift Zoom

*Note: There are many more di erent kinds of data augmentation such as, cropping, replacing, shearing. This slide only demonstrates a few.
ff
ff
Popular & useful computer vision
architectures
Release Use in
Architecture Paper When to use
Date TensorFlow

Find pre-trained versions A good backbone for


ResNet (residual
2015 https://arxiv.org/abs/1512.03385 on TensorFlow Hub or many computer vision
networks) tf.keras.applications problems

Find pre-trained versions Typically now better than


E cientNet(s) 2019 https://arxiv.org/abs/1905.11946 on TensorFlow Hub or ResNets for computer
tf.keras.applications vision

Find pre-trained versions Lightweight architecture


MobileNet(s) 2017 https://arxiv.org/abs/1704.04861 on TensorFlow Hub or suitable for devices with
tf.keras.applications less computing power
ffi
Steps in modelling with TensorFlow
The machine learning explorer’s
motto
“Visualize, visualize, visualize”
Data

Model It’s a good idea to visualize


these as often as possible.

Training

Predictions
The machine learning practitioner’s
motto

“Experiment, experiment, experiment”

👩🍳 👩🔬
(try lots of things an
d see what
tastes good)

You might also like