Handwritten Digit Recognition
Handwritten Digit Recognition
Machine Learning
Introduction
Using the DeepLearning package, this application trains a neural network to recognize the numbers
in images of handwritten digits. The trained neural network is then applied to a number of test
images.
The training and testing images are a very small subset of the MNIST database of handwritten digits;
these consist of 28 x 28 pixel images of a handwritten digit, ranging from 0 to 9. A sample image for
the digit zero is .
Ultimately, this application generates an vector of weights for each digit; think of weights as a
marking grid for a multiple choice exam. When reshaped into a matrix, a weight vector for the digit
0 might look like this.
If a pixel with a high intensity lands in the red area, the evidence is high that the
handwritten digit is zero
Conversely, if a pixel with a high intensity lands in the blue area, the evidence is low that
the handwritten digit is zero
https://www.tensorflow.org/versions/r1.1/get_started/mnist/beginners
https://www.oreilly.com/learning/not-another-mnist-tutorial-with-tensorflow
Notes
Introduction
We first build a computational (or dataflow) graph. Then, we create a Tensorflow session to run
the graph.
Images
Each 28 x 28 image is flattened into a list with 784 elements.
Once flattened, the training images are stored in a tensor x, with shape of [none, 784]. The first
index is the number of training images ("none" means that we can use an arbitrary number of
training images).
Labels
Each training image is associated with a label.
So for an image that displays the digit 5, the label is [ 0,0,0,0,0,1,0,0,0,0]. This is known as a one-
hot encoding.
All the labels are stored in a tensor y_ with a shape of [none, 10].
Training
The neural network is trained via multinomial logistic regression (also known as softmax).
Step 1
Calculate the evidence that each image is in a selected class. Do this by performing a weighted
sum of the pixel intensity for the flattened image.
where
Wi,j and bi are the weight and the bias for digit i and pixel j. Think of W as a matrix with
784 rows (one for each pixel) and 10 columns (one for each digit), and b is a vector
with 10 columns (one for each digit)
xj is the intensity of pixel j
Step 2
Normalize the evidence into a vector of probabilities with softmax.
Step 3
For each image, calculate the cross-entropy of the vector of predicted probabilities and the
actual probabilities (i.e the labels)
where
y_ is the true distribution of probabilities (i.e. the one-hot encoded label)
y is the predicted distribution of probabilities
Step 4
The mean cross-entropy across all training images is then minimized to find the optimum values
of W and b
Testing
For each test image, we will generate 10 ordered probabilities that sum to 1. The location of the
highest probability is the predicted value of the digit.
Miscellaneous
This application consists of
this worksheet
and a very small subset of images from the MNIST handwritten digit database
in a single zip file. The images are stored in folders; the folders should be extracted to the
location as this worksheet.
Load Packages and Define Parameters
> restart:
with(DeepLearning):
with(DocumentTools):
with(DocumentTools:-Layout):
Generate the labels for digit j, where label[n] is the label for image[n].
> for j from 0 to L - 1 do
labels[j] := ListTools:-Rotate~([[1,0,0,0,0,0,0,0,0,0]$N],-j)
[]:
end do:
Collect labels
> y_train := [seq(labels[i - 1], i = 1 .. L)]:
Define placeholders x and y to feed the training images and labels into
> x := Placeholder(float[4], [none, 784]):
y_ := Placeholder(float[4], [none, L]):
end do:
For each test image, generate 10 probabilities that the digit is a number from 1 to 10
> pred := sess:-Run(y, {x in x_train})
(6.2)
For each test image, find the predicted digit associated with the greatest probability
> predList := seq( max[index]( pred[i, ..] ) - 1, i = 1 .. T )
(6.3)
9 1 0 5 3 4 5 2 8 3 8 2 5 2 4 6 8 4 1 1 1 6 7 5 7
7 4 7 7 8 7 5 5 7 5 5 3 3 0 6 7 7 4 3 4 0 3 2 3 7
Visualize Weights
Generate the weights associated with each digit
> weights := sess:-Run(W)
(7.1)
(7.1)
(7.2)
Reshape the 784-element vector of weights for each label into a 28 x 28 Matrix
> V := Vector():
for i from 1 to L do
w := ArrayTools:-Reshape(weights[.., i], [1 .. 28, 1 .. 28])
^%T:
V(i):= Statistics:-HeatMap(w, color = [blue, white,
"DarkRed"], size = [200, 200], labels=["", ""], axes = none,
range = minRange .. maxRange)
end do:
> plots:-display(V^%T)