MNIST Assignment Architecture Tuning and Realtime Processing
MNIST Assignment Architecture Tuning and Realtime Processing
NAME______________
Model Persistence
It is no good to train your network every time you want to use it. We need it to persist in
memory when the program ends and when the computer is turned off.
1. Find out the Keras Python code to save your model to a file. Enter it below:
model.save(“path”)
2. Find out the Keras Python code to load your model to a from a file on disk. Enter it
below:
model.load(“path”)
Pretty easy, isn’t it?
3. When saving models, it’s just no good if we cannot remember which one was the
best, Imagine you have variables called LR and acc and these variables store the learning rate and accuracy
after evaluation.
How could we save a model to disk with a name like model Acc 91.1 LR 0.4 so we can sort models in File
Explorer from best to worst?
Enter the code below:
Tuning Hyperparameters
4. Choose a suitable number of epochs (between 10 and 30 passes through the data set) and use your network
tuning skills to adjust the learning rate many times to get the best possible learning rate for the MNIST
handwritten digits dataset.
Hint: (it will be something between 1 and 0.01)
Learning Rate: 0.2
What learning rate did you choose?
Extension:
There are other hyperparameters aside from than the learning rate… investigate and experiment.
Number of
6. How many outputs should we have? outputs: 10
Note: this network only has one output – which means it would only be suitable for a yes /no (>0.5/ <0.5)
classification. Your network will be different.
The classification task dictates the numbers of nodes on the input and output layers, but we are still free to use any
number of nodes and layers in the hidden (middle) layers of the network.
I want you to try 7 different numbers for the number of nodes in the hidden layer.
What is the range of numbers you will use for the hidden layer?
20 up to and including 80, step 10
FYI Keras’ model.add(Dense()) Function adds a dense or fully connected layer like the ones shown in this diagram.
The easiest way is to use a for loop, ensure that models and results are being saved and leave the computer training
overnight. It should save the models as it goes (with names that tell you the number of nodes)
9. Investigate networks with two hidden layers and fill in the table.
Pretty sure something went horribly wrong here, please check out my
code to see if something is done wrong.
Here we are using a step size of 10 because there’s probably not much difference between a network with 20 layer 3
nodes and one with 21 layer 3 nodes.
You can automate this investigation using nested For loops like this:
Also, these networks might start to take a long time to train. It may be an idea to team up with someone or use two
computers: one does range 20 to 30 and the other does range 40 to 50 for example.