0% found this document useful (0 votes)
14 views

MNIST Assignment Architecture Tuning and Realtime Processing

Uploaded by

cpclipupload
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

MNIST Assignment Architecture Tuning and Realtime Processing

Uploaded by

cpclipupload
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Year 10 Data Science – Assignment

NAME______________

Tuning, Stored Models and


Real-time Processing with
Neural Nets
Introduction
A model is another name for a neural network. In this assignment we train networks and save
them along with their accuracy. We change hyper parameters, and network architectures, do
many trains with random starting points spending many hours of cpu time. We then select the
best model of all and use it in a real-time application with constantly updating predictions.

Model Persistence
It is no good to train your network every time you want to use it. We need it to persist in
memory when the program ends and when the computer is turned off.

1. Find out the Keras Python code to save your model to a file. Enter it below:

model.save(“path”)

2. Find out the Keras Python code to load your model to a from a file on disk. Enter it
below:

model.load(“path”)
Pretty easy, isn’t it?

3. When saving models, it’s just no good if we cannot remember which one was the
best, Imagine you have variables called LR and acc and these variables store the learning rate and accuracy
after evaluation.

How could we save a model to disk with a name like model Acc 91.1 LR 0.4 so we can sort models in File
Explorer from best to worst?
Enter the code below:

model.save(f”path\\model Acc {acc_var} LR {lr_var}”)

Tuning Hyperparameters

4. Choose a suitable number of epochs (between 10 and 30 passes through the data set) and use your network
tuning skills to adjust the learning rate many times to get the best possible learning rate for the MNIST
handwritten digits dataset.
Hint: (it will be something between 1 and 0.01)
Learning Rate: 0.2
What learning rate did you choose?

Extension:
There are other hyperparameters aside from than the learning rate… investigate and experiment.

Optimising Network Architecture


The MNIST dataset dictates the number of inputs our network must have, and the number of classes in our
classification problem dictates the number of output nodes we must have.

5. The mnist dataset consists of images with 28*28 pixels. Number of


How many inputs must we therefore have? inputs: 784

Number of
6. How many outputs should we have? outputs: 10

Note: this network only has one output – which means it would only be suitable for a yes /no (>0.5/ <0.5)
classification. Your network will be different.

The classification task dictates the numbers of nodes on the input and output layers, but we are still free to use any
number of nodes and layers in the hidden (middle) layers of the network.

The network we built in class has just one hidden layer.

7. Draw the starting network diagram:


Note you don’t need to draw every single node, but you can
use dots (…) to indicate a range of nodes:
e.g.
8. You will build and train several different networks with different size hidden layer.

I want you to try 7 different numbers for the number of nodes in the hidden layer.

What is the range of numbers you will use for the hidden layer?
20 up to and including 80, step 10

and put the results in this table:

Layer 2 # nodes F1 Score


20 0.9637697690534851
30 0.9766043311488675
40 0.9697217751012853
50 0.9766732253122498
60 0.9776018679394334
70 0.019165469713456656
80 0.023138302649304

FYI Keras’ model.add(Dense()) Function adds a dense or fully connected layer like the ones shown in this diagram.

The easiest way is to use a for loop, ensure that models and results are being saved and leave the computer training
overnight. It should save the models as it goes (with names that tell you the number of nodes)

for numOfNodesL2 in range (20,50,10):

#Todo: Write code to build each network using variables here:

#train network and record training history


History = model.fit(…)
#evaluate by predicting with the test data (that’s testX)
predictions = model.predict(testX,batch_size=128)
#and reporting how the predictions match up with the test labels (testY)
report = classification_report(testY.argmax(axis=1),
predictions.argmax(axis=1),
target_names=[str(x) for x in lb.classes_]))
f1score = report["weighted avg"]['f1-score'])

Submit your code for this question!

9. Investigate networks with two hidden layers and fill in the table.

F1 Scores Number of nodes in Layer 3


Nodes in 20 30 40 50
layer 2
10 0.9649813411388668 0.9671878049064717 0.9712019099899467 0.023138302649304
20 0.023138302649304 0.023138302649304 0.023138302649304 0.017493624772313296
30 0.019165469713456656 0.023138302649304 0.023138302649304 0.023138302649304
40 0.023138302649304 0.023138302649304 0.023138302649304 0.023138302649304
Submit the classification report for each of these cases.

Pretty sure something went horribly wrong here, please check out my
code to see if something is done wrong.

Here we are using a step size of 10 because there’s probably not much difference between a network with 20 layer 3
nodes and one with 21 layer 3 nodes.

You can automate this investigation using nested For loops like this:

for nodesL2 in range(70,30, 50):


for nodesL3 in range (20,50,10):

#write code to build network using variables here:

#train network and record training history


History = model.fit(…)
#evaluate by predicting with the test data (that’s testX)
predictions = model.predict(testX,batch_size=128)
#and reporting how the predictions match up with the test labels (testY)
report = classification_report(testY.argmax(axis=1),
predictions.argmax(axis=1),
target_names=[str(x) for x in lb.classes_]))
f1score = report["weighted avg"]['f1-score'])

Submit your code for this question!

Also, these networks might start to take a long time to train. It may be an idea to team up with someone or use two
computers: one does range 20 to 30 and the other does range 40 to 50 for example.

You might also like