0% found this document useful (0 votes)

14 views

HW 3

This document outlines homework assignments for an EE 541 class. It includes 3 problems: 1. Calculating the output of a simple multi-layer perceptron network on a given input. 2. Creating an HDF5 file to store binary data and exploring its capabilities for random access. 3. Implementing logistic regression and softmax classification on the MNIST handwritten digits dataset using NumPy. Tasks include training a logistic regression model to detect the digit 2, exploring hyperparameters and regularization, and plotting learning curves.

Uploaded by

bohari NADRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

HW 3

Uploaded by

bohari NADRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Assigned:

25 September

Homework #3
EE 541:Fall2023

1. An MLP has two input nodes, one hidden layer, and two Recall outputs.
that the output for layer l
(l) (l−1)
is given by a = h l W l a + bl . The two sets of weights and biases are given by:

1 −2 1
W1 = b1 =
3 4 0
2 2 0
W2 = b2 =
2 −3 −4

The non-linearactivation forthe hidden layer is ReLU (rectified linear

unit) – that is h(x) =
max (x, 0).The output layer is linear (i.e.,
identity activation function).
What is the output ac-
tivation for input x = [+1; T−1]
?

2. The hd5 format can store multiple data objects in a single file each keyed by object name – e.g.,
can store a numpy float array called regressor and a numpy integer array called labels in the sam
Hd5 also allows fast non-sequential
access to objects without scanning the entire
This
file.
means
you can efficiently access objects and data such as x[idxs] with non-consecutive indexes e.g., idx
= [2, 234, 512]. This random-access property is useful
when extracting a random subset from a
larger training database.

In this problem you will create an hd5 file containing a numpy array of binary random sequences
you generate yourself.
Follow these steps:

(1) Run the provided template python file (random binary collection.py).
The script is set to
DEBUG mode by default.

(2) Experiment with the assert statements to trap errors and understand what they are doing b
using the shape method on numpy arrays, etc.

(3) Set the DEBUG flag to False.

Manually create 25 binary sequences each with length
It is 20.
important that you do this by hand, i.e., , do not use a coin, computer, or random numbe
generator.

(4) Verify that your hd5 file was written properly by checking that it can be read-back.

(5) Submit your hd5 file as directed.

3. Logistic regression

The MNIST dataset of handwritten digits is one of the earliest and most used datasets to benchm
machine learning classifiers.
Each datapoint contains 784 input features – the
values
pixel from a
28 × 28 image – and belongs to one of 10 output classes – represented by the numbers 0-9.

In this problem you will

use numpy to classify input images using a logistic-regression.
Use only
Python standard library modules, numpy, and mathplotlib for this problem.

(a) Logistic “2” detector

In this part you will use the provided MNIST handwritten-digit data to build and train a logist
“2” detector:
(
1 x is a “2”
y=
0 else.

1, w
A logistic classifier takes learned weight vector w . . . wL ]T and the unregularized
=2, [w
offset bias b ≜0wto estimate a probability that an input vector 1, xx2,= . , xL ]T is “2”:
. . [x

1 1
p(x) = P [Y = 1|x, w] = P = T x + w ))
.
1 + exp− L 1 + exp (− (w
i=1 w k · x k + w0 0

Train a logistic classifier to find weights that minimize the binary log-loss (also called the bin
cross entropy loss):
N
1X
l(w) = − (y i log p(x)) + (1 − iy) log (1 − p(x))
N
i=1

where the sum is over the N samples in the training

Train
set.
your modeluntilconvergence
according to some metric you choose.
Experiment with variations1- of
and/or
ℓ ℓ-regularization
2
to stabilize training and improve generalization.

Submit answers to the following.

i. How did you determine a learning rate?

What values did you try?
What was your final
value?

ii. Describe the method you used to establish

convergence.
model

iii. What regularizers did you try? Specifically, how did each impactor
your
improve
modelits
performance?

iv. Plot log-loss (i.e.,learning curve) of the training set and test set on the same
On figure.
a separate figure plot the accuracy against iteration number ofon your
themodel
training
set and test set.Plot each as a function of the iteration number.

v. Clasify each input to the binary output “digit is a 2” using a 0.5 threshold.
Compute
the finalloss and final
accuracy for both your training set and test set.
Submit your trained weights to Autolab.
Save your weights and bias to an hdf5
Usefile.
keys w
and b for the weights and bias, respectively.
w should be a length-784 numpy vector/array and
b should be a numpy scalar.
Use the following as guidance:

with h5py.File(outFile, 'w') as hf:

hf.create_dataset('w', data = np.asarray(weights))
hf.create_dataset('b', data = np.asarray(bias))

Note: you willnot be scored on your models overall

accuracy.But a low-score may indicate
errors in training or poor optimization.

(b) Softmax classification:

gradient descent (GD)

In this part you will

use soft-max to peform multi-class classification instead
distinctof“one
against all” detectors.
The target vector
(
1 x is an “l”
[Y]l =
0 else.

for l = 0, . . . , K − 1. You can alternatively consider a scalar output Ytoequal

the value in
Construct
{0, 1, . . . , K − 1} corresponding to the class of input x. a logistic classifier that uses
K seperate linear weight vectors 0, w1w K−1 . Compute estimated probabilities for each
,..., w
class given input x and select the class with the largest score among your K predictors:
T
exp(wl x)
P [Y = l|x, w] =P K T
i=0 exp(wi x)

Ŷ = arg maxP [Y = l|x, w] .

Note that the probabilities sum to

Use
1. log-loss and optimize with batch gradient descent.
The (negative) likelihood function on an N sampling training set is:

1X
N h i
L(w) = − log P Y = y (i) |x (i) , w
N
i=1

where the sum is over the N points in our training set.

Submit answers to the following.

i. Compute (by-hand) the derivative of the log-likelihood of the soft-max

Write
function.
the
derivative in terms of conditional
probabilities, and indicator functions (i.e.,
the vector x,
do not write this expression in terms of exponentials).
You need this gradient in subsequent
parts of this problem.

ii. Implement batch gradient descent.

What learning rate did you use?

iii. Plot log-loss (i.e.,learning curve) of the training set and test set on the same
On figure.
a separate figure plot the accuracy against iteration number ofon
your
themodel
training
set and test set.
Plot each as a function of the iteration number.

iv. Compute the final

loss and final
accuracy for both your training set and test set.

(c) Softmax classification:

stochastic gradient descent

In this part you will

use stochastic gradient descent (SGD) in place of (deterministic) gradient
descent above.Test your SGD implmentation using single-point updates and a mini-batch size
of 100. You may need to adjust the learning rate to improve performance.
You can either:
modify the rate by hand or according to some decay scheme or you may choose a single lea
rate.You should get a final
predictor comparable to that in the previous question.

Submit answers to the following.

i. Implement SGD with mini-batch size of 1compute

(i.e., the gradient and update weights
after each sample).
Record the log-loss and accuracy of the training set and test set every
5,000 samples.Plot the sampled log-loss and accuracy values on the same (respective)
figures against the batch number.
Your plots should start at iteration 0 (i.e., include initial
log-loss and accuracy).
Your curves should show performance comparable to batch gradient
descent.How many iterations did it take to acheive comparable performance with batch
gradient descent?How does this number depend on the learning(or rate?
learning rate
decay schedule if you have a non-constant learning rate).

ii. Compare (to batch gradient descent) the computational

total complexity to reach a com-
parable accuracy on your training Note
set.that each iteration of batch gradient descent
costs an extra factor of N operations where N is the number data points.

iii. Implement SGD with mini-batch size of 100 (i.e., compute the gradient and update weigh
with accumulated average after every 100 samples).
Record the log-loss and accuracies as
Yourplots.
above (every 5,000 samples – not 5,000 batches) and create similar curves
should show performance comparable to batch gradient
descent.How many iterations
did it take to acheive comparable performance with batch gradientHow
descent?
does
this number depend on the learning (or
rate?
learning rate decay schedule if you have a
non-constant learning rate).

iv. Compare the computational

complexity to reach comparable perforamnce between the 100
sample mini-batch algorithm, the single-point mini-batch, and batch gradient descent.

Submit your trained weights to Autolab.

Save your weights and bias to an hdf5
Usefile.
keys W
and b for the weights and bias, respectively.
W should be a 10 × 784 numpy array and b should
be 10 × 1 – shape:
(10,) – numpy array.The code to save the weights is the same as (a) –
substituting W for w.

Note: you willnot be scored on your models overall

accuracy.But a low-score may indicate
errors in training or poor optimization.

Art Level Props Process
No ratings yet
Art Level Props Process
7 pages
CS4100 CS5100 CW1 20241001
No ratings yet
CS4100 CS5100 CW1 20241001
10 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Bil470 hw2 Summer2024
No ratings yet
Bil470 hw2 Summer2024
4 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
MLP - Week 6 - MNIST - LogitReg - Ipynb - Colaboratory
No ratings yet
MLP - Week 6 - MNIST - LogitReg - Ipynb - Colaboratory
19 pages
CS335 Lab6
No ratings yet
CS335 Lab6
7 pages
178 hw3
No ratings yet
178 hw3
3 pages
Homework2
No ratings yet
Homework2
3 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
CVD Lab Manual
No ratings yet
CVD Lab Manual
33 pages
ML Lab 06 Manual - Linear Regression 1 (Version 6)
No ratings yet
ML Lab 06 Manual - Linear Regression 1 (Version 6)
8 pages
1710993830340
No ratings yet
1710993830340
9 pages
C2 W2 SoftMax
No ratings yet
C2 W2 SoftMax
7 pages
Machine Learning Assignments
No ratings yet
Machine Learning Assignments
3 pages
assgmt1
No ratings yet
assgmt1
7 pages
C2W3_Lab_01_Model_Evaluation_and_Selection
No ratings yet
C2W3_Lab_01_Model_Evaluation_and_Selection
21 pages
C2_W2_SoftMax
No ratings yet
C2_W2_SoftMax
7 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
RLDL File
No ratings yet
RLDL File
31 pages
Assignment 3
No ratings yet
Assignment 3
5 pages
Lab 04 - Logisitic Regression
No ratings yet
Lab 04 - Logisitic Regression
5 pages
practicalMachineLearning_lecture3
No ratings yet
practicalMachineLearning_lecture3
25 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
ML Hota Assign3
No ratings yet
ML Hota Assign3
4 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
Qs ML
No ratings yet
Qs ML
8 pages
Vi Int368 Ml-I
No ratings yet
Vi Int368 Ml-I
2 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
NB 13
No ratings yet
NB 13
27 pages
Homework 4
No ratings yet
Homework 4
3 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
HW 7
No ratings yet
HW 7
4 pages
Logistic
No ratings yet
Logistic
14 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
IN5400 - Machine Learning For Image Analysis
No ratings yet
IN5400 - Machine Learning For Image Analysis
6 pages
ML Activity Kalyan
No ratings yet
ML Activity Kalyan
21 pages
TD2345
No ratings yet
TD2345
3 pages
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
No ratings yet
CIS 419/519 Introduction To Machine Learning Assignment 2: Instructions
12 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
CS229 Lecture 2 PDF
100% (1)
CS229 Lecture 2 PDF
48 pages
20102A0071 DL Experiment5.b
No ratings yet
20102A0071 DL Experiment5.b
5 pages
P06 The Classification Pipeline Ans
No ratings yet
P06 The Classification Pipeline Ans
16 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Q. (A) What Are Different Types of Machine Learning? Discuss The Differences
No ratings yet
Q. (A) What Are Different Types of Machine Learning? Discuss The Differences
12 pages
Ml Lab Manual
No ratings yet
Ml Lab Manual
36 pages
Aie231 NN Lab5
No ratings yet
Aie231 NN Lab5
7 pages
ML - LAB - FILE Pankaj
No ratings yet
ML - LAB - FILE Pankaj
13 pages
DL Practical 02 Binary Class Classifier Using ANN
No ratings yet
DL Practical 02 Binary Class Classifier Using ANN
5 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Machine Learning Lab (3) Report (21 CP 81)
No ratings yet
Machine Learning Lab (3) Report (21 CP 81)
7 pages
Final Report
No ratings yet
Final Report
8 pages
ML - LAB - FILE Amrit
No ratings yet
ML - LAB - FILE Amrit
13 pages
Machine Learning PYQ 2021
No ratings yet
Machine Learning PYQ 2021
4 pages
10. Binary Logistic Regression 2
No ratings yet
10. Binary Logistic Regression 2
43 pages
COMPX310-19A Machine Learning Chapter 4: Training Models
No ratings yet
COMPX310-19A Machine Learning Chapter 4: Training Models
48 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
System Administration Assignment
No ratings yet
System Administration Assignment
12 pages
What Do The Port Numbers in An IPSEC
No ratings yet
What Do The Port Numbers in An IPSEC
4 pages
Suman Jha: SR No 50/2A, Wadheshwar Nagar Wadgaonsheri, Pune 411014, India (+91) 9850825161
No ratings yet
Suman Jha: SR No 50/2A, Wadheshwar Nagar Wadgaonsheri, Pune 411014, India (+91) 9850825161
4 pages
Magic Mirror: The Magpi Issue 54
No ratings yet
Magic Mirror: The Magpi Issue 54
2 pages
Descarga de Manual Gratis de Reparacion Chevy Compress
No ratings yet
Descarga de Manual Gratis de Reparacion Chevy Compress
6 pages
DESIGN - AND - IMPLEMENTATION - OF - AN - ONLINE - LIBRARY - SYSTEM Having
No ratings yet
DESIGN - AND - IMPLEMENTATION - OF - AN - ONLINE - LIBRARY - SYSTEM Having
71 pages
Ms Publisher Course Outline
No ratings yet
Ms Publisher Course Outline
7 pages
MS 500
No ratings yet
MS 500
32 pages
FF EULA License Ver2.2
No ratings yet
FF EULA License Ver2.2
2 pages
Coc Exam Level4
80% (5)
Coc Exam Level4
3 pages
Script Controllers Tokosepatu - PHP
No ratings yet
Script Controllers Tokosepatu - PHP
6 pages
05 Endocrinology 4.2 (Medicalstudyzone - Com)
No ratings yet
05 Endocrinology 4.2 (Medicalstudyzone - Com)
162 pages
introduction_rtbox
No ratings yet
introduction_rtbox
17 pages
Bibliografía: Prentice Hall. 2da. Edición. 1999
No ratings yet
Bibliografía: Prentice Hall. 2da. Edición. 1999
6 pages
Flash Your Lenovo Ideapad Laptop BIOS From Linux Using UEFI Capsule Updates
No ratings yet
Flash Your Lenovo Ideapad Laptop BIOS From Linux Using UEFI Capsule Updates
11 pages
Tango Live - Wikipedia
No ratings yet
Tango Live - Wikipedia
4 pages
Canon Uc - 1 - Hi Manual
No ratings yet
Canon Uc - 1 - Hi Manual
118 pages
Oromia Health Bureau: HRMIS Version 1.0, 2023
No ratings yet
Oromia Health Bureau: HRMIS Version 1.0, 2023
22 pages
Install Guide Linux Ubuntu 8.04.4 LTS v1.0
No ratings yet
Install Guide Linux Ubuntu 8.04.4 LTS v1.0
12 pages
Download ebooks file Advanced R, Second Edition by Hadley Wickham all chapters
100% (7)
Download ebooks file Advanced R, Second Edition by Hadley Wickham all chapters
55 pages
Java Abstract Class Implementing An Interface With Generics - Stack Overflow
No ratings yet
Java Abstract Class Implementing An Interface With Generics - Stack Overflow
6 pages
Mobile App Security PDF
No ratings yet
Mobile App Security PDF
3 pages
En Visit Japan Web Manual 01
No ratings yet
En Visit Japan Web Manual 01
87 pages
Panel With Stringers
No ratings yet
Panel With Stringers
25 pages
ErrMsg Eng
No ratings yet
ErrMsg Eng
10 pages
BDA Mid-2 Important Questions
No ratings yet
BDA Mid-2 Important Questions
19 pages
Engine.ini
No ratings yet
Engine.ini
2 pages
Fundemantal of AI - Project
No ratings yet
Fundemantal of AI - Project
17 pages
Characteristics of Computers
100% (2)
Characteristics of Computers
12 pages

HW 3

Uploaded by

HW 3

Uploaded by

Assigned:

The non-linearactivation forthe hidden layer is ReLU (rectified linear

(3) Set the DEBUG flag to False.

(5) Submit your hd5 file as directed.

In this problem you will

(a) Logistic “2” detector

where the sum is over the N samples in the training

Submit answers to the following.

i. How did you determine a learning rate?

ii. Describe the method you used to establish

with h5py.File(outFile, 'w') as hf:

Note: you willnot be scored on your models overall

(b) Softmax classification:

In this part you will

for l = 0, . . . , K − 1. You can alternatively consider a scalar output Ytoequal

Ŷ = arg maxP [Y = l|x, w] .

Note that the probabilities sum to

where the sum is over the N points in our training set.

Submit answers to the following.

i. Compute (by-hand) the derivative of the log-likelihood of the soft-max

ii. Implement batch gradient descent.

iv. Compute the final

(c) Softmax classification:

In this part you will

Submit answers to the following.

i. Implement SGD with mini-batch size of 1compute

ii. Compare (to batch gradient descent) the computational

iv. Compare the computational

Submit your trained weights to Autolab.

Note: you willnot be scored on your models overall

You might also like