0% found this document useful (0 votes)
604 views

DL - Assignment 4 Solution

This document contains 10 multiple choice questions from an online NPTEL certification course on deep learning. The questions cover topics like single layer perceptrons, gradient descent, overfitting, softmax activation functions, and stochastic gradient descent. For each question there is a detailed solution provided.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
604 views

DL - Assignment 4 Solution

This document contains 10 multiple choice questions from an online NPTEL certification course on deep learning. The questions cover topics like single layer perceptrons, gradient descent, overfitting, softmax activation functions, and stochastic gradient descent. For each question there is a detailed solution provided.

Uploaded by

swathisreejith6
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

NPTEL Online Certification Courses

Indian Institute of Technology Kharagpur

Deep Learning
Assignment- Week 4
TYPE OF QUESTION: MCQ/MSQ
Number of questions: 10 Total mark: 10 X 1 = 10
______________________________________________________________________________

QUESTION 1:
Which of the following cannot be realized with single layer perceptron (only input and output
layer)?

a. AND
b. OR
c. NAND
d. XOR

Correct Answer: d

Detailed Solution:
It cannot implement XOR gate as it cannot be classified by a linear separator.

QUESTION 2:
For a function f (θ0, θ1), if θ0 and θ1 are initialized at a local minimum, then what should be the
values of θ0 and θ1 after a single iteration of gradient descent:

a. θ0 and θ1 will update as per gradient descent rule


b. θ0 and θ1 will remain same
c. Depends on the values of θ0 and θ1
d. Depends on the learning rate

Correct Answer: b

Detailed Solution:

At a local minimum, the derivative (gradient) is zero, so gradient descent will not change the
parameters.

QUESTION 3:
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Choose the correct option:

i) Inability of a model to obtain sufficiently low training error is termed as Overfitting


ii) Inability of a model to reduce large margin between training and testing error is
termed as Overfitting
iii) Inability of a model to obtain sufficiently low training error is termed as Underfitting
iv) Inability of a model to reduce large margin between training and testing error is
termed as Underfitting

a. Only option (i) is correct


b. Both Options (ii) and (iii) are correct
c. Both Options (ii) and (iv) are correct
d. Only option (iv) is correct

Correct Answer: b

Detailed Solution:

Follow lecture 17
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

QUESTION 4:
Suppose for a cost function 𝐽(𝜃) = 0.25𝜃 2 as shown in graph below, refer to this graph and
choose the correct option regarding the Statements given below. 𝜃 is plotted along horizontal
axis.

Statement i: The magnitude of weight update at the green point is higher than the magnitude
of weight update at yellow point.

Statement ii: The magnitude of weight update at the green point is higher than the magnitude
of weight update at red point.

a. Only Statement i is true


b. Only Statement ii is true
c. Both Statement i and ii are true
d. None of them are true

Correct Answer: a

Detailed Solution:

Weight update is directly proportional to the magnitude of the gradient of the cost
𝜕𝐽(𝜃)
function. In our case, = 0.5𝜃. So, the weight update will be more for higher values of 𝜃.
𝜕𝜃
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

QUESTION 5:
Choose the correct option. Gradient of a continuous and differentiable function is:

i) is zero at a minimum
ii) is non-zero at a maximum
iii) is zero at a saddle point
iv) magnitude decreases as you get closer to the minimum

a. Only option (i) is corerct


b. Options (i), (iii) and (iv) are correct
c. Options (i) and (iv) are correct
d. Only option (ii) is correct

Correct Answer: b

QUESTION 6:
Input to SoftMax activation function is [3,1,2]. What will be the output?

a. [0.58,0.11, 0.31]
b. [0.43,0.24, 0.33]
c. [0.60,0.10,0.30]
d. [0.67, 0.09,0.24]

Correct Answer: d

Detailed Solution:
𝒙
𝒆 𝒋
SoftMax, 𝝈(𝒙𝒋 ) = 𝑛 for j=1,2…,n
∑𝑘=1 𝒆𝒙𝒌
𝒙
𝒆 𝒋
Therefore, 𝝈(𝟑) = 𝑛 =0.67and similarly the other values
∑𝑘=1 𝒆𝒙𝒌

QUESTION 7:
If SoftMax of 𝑥𝑗 is denoted as 𝜎(𝑥𝑗 ) where 𝑥𝑗 is the jth element of the n-dimensional vector X
𝜕𝜎(𝑥𝑗 )
i.e., X = [𝑥1 , … , 𝑥𝑗 , … , 𝑥𝑛 ], then derivative of 𝜎(𝑥𝑗 ) w.r.t. 𝑥𝑗 i.e., is given by,
𝜕𝑥𝑗
a. 𝜎(𝑥𝑗 ) x 𝜎(𝑥𝑗 )
b. 1 − 𝜎(𝑥𝑗 )
c. 0
d. 𝜎(𝑥𝑗 ) x (1 − 𝜎(𝑥𝑗 ))
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Correct Answer: d

Detailed Solution:
𝒙
𝒆 𝒋
SoftMax, 𝝈(𝒙𝒋 ) = 𝑛 for j=1,2…,n
∑𝑘=1 𝒆𝒙𝒌

Taking derivative w.r.t. 𝒙𝒋 ,we get, 𝝈(𝒙𝒋 ) x (𝟏 − 𝝈(𝒙𝒋 ))

QUESTION 8:
Which of the following options is true?
a. In Stochastic Gradient Descent, a small batch of sample is selected randomly
instead of the whole data set for each iteration. Too large update of weight
values leading to faster convergence
b. In Stochastic Gradient Descent, the whole data set is processed together for
update in each iteration.
c. Stochastic Gradient Descent considers only one sample for updates and has
noisier updates.
d. Stochastic Gradient Descent is a non-iterative process
Correct Answer: c
Detailed Solution:

Stochastic Gradient Descent considers just one sample for update and thus has noisier updates.

QUESTION 9:
What are the steps for using a gradient descent algorithm?

1. Calculate error between the actual value and the predicted value
2. Re-iterate until you find the best weights of network
3. Pass an input through the network and get values from output layer
4. Initialize random weight and bias
5. Go to each neurons which contributes to the error and change its respective values to reduce
the error

a. 1, 2, 3, 4, 5

b. 5, 4, 3, 2, 1

c. 3, 2, 1, 5, 4

d. 4, 3, 1, 5, 2
NPTEL Online Certification Courses
Indian Institute of Technology Kharagpur

Correct Answer: d

Detailed Solution:

Initialize random weights, and then start passing input instances and calculate error response
from output layer and back-propagate the error through each subsequent layers. Then update
the neuron weights using a learning rate and gradient of error. Please refer to the lectures of
week 4.

QUESTION 10:
J(θ) = 2θ2 -2θ+2 is a given cost function? Find the correct weight update rule for gradient
descent optimization at step t+1? Consider, 𝛼=0.01 to be the learning rate.

a. 𝜃𝑡+1 = 𝜃𝑡 − 0.01(2𝜃 − 1)
b. 𝜃𝑡+1 = 𝜃𝑡 + 0.01(2𝜃 − 1)
c. 𝜃𝑡+1 = 𝜃𝑡 − (2𝜃 − 1)
d. 𝜃𝑡+1 = 𝜃𝑡 − 0.02(2𝜃 − 1)

Correct Answer: d

Detailed Solution:

𝜕𝐽(𝜃)
= 4𝜃 − 2 = 2(2𝜃 − 1)
𝜕𝜃
So, weight update will be
𝜃𝑡+1 = 𝜃𝑡 − 0.01 ∗ 2(2𝜃 − 1) = 𝜃𝑡 − 0.02(2𝜃 − 1)
___________________________________________________________________

______________________________________________________________________________

************END*******

You might also like