Assginment - With Hints
Assginment - With Hints
Q1 (40 marks):
The following shows a simple 1D convnet with a single convolution kernel, the input
is a 5D vector.
1. List all learnable parameters in the network. (Notes: in the conv layer, k
represents weights of the convolution kernel, b represents the bias (please refer to
P13 of the class slides); in the fc layer, w represents the weights, a represents the
bias.)
k ,b,w,a
2. Write down the forward propagation of the network in a layer-by-layer manner.
Q2 (30 marks):
Consider the following scenarios, you may select appropriate gradient descent
algorithms according to different scenarios, please write down your thoughts (better
with formulas).
1.When dealing with online data.
2.When the area around a local optima is like a ravine, i.e., where the surface curves
are much more steep in one dimension than another.
3.When the data is sparse and the features have very different frequencies.
Please refer to “An overview of gradient descent optimization algorithms”
(https://arxiv.org/pdf/1609.04747.pdf)
Q3 (30 marks):
Consider an unknown Markov Decision Process (MDP) with 3 states (A, B, C) and 2
actions (turnLeft, turnRight), and the agent make decisions according to some policy
COMP3057 Assignment 2
also stochastic.)
'
s a s r
A turnRight B 2
C turnLeft B 2
B turnRight C -2
A turnRight B 4
You may consider a discount factor of γ=1.
The update function of Q-learning is:
Q ( s t , at )=( 1−α ) ⋅Q ( st , at ) +α ⋅(r t + γ max Q( s t +1 , a ' )) (1)
'
a
1
Assume all Q-values are initialized to 0 and use a learning rate of α = .
2
1. Run Q-learning with data in the table and compute the value of Q( A , turnRight )
and Q(B , turnRight ). (hints: you may consider to compute Q 1 ( A , turnRight ),
with the update function in
Q1 (C , turnLeft ) , Q1 ( B ,turnRight ) ,Q2 ( A ,turnRight )
Eq.(1))
B?