Assginment - With Hints

This document contains instructions for 3 questions related to machine learning concepts. Q1 asks about the parameters and forward/backward propagation in a simple 1D convolutional neural network. Q2 asks about selecting appropriate gradient descent algorithms for different scenarios like online learning, steep cost functions, and sparse data. Q3 asks about running Q-learning on a Markov Decision Process to learn state-action values and derive an optimal policy from those values.

Uploaded by

rui

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Assginment - With Hints

Uploaded by

rui

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

COMP3057 Assignment 2

Q1 (40 marks):
The following shows a simple 1D convnet with a single convolution kernel, the input
is a 5D vector.

1. List all learnable parameters in the network. (Notes: in the conv layer, k
represents weights of the convolution kernel, b represents the bias (please refer to
P13 of the class slides); in the fc layer, w represents the weights, a represents the
bias.)
k ,b,w,a
2. Write down the forward propagation of the network in a layer-by-layer manner.

3. Write down the backward propagation of the network in a layer-by-layer manner.

Q2 (30 marks):
Consider the following scenarios, you may select appropriate gradient descent
algorithms according to different scenarios, please write down your thoughts (better
with formulas).
1.When dealing with online data.
2.When the area around a local optima is like a ravine, i.e., where the surface curves
are much more steep in one dimension than another.
3.When the data is sparse and the features have very different frequencies.
Please refer to “An overview of gradient descent optimization algorithms”
(https://arxiv.org/pdf/1609.04747.pdf)

Q3 (30 marks):
Consider an unknown Markov Decision Process (MDP) with 3 states (A, B, C) and 2
actions (turnLeft, turnRight), and the agent make decisions according to some policy
COMP3057 Assignment 2

π . Given a dataset consisting of samples ( s , a , s ' , r ), which representing taking an

action a in state s resulting in a transition to state s' and a reward r .(hints: here we
consider a dynamic system , which means the reward in each step is
p ( s , r| s , a ¿
'

also stochastic.)
'
s a s r
A turnRight B 2
C turnLeft B 2
B turnRight C -2
A turnRight B 4
You may consider a discount factor of γ=1.
The update function of Q-learning is:
Q ( s t , at )=( 1−α ) ⋅Q ( st , at ) +α ⋅(r t + γ max Q( s t +1 , a ' )) (1)
'
a

1
Assume all Q-values are initialized to 0 and use a learning rate of α = .
2
1. Run Q-learning with data in the table and compute the value of Q( A , turnRight )
and Q(B , turnRight ). (hints: you may consider to compute Q 1 ( A , turnRight ),
with the update function in
Q1 (C , turnLeft ) , Q1 ( B ,turnRight ) ,Q2 ( A ,turnRight )

Eq.(1))

2. Construct a policy π Q that maximizes the Q-value in a given state:

. What are the actions chosen by the policy in states A and
π Q ( s )=argma x a Q(s , a)

QUBE-Servo LQR Control Workbook (Instructor)
No ratings yet
QUBE-Servo LQR Control Workbook (Instructor)
10 pages
Uebung CAE
No ratings yet
Uebung CAE
7 pages
CC106 Study-Guide 1
No ratings yet
CC106 Study-Guide 1
8 pages
EE263 Homework Problems Lecture 2 - Linear Functions and Examples
100% (1)
EE263 Homework Problems Lecture 2 - Linear Functions and Examples
154 pages
Ethics and Technology PDF
100% (1)
Ethics and Technology PDF
2 pages
Assignment 4
No ratings yet
Assignment 4
6 pages
DRL
No ratings yet
DRL
9 pages
LECTURE 2
No ratings yet
LECTURE 2
28 pages
242 Sheet 02 02
No ratings yet
242 Sheet 02 02
6 pages
RG CW 14
No ratings yet
RG CW 14
7 pages
HW-2025-3
No ratings yet
HW-2025-3
2 pages
Lecture 4
No ratings yet
Lecture 4
3 pages
Cap 6.3
No ratings yet
Cap 6.3
9 pages
Differential Method 6
No ratings yet
Differential Method 6
2 pages
drl_hw2_2022_fin2
No ratings yet
drl_hw2_2022_fin2
6 pages
Question of Mockup Exam
No ratings yet
Question of Mockup Exam
15 pages
Krylov Subspace Methods
No ratings yet
Krylov Subspace Methods
8 pages
MA 3005 – Tutorial Questions (Part II)
No ratings yet
MA 3005 – Tutorial Questions (Part II)
6 pages
Lecture 7
No ratings yet
Lecture 7
5 pages
LQG LTR Controller Design For An Aircraft Model
No ratings yet
LQG LTR Controller Design For An Aircraft Model
12 pages
Assignment Two
No ratings yet
Assignment Two
3 pages
HW 2022 6
No ratings yet
HW 2022 6
2 pages
Capitulo 6
100% (1)
Capitulo 6
33 pages
Some Past Exam Problems in Control Systems - Part 1
No ratings yet
Some Past Exam Problems in Control Systems - Part 1
5 pages
q2B Review Sol
No ratings yet
q2B Review Sol
14 pages
MAS-Lab7-QFA
No ratings yet
MAS-Lab7-QFA
10 pages
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
No ratings yet
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
5 pages
X X . Explain ( (0)
No ratings yet
X X . Explain ( (0)
2 pages
Solution Lagrangian One Degree of Freedom
No ratings yet
Solution Lagrangian One Degree of Freedom
3 pages
ConSys Compre (2017-18)
No ratings yet
ConSys Compre (2017-18)
2 pages
(KA & TN) C4 - STUDENT MPC FORMULAE SCHEDULE
No ratings yet
(KA & TN) C4 - STUDENT MPC FORMULAE SCHEDULE
3 pages
RL_assignment
No ratings yet
RL_assignment
2 pages
Q G Q Q Q C Q Q M Q: Es La Variable de Desplazamiento
No ratings yet
Q G Q Q Q C Q Q M Q: Es La Variable de Desplazamiento
5 pages
Deep Q Learning
No ratings yet
Deep Q Learning
5 pages
TP2_reg_2024
No ratings yet
TP2_reg_2024
5 pages
Ma Cylindrical Coordinates
No ratings yet
Ma Cylindrical Coordinates
18 pages
EE3331C Feedback Control Systems L7: Control System Performance: Transient & Steady-State
No ratings yet
EE3331C Feedback Control Systems L7: Control System Performance: Transient & Steady-State
30 pages
ME PHD Qual MControl Sample1
No ratings yet
ME PHD Qual MControl Sample1
9 pages
18-19 - S5 - Math - Final-Yr - Exam - P1 Solutions
No ratings yet
18-19 - S5 - Math - Final-Yr - Exam - P1 Solutions
4 pages
2023handout 12 - System Stability Using Nyquist Diagrams
No ratings yet
2023handout 12 - System Stability Using Nyquist Diagrams
27 pages
Uploads 2024 06 19 JA-Krish - (PH+CH+MT) - V1 2-Danish
No ratings yet
Uploads 2024 06 19 JA-Krish - (PH+CH+MT) - V1 2-Danish
19 pages
2.10 QR Decomposition: Solution of Linear Algebraic Equations
No ratings yet
2.10 QR Decomposition: Solution of Linear Algebraic Equations
5 pages
Midsem 21-22
No ratings yet
Midsem 21-22
4 pages
Nyquist Plot PDF
No ratings yet
Nyquist Plot PDF
24 pages
Signals and Systems Laboratory 7
No ratings yet
Signals and Systems Laboratory 7
9 pages
Extension of Beta Function
No ratings yet
Extension of Beta Function
6 pages
Linear Systems HW #1 2021
No ratings yet
Linear Systems HW #1 2021
4 pages
Differential Calculus
No ratings yet
Differential Calculus
21 pages
Maximum Likelihood and CRLB Estimators
No ratings yet
Maximum Likelihood and CRLB Estimators
4 pages
Updating The QR Factorization and The Least Squares Problem (2008)
No ratings yet
Updating The QR Factorization and The Least Squares Problem (2008)
73 pages
Control Systems
No ratings yet
Control Systems
18 pages
Nyquist Diagram and Stability Criterion
No ratings yet
Nyquist Diagram and Stability Criterion
28 pages
S S S G: Dynamics and Control-Tutorial Questions
No ratings yet
S S S G: Dynamics and Control-Tutorial Questions
9 pages
Experiment 106. Stabilisation of A Rotary Inverted Pendulum 1 Objective
No ratings yet
Experiment 106. Stabilisation of A Rotary Inverted Pendulum 1 Objective
11 pages
Fall 2010 Control Systems (EEE 226) Unit-3 Frequency Response Analysis Faculty: Ramesh Babu. N
No ratings yet
Fall 2010 Control Systems (EEE 226) Unit-3 Frequency Response Analysis Faculty: Ramesh Babu. N
40 pages
Rl Dp and Value and Policy
No ratings yet
Rl Dp and Value and Policy
4 pages
Control Systems 2
No ratings yet
Control Systems 2
18 pages
SuccessClap Paper 2 2024 CSE Solutions
No ratings yet
SuccessClap Paper 2 2024 CSE Solutions
49 pages
4a - Approximate Reinforcement Learning
No ratings yet
4a - Approximate Reinforcement Learning
55 pages
ALPS 2314 Physics Assignment Paper
No ratings yet
ALPS 2314 Physics Assignment Paper
14 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Solutions to Problems in Fluids and Turbomachinery
From Everand
Solutions to Problems in Fluids and Turbomachinery
Rahul Basu
No ratings yet
The Journey To An AI Led Agency
No ratings yet
The Journey To An AI Led Agency
27 pages
Understanding Deep Learning
100% (1)
Understanding Deep Learning
39 pages
Review DL2019
No ratings yet
Review DL2019
16 pages
Finger Vein Recognition
No ratings yet
Finger Vein Recognition
20 pages
Tutorial 2: Answer The Following Question. 1. Name Some Logisticregression Problems and Deeply Explain One of Them
No ratings yet
Tutorial 2: Answer The Following Question. 1. Name Some Logisticregression Problems and Deeply Explain One of Them
2 pages
ANFIS
No ratings yet
ANFIS
19 pages
Meta Pseudo Labels
No ratings yet
Meta Pseudo Labels
12 pages
Combining Pattern Classifiers Methods and Algorithms PDF
No ratings yet
Combining Pattern Classifiers Methods and Algorithms PDF
2 pages
2020 Deep CNN TR Le
No ratings yet
2020 Deep CNN TR Le
6 pages
The Future of Ai: Yann Lecun
No ratings yet
The Future of Ai: Yann Lecun
20 pages
A Comprehensive Survey On Segment Anything Model For Vision and Beyond
No ratings yet
A Comprehensive Survey On Segment Anything Model For Vision and Beyond
28 pages
Bird Region Detection in Images With Multi-Scale HOG Features and SVM Scoring
No ratings yet
Bird Region Detection in Images With Multi-Scale HOG Features and SVM Scoring
12 pages
Ds7201 Adip
No ratings yet
Ds7201 Adip
2 pages
G E L U (Gelu) : Aussian Rror Inear Nits S
No ratings yet
G E L U (Gelu) : Aussian Rror Inear Nits S
9 pages
4 Actions To Improve Martech ROIi
No ratings yet
4 Actions To Improve Martech ROIi
13 pages
Computer Vision: G. Sai Umesh Harsha Vikas Yashwanth
No ratings yet
Computer Vision: G. Sai Umesh Harsha Vikas Yashwanth
30 pages
Neural Networks and Deep Learning
No ratings yet
Neural Networks and Deep Learning
36 pages
Feature Extraction Using Deep Learning For Food Type Recognition
No ratings yet
Feature Extraction Using Deep Learning For Food Type Recognition
4 pages
POLS 1503 Learning Journal Unit 4
No ratings yet
POLS 1503 Learning Journal Unit 4
4 pages
DONE SOFT COMPUTING Unit 1
No ratings yet
DONE SOFT COMPUTING Unit 1
3 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
4 pages
Nanoelectromechanical Systems (NEMS) Are A Class of Devices Integrating Electrical and Mechanical Functionality On The
No ratings yet
Nanoelectromechanical Systems (NEMS) Are A Class of Devices Integrating Electrical and Mechanical Functionality On The
5 pages
Matlab Iris RBF
No ratings yet
Matlab Iris RBF
21 pages
Chapter 1 - Introduction To Neuro-Fuzzy and Soft Computing
No ratings yet
Chapter 1 - Introduction To Neuro-Fuzzy and Soft Computing
16 pages
Digital Skills Insights 2020
No ratings yet
Digital Skills Insights 2020
126 pages
Will Industry 4.0
No ratings yet
Will Industry 4.0
5 pages
Computer Vision Ii: Ai Courses by Opencv
No ratings yet
Computer Vision Ii: Ai Courses by Opencv
8 pages
Aditive Manufacturing Processes: Carlos Relvas - 1
No ratings yet
Aditive Manufacturing Processes: Carlos Relvas - 1
30 pages