0% found this document useful (0 votes)
2 views8 pages

2024_Machine_Learning

The document is a question paper for a Machine Learning course, detailing instructions for candidates and various sections of questions. It includes topics such as linear regression, clustering, support vector machines, and neural networks, along with practical exercises and theoretical questions. The paper is structured into compulsory and optional sections, with a total duration of 3 hours and a maximum score of 75 marks.

Uploaded by

tee1708rawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views8 pages

2024_Machine_Learning

The document is a question paper for a Machine Learning course, detailing instructions for candidates and various sections of questions. It includes topics such as linear regression, clustering, support vector machines, and neural networks, along with practical exercises and theoretical questions. The paper is structured into compulsory and optional sections, with a total duration of 3 hours and a maximum score of 75 marks.

Uploaded by

tee1708rawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 8

(

, this question paper contains 12 printed pages.]

Your Roll No...............


Sr. No. of Question Paper: 3143 Sine:
Unique Paper Code : 32347607
Name of the Paper ': Machine Learning |
Name of the Course : B.Sc. (H) Computer Science

ADMISSIONS OF 2019,
2020 & 2021

_ Semester a |

Duration : 3 Hours Maximum Marks ; 75

Instructions for Candidates

1. Write your Roll No. on the top immediately on receipt

of this question paper.


2. Section A is compulsory.
3. Attempt any 4 questions from Section B.

4. Use of scientific calculator is allowed.

Pal,
3143 2 A | Gras i

Section A : : instances of independent variables x1

(Compulsory) and x2 as given in following data table

using linear regression. Also predict mean


1. (a) Consider a scenario where 6000 patients are tested
squared error.
for Covid positive. Out of which 5000 are actually

Covid negative and 1000 are actually Covid ar

positive. For covid positive patients the test

however gave positive indication for 700 only and

ot | OO] | P|
meet | Cr) La

ed ee

for covid negative patients, the test gave positive

indication for 200 patients. Construct a confusion

i ‘ (c) Cluster the following set of data objects in two


matrix for above scenario and find the values of

clusters by applying one iteration of k-means


True Positive Rate (TPR), False Positive Rate

algorithm. Treat objects 2 and 5 as initial cluster

(FPR), Specificity, Sensitivity metrics. (5) 4 centres, Use Euclidean distance aa:
the distance
metric. Determine updated cluster centre
(b) Answer the following : (5) eb riuntin. (5)
(i) What is the impact of small dataset with Object X-coordinate | ¥-caordinate
respect to large number of features? Number : :
1
2 4 6
(ii) For the given values theta_0=0.2, 3 iF 8
theta_I=0.1, and theta _2=0.1; predict 4 ay :
5 12 4

values of dependent variable y for all

PTO:
3143 4 Roe Seas 5

(d) Differentiate between linear regression and S.No. | CGPA Assessment Project
Points | Result
: Ah : d 1 9.2 85
polynomial regression. Derive the gradient descent 5 a 80 8 ae
algorithm to find the unknown parameters in ¢ a a 3 ris
multivariate linear regression. (5) 2 6.5 30 4 Fail
: 6 8.2 rea i Pass
7 5.8 38 5 Fail
(ec) How PCA (Principal Component Analysis) 8 8.9 91 9 Pass
algorithm helps in dimension reduction in machine
learning? Write the steps of PCA algorithm. Section - B

(5)

a He IAN ; d 2. (a) Consider two fe: i i


(f) What is regularization? Write equations of cost (a) ; pocites ite Garesek ang
their

function for regularized linear and regularized caidas sh cry aaloannuhe 4)

logistic regression. What will be the effect on eee vapee (medium, low, high, very
high)

model when the regularization parameter is set to

wera? (5) « Status: values (SO, AO, Clerk)

(zg) Consider the following dataset with 8 training Answer the following
questions :

instances. Use k-NN algorithm (for k~3) to (i) Using Cartesian product on above
feature

determine the ‘Result’ status for a new lest :


; sel, construct a new feature and ‘generate

instance with values CGPA = 7.6, Assessment = 60

and Project Points = 7, (5)

its possible values list.

PFO.
we
3143 6 ny
(ii) State one advantage and one disadvantage

of above approach for feature construction.

(b) For the given set of points, identify clusters using


complete linkage in agglomerative clustering. Use

Fuclidean distance to calculate the distance

443 7
4S
pg ae whe
35 i
(1.5, 3) eg
3 a —-@ . h—{5; 3}
| 2.5 dal df {3,25} — ee
: ma ra
15 if ‘lk; (25;45}— PLAT SSL EEEIeRa dae eens
a f-
Q t : 1
| 0 1 2 a 4 5 6

between two points. — (6)


Points | X coordinate | Y coordinate
Pl 1 1
P2 ils L5
FS 3 5
P4 3 4

3. (a) Consider the following two dimensional space with


some data points such that circle points represent
positive class points and triangular points represent
negative class points separated by a decision

boundary as shown. (5)

Answer the following questions :

(i) Identify support vectors, (with respect to


SVM classifier applied on above data)

(ii) Draw marginal planes, (with respect to


SVM classifier applied on above data)

(iii) Define Marginal Distance in SVM


algorithm.

(b) Construct neural network for a two input NOR


gate using truth table. Show diagram for your
generated neural network model with weights.

(5)

P.T.O.
3143 8
4. (a) Apply Naive Bayesian Classifier to Predict
whether a car is stolen or not with features
{Color:RED, Origin:Domestic, Typer:SUV} based

(5)

on given dataset.

Type Stolen

Color Origin

RED _ SPORTS DOMESTIC YES


RED SPORTS DOMESTIC NO

RED SPORTS DOMESTIC YES


YELLOW SPORTS DOMESTIC NO

YELLOW SPORTS IMPORTED YES


YELLOW SUV IMPORTED NO

YELLOW SUV IMPORTED YES


YELLOW SUV DOMESTIC NO

RED SUV IMPORTED NO

RED SPORTS IMPORTED | YES

(b) Differentiate between hold out method, leave one


out method and k-fold method for cross-validation.
Which of the above methods has low bias and

(5)

high variance. Justify.

‘5143 9

5. (a) Using the data given below, build a logistic


regression model to predict whether a student is
pass or fail based on exam score using gradient
descent algorithm. Assume initial values for model
parameters (thetas) as 0 and learning rate as 0.3.
Use one iterations of gradient descent algorithm

(6)

to update the model parameters.

Exam Score (x)


50
55

60

65
70

75

80

85

90

95

Pass/Fail (y)

be me ee ee ee | OO

(b) Using least squares method, learn the regression


coefficients for the data given below. Also predict
the value of y for x=12 using your learned

(4)

coefficients.

P.T.0,
3143 10 af 1 Gas He

(b) Explain the effect of following factors in achieving

* model convergence with respect to gradient


x 4
2 a : descent algorithm.
4
6 29
a ee » Learning rate is too small.
* Learning rate is too large.: (3)

Teo) Consider following training data for 5 persons. For


binary classification of a person as sick or not

sick create a decision tree model. Show all the

steps. (8)

Person | Al A2 |A3 | Class


No
1 Yes Yes Yes | Not Sick
Z Yes No Yes | Sick
3 No No Yes | Sick
4 No Yes Yes | Not Sick

For given input values of x! and x? as 0.3 and 0.5 Pe No Yes No __} Sick

respectively, determine the values of output nodes

yl and y2. Use bias b1=0.5 and b2=0.5. Use (b) Consider the expected and predicted
outcomes of

: : ivati ti idden as : ; aa
sigmoid as the activation function for h a machine learning classificr on a data
set

(7) Pie
well as output layer. containing 7 observations. Calculate the

foe tae
3143 12 —@

performance of the classifier using Jaccard Index

metric. (2)

Y expected | 0 0 0 0
¥ predicted | 1 0 0 :

ol
as
1
ol
|
els

(2500)

You might also like