0% found this document useful (0 votes)
68 views19 pages

ML FA24 Final Term Exam (Solution)

The document outlines the terminal examination for the Machine Learning course at COMSATS University Islamabad, detailing the structure, questions, and marking scheme. It includes various topics such as FP-Growth, Decision Trees, Performance Evaluation Metrics, K-Means Clustering, and Reinforcement Learning. The exam consists of multiple questions requiring analysis and application of machine learning concepts, with specific instructions for students based on their registration numbers.

Uploaded by

Hanzala Shafique
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views19 pages

ML FA24 Final Term Exam (Solution)

The document outlines the terminal examination for the Machine Learning course at COMSATS University Islamabad, detailing the structure, questions, and marking scheme. It includes various topics such as FP-Growth, Decision Trees, Performance Evaluation Metrics, K-Means Clustering, and Reinforcement Learning. The exam consists of multiple questions requiring analysis and application of machine learning concepts, with specific instructions for students based on their registration numbers.

Uploaded by

Hanzala Shafique
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

COMSATS University Islamabad, Wah Campus

Terminal Examination Fall 2024


Department of Computer Science

Program(s)/Classes: BCS 6 A Date: 09th January 2025


Subject: Machine Learning (CSC354) Maximum Marks: 100 Marks
Instructor Name(s): Prof. Dr Sheraz Anjum Total Time Allowed: 3 hrs

Note: Solve the questions on separately provided Answer Sheet. Attempt all questions.
Question No. 1 (CLO-1(SO-1)) (05 + 05 = 10 Marks)
a- FP-Growth tree: The FP-Growth tree for a dataset has been computed and depicted below,
along with the frequent item set table. Determine the association rules between frequent
items using confidence, considering a threshold of over 75%.

• Confidence of (B→E) = S(B∩E) / S(B) = 3/4 =0.75


• Confidence of (E→B) = S(E∩B) / S(E) = 3/3 =1.0
• Confidence of (J→C) = S(J∩C) / S(J) = 3/4 = 0.75
• Confidence of (C→J) = S(C∩J) / S(C) = 3/3 =1
• Confidence of (B→J) = S(B∩J) / S(B) =3/4 = 0.75
• Confidence of (J→B) = S(J∩B) / S(J) = 3/4 = 0.75

b- E-Clat algorithm: Apply the E-Clat algorithm to the following dataset with a minimum support
= 02.

Page 1 of 19
Question No. 2 (CLO-2(SO-2,4)) (10+05+05+05+05= 30 Marks)

a- Decision Tree: You are a data analyst for a social media company aiming to predict whether
a user will spend more than 2 hours on a platform based on usage behavior. Your dataset
has 14 records with attributes related to platform, activity level, content engagement, and
network strength, as well as a final decision on whether the user spent more than 2 hours
("Yes" or "No"). Using the ID3 algorithm with Entropy and Information Gain, you need to
identify the attribute that best fits as the root node of your decision tree.

Page 2 of 19
Content Network Spent > 2
Day Platform Activity Level
Engagement Strength hours?
1 Instagram High Low Weak No
2 Instagram High Low Strong No
3 Twitter High Low Weak Yes
4 Facebook Moderate Low Weak Yes
5 Facebook Low Moderate Weak Yes
6 Facebook Low Moderate Strong No
7 Twitter Low Moderate Strong Yes
8 Instagram Moderate Low Weak No
9 Instagram Low Moderate Weak Yes
10 Facebook Moderate Moderate Weak Yes
11 Instagram Moderate Moderate Strong Yes
12 Twitter Moderate Low Strong Yes
13 Twitter High Moderate Weak Yes
14 Facebook Moderate Low Strong No

Page 3 of 19
Page 4 of 19
b- Performance Evaluation Metrics: You are working on a machine learning project to classify
birds into two categories: Sparrow and Not-Sparrow. The table below shows the results of a
test dataset where a machine learning model was used to predict whether a bird is a
"Sparrow" or "Not-Sparrow" based on various features. The table provides the actual labels
and the model's predicted labels for 10 test cases.
• Odd registration # will compute the Accuracy
and Recall.
• Even registration # will compute the F1-Score.

Page 5 of 19
c- Cross Validation: Briefly explain How K-fold cross validation improves the holdout method?
Illustrate with the pictorial representation.
K-fold cross-validation improves on the holdout method by splitting the dataset into k
subsets, repeating k times. Each iteration involves training the model on k-1 folds and testing
on the remaining fold. The average accuracy serves as the cross-validation metric, ensuring
model evaluation independence. While it mitigates bias and is effective for limited data, it
demands increased computational resources due to repeated training.

d- Linear Regression: Emma recently bought a new car and decided to track the gallons of
gas she used on five of her business trips. She seeks assistance in predicting the consumed
gallons of gas. So, calculate the consumed gallons using the equation ŷ= -2125 + 1.08x for
2005, 2006, 2007, 2009, and 2013.
e-

Page 6 of 19
f- Ensemble Learning: What is the difference between Max Voting and Averaging
technique? Justify your answer with the help of an example.
Max Voting: Selects the class with the highest votes (used for classification).
Averaging: Calculates the average of predictions (used for regression or probabilistic
outputs).

(5+4+5+4+4)/5 = 4.4

Page 7 of 19
Question No. 3 (CLO-3(SO-2,4)) (15+15+05= 35 Marks)
Note:
• Even Registration # will use Manhattan Distance where needed
• Odd Registration # will use Euclidean Distance where needed

a- K-Means Clustering: Consider the database for which is given below. Here k=2 (there are just
going to be two clusters). Find clusters using k-mean clustering. Stop after the 2nd iteration.
• Centroid for Cluster 01 is Individual A
• Centroid for Cluster 02 is Individual C

S# X1 X2
A 1 1
B 1 0
C 0 2
D 2 4
E 3 5
E 3 5

Page 8 of 19
Page 9 of 19
Page 10 of 19
b- Hierarchical Clustering: Apply Bottom-up approach of hierarchical clustering on the
following data.
Points X Y
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5
• Even Registration # will Compute and Draw Dendrogram Using Complete Linkage
• Odd Registration # will Compute and Draw Dendrogram Using Single Linkage

Page 11 of 19
Page 12 of 19
Page 13 of 19
Page 14 of 19
Page 15 of 19
Page 16 of 19
c- DBSCAN: Use the following Proximity matrix to identify core points and outliers if any. Given,
Epsilon(Eps) = 2.5 and Minimum Points (MinPts) = 3.

There is neither any core points nor any outliers.


Question No. 4 (CLO-4(SO-2,4)) (04+05+06 = 15 Marks)
Reinforcement Learning (RL) and Deep Learning (DL):
a- Briefly explain the difference between Model based and Model free RL.

Page 17 of 19
Model-Based RL: Learner builds a model of the environment (e.g., transition and reward
functions) and uses it to plan actions by simulating future states.
Model-Free RL: Learner directly learns the optimal policy or value function from interactions
with the environment without explicitly modeling it.
b- Consider the following R matrix. Imagine that we are in State 01 and then Compute the Q
value when Q(state, action) = Q(3,1). Note that the Learning rate is 0.8.

80

c- Recall the basic concepts of DL and Label the following diagram of perceptron.

Page 18 of 19
Question No. 5 (CLO-5(SO-2,3,4,5)) (10 Marks)

• Briefly explain about your semester project and your findings related to your semester
project.

***Success is not measured by how well you cheat,


but by how honestly you strive***

Page 19 of 19

You might also like