0% found this document useful (0 votes)
2 views

Midterm-Lab-Exam_-Attempt-review

The document is a review of a Midterm Lab Exam taken on June 19, 2024, where the participant scored 49 out of 50 marks, equivalent to a grade of 98%. It contains a series of questions related to data analysis, machine learning concepts, and algorithms, all of which were answered correctly. The exam covers various topics including Bayesian networks, clustering algorithms, and the least squares method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Midterm-Lab-Exam_-Attempt-review

The document is a review of a Midterm Lab Exam taken on June 19, 2024, where the participant scored 49 out of 50 marks, equivalent to a grade of 98%. It contains a series of questions related to data analysis, machine learning concepts, and algorithms, all of which were answered correctly. The exam covers various topics including Bayesian networks, clustering algorithms, and the least squares method.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Home / My courses / UGRD-CYBS6101-2333T / MIDTERM EXAMINATION / Midterm Lab Exam

Started on Wednesday, 19 June 2024, 11:26 PM


State Finished
Completed on Wednesday, 19 June 2024, 11:33 PM
Time taken 7 mins 53 secs
Marks 49.00/50.00
Grade 98.00 out of 100.00

Question 1
Correct

Mark 1.00 out of 1.00

How does the least squares method handle outliers in the data set?

Select one:
a. It removes them
b. It gives them more weight
c. It gives them less weight
d. It ignores them

Question 2

Correct

Mark 1.00 out of 1.00

Which of the following file types can be imported into KNIME?

Select one:
a. XML
b. CSV
c. Excel
d. All of the above

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 1/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 3

Correct

Mark 1.00 out of 1.00

What is an edge in a Bayesian network?

Select one:
a. A probabilistic relationship between two variables
b. A point in the network where two or more nodes meet
c. A variable in the system being modeled
d. None of the above

Question 4

Correct

Mark 1.00 out of 1.00

What is a perceptron?

Select one:
a. A type of deep learning neural network
b. A type of artificial neuron that can be trained to recognize patterns
c. A type of unsupervised learning algorithm
d. A type of machine learning algorithm for classification tasks

Question 5
Correct

Mark 1.00 out of 1.00

The KL distance is often used in natural language processing to compare the distribution of words in a document with the distribution
of words in a reference corpus. In this context, a low KL distance indicates that the document is:

Select one:
a. Somewhat similar to the reference corpus
b. Very similar to the reference corpus
c. Very different from the reference corpus
d. Somewhat different from the reference corpus

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 2/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 6

Correct

Mark 1.00 out of 1.00

What is the Naive Bayes classifier used for?

Select one:
a. To classify data into different categories based on certain features
b. To predict the probability of an event occurring
c. All of the above
d. To predict the value of a continuous variable

Question 7

Correct

Mark 1.00 out of 1.00

What is an example of a regression task in supervised learning?

Select one:
a. Predicting the stock price for the next day based on historical data
b. Grouping customers into different segments based on their spending habits
c. Determining whether an email is spam or not
d. Predicting the price of a house based on its characteristics

Question 8
Correct

Mark 1.00 out of 1.00

What is the process of removing unnecessary weights from a trained perceptron called?

Select one:
a. Validation
b. Training
c. Testing
d. Pruning

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 3/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 9

Correct

Mark 1.00 out of 1.00

What is the EM algorithm used to optimize in the "M" step?

Select one:
a. The latent variables
b. The likelihood of the model
c. The model parameters
d. The prediction accuracy of the model

Question 10

Correct

Mark 1.00 out of 1.00

Hierarchical clustering is sensitive to the ______________ of the data.

Select one:
a. All of the above
b. Outliers
c. Scale
d. Variance

Question 11
Correct

Mark 1.00 out of 1.00

What is the EM algorithm used for?

Select one:
a. All of the above
b. Classification
c. Clustering
d. Regression

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 4/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 12

Correct

Mark 1.00 out of 1.00

What is a batch learning algorithm?

Select one:
a. An algorithm that processes the training data one example at a time
b. An algorithm that processes the training data in real-time
c. An algorithm that processes all of the training data at once
d. An algorithm that processes the training data in small groups or batches

Question 13

Correct

Mark 1.00 out of 1.00

What is the process of selecting a subset of data for analysis called?

Select one:
a. Filtering
b. Cleaning
c. Normalizing
d. Sampling

Question 14
Correct

Mark 1.00 out of 1.00

Which of the following is NOT a disadvantage of the k-means algorithm?

Select one:
a. It can be computationally expensive for large datasets
b. It is sensitive to the initial placement of centroids
c. It can handle categorical variables
d. It may produce suboptimal results if the clusters are not spherical

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 5/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 15

Correct

Mark 1.00 out of 1.00

How can the sensitivity to the initial placement of centroids be addressed in the k-means algorithm?

Select one:
a. By using the k-means++ initialization method
b. By using a different clustering algorithm
c. By using a hierarchical clustering approach
d. By normalizing the data prior to clustering

Question 16

Correct

Mark 1.00 out of 1.00

The KL distance is often used in machine learning to evaluate the performance of a classification model. In this context, a low KL
distance indicates that the model's predicted class probabilities are:

Select one:
a. Somewhat different from the true class probabilities
b. Somewhat similar to the true class probabilities
c. Very different from the true class probabilities
d. Very similar to the true class probabilities

Question 17

Correct

Mark 1.00 out of 1.00

What is the "E" step in the EM algorithm?

Select one:
a. The step where the model parameters are updated
b. The step where the likelihood of the model is maximized
c. The step where the expectation of the latent variables is calculated
d. The step where the prediction accuracy of the model is calculated

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 6/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 18

Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm used for classification tasks?

Select one:
a. Decision tree
b. Linear regression
c. Support vector machine
d. K-nearest neighbors

Question 19

Correct

Mark 1.00 out of 1.00

How is the slope of the line of best fit calculated using the least squares method?

Select one:
a. By dividing the sum of the product of the x values and the y values by the sum of the x values
b. By dividing the sum of the y values by the sum of the squares of the x values
c. By dividing the sum of the product of the x values and the y values by the sum of the squares of the x values
d. By dividing the sum of the y values by the sum of the x values

Question 20
Correct

Mark 1.00 out of 1.00

What is a Bayesian network used for?

Select one:
a. All of the above
b. To perform machine learning tasks
c. To optimize the use of resources
d. To model and predict the behavior of systems

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 7/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 21

Correct

Mark 1.00 out of 1.00

How is the line of best fit calculated using the least squares method?

Select one:
a. By minimizing the sum of the squares of the errors between the data points and the line of best fit
b. By minimizing the variance of the data set
c. By minimizing the mean of the data set
d. By minimizing the sum of the absolute values of the errors between the data points and the line of best fit

Question 22

Correct

Mark 1.00 out of 1.00

How is the final set of clusters determined in the k-means algorithm?

Select one:
a. By selecting the set of clusters that maximize the sum of squared errors
b. By selecting the set of clusters that minimize the within-cluster variance
c. By selecting the set of clusters that minimize the sum of squared errors
d. By selecting the set of clusters that maximize the within-cluster variance

Question 23
Correct

Mark 1.00 out of 1.00

What is the process of identifying and removing duplicate data called?

Select one:
a. De-duplication
b. Sampling
c. Cleaning
d. Filtering

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 8/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 24

Correct

Mark 1.00 out of 1.00

How is KNIME different from other data analysis tools?

Select one:
a. It is open source
b. It is free
c. It has a user-friendly interface
d. It allows users to build custom data pipelines

Question 25

Correct

Mark 1.00 out of 1.00

What is a node in a Bayesian network?

Select one:
a. A point in the network where two or more edges meet
b. A probabilistic relationship between two variables
c. A variable in the system being modeled
d. All of the above

Question 26
Correct

Mark 1.00 out of 1.00

What is supervised learning used for?

Select one:
a. Classification tasks
b. Regression tasks
c. Both classification and regression tasks
d. Unsupervised learning tasks

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 9/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 27

Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm used for clustering tasks?

Select one:
a. K-means
b. All of the above
c. Agglomerative clustering
d. DBSCAN

Question 28

Correct

Mark 1.00 out of 1.00

Hierarchical clustering is a type of ______________ technique.

Select one:
a. Classification
b. Dimensionality reduction
c. Clustering
d. Regression

Question 29
Correct

Mark 1.00 out of 1.00

What is the main advantage of using a directed acyclic graph (DAG) over other types of graphs?

Select one:
a. All of the above
b. DAGs are more efficient for storing and processing data
c. DAGs are easier to understand and visualize
d. DAGs can represent more complex relationships between data

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 10/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 30

Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm?

Select one:
a. Support vector machine
b. All of the above
c. K-nearest neighbors
d. Linear regression

Question 31

Correct

Mark 1.00 out of 1.00

What is the process of applying machine learning algorithms to data called?

Select one:
a. Data analysis
b. Data mining
c. Data visualization
d. Data modeling

Question 32
Correct

Mark 1.00 out of 1.00

The ______________ linkage criterion is a popular choice for hierarchical clustering, which merges the two clusters that have the
maximum distance between them.

Select one:
a. Single
b. Average
c. Centroid
d. Complete

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 11/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 33

Correct

Mark 1.00 out of 1.00

What is an example of a batch learning algorithm used for dimensionality reduction tasks?

Select one:
a. Principal component analysis
b. All of the above
c. Multidimensional scaling
d. t-SNE

Question 34

Correct

Mark 1.00 out of 1.00

What is KNIME used for?

Select one:
a. Data mining
b. Data visualization
c. All of the above
d. Data analysis

Question 35
Correct

Mark 1.00 out of 1.00

How can users access the KNIME Marketplace?

Select one:
a. From the KNIME forum
b. From the KNIME interface
c. All of the above
d. From the KNIME website

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 12/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 36

Correct

Mark 1.00 out of 1.00

Which of the following is NOT a feature of KNIME?

Select one:
a. Data transformation
b. Machine learning
c. Flow-based programming
d. Data storage

Question 37

Correct

Mark 1.00 out of 1.00

What is the advantage of the Naive Bayes classifier over other classifiers?

Select one:
a. It is faster to train and predict
b. It is more flexible
c. It is able to handle large amounts of data
d. It is more accurate

Question 38
Correct

Mark 1.00 out of 1.00

What is an example of a real-world application of directed acyclic graphs (DAGs)?

Select one:
a. Social media networks
b. Computer networks
c. All of the above
d. Data pipelines

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 13/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 39

Correct

Mark 1.00 out of 1.00

Is the least squares method a deterministic or a probabilistic method?

Select one:
a. Both deterministic and probabilistic
b. Probabilistic
c. Deterministic
d. Neither deterministic nor probabilistic

Question 40

Incorrect

Mark 0.00 out of 1.00

What is the main goal of the EM algorithm?

Select one:
a. To maximize the prediction accuracy of the model 
b. To maximize the likelihood of a model given the data
c. To minimize the error between the predicted and actual values of the data
d. To minimize the cost or loss function of a model

Question 41
Correct

Mark 1.00 out of 1.00

What is an example of a classification task in supervised learning?

Select one:
a. Grouping customers into different segments based on their spending habits
b. Determining whether an email is spam or not
c. Predicting the price of a house based on its characteristics
d. Predicting the stock price for the next day based on historical data

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 14/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 42

Correct

Mark 1.00 out of 1.00

The KL distance can be used to measure the information lost when approximating one distribution with another. In this context, the
distribution being approximated is known as the:

Select one:
a. Reference distribution
b. Approximation distribution
c. Base distribution
d. Target distribution

Question 43
Correct

Mark 1.00 out of 1.00

What is the main disadvantage of the Hebb rule?

Select one:
a. It is slow to converge
b. It is unable to handle large datasets
c. It is unable to handle nonlinear relationships
d. It is prone to overfitting

Question 44

Correct

Mark 1.00 out of 1.00

How is the Hebb rule used in the training of a neural network?

Select one:
a. It is used to calculate the output of the neural network
b. It is used to determine the input to the neural network
c. It is used to determine the structure of the neural network
d. It is used to adjust the weights of the neural network based on the input and output

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 15/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 45

Correct

Mark 1.00 out of 1.00

What is the minimum required Java version to run KNIME?

Select one:
a. Java 9
b. Java 8
c. Java 10
d. Java 7

Question 46

Correct

Mark 1.00 out of 1.00

In information theory, the KL distance can be used to measure the information lost when approximating one distribution with
another. Which of the following is NOT a property of the KL distance in this context?

Select one:
a. It is non-negative
b. It is zero only when the two distributions are identical
c. It is always positive
d. It is non-symmetric

Question 47

Correct

Mark 1.00 out of 1.00

What is the main advantage of the Hebb rule?

Select one:
a. It is able to handle nonlinear relationships
b. It is able to handle large datasets
c. It is fast to converge
d. It is easy to implement

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 16/17
6/19/24, 11:46 PM Midterm Lab Exam: Attempt review

Question 48

Correct

Mark 1.00 out of 1.00

The KL distance is always positive and is equal to zero only when the two probability distributions are:

Select one:
a. Independently distributed
b. Mutually exclusive
c. Identically distributed
d. Uniformly distributed

Question 49

Correct

Mark 1.00 out of 1.00

How does supervised learning differ from unsupervised learning?

Select one:
a. Supervised learning involves predicting a value, while unsupervised learning involves clustering data
b. Supervised learning involves clustering data, while unsupervised learning involves predicting a value
c. Supervised learning involves labeled data, while unsupervised learning involves unlabeled data
d. Supervised learning involves predicting a continuous value, while unsupervised learning involves predicting a categorical
value

Question 50
Correct

Mark 1.00 out of 1.00

How does the Naive Bayes classifier calculate the probability of a data point belonging to a particular class?

Select one:
a. By using the least squares method
b. By using the gradient descent algorithm
c. By using the Bayes theorem
d. By using the maximum likelihood estimation

◄ Midterm Exam

Jump to...

https://trimestralexam.amaesonline.com/2333B/mod/quiz/review.php?attempt=45210&cmid=12925&showall=1 17/17

You might also like