ISE 529 mock test answers
ISE 529 mock test answers
Mock mid
5. In the Bag of Words model, what is the primary limitation that can
affect text classification tasks?
6. In the Bag of Words model, how is text data typically represented for machine
learning models?
A) As a structured table where each word is a feature with its frequency as a value.
B) As a continuous vector where word meanings are preserved.
C) As a sequence of words in the order they appear in the document.
D) As an image matrix where words are represented as pixels.
9. Which of the following preprocessing steps helps reduce the dimensionality of text
data while preserving important information?
A) Removing stop words and applying stemming or lemmatization
B) Converting text into uppercase letters for uniformity
C) Keeping all punctuation and special characters
D) Replacing all words with their synonyms to increase variability
12. Which evaluation metric is most suitable for a classification model when dealing
with imbalanced datasets?
A) Accuracy
B) Precision, Recall, and F1-score
C) Mean Squared Error (MSE)
D) R-squared (R²)
13. How are Decision Trees different from other linear models?
A) Decision Trees create splits in the data based on linear combinations of features.
B) Decision Trees can capture non-linear relationships without needing feature
transformations.
C) Decision Trees only work with continuous data, while linear models handle both continuous
and categorical data.
D) Decision Trees require the data to be scaled like linear models.
15. Assume you are working on predicting housing prices in a highly volatile market
where some houses are priced unusually high due to unique features (like proximity
to landmarks).
You want to create a model that ignores these extreme outliers and focuses on
capturing the general trend in pricing. There are few things you are considering:
- You expect that the relationship between the house features (like size, location,
and age)
- the price is non-linear.
Which model would be the most appropriate for this situation to ensure the best fit
while considering maximum margin and handling non-linearity?
16. Which assumption of linear regression ensures that the error terms have the same
variance across all levels of the independent variables?
A) Linearity
B) Normality of residuals
C) Homoscedasticity
D) Independence of errors
18. Based on the data table, you are tasked with fitting a polynomial regression model to
predict Y from X. You suspect that a polynomial relationship exists between X and
Y. You decide to increase the polynomial degree until you find the best fit.
Looking at the values of Y as X increases, what degree of polynomial would you expect to
be appropriate for fitting this data?
X (Features) Y (Target)
1 3
2 12
3 27
4 48
5 75
A) Linear (degree = 1)
B) Quadratic (degree = 2)
C) Cubic (degree = 3)
D) Quartic (degree = 4)
19. You are working on a multiple linear regression model to predict house prices based
on house size and location of the following table. The "Location" variable is
categorical and needs to be encoded using dummy variables.
House Size (sqft) Location Price ($)
Which of the following actions should you take to avoid the Dummy Variable Trap?
A) Keep all three dummy variables in the model.
B) Remove the dummy variable for "Urban" since it's the first category.
C) Remove one of the dummy variables (for any category) to prevent multicollinearity.
D) Add an additional dummy variable for houses that do not belong to any of the existing
categories.
2 2
Calculate 𝑅 adjusted and 𝑅 .
2
2 2 (1−𝑅 )58
𝑅 = 173383 / 225993, 𝑅 adjusted = 1 - 44
Null Hypothesis (H₀): There is no significant difference in pulp brightness between the shift
operators.
Use ANOVA (Analysis of Variance) to compare means across the four operators.