0% found this document useful (0 votes)
12 views

Adijfpqo

Uploaded by

vanessalaucode
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Adijfpqo

Uploaded by

vanessalaucode
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

All iClicker Questions

Feel free to edit this (formatting, missing content, answers...). Can an instructor provide answers
to these? I can also try to make a key if I get the time. Maybe we can pin and keep growing this
post?

1.1 (Intro): select all of the following statements which are True

 (A) Predicting spam is an example of machine learning.


 (B) Predicting housing prices is not an example of machine learning.
 (C) For problems such as spelling correction, translation, face recognition, spam
identification, if you are a domain expert, it's usually faster and scalable to come up with
a robust set of rules manually rather than building a machine learning model.
 (D) If you are asked to write a program to find all prime numbers up to a limit, it is better
to implement one of the algorithms for doing so rather than using machine learning.
 (E) Google News is likely using machine learning to organize news.

Varada's answers: A, D, E

2.2: Supervised vs unsupervised

Select all of the following statements which are examples of supervised machine learning

 (A) Finding groups of similar properties in a real estate data set.


 (B) Predicting whether someone will have a heart attack or not on the basis of
demographic, diet, and clinical measurement.
 (C) Grouping articles on different topics from different news sources (something like the
Google News app).
 (D) Detecting credit card fraud based on examples of fraudulent and non-fraudulent
transactions.
 (E) Given some measure of employee performance, identify the key factors which are
likely to influence their performance.

Varada's answers: B, D, E

2.3: Classification vs regression

Select all of the following statements which are examples of regression problems

 (A) Predicting the price of a house based on features such as number of bedrooms and the
year built.
 (B) Predicting if a house will sell or not based on features like the price of the house,
number of rooms, etc.
 (C) Predicting percentage grade in CPSC 330 based on past grades.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
 (D) Predicting whether you should bicycle tomorrow or not based on the weather
forecast.
 (E) Predicting appropriate thermostat temperature based on the wind speed and the
number of people in a room.

Varada's answers: A, C, E

Exercise 2.4: Decision Trees

1. Order the steps below to build ML models using sklearn.


o score to evaluate the performance of a given model
o predict on new examples
o Creating a model instance
o Creating X and y
o fit

Varada's answer:


o Creating X and y
o Creating a model instance
o fit
o score to evaluate the performance of a given model
o predict on new examples

2.5: Baselines and decision trees

Select all of the following statements which are TRUE.

 (A) Change in features (i.e., binarizing features above) would


change DummyClassifier predictions.
 (B) predict takes only X as argument whereas fit and score take both X and y as
arguments.
 (C) For the decision tree algorithm to work, the feature values must be binary.
 (D) The prediction in a decision tree works by routing the example from the root to the
leaf.

Varada's answers: B, D

3.1: ML Fundamentals

Select all of the following statements which are TRUE.

 (A) A decision tree model with no depth (the default max_depth in sklearn) is likely to
perform very well on the deployment data.
 (B) Data splitting helps us assess how well our model would generalize.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
 (C) Deployment data is only scored once.
 (D) Validation data could be used for hyperparameter optimization.
 (E) It's recommended that data be shuffled before splitting it into train and test sets..

Varada's answers: B, D, E

3.2: ML Fundamentals

Select all of the following statements which are TRUE.

 (A) k-fold cross-validation calls fit k times.


 (B) We use cross-validation to get a more robust estimate of model performance.
 (C) If the mean train accuracy is much higher than the mean cross-validation accuracy it's
likely to be a case of overfitting.
 (D) The fundamental tradeoff of ML states that as training error goes down, validation
error goes up.
 (E) A decision stump on a complicated classification problem is likely to underfit.

Varada's answers: A, B, C, E

4.1: kNNs & SVM-RBF

Select all of the following statements which are TRUE.

 (A) Analogy-based models find examples from the test set that are most similar to the
query example we are predicting.
 (B) Euclidean distance will always have a non-negative value.
 (C) With k-NN, setting the hyperparameter k to larger values typically reduces training
error.
 (D) Similar to decision trees, k-NNs finds a small set of good features.
 (E) In k-NN, with k >1, the classification of the closest neighbour to the test example
always contributes the most to the prediction.

Varada's answers: B

4.2: kNNs & SVM-RBF

Select all of the following statements which are TRUE.

 (A) k-NN may perform poorly in high-dimensional space (say, d > 1000).
 (B) In SVM RBF, removing a non-support vector would not change the decision
boundary.
 (C) In sklearn’s SVC classifier, large values of gamma tend to result in higher training
score but probably lower validation score.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
 (D) If we increase both gamma and C, we can't be certain if the model becomes more
complex or less complex.

Varada's answers: A, C (B and D are a bit ambiguous. We will avoid ambiguous


questions in exams.)

5.1: Preprocessing & Pipelines

Take a guess: In your machine learning project, how much time will you typically spend on data
preparation and transformation?

 (A) ~80% of the project time


 (B) ~20% of the project time
 (C) ~50% of the project time
 (D) None. Most of the time will be spent on model building

Varada's answers: (A)

5.2: Preprocessing & Pipelines

Select all of the following statements which are TRUE.

1. StandardScaler ensures a fixed range (i.e., minimum and maximum values) for the
features.
2. StandardScaler calculates mean and standard deviation for each feature separately.
3. In general, it's a good idea to apply scaling on numeric features before training k-NN or
SVM RBF models.
4. The transformed feature values might be hard to interpret for humans.
5. After applying SimpleImputer The transformed data has a different shape than the
original data.

Varada's answers: 2., 3., 4.

5.3: Preprocessing & Pipelines

Select all of the following statements which are TRUE.

1. You can have scaling of numeric features, one-hot encoding of categorical features,
and scikit-learn estimator within a single pipeline.
2. Once you have a scikit-learn pipeline object with an estimator as the last step, you can
call fit, predict, and score on it.
3. You can carry out data splitting within scikit-learn pipeline.
4. We have to be careful of the order we put each transformation and model in a pipeline.
5. If you call cross_validate with a pipeline object, it will call fit and transform on the
training fold and only transform and predict/score on the validation fold.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
Varada's answers: 2., 4., 5.

6.1: Column Transformer & Text Features

Select all of the following statements which are TRUE.

1. You could carry out cross-validation by passing a ColumnTransformer object


to cross_validate.
2. After applying column transformer, the order of the columns in the transformed data has
to be the same as the order of the columns in the original data.
3. After applying a column transformer, the transformed data is always going to be of
different shape than the original data.
4. When you call fit_transform on a ColumnTransformer object, you get a numpy
ndarray.

Varada's answers: 4.

6.2: Column Transformer & Text Features

Select all of the following statements which are TRUE.

 (A) handle_unknown="ignore" would treat all unknown categories equally.


 (B) As you increase the value for max_features hyperparameter
of CountVectorizer the training score is likely to go up.
 (C) Suppose you are encoding text data using CountVectorizer. If you encounter a
word in the validation or the test split that's not available in the training data, we'll get an
error.
 (D) In the code below, inside cross_validate, each fold might have slightly different
number of features (columns) in the fold.

pipe = (CountVectorizer(), SVC())


cross_validate(pipe, X_train, y_train)

Varada's answers: A, B, D

7.1: Linear Models

Select all of the following statements which are TRUE.

 (A) Increasing the hyperparameter alpha of Ridge is likely to decrease model


complexity.
 (B) Ridge can be used with datasets that have multiple features.
 (C) With Ridge, we learn one coefficient per training example.
 (D) If you train a linear regression model on a 2-dimensional problem (2 features), the
model will learn 3 parameters: one for each feature and one for the bias term.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
Varada's answers: A, B, D

7.2: Linear Models

Select all of the following statements which are TRUE.

 (A) Increasing logistic regression's C hyperparameter increases model complexity.


 (B) The raw output score (i.e., the weighted sum of feature values + bias) can be used to
calculate the probability score for a given prediction.
 (C) For linear classifier trained on d features, the decision boundary is a d −1-
dimensional hyperplane.
 (D) A linear model is likely to be uncertain about the data points close to the decision
boundary.

Varada's answers: A, B, C, D

8.1: Hyperparameter Optimization

Select all of the following statements which are TRUE.

 (A) If you get best results at the edges of your parameter grid, it might be a good idea to
adjust the range of values in your parameter grid.
 (B) Grid search is guaranteed to find the best hyperparameter values.
 (C) It is possible to get different hyperparameters in different runs
of RandomizedSearchCV.

Varada's answers: A, C

8.2: Hyperparameter Optimization

Would you trust the model?

 You have a dataset and you give me 1/10th of it. The dataset given to me is rather small
and so I split it into 96% train and 4% validation split. I carry out hyperparameter
optimization using a single 4% validation split and report validation accuracy of 0.97.
Would it classify the rest of the data with similar accuracy?

1. Probably
2. Probably not

Varada's answers: Probably not

9.1: Classification Metrics

Select all of the following statements which are TRUE.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
 (A) In medical diagnosis, false positives are more damaging than false negatives (assume
"positive" means the person has a disease, "negative" means they don't).
 (B) In spam classification, false positives are more damaging than false negatives
(assume "positive" means the email is spam, "negative" means they it's not).
 (C) If method A gets a higher accuracy than method B, that means its precision is also
higher.
 (D) If method A gets a higher accuracy than method B, that means its recall is also
higher.

Varada's answers: B

10.1: Regression Metrics

Select all of the following statements which are TRUE.

 (A) Price per square foot would be a good feature to add in our X.
 (B) The alpha hyperparameter of Ridge has similar interpretation of C hyperparameter
of LogisticRegression; higher alpha means more complex model.
 (C) In Ridge, smaller alpha means bigger coefficients whereas bigger alpha means
smaller coefficients.

Varada's answers: C

10.2: Regression Metrics

Select all of the following statements which are TRUE.

 (A) We can use still use precision and recall for regression problems but now we have
other metrics we can use as well.
 (B) In sklearn for regression problems, using r2_score() and .score() (with default
values) will produce the same results.
 (C) RMSE is always going to be non-negative.
 (D) MSE does not directly provide the information about whether the model is
underpredicting or overpredicting.
 (E) We can pass multiple scoring metrics to GridSearchCV or RandomizedSearchCV for
regression as well as classification problems.

Varada's answers: B, C, D, E

11.1: Ensembles

Select the most accurate option below.

 (A) Every tree in a random forest uses a different bootstrap sample of the training set.

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
 (B) To train a tree in a random forest, we first randomly select a subset of features. The
tree is then restricted to only using those features.
 (C) The n_estimators hyperparameter of random forests should be tuned to get a better
performance on the validation or test data.
 (D) In random forests we build trees in a sequential fashion, where the current tree is
dependent upon the previous tree.
 (E) Let classifiers A, B, and C have training errors of 10%, 20%, and 30%, respectively.
Then, the best possible training error from averaging A, B and C is 10%.

Varada's answers: A

This study source was downloaded by 100000792727365 from CourseHero.com on 04-24-2024 11:29:43 GMT -05:00

https://www.coursehero.com/file/223575961/All-iClicker-Questions-330docx/
Powered by TCPDF (www.tcpdf.org)

You might also like