DS QS
DS QS
L2
What cross-validation technique L2
What’s the F1 score? How would you use it? L1
What do you understand by Precision and Recall? L1
Explain false negative, false positive, true negative and true
positive with a simple example. L1
What is a Confusion Matrix? L1
How is KNN different from K-means clustering? L1
Is it better to have too many false positives or too many false
negatives? Explain. L1
What is the difference between Entropy and Information
Gain? L2
your model is suffering from low bias and high variance.
Which algorithm should you use to tackle it? Why? L1
You’ve built a random forest model with 10000 trees. You got
delighted after getting training error as 0.00. But, the
validation error is 34.23. What is going on?
L2
Explain the difference between L1 and L2 regularization. L2
When does regularization becomes necessary in Machine
Learning? L2
When is Ridge regression favorable over Lasso regression?
L2
In which scenario you will use neural network? L2
Can or cannot PCA + Linear Regression work and why? L3
How do you validate your model? L2
how do you select important variables? Explain your
methods. L2
You are given a data set. The data set contains many
variables, some of which are highly correlated and you know
about it. Your manager has asked you to run PCA. Would you
remove correlated variables first? Why?
L2
How do you impute the missing value? L1
How do you deal with outliers? L1
How would you screen for outliers and what should you do if
you find one? L1
Name a few libraries in Python used for Data Analysis and
Scientific Computations. L1
How are NumPy and SciPy related? L1
Level Type
L1 Beginner Tech
L2 Intermediate Subjective
L3 Expert