0% found this document useful (0 votes)
3 views

ml_interview

The document is a mock paper containing 10 interview questions related to machine learning, along with their solutions. Key topics include the bias-variance trade-off, differences between supervised and unsupervised learning, precision and recall, Bayes' Theorem, and error types. It serves as a resource for individuals preparing for machine learning interviews.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ml_interview

The document is a mock paper containing 10 interview questions related to machine learning, along with their solutions. Key topics include the bias-variance trade-off, differences between supervised and unsupervised learning, precision and recall, Bayes' Theorem, and error types. It serves as a resource for individuals preparing for machine learning interviews.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

www.foxmula.

com

MOCK PAPER

Interview Questions for ML

This Mock Paper has 10 Questions and solutions.

#KeepLearning #KeepGrowing
#TheSmartWay
#foxmula
QUESTION 1
What’s the trade-off between bias and variance?

Bias is error due to erroneous or overly simplistic assumptions in the


learning algorithm you’re using. This can lead to the model
underfitting your data, making it hard for it to have high predictive
accuracy and for you to generalize your knowledge from the training
set to the test set.

Variance is error due to too much complexity in the learning algorithm


you’re using. This leads to the algorithm being highly sensitive to high
degrees of variation in your training data, which can lead your model
to overfit the data. You’ll be carrying too much noise from your training
data for your model to be very useful for your test data.

The bias-variance decomposition essentially decomposes the learning


error from any algorithm by adding the bias, the variance and a bit of
irreducible error due to noise in the underlying dataset. Essentially, if
you make the model more complex and add more variables, you’ll lose
bias but gain some variance — in order to get the optimally reduced
amount of error, you’ll have to tradeoff bias and variance. You don’t
want either high bias or high variance in your model.

foxmula.com
QUESTION 2
What is the difference between supervised and unsupervised
machine learning?

Supervised learning requires training labeled data. For example, in


order to do classification (a supervised learning task), you’ll need to
first label the data you’ll use to train the model to classify data into your
labeled groups. Unsupervised learning, in contrast, does not require
labeling data explicitly.
QUESTION 3
Define precision and recall.

Recall is also known as the true positive rate: the amount of positives
your model claims compared to the actual number of positives there
are throughout the data. Precision is also known as the positive
predictive value, and it is a measure of the amount of accurate
positives your model claims compared to the number of positives it
actually claims. It can be easier to think of recall and precision in the
context of a case where you’ve predicted that there were 10 apples
and 5 oranges in a case of 10 apples. You’d have perfect recall (there
are actually 10 apples, and you predicted there would be 10) but
66.7% precision because out of the 15 events you predicted, only 10
(the apples) are correct.

QUESTION 4
What is Bayes’ Theorem? How is it useful in a machine learning
context?

Bayes’ Theorem gives you the posterior probability of an event given


what is known as prior knowledge.

Mathematically, it’s expressed as the true positive rate of a condition


sample divided by the sum of the false positive rate of the population
and the true positive rate of a condition. Say you had a 60% chance of
actually having the flu afterfoxmula.com
a flu test, but out of people who had the
flu, the test will be false 50% of the time, and the overall population
only has a 5% chance of having the flu. Would you actually have a 60%
chance of having the flu after having a positive test?

Bayes’ Theorem says no. It says that you have a (.6 * 0.05) (True
Positive Rate of a Condition Sample) / (.6*0.05)(True Positive Rate of a
Condition Sample) + (.5*0.95) (False Positive Rate of a Population) =
0.0594 or 5.94% chance of getting a flu.
Bayes’ Theorem is the basis behind a branch of machine learning that
most notably includes the Naive Bayes classifier. That’s something
important to consider when you’re faced with machine learning
interview questions.
QUESTION 5
What’s your favorite algorithm, and can you explain it to me in less
than a minute?

This type of question tests your understanding of how to


communicate complex and technical nuances with poise and the
ability to summarize quickly and efficiently. Make sure you have a
choice and make sure you can explain different algorithms so simply
and effectively that a five-year-old could grasp the basics!

QUESTION 6
What’s the F1 score? How would you use it?

TThe F1 score is a measure of a model’s performance. It is a weighted


average of the precision and recall of a model, with results tending to 1
being the best, and those tending to 0 being the worst. You would use
it in classification tests where true negatives don’t matter much.

QUESTION 7
What’s the difference between Type I and Type II error?

Don’t think that this is a trick question! Many machine learning


interview questions will befoxmula.com
an attempt to lob basic questions at you
just to make sure you’re on top of your game and you’ve prepared all
of your bases.

Type I error is a false positive, while Type II error is a false negative.


Briefly stated, Type I error means claiming something has happened
when it hasn’t, while Type II error means that you claim nothing is
happening when in fact something is.

A clever way to think about this is to think of Type I error as telling a


man he is pregnant, while Type II error means you tell a pregnant
woman she isn’t carrying a baby.
QUESTION 8
What’s a Fourier transform?

A Fourier transform is a generic method to decompose generic


functions into a superposition of symmetric functions. Or as this more
intuitive tutorial puts it, given a smoothie, it’s how we find the recipe.
The Fourier transform finds the set of cycle speeds, amplitudes and
phases to match any time signal. A Fourier transform converts a signal
from time to frequency domain — it’s a very common way to extract
features from audio signals or other time series such as sensor data.

QUESTION 9
How do you handle missing or corrupted data in a dataset?

You could find missing/corrupted data in a dataset and either drop


those rows or columns, or decide to replace them with another value.

In Pandas, there are two very useful methods: isnull() and dropna() that
will help you find columns of data with missing or corrupted data and
drop those values. If you want to fill the invalid values with a
placeholder value (for example, 0), you could use the fillna() method.

foxmula.com
QUESTION 10
Pick an algorithm. Write the psuedo-code for a parallel
implementation.

This kind of question demonstrates your ability to think in parallelism


and how you could handle concurrency in programming
implementations dealing with big data. Take a look at pseudocode
frameworks such as Peril-L and visualization tools such as Web
Sequence Diagrams to help you demonstrate your ability to write code
that reflects parallelism.

You might also like