ML QU
ML QU
1. Fundamentals of ML
* What is Machine Learning? How does it differ from traditional programming?
* Explain the differences between Supervised, Unsupervised, and Reinforcement
Learning.
* What is overfitting? How can you prevent it?
* What is underfitting? How does it impact a model�s performance?
* What is the bias-variance tradeoff?
* What is the difference between parametric and non-parametric models?
* Explain cross-validation. Why is it important?
* What is the difference between classification and regression?
* What are some common loss functions for regression and classification?
* What are the different types of activation functions? When should you use each?
2. Data Preprocessing & Feature Engineering
* Why is feature scaling important? What are the common methods?
* What is one-hot encoding, and when should you use it?
* Explain dimensionality reduction. What techniques can be used?
* What is PCA (Principal Component Analysis)? How does it work?
* What are outliers? How can you handle them?
* What is multicollinearity, and how can you detect it?
* How do you deal with imbalanced datasets?
* Explain the difference between L1 (Lasso) and L2 (Ridge) regularization.
Basic Questions
1. What is Machine Learning?
Follow-up: Can you explain the difference between supervised, unsupervised, and
reinforcement learning?
2. What are the key differences between machine learning and traditional
programming?
Follow-up: What challenges might arise when designing a system based on ML?
3. Explain the concept of training, validation, and testing datasets.
Follow-up: How would you determine if your model is overfitting?
4. What is feature engineering and why is it important?
Follow-up: Can you provide an example where feature engineering significantly
improved your model�s performance?
5. Define bias and variance in the context of machine learning.
Follow-up: How do these concepts relate to the bias-variance tradeoff?
Intermediate Questions
1. Discuss common algorithms for classification and regression.
Follow-up: When would you choose logistic regression over a decision tree, for
example?
2. How does cross-validation work and why is it used?
Follow-up: What are some pitfalls of cross-validation in time series data?
3. What are ensemble methods? Explain bagging, boosting, and stacking.
Follow-up: Can you share a scenario where an ensemble method outperformed a single
model?
4. How do you handle imbalanced datasets?
Follow-up: What techniques (e.g., oversampling, undersampling, synthetic data
generation) have you used and what were the outcomes?
5. Explain the concept of gradient descent.
Follow-up: What are the differences between batch, mini-batch, and stochastic
gradient descent?
6. How do you evaluate the performance of a machine learning model?
Follow-up: Which metrics would you use for a classification task versus a
regression task?
7. What is regularization, and why is it important?
Follow-up: Explain L1 versus L2 regularization and their effects on model
parameters.
Advanced Questions
1. Explain the mathematics behind support vector machines (SVMs).
Follow-up: How does the kernel trick work, and why is it useful in SVM?
2. Discuss deep learning architectures.
Follow-up: How do convolutional neural networks (CNNs) differ from recurrent neural
networks (RNNs) and when would you use one over the other?
3. How do you approach hyperparameter tuning?
Follow-up: What are the benefits and drawbacks of grid search versus random search
versus more advanced methods like Bayesian optimization?
4. Discuss model interpretability techniques.
Follow-up: What methods would you use to explain complex models like deep neural
networks?
5. Explain how you would deploy a machine learning model to production.
Follow-up: What challenges might you face with scalability, monitoring, and model
updating?
6. Discuss advanced topics such as transfer learning and unsupervised pre-training.
Follow-up: Can you provide an example of when transfer learning was particularly
effective?
7. How do you manage and version control data, models, and experiments in your
workflow?
Follow-up: What tools or frameworks have you found most effective (e.g., MLflow,
DVC)?
8. What are some recent trends in ML research that excite you?
Follow-up: How do you stay updated with the latest developments in the field?
9. Explain the concept of causal inference in machine learning.
Follow-up: How does it differ from standard predictive modeling, and what are its
challenges?
10. Discuss the ethical considerations and potential biases in machine learning
models.
Follow-up: How would you mitigate unintended biases in a deployed system?