ML Mod 3
ML Mod 3
Write expressions for hypothesis, cost function and for parameter using gradient
descent for logistic regression. Explain each term in short.
Logistic regression is a type of machine learning algorithm used for binary classification tasks, like
predicting whether an email is spam or not spam.
Let's say we're trying to predict whether students will pass (1) or fail (0) an exam based on the
number of hours they studied. We have data for several students, where each row represents one
student:
Hours studied: 3, 5, 2, 8, 6
Result (pass/fail): 0, 1, 0, 1, 1
Logistic regression takes this data and calculates the probability of passing the exam for each
student based on the number of hours they studied. For example, it might predict a 70% chance of
passing for a student who studied for 5 hours.
Then, using a threshold (e.g., 0.5), it decides whether to predict a pass (1) or fail (0) based on
these probabilities.
In summary, logistic regression predicts binary outcomes (like pass or fail) based on input features
(like hours studied), making it useful for classification tasks.
Convex curves:
Non-convex curves:
4. Write expression for sigmoid function. Explain utility of Sigmoid function in logistic
regression.
Sigmoid function is useful to map any predicted values of probabilities into another value
between 0 and 1.
The function's shape resembles an "S" curve, starting from 0 and gradually increasing
towards 1 as z becomes larger.
- As z becomes more negative, g(z) approaches 0, and as z becomes more positive, g(z)
approaches 1.
- The sigmoid function is particularly useful in logistic regression because it converts the
output of the linear equation (which can be any real number) into a probability between 0
and 1.
This probability represents the likelihood of a binary outcome (e.g., pass or fail, spam or
not spam), making it suitable for binary classification tasks.
5. Application:
● Linear regression is suitable for predicting real-valued outputs like house prices, stock prices,
etc.
● Logistic regression is commonly used for binary classification tasks like spam detection,
medical diagnosis, etc.
6. Assumption:
● Linear regression assumes a linear relationship between input and output variables.
● Logistic regression assumes a linear relationship between the log odds of the outcome and
the input variables.
7. Optimization Algorithm:
● Linear regression often uses methods like Ordinary Least Squares (OLS) or gradient descent
for optimization.
● Logistic regression uses gradient descent or other optimization algorithms to minimize the
log loss function.
8. Model Interpretability:
● Linear regression coefficients represent the change in the output variable for a one-unit
change in the input variable.
● Logistic regression coefficients represent the change in the log odds of the outcome for a
one-unit change in the input variable.
9. Handling Outliers:
● Linear regression is sensitive to outliers, which can significantly impact the model's
performance.
● Logistic regression is less affected by outliers due to its robustness in handling binary
outcomes.
10. Decision Boundary:
● Linear regression doesn't have a distinct decision boundary since it's continuous.
● Logistic regression has a decision boundary (typically at 0.5 probability) separating the two
classes.