0% found this document useful (0 votes)
8 views36 pages

Logistic Regression[2]

Logistic Regression is a supervised learning algorithm primarily used for binary classification tasks, predicting outcomes based on a probabilistic model. It addresses limitations of linear regression, such as producing outputs outside the 0 to 1 range and sensitivity to outliers, by using the sigmoid function to map predictions to probabilities. The algorithm is versatile, applicable in various fields like medical diagnosis, fraud detection, and marketing, while also incorporating regularization techniques to prevent overfitting.

Uploaded by

Vidhi Rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views36 pages

Logistic Regression[2]

Logistic Regression is a supervised learning algorithm primarily used for binary classification tasks, predicting outcomes based on a probabilistic model. It addresses limitations of linear regression, such as producing outputs outside the 0 to 1 range and sensitivity to outliers, by using the sigmoid function to map predictions to probabilities. The algorithm is versatile, applicable in various fields like medical diagnosis, fraud detection, and marketing, while also incorporating regularization techniques to prevent overfitting.

Uploaded by

Vidhi Rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Logistic Regression

By
Dr Ravi Prakash Verma
Professor
Department of CSAI
ABESIT
Logistic Regression
• Introduction
• Logistic Regression is a supervised learning algorithm used for
classification problems.
• Despite its name, it is not a regression algorithm in the traditional
sense (like Linear Regression) but rather a probabilistic classification
model.
• It is widely used in binary classification tasks, where the goal is to
predict one of two possible outcomes.
Why Not Linear Regression for Classification?
• Linear Regression models continuous values and is unsuitable for
classification because:
1.It can output values beyond the range of 0 to 1, which are not meaningful for
classification.
2.It is sensitive to outliers, which can distort decision boundaries.
3.It does not model probabilities directly.
• To address these issues, Logistic Regression applies a transformation
to map outputs into a probability range of (0,1).
Why Not Linear Regression for Classification?
• Here the threshold value is 0.5, which means if the value of h(x) is
greater than 0.5 then we predict malignant tumor (1) and if it is less
than 0.5 then we predict benign tumor (0).
Why Not Linear Regression for Classification?
• Add some outliers in our dataset, now this best fit line will shift to
that point. Hence the line will be somewhat like this:
Why Not Linear Regression for Classification?
• The blue line represents the old threshold, and the yellow line
represents the new threshold, which is maybe 0.2.
• To keep predictions right, lower threshold value.
• Hence, we can say that linear regression is prone to outliers.
Why Not Linear Regression for Classification?
• Now, if h(x) is greater than 0.2, only this regression will give correct
outputs.
• Another problem with linear regression is that the predicted values
may be out of range.
• Probability can be between 0 and 1, but if we use linear regression,
this probability may exceed 1 or go below 0.
• To overcome these problems, we use Logistic Regression, which
converts this straight best-fit line in linear regression to an S-curve
using the sigmoid function, which will always give values between 0
and 1.
Why Not Linear Regression for Classification?
Types of Logistic Regression
• On the basis of the categories, Logistic Regression can be classified into
three types:
1.Binomial: In binomial Logistic regression, there can be only two possible
types of the dependent variables, such as 0 or 1, Pass or Fail, etc.
2.Multinomial: In multinomial Logistic regression, there can be 3 or more
possible unordered types of the dependent variable, such as "cat", "dogs", or
"sheep"
3.Ordinal: In ordinal Logistic regression, there can be 3 or more possible
ordered types of dependent variables, such as "low", "Medium", or "High ".
• “
Assumptions of Logistic Regression
1. Independent observations: Each observation is independent of the
other. meaning there is no correlation between any input variables.
2. Binary dependent variables: It takes the assumption that the
dependent variable must be binary or dichotomous, meaning it can
take only two values. For more than two categories SoftMax
functions are used.
3. Linearity relationship between independent variables and log
odds: The relationship between the independent variables and the
log odds of the dependent variable should be linear.
4. No outliers: There should be no outliers in the dataset.
5. Large sample size: The sample size is sufficiently large
The Logistic (Sigmoid) Function
• The core of Logistic Regression is the Sigmoid Function, which maps
any real-valued number into the range (0,1):
• σ(z)=1/1+e−z
• ​where:
• z is the linear combination of input features: z=w0+w1x1+w2x2+...+wnxn
• w are the weights (or coefficients) associated with each feature.
• x are the input feature values.
• The sigmoid function ensures that outputs represent probabilities.
• For example:
• If z→+∞, then σ(z)≈1
• If z→−∞, then σ(z)≈0
• If z=0, then σ(z)=0.5
Binary Classification with Logistic Regression
Cost Function in Logistic Regression
Optimization Using Gradient Descent
Multi-Class Logistic Regression (Softmax Regression)
Regularization in Logistic Regression
Performance Evaluation
Applications of Logistic Regression
• Medical Diagnosis: Predicting disease presence (e.g., cancer
detection).
• Fraud Detection: Identifying fraudulent transactions.
• Marketing: Predicting customer conversion.
• Spam Detection: Classifying emails as spam or not spam.
• Credit Scoring: Determining loan approvals.
Conclusion
• Logistic Regression is a simple yet powerful classification algorithm.
• It provides probabilistic predictions, making it interpretable.
• Regularization techniques help prevent overfitting.
• It works well for binary and multiclass classification (via Softmax).
• Suitable for problems where relationships are linear in log-odds
space.
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
• Step 7: Repeat Until Convergence
• We continue updating w and b in multiple iterations until the cost function J
minimizes.
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Example Logistic Regression
• Predicting if a Student Passes an Exam
Multivariate Logistic Regression
• So far, we used only one feature (study hours). In real-world
scenarios, we have multiple features.
• The logistic regression equation for multiple variables is:
Multivariate Logistic Regression
Multivariate Logistic Regression

You might also like