0% found this document useful (0 votes)

2 views

1. Classification

This document discusses the concepts of classification, focusing on logistic regression as a preferred method over linear regression for binary outcomes due to its ability to produce probabilities between 0 and 1. It includes examples, such as credit card default prediction, and explains the logistic regression model's parameters and their estimation through maximum likelihood. Additionally, it touches on the application of logistic regression for multiclass problems and introduces discriminant analysis as an alternative when classes are well-separated.

Uploaded by

sainathgunda99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

1. Classification

Uploaded by

sainathgunda99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

sainathgunda99@gmail.

com
DLZNK464L9 Classification – Logistic Regression

Proprietary
Thiscontent.
file is ©University
meant forofpersonal
Arizona. All use
Rightsby
Reserved. Unauthorized use or distributiononly.
[email protected] prohibited.

Sharing or publishing the contents in part or full is liable for legal action.
Agenda
In this session, we will discuss:
• Introduction to classifications
• Linear regression vs. Logistic regression
• Brief about Prediction
• Introduction to Logistic regression

1. Classification

Uploaded by

1. Classification

Uploaded by

sainathgunda99@gmail.

Eye color ∈ {brown, blue, green}

Balance seems to be of more

Can we simply perform a linear regression of Y on X and classify it as Yes if ŷ >0.5?

0 500 1000 1500 0 500 1000 1500

● Orange marks: Response Y (0 or 1).

Remember our lecture on linear models?

This monotone transformation is called the log odds or logit

Coefficient Std. error Z-statistic P-value

With a balance of $2000?

Coefficient Std. error Z-statistic P-value

Coefficient Std. error Z-statistic P-value

[email protected] Coefficient Std. error Z-statistic P-value

1000 1500 2000 2500

Credit Card Balance Student Status

In the context of discriminant analysis,

• fk(x) =Pr(X=x|Y=k) is the density for Xin class k. We will use

We classify a new point according to which density is the highest.

Some simplifications later… Proprietary

Bayes decision boundary

Example with µ1 = −1.5, µ2 = 1.5, π1 = π2 = 0.5, and σ2 = 1.

An expectation. Average of all the

where is the usual formula for the

So, classifying to the largest

When K = 2, we classify to class 2 if

(23 + 252)/10000 errors — a 2.75% misclassification rate! Some

The fraction of true positive examples that are classified as negative

0.0 0.1 0.2 0.3 0.4 0.5

0.0 0.2 0.4 0.8 1.0

False positive rate

The ROC plot displays both simultaneously.

• By assuming independency between features, we get a näive Bayes classifier.

Because the Σk are different, the quadratic terms matter.

• Can use for mixed feature vectors (qualitative and quantitative). If Xj is

• Find the K “training points” that are

• Value of k is typically odd to avoid ties; 3 and 5 are most common.

You might also like