Logistic
Regression
                   Classification
Machine Learning
                   Slides from CS-229 by Andrew Ng
Classification
    Email: Spam / Not Spam?
    Online Transactions: Fraudulent (Yes / No)?
    Tumor: Malignant / Benign ?
                     0: “Negative Class” (e.g., benign tumor)
                     1: “Positive Class” (e.g., malignant tumor)
                                                                   Andrew Ng
     (Yes) 1
Malignant ?
     (No) 0
                    Tumor Size        Tumor Size
                                                     Threshold will change
                                                     because of these new points
       Threshold classifier output            at 0.5:
               If                , predict “y = 1”
               If                , predict “y = 0”
                                                                                   Andrew Ng
Classification: y = 0 or 1
          can be > 1 or < 0
Logistic Regression:
                              Andrew Ng
                   Logistic
                   Regression
                   Hypothesis
                   Representation
Machine Learning
Logistic Regression Model
  Want
                            0.5
Sigmoid function                  0
Logistic function
                                      Andrew Ng
Interpretation of Hypothesis Output
        = estimated probability that y = 1 on input x
  Example: If
  Tell patient that 70% chance of tumor being malignant
  “probability that y = 1, given x,
   parameterized by ”
                                                          Andrew Ng
                   Logistic
                   Regression
                   Decision boundary
Machine Learning
Logistic regression                                1
                                                  0.5
                                                     0
                                                         z
 Suppose predict “       “ if
              Alternatively, if       𝑧 ≥ 0 ; 𝜃𝑇 𝑥 ≥ 0
          predict “      “ if
              Alternatively, if       𝑧 <0 ; 𝜃𝑇 𝑥< 0
                                                             Andrew Ng
Decision Boundary
x2                            Linear decision boundary
     3
     2
     1
                                Predict “          “ if  𝜃𝑇 𝑥 ≥ 0
         1       2   3   x1
              
             Suppose
                                Predict “          “ if
                                                               𝑥1 + 𝑥 2 ≥ 3
                                                                       Andrew Ng
                   Logistic
                   Regression
                   Cost function
Machine Learning
Training set:
m examples
How to choose parameters ?
                             Andrew Ng
Cost function
  Linear regression:
          “non-convex”   “convex”
                                    Andrew Ng
Logistic regression cost function
          If y = 1
    0                    1          Andrew Ng
Logistic regression cost function
            If y = 0
    0                    1          Andrew Ng
                   Logistic
                   Regression
                   Simplified cost function
Machine Learning
Logistic regression cost function
  𝐶𝑜𝑠𝑡
         ( h𝜃 (𝑥 ) , 𝑦 ) =− 𝑦𝑙𝑜𝑔 ( h𝜃 ( 𝑥 ) ) − (1 − 𝑦 ) log (1 −h 𝜃 ( 𝑥 ))
                                                                               Andrew Ng
Logistic regression cost function
To fit parameters :
To make a prediction given new :
   Output
                                    Andrew Ng
Gradient Descent
Want          :
 Repeat
                   (simultaneously update all   )
                                                    Andrew Ng
Gradient Descent
Want            :
 Repeat
                     (simultaneously update all   )
Algorithm looks identical to linear regression!
   What’s the difference then?
                                                      Andrew Ng
Recall
 Linear regression:
                           𝑛
                                      𝑇
                  h𝜃 ( 𝑥 ) =∑ 𝜃 𝑖 𝑥 𝑖=𝜃 𝑥
                          𝑖=0
 Logistic regression:
                                            Andrew Ng
Gradient Descent
Want            :
 Repeat
                     (simultaneously update all   )
Algorithm looks identical to linear regression!
The hypothesis  h𝜃 ( 𝑥 ) has changed now!
                                                      Andrew Ng
Logistic Regression (Binary Classification)
Want            :
 Repeat
                      (simultaneously update all   )
    How to extend for multi-class classification?
                                                       Andrew Ng
                   Logistic
                   Regression
                   Multi-class classification:
                   One-vs-all
Machine Learning
Multiclass classification
Email tagging: Work, Friends, Family, Hobby, etc.
Medical diagrams: Not ill, Cold, Flu, etc.
Weather: Sunny, Cloudy, Rain, Snow, etc.
Images: Cat, Table, Person, etc.
                                                    Andrew Ng
 Binary classification:        Multi-class classification:
x2                        x2
               x1                            x1
                                                             Andrew Ng
                            x2
One-vs-all (one-vs-rest):
                                 x1
  x2                        x2
                  x1             x1
                            x2
       Class 1:
       Class 2:
       Class 3:
                                 x1
                                      Andrew Ng
One-vs-all
Train a logistic regression classifier   for each
class to predict the probability that    .
On a new input , to make a prediction, pick the
class that maximizes
                                                    Andrew Ng