0% found this document useful (0 votes)
2 views

lecture6

The lecture focuses on constructing binary classifiers using Logistic Regression, covering topics such as applying logistic regression, formulating likelihoods, and deriving gradients and Hessians. It introduces the Iteratively Reweighted Least Squares (IRLS) algorithm for optimization and discusses the softmax link for logistic regression. The next lecture will address automatic derivative computation through back-propagation.

Uploaded by

Tachbir Dewan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

lecture6

The lecture focuses on constructing binary classifiers using Logistic Regression, covering topics such as applying logistic regression, formulating likelihoods, and deriving gradients and Hessians. It introduces the Iteratively Reweighted Least Squares (IRLS) algorithm for optimization and discusses the softmax link for logistic regression. The next lecture will address automatic derivative computation through back-propagation.

Uploaded by

Tachbir Dewan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Outline of the lecture

This lecture describes the construction of binary classifiers using a


technique called Logistic Regression. The objective is for you to learn:

 How to apply logistic regression to discriminate between two


classes.
 How to formulate the logistic regression likelihood.
 How to derive the gradient and Hessian of logistic regression.
 How to incorporate the gradient vector and Hessian matrix into
Newton’s optimization algorithm so as to come up with an algorithm
for logistic regression, which we call IRLS.
 How to do logistic regression with the softmax link.
McCulloch-Pitts model of a neuron
PSigmoid function
sigm(´) refers to the sigmoid function, also known as the logistic or
logit function:
1 e´
sigm(´) = ¡´
= ´
1+e e +1
Linear separating hyper-plane

[Greg Shakhnarovich]
Bernoulli: a model for coins
A Bernoulli random variable r.v. X takes values in {0,1}

q if x=1
p(x|q ) =
1- q if x=0

Where q 2 (0,1). We can write this probability more succinctly as


follows:
Entropy
In information theory, entropy H is a measure of the uncertainty
associated with a random variable. It is defined as:

H(X) = - S
x
p(x|q ) log p(x|q )

Example: For a Bernoulli variable X, the entropy is:


Logistic regression
The logistic regression model speci¯es the probability of a binary output
yi 2 f0; 1g given the input xi as follows:
n
Y
p(yjX; µ) = Ber(yi jsigm(xi µ))
i=1
Yn · ¸yi · ¸1¡yi
1 1
= 1¡
i=1
1 + e¡xi µ 1 + e¡xi µ
Pd
where xi µ = µ0 + j=1 µj xij
Gradient and Hessian of binary logistic regression
The gradient and Hessian of the negative loglikelihood, J(µ) = ¡ log p(yjX; µ),
are given by:

X n
d
g(w) = J(µ) = xTi (¼i ¡ yi ) = XT (¼ ¡ y)
dµ i=1
d X
T
H = g(µ) = ¼i (1 ¡ ¼i )xi xTi = XT diag(¼i (1 ¡ ¼i ))X
dµ i

where ¼i = sigm(xi µ)

One can show that H is positive de¯nite; hence the NLL is convex and
has a unique global minimum.

To ¯nd this minimum, we turn to batch optimization.


Iteratively reweighted least squares (IRLS)
For binary logistic regression, recall that the gradient and Hessian of the
negative log-likelihood are given by

gk = XT (¼ k ¡ y)
Hk = XT Sk X
Sk := diag(¼1k (1 ¡ ¼1k ); : : : ; ¼nk (1 ¡ ¼nk ))
¼ik = sigm(xi µ k )

The Newton update at iteration k + 1 for this model is as follows (using


´k = 1, since the Hessian is exact):

µ k+1 = µ k ¡ H¡1 gk
= µ k + (XT Sk X)¡1 XT (y ¡ ¼ k )
T ¡1
£ T T
¤
= (X Sk X) (X Sk X)µ k + X (y ¡ ¼ k )
= (XT Sk X)¡1 XT [Sk Xµ k + y ¡ ¼ k ]
Softmax formulation
Likelihood function
Negative log-likelihood criterion
Neural network representation of loss
Manual gradient computation
Manual gradient computation
Next lecture
In the next lecture, we develop an automatic layer-wise way of
computing all the necessary derivatives known as back-propagation.

This is the approach used in Torch. We will review the torch nn class.

You might also like