Unit 2.1
Unit 2.1
Unit II
Introduction
□ Modern deep learning provides a very powerful
framework for supervised learning.
□ By adding more layers and more units within a
layer, a deep network can represent functions
of increasing complexity
□ Deep feedforward networks, also often
called feedforward neural networks, or
multilayer perceptrons (MLPs) are the
quintessential deep learning models.
□ The goal of a feedforward network is to
approximate some function f*.
Introduction
□ y = f*(x) maps an input x to a category y.
□ A feedforward network defines a mapping y = f
(x; θ) and learns the value of the parameters θ
that result in the best function approximation
□ feedforward :- information flows through the
function being evaluated from x, through the
intermediate computations used to define f ,
and finally to the output y.
□ No feedback connections in which outputs of
the model are fed back into itself
Introduction
□ When feedforward neural networks are
extended to include feedback connections,
they are called recurrent neural networks
□ Feedforward neural networks are called
networks because they are typically
represented by composing together many
different functions
■ f(x) = f(3)(f(2)(f(1)(x)))
□ During neural network training, we drive f(x) to
match f∗(x).
y ≈ f∗(x)
Introduction
□ Linear models: logistic regression and linear
regression, are appealing because they may be fit
efficiently and reliably
□ To extend linear models to represent nonlinear
functions of x, we can apply the linear model not to x
itself but to a transformed input φ(x )
■ Φ -nonlinear transformation
□ Choosing the mapping φ
■ use a very generic φ, such as the infinite-dimensional φ
that is implicitly used by kernel machines based on the
RBF kernel
■ option is to manually engineer φ
■ deep learning to learn φ
Learning XOR
□ XOR Function: When exactly one of binary
values is equal to 1, the XOR function returns 1.
□ target function, y = f∗(x)
□ Our model provides a function y = f(x;θ)
□ our learning algorithm will adapt the
parameters θ to make f as similar as possible
to f∗
□ X = {[0, 0]T, [0,1]T,[1, 0]T,[1, 1]T}
□ Consider regression problem and use a mean
squared error loss function
Learning XOR
□ Evaluated on our whole training set, the MSE
loss function is
□ Linear Model