0% found this document useful (0 votes)

3 views

TensorFlow Regression

The document provides an overview of building regression models using TensorFlow, emphasizing the role of neural networks and neurons in deep learning. It outlines prerequisites for learning, including Python programming and TensorFlow installation, and discusses various machine learning concepts such as linear and logistic regression. Additionally, it explains the operation of neurons, the training process, and the significance of weights and biases in neural networks.

Uploaded by

Surya Bhoi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

TensorFlow Regression

Uploaded by

Surya Bhoi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 445

Building Regression Models using TensorFlow

LEARNING USING NEURONS

Overview

The most common use of TensorFlow is in deep learning

Neural networks are the most common type of deep learning algorithm

The basic building block of a neural network is a neuron

Linear regression can be “learnt” using a single neuron

Deep learning extends this idea to more complex, non-linear functions

Prerequisites and Course Outline
Learning using Neurons

Linear Regression in TensorFlow

Course Outline
Logistic Regression in TensorFlow

Estimators
Understanding the Foundations of TensorFlow

Related Courses Understanding and Applying Linear Regression

Understanding and Applying Logistic Regression

Soft ware Have TensorFlow installed

and Skills Know some Python programming

Understanding Deep Learning
Whales: Fish or Mammals?

Mammals Fish

Members of the infraorder Cetacea Look like fish, swim like fish, move with fish
Rule-based Binary Classifier

Whale Mammal

Rule-based Classifier

Human Experts
“Traditional” ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Corpus
“Traditional” ML-based Binary Classifier

Corpus Classification Algorithm ML-based Classifier

“Traditional” ML-based Binary Classifier
“Traditional” ML-based Rule-based
Dynamic Static

Experts optional Experts required

Corpus required Corpus optional

Training step No training step

“Traditional” ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Corpus
“Traditional” ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Input: Feature Vector

Corpus
“Traditional” ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Output: Label

Corpus
“Traditional” ML-based Binary Classifier

Moves like a fish,

Fish
Looks like a fish
ML-based Classifier

Corpus
“Traditional” ML-based Binary Classifier

Moves like a fish,

Fish
Looks like a fish
ML-based Classifier

Input: Feature Vector

Corpus
“Traditional” ML-based Binary Classifier

Moves like a fish,

Fish
Looks like a fish
ML-based Classifier

Output: Label

Corpus
The attributes that the ML algorithm focuses on are called
features
Feature Vectors Each data point is a list - or vector - of such features

Thus, the input into an ML algorithm is a feature vector

“Traditional” ML-based systems still rely on
experts to decide what features to pay
attention to
“Representation” ML-based systems figure
out by themselves what features to pay
attention to
“Traditional” ML-based Binary Classifier

Corpus Classification Algorithm ML-based Classifier

“Traditional” ML-based Binary Classifier

Corpus Feature Selection by Classification Algorithm ML-based Classifier

Experts
“Traditional” ML-based Binary Classifier

Corpus Feature Selection by Classification Algorithm ML-based Classifier

Experts
“Representation” ML-based Binary Classifier

Feature Selection Classification Algorithm

Corpus ML-based Classifier
Algorithm
“Representation” ML-based Binary Classifier

Feature Selection Classification Algorithm

Corpus ML-based Classifier
Algorithm
“Traditional” ML-based Binary Classifier

Breathes like a mammal

Mammal
Gives birth like a mammal
ML-based Classifier

Corpus
“Representation” ML-based Binary Classifier

Picture or video of a whale Mammal

ML-based Classifier

Corpus
“Traditional” ML-based Binary Classifier

Corpus Feature Selection by Classification Algorithm ML-based Classifier

Experts
“Representation” ML-based Binary Classifier

Feature Selection Classification Algorithm

Corpus ML-based Classifier
Algorithm
“Representation” ML-based Binary Classifier

Feature Selection Classification Algorithm

Corpus ML-based Classifier
Algorithm
“Deep Learning” systems are one type of
representation systems
Deep Learning and Neural Net works

Deep Learning Neural Networks Neurons

Algorithms that learn what The most common class of deep Simple building blocks that actually
features matter learning algorithms “learn”
Deep Learning Book - Chapter 1 (intro), page 6
“Deep Learning”-based Binary Classifier

Object Parts
Corners
Edges
Pixels

Corpus of Feature Selection & Classification ML-based Classifier

Images Algorithm
“Deep Learning”-based Binary Classifier

Object Parts
Corners
Edges
Pixels

Corpus of “Visible layer” ML-based Classifier

Images
“Deep Learning”-based Binary Classifier

Object Parts
Corners
Edges
Pixels

Corpus of “Hidden Layers” ML-based Classifier

Images
Neural Net works Introduced

Layer 2
Layer 1

Layer N
…
Corpus of Layers in a neural network ML-based Classifier
Images
Neural Net works Introduced

Pixels Processed groups of pixels

Corpus of Each layer consists of individual interconnected ML-based Classifier

Images neurons
Neurons as Learning Units
A machine learning algorithm is an algorithm that
is able to learn from data
Learning Algorithms

A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E
Learning Algorithms
A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E

Most common tasks in ML:

Classification, regression
Learning Algorithms
A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E

Accuracy in classification,
residual variance in regression
Training using a corpus of
labelled instances
Learning Algorithms
A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E
Learning Algorithms
A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E
Deep Learning and Neural Net works

Deep Learning Neural Networks Neurons

Algorithms that learn what The most common class of deep Simple building blocks that actually
features matter learning algorithms “learn”
“Deep Learning”-based Binary Classifier

Object Parts
Corners
Edges
Pixels

Corpus of Feature Selection & Classification ML-based Classifier

Images Algorithm
Layers in the Computation Graph

Object Parts
Corners
Edges
Pixels

Corpus of Groups of neurons that perform similar ML-based Classifier

Images functions are aggregated into layers
Neural Net works: Net works of Neurons

Pixels Processed groups of pixels

Corpus of Each layer consists of individual interconnected ML-based Classifier

Images neurons
Directed computation graphs “learn” relationships between
data

Deep Learning The more complex the graph, the more the relationships it can
“learn”

“Deep” Learning: Depth of the computation graph

y = Wx + b

“Learning” Regression
Regression can be reverse-engineered by a single neuron
Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
def XOR(x1,x2):
if (x1 == x2):
return 0
return 1

“Learning” XOR
The XOR function can be reverse-engineered using 3 neurons arranged in 2 layers
XOR: 3 Neurons, 2 Layers

X1 X2 Y
X1
0 0 0
0 1 1

1 0 1
X2 1 1 0

Output Truth Table

2 Inputs Layer 1 Layer 2
def doSomethingReallyComplicated(x1,x2…):
…
…
…
return complicatedResult

“Learning” Arbitrarily Complex Functions

Adding layers to a neural network can “learn” (reverse-engineer) pretty much anything
Arbitrarily Complex Function

Pixels Processed groups of pixels

Corpus of Operations (nodes) on data (edges) ML-based Classifier

Images
The Computational Graph

Pixels Processed groups of pixels

Corpus of Operations (nodes) on data (edges) ML-based Classifier

Images
The Computational Graph

Pixels Processed groups of pixels

Corpus of The nodes in the computation graph are neurons ML-based Classifier
Images (simple building blocks)
The Computational Graph

Pixels Processed groups of pixels

Corpus of The edges in the computation graph are data items ML-based Classifier
Images called tensors
The nodes in the computation graph are simple entities
called neurons

Neurons Each neuron performs very simple operations on data

The neurons are connected in very complex,

sophisticated ways
The complex interconnections between simple neurons

Different network configurations => different types of

Neural Net works neural networks

- Convolutional

- Recurrent
Groups of neurons that perform similar functions are
Neural Net works aggregated into layers
Layers in the Computation Graph

Object Parts
Corners
Edges
Pixels

Corpus of Groups of neurons that perform similar ML-based Classifier

Images functions are aggregated into layers
Layers in the Computation Graph

Object Parts
Corners
Edges
Pixels

Corpus of “Visible layer” ML-based Classifier

Images
Layers in the Computation Graph

Object Parts
Corners
Edges
Pixels

Corpus of “Hidden Layers” ML-based Classifier

Images
Layers in the Computation Graph

Layer 2
Layer 1

Layer N
…
Corpus of Layers in a neural network ML-based Classifier
Images
Neurons Each layer consists of units called neurons
Neurons

Pixels Processed groups of pixels

Corpus of Neural Network ML-based Classifier

Images
Neural Neurons in a neural network can be
Net works connected in very complex ways…
Neural Net works Introduced

Pixels Processed groups of pixels

Corpus of Neurons in a neural network can be connected in very ML-based Classifier

Images complex ways…
Neurons in a neural network can be connected in very
complex ways…
Neural Net works
…But each neuron only applies two simple functions
to its inputs
Neurons in a neural network can be connected in very complex
ways…

…But each neuron only applies two simple functions to its inputs
Neural Net works
- A linear (affine) transformation

- An activation function
Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

Each neuron only applies two simple functions to its inputs

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

Inputs into the neuron

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

Output from the neuron

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

Each neuron only applies two simple functions to its inputs

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The affine transformation is just a weighted sum with a bias added

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The values W1, W2…Wn are called the weights

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The value b is called the bias

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

Where do the values of W and b come from?

The weights and biases of individual neurons are
determined during the training process
The actual training of a neural network is
managed by TensorFlow
Finding the “best” values of W and b for each neuron is crucial
W Affine
The “best” values are found using the cost function, optimiser and
Wx + b
Transformation
X corpus…

…and the process of finding them is called the training process

b
W Affine Different types of neural networks wire up neurons in different
ways
Wx + b
Transformation
X
These interconnections can get very sophisticated…

b
During training, the output of deeper layers may be “fed back” to find the
W Affine
best W, b
Wx + b
Transformation
X
This is called back propagation

Back propagation is the standard algorithm for training neural networks

b
Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The training algorithm will use the weights to tell the

neuron which inputs matter, and which do not…
Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

…and apply a corrective bias if needed

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The linear output can only be used to learn linear

functions, but we can easily generalize this…
Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn b

The Activation function is a non-linear function, very

often simply the max(0,…) function
Activation The output of the affine transformation is chained
Wx + b Function max(Wx+b,0) into an activation function
Activation The activation function is needed for the neural
Wx + b Function max(Wx+b,0) network to predict non-linear functions
The most common form of the activation function is the
ReLU
Activation
Wx + b Function max(Wx+b,0) ReLU : Rectified Linear Unit

ReLU(x) = max(0,x)
Regression: The Simplest Neural Net work
def doSomethingReallyComplicated(x1,x2…):
…
…
…
return complicatedResult

“Learning” Arbitrarily Complex Functions

Adding layers to a neural network can “learn” (reverse-engineer) pretty much anything
y = Wx + b

“Learning” Regression
Regression can be reverse-engineered by a single neuron
Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
Operation of a Single Neuron

W1
X1
W2 y = Wx + b
X2
Wi Wx + b
…

Xi
…

Wn b
Xn
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Here the neuron is an entity that finds the “best fit” line through a set of
points
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

We are instructing the neuron to learn a linear function - so no activation

function is required at all
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

The affine transformation is just a weighted sum of the inputs with a bias
added
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Inputs into the neuron

Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b
Output from the neuron
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

The values W1, W2…Wn are called the weights

Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

The value b is called the bias

Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Where do the weights W and the bias b come from?

The weights and biases of individual neurons are
determined during the training process
The actual training of a neural network is
managed by TensorFlow
Simple Regression
Regression Equation:
y = A + Bx

y1 = A + Bx1
y2 = A + Bx2
y3 = A + Bx3
… …
yn A + Bxn
Simple Regression
Regression Equation:
y = A + Bx

y1 = A + Bx1 + e1
y2 = A + Bx2 + e2
y3 = A + Bx3 + e3
… … …
yn A + Bxn + en
Simple Regression
Regression Equation:
y = A + Bx
y1 1 x1 e1

[] [][][]
y2
y3
…
yn
= A

…
1
1

1
+B
x2
x3
…
xn
+
e2
e3
…
en
Minimising Least Square Error
Y

(xi, yi)

ei = yi - y’i

(xi, y’i)
Regression Line:
y = A + Bx

X
Residuals of a regression are the difference between actual and
fitted values of the dependent variable
The “Best” Regression Line
Y

Linear Regression involves finding the “best fit” line

The “Best” Regression Line
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Let’s compare two lines, Line 1 and Line 2

The “Best” Regression Line
Y

A1 Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The first line has y-intercept A1

The “Best” Regression Line
Y

x increases by 1

y decreases by B1

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

In the first line, if x increases by 1 unit, y decreases by B1 units

The “Best” Regression Line
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x
A2 X

The second line has y-intercept A2

The “Best” Regression Line
Y

Line 1: y = A1 + B1x
y decreases by B2
Line 2: y = A2 + B2x
x increases by 1
X

In the second line, if x increases by 1 unit, y decreases by B2 units

Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Drop vertical lines from each point to the lines A and B

Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Drop vertical lines from each point to the lines A and B

Simple Regression
Regression Equation:
y = A + Bx
y1 1 x1 e1

[] [][][]
y2
y3
…
yn
= A

…
1
1

1
+B
x2
x3
…
xn
+
e2
e3
…
en
Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The “best fit” line is the one where the sum of the squares of
the lengths of these dotted lines is minimum
Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The “best fit” line is the one where the sum of the squares of
the lengths of these dotted lines is minimum
Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The “best fit” line is the one where the sum of the
squares of the lengths of the errors is minimum
Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The “best fit” line is the one where the sum of the
squares of the lengths of the errors is minimum
Simple Regression
Regression Equation:
y = A + Bx
y1 1 x1 e1

[] [][][]
y2
y3
…
yn
= A

…
1
1

1
+B
x2
x3
…
xn
+
e2
e3
…
en
The “best fit” line is the one where the sum of the squares of
the lengths of the errors is minimum
Minimising Least Square Error
Y

Regression Line: y = A
+ Bx

The “best fit” line is called the regression line

Optimizers for the “Best-fit”

Maximum likelihood
Method of moments Method of least squares
estimation
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Where do the weights W and the bias b come from?

Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Where do the weights W and the bias b come from?

They are determined during the training process
Operation of a Single Neuron
W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

This optimization is not carried out by the individual neuron, rather

a training algorithm will take care of this
A Slightly More Complex Neural Net work
def doSomethingReallyComplicated(x1,x2…):
…
…
…
return complicatedResult

“Learning” Arbitrarily Complex Functions

Adding layers to a neural network can “learn” (reverse-engineer) pretty much anything
An Arbitrarily Complex Function

Pixels Processed groups of pixels

Corpus of Each layer consists of individual interconnected ML-based Classifier

Images neurons
y = Wx + b

“Learning” Regression
Regression can be learnt by a single neuron using an affine transformation alone
Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
def XOR(x1,x2):
if (x1 == x2):
return 0
return 1

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
XOR: Not Linearly Separable
X2
X1 X2 Y

0 0 0
0 1 1
Y=1 Y=0 1 0 1
1 1 0

Y=1
Y=0
X1
XOR: Not Linearly Separable
X2
X1 X2 Y

0 0 0
0 1 1
Y=1 Y=0 1 0 1
1 1 0

Y=1
Y=0
X1
No one straight line neatly divides the points into disjoint regions
where Y = 0 and Y = 1
XOR: Not Linearly Separable
X2
X1 X2 Y

0 0 0
0 1 1
Y=1 Y=0 1 0 1
1 1 0

Y=1
Y=0
X1
No one straight line neatly divides the points into disjoint regions
where Y = 0 and Y = 1
XOR: 3 Neurons, 2 Layers

X1 X1 X2 Y

0 0 0
0 1 1

1 0 1
X2
1 1 0

2 Inputs Layer 1 Layer 2 Output Truth Table

X1 X2 Y

0 0 0
0 1 1

1 0 1
1 1 0

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
1-Neuron Regression

W1
X1
X2
W2 Affine
Wx + b y = Wx + b
Wi Transformation
…

Xi
…

Wn
Xn b

Regression could be learnt by a single neuron using a single, linear

operation
Adding an Activation Function
W1
X1
Affine Activation
W2
Transformation Wx + b Function max(Wx+b,0)
X2

XOR, a simple non-linear function, can be learnt by 3 neurons if we add an

appropriate activation function
Activation The activation function is needed for the neural
Wx + b Function max(Wx+b,0) network to predict non-linear functions
The most common form of the activation function is the
ReLU
Activation
Wx + b Function max(Wx+b,0) ReLU : Rectified Linear Unit

ReLU(x) = max(0,x)
XOR: 3 Neurons, 2 Layers

X1 X1 X2 Y

0 0 0
0 1 1

1 0 1
X2
1 1 0

2 Inputs Layer 1 Layer 2 Output Truth Table

3-Neuron XOR
b3
W1
X1 Affine Activation W5 Affine
Transformation Function Transformation
W2
b1 W6

W3 Activation
Affine Activation
Function
Transformation Function
X2
W4
b2
3-Neuron XOR
Neuron #1
b3
W1
X1 Affine Activation W5 Affine
Transformation Function Transformation
W2
b1 W6

W3 Activation
Affine Activation
Function
Transformation Function
X2
W4
b2

Neuron #2
3-Neuron XOR
Neuron #3

b3
W1
X1 Affine Activation W5 Affine
Transformation Function Transformation
W2
b1 W6

W3 Activation
Affine Activation
Function
Transformation Function
X2
W4
b2

Neuron #2
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
X1 Affine Activation W5 Affine
Transformation Function Transformation
W2
b1 W6

W3 Activation
Affine Activation
Function
Transformation Function
X2
W4
b2

Neuron #2
The most common form of the activation function is the
ReLU
Activation
Wx + b Function max(Wx+b,0) ReLU : Rectified Linear Unit

ReLU(x) = max(0,x)
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3 Activation
Affine
RELU Function
Transformation
X2
W4
b2

Neuron #2
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
max(x,0)
X1 Affine W 5 Affine
RELU
Transformation x Transformation
W2
b1 W6

W3 Activation
Affine x max(x,0)
RELU Function
Transformation
X2
W4
b2

Neuron #2
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3 Activation
Affine
RELU Function
Transformation
X2
W4
b2

Neuron #2
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6 x

W3
Affine Identity
RELU
Transformation
X2
W4 x
b2

Neuron #2
3-Neuron XOR
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
3-Neuron XOR
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6
Inputs
W3
Affine Identity
RELU
Transformation
X2
W4
b2
3-Neuron XOR
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2

Output
3-Neuron XOR
Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2

Neuron #2 Layer 1
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
max(x,0)
X1 Affine W 5 Affine
RELU
Transformation x Transformation
W2
b1 W6

W3 Activation
Affine x max(x,0)
RELU Function
Transformation
X2
W4
b2

Neuron #2 Layer 2
3-Neuron XOR
Neuron #3
Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
Information
b1 only “feeds forward" W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2

Neuron #2
“2-Layer Feed-for ward Neural Net work”
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
XOR: 3 Neurons, 2 Layers

X1 X1 X2 Y

0 0 0
0 1 1

1 0 1
X2
1 1 0

2 Inputs Layer 1 Layer 2 Output Truth Table

Operation of a Single Neuron
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function max(Wx+b,0)
…

Xi
…

Wn
Xn
b

Each neuron has weights and a bias that must be calculated by the training algorithm
(done for us by TensorFlow)
Weights and Bias of Neuron #1
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
Weights and Bias of Neuron #2
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
Weights and Bias of Neuron #3
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
The weights and biases of individual neurons are
determined during the training process
Weights and Bias of Neuron #1
b3
W1 =1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 =0 W6

W3 =1
Affine Identity
RELU
Transformation
X2
W4
b2
Weights and Bias of Neuron #2
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2 =1 b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4 =1
b2 =-1
Weights and Bias of Neuron #3
b3 =0
W1
X1 Affine W5 =1 Affine
RELU
Transformation Transformation
W2
b1 W6 =-2

W3
Affine Identity
RELU
Transformation
X2
W4
b2
X1 X2 Y

0 0 0
0 1 1

1 0 1
1 1 0

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
3-Neuron XOR
X1 =0 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6
Inputs
W3
Affine Identity
RELU
Transformation
X2 =0 W4
b2
Weights and Bias of Neuron #1
X1 =0 b3
W1 =1
Affine 0 0 W5 Affine
RELU
Transformation Transformation
W2
b1 =0 W6

W3 =1
Affine Identity
RELU
Transformation
X2 =0 W4
b2
Weights and Bias of Neuron #2
X1 =0 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2 =1 b1 W6

W3
Affine 0 Identity
RELU
Transformation -1

X2 =0 W4 =1
b2 =-1
Weights and Bias of Neuron #3
b3 =0
W1
X1 Affine W5 =1 Affine
RELU 0
Transformation Transformation
W2
b1 W6 =-2 0

W3
Affine Identity
RELU
Transformation 0
X2
W4
b2

0
X1 X2 Y

0 0 0
0 1 1

1 0 1
1 1 0

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
3-Neuron XOR
X1 =0 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6
Inputs
W3
Affine Identity
RELU
Transformation
X2 =1 W4
b2
Weights and Bias of Neuron #1
X1 =0 b3
W1 =1
Affine 1 1 W5 Affine
RELU
Transformation Transformation
W2
b1 =0 W6

W3 =1
Affine Identity
RELU
Transformation
X2 =1 W4
b2
Weights and Bias of Neuron #2
X1 =0 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2 =1 b1 W6

W3
Affine 0 Identity
RELU
Transformation 0

X2 =1 W4 =1
b2 =-1
Weights and Bias of Neuron #3
b3 =0
W1
X1 Affine W5 =1 Affine
RELU 1
Transformation Transformation
W2
b1 W6 =-2 1

W3
Affine Identity
RELU
Transformation 0
X2
W4
b2

1
X1 X2 Y

0 0 0
0 1 1

1 0 1
1 1 0

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
3-Neuron XOR
X1 =1 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6
Inputs
W3
Affine Identity
RELU
Transformation
X2 =0 W4
b2
Weights and Bias of Neuron #1
X1 =1 b3
W1 =1
Affine 1 1 W5 Affine
RELU
Transformation Transformation
W2
b1 =0 W6

W3 =1
Affine Identity
RELU
Transformation
X2 =0 W4
b2
Weights and Bias of Neuron #2
X1 =1 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2 =1 b1 W6

W3
Affine 0 Identity
RELU
Transformation 0

X2 =0 W4 =1
b2 =-1
Weights and Bias of Neuron #3
b3 =0
W1
X1 Affine W5 =1 Affine
RELU 1
Transformation Transformation
W2
b1 W6 =-2 1

W3
Affine Identity
RELU
Transformation 0
X2
W4
b2

1
X1 X2 Y

0 0 0
0 1 1

1 0 1
1 1 0

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
3-Neuron XOR
X1 =1 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6
Inputs
W3
Affine Identity
RELU
Transformation
X2 =1 W4
b2
Weights and Bias of Neuron #1
X1 =1 b3
W1 =1
Affine 2 2 W5 Affine
RELU
Transformation Transformation
W2
b1 =0 W6

W3 =1
Affine Identity
RELU
Transformation
X2 =1 W4
b2
Weights and Bias of Neuron #2
X1 =1 b3
W1
Affine W5 Affine
RELU
Transformation Transformation
W2 =1 b1 W6

W3
Affine 1 Identity
RELU
Transformation 1

X2 =1 W4 =1
b2 =-1
Weights and Bias of Neuron #3
b3 =0
W1
X1 Affine W5 =1 Affine
RELU 2
Transformation Transformation
W2
b1 W6 =-2 0

W3
Affine Identity
RELU
Transformation 1
X2
W4
b2

0
Choice of Activation Function

Pixels Processed groups of pixels

Corpus of Neural Network ML-based Classifier

Images
Choice of Activation Function

Pixels Processed groups of pixels

Corpus of Input layers use identity function as ML-based Classifier

Images
activation: f(x) = x
Choice of Activation Function

Pixels Processed groups of pixels

Corpus of Inner hidden layers typically use ReLU as ML-based Classifier

Images
activation function
Choice of Activation Function

Pixels Processed groups of pixels

Corpus of Output layer in our XOR example used ML-based Classifier

Images
the identity function
Choice of Activation Function

Pixels Processed groups of pixels

Corpus of Output layer in classification will often use ML-based Classifier

Images
SoftMax
Another very common form of the activation function is the

Activation SoftMax

Wx + b Function max(Wx+b,0) SoftMax(x) outputs a number between 0 and 1

This output can be interpreted as a probability

def XOR(x1,x2):
if (x1 == x2):
return 0
return 1

“Learning” XOR
Reverse-engineering XOR requires 3 neurons (arranged in 2 layers) as well as a non-linear activation function
XOR: 3 Neurons, 2 Layers

X1 X1 X2 Y

0 0 0
0 1 1

1 0 1
X2
1 1 0

2 Inputs Layer 1 Layer 2 Output Truth Table

3-Neuron XOR
b3
W1
X1 Affine W5 Affine
RELU
Transformation Transformation
W2
b1 W6

W3
Affine Identity
RELU
Transformation
X2
W4
b2
def doSomethingReallyComplicated(x1,x2…):
…
…
…
return complicatedResult

“Learning” Arbitrarily Complex Functions

Adding layers to a neural network can “learn” (reverse-engineer) pretty much anything
A neuron is the smallest entity in a neural network

Linear regression can be learnt by a single neuron

A more complex function such as XOR requires more neurons

Summary Combinations of interconnected neurons can “learn” virtually
anything

Training such networks to use the “best” parameter values is

vital
Building Linear Regression Models Using TensorFlow
Implementing Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Simple Regression

Cause Effect

Independent variable Dependent variable

Implementing Regression in TensorFlow

Baseline

Non-TensorFlow implementation

Regular python code

Regression in Python

Pandas for dataframes NumPy for arrays

Statsmodels for regression Matplotlib for plots

Negative Indices In Python

Index -n Index 0

Backward Forward
Indices Indices
Index -1 Index n-1
Prices to Returns

0 0
0
1

= [:-1] / -1
[1:]

n-1 n-1
n

Returns = Prices[:-1] / Prices[1:] -1

ML-based Regression Model

(x1, y1)

(x2, y2)
(A, B)
…
(xn, yn)

Corpus Regression Algorithm Regression Line: y = A + Bx

ML-based Regression Model
x y

[x1] y1

[x2] y2 (A, B)

… …
[xn] yn

Corpus NumPy Linear Regression Regression Line: y = A + Bx

Reshaping in NumPy

reshape(-1,1)

n
Reshaping in NumPy

x1
[x1]
x2

…
reshape(-1,1) [x2]
…
xn
[xn]
Implementing Regression in TensorFlow
Baseline

Non-TensorFlow implementation

Regular python code

Computation Graph
Neural network of 1 neuron

Affine transformation suffices

Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
Regression: The Simplest Neural Net work
W1
X1
X2
W2 Affine Activation
Wi Transformation Wx + b Function Wx+b
…

Xi
…

Wn
Xn b
Regression: The Simplest Neural Net work
W1
X1
X2
W2 Affine Identity
Wi Transformation Wx + b Function Wx+b
…

Xi
…

Wn
Xn b
Implementing Regression in TensorFlow

Baseline Cost Function

Non-TensorFlow implementation Mean Square Error (MSE)

Regular python code Quantifying goodness-of-fit

Computation Graph
Neural network of 1 neuron

Affine transformation suffices

The “Best” Regression Line
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Let’s compare two lines, Line 1 and Line 2

Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Drop vertical lines from each point to the lines A and B

Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

Drop vertical lines from each point to the lines A and B

Simple Regression
Regression Equation:
y = A + Bx
y1 1 x1 e1

[] [][][]
y2
y3
…
yn
= A

…
1
1

1
+B
x2
x3
…
xn
+
e2
e3
…
en
Minimising Least Square Error
Y

Line 1: y = A1 + B1x

Line 2: y = A2 + B2x

The “best fit” line is the one where the sum of the squares of
the lengths of these dotted lines is minimum
Minimising Least Square Error
Y

Regression Line: y = A
+ Bx

The “best fit” line is called the regression line

Implementing Regression in TensorFlow

Baseline Cost Function

Non-TensorFlow implementation Mean Square Error (MSE)

Regular python code Quantifying goodness-of-fit

Computation Graph Optimizer

Neural network of 1 neuron Gradient Descent optimizers

Affine transformation suffices Improving goodness-of-fit

Why Choosing Is Complicated

What do we really want to What is slowing us down? What do we really control?

achieve?

Choosing involves answering complicated questions

Why Optimization Helps

What do we really want to What is slowing us down? What do we really control?

achieve?

Optimization forces us to mathematically pin down

answers to these questions
Framing the Optimization Problem

Objective Function Constraints Decision Variables

What we would like to achieve What slows us down What we really control

Collectively, these answers constitute the optimization problem

Linear Regression as an Optimization Problem

Objective Function Constraints Decision Variables

Minimize variance of the Express relationship as a Values of W and b

residuals (MSE) straight line

y = Wx + b
Minimizing MSE
MSE

b
Minimizing MSE
MSE

b
Minimizing MSE

Smallest value of MSE

Minimizing MSE
MSE

As small as possible!

Smallest value of MSE

Minimizing MSE
MSE

Smallest value of MSE

Minimizing MSE
MSE

Smallest value of MSE

“Gradient Descent”
Converging on the “best” value
MSE
using an optimization algorithm

Initial value of MSE

Smallest value of MSE

Minimizing MSE
MSE

Smallest value of MSE

“Training” the Algorithm
“Training Process” = Finding these best
MSE
values

Best value of W

Best value of b
Smallest value of MSE
Start Somewhere Initial values - have to start
somewhere
MSE

Initial value of W
Initial value of b
Initial value of MSE
“Gradient Descent”
Converging on the “best” value
MSE
using an optimization algorithm

Initial value of MSE

Smallest value of MSE

“Learning Rate”
MSE

Change in value of MSE in each

epoch = Learning Rate

Initial value of MSE

Smallest value of MSE

tf.train.GradientDescentOptimizer
Gradient
Descent tf.train.AdamOptimizer
Optimizers tf.train.FtrlOptimizer
Start Somewhere
Initial values - have to start
MSE
somewhere

Initial value of W
Initial value of b

Initial value of MSE

Minimizing MSE
MSE

Initial value of MSE

Smallest value of MSE

Regression: The Simplest Neural Net work

Training data
Single Neuron Regression Line
Minimizing MSE
MSE

Training data

Initial value of MSE

Smallest value of MSE

Minimizing MSE
MSE

Batch of training data

Initial value of MSE

Smallest value of MSE

“Epoch”
MSE

1 step towards optimal = 1 epoch

Batch of training data

New value of MSE

Smallest value of MSE

“Batch Size”
MSE

Number of training data points

considered = batch size
Batch of training data

New value of MSE

Smallest value of MSE

“Batch Size”

Stochastic Gradient Mini-batch Gradient

Descent Descent Batch Gradient Descent
All training data in each batch
1 point at a time Some subset in each batch
Implementing Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer

Neural network of 1 neuron Gradient Descent optimizers

Affine transformation suffices Improving goodness-of-fit

Minimizing MSE
MSE

Batch of training data

New value of MSE

Smallest value of MSE

Decisions in Training

Initial values Type of optimizer

Number of epochs Batch size

Simple Regression

Cause Effect

Independent variable Dependent variable

Simple Regression
Oil Prices

Government Bond
Yields

One cause, one effect

Multiple Regression

Causes Effect

Independent variables Dependent variable

Multiple Regression
XOM Returns

Oil Price Changes

NASDAQ Share
Index

Many causes, one effect

Implementing Regression in TensorFlow
Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Implement linear regression without using TensorFlow

Define computation graph of just one neuron

Summary Set up the cost function

Use various gradient descent optimizers

Understand gradient descent process

Train the model to get a converged regression model

Building Logistic Regression Models Using TensorFlow
Over view

Given causes, predict probability of effects - that’s logistic regression

Linear regression and logistic regression are similar, yet quite different

Logistic regression can be used for categorical y-variables

Logistic regression in TensorFlow differs from linear regression in two ways

- Softmax as the activation function

- cross-entropy as the cost function

Two Approaches to Deadlines

Start 5 minutes before deadline Start 1 year before deadline

Good luck with that Maybe overkill

Neither approach is optimal

Starting a Year in Advance
Probability of meeting the deadline

100%
Probability of getting other important work done

0%
Starting Five Minutes in Advance
Probability of meeting the deadline

0%
Probability of getting other important work done

100%
The Goldilocks Solution

Work fast Work smart Work hard

Start very late and hope for the Start as late as possible to be Start very early and do little
best sure to make it else

As usual, the middle path is best

Working Smart
Probability of meeting the deadline

95%
Probability of getting other important work done

95%
Working Hard, Fast, Smart
Probability of (1 year,100%)

meeting
deadline Start 1 year before deadline

100% probability of meeting deadline

Start 5 minutes before deadline

0% probability of meeting deadline

(5 mins,0%)

Time to deadline
Working Hard, Fast, Smart
(1 year,100%)
Probability of
meeting (?,95%)
Work hard
deadline
Work smart

Work fast
(5 mins,0%)

Time to deadline
Working Hard, Fast, Smart
(1 year,100%)
Probability of
meeting (?,95%)
Work hard
deadline
Work smart

Work fast
(5 mins,0%)

Time to deadline
Working Hard, Fast, Smart
(1 year,100%)
Probability of
meeting (?,95%)
deadline Work hard
95% Work smart

Work fast
(5 mins,0%) 11 days

Time to deadline
Working Hard, Fast, Smart
(1 year,100%)
Probability of
meeting (11 days,95%)
deadline Work hard
95%
Work smart

Work fast
(5 mins,0%)

Time to deadline
Working Hard, Fast, Smart
Probability of
meeting
deadline Work hard

Work smart

Work fast

Time to deadline
Logistic Regression helps find how probabilities
are changed by actions
Working Smart with Logistic Regression
Y
100%

Probability

0%
X
Time to deadline
Working Smart with Logistic Regression
Y
100%

Probability
0%

0%
X
Time to deadline
Start too late, and you’ll definitely miss
Working Smart with Logistic Regression
Y
100%

Probability
100%

0%
X
Time to deadline
Start too early, and you’ll definitely make it
Working Smart with Logistic Regression
Y
100%

Probability
<50% >50%

0%
X

Time to deadline
Working smart is knowing when to start
Y-axis: probability of meeting deadline

X-axis: time to deadline

Meeting or missing deadline is binary

Probability curve flattens at ends

- floor of 0

- ceiling of 1
y: hit or miss? (0 or 1?)

x: start time before deadline

p(y) : probability of y = 1
Logistic regression involves finding the “best fit” such
curve
1 - A is the intercept
p(yi) =
1+ e-(A+Bx )
i - B is the regression coefficient

(e is the constant 2.71828)

S-curves are widely studied, well understood

1
y=
1+e -(A+Bx)

Logistic regression uses S-curve to estimate

probabilities
1
p(y) =
1+e -(A+Bx)
x=0
x = +∞
p(y) = 1

1
p(yi) = x = -∞
1+ e-(A+Bx )
i p(y) = 0

If A and B are positive

x=0
x = -∞
p(y) = 1

1
p(yi) = x = +∞
1+ e-(A+Bx )
i p(y) = 0

If A and B are negative

Working Hard, Fast, Smart
Probability of (1 year,100%)
meeting (11 days,95%)
Work hard
deadline
Work smart

Work fast
(5 mins,0%)

Time to deadline
Working Hard, Fast, Smart

Minimum value of p(yi)

Working Hard, Fast, Smart

Maximum value of p(yi)

Working Hard, Fast, Smart
Probability of outcome changes with every change in value
of the independent variable
p(y2)

p(y1)

X1
X2

Between maximum and minimum values of p(yi)

Logistic Regression

0 1
Probability of outcome is very sensitive
to changes in cause

1
p(yi) =
1+ e-(A+Bx )
i
Categorical and Continuous Variables

Continuous Categorical
Can take an infinite set of values Can take a finite set of values
(height, weight, income…) (male/female, day of week…)

Categorical variables that can take just two values are called
binary variables
Logistic Regression helps estimate how
probabilities of categorical variables are
influenced by causes
Logistic Regression in Classification
Whales: Fish or Mammals

Mammal Fish

Member of the infraorder Cetacea Looks like a fish, swims like a fish, moves
like a fish
Rule-based Binary Classifier

Whale Mammal

Rule-based Classifier

Human Experts
ML-based Binary Classifier

Corpus Classification Algorithm ML-based Classifier

ML-based Binary Classifier

Moves like a fish,

ML-based Predictor Fish
Looks like a fish

Corpus
ML-based Binary Classifier

Breathes like a mammal

ML-based Classifier Mammal
Gives birth like a mammal

Corpus
ML-based Predictor

Corpus Logistic Regression ML-based Predictor

1
p(yi) =
1+ e-(A+Bx )
i
ML-based Predictor

Lives in water, breathes

P(mammal) = 0.55
with lungs,does not lay
eggs

Corpus
Applying Logistic Regression
Probability of
animal being
(95%)
fish Lives in water, breathes with gills,
lays eggs
(60%)

Lives in water, breathes with lungs,does not

lay eggs
Lives on land, breathes with lungs,does not lay
eggs
(5%) (40%)

Whales: Fish or Mammals?

Applying Logistic Regression
(50%)
Probability of
animal being
(95%)
fish (80%)

(60%)

(5%) (20%) (40%)

Rule of 50%
Applying Logistic Regression
(50%)
Probability of
animal being (95%)
fish (80%)

(60%)

(5%) (20%) (40%)

If probability < 50%, it’s a mammal

Applying Logistic Regression
(50%)
Probability of
animal being (95%)
fish (80%)

(60%)

(5%) (20%) (40%)

If probability < 50%, it’s a mammal

Applying Logistic Regression

Mammal Fish

Probability of whales being fish < 50%

Applying Logistic Regression

Mammal Fish

Probability of whales being fish > 50%

Logistic Regression and Linear Regression
X Causes Y

Cause Effect

Independent variable Dependent variable

X Causes Y

Cause Effect

Explanatory variable Dependent variable

Linear Regression
Y

Represent all n points as (xi,yi), where i = 1 to n

Linear Regression
Y (x1, y1)

(x2, y2)

(x3, y3)
Regression Line: y =
(xn, yn) A + Bx

Represent all n points as (xi,yi), where i = 1 to n

Logistic Regression
p(y)

y=1

y=0

Represent all n points as (xi,yi), where i = 1 to n

Logistic Regression
p(y)

(x3, y3) (xn, yn)

Regression Curve
1
p(y) =
1 + e-(A+Bx)
(x1, y1)
(x2, y2)
X

Represent all n points as (xi,yi), where i = 1 to n

Similar, yet Different

Linear Regression Logistic Regression

Given causes, predict effect Given causes, predict probability of effect

y
p(y)

x x
Similar, yet Different

Linear Regression Logistic Regression

Effect variable (y) must be continuous Effect variable (y) must be categorical

y p(y)

x x
Similar, yet Different

Linear Regression Logistic Regression

Cause variables (x) can be continuous or Cause variables (x) can be continuous or
categorical categorical

y p(y)

x x
Similar, yet Different

Linear Regression Logistic Regression

Connect the dots with a straight line Connect the dots with an S-curve

y p(y)

x x
Similar, yet Different

Linear Regression Logistic Regression

1
yi = A + Bxi p(yi) =
1+ e-(A+Bx )
i

y p(y)

x x
Similar, yet Different

Linear Regression Logistic Regression

1
yi = A + Bxi p(yi) =
1+e -(A+Bx )
i

Objective of regression is to find A, B that “best Objective of regression is to find A, B that “best
fit” the data fit” the data
Similar, yet Different

Linear Regression Logistic Regression

p(yi)
yi = A + Bxi ln( 1 - p(yi) ) = A + Bxi

Relationship is already linear (by assumption) Relationship can be made linear (by log
transformation)
Similar, yet Different

Linear Regression Logistic Regression

yi = A + Bxi logit(p) = A + Bxi

p
logit(p) = ln( )
1-p

Solve regression problem using cookie-cutter Solve regression problem using cookie-cutter
solvers solvers
Logistic Regression
p(y)

y=1

y=0

Represent all n points as (xi,yi), where i = 1 to n

Logistic Regression
p(y)
(x3, y3) (xn, yn)

Regression Curve
1
p(y) =
1+e -(A+Bx)

(x1, y1)

(x2, y2)
X

Represent all n points as (xi,yi), where i = 1 to n

Linear Regression
y = A + Bx

y1 = A + Bx1
y2 = A + Bx2
y3 = A + Bx3
… …
yn = A + Bxn
Logistic Regression
1
p(y) =
1+e -(A+Bx)

1
p(yi) =
1+ e-(A+Bx )
i

1
p(y1) =
1+ e-(A+Bx )
1
…
1
p(yn) =
1 + e-(A+Bxn)
Logistic Regression
p(y)

y=1

y=0

Represent all n points as (xi,yi), where i = 1 to n

Logistic Regression
p(y)

(x3, y3) (xn, yn)

Regression Curve
1
p(y) =
1+e -(A+Bx)

(x1, y1)
(x2, y2)
X

Represent all n points as (xi,yi), where i = 1 to n

Logistic Regression

Regression Equation:

1
p(yi) =
1+ e-(A+Bx )
i

Solve for A and B that “best fit” the data

Odds from Probabilities

p
Odds(p) =
1-p
Odds of an Event
1
p=
1+e -(A+Bx)

A + Bx
e
p= A + Bx
1+e
A + Bx
e
1-p= 1-
A + Bx
1+e
Odds of an Event
A + Bx
e
1-p= 1- A + Bx
1+e

A + Bx
1+e - e
A + Bx
1-p= A + Bx
1+e

1
1-p=
A + Bx
1+e
Odds of an Event
A + Bx
e
p= A + Bx
1+e
1
1-p=
A + Bx
1+e
p
Odds(p) = = A + Bx
1-p e
Logit Is Linear

p
Odds(p) = = A + Bx
e
1-p

logit(p) = A + Bx

ln(Odds(p)) is called the logit function

Logit Is Linear
ln Odds(p) = ln(p) - ln(1-p)

1
p=
1+e -(A+Bx)

logit(p) = ln Odds(p) = A + Bx

This is a linear function!

Logistic Regression can be solved via linear
regression on the logit function (log of the odds
function)
Logistic Regression in TensorFlow
Logistic Regression

Cause Effect

Changes in S&P 500 Changes in price of Google Stock

Logistic Regression

y = Returns on x = Returns on
Google stock S&P 500
(GOOG) (S&P500)
Logistic Regression

1
p(yi) =
1+ e-(A+Bx )
i

P(y) = Probability of Google x = Returns on S&P 500

going up in the current month i for current month
Logistic Regression

>,= 0.5 Up True

Predicted
Predicted
labels Labels
< 0.5 Down False
Set up the Problem

> 0% Up 1

GOOG Labels
Returns
<= 0% Down 0

Label GOOG returns as binary (1,0)

Prediction Accuracy
DATE ACTUAL PREDICTED
2005-01-01 NA NA
2005-02-01 0 1
2005-03-01 0 0

2017-01-01 1 1
2017-02-01 1 1

Compare GOOG’s actual labels vs. predicted labels

Linear Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Linear Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Linear Regression with One Neuron

Set of Points
Single Neuron Regression Line
Linear Regression with One Neuron

W1
X1
W2
X2 Affine Activation
Wi
Transformation Wx + b Function Wx+b
…

Xi
Wn
…

Xn b
Linear Regression with One Neuron

W1
X1
W2
X2 Affine Identity
Wi
Transformation Wx + b Function Wx+b
…

Xi
Wn
…

Xn b
Linear Regression with One Neuron

W1
X1
X2 W2 Affine Identity
Wi Transformation Wx + b Function
…

Wn
…

Xn
b
Logistic Regression with One Neuron
W1
X1
X2 W2 Softmax P(Y = True)
Affine Transformation
Wi Wx + b Function
…

Xi P(Y = False)

Wn
…

Xn
b
Logistic Regression with One Neuron
W1
X1
X2
W2 Affine Softmax P(Y = True)
Wi Transformation Wx + b Function
…

P(Y = False)
Xi

Wn
…

Xn b
Logistic Regression with One Neuron
W1
X1 W2
X2 Affine W2 Softmax P(Y = True)
Wi
Transformation W1x + b1 Function
…

Xi P(Y = False)

Wn
…

Xn b1 b2
Logistic Regression with One Neuron
W1
X1
X2
W2 Affine W2 Softmax P(Y = True)
Wi Transformation Wx + b Function
…

P(Y = False)
Xi

Wn
…

Xn b b2
Logistic Regression with One Neuron

W1 1
X1
W2 Affine 1+e -(W2x’+b2)
X2
Wi Transformation W1x + b1
…

Softmax
Xi p(Y = True)
x’ Function
Wn W2,b2
…

1
Xn b1 W2x’ + b2
1+e

1 - p(Y = True)
Logistic Regression with One Neuron
W1
X1
X2
W2 Affine W2 Softmax P(Y = True)
Wi Transformation x’ Function
…

P(Y = False)
Xi

Wn
…

Xn b b2
Logistic Regression with One Neuron

W1 1
X1
X2
W2 1+e -(W2x’+b2)
Affine Transformation
Wi W1x + b1 Softmax
…

Xi x’ Function p(Y = True)

Wn
…

W2,b2
1
Xn
b1 W2x’ + b2
1+e

1 - p(Y = True)
Logistic Regression with One Neuron

W1
X1
X2
W2 Affine P(Y = True)
Wi Transformation W1x + b1
Softmax
…

Xi Function
Wn
…

P(Y = False)
Xn
b1
SoftMax for True/False Classification

1
1+e -(Wx+B)
p(Y = True)
Softmax
x
Function 1
Wx + B
1+e
p(Y = False)
Linear Regression with One Neuron

1-dimensional feature
vector
Shape (W) = [1,1] Regression Line

Shape (b) = [1]

Logistic Regression with One Neuron

1-dimensional feature
Shape (W) = [1,2] S-Curve
vector
Shape (b) = [2]
SoftMax for Digit Classification

P(Y = 0)

P(Y = 1)

Softmax
Function …

P(Y = 9)
SoftMax for Digit Classification

1-dimensional feature
S-Curve
vector Shape (W) = [1,10]

Shape (b) = [10]

SoftMax N-category Classification

P(Y = Y1)

P(Y = Y2)
Softmax
Function …

P(Y = YN)
SoftMax N-category Classification

1-dimensional feature
vector
Shape (W) = [1,N] S-Curve

Shape (b) = [N]

SoftMax N-category Classification

1-dimensional feature
vector
Shape (W) = [1,N] S-Curve

Shape (b) = [N]

SoftMax N-category Classification

1-dimensional feature
Shape (W) = [1,N] S-Curve
vector
Shape (b) = [N]
SoftMax N-category Classification

M-dimensional feature Shape (W) = [M,N] S-Curve

vector
Shape (b) = [N]
SoftMax N-category Classification

M-dimensional feature
vector
Shape (W) = [M,N] S-Curve

Shape (b) = [N]

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Linear Regression and MSE
Y

Regression Line: y =
A + Bx

The “best fit” line is called the regression line

Logistic Regression

>,= 0.5 Up True

Predicted
Predicted labels
Labels
< 0.5 Down False
Set up the Problem

> 0% Up 1

GOOG
Labels
Returns

<= 0% Down 0

Label GOOG returns as binary (1,0)

Prediction Accuracy
DATE ACTUAL PREDICTED

2005-01-01 NA NA

2005-02-01 0 1
2005-03-01 0 0

2017-01-01 1 1
2017-02-01 1 1

Compare GOOG’s actual labels vs. predicted labels

Intuition: Low Cross Entropy

Yactual

Ypredicted
Intuition: Low Cross Entropy

Yactual

Ypredicted

The labels of the two series are in-synch

Intuition: Low Cross Entropy

Yactual

Ypredicted

-Sum( Yactual * log [ Ypredicted] ) will be small

Cross Entropy
Intuition: Low Cross Entropy

Yactual

Ypredicted
Intuition: High Cross Entropy

Yactual

Ypredicted

The labels of the two series are out-of-synch

Intuition: High Cross Entropy

Yactual

Ypredicted

-Sum( Yactual * log [ Ypredicted] ) will be large

Cross Entropy
Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

tensorflow.argmax(y,1)

Finding the index of the largest element

Return the index of the largest element of tensor y along dimension k
Tensor

tensorflow.argmax(y,1)

Finding the index of the largest element

Return the index of the largest element of tensor y along dimension k
Dimension

tensorflow.argmax(y,1)

Finding the index of the largest element

Return the index of the largest element of tensor y along dimension k
tf.argmax

Tensor y Dimension 0 Dimension 1

5
Index = 0
15
1
12
2
3 100

4 74

5 33

tf.argmax(y,1)
tf.argmax

Tensor y Dimension 0 Dimension 1

5
Index = 0
15
1
12
2
3 100

74
Return value 4
5 33

tf.argmax(y,1)
tf.argmax

Tensor y Dimension 0 Dimension 1

5
Index = 0
15
1
12
2
3 100

74
Return value 4
5 33

tf.argmax(y,1)
tf.argmax

Tensor y Dimension 0 Dimension 1

Index = 0 5

1 15

12
2
3 100

4 74

5 33

tf.argmax(y,1)
tf.argmax

Tensor y Dimension 0 Dimension 1

Index = 0 5

1 15

12
2
3 100

Return value 4 74
Largest value
5 33

tf.argmax(y,1)
tf.argmax

Tensor y Dimension 1 … Dimension N

Index = 0

Index = M

tf.argmax(y,1)
tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Actual labels

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Predicted labels
tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
One-hot

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
One-hot Representation

TRUE FALSE

TRUE 1 0

FALSE 0 1
FALSE 0 1
…

TRUE 1 0

Label Vector One-hot Label Vector

One-hot y_

TRUE FALSE
TRUE 1 0

FALSE 0 1
FALSE 0 1
…

TRUE 1 0

Label Vector One-hot Label Vector

argmax(y_,1)

0 1

1 0 0

0 1 1

0 1 1
…

1 0 0

One-hot Label Vector Index of one-hot element

Predicted labels
tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Predicted Probabilities y
P(TRUE) P(FALSE)

P(TRUE) = 0.70 0.70 0.30

P(TRUE) = 0.44 0.44 0.56
P(TRUE) = 0.34 0.34 0.66
…

P(TRUE) = 0.84 0.84 0.16

Probabilities Softmax Output

Predicted Probabilities y

P(TRUE) P(FALSE)

P(TRUE) = 0.70 0.70 0.30

P(TRUE) = 0.44 0.44 0.56

P(TRUE) = 0.34 0.34 0.66 Each row

…
sums to 1

P(TRUE) = 0.84 0.84 0.16

Probabilities Softmax Output

Predicted Probabilities y

P(TRUE) P(FALSE)
P(TRUE) = 0.70 0.70 0.30

P(TRUE) = 0.44 0.44 0.56

P(TRUE) = 0.34 0.34 0.66

P(TRUE) = 0.84 0.84 0.16

Probabilities Softmax Output

Rule of 50% in Binary Classification

Mammal Fish

Probability of whales being Fish < 50%

argmax(y,1)

P(TRUE) P(FALSE)
0
0.70 0.30
1
0.44 0.56
1
0.34 0.66
…

0
0.84 0.16

Softmax Output argmax(y,1)

One-hot Vectors with Digit Classes
0 1 … 9

0 1 0 0 0

1 0 1 0 0

…
…

9 0 0 0 1

Actual Digits One-hot Label Vectors

y_:One-hot Vectors with Digit Classes

0 1 … 9

0 1 0 0 0

1 0 1 0 0

…
…

9 0 0 0 1

Actual Digits One-hot Label Vectors

argmax(y_,1)

0 1 … 9

1 0 0 0 0

0 1 0 0 1
…
…

0 0 0 1 9

One-hot Label Vectors argmax(y_,1)

Digit Classification

P(X=0) P(X=1) … P(X=9)

0.70 0.30 0

0.44 0.56 1
1
…

0.66 9

Softmax Output argmax(y,1)

y: Predicted Probabilities

P(X=0) P(X=1) … P(X=9)

0.70 0.30 0
0.44 0.56 1

0.66 9

Softmax Output argmax(y,1)

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Actual labels

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Predicted labels
tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Two invocations of tf.argmax

Once on actual labels y_, once on predicted values y
Two invocations of tf.argmax

Tensor of actual labels Tensor of predicted labels

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Once on actual labels y_, once on predicted values y

Two invocations of tf.argmax

List of True, False values

tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Once on actual labels y_, once on predicted values y

Two invocations of tf.argmax

True: Correct prediction

False: Incorrect prediction
tf.equal(tf.argmax(y_,1), tf.argmax(y,1))

Once on actual labels y_, once on predicted values y

Building Generalized Linear Models Using Estimators
Over view

Estimators are cookie-cutter TensorFlow APIs for many standard problems

These high-level APIs reside in tf.learn and tf.contrib.learn

Estimators can be extended by plugging custom models into a base class

That extension relies on composition rather than inheritance

How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type (Base class)

…
Feature Dimensions
Name Type
Input Function
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type
(Base class)

…
Feature Dimensions
Name Type
Input Function
Feature Number of
y Variable Batch Size
Names Epochs
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type
(Base class)

…
Feature Dimensions
Name Type
Input Function
Feature Number of
y Variable Batch Size
Names Epochs
How Estimators Work

Feature Vector Estimator

Object
Instantiate Optimizer

Fetch training data

Define cost function

Run optimisation
Input Function
Return trained model
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type (Base class)

…
Feature Dimensions
Name Type
Input Function
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type (Base class)

…
Feature Dimensions
Name Type
Input Function
Complex Neural Net works

Pixels Processed groups of pixels

Operations (nodes) on data

Corpus of ML-based Classifier
Images (edges)
Complex Neural Net works

Pixels Processed groups of pixels

Corpus of Neurons in a neural network can be connected in ML-based Classifier

Images very complex ways…
Complex Neural Net works

Pixels Processed groups of pixels

Corpus of All those interconnections represent intermediate ML-based Classifier

Images feature vector data!
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type (Base class)

…
Feature Dimensions
Name Type
Input Function
All those interconnections represent intermediate
feature vector data!
How Estimators Work

Feature Vector Estimator Object

Feature Dimensions tf.contrib.learn.Estimator
Name Type (Base class)

…
Feature Dimensions
Name Type
Input Function
Linear Regression in TensorFlow
Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Linear Regression with an Estimator

Baseline Input Function Evaluate

Non-TensorFlow implementation tf.contrib.learn.io.numpy_input_fn Use trained model

Regular python code Set up X, Y, batch_size, num_epochs Predict new points (test data)

Instantiate Estimator Fit

tf.contrib.learn.LinearRegressor Returns trained model

Abstracts cost and optimizer choices Can re-specify number of training steps
Learning using Neurons

Linear Regression in TensorFlow

Course Outline
Logistic Regression in TensorFlow

Estimators
“Representation” ML-based systems figure out by
themselves what features to pay attention to
y = Wx + b

“Learning” Regression
Regression can be reverse-engineered by a single neuron
Regression: The Simplest Neural Net work

Set of Points
Single Neuron Regression Line
def XOR(x1,x2):
if (x1 == x2):
return 0
return 1

“Learning” XOR
The XOR function can be reverse-engineered using 3 neurons arranged in 2 layers
XOR: 3 Neurons, 2 Layers

X1
X1 X2 Y

0 0 0

0 1 1

X2 1 0 1
1 1 0

2 Inputs Layer 1 Layer 2 Output Truth Table

def doSomethingReallyComplicated(x1,x2…):
…
…
…
return complicatedResult

“Learning” Arbitrarily Complex Functions

Adding layers to a neural network can “learn” (reverse-engineer) pretty much anything
Arbitrarily Complex Function

Pixels Processed groups of pixels

Corpus of Operations (nodes) on data ML-based Classifier

Images (edges)
Linear Regression in TensorFlow
Baseline Cost Function Training

Non-TensorFlow implementation Mean Square Error (MSE) Invoke optimizer in epochs

Regular python code Quantifying goodness-of-fit Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Affine transformation suffices Improving goodness-of-fit Compare to baseline

Logistic Regression in TensorFlow

Baseline Cost Function Training

Non-TensorFlow implementation Cross Entropy Invoke optimizer in epochs

Regular python code Similarity of distribution Batch size for each epoch

Computation Graph Optimizer Converged Model

Neural network of 1 neuron Gradient Descent optimizers Values of W and b

Softmax activation required Improving goodness-of-fit Compare to baseline

Logistic Regression Using Estimators

Official Hardcore Nuzlocke Rulebook by PokemonChallenges
No ratings yet
Official Hardcore Nuzlocke Rulebook by PokemonChallenges
4 pages
Innovation
No ratings yet
Innovation
47 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
Mastering Objectoriented Python
From Everand
Mastering Objectoriented Python
Steven F. Lott
5/5 (2)
Internet Does More Harm Than Good
100% (1)
Internet Does More Harm Than Good
3 pages
Materials Science and Engineering - A First Course - V. Raghavan
No ratings yet
Materials Science and Engineering - A First Course - V. Raghavan
53 pages
Velo Clod Lab Hol 2140 01 Net - PDF - en
No ratings yet
Velo Clod Lab Hol 2140 01 Net - PDF - en
172 pages
Delegation
100% (2)
Delegation
58 pages
Introducing TensorFlow and ML
No ratings yet
Introducing TensorFlow and ML
289 pages
TensorFlow NN
No ratings yet
TensorFlow NN
178 pages
Introduction To DL With TensorFlow
No ratings yet
Introduction To DL With TensorFlow
55 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-2
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-2
12 pages
Module 3
No ratings yet
Module 3
97 pages
ML06_Neural-Network_2024-2025
No ratings yet
ML06_Neural-Network_2024-2025
78 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
24 pages
Deep Learning With Tensorflow
100% (1)
Deep Learning With Tensorflow
70 pages
DL unit 4 perfect pdf_1
No ratings yet
DL unit 4 perfect pdf_1
23 pages
What Is AI & Machine
No ratings yet
What Is AI & Machine
8 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Reading+10+ +Introduction+to+Deep+Learning
No ratings yet
Reading+10+ +Introduction+to+Deep+Learning
21 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
Unit 3 Introduction to Deep Learning part 1
No ratings yet
Unit 3 Introduction to Deep Learning part 1
7 pages
8. Deep learning
No ratings yet
8. Deep learning
95 pages
Unit 4
100% (1)
Unit 4
57 pages
Deep Learning Introduction
No ratings yet
Deep Learning Introduction
14 pages
Lecture no 6 Deep Learning Algorithm
No ratings yet
Lecture no 6 Deep Learning Algorithm
37 pages
DL Inference FPGA Class1
No ratings yet
DL Inference FPGA Class1
56 pages
Unit I
No ratings yet
Unit I
48 pages
04Introduction to Neural Networks
No ratings yet
04Introduction to Neural Networks
62 pages
MODULE 1 DL SNOTES
No ratings yet
MODULE 1 DL SNOTES
11 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
JNTUK R20 B.Tech CSE 4-1 Deep Learning Techniques Unit 1 Notes
No ratings yet
JNTUK R20 B.Tech CSE 4-1 Deep Learning Techniques Unit 1 Notes
15 pages
Module 2
No ratings yet
Module 2
84 pages
Module 2
No ratings yet
Module 2
73 pages
Report On Neural Networks
No ratings yet
Report On Neural Networks
15 pages
What Is AI? What Is ML? What Is Deep Learning? Machine Learning Process
No ratings yet
What Is AI? What Is ML? What Is Deep Learning? Machine Learning Process
8 pages
Deep Learning
No ratings yet
Deep Learning
100 pages
Hardware Architectures For Deep Neural Networks-MIT'16
No ratings yet
Hardware Architectures For Deep Neural Networks-MIT'16
300 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
Introduction To ML
No ratings yet
Introduction To ML
34 pages
AA12_Deep_Learning_2024 (1)
No ratings yet
AA12_Deep_Learning_2024 (1)
30 pages
Article Review 10 Eng
No ratings yet
Article Review 10 Eng
28 pages
Deep Learning Unit-II
No ratings yet
Deep Learning Unit-II
19 pages
Deep Learning File
No ratings yet
Deep Learning File
58 pages
1. Deep Learning
No ratings yet
1. Deep Learning
127 pages
Mod 2.1,2.2
No ratings yet
Mod 2.1,2.2
24 pages
Analyzing Types of Neural Networks in Deep Learning
No ratings yet
Analyzing Types of Neural Networks in Deep Learning
15 pages
ML Lecture#1
No ratings yet
ML Lecture#1
52 pages
Unit 4 Hca
No ratings yet
Unit 4 Hca
57 pages
Introduction To Deep Neural Networks - DataCamp
No ratings yet
Introduction To Deep Neural Networks - DataCamp
10 pages
1 DEEP LEARNING 2324 (1)
No ratings yet
1 DEEP LEARNING 2324 (1)
55 pages
Unit I
No ratings yet
Unit I
10 pages
Module1_ Deep Learning
No ratings yet
Module1_ Deep Learning
26 pages
Lec 1
No ratings yet
Lec 1
30 pages
DEEP LEARNING & IMAGE PROCESSING [DAY-1 NOTES]
No ratings yet
DEEP LEARNING & IMAGE PROCESSING [DAY-1 NOTES]
18 pages
Introduction To Deep Learning: Poo Kuan Hoong 19 July 2016
No ratings yet
Introduction To Deep Learning: Poo Kuan Hoong 19 July 2016
53 pages
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
23 pages
Unit-3
No ratings yet
Unit-3
16 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
1c Machinelearning
No ratings yet
1c Machinelearning
50 pages
Lecture Notes: Introduction To Machine Learning For The Sciences
No ratings yet
Lecture Notes: Introduction To Machine Learning For The Sciences
80 pages
Clojure Programming Cookbook
From Everand
Clojure Programming Cookbook
Makoto Hashimoto
No ratings yet
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
Entity Framework Tutorial - Second Edition
From Everand
Entity Framework Tutorial - Second Edition
Joydip Kanjilal
No ratings yet
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
From Everand
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
Arun Manivannan
No ratings yet
Microsoft (Edge) Default
No ratings yet
Microsoft (Edge) Default
4 pages
Vacuum Pumps Manufacturers Bangalore Karnataka, India
No ratings yet
Vacuum Pumps Manufacturers Bangalore Karnataka, India
55 pages
Redes Definidas Por Software Uma Abordagem Sistemi
No ratings yet
Redes Definidas Por Software Uma Abordagem Sistemi
52 pages
Blueberry Inspection Instructions
No ratings yet
Blueberry Inspection Instructions
40 pages
Tong D 2012 PHD Thesis
No ratings yet
Tong D 2012 PHD Thesis
246 pages
Profile Summary: Vishal Vaibhav
No ratings yet
Profile Summary: Vishal Vaibhav
4 pages
People vs. Purisima
No ratings yet
People vs. Purisima
8 pages
Bosch Spreads
No ratings yet
Bosch Spreads
47 pages
988Hole's Essentials of Human Anatomy and Physiology 12th Edition by David Shier, Jackie Butler, Ricki Lewis ISBN 0073403725 9780073403724 instant download
No ratings yet
988Hole's Essentials of Human Anatomy and Physiology 12th Edition by David Shier, Jackie Butler, Ricki Lewis ISBN 0073403725 9780073403724 instant download
29 pages
Electronic Reservation Slip (ERS) : 2354498663 12455/DEE BKN EXP Sleeper Class (SL)
No ratings yet
Electronic Reservation Slip (ERS) : 2354498663 12455/DEE BKN EXP Sleeper Class (SL)
2 pages
EENG 95 (LAB) Midterms Reviewer
No ratings yet
EENG 95 (LAB) Midterms Reviewer
7 pages
Windows10andWindowsServer2016PolicySettings 1803
No ratings yet
Windows10andWindowsServer2016PolicySettings 1803
634 pages
DBMS Report
No ratings yet
DBMS Report
84 pages
Course Title: Structure in English Course Code: EM3 Course Description
No ratings yet
Course Title: Structure in English Course Code: EM3 Course Description
7 pages
Dokumen.pub the New Commodity Trading Guide Breakthrough Strategies for Capturing Market Profits 1st Edition 0137145292 9780137145294
No ratings yet
Dokumen.pub the New Commodity Trading Guide Breakthrough Strategies for Capturing Market Profits 1st Edition 0137145292 9780137145294
193 pages
4
No ratings yet
4
12 pages
The Global Transformations Reader An Introduction to the Globalization Debate 2nd Edition David Held - The ebook version is available in PDF and DOCX for easy access
100% (1)
The Global Transformations Reader An Introduction to the Globalization Debate 2nd Edition David Held - The ebook version is available in PDF and DOCX for easy access
44 pages
Spare Parts Catalog: 16 S 2323 TD Astra Veicoli Industriali Material Number: 1344.004.008
No ratings yet
Spare Parts Catalog: 16 S 2323 TD Astra Veicoli Industriali Material Number: 1344.004.008
66 pages
Bigbook Quant
50% (2)
Bigbook Quant
566 pages
The Fiber Cable Specifications - GYTY53-12CORE-Version4
No ratings yet
The Fiber Cable Specifications - GYTY53-12CORE-Version4
5 pages
Invito Programma Evento Stresa Maggio
No ratings yet
Invito Programma Evento Stresa Maggio
23 pages
Arc Lengths and Area of Sectors
No ratings yet
Arc Lengths and Area of Sectors
5 pages
Personal Account Application Form PDF
No ratings yet
Personal Account Application Form PDF
2 pages
(Ebook) Encountering the New Testament : a historical and theological survey by Walter A. Elwell, Robert W. Yarbrough ISBN 9780801039645, 0801039649 2024 Scribd Download
100% (2)
(Ebook) Encountering the New Testament : a historical and theological survey by Walter A. Elwell, Robert W. Yarbrough ISBN 9780801039645, 0801039649 2024 Scribd Download
67 pages