0% found this document useful (0 votes)
22 views

Ch-2-Mathematical Building Blocks NN

Uploaded by

shahmurrawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Ch-2-Mathematical Building Blocks NN

Uploaded by

shahmurrawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

The Mathematical Building

Blocks of Neural Networks


Mathematics forms the core of machine learning as all the
machine learning principles are based on it. To learn Machine
Learning one must have a good understanding of Vectors,
Matrices, Probability and Statistics and a bit of calculus
(derivatives and partial derivatives).
Mathematics is vital in any machine learning algorithm and
includes various core concepts of mathematics to get the
right algorithm designed in a specific way.

The importance of mathematics topics for machine learning


and data science is mentioned below −
1
2
The core building block of neural networks is the layer, a data-
processing module that you can think of as a filter for data. Some data
goes in, and it comes out in a more useful form. Specifically, layers
extract representations out of the data fed into them—hopefully,
representations that are more meaningful for the problem at hand.
Most of deep learning consists of chaining together simple layers that
will implement a form of progressive data purification. A deep-learning
model is like a filter for data processing, made of a chain of increasingly
refined data filters—the layers.

3
Here, our network consists of a sequence of two Dense layers, which
are densely connected (also called fully connected) neural layers. The
second (and last) layer is a 10-way softmax layer, which means it will
return an array of 10 probability scores (summing to 1). Each score will
be the probability that the current digit image belongs to one of our 10
digit classes.

The softmax function, often used in the final layer of a neural network
model for classification tasks, converts raw output scores — also
known as logits — into probabilities by taking the exponential of each
output and normalizing these values by dividing by the sum of all the
exponentials.

4
To make the network ready for training, we need to pick three more
things, as part of the compilation step:
• A loss function—How the network will be able to measure its
performance on the training data, and thus how it will be able to
navigate itself in the right direction.
• An optimizer—The mechanism through which the network will
update itself based on the data it sees and its loss function.
• Metrics to monitor during training and testing—Here, we’ll only care
about accuracy (the fraction of the images that were correctly
classified).

5
Data representations for neural networks
In the previous example, we started from data stored in
multidimensional Numpy arrays, also called tensors. In general, all
current machine-learning systems use tensors as their basic data
structure. Tensors are fundamental to the field—so fundamental that
Google’s TensorFlow was named after them.
At its core, a tensor is a container for data—almost always numerical
data. So, it’s a container for numbers. You may be already familiar with
matrices, which are 2D tensors: tensors are a generalization of matrices
to an random number of dimensions (note that in the context of
tensors, a dimension is often called an axis).

6
Tensors
A Tensor is a N-dimensional Matrix:
• A Scalar is a 0-dimensional tensor
• A Vector is a 1-dimensional tensor
• A Matrix is a 2-dimensional tensor
A Tensor is a generalization of Vectors and Matrices to higher
dimensions.

Tensors are multi-dimensional arrays with a uniform type (called a dtype).


You can see all supported dtypes at tf.dtypes. Tensors are (kind of) like np.arrays.

7
8
Tensor Ranks
The number of directions a tensor can have in a N-dimensional space,
is called the Rank of the tensor.
The rank is denoted R.
A Scalar is a single number.
• It has 0 Axes
• It has a Rank of 0
• It is a 0-dimensional Tensor

9
A Vector is an array of numbers.
• It has 1 Axis
• It has a Rank of 1
• It is a 1-dimensional Tensor
• A Matrix is a 2-dimensional array.
• It has 2 Axis
• It has a Rank of 2
• It is a 2-dimensional Tensor

10
Vectors

Vector is considered to be array of numbers which is either continuous


or discrete and the space which consists of vectors is called as vector
space. A vector is a quantity or phenomenon that has two independent
properties: magnitude and direction. The space dimensions of vectors
can be either finite or infinite but it has been observed that machine
learning and data science problems deal with fixed length vectors.
In Deep learning, we deal with multidimensional data. So vectors
become very crucial and are considered as input features for any
prediction problem statement.

11
An array of numbers is called a vector, or 1D tensor. A 1D tensor is said
to have exactly one axis. Following is a Numpy vector:
>>> x = np.array([12, 3, 6, 14])
>>> x array([12, 3, 6, 14])
>>> x.ndim 1
This vector has five entries and so is called a 5-dimensional vector.
Don’t confuse a 5D vector with a 5D tensor! A 5D vector has only one
axis and has five dimensions along its axis, whereas a 5D tensor has five
axes (and may have any number of dimensions along each axis).
Dimensionality can denote either the number of entries along a specific
axis (as in the case of our 5D vector) or the number of axes in a tensor
(such as a 5D tensor), which can be confusing at times.

12
Scalars (0D tensors)
A tensor that contains only one number is called a scalar (or scalar tensor, or
0-dimensional tensor, or 0D tensor). In Numpy, a float32 or float64 number is
a scalar tensor (or scalar array). You can display the number of axes of a
Numpy tensor via the ndim attribute; a scalar tensor has 0 axes (ndim == 0).
The number of axes of a tensor is also called its rank. Here’s a Numpy scalar:
>>> import numpy as np
>>> x = np.array(12)
>>> x array(12)
>>> x.ndim
0
Note: ndim represents the number of dimensions (axes) of the ndarray

13
Matrices (2D tensors)
An array of vectors is a matrix, or 2D tensor. A matrix has two axes
(often referred to rows and columns).
A rectangular representation of numbers in rows and columns is called
a matrix. If a matrix has n rows and m columns it is said to be an order
“n x m” matrix (read as n by m matrix). Matrices are used to represent
a dataset systematically with the rows representing different data
points and the columns representing their different parameters.

14
This is a Numpy matrix:
>>> x = np.array([[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])
>>> x.ndim
2
The entries from the first axis are called the rows, and the entries from
the second axis are called the columns. In the previous example, [5, 78,
2, 34, 0] is the first row of x, and [5, 6, 7] is the first column.

15
3D tensors and higher-dimensional tensors
Most of the structured data is usually represented in the form of tables
or a specific matrix.
If you pack such matrices in a new array, you obtain a 3D tensor, which
you can visually understand as a cube of numbers. Following is a Numpy
3D tensor: By packing 3D tensors in an array, you can create a 4D tensor,
and so on. In deep learning, you’ll generally manipulate tensors that are
0D to 4D, although you may go up to 5D if you process video data.

16
Key attributes
A tensor is defined by three key attributes:
Number of axes (rank)—For instance, a 3D tensor has three axes, and a
matrix has two axes. This is also called the tensor’s ndim in Python libraries
such as Numpy.
Shape—This is a tuple of integers that describes how many dimensions the
tensor has along each axis. For instance, the previous matrix example has
shape (3, 5), and the 3D tensor example has shape (3, 3, 5). A vector has a
shape with a single element, such as (5,), whereas a scalar has an empty
shape, ().
Data type (usually called dtype in Python libraries)—This is the type of the
data contained in the tensor; for instance, a tensor’s type could be float32,
uint8, float64, and so on. On rare occasions, you may see a char tensor. Note
that string tensors don’t exist in Numpy (or in most other libraries), because
tensors live in preallocated, contiguous memory segments: and strings, being
variable length, would preclude the use of this implementation

17
18
Let’s display the fourth digit in this 3D tensor, using the library
Matplotlib (part of the standard scientific Python suite).
Matplotlib is a low level graph plotting library in python that serves as a
visualization utility.
• Matplotlib was created by John D. Hunter.
• Matplotlib is open source and we can use it freely.
• Matplotlib is mostly written in python, a few segments are written in
C, Objective-C and Javascript for Platform compatibility.

19
Displaying the fourth digit
digit = train_images[4]
import matplotlib.pyplot as plt
plt.imshow(digit, cmap=plt.cm.binary)
plt.show()

20
Real-world examples of data tensors
Let’s make data tensors more concrete with a few examples similar to
what you’ll encounter later. The data you’ll manipulate will almost
always fall into one of the following categories:
• Vector data—2D tensors of shape (samples, features)
• Timeseries data or sequence data—3D tensors of shape (samples,
timesteps, features)
• Images—4D tensors of shape (samples, height, width, channels) or
(samples, channels, height, width)
• Video—5D tensors of shape (samples, frames, height, width,
channels) or (samples, frames, channels, height, width)

21
Vector data
This is the most common case. In such a dataset, each single data point can
be encoded as a vector, and thus a batch of data will be encoded as a 2D
tensor (that is, an array of vectors), where the first axis is the samples axis
and the second axis is the features axis. Let’s take a look at two examples:
• An actuarial dataset of people, where we consider each person’s age, ZIP
code, and income. Each person can be characterized as a vector of 3 values,
and thus an entire dataset of 100,000 people can be stored in a 2D tensor
of shape (100000, 3).
• A dataset of text documents, where we represent each document by the
counts of how many times each word appears in it (out of a dictionary of
20,000 common words). Each document can be encoded as a vector of
20,000 values (one count per word in the dictionary), and thus an entire
dataset of 500 documents can be stored in a tensor of shape (500, 20000).

22
Timeseries data or sequence data
Whenever time matters in your data (or the notion of sequence order),
it makes sense to store it in a 3D tensor with ancleartime axis. Each
sample can be encoded as a sequence of vectors (a 2D tensor), and
thus a batch of data will be encoded as a 3D tensor.
A 3D timeseries data tensor

23
The time axis is always the second axis (axis of index 1), by convention.
Let’s look at a few examples:

• A dataset of stock prices. Every minute, we store the current price of


the stock, the highest price in the past minute, and the lowest price in
the past minute. Thus every minute is encoded as a 3D vector, an
entire day of trading is encoded as a 2D tensor of shape (390, 3)
(there are 390 minutes in a trading day), and 250 days’ worth of data
can be stored in a 3D tensor of shape (250, 390, 3). Here, each
sample would be one day’s worth of data.

24
• A dataset of tweets, where we encode each tweet as a sequence of
280 characters out of an alphabet of 128 unique characters. In this
setting, each character can be encoded as a binary vector of size 128
(an all-zeros vector except for a 1 entry at the index corresponding to
the character). Then each tweet can be encoded as a 2D tensor of
shape (280, 128), and a dataset of 1 million tweets can be stored in a
tensor of shape (1000000, 280, 128).

25
Image Data

Images typically have three dimensions: height, width, and color depth.
Although grayscale images (like our MNIST digits) have only a single
color channel and could thus be stored in 2D tensors, by convention
image tensors are always 3D, with a one-dimensional color channel for
grayscale images. A batch of 128 grayscale images of size 256 × 256
could thus be stored in a tensor of shape (128, 256, 256, 1), and a
batch of 128 color images could be stored in a tensor of shape (128,
256, 256, 3).

26
A 4D image data tensor (channels-first convention)

27
There are two conventions for shapes of images tensors: the channels-
last convention (used by TensorFlow) and the channels-first convention
(used by Theano). The TensorFlow machine-learning framework, from
Google, places the color-depth axis at the end: (samples, height, width,
color_depth).
Meanwhile, Theano places the color depth axis right after the batch
axis: (samples, color_depth, height, width). With the Theano
convention, the previous examples would become (128, 1, 256, 256)
and (128, 3, 256, 256). The Keras framework provides support for both
formats.
Theano is a Python library that allows us to evaluate mathematical
operations including multi-dimensional arrays efficiently. It is mostly
used in building Deep Learning Projects. Theano works way faster on
the Graphics Processing Unit (GPU) rather than on the CPU.

28
Video data
Video data is one of the few types of real-world data for which you’ll need
5D tensors. A video can be understood as a sequence of frames, each frame
being a color image. Because each frame can be stored in a 3D tensor
(height, width, color_depth), a sequence of frames can be stored in a 4D
tensor (frames, height, width, color_ depth), and thus a batch of different
videos can be stored in a 5D tensor of shape (samples, frames, height, width,
color_depth).
For instance, a 60-second, 144 × 256 YouTube video clip sampled at 4 frames
per second would have 240 frames. A batch of four such video clips would be
stored in a tensor of shape (4, 240, 144, 256, 3). That’s a total of 106,168,320
values! If the dtype of the tensor was float32, then each value would be
stored in 32 bits, so the tensor would represent 405 MB. Heavy! Videos you
encounter in real life are much lighter, because they aren’t stored in float32,
and they’re typically compressed by a large factor (such as in the MPEG
format).
29

You might also like