0% found this document useful (0 votes)
11 views

dl-unit-3

This document provides an overview of neural networks, focusing on their anatomy, the Keras library, and setting up a deep learning workstation. It details the components of neural networks such as layers, models, loss functions, and optimizers, as well as the features and benefits of using Keras for deep learning applications. Additionally, it outlines the prerequisites and steps required to establish a deep learning environment, particularly on Ubuntu systems.

Uploaded by

gowthamsai2098
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

dl-unit-3

This document provides an overview of neural networks, focusing on their anatomy, the Keras library, and setting up a deep learning workstation. It details the components of neural networks such as layers, models, loss functions, and optimizers, as well as the features and benefits of using Keras for deep learning applications. Additionally, it outlines the prerequisites and steps required to establish a deep learning environment, particularly on Ubuntu systems.

Uploaded by

gowthamsai2098
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

lOMoARcPSD|39954333

DL UNIT 3

COMPUTER SCIENCE ENGINEERING (Jawaharlal Nehru Technological University,


Kakinada)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Gowtham Sai M ([email protected])
lOMoARcPSD|39954333

UNIT -III
Neural Networks: Anatomy of Neural Network, Introduction to Keras: Keras, TensorFlow,
Theano and CNTK, Setting up Deep Learning Workstation, Classifying Movie Reviews: Binary
Classification, Classifying newswires: Multiclass Classification.

3.1. Anatomy of a neural network


Training a neural network revolves around the following objects:
Layers, which are combined into a network (or model)
The input data and corresponding targets
The loss function, which defines the feedback signal used for learning
The optimizer, which determines how learning proceeds

You can visualize their interaction as illustrated in figure 3.1: the network, composed of layers that are
chained together, maps the input data to predictions.
The loss function then compares these predictions to the targets, producing a loss value: a measure of how
well the network’s predictions match what was expected.
The optimizer uses this loss value to update the network’s weights.

3.1.1 Layers: the building blocks of deep learning


The fundamental data structure in neural networks is the layer.
A layer is a data-processing module that takes as input one or more tensors and that outputs one or more
tensors. Some layers are stateless, but more frequently layers have a state.
Different layers are appropriate for different tensor formats and different types of data processing.
For instance, simple vector data, stored in 2D tensors of shape (samples, features), is often processed by
densely connected layers, also called fully connected or dense layers (the Dense class in Keras). Sequence
data, stored in 3D tensors of shape (samples, timesteps, features), is typically processed by recurrent
layers such as an LSTM layer. Image data, stored in 4D tensors, is usually processed by 2D convolution
layers (Conv2D).
Building deep-learning models in Keras is done by clipping together compatible layers to form useful
data- transformation pipelines.

Consider the following example

We’re creating a layer that will only accept as input 2D tensors where the first dimension is 784 (axis 0,
the batch dimension, is unspecified, and thus any value would be accepted). This layer will return a tensor
where the first dimension has been transformed to be 32.
When using Keras, you don’t have to worry about compatibility, because the layers you add to your
models are dynamically built to match the shape of the incoming layer. For instance, suppose you write
the following

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

The second layer didn’t receive an input shape argument—instead, it automatically inferred its input
shape as being the output shape of the layer that came before.
3.1.2 Models: networks of layers
A deep-learning model is a directed, acyclic graph of layers. The most common instance is a linear stack
of layers, mapping a single input to a single output.
But as you move forward, you’ll be exposed to a much broader variety of network topologies. Some
common ones include the following:

o Two-branch networks
o Multihead networks
o Inception blocks

Picking the right network architecture is more an art than a science; and although there are some best
practices and principles you can rely on, only practice can help you become a proper neural-network
architect.
3.1.3 Loss functions and optimizers: keys to configuring the learning process
Once the network architecture is defined, you still have to choose two more things:
• Loss function (objective function)—The quantity that will be minimized during training. It
represents a measure of success for the task at hand.

• Optimizer—Determines how the network will be updated based on the loss function.

3.2. INTRODUCTION TO KERAS

Deep learning is one of the major subfield of machine learning framework. Machine learning is the study
of design of algorithms, inspired from the model of human brain. Deep learning is becoming more popular
in data science fields like robotics, artificial intelligence(AI), audio & video recognition and image
recognition. Artificial neural network is the core of deep learning methodologies. Deep learning is
supported by various libraries such as Theano, TensorFlow, Caffe, Mxnet etc., Keras is one of the most
powerful and easy to use python library, which is built on top of popular deep learning libraries like
TensorFlow, Theano, etc., for creating deep learning models.

Overview of Keras

Keras runs on top of open source machine libraries like TensorFlow, Theano or Cognitive Toolkit (CNTK).
Theano is a python library used for fast numerical computation tasks. TensorFlow is the most famous
symbolic math library used for creating neural networks and deep learning models. TensorFlow is very
flexible and the primary benefit is distributed computing. CNTK is deep learning framework developed by
Microsoft. It uses libraries such as Python, C#, C++ or standalone machine learning toolkits. Theano and
TensorFlow are very powerful libraries but difficult to understand for creating neural networks.

Keras is based on minimal structure that provides a clean and easy way to create deep learning models
based on TensorFlow or Theano. Keras is designed to quickly define deep learning models. Well, Keras is
an optimal choice for deep learning applications.

Features

Keras leverages various optimization techniques to make high level neural network API easier and more
performant. It supports the following features −

• Consistent, simple and extensible API.


• Minimal structure - easy to achieve the result without any frills.
• It supports multiple platforms and backends.
Downloaded by Gowtham Sai M ([email protected])
lOMoARcPSD|39954333

• It is user friendly framework which runs on both CPU and GPU.


• Highly scalability of computation.

Benefits

Keras is highly powerful and dynamic framework and comes up with the following advantages −

• Larger community support.


• Easy to test.
• Keras neural networks are written in Python which makes things simpler.
• Keras supports both convolution and recurrent networks.
• Deep learning models are discrete components, so that, you can combine into many ways.

Keras:
Keras is a high-level neural networks API written in Python, which serves as an interface for building and
training deep learning models. It's designed to be user-friendly, modular, and extensible, allowing
developers to quickly prototype and experiment with neural networks. Keras provides a simple and intuitive
API for constructing various types of neural network architectures, such as convolutional neural networks
(CNNs), recurrent neural networks (RNNs), and more. It supports both CPU and GPU computations.
TensorFlow:
TensorFlow is an open-source machine learning framework developed by Google Brain. It provides a
comprehensive ecosystem of tools, libraries, and resources for building and deploying machine learning
models. TensorFlow includes a low-level API that allows users to define and execute computational graphs,
as well as a high-level API called TensorFlow Keras, which integrates seamlessly with Keras. In fact, since
TensorFlow 2.0, Keras has become the official high-level API for building models in TensorFlow.
Theano:
Theano was an open-source numerical computation library developed by the Montreal Institute for Learning
Algorithms (MILA). It allowed users to define, optimize, and evaluate mathematical expressions involving
multi-dimensional arrays efficiently. Theano was widely used in the early days of deep learning, and Keras
originally supported it as one of its backends. However, Theano development ceased in 2017, and it's no
longer actively maintained.
CNTK (Microsoft Cognitive Toolkit):
The Microsoft Cognitive Toolkit, formerly known as CNTK, is an open-source deep learning framework
developed by Microsoft. Like TensorFlow and Theano, CNTK provides a scalable and efficient platform
for training deep learning models. Keras also supported CNTK as one of its backends, allowing users to
leverage the capabilities of CNTK while using the Keras API. However, as of my last update, CNTK has
been deprecated, and Microsoft has shifted its focus to supporting TensorFlow as the primary deep learning
framework on its Azure platform.

3.3. Setting up Deep Learning Workstation


How to set up your deep learning workstation: the most comprehensive guide
Prerequisites to set up deep learning workstation
• Prerequisites to set up deep learning workstation
• The main steps to set up a deep learning workstation
• Updating the Linux system packages
• Installing the Python-pip command
• Installation steps for Python scientific suit in Ubuntu
• Installation of the BLAS library
• Installation of Python basic libraries
• Installation of HDF5
• Installation of modules to visualize Keras model
• Installation of opencv package
• Setting up GPU for deep learning
• CUDA installation
• Install cuDNN

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

• Installation of TensorFlow
• Installing Keras
• Optional installation of Theano
I assume that you already have Ubuntu on your computer. If not then please install the latest version of
Ubuntu. This is the most famous open-source Although it is possible to run deep learning Keras models on
Windows, it is not recommended.
Another prerequisite for running deep learning models is a good quality GPU. I will advise you to have an
NVIDIA GPU in your computer for satisfactory performance. It is a necessary condition not must though.
Because running sequence processing using recurrent neural network and image processing through
convolutional neural models in CPU is a difficult proposition.
Such models may take hours to give results when run with CPU. Whereas a modern NVIDIA GPU will
take merely 5-10 minutes to complete the models. In case if you are not interested to invest for GPU an
alternative is using cloud service for computing paying hourly rent.
However, in long run, this using this service may cost you more than upgrading your local system. So, my
suggestion will be if you are serious about deep learning and wish to continue with even moderate use, go
for a good workstation set up.

The main steps to set up a deep learning workstation


It is a little time-consuming process. You will require a stable internet connection to download various files.
Depending on the internet speed the complete process may take 2-3 hours (with an internet speed of 1gbps
in my case it took 2 hours) to complete. The main steps to set up a deep learning workstation are as follow:

• Updating the Linux system packages


• Installation of Python pip command. It is the very basic command going to be used to
install other components
• Installing the Basic Linear Algebra Subprogram (BLAS) library required for
mathematical operation.
• HDF5 data frame installation to store hierarchical data
• Installation of Graphviz to visualize Keras model
• CUDA and cuDNN NVIDIA graphics drivers installation
• Installation of TensorFlow as the backend of Keras
• Keras installation
• Installation of Theano (optional)
So, we will now proceed with the step by step installation process

Updating the Linux system packages


The following line of commands will complete the process of Linux system up-gradation process. You have
to type the commands in Ubuntu terminal. The keyboard shortcut to open the terminal is “Ctrl+Alt+T”.
Open the terminal and execute the following lines of code.

$ sudo apt-get update


$ sudo apt-get --assume-yes upgrade
Installing the Python-pip command
The pip command is for installing and managing Python packages. Next which ever packages we are going
to install, this pip command will be used. It is an replacement of the earlier command easy_install. Run the
following command to install python-pip.
$ sudo apt-get install python-pip python-dev
It should install pip in your computer. But sometimes there may be exceptions. As it happened to me also.
See the below screenshot of my Ubuntu terminal. It says “Unable to locate package python-pip”.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

It created a big problem as I was clueless about why it is happening. In my old computer, I have used it no.
of times without any issue. After scouring the internet for several hours I got the solution. This has to do
with the Python version installed in your computer.

If you are also facing the problem (most likely if using a new computer) then first check the python version
with this command.

$ ls /bin/python*

If it returns python version 2 (for example python 2.7) then use python2-pip command or if it returns higher
version python like python 3.8 then use python3-pip command to install pip. So, now the command will be
as below
$ sudo apt-get install python3-pip
Ubuntu by default uses Python 2 while updating its packages. In case you want to use Python 3 then it needs
to be explicitly mentioned. Only Python means Python 2 for Ubuntu. So, to change the Python version, use
the following code.

# Installing Python3
$ sudo apt-get install python3-pip python3-dev
Installation steps for Python scientific suit in Ubuntu
Here the process discussed are for Windows and Linux Operating systems. For the Mac users they need to
install the Python scientific suit via Anaconda. They can install it from the Anaconda repository. It is
continuously updated document. The documentation provided in Anaconda is very vivid one with every
step in detail.
Installation of the BLAS library
The Basic Liner Algebra Subprogram (BLAS) installation is the first step in setting up your deep learning
workstation. But one thing Mac users should keep in mind that this installation does not include Graphviz
and HDF5 and they have to install them separately.

Here we will install OpenBLAS using the following command.

$ sudo apt-get install build-essential cmake git unzip \


pkg-config libopenblas-dev liblapack-dev
Installation of Python basic libraries
In the next step, we will need to install the basic Python libraries like NumPy, Panda, PMatplotlib, SciPy
etc. These are core Python libraries required for any kind of mathematical operations. So, be it machine
learning or deep learning or any kind of computation intensive task, we will need these libraries.
So use the following command in Ubuntu terminal to install all these scientific suite simultaneously.

# installation of Python basic libraries


$ sudo apt-get install python-panda python-numpy python-scipy python- matplotlib python-yaml
Installation of HDF5
The Hierarchical Data Format (HDF) version 5 is an open-source file format which supports large, complex
and heterogeneous data sources. It was developed by NASA to store large numeric data files in efficient
binary formats. It has been created on the other two hierarchical data formats like HDF4 and NetCDF.
HDF5 data format allows the developer to organize his machine learning/deep learning data in a file
directory structure very similar to what we use in any computer. This directory structure can be used to
maintain the hierarchy of the data.

If we consider the directory nomenclature in the computer filing system, then the “directory” or “folder” is
the “group” and the “files” are the “dataset” in case of HDF5 data format. It has importance in deep learning
in order to save and fetch the Keras model from the disc.

Run the following command to install HDF5 in your machine

# Install HDF5 data format to save the Keras models


$ sudo apt-get install libhdf5-serial-dev python-h5py
Installation of modules to visualize Keras model

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

In the next step we will install two packages called Graphviz and pydot-ng. These two packages are
necessary to visualize the Keras model. The codes for installing these two packages are as follow:
# Install graphviz
$ sudo apt-get install graphviz
# Install pydot-ng
$ sudo pip install pydot-ng
These two packages will definitely help you in the execution of the deep learning models you created. But
for the time being, you can skip their installation and proceed with the GPU configuration part. Keras can
also function without these two packages.

Installation of opencv package


Use the following code to install opencv package

# Install opencv
$ sudo apt-get install python-opencv
Setting up GPU for deep learning
Here comes the most important part. As you know that GPU plays an important role in deep learning
modelling. In this section, we are going to set up the GPU support by installing two components namely
CUDA and cuDNN. But to function properly they need NVIDIA GPU.

Although you can run your Keras model even in the CPU, it will take much longer time to train a model to
compare to the time taken by GPU. So, my advice will be if you are serious about deep learning modelling,
then plan to procure an NVIDIA GPU (using cloud service paying hourly rent is also an alternative).

Lets concentrate on the setting up of GPU assuming that your computer already have latest one.

CUDA installation
To install CUDA visit NIVIDIA download page following this link https://developer.nvidia.com/cuda-
downloads. You will land in the following page. It will ask for selecting the OS you are using. As we are
using Ubuntu here (to know why to use Ubuntu as the preferred OS read this article) so click Ubuntu.

CUDA installation-OS selection


Then it will ask other specifications of your workstation environment. Select them as per your existing
specifications. Like here I have selected OS as Linux. I am using a Dell Latitude 3400 laptop which is a 64
bit computer, so in next option I selected x86_64; the Linux distribution is Ubuntu version 20.04.

Finally the installer type you have to select. Here I have selected the network installer mainly because it
has comparatively smaller download size. I am using my mobile internet for the time being. So, it was the
best option for me. But you can choose any of the other local installation options if there is no constrain of
internet bandwidth. The plus point of local installation is you have to do this only once.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

CUDA installation-specification selection


As all the specifications are mentioned, NVIDIA will provide you the installer. Copy the code from there
and run in Ubuntu terminal. It will use Ubuntu’s apt to install the packages, which is the most easiest way
to install CUDA.

CUDA installation code


$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-
ubuntu2004.pin
$ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ sudo apt-key adv --fetch-keys
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
$ sudo add-apt-repository "deb
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
$ sudo apt-get update
$ sudo apt-get -y install cuda
Install cuDNN
“cuDNN is a powerful library for Machine Learning. It has been developed to help developers like yourself
to accelerate the next generation of world changing applications.”

NVDIA.com
To download the specific cuDNN file for your operating system and linux distribution you have to visit the
NIVIDIA download page.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Downloading cuDNN
To download the library, you have to create an account with NVIDIA. It is a compulsory step.

NVIDIA membership for Downloading cuDNN


Fill in the necessary fields.

NVIDIA membership for Downloading cuDNN


As you finish registration a window with some optional settings will appear. You can skip them and proceed
for the next step.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

NVIDIA membership for Downloading cuDNN


A short survey by NIVIDIA is the next step. Although it is on the experience as developer, you can fill it
with any of the options just to navigate to the download page.

Download survey for cuDNN


Now the page with several download options will appear and you have to choose according to your
specifications. I have selected the following debian file for my workstation.

Selecting the OS for cuDNN download


Download the file (the file size is around 300mb in my case). Now to install the library, first change the
directory to enter in the download folder and execute the install command.

Once you are in the directory where the library has been downloaded (by default it is the download folder
of your computer) run the command below. Use the filename in place of **** in the command.

$ sudo dpkg -i dpkg -i ******.deb


You can follow the installation process from this page. With this the cuDNN installation is completed.
Installation of TensorFlow
The next step is installation of TensorFlow. It is very simple. Just execute the below command to install
TensorFlow without GPU support using the pip command.
Downloaded by Gowtham Sai M ([email protected])
lOMoARcPSD|39954333

# Installing TensorFlow using pip3 command for Python3


$ sudo pip3 install tensorflow
Installing Keras
This is the final step of setting up your deep learning workstation and you are good to go. You can run the
simple below command.

$ sudo pip3 install keras


Or you can install it from Github too. The benefits of installing Keras from Github are that you will get lots
of example codes from there. You can run those example scripts to test them on your machine. These are
very good source of learning.
$ git clone https://github.com/fchollet/keras
$ cd keras
$ sudo python setup.py install
Optional installation of Theano
Installation of Theano is optional as we have already installed TensorFlow. However, installing Theano can
prove advantageous while building Keras code and switching between TensorFlow and Theano. Execute
the code below to finish installing Theano:

$ sudo pip3 install theano


Congratulations !!! you have finished with all installations and completed the set up for your deep learning
workstation. You are now ready to execute your first code of deep learning neural network.

I hope this article will prove helpful to set up your deep learning workstation. It is indeed a lengthy article
but covers all technicalities which you may need in case of any difficulty during the process. A little
knowledge about every component you are installing also helps you to make any further changes in the
setting.

Let me know how you find this article by commenting below. Please mention if any information I missed
or any doubt you have regarding the process. I will try my best to provide the information.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

3.4. Classifying Movie Reviews: Binary Classification

Two-class classification, or binary classification, may be the most widely applied kind of machine-learning problem.
In this example, you’ll learn to classify movie reviews as positive or negative, based on the text content of the reviews.
3.4.1 The IMDB dataset
You’ll work with the IMDB dataset: a set of 50,000 highly polarized reviews from the Internet Movie Database.
They’re split into 25,000 reviews for training and 25,000 reviews for testing, each set consisting of 50% negative and
50% positive reviews.

The argument num_words=10000 means you’ll only keep the top 10,000 most frequently occurring words inthe training
data. Rare words will be discarded. This allows you to work with vector data of manageable size
The variables train_data and test_data are lists of reviews; each review is a list of word indices (encoding a sequence of
words). train_labels and test_labels are lists of 0s and 1s, where 0 stands for negative and 1 standsfor positive:

Because you’re restricting yourself to the top 10,000 most frequent words, no word index will exceed 10,000:
>>> max([max(sequence) for sequence in train_data]) output:
9999
For kicks, here’s how you can quickly decode one of these reviews back to English words:

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

3.4.2 Preparing the data


You can’t feed lists of integers into a neural network. You have to turn your lists into tensors. There are twoways to do
that:

1. Pad your lists so that they all have the same length, turn them into an integer tensor of shape (samples,
word_indices), and then use as the first layer in your network a layer capable of handling such integer tensors (the
Embedding layer, which we’ll cover in detail later in the book).

2. One-hot encode your lists to turn them into vectors of 0s and 1s. This would mean, for instance, turning the
sequence [3, 5] into a 10,000-dimensional vector that would be all 0s except for indices 3 and 5, which would be 1s.
Then you could use as the first layer in your network a Dense layer, capable of handling floating-point vector data.
Let’s go with the latter solution to vectorize the data, which you’ll do manually for maximum clarity

Here’s what the samples look like now:


>>> x_train[0]
array([ 0., 1., 1., ..., 0., 0., 0.])
You should also vectorize your labels, which is straightforward:
y_train = np.asarray(train_labels).astype('float32') y_test
= np.asarray(test_labels).astype('float32')
3.4.3 Building your network
The input data is vectors, and the labels are scalars (1s and 0s): this is the easiest setup you’ll ever encounter. A type of
network that performs well on such a problem is a simple stack of fully connected (Dense) layers with relu activations:
Dense(16, activation='relu').

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Finally, you need to choose a loss function and an optimizer. Because you’re facing a binary classification problem
and the output of your network is a probability (you end your network with a single-unit layer with a sigmoid
activation), it’s best to use the binary_crossentropy loss. It isn’t the only viable choice: you could use, for instance,
mean_squared_error. But crossentropy is usually the best choice when you’re dealing with models that output
probabilities.

3.4.4 Validating your approach


In order to monitor during training the accuracy of the model on data it has never seen before, you’ll createa validation
set by setting apart 10,000 samples from the original training data

You’ll now train the model for 20 epochs (20 iterations over all samples in the x_train and y_train tensors), in mini-
batches of 512 samples. At the same time, you’ll monitor loss and accuracy on the 10,000 samples that you set apart.
You do so by passing the validation data as the validation_data argument.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

let’s use Matplotlib to plot the training and validation loss side by side (see figure 3.7), as well as the training and
validation accuracy (see figure 3.8).

As you can see, the training loss decreases with every epoch, and the training accuracy increases with everyepoch. That’s
what you would expect when running gradient descent optimization—the quantity you’re trying to minimize should be
less with every iteration.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

3.5. Classifying newswires: Multiclass Classification


In this example, we will build a model to classify Reuters newswires into 46 mutually exclusive topics. Because we have
many classes, this problem is an instance of multiclass classification; and because each data point should be classified
into only one category, the problem is more specifically an instance of single-label, multiclass classification. If each
data point could belong to multiple categories (in this case, topics), you’d be facing a multilabel, multiclass
classification problem.
Reuters dataset
We will work with the Reuters dataset, a set of short newswires and their topics, published by Reuters in 1986. It’s a simple,
widely used toy dataset for text classification. There are 46 different topics; some topics are more represented than others,
but each topic has at least 10 examples in the training set. Like IMDB and MNIST, the Reuters dataset comes packaged as
part of Keras.

2.

3.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

4. Data Prep
vectorize the input data
def vectorize_sequences(sequences, dimension=10000): results
= np.zeros((len(sequences), dimension))
for i, sequence in enumerate(sequences): results[i,
sequence] = 1.
return results
x_train = vectorize_sequences(train_data)#1 x_test =
vectorize_sequences(test_data)#2
1. Verctorize training data

2. Vencotrize testing data


vectorize the label with the exact same code as in the previous example. def
to_one_hot(labels, dimension=46):
results = np.zeros((len(labels), dimension)) for i,
label in enumerate(labels):
results[i, label] = 1.
return results
one_hot_train_labels = to_one_hot(train_labels)#1
one_hot_test_labels = to_one_hot(test_labels)#2
1. Verctorize training labels

2. Vencotrize testing labels


Note that there is a built-in way to do this in Keras:
from tensorflow.keras.utils import to_categorical
one_hot_train_labels = to_categorical(train_labels)
one_hot_test_labels = to_categorical(test_labels)
5. Building the model
This topic-classification problem looks similar to the previous movie-review classification: in both cases, we are trying
to classify short snippets of text. But there is a new constraint here: the number of output classes has gone from 2 to 46. The
dimensionality of the output space is much larger.
In a stack of Dense layers like that we have been using, each layer can only access information present in theoutput of the
previous layer. If one layer drops some information relevant to the classification problem, this information can never
be recovered by later layers: each layer can potentially become an information bottleneck. In the previous example, we
used 16-dimensional intermediate layers, but a 16-dimensional spacemay be too limited to learn to separate 46 different
classes: such small layers may act as information bottlenecks, permanently dropping relevant information. For this
reason we will use larger layers. Let’s go with 64 units.

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Model Defination
model = keras.Sequential([
layers.Dense(64, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(46, activation='softmax')
])
Note about this architecture:

1. We end the model with a Dense layer of size 46. This means for each input sample, the network will output a
46-dimensional vector. Each entry in this vector (each dimension) will encode a different output class.

2. The last layer uses a softmax activation. You saw this pattern in the MNIST example. It means the model will
output a probability distribution over the 46 different output classes — for every input sample, the model will
produce a 46-dimensional output vector, where output[i] is the probability thatthe sample belongs to class i. The
46 scores will sum to 1.

3. The best loss function to use in this case is categorical_crossentropy. It measures the distance between two
probability distributions: here, between the probability distribution output by the model and the true
distribution of the labels. By minimizing the distance between these two distributions, you train the model to
output something as close as possible to the true labels.

6. Compile the model


model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
Validation of the approach
Let’s set apart 1,000 samples in the training data to use as a validation set. x_val =
x_train[:1000]
partial_x_train = x_train[1000:] y_val =
one_hot_train_labels[:1000]
partial_y_train = one_hot_train_labels[1000:] let’s
train the model for 20 epochs.
Training the model
history = model.fit(partial_x_train,
partial_y_train, epochs=20,
batch_size=512,
validation_data=(x_val, y_val))

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Epoch 1/20
16/16 [==============================] - 2s 81ms/step - loss: 3.1029 - accuracy:
0.4079 - val_loss: 1.7132 - val_accuracy: 0.6440

Epoch 2/20

16/16 [==============================] - 1s 38ms/step - loss: 1.4807 - accuracy:


0.6992 - val_loss: 1.2964 - val_accuracy: 0.7230

Epoch 3/20

16/16 [==============================] - 1s 36ms/step - loss: 1.0763 - accuracy:


0.7762 - val_loss: 1.1460 - val_accuracy: 0.7380

Epoch 4/20

16/16 [==============================] - 1s 36ms/step - loss: 0.8441 - accuracy:


0.8245 - val_loss: 1.0389 - val_accuracy: 0.7810

Epoch 5/20

16/16 [==============================] - 1s 37ms/step - loss: 0.6595 - accuracy:


0.8658 - val_loss: 0.9456 - val_accuracy: 0.8050

Epoch 6/20

16/16 [==============================] - 1s 37ms/step - loss: 0.5237 - accuracy:


0.8945 - val_loss: 0.9203 - val_accuracy: 0.8040
Epoch 7/20
16/16 [==============================] - 1s 36ms/step - loss: 0.4181 - accuracy:
0.9160 - val_loss: 0.8765 - val_accuracy: 0.8140Epoch
8/20
16/16 [==============================] - 1s 35ms/step - loss: 0.3485 - accuracy:
0.9316 - val_loss: 0.8895 - val_accuracy: 0.8060

Epoch 9/20

16/16 [==============================] - 1s 36ms/step - loss: 0.2829 - accuracy:


0.9390 - val_loss: 0.8829 - val_accuracy: 0.8110

Epoch 10/20

16/16 [==============================] - 1s 36ms/step - loss: 0.2246 - accuracy:


0.9479 - val_loss: 0.9112 - val_accuracy: 0.8140

Epoch 11/20

16/16 [==============================] - 1s 36ms/step - loss: 0.1894 - accuracy:


0.9532 - val_loss: 0.9060 - val_accuracy: 0.8120

Epoch 12/20

16/16 [==============================] - 1s 37ms/step - loss: 0.1765 - accuracy:


0.9538 - val_loss: 0.9068 - val_accuracy: 0.8160
Epoch 13/20
16/16 [==============================] - 1s 37ms/step - loss: 0.1610 - accuracy:
0.9529 - val_loss: 0.9394 - val_accuracy: 0.8100

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Epoch 14/20
16/16 [==============================] - 1s 37ms/step - loss: 0.1438 - accuracy:
0.9574 - val_loss: 0.9254 - val_accuracy: 0.8190

Epoch 15/20

16/16 [==============================] - 1s 35ms/step - loss: 0.1305 - accuracy:


0.9584 - val_loss: 0.9666 - val_accuracy: 0.8060

Epoch 16/20

16/16 [==============================] - 1s 37ms/step - loss: 0.1291 - accuracy:


0.9562 - val_loss: 0.9537 - val_accuracy: 0.8120

Epoch 17/20

16/16 [==============================] - 1s 36ms/step - loss: 0.1140 - accuracy:


0.9593 - val_loss: 1.0202 - val_accuracy: 0.8020

Epoch 18/20

16/16 [==============================] - 1s 38ms/step - loss: 0.1167 - accuracy:


0.9567 - val_loss: 0.9942 - val_accuracy: 0.8070

Epoch 19/20

16/16 [==============================] - 1s 38ms/step - loss: 0.0972 - accuracy:


0.9669 - val_loss: 1.0709 - val_accuracy: 0.7960

Epoch 20/20

16/16 [==============================] - 1s 34ms/step - loss: 0.1035 - accuracy:


0.9607 - val_loss: 1.0530 - val_accuracy: 0.8020

7. Plotting the training and validation loss


loss = history.history['loss']
val_loss = history.history['val_loss'] epochs =
range(1, len(loss) + 1)
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and validation loss') plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend() plt.show()

Downloaded by Gowtham Sai M ([email protected])


lOMoARcPSD|39954333

Downloaded by Gowtham Sai M ([email protected])

You might also like