0% found this document useful (0 votes)

36 views16 pages

uNIT 1

tttt

Uploaded by

Senthil Pandi S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views16 pages

uNIT 1

tttt

Uploaded by

Senthil Pandi S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

UNIT-I INTRODUCTION TO DEEP LEARNING

Introduction: History, AI vs ML vs DL, Deep Learning: Hardware, Data and Algorithms -

Building blocks of neural networks: Data Representation - Gradient-based Optimization -
Stochastic Gradient Descent - Backpropagation – Anatomy of a Neural Network: Layers,
Models, Loss Functions and Optimizers.

DEEP LEARNING

 Deep learning is an aspect of artificial intelligence (AI) that is concerned with emulating
the learning approach that human beings use to gain certain types of knowledge. At its
simplest, deep learning can be thought of as a way to automate predictive analytics.
 Deep learning algorithms seek to exploit the unknown structure in the input distribution
in order to discover good representations, often at multiple levels, with higher-level
learned features defined in terms of lower-level features

How deep learning works

Computer programs that use deep learning go through much the same process. Each algorithm in
the hierarchy applies a nonlinear transformation on its input and uses what it learns to create a
statistical model as output. Iterations continue until the output has reached an acceptable level of
accuracy. The number of processing layers through which data must pass is what inspired the
label deep.
In traditional machine learning, the learning process is supervised and the programmer has to be
very, very specific when telling the computer what types of things it should be looking for when
deciding if an image contains a dog or does not contain a dog. This is a laborious process called
feature extraction and the computer's success rate depends entirely upon the programmer's ability
to accurately define a feature set for "dog." The advantage of deep learning is that the
program builds the feature set by itself without supervision. Unsupervised learning is not
only faster, but it is usually more accurate.

INTRODUCTION TO DEEPP LEARNING

 Deep learning models are trained by using large sets of labeled data and neural network
architectures that learn features directly from the data without the need for manual feature
extraction.
 Deep learning is a specific subset of Machine Learning, which is a specific subset of
Artificial Intelligence.
 Artificial Intelligence is the broad mandate of creating machines that can think intelligently.
 Machine Learning is one way of doing that, by using algorithms to glean insights from data.
 Deep Learning is one way of doing that, using a specific algorithm called a Neural Network
 Neural networks are inspired by the structure of the cerebral cortex. At the basic level is the
perceptron, the mathematical representation of a biological neuron.
 Like in the cerebral cortex, there can be several layers of interconnected perceptrons.
 Input values, or in other words our underlying data, get passed through this “network” of
hidden layers until they eventually converge to the output layer.
 The output layer is our prediction: it might be one node if the model just outputs a number, or
a few nodes if it’s a multiclass classification problem.
 The hidden layers of a Neural Net perform modifications on the data to eventually feel out
what its relationship with the target variable is.
 Each node has a weight, and it multiplies its input value by that weight. Do that over a few
different layers, and the Net is able to essentially manipulate the data into something
meaningful.
 To figure out what these small weights should be, we typically use an algorithm called
Backpropagation.
 The great reveal about Neural Nets is that they aren’t all that smart – they’re basically just
feeling around, through trial and error, to try and find the relationships in your data.
 The hiker doesn’t actually know where she’s going – she just feels around to find a path that
might take her down the mountain. Our algorithm is the same – it’s feeling around to figure
out how to make the most accurate predictions. The final values that each our our nodes in a
Neural Net takes on is a reflection of that process.
 In the 1980s, most neural networks were a single layer due to the cost of computation and
availability of data.
 Nowadays we can afford to have more hidden layers in our Neural Nets, hence the moniker
“Deep” Learning.
 The different types of Neural Networks available for use have also proliferated. Models like
Convolutional Neural Nets, Recurrent Neural Nets, and Long Short-Term Memory are
finding compelling use cases across the board.
Why is Deep Learning Important?

 Deep Learning is important for one reason, and one reason only: we’ve been able to achieve
meaningful, useful accuracy on tasks that matter.
 Machine Learning has been used for classification on images and text for decades, but it
struggled to cross the threshold – there’s a baseline accuracy that algorithms need to have to
work in business settings.
 Deep Learning is finally enabling us to cross that line in places we weren’t able to before.

Applications

 Computer vision is a great example of a task that Deep Learning has transformed into
something realistic for business applications. Using Deep Learning to classify and label
images isn’t only better than any other traditional algorithms: it’s starting to be better
than actual humans.
 Speech recognition is a another area that has felt Deep Learning’s impact. Spoken
languages are so vast and ambiguous. Baidu – one of the leading search engines of China
– has developed a voice recognition system that is faster and more accurate than
humansat producing text on a mobile phone; in both English and Mandarin.

Deep Learning is important because it finally makes these tasks accessible – it brings previously
irrelevant workloads into the purview of Machine Learning.
Deep Learning Vs Machine Learning

 Deep learning is a specialized form of machine learning.

 A machine learning workflow starts with relevant features being manually extracted from
images.
 The features are then used to create a model that categorizes the objects in the image.
With a deep learning workflow, relevant features are automatically extracted from
images.
 In addition, deep learning performs “end-to-end learning” – where a network is given raw
data and a task to perform, such as classification, and it learns how to do this
automatically.
 Another key difference is deep learning algorithms scale with data, whereas shallow
learning converges. Shallow learning refers to machine learning methods that plateau at a
certain level of performance when you add more examples and training data to the
network.
 A key advantage of deep learning networks is that they often continue to improve as the
size of your data increases.In machine learning, you manually choose features and a
classifier to sort images.
 With deep learning, feature extraction and modeling steps are automatic.

Choosing Between Machine Learning and Deep Learning

 Machine learning offers a variety of techniques and models you can choose based on
your application, the size of data you're processing, and the type of problem you want to
solve.
 A successful deep learning application requires a very large amount of data (thousands of
images) to train the model, as well as GPUs, or graphics processing units, to rapidly
process your data.
 When choosing between machine learning and deep learning, consider whether you have
a high-performance GPU and lots of labeled data.
 Deep learning is generally more complex, so you’ll need at least a few thousand images
to get reliable results. Having a high-performance GPU means the model will take less
time to analyze all those images.
How to Create and Train Deep Learning Models

The three most common ways people use deep learning to perform object classification are:
Training from Scratch
 To train a deep network from scratch, you gather a very large labeled data set and design
a network architecture that will learn the features and model.
 This is good for new applications, or applications that will have a large number of output
categories.
 This is a less common approach because with the large amount of data and rate of
learning, these networks typically take days or weeks to train.
Transfer Learning
 Most deep learning applications use the transfer learning approach, a process that
involves fine-tuning a pretrained model.
 You start with an existing network, such as AlexNet or GoogLeNet, and feed in new data
containing previously unknown classes.
 After making some tweaks to the network, you can now perform a new task, such as
categorizing only dogs or cats instead of 1000 different objects. This also has the
advantage of needing much less data (processing thousands of images, rather than
millions), so computation time drops to minutes or hours.
 Transfer learning requires an interface to the internals of the pre-existing network, so it
can be surgically modified and enhanced for the new task.
Feature Extraction
 A slightly less common, more specialized approach to deep learning is to use the network
as a feature extractor.
 Since all the layers are tasked with learning certain features from images, we can pull
these features out of the network at any time during the training process.
 These features can then be used as input to a machine learning model such as support
vector machines (SVM).

HISTORY OF AI
1943 – The first mathematical model of a neural network: Walter Pitts and Warren
McCulloch

 Walter Pitts, a logician, and Warren McCulloch, a neuroscientist, gave us that piece of
the puzzle in 1943 when they created the first mathematical model of a neural network.
 Published in their seminal work “A Logical Calculus of Ideas Immanent in Nervous
Activity”, they proposed a combination of mathematics and algorithms that aimed to
mimic human thought processes.
 Their model – typically called McCulloch-Pitts neurons – is still the standard today
(although it has evolved over the years).

1950 – The prediction of machine learning: Alan Turing

 Turing, a British mathematician, is perhaps most well-known for his involvement in

code-breaking during World War II.
 But his contributions to mathematics and science don’t stop there. In 1947, he predicted
the development of machine learning, even going so far as to describe the impact it could
have on jobs.
 In 1950, Turing proposed just such a machine, even hinting at genetic algorithms, in his
paper “Computing Machinery and Intelligence.”
 In it, he crafted what has been dubbed The Turing Test – although he himself called it
The Imitation Game – to determine whether a computer can “think.”
 At its simplest, the test requires a machine to carry on a conversation via text with a
human being.
 If after five minutes the human is convinced that they’re talking to another human, the
machine is said to have passed.
 It would take 60 years for any machine to do so, although many still debate the validity of
the results.
1952 – First machine learning programs: Arthur Samuel

 Upon joining the Poughkeepsie Laboratory at IBM, Arthur Samuel would go on to create
the first computer learning programs. The programs were built to play the game of
checkers.
 Arthur Samuel’s program was unique in that each time checkers was played, the
computer would always get better, correcting its mistakes and finding better ways to win
from that data. This automatic learning would be one of the first examples of machine
learning.

1957 – Setting the foundation for deep neural networks: Frank Rosenblatt

 Rosenblatt, a psychologist, submitted a paper entitled “The Perceptron: A Perceiving and

Recognizing Automaton” to Cornell Aeronautical Laboratory in 1957.
 He declared he would “construct an electronic or electromechanical system which would
learn to recognize similarities or identities between patterns of optical, electrical, or tonal
information, in a manner which may be closely analogous to the perceptual processes of a
biological brain.” Whew.
 His idea was more hardware than software or algorithm, but it did plant the seeds of
bottom-up learning, and is widely recognized as the foundation of deep neural networks
(DNN).

1959 – Discovery of simple cells and complex cells: David H. Hubel and Torsten Wiesel

 In 1959, neurophysiologists and Nobel Laureates David H. Hubel and Torsten Wiesel
discovered two types of cells in the primary visual cortex: simple cells and complex cells.
 Many artificial neural networks (ANNs) are inspired by these biological observations in
one way or another. While not a milestone for deep learning specifically, it was definitely
one that heavily influenced the field.

1960 – Control theory: Henry J. Kelley

 Kelley was a professor of aerospace and ocean engineering at the Virginia Polytechnic
Institute.
 In 1960, he published “Gradient Theory of Optimal Flight Paths,” itself a major and
widely recognized paper in his field.
 Many of his ideas about control theory – the behavior of systems with inputs, and how
that behavior is modified by feedback – have been applied directly to AI and ANNs over
the years.
 They were used to develop the basics of a continuous backpropagation model (aka the
backward propagation of errors) used in training neural networks.

1965 – The first working deep learning networks: Alexey Ivakhnenko and V.G. Lapa

 Mathematician Ivakhnenko and associates including Lapa arguably created the first
working deep learning networks in 1965, applying what had been only theories and ideas
up to that point.
 Ivakhnenko developed the Group Method of Data Handling (GMDH) – defined as a
“family of inductive algorithms for computer-based mathematical modeling of multi-
parametric datasets that features fully automatic structural and parametric optimization of
models” – and applied it to neural networks.
 For that reason alone, many consider Ivakhnenko the father of modern deep learning.
 His learning algorithms used deep feedforward multilayer perceptrons using statistical
methods at each layer to find the best features and forward them through the system.
 Using GMDH, Ivakhnenko was able to create an 8-layer deep network in 1971, and he
successfully demonstrated the learning process in a computer identification system called
Alpha.

1979-80 – An ANN learns how to recognize visual patterns: Kunihiko Fukushima

 A recognized innovator in neural networks, Fukushima is perhaps best known for the
creation of Neocognitron, an artificial neural network that learned how to recognize
visual patterns.
 It has been used for handwritten character and other pattern recognition tasks,
recommender systems, and even natural language processing.
 His work – which was heavily influenced by Hubel and Wiesel – led to the development
of the first convolutional neural networks, which are based on the visual cortex
organization found in animals.
 They are variations of multilayer perceptrons designed to use minimal amounts of
preprocessing.

1982 – The creation of the Hopfield Networks:John Hopfield

 In 1982, Hopfield created and popularized the system that now bears his name.
 Hopfield Networks are a recurrent neural network that serve as a content-addressable
memory system, and they remain a popular implementation tool for deep learning in the
21st century.

1985 – A program learns to pronounce English words :TerrySejnowski

 Computational neuroscientist Terry Sejnowski used his understanding of the learning

process to create NETtalk in 1985.
 The program learned how to pronounce English words in much the same way a child
does, and was able to improve over time while converting text to speech.
1986 – Improvements in shape recognition and word prediction: David Rumelhart,
Geoffrey Hinton, and Ronald J. Williams

 In a 1986 paper entitled “Learning Representations by Back-propagating Errors,”

Rumelhart, Hinton, and Williams described in greater detail the process of
backpropagation.
 They showed how it could vastly improve the existing neural networks for many tasks
such as shape recognition, word prediction, and more.
 Despite some setbacks after that initial success, Hinton kept at his research during the
Second AI Winter to reach new levels of success and acclaim. He is considered by many
in the field to be the godfather of deep learning.

1989 – Machines read handwritten digits :YannLeCun

 LeCun – another rock star in the AI and DL universe – combined convolutional neural
networks (which he was instrumental in developing) with recent backpropagation
theories to read handwritten digits in 1989.
 His system was eventually used to read handwritten checks and zip codes by NCR and
other companies, processing anywhere from 10-20% of cashed checks in the United
States in the late 90s and early 2000s.

1989 – Q-learning: Christopher Watkins

 Watkins published his PhD thesis – “Learning from Delayed Rewards” – in 1989. In it,
he introduced the concept of Q-learning, which greatly improves the practicality and
feasibility of reinforcement learning in machines.
 This new algorithm suggested it was possible to learn optimal control directly without
modelling the transition probabilities or expected rewards of the Markov Decision
Process.

1993 – A ‘very deep learning’ task is solved: Jürgen Schmidhuber

 German computer scientist Schmidhuber solved a “very deep learning” task in 1993 that
required more than 1,000 layers in the recurrent neural network.
 It was a huge leap forward in the complexity and ability of neural networks.

1995 – Support vector machines: Corinna Cortes and Vladimir Vapnik

 Support vector machines – or SVMs – have been around since the 1960s, tweaked and
refined by many over the decades.
 The current standard model was designed by Cortes and Vapnik in 1993 and presented in
1995.
 A SVM is basically a system for recognizing and mapping similar data, and can be used
in text categorization, handwritten character recognition, and image classification as it
relates to machine learning and deep learning.

1997 – Long short-term memory was proposed: Jürgen Schmidhuber and SeppHochreiter

 A recurrent neural network framework, long short-term memory (LSTM) was proposed
by Schmidhuber and Hochreiter in 1997.
 They improve both the efficiency and practicality of recurrent neural networks by
eliminating the long-term dependency problem (when necessary information is located
too far “back” in the RNN and gets “lost”). LSTM networks can “remember” that
information for a longer period of time.
 Refined over time, LSTM networks are widely used in DL circles, and Google recently
implemented it into its speech-recognition software for Android-powered smartphones.

1998 – Gradient-based learning: YannLeCun

 LeCun was instrumental in yet another advancement in the field of deep learning when he
published his “Gradient-Based Learning Applied to Document Recognition” paper in
1998.
 The Stochastic gradient descent algorithm (aka gradient-based learning) combined with
the backpropagation algorithm is the preferred and increasingly successful approach to
deep learning.

2009 – Launch of ImageNet: Fei-Fei Li

 A professor and head of the Artificial Intelligence Lab at Stanford University, Fei-Fei Li
launched ImageNet in 2009.
 As of 2017, it’s a very large and free database of more than 14 million (14,197,122 at last
count) labeled images available to researchers, educators, and students.
 Labeled data – such as these images – are needed to “train” neural nets in supervised
learning.
 Images are labeled and organized according to Wordnet, a lexical database of English
words – nouns, verbs, adverbs, and adjectives – sorted by groups of synonyms called
synsets.

2011 – Creation of AlexNet: Alex Krizhevsky

 Between 2011 and 2012, Alex Krizhevsky won several international machine and deep
learning competitions with his creation AlexNet, a convolutional neural network.
 AlexNet built off and improved upon LeNet5 (built by YannLeCun years earlier). It
initially contained only eight layers – five convolutional followed by three fully
connected layers – and strengthened the speed and dropout using rectified linear units.
 Its success kicked off a convolutional neural network renaissance in the deep learning
community.

2012 – The Cat Experiment

 It may sound cute and insignificant, but the so-called “Cat Experiment” was a major step
forward.Using a neural network spread over thousands of computers, the team presented
10,000,000 unlabeled images – randomly taken from YouTube – to the system and
allowed it to run analyses on the data.
 when this unsupervised learning session was complete, the program had taught itself to
identify and recognize cats, performing nearly 70% better than previous attempts at
unsupervised learning.
 It wasn’t perfect, though. The network recognized only about 15% of the presented
objects. That said, it was yet another baby step towards genuine AI.

2014 – DeepFace

 Monster platforms are often the first thinking outside the box, and none is bigger than
Facebook.
 Developed and released to the world in 2014, the social media behemoth’s deep learning
system – nicknamed DeepFace – uses neural networks to identify faces with 97.35%
accuracy. That’s an improvement of 27% over previous efforts, and a figure that rivals
that of humans (which is reported to be 97.5%).

2014 – Generative Adversarial Networks (GAN)

 Introduced in 2014 by a team of researchers lead by Ian Goodfellow, an authority no less

than YannLeCun himself had this to say about GANs:They’re kind of a big deal.
 generative adversarial networks enable models to tackle unsupervised learning, which is
more or less the end goal in the artificial intelligence community.
 Essentially, a GAN uses two competing networks: the first takes in data and attempts to
create indistinguishable samples, while the second receives both the data and created
samples, and must determine if each data point is genuine or generated.
 Learning simultaneously, the networks compete against one another and push each other
to get “smarter” faster.

2016 – Powerful machine learning products

 Cray Inc., as well as many other businesses like it, are now able to offer powerful
machine and deep learning products and solutions.
 Using Microsoft’s neural-network software on its XC50 supercomputers with 1,000
Nvidia Tesla P100 graphic processing units, they can perform deep learning tasks on data
in a fraction of the time they used to take – hours instead of days.
1960s: Shallow neural networks
1960-70s: Backpropagation emerges
1974-80: First AI Winter
1980s: Convolution emerges
1987-93: Second AI Winter
1990s: Unsupervised deep learning
1990s-2000s: Supervised deep learning back en vogue
2006s-present: Modern deep learning

DEEP LEARNING PLATFORMS

The machine learning paradigm is continuously evolving. Deep learning is what makes solving
complex problems possible.
1. TensorFlow: top-deep-learning-framework

 TensorFlow is arguably one of the best deep learning frameworks and has been adopted
by several giants such as Airbus, Twitter, IBM, and others mainly due to its highly
flexible system architecture.
 The most well-known use case of TensorFlow has got to be Google Translate coupled
with capabilities such as natural language processing, text classification/summarization,
speech/image/handwriting recognition, forecasting, and tagging.
 TensorFlow is available on both desktop and mobile and also supports languages such as
Python, C++, and R to create deep learning models along with wrapper libraries.
 TensorFlow comes with two tools that are widely used:
1. TensorBoard for the effective data visualization of network modeling and
performance.
2. TensorFlow Serving for the rapid deployment of new algorithms/experiments while
retaining the same server architecture and APIs. It also provides integration with other
TensorFlow models, which is different from conventional practices and can be extended
to serve other model and data types.

2. Caffe: caffe-top-deep-learning-framework

 Caffe is a deep learning framework that is supported with interfaces like C, C++, Python,
and MATLAB as well as the command line interface.
 It is well known for its speed and transposability and its applicability in modeling
convolution neural networks (CNN).
 The biggest benefit of using Caffe’s C++ library (comes with a Python interface) is the
ability to access available networks from the deep net repository Caffe Model Zoo that
are pre-trained and can be used immediately.
 When it comes to modeling CNNs or solving image processing issues, this should be
your go-to library.
 Caffe’s biggest USP is speed. It can process over 60 million images on a daily basis with
a single Nvidia K40 GPU. That’s 1 ms/image for inference and 4 ms/image for learning
— and more recent library versions are faster still.
 Caffe is a popular deep learning network for visual recognition. However, Caffe does not
support fine-granular network layers like those found in TensorFlow or CNTK.
 Given the architecture, the overall support for recurrent networks, and language modeling
it's quite poor, and establishing complex layer types has to be done in a low-level
language.

3. Microsoft Cognitive Toolkit/CNTK: microsoft-top-deep-learning-framework

 Popularly known for easy training and the combination of popular model types across
servers, the Microsoft Cognitive Toolkit (previously known as CNTK) is an open-source
deep learning framework to train deep learning models.
 It performs efficient convolution neural networks and training for image, speech, and
text-based data. Similar to Caffe, it is supported by interfaces such as Python, C++, and
the command line interface.
 Given its coherent use of resources, the implementation of reinforcement learning models
or generative adversarial networks (GANs) can be done easily using the toolkit. It is
known to provide higher performance and scalability as compared to toolkits like Theano
or TensorFlow while operating on multiple machines.
 Compared to Caffe, when it comes to inventing new complex layer types, users don’t
need to implement them in a low-level language due to the fine granularity of the
building blocks.
 The Microsoft Cognitive Toolkit supports both RNN and CNN types of neural models
and thus is capable of handling images, handwriting, and speech recognition problems.
Currently, due to the lack of support on ARM architecture, its capabilities on mobile are
fairly limited.

4. Torch/PyTorch: pytorch-top-deep-learning-framework

 Torch is a scientific computing framework that offers wide support for machine learning
algorithms.
 It is a Lua-based deep learning framework and is used widely amongst industry giants
such as Facebook, Twitter, and Google. It employs CUDA along with C/C++ libraries for
processing and was basically made to scale the production of building models and
provide overall flexibility.
 As of late, PyTorch has seen a high level of adoption within the deep learning framework
community and is considered to be a competitor to TensorFlow.
 PyTorch is basically a port to the Torch deep learning framework used for constructing
deep neural networks and executing tensor computations that are high in terms of
complexity.
 As opposed to Torch, PyTorch runs on Python, which means that anyone with a basic
understanding of Python can get started on building their own deep learning models.
 Given PyTorch framework’s architectural style, the entire deep modeling process is far
simpler as well as transparent compared to Torch.

5. MXNet: mxnet-top-deep-learning-framework

 Designed specifically for the purpose of high efficiency, productivity, and flexibility,
MXNet (pronounced as mix-net) is a deep learning framework supported by Python, R,
C++, and Julia.
 The beauty of MXNet is that it gives the user the ability to code in a variety of
programming languages. This means that you can train your deep learning models with
whichever language you are comfortable in without having to learn something new from
scratch.
 With the backend written in C++ and CUDA, MXNet is able to scale and work with a
myriad of GPUs, which makes it indispensable to enterprises. Case in point: Amazon
employed MXNet as its reference library for deep learning.
 MXNet supports long short-term memory (LTSM) networks along with both RNNs and
CNNs.
 This deep learning framework is known for its capabilities in imaging,
handwriting/speech recognition, forecasting, and NLP.

6. Chainer :chainer-top-deep-learning-framework

 Highly powerful, dynamic and intuitive, Chainer is a Python-based deep learning

framework for neural networks that is designed by the run strategy.
 Compared to other frameworks that use the same strategy, you can modify the networks
during runtime, allowing you to execute arbitrary control flow statements.
 Chainer supports both CUDA computation along with multi-GPU. This deep learning
framework is utilized mainly for sentiment analysis, machine translation, speech
recognition, etc. using RNNs and CNNs.

7. Keras: keras-top-deep-learning-framework

 Well known for being minimalistic, the Keras neural network library (with a supporting
interface of Python) supports both convolutional and recurrent networks that are capable
of running on either TensorFlow or Theano.
 The library is written in Python and was developed keeping quick experimentation as its
USP.
 Due to the fact that the TensorFlow interface is a tad bit challenging coupled with the fact
that it is a low-level library that can be intricate for new users, Keras was built to provide
a simplistic interface for the purpose of quick prototyping by constructing effective
neural networks that can work with TensorFlow.
 Lightweight, easy to use, and really straightforward when it comes to building a deep
learning model by stacking multiple layers: that is Keras in a nutshell. These are the very
reasons why Keras is a part of TensorFlow’s core API.
 The primary usage of Keras is in classification, text generation and summarization,
tagging, and translation, along with speech recognition and more. If you happen to be a
developer with some experience in Python and wish to dive into deep learning, Keras is
something you should definitely check out.

8. Deeplearning4j: dl4j-top-deep-learning-framework

 Parallel training through iterative reduce, microservice architecture adaptation, and

distributed CPUs and GPUs are some of the salient features of the Deeplearning4j deep
learning framework. It is developed in Java as well as Scala and supports other JVM
languages, too.
 Widely adopted as a commercial, industry-focused distributed deep learning platform, the
biggest advantage of this deep learning framework is that you can bring together the
entire Java ecosystem to execute deep learning.
 It can also be administered on top of Hadoop and Spark to orchestrate multiple host
threads. DL4J uses MapReduce to train the network while depending on other libraries to
execute large matrix operations.
 Deeplearning4j comes with a deep network support through RBM, DBN, convolution
neural networks (CNNs), recurrent neural networks (RNNs), recursive neural tensor
networks (RNTNs), and long short-term memory (LTSM).
 Since this deep learning framework is implemented in Java, it is much more efficient
compared to Python.
 When it comes to image recognition tasks using multiple GPUs, it is as fast as Caffe. This
framework shows matchless potential for image recognition, fraud detection, text mining,
parts-of-speech tagging, and natural language processing.
 With Java as your core programming language, you should certainly opt for this deep
learning framework if you’re looking for a robust and effective method of deploying your
deep learning models to production.

Gradient-Based Optimization

Gradient-based optimization is a class of algorithms used to find the minimum or maximum of a function
by iteratively adjusting parameters based on the gradient. Its Used extensively in machine learning, deep
learning, and various fields with optimization problems.
Objective: Minimize (or maximize) an objective function.

1. Objective Function - Function that measures performance, often a loss or cost function in machine
learning.

2. Gradient - Vector pointing in the steepest increase direction of a function. Negative gradient guides
parameter updates.

3. Learning Rate- Hyperparameter controlling step size in each iteration. & Importance of choosing an
appropriate value.

Stochastic Gradient Descent (SGD)

Random mini-batches for gradient estimation. Efficient for large datasets.

Deep Learning
100% (3)
Deep Learning
207 pages
Modul English PSPK
No ratings yet
Modul English PSPK
139 pages
(Machine Learning - Foundations, Methodologies, and Applications) Fengxiang He, Dacheng Tao - Foundations of Deep Learning-Springer (2025)
No ratings yet
(Machine Learning - Foundations, Methodologies, and Applications) Fengxiang He, Dacheng Tao - Foundations of Deep Learning-Springer (2025)
298 pages
PowerPoint Presentation-1
No ratings yet
PowerPoint Presentation-1
39 pages
Liebherr LTM 1150-6.1 Product Advantages
100% (2)
Liebherr LTM 1150-6.1 Product Advantages
18 pages
Deep Learning PIAIC
100% (1)
Deep Learning PIAIC
229 pages
21UGYS01 - Mapping Techniques
No ratings yet
21UGYS01 - Mapping Techniques
109 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
71 pages
DL Unit 2
No ratings yet
DL Unit 2
29 pages
Deep Learning Module-01 Search Creators
No ratings yet
Deep Learning Module-01 Search Creators
17 pages
Deep Learning Note 21cs743
No ratings yet
Deep Learning Note 21cs743
96 pages
Deep Learning Unit1
No ratings yet
Deep Learning Unit1
63 pages
Deep Learning
100% (1)
Deep Learning
21 pages
Deep Learning University
No ratings yet
Deep Learning University
129 pages
R21 - A7709 - Deep Learning: Dr. Bhawani Sankar Panigrahi
No ratings yet
R21 - A7709 - Deep Learning: Dr. Bhawani Sankar Panigrahi
92 pages
Nvidia Fundamentals of Deep Learning PPT 1
No ratings yet
Nvidia Fundamentals of Deep Learning PPT 1
40 pages
UNIT I Part 1 Notes
No ratings yet
UNIT I Part 1 Notes
28 pages
Efficient Deep Learning (First Early Release) (Gaurav Menghani Naresh Singh) (Z-Library)
No ratings yet
Efficient Deep Learning (First Early Release) (Gaurav Menghani Naresh Singh) (Z-Library)
69 pages
Deep Learning Project
No ratings yet
Deep Learning Project
24 pages
PytorchNVIDIA 1
No ratings yet
PytorchNVIDIA 1
39 pages
Chapter1. Introduction To Deep Learning
No ratings yet
Chapter1. Introduction To Deep Learning
21 pages
Deep Learning - PPT - Updated
No ratings yet
Deep Learning - PPT - Updated
15 pages
Deep Learning
No ratings yet
Deep Learning
61 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
Artificial Intelligence Vs Machine Learning Vs Deep Learning
No ratings yet
Artificial Intelligence Vs Machine Learning Vs Deep Learning
38 pages
B Quiz (Mains)
No ratings yet
B Quiz (Mains)
41 pages
Box Pushing Paper - 1
No ratings yet
Box Pushing Paper - 1
5 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
Deep Learning File
No ratings yet
Deep Learning File
58 pages
Artificial Intelligence and Deep Learning
0% (1)
Artificial Intelligence and Deep Learning
9 pages
Deep Learning Presentation
No ratings yet
Deep Learning Presentation
10 pages
Deep Learning Unit-II
No ratings yet
Deep Learning Unit-II
19 pages
What's The Difference Between AI, Machine Learning
No ratings yet
What's The Difference Between AI, Machine Learning
21 pages
Elliott Wave Theroist
100% (1)
Elliott Wave Theroist
5 pages
Hao 2016
No ratings yet
Hao 2016
23 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
Day-4 Deep Learning and Machine Learning
No ratings yet
Day-4 Deep Learning and Machine Learning
11 pages
Ai, ML, DL PDF
No ratings yet
Ai, ML, DL PDF
2 pages
Neural Networks and Deep Learning - Deep Learning Explained To Your Granny - A Visual Introduction For Beginners Who Want To Make Their Own Deep Learning Neural Network (Machine Learning)
100% (5)
Neural Networks and Deep Learning - Deep Learning Explained To Your Granny - A Visual Introduction For Beginners Who Want To Make Their Own Deep Learning Neural Network (Machine Learning)
84 pages
LIBRARY MANAGEMENT SYSTEM - Final
100% (1)
LIBRARY MANAGEMENT SYSTEM - Final
25 pages
Reading+10+ +Introduction+to+Deep+Learning
No ratings yet
Reading+10+ +Introduction+to+Deep+Learning
21 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Playbook Executive Briefing Deep Learning
No ratings yet
Playbook Executive Briefing Deep Learning
20 pages
Deep Learning Introduction
No ratings yet
Deep Learning Introduction
14 pages
Unit 3
No ratings yet
Unit 3
21 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
What Is Deep Learning - How It Works, Techniques & Applications - MATLAB & Simulink
No ratings yet
What Is Deep Learning - How It Works, Techniques & Applications - MATLAB & Simulink
14 pages
What Is Deep Learning - Definition, Examples, and Careers - Coursera
No ratings yet
What Is Deep Learning - Definition, Examples, and Careers - Coursera
11 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-2
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-2
12 pages
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
23 pages
Deep Learning Part 1
No ratings yet
Deep Learning Part 1
16 pages
Introducton To Deep Learning: 1-SAM 2-JU ST 3-LAL IE 4-NA S
No ratings yet
Introducton To Deep Learning: 1-SAM 2-JU ST 3-LAL IE 4-NA S
37 pages
Introduction To Deep Learning: by Gargee Sanyal
No ratings yet
Introduction To Deep Learning: by Gargee Sanyal
20 pages
Unit - 1 Deep Learning 3-2
No ratings yet
Unit - 1 Deep Learning 3-2
15 pages
Difference Between Machine Learning and Deep Learning
No ratings yet
Difference Between Machine Learning and Deep Learning
5 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Report of Ann Cat3-Dev
No ratings yet
Report of Ann Cat3-Dev
8 pages
What Is AI & Machine
No ratings yet
What Is AI & Machine
8 pages
What Is Deep Learning
No ratings yet
What Is Deep Learning
5 pages
Unit Plan: Paises Hispano-Hablantes
100% (1)
Unit Plan: Paises Hispano-Hablantes
30 pages
What Is Deep Learning
No ratings yet
What Is Deep Learning
5 pages
(IJCST-V9I4P17) :yew Kee Wong
No ratings yet
(IJCST-V9I4P17) :yew Kee Wong
4 pages
Matlab Deep Learning Series
No ratings yet
Matlab Deep Learning Series
6 pages
Machine Learning Vs Deep Learning
No ratings yet
Machine Learning Vs Deep Learning
2 pages
MRK - Spring 2022 - CS719 - 2 - MS210400057
No ratings yet
MRK - Spring 2022 - CS719 - 2 - MS210400057
6 pages
About Deep Learning: How Does Deep Learning Attain Such Impressive Results?
No ratings yet
About Deep Learning: How Does Deep Learning Attain Such Impressive Results?
3 pages
AI-Powered Course Recommendation System
No ratings yet
AI-Powered Course Recommendation System
11 pages
Bamboo Art: Terracotta
No ratings yet
Bamboo Art: Terracotta
2 pages
Annexure-4 CertificatefromUniversity
No ratings yet
Annexure-4 CertificatefromUniversity
1 page
Introduction To TikTok Shop Affiliate Program
No ratings yet
Introduction To TikTok Shop Affiliate Program
10 pages
TEST INITIAL LA LIMBA ENGLEZA Cls 11 E Lic.4
No ratings yet
TEST INITIAL LA LIMBA ENGLEZA Cls 11 E Lic.4
1 page
Cardiosync Corporate Business Plan
No ratings yet
Cardiosync Corporate Business Plan
7 pages
1 Datasheet Solis-3P10K-4G
No ratings yet
1 Datasheet Solis-3P10K-4G
2 pages
Lesson 3 - Week 1
No ratings yet
Lesson 3 - Week 1
28 pages
The Trade - Offs of ChatGPT To Filipino Freelance Content Writers A Diffusion of Innovation Theory Perspective
No ratings yet
The Trade - Offs of ChatGPT To Filipino Freelance Content Writers A Diffusion of Innovation Theory Perspective
7 pages
7º Basico B (7 Grade) : I.-Listening Test (Questions 1 To 10) Listen and Answer
No ratings yet
7º Basico B (7 Grade) : I.-Listening Test (Questions 1 To 10) Listen and Answer
3 pages
Position Paper
No ratings yet
Position Paper
2 pages
Implementation and Analysis of Smart Lamp Using An
No ratings yet
Implementation and Analysis of Smart Lamp Using An
4 pages
2012apr TDM Fortier
No ratings yet
2012apr TDM Fortier
19 pages
Cidam Layout
No ratings yet
Cidam Layout
40 pages
A Comprehensive Overview of Chinas Belt and Road Initiative and Its Implication For The Region and Beyond
No ratings yet
A Comprehensive Overview of Chinas Belt and Road Initiative and Its Implication For The Region and Beyond
12 pages
ESD Assignment
No ratings yet
ESD Assignment
14 pages
Lesson Plan #5-Final Demo
No ratings yet
Lesson Plan #5-Final Demo
5 pages
Modeling Class X AI
No ratings yet
Modeling Class X AI
24 pages
Timber Formwork Design
No ratings yet
Timber Formwork Design
12 pages
TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card
No ratings yet
TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card TOMEI Camshaft Spec Card
1 page
Objective Function Decisions Demand Supply Constraints
No ratings yet
Objective Function Decisions Demand Supply Constraints
7 pages
COVID-19 Testing
No ratings yet
COVID-19 Testing
2 pages