House Dzone Refcard 383 Neural Network Essentials
House Dzone Refcard 383 Neural Network Essentials
CONTENTS
Neural Network
• What Are Neural Networks?
Essentials
− Common Neural Architectures
− Neural Network Model
Optimization
− Neural Network Chips
• Conclusion
− Additional Resources
DR. TUHIN CHATTOPADHYAY
FOUNDER & CEO, TUHIN AI ADVISORY
neural networks, which are considered the building blocks. Common in predicting the desired outcome. All the independent variables of
neural architectures are discussed thereafter, defining the underlying the model are a part of the input layer. The one-to-many interconnected
arrangement of the neurons and the specific purposes they serve. hidden layers are configured based on the purpose that the neural
network is going to serve, like object detection and classification
Then we'll discuss the different AI accelerators specifically designed through visual recognition and NLP.
for the efficient processing of DNN workloads, along with the neural
network optimizers that work on the learning rate to reduce overall
loss and improve accuracy. Finally, we will cover various applications
of DNNs across industries and explore the power of leveraging high-
performance computing (HPC) with AI.
Hidden layers are a function of the weighted sum of the inputs/ Table 1
predictors. When the network contains multiple hidden layers, each
TYPE DESCRIPTION FORMULA
hidden unit is a function of the weighted sum of the units of the previous
Rectified A linear function that will max(0,x)
hidden layer.
Linear output the input directly if it
Activation is positive, but if the input is
The output layer, as a function of the hidden layers, contains the
(ReLU) negative, the output is 0
target (dependent) variables. For any image classification, the output
Logistic An "S" curve that generates an 1
layer segregates the input data into multiple nodes as per the desired
(Sigmoid) output between 0 and 1 and is
1 + e –x
objective of the model. expressed as probability
computational and processing power. The key architectures are RECURRENT NEURAL NETWORKS
discussed below. Recurrent neural networks (RNNs) consider input as time series to
generate output as time series with at least one connection cycle.
RADIAL BASIS FUNCTION
RNNs are universal approximators: They can approximate virtually
The radial basis function (RBF) has a single non-linear hidden layer
any dynamical system. RNNs are used for time series analyses like
called a "feature vector," where the number of neurons in the hidden
stock predictions, sales forecasting, natural language processing and
layer should be more than the number of neurons in the input layer
translation, chatbots, image captioning, and music synthesis.
to cast the data into a higher dimensional space. Thus, RBF increases
the dimension of the feature vector to make the classification highly Figure 6: Recurrent neural networks
separable in high-dimensional space. The figure below illustrates
how the inputs ( x) are transformed to output ( y) with through a single
hidden layer (i.e., feature vector), which connects to x and y through
the weights.
RESTRICTED BOLTZMANN MACHINES The GRU is a simplified variant of LSTM where forget and input gates
Restricted Boltzmann machines (RBMs) are unsupervised learning are combined into a single update gate, and the cell state and hidden
algorithms with two-layer neural networks comprising a visible/input state are also combined. Thus, a GRU uses less memory and is therefore
layer and the hidden layer without any intra-layer connections — i.e., no faster than LSTM.
two nodes in the layers are connected, which creates restriction. RBMs
are used for recommendation engines of movies, pattern recognition CONVOLUTIONAL NEURAL NETWORKS
(e.g., understanding handwritten text), and radar target recognition for Convolutional neural networks (CNNs) are widely popular for image
real-time intra-pulse recognition. classification. A CNN assigns weights and biases to objects in the image
for classification purposes. An image comprising a matrix of pixel
Figure 5: Restricted Boltzmann machines values is processed through the convolutional layer, pooling layer, and
fully connected (FC) layer. The pooling layer reduces the spatial size of
the convolved feature.
GENERATIVE ADVERSARIAL NETWORKS CPUs with MIMD architecture are brilliant in task optimization and are
Generative adversarial networks (GANs) use two neural networks, a more suitable for applications with limited parallelism, such as sparse
generator, and a discriminator. While the generator helps in generating DNNs, RNNs that have dependency on the steps, and small models with
image, voice, and video content, the discriminator classifies them as small effective batch sizes.
either from the domain or generated. The two models are trained for a
A TPU is Google's custom-developed application-specific integrated
zero-sum game until it's proven that the generator model is producing
circuit (ASIC) that is used to accelerate DL workloads. TPUs provide high
reasonable results.
throughput for large batch sizes and are suitable for models that train
NEURAL NETWORK CHIPS for weeks, dominated by matrix computations.
Neural network chips provide the power of computing infrastructure
AI ACCELERATORS FOR DEEP LEARNING INFERENCE
through processing speed, storage, and networking that make the
AI accelerators are required for DL inference for faster computation
chips capable of quickly running neural network algorithms on vast
through parallel computational capabilities. They have high bandwidth
amounts of data. Network chips break the tasks into multiple sub-
memory that can allocate four to five times more bandwidth between
tasks, which can run into multiple cores concurrently to increase the
processors than traditional chips. A couple of leading AI accelerators
processing speed.
for DL inference are AWS Inferentia, a custom-designed ASIC, and Open
TYPES OF AI ACCELERATORS Visual Inference and Neural Network Optimization (OpenVINO), an
Specialized AI accelerators have been designed that vary significantly open-source toolkit for optimizing and deploying AI inference.
depending on the model size, supported framework, programmability,
They both boost deep learning performance for performing tasks like
learning curve, target throughput, latency, and cost. Such hardware
computer vision, speech recognition, NLP, and NLG. OpenVINO uses
includes the graphical processing unit (GPU), vision processing unit
models trained in frameworks including TensorFlow, PyTorch, Caffe,
(VPU), field-programmable gate array (FPGA), central processing unit
and Keras, and optimizes model performance with acceleration from
(CPU), and Tensor Processing Unit (TPU).
CPU, GPU, VPU, and iGPU.
While some accelerators like GPUs are more capable of handling
computer graphics and image processing, FPGAs demand field NEURAL NETWORK MODEL OPTIMIZATION
programming using hardware description languages (HDLs) like VHDL Deep learning model optimizations are used for various scenarios,
and Verilog, and TPUs by Google are more specialized for neural including video analytics as well as computer vision. Since most
network machine learning. Let's look at each of them separately below of these computation-intensive analyses are done in real time, the
• Faster performance
GPUs were originally developed for graphics processing and are now
widely used for deep learning (DL). Their benefit is parallel processing • Reduced computational requirements
across industries ranging from genome mapping to autonomous At DZone, we foster a collaborative environment that empowers developers and
tech professionals to share knowledge, build skills, and solve problems through
transportation.
content, code, and community. We thoughtfully — and with intention — challenge
the status quo and value diverse perspectives so that, as one, we can inspire
ADDITIONAL RESOURCES positive change through technology.