0% found this document useful (0 votes)
35 views

unit 2

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

unit 2

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Unit II

Unsupervised Learning Network- Introduction, Fixed Weight Competitive Nets, Maxnet,

Hamming Network, Kohonen Self-Organizing Feature Maps, Learning Vector Quantization,


Counter Propagation Networks, Adaptive Resonance Theory Networks. Special Networks-
Introduction to various networks.

Unsupervised Learning Network

Unsupervised learning

Unsupervised learning is the training of a machine using information that is neither


classified nor labeled and allowing the algorithm to act on that information without
guidance. Here the task of the machine is to group unsorted information according to
similarities, patterns, and differences without any prior training of data.

Unlike supervised learning, no teacher is provided that means no training will be given to
the machine. Therefore the machine is restricted to find the hidden structure in unlabeled
data by itself.
For instance, suppose it is given an image having both dogs and cats which it has never
seen.

Thus the machine has no idea about the features of dogs and cats so we can’t categorize it
as ‘dogs and cats ‘. But it can categorize them according to their similarities, patterns, and
differences, i.e., we can easily categorize the above picture into two parts. The first may
contain all pics having dogs in them and the second part may contain all pics having cats
in them. Here you didn’t learn anything before, which means no training data or examples.
It allows the model to work on its own to discover patterns and information that was
previously undetected. It mainly deals with unlabelled data.

Unsupervised learning is classified into two categories of algorithms:


• Clustering: A clustering problem is where you want to discover the inherent groupings
in the data, such as grouping customers by purchasing behavior.
• Association: An association rule learning problem is where you want to discover rules
that describe large portions of your data, such as people that buy X also tend to buy Y.
Types of Unsupervised Learning:- Clustering

1. Exclusive (partitioning)
2. Agglomerative
3. Overlapping
4. Probabilistic

Clustering Types:-

1. Hierarchical clustering
2. K-means clustering

Advantages of unsupervised learning:

• It does not require training data to be labeled.


• Dimensionality reduction can be easily accomplished using unsupervised learning.
• Capable of finding previously unknown patterns in data.
• Flexibility: Unsupervised learning is flexible in that it can be applied to a wide variety
of problems, including clustering, anomaly detection, and association rule mining.
• Exploration: Unsupervised learning allows for the exploration of data and the discovery
of novel and potentially useful patterns that may not be apparent from the outset.
• Low cost: Unsupervised learning is often less expensive than supervised learning
because it doesn’t require labeled data, which can be time-consuming and costly to
obtain.

Disadvantages of unsupervised learning :


• Difficult to measure accuracy or effectiveness due to lack of predefined answers during
training.
• The results often have lesser accuracy.
• The user needs to spend time interpreting and label the classes which follow that
classification.
• Lack of guidance: Unsupervised learning lacks the guidance and feedback provided by
labeled data, which can make it difficult to know whether the discovered patterns are
relevant or useful.
• Sensitivity to data quality: Unsupervised learning can be sensitive to data quality,
including missing values, outliers, and noisy data.
• Scalability: Unsupervised learning can be computationally expensive, particularly for
large datasets or complex algorithms, which can limit its scalability.

This learning process is independent. During the training of ANN under


unsupervised learning, the input vectors of similar type are combined to form
clusters. When a new input pattern is applied, then the neural network gives an
output response indicating the class to which input pattern belongs. In this, there
would be no feedback from the environment as to what should be the desired
output and whether it is correct or incorrect. Hence, in this type of learning the
network itself must discover the patterns, features from the input data and the
relation for the input data over the output.

Basic Concept of Competitive Network

This network is just like a single layer feed-forward network having feedback connection
between the outputs. The connections between the outputs are inhibitory type, which is
shown by dotted lines, which means the competitors never support themselves.
Basic Concept of Competitive Learning Rule

As said earlier, there would be competition among the output nodes so the main concept is
- during training, the output unit that has the highest activation to a given input pattern, will
be declared the winner. This rule is also called Winner-takes-all because only the winning
neuron is updated and the rest of the neurons are left unchanged.

Fixed Weight Competitive Nets

During training process also the weights remains fixed in these competitive networks. The
idea of competition is used among neurons for enhancement of contrast in their activation
functions. In this, two networks- Maxnet and Hamming networks

In most of the neural networks using unsupervised learning, it is essential to compute the
distance and perform comparisons.

Max Net
This is also a fixed weight network, which serves as a subnet for selecting the node having
the highest input. All the nodes are fully interconnected and there exists symmetrical
weights in all these weighted interconnections.

When a net is trained to classify the input signal into one of the output categories, A, B, C,
D, E, J, or K, the net sometimes responded that the signal was both a C and a K, or both an
E and a K, or both a J and a K, due to similarities in these character pairs. In this case it
will be better to include additional structure in the net to force it to make a definitive
decision. The mechanism by which this can be accomplished is called competition.

The most extreme form of competition among a group of neurons is called Winner-
TakeAll, where only one neuron (the winner) in the group will have a nonzero output
signal when the competition is completed. An example of that is the MAXNET.

Architecture

It uses the mechanism which is an iterative process and each node receives inhibitory
inputs from all other nodes through connections. The single node whose value is maximum
would be active or winner and the activations of all other nodes would be inactive.
Hamming networks
This kind of network is Hamming network, where for every given input vectors, it would
be clustered into different groups. Following are some important features of Hamming
Networks −

• Lippmann started working on Hamming networks in 1987.


• It is a single layer network.
• The inputs can be either binary {0, 1} of bipolar {-1, 1}.
• The weights of the net are calculated by the exemplar vectors.
• It is a fixed weight network which means the weights would remain the same even
during training.

Hamming Distance

Hamming distance of two vectors, x and y of


dimension n x.y = a - d
Where: a is number of bits in agreement in x & y(No. of Similarity bits in x & y), and d is
number of bits different in x and y (No. of Dissimilarity bits in x & y).

The value "a - d" is the Hamming distance existing between two vectors. Since, the total
number of components is n, we have, n = a + d
i.e., d = n - a

On simplification, we
get x.y = a - (n - a)

x.y = 2a -
n 2a = x.y
+ n a =
1/2x.y +
1/2n
From the above equation, it is clearly understood that the weights can be set to one-half the
exemplar vector and bias can be set initially to n/2
Kohonen Self- Organizing Feature Map

Kohonen Self-Organizing feature map (SOM) refers to a neural network, which is


trained using competitive learning. Basic competitive learning implies that the
competition process takes place before the cycle of learning. The competition process
suggests that some criteria select a winning processing element.

The self-organizing map makes topologically ordered mappings between input data and
processing elements of the map. Topological ordered implies that if two inputs are of
similar characteristics, the most active processing elements answering to inputs that are
located closed to each other on the map. The weight vectors of the processing elements are
organized in ascending to descending order. Wi < Wi+1 for all values of i or Wi+1 for all
values of i (this definition is valid for one-dimensional self-organizing map only).

The self-organizing map is typically represented as a two-dimensional sheet of processing


elements described in the figure given below. Each processing element has its own weight
vector, and learning of SOM (self-organizing map) depends on the adaptation of these
vectors. The processing elements of the network are made competitive in a self-organizing
process, and specific criteria pick the winning processing element whose weights are
updated. Generally, these criteria are used to limit the Euclidean distance between the input
vector and the weight vector. SOM (self-organizing map) varies from basic competitive
learning so that instead of adjusting only the weight vector of the winning processing
element also weight vectors of neighboring processing elements are adjusted

It is discovered by Finnish professor and researcher Dr. Teuvo Kohonen in 1982. The self-
organizing map refers to an unsupervised learning model proposed for applications in
which maintaining a topology between input and output spaces
All the entire learning process occurs without supervision because the nodes are self-
organizing. They are also known as feature maps, as they are basically retraining the
features of the input data, and simply grouping themselves as indicated by the similarity
between each other. It has practical value for visualizing complex or huge quantities of high
dimensional data and showing the relationship between them into a low, usually two-
dimensional field to check whether the given unlabeled data have any structure to it.

A self-Organizing Map (SOM) varies from typical artificial neural networks (ANNs) both
in its architecture and algorithmic properties. Its structure consists of a single layer linear
2D grid of neurons, rather than a series of layers. All the nodes on this lattice are associated
directly to the input vector, but not to each other. It means the nodes don't know the values
of their neighbors, and only update the weight of their associations as a function of the
given input. The grid itself is the map that coordinates itself at each iteration as a function
of the input data. As such, after clustering, each node has its own coordinate (i.j), which
enables one to calculate Euclidean distance between two nodes

A Self-Organizing Map utilizes competitive learning instead of error-correction learning, to


modify its weights. It implies that only an individual node is activated at each cycle in
which the features of an occurrence of the input vector are introduced to the neural
network, as all nodes compete for the privilege to respond to the input.
The selected node- the Best Matching Unit (BMU) is selected according to the similarity
between the current input values and all the other nodes in the network. The node with the
fractional Euclidean difference between the input vector, all nodes, and its neighboring
nodes is selected and within a specific radius, to have their position slightly adjusted to
coordinate the input vector. By experiencing all the nodes present on the grid, the whole
grid eventually matches the entire input dataset with connected nodes gathered towards one
area, and dissimilar ones are isolated.

Algorithm:

Step:1

Each node weight w_ij initialize to a random value.

Step:2

Choose a random input vector x_k.

Step:3

Repeat steps 4 and 5 for all nodes on the map.

Step:4
Calculate the Euclidean distance between weight vector w ij and the input vector x(t)
connected with the first node, where t, i, j =0. Step:5 track the node that generates the
smallest distance t.

Step:6

Calculate the overall Best Matching Unit (BMU). It means the node with the smallest
distance from all calculated ones.

Step:7

Discover topological neighborhood βij(t) its radius σ(t) of BMU in Kohonen Map.

Step:8

Repeat for all nodes in the BMU neighborhood: Update the weight vector w_ij of the first
node in the neighborhood of the BMU by including a fraction of the difference between the
input vector x(t) and the weight w(t) of the neuron. Wij(new)=wij(old)+alpha[xi-wij(old)]

Step:9

Repeat the complete iteration until reaching the selected iteration limit t=n.

Here, step 1 represents initialization phase, while step 2 to 9 represents the


training phase. Where;
t = current iteration.

i = row coordinate of the nodes grid.

J = column coordinate of the nodes grid.


W= weight vector w_ij = association
weight between the nodes i,j in the grid.
X = input vector

X(t)= the input vector instance at iteration t β_ij = the neighborhood


function, decreasing and representing node i,j distance from the BMU.
σ(t) = The radius of the neighborhood function, which calculates how far neighbor nodes are
examined in the 2D grid when updating vectors.
Neighbor Topologies in Kohonen SOM

There can be various topologies, however the following two


topologies are used the most − Rectangular Grid Topology
This topology has 24 nodes in the distance-2 grid, 16 nodes in the distance-1 grid, and 8
nodes in the distance-0 grid, which means the difference between each rectangular grid is 8
nodes. The winning unit is indicated by #.

Hexagonal Grid Topology

This topology has 18 nodes in the distance-2 grid, 12 nodes in the distance-1 grid, and 6
nodes in the distance-0 grid, which means the difference between each rectangular grid is 6
nodes. The winning unit is indicated by #.
Learning Vector Quantization ( or LVQ ) is a type of Artificial Neural Network which
also inspired by biological models of neural systems. It is based on prototype supervised
learning classification algorithm and trained its network through a competitive learning
algorithm similar to Self Organizing Map. It can also deal with the multiclass classification
problem. LVQ has two layers, one is the Input layer and the other one is the Output layer.
The architecture of the Learning Vector Quantization with the number of classes in an input
data and n number of input features for any sample is given

How Learning Vector Quantization works?

Let’s say that an input data of size ( m, n ) where m is the number of training examples and
n is the number of features in each example and a label vector of size ( m, 1 ). First, it
initializes the weights of size ( n, c ) from the first c number of training samples with
different labels and should be discarded from all training samples. Here, c is the number of
classes. Then iterate over the remaining input data, for each training example, it updates the
winning vector ( weight vector with the shortest distance ( e.g Euclidean distance ) from the
training example ).

The weight updation


rule is given by: if
correctly_classified:
wij(new) = wij(old) + alpha(t) * (xik - wij(old))
else:
wij(new) = wij(old) - alpha(t) * (xik - wij(old))

where alpha is a learning rate at time t, j denotes the winning vector, i denotes the i th feature
of training example and k denotes the k th training example from the input data. After
training the LVQ network, trained weights are used for classifying new examples. A new
example is labelled with the class of the winning vector.

Algorithm LVQ :

Step 1: Initialize reference vectors.


from a given set of training vectors, take the first “n” number of clusters training vectors
and use them as weight vectors, the remaining vectors can be used for training.
Assign initial weights and classifications randomly
Step 2: Calculate Euclidean distance for i=1 to n and j=1 to m,
D(j) = ΣΣ (xi-Wij)^2 find
winning unit index j, where D(j) is
minimum

Step 3: Update weights on the winning unit wi using the following conditions:
if T = J then wi(new) = wi (old) +
α[x – wi(old)] if T ≠ J then
wi(new) = wi (old) – α[x – wi(old)]
Step 4: Check for the stopping condition if false repeat the above steps.
Below is the implementation
Counter propagation network

Counter propagation network (CPN) were proposed by Hecht Nielsen in 1987.They are
multilayer network based on the combinations of the input, output, and clustering layers.
The application of counter propagation net are data compression, function approximation
and pattern association. The counter propagation network is basically constructed from an
instar-outstar model. This model is three layer neural network that performs input-output
data mapping, producing an output vector y in response to input vector x, on the basis of
competitive learning. The three layer in an instar-outstar model are the input layer, the
hidden(competitive) layer and the output layer.

There are two stages involved in the training process of a counter propagation net. The
input vector are clustered in the first stage. In the second stage of training, the weights from
the cluster layer units to the output units are tuned to obtain the desired response. There are
two types of counter propagation network:

1. Full counter propagation network

2. Forward-only counter propagation network

Full CPN

• The Full CPN allows to produce a correct output even when it is given an input vector
that is partially incomplete or incorrect.

• In first phase, the training vector pairs are used to form clusters using either dot product
or Euclidean distance.

• If dot product is used, normalization is a must.

• During second phase, the weights are adjusted between the cluster units and output units.

• The architecture of CPN resembles an instar and outstar model.

• The model which connects the input layers to the hidden layer is called Instar model and
the model which connects the hidden layer to the output layer is called Outstar model.

• The weights are updated in both the Instar (in first phase) and Outstar model (second
phase).
• The network is fully interconnected network.

Architecture of Full Counter propagation

First phase of Full CPN

• This phase of training is called as In star modeled training.

• The active units here are the units in the x-input, z-cluster and y-input layers.

• The winning unit uses standard Kohonen learning rule for its weigh updation.

• The rule is: • v ij(new)= vij(old) + α(xi– vij (old)

= (1- α)vij(old) + α.xi ;where i=1 to n

• w kj(new)= wkj(old) + β(yk– wi k (old)


= (1- β)wkj(old) + β.yk;where k=1 to n

Second phase of full CPN

• In this phase, we can find only the J unit remaining active in the cluster layer.

• The weights from the winning cluster unit J to the output units are adjusted, so that vector
of activation of units in the y ouput layer, y*, is approximation of input vector y; and x* is
an approximation of input vector x.

• Here weight updation is done by Grossberg learning rule.

• Here no competition is assumed among units.

• The weight updation rule is given as:

• ujk(new)= ujk(old) + a(yk– ujk (old)

• = (1- a) ujk(old) + a.yk ;where k=1 to n

•tji(new)= tji(old) + b(xi– tji (old)

• = (1- b) tji(old) + b.xi;where i=1 to n

Training Algorithm

• The parameters used are:

• x – Input training vector x=(x1,…,xi,…,xn)

• y - Target output vector y=(y1,…,yk,…,ym)

• zj – activation of cluster unit Zj.

• x* - Approximation to vector x.

• y* - Approximation to vector y.

• vij – weight from x input layer to Z-cluster layer.

• wjk – weight from y input layer to Z-cluster layer.

• tji – weight from cluster layer to X-output layer.


• ujk – weight from cluster layer to Y-output layer.

• α, β – Learning rates during Kohonen learning.

• a, b – Learning rates during Grossberg learning.

Forward-only Counterpropagation network:

A simplified version of full CPN is the forward-only CPN. Forward-only CPN uses only
the x vector to form the cluster on the Kohonen units during phase I training. In case of
forward-only CPN, first input vectors are presented to the input units. First, the weights
between the input layer and cluster layer are trained. Then the weights between the cluster
layer and output layer are trained. This is a specific competitive network, with target
known.

Architecture of forward-only CPN

It consists of three layers: input layer, cluster layer and output layer. Its architecture
resembles the back-propagation network, but in CPN there exists interconnections between
the units in the cluster layer.
The activation Function Will be similar to the Full Propagation

First phase formula from x to z

• The rule is: • v ij(new)= vij(old) + α(xi– vij (old)

= (1- α)vij(old) + α.xi ;where i=1 to n

• vij – weight from x input layer to Z-cluster layer.

The weight updation rule is given as:

• ujk(new)= ujk(old) + a(yk– ujk (old)

• = (1- a) ujk(old) + a.yk ;where k=1 to n

ujk – weight from cluster layer to Y-output layer


Adaptive Resonance Theory (ART) Adaptive resonance theory is a type of neural network
technique developed by Stephen Grossberg and Gail Carpenter in 1987. The basic ART uses
unsupervised learning technique. The term “adaptive” and “resonance” used in this
suggests that they are open to new learning(i.e. adaptive) without discarding the previous or
the old information(i.e. resonance). The ART networks are known to solve the stability-
plasticity dilemma i.e., stability refers to their nature of memorizing the learning and
plasticity refers to the fact that they are flexible to gain new information. Due to this the
nature of ART they are always able to learn new input patterns without forgetting the past.
ART networks implement a clustering algorithm. Input is presented to the network and the
algorithm checks whether it fits into one of the already stored clusters. If it fits then the
input is added to the cluster that matches the most else a new cluster is formed.
Types of Adaptive Resonance Theory(ART) Carpenter and Grossberg developed different
ART architectures as a result of 20 years of research. The ARTs can be classified as follows:
• ART1 – It is the simplest and the basic ART architecture. It is capable of clustering
binary input values.
• ART2 – It is extension of ART1 that is capable of clustering continuous-valued input
data.
• Fuzzy ART – It is the augmentation of fuzzy logic and ART.
• ARTMAP – It is a supervised form of ART learning where one ART learns based on the
previous ART module. It is also known as predictive ART.
• FARTMAP – This is a supervised ART architecture with Fuzzy logic included.

Basic of Adaptive Resonance Theory (ART) Architecture The adaptive resonant theory is
a type of neural network that is self-organizing and competitive. It can be of both types, the
unsupervised ones(ART1, ART2, ART3, etc) or the supervised ones(ARTMAP). Generally,
the supervised algorithms are named with the suffix “MAP”. But the basic ART model is
unsupervised in nature and consists of :
• F1 layer or the comparison field(where the inputs are processed)
• F2 layer or the recognition field (which consists of the clustering units)
• The Reset Module (that acts as a control mechanism)

The F1 layer accepts the inputs and performs some processing and transfers it to the F2
layer that best matches with the classification factor. There exist two sets of weighted
interconnection for controlling the degree of similarity between the units in the F1 and the
F2 layer. The F2 layer is a competitive layer. The cluster unit with the large net input
becomes the candidate to learn the input pattern first and the rest F2 units are ignored. The
reset unit makes the decision whether or not the cluster unit is allowed to learn the input
pattern depending on how similar its top-down weight vector is to the input vector and to
the decision. This is called the vigilance test.
Thus we can say that the vigilance parameter helps to incorporate new memories or new
information. Higher vigilance produces more detailed memories, lower vigilance produces
more general memories.

Generally two types of learning exists,slow learning and fast learning. In fast learning,
weight update during resonance occurs rapidly. It is used in ART1.In slow learning, the
weight change occurs slowly relative to the duration of the learning trial. It is used in ART2.

Advantage of Adaptive Resonance Theory (ART)


• It exhibits stability and is not disturbed by a wide variety of inputs provided to its
network.
• It can be integrated and used with various other techniques to give more good results.
• It can be used for various fields such as mobile robot control, face recognition, land cover
classification, target recognition, medical diagnosis, signature verification, clustering web
users, etc.
• It has got advantages over competitive learning. The competitive learning lacks the
capability to add new clusters when deemed necessary.
• It does not guarantee stability in forming clusters.

Application of ART:

ART stands for Adaptive Resonance Theory. ART neural networks used for fast, stable
learning and prediction have been applied in different areas. The application incorporates
target recognition, face recognition, medical diagnosis, signature verification, mobile
control robot.
Target recognition:

Fuzzy ARTMAP neural network can be used for automatic classification of targets depend
on their radar range profiles. Tests on synthetic data show the fuzzy ARTMAP can result in
substantial savings in memory requirements when related to k nearest neighbor(kNN)
classifiers. The utilization of multiwavelength profiles mainly improves the performance of
both kinds of classifiers.

Medical diagnosis:

Medical databases present huge numbers of challenges found in general information


management settings where speed, use, efficiency, and accuracy are the prime concerns. A
direct objective of improved computer-assisted medicine is to help to deliver intensive care
in situations that may be less than ideal. Working with these issues has stimulated several
ART architecture developments, including ARTMAP-IC.

Signature verification:

Automatic signature verification is a well known and active area of research with various
applications such as bank check confirmation, ATM access, etc. the training of the network
is finished using ART1 that uses global features as input vector and the verification and
recognition phase uses a two-step process. In the initial step, the input vector is coordinated
with the stored reference vector, which was used as a training set, and in the second step,
cluster formation takes place.
Mobile control robot:

Nowadays, we perceive a wide range of robotic devices. It is still a field of research in their
program part, called artificial intelligence. The human brain is an interesting subject as a
model for such an intelligent system. Inspired by the structure of the human brain, an
artificial neural emerges. Similar to the brain, the artificial neural network contains
numerous simple computational units, neurons that are interconnected mutually to allow
the transfer of the signal from the neurons to neurons. Artificial neural networks are used to
solve different issues with good outcomes compared to other decision algorithms.

Limitations of Adaptive Resonance Theory Some ART networks are inconsistent (like
the Fuzzy ART and ART1) as they depend upon the order of training data, or upon the
learning rate.

Special Networks

An Artificial Neural Network (ANN) is an information processing paradigm that is inspired


by the brain. ANNs, like people, learn by examples. An ANN is configured for a specific
application, such as pattern recognition or data classification, through a learning process.
Learning largely involves adjustments to the synaptic connections that exist between the
neurons.
Artificial Neural Networks (ANNs) are a type of machine learning model that are inspired
by the structure and function of the human brain. They consist of layers of interconnected
“neurons” that process and transmit information.

There are several different architectures for ANNs, each with their own strengths and
weaknesses. Some of the most common architectures include:

Feedforward Neural Networks: This is the simplest type of ANN architecture, where the
information flows in one direction from input to output. The layers are fully connected,
meaning each neuron in a layer is connected to all the neurons in the next layer.

Recurrent Neural Networks (RNNs): These networks have a “memory” component, where
information can flow in cycles through the network. This allows the network to process
sequences of data, such as time series or speech.
Convolutional Neural Networks (CNNs): These networks are designed to process data with
a grid-like topology, such as images. The layers consist of convolutional layers, which learn
to detect specific features in the data, and pooling layers, which reduce the spatial
dimensions of the data.

Autoencoders: These are neural networks that are used for unsupervised learning. They
consist of an encoder that maps the input data to a lower-dimensional representation and a
decoder that maps the representation back to the original data.

Generative Adversarial Networks (GANs): These are neural networks that are used for
generative modeling. They consist of two parts: a generator that learns to generate new data
samples, and a discriminator that learns to distinguish between real and generated data.

The model of an artificial neural network can be specified by three entities:

• Interconnections
• Activation functions
• Learning rules

Interconnections:

Interconnection can be defined as the way processing elements (Neuron) in ANN are
connected to each other. Hence, the arrangements of these processing elements and
geometry of interconnections are very essential in ANN.
These arrangements always have two layers that are common to all network architectures,
the Input layer and output layer where the input layer buffers the input signal, and the
output layer generates the output of the network. The third layer is the Hidden layer, in
which neurons are neither kept in the input layer nor in the output layer. These neurons are
hidden from the people who are interfacing with the system and act as a black box to them.
By increasing the hidden layers with neurons, the system’s computational and processing
power can be increased but the training phenomena of the system get more complex at the
same time. There exist five basic types of neuron connection architecture :

1. Single-layer feed-forward network


2. Multilayer feed-forward network
3. Single node with its own feedback
4. Single-layer recurrent network
5. Multilayer recurrent network

1. Single-layer feed-forward network

In this type of network, we have only two layers input layer and the output layer but the
input layer does not count because no computation is performed in this layer. The output
layer is formed when different weights are applied to input nodes and the cumulative effect
per node is taken. After this, the neurons collectively give the output layer to compute the
output signals.

2. Multilayer feed-forward network


This layer also has a hidden layer that is internal to the network and has no direct contact
with the external layer. The existence of one or more hidden layers enables the network to
be computationally stronger, a feed-forward network because of information flow through
the input function, and the intermediate computations used to determine the output Z.
There are no feedback connections in which outputs of the model are fed back into itself.

3. Single node with its own feedback

Single Node with own Feedback

When outputs can be directed back as inputs to the same layer or preceding layer nodes,
then it results in feedback networks. Recurrent networks are feedback networks with closed
loops. The above figure shows a single recurrent network having a single neuron with
feedback to itself.
4. Single-layer recurrent network

The above network is a single-layer network with a feedback connection in which the
processing element’s output can be directed back to itself or to another processing element
or both. A recurrent neural network is a class of artificial neural networks where
connections between nodes form a directed graph along a sequence. This allows it to
exhibit dynamic temporal behavior for a time sequence. Unlike feedforward neural
networks, RNNs can use their internal state (memory) to process sequences of inputs.

5. Multilayer recurrent network

In this type of network, processing element output can be directed to the processing element
in the same layer and in the preceding layer forming a multilayer recurrent network. They
perform the same task for every element of a sequence, with the output being dependent on
the previous computations. Inputs are not needed at each time step. The main feature of a
Recurrent Neural Network is its hidden state, which captures some information about a
sequence.

Types of Neural Networks

There are seven types of neural networks that can be used.


• Multilayer Perceptron (MLP): A type of feedforward neural network with three or more
layers, including an input layer, one or more hidden layers, and an output layer. It uses
nonlinear activation functions.
• Convolutional Neural Network (CNN): A neural network that is designed to process input
data that has a grid-like structure, such as an image. It uses convolutional layers and
pooling layers to extract features from the input data.
• Recursive Neural Network (RNN): A neural network that can operate on input sequences
of variable length, such as text. It uses weights to make structured predictions.
• Recurrent Neural Network (RNN): A type of neural network that makes connections
between the neurons in a directed cycle, allowing it to process sequential data.
• Long Short-Term Memory (LSTM): A type of RNN that is designed to overcome the
vanishing gradient problem in training RNNs. It uses memory cells and gates to
selectively read, write, and erase information.
• Sequence-to-Sequence (Seq2Seq): A type of neural network that uses two RNNs to map
input sequences to output sequences, such as translating one language to another.
• Shallow Neural Network: A neural network with only one hidden layer, often used for
simpler tasks or as a building block for larger networks

You might also like