What are the commonly used activation functions
What are the commonly used activation functions
Some of the commonly used activation functions are binary, sigmoidal (linear), and tan
hyperbolic sigmoidal functions(nonlinear).
Binary - The output has only two values, either 0 or 1. For this, the threshold value is
set up. If the net weighted input is greater than 1, the output is assumed as one;
otherwise, it is zero.
Sigmoidal Hyperbolic - This function has an ‘S’ shaped curve. Here, the tan
hyperbolic function is used to approximate the output of the net input. The function is
defined as - f (x) = (1/1+ exp(-????x)) where ???? - steepness parameter.
Neural Network Architecture Types
Perceptron Model in Neural Networks
Multilayer Perceptron Neural Network
Recurrent Neural Network
Hopfield Network
Perceptron Model
A Neural Network has two input and one output unit with no hidden layers. These are also
known as ‘single-layer perceptrons’.
Multilayer Perceptron
Unlike single-layer perceptrons, deep feedforward neural networks use more than one hidden
layer of neurons.
Recurrent
Type of Neural Network in which hidden layer neurons have self-connections. It possesses
memory. At any instance, the hidden layer neuron receives activation from the lower layer
and its previous activation value.
Hopfield Network
A fully interconnected network of neurons in which each neuron is connected to every other
neuron. The network is trained with input patterns by setting a value of neurons to the
desired pattern. Then, its weights are computed. The weights are not changed. Once trained
for one or more patterns, the network will converge to the learned patterns. It is different
from other Neural Networks.
Learning Techniques
The neural network learns by iteratively adjusting its weights and bias (threshold) to yield
the desired output. These are also called free parameters. For learning to take place, the
Neural Network must be trained first. The training is performed using a defined set of rules,
the learning algorithm.
Training Algorithms
Gradient Descent Algorithm—This is the simplest training algorithm used in a
supervised training model. If the actual output is different from the target output, the
difference or error is found. The gradient descent algorithm changes the network's
weights to minimize this mistake.
Back Propagation Algorithm—It extends the gradient-based delta learning rule. Here,
after finding an error (the difference between desired and target), the error is propagated
backwards from the output layer to the input layer via the hidden layer. It is used in Multi-
layer Neural Networks.
Training Data Set: A set of examples used for learning is to fit the parameters [i.e.,
weights] of the network. One approach comprises one full training cycle on the training set.
Validation Set Approach: A set of examples used to tune the parameters [i.e., architecture]
of the network. For example, to choose the number of hidden units in a Neural Network.
Making Test Set: A set of examples is used only to assess the performance [generalization]
of a fully specified network or apply successfully to predict output whose input is known.
Algorithms to Train a Neural Network
Hebbian Learning Rule
Self-Organizing Kohonen Rule
Hopfield Network Law
Hebbian Learning Rule with Implementation of AND Gate
Hebbian Learning Rule, also known as Hebb Learning Rule, was proposed by Donald O
Hebb. It is one of the first and also easiest learning rules in the neural network. It is used for
pattern classification. It is a single layer neural network, i.e. it has one input layer and one
output layer. The input layer can have many units, say n. The output layer only has one unit.
Hebbian rule works by updating the weights between neurons in the neural network for each
training sample.
Hebbian Learning Rule Algorithm :
1. Set all weights to zero, wi = 0 for i=1 to n, and bias to zero.
2. For each input vector, S(input vector) : t(target output pair), repeat steps 3-5.
3. Set activations for input units with the input vector Xi = Si for i = 1 to n.
4. Set the corresponding output value to the output neuron, i.e. y = t.
5. Update weight and bias by applying Hebb rule for all i = 1 to n:
Implementing AND Gate :
Architecture of KSOM
3. Calculate the activation level of each neuron in the grid in response to the input data.
4. Select the neuron with the highest activation level as the winning neuron.
5. Update the weights of the winning neuron and its neighbors, using a learning rate
and a neighborhood function that decrease with distance from the winning neuron.
The resulting weight vectors of the neurons in the grid can be visualized as a low-
dimensional representation of the high-dimensional input data.
Advantages of KSOM
Kohonen Self-Organizing Maps (KSOM) have several advantages that make them
useful for a wide range of applications, including:
which means that they do not require labeled data for training. This makes them
useful for tasks where labeled data is not available or is too expensive to obtain.
3. Clustering and visualization: KSOMs can be used for clustering and visualization
4. Robustness to noise: KSOMs are relatively robust to noise and can still perform
well even if the input data contains some level of noise or errors.
interpreted, which can be useful for identifying trends and patterns in the data, and
for communicating the results to others.
Disadvantages of KSOM
While Kohonen Self-Organizing Maps (KSOM) have many advantages, there are also
some limitations and disadvantages to using this technique, including:
the initial conditions of the network, such as the initial weights of the neurons in
the grid. This means that different initializations can result in different final
solutions, and it may be necessary to run the algorithm multiple times to obtain a
stable solution.
particularly for large datasets and complex network architectures. This can make
training and testing the network time-consuming and computationally expensive.
network size, or the number of neurons in the grid, can be difficult and is often a
trial-and-error process. Using too few neurons can result in poor representation of
the input data, while using too many neurons can lead to overfitting.
4. Limited to low-dimensional data: KSOMs are typically used for dimensionality
interpreting the resulting clusters or patterns in the data can be difficult. The
meaning of the clusters or patterns may be unclear, and it may be necessary to
combine KSOMs with other techniques to gain a deeper understanding of the data.