Activation functions in Neural Networks | Set2
Last Updated :
05 Sep, 2020
The article Activation-functions-neural-networks will help to understand the use of activation function along with the explanation of some of its variants like linear, sigmoid, tanh, Relu and softmax. There are some other variants of the activation function like Elu, Selu, Leaky Relu, Softsign and Softplus which are discussed briefly in this article.
Leaky Relu function:
Leaky Rectified linear unit(Leaky Relu) is an extension of the Relu function to overcome the dying neuron problem.

Equation:
lerelu(x) = x if x>0
lerelu(x) = 0.01 * x if x<=0
Derivative:
d/dx lerelu(x) = 1 if x>0
d/dx lerelu(x) = 0.01 if x<=0
Uses: Relu return 0 if the input is negative and hence the neuron becomes inactive as it does not contribute to gradient flow. Leaky Relu overcomes this problem by allowing small value to flow when the input is negative. So, if the learning is too slow using Relu, one can try using Leaky Relu to see any improvement happens or not.
Elu function:
The exponential Linear Unit is also similar to Leaky Relu but differs for negative input. It also helps to overcome the dying neuron problem.

Equation:
elu(x) = x if x>0
elu(x) = alpha * (exp(x)-1) if x<0
Derivative:
d/dx elu(x) = 1 if x>0
d/dx elu(x) = elu(x) + alpha if x<=0
Uses: It has the same purpose that of Leaky Relu and convergence of cost function towards zero is faster than Relu as well as Leaky Relu. For example, neural network learning on Imagenet using Elu is faster than using Relu.
Selu function:
Scaled Exponential Linear Unit is the scaled form of Elu. Just multiply the output of Elu by a predetermined “scale” parameter and you will get the desired output which selu gives.

Equation:
selu(x) = scale * x if x>0
selu(x) = scale * alpha * (exp(x)-1) if x<=0
where,
alpha = 1.67326324
scale = 1.05070098
Derivative:
d/dx selu(x) = scale if x>0
d/dx selu(x) = selu(x) + scale * alpha if x<=0
Uses: This activation function is used in Self-Normalizing Neural Networks (SNNs) which is used to train a deep and robust network less effected from vanishing and exploding gradient problem.
Softsign function:
Softsign function is an alternative to tanh function where tanh converges exponentially and softsign converges polynomially. 
Equation:
softsign(x) = x / (1 + |x|)
Derivative:
d/dx softsign(x) = 1 / (1 + |x|)^2
Uses: It is mostly used in the regression problem and can be used in a deep neural network for text to speech conversion.
Softplus function:
Softplus function is a smoothed form of the Relu activation function and its derivative is the sigmoid function. It also helps in overcoming the dying neuron problem.
Equation:
softplus(x) = log(1 + exp(x))
Derivative:
d/dx softplus(x) = 1 / (1 + exp(-x))
Uses: Some experiments show that softplus takes lesser epochs to converge than Relu and sigmoid. It can be used in the speech recognition system.
Similar Reads
Activation functions in Neural Networks
While building a neural network, one key decision is selecting the Activation Function for both the hidden layer and the output layer. Activation functions decide whether a neuron should be activated. Before diving into the activation function, you should have prior knowledge of the following topics
8 min read
Activation Functions in Neural Networks Using R
Activation functions are essential components of neural networks that play a crucial role in determining how a model processes and interprets data. They introduce non-linearity into the network, enabling it to learn and capture complex patterns and relationships within the data. By applying mathemat
5 min read
Introduction to Artificial Neural Networks | Set 1
Artificial Neural Networks (ANNs) are computational models inspired by the human brain. They are widely used for solving complex tasks such as pattern recognition, speech processing, and decision-making. By mimicking the interconnected structure of biological neurons, ANNs can learn patterns and mak
5 min read
Swish activation function in Pytorch
Activation functions are a fundamental component of artificial neural networks. They introduce non-linearity into the model, allowing it to learn complex relationships in the data. One such activation function, the Swish activation function, has gained attention for its unique properties and potenti
6 min read
Backpropagation in Convolutional Neural Networks
Convolutional Neural Networks (CNNs) have become the backbone of many modern image processing systems. Their ability to learn hierarchical representations of visual data makes them exceptionally powerful. A critical component of training CNNs is backpropagation, the algorithm used for effectively up
4 min read
Activation Function in TensorFlow
Activation functions add non-linearity to deep learning models and allow them to learn complex patterns. TensorFlowâs tf.keras.activations module provides a variety of activation functions to use in different scenarios. An activation function is a mathematical transformation applied to the output of
4 min read
Introduction to Convolution Neural Network
Convolutional Neural Network (CNN) is an advanced version of artificial neural networks (ANNs), primarily designed to extract features from grid-like matrix datasets. This is particularly useful for visual datasets such as images or videos, where data patterns play a crucial role. CNNs are widely us
8 min read
Types Of Activation Function in ANN
The biological neural network has been modeled in the form of Artificial Neural Networks with artificial neurons simulating the function of a biological neuron. The artificial neuron is depicted in the below picture: Each neuron consists of three major components:Â A set of 'i' synapses having weigh
4 min read
Auto-associative Neural Networks
Auto associative Neural networks are the types of neural networks whose input and output vectors are identical. These are special kinds of neural networks that are used to simulate and explore the associative process. Association in this architecture comes from the instruction of a set of simple pro
3 min read
Weights and Bias in Neural Networks
Machine learning, with its ever-expanding applications in various domains, has revolutionized the way we approach complex problems and make data-driven decisions. At the heart of this transformative technology lies neural networks, computational models inspired by the human brain's architecture. Neu
13 min read