Activation Function
Activation Function
The role of the Activation Function is to derive output from a set of input values
fed to a node (or a layer).
In deep learning, this is also the role of the Activation Function—that’s why it’s
often referred to as a Transfer Function in Artificial Neural Network.
It is used to determine the output of neural network like yes or no. It maps
the resulting values in between 0 to 1 or -1 to 1 etc. (depending upon the
function).
The function doesn't do anything to the weighted sum of the input, it simply
spits out the value it was given.
All layers of the neural network will collapse into one if a linear activation
function is used. No matter the number of layers in the neural network,
the last layer will still be a linear function of the first layer. So, essentially,
a linear activation function turns the neural network into just one layer.
Because of its limited power, this does not allow the model to create complex
mappings between the network’s inputs and outputs.
The larger the input (more positive), the closer the output value will be to 1.0,
whereas the smaller the input (more negative), the closer the output will be to
0.0, as shown below.
Mathematically it can be represented as:
Here’s why sigmoid/logistic activation function is one of the most widely used
functions:
The main catch here is that the ReLU function does not activate all the
neurons at the same time.
The neurons will only be deactivated if the output of the linear transformation
is less than 0.