0% found this document useful (0 votes)
24 views

Sigmoid Deep Learning

Uploaded by

kobbyjnr005
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Sigmoid Deep Learning

Uploaded by

kobbyjnr005
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

What is sigmoid in deep learning?

The sigmoid neuron is essentially the building block of the deep neural
networks. These sigmoid neurons are like perceptron, but they happen to
be slightly modified so that the output from the sigmoid neuron is far
smoother than the step functional output from perceptron.

The sigmoid function is a smoother (less harsh) function than perceptron.


In a sigmoid neuron, a minor change in the input only causes a minor
change in the output, unlike the stepped functional output generated by a
perceptron.

The inputs to the sigmoid neuron can be real numbers unlike the Boolean
inputs in MP Neuron. Even the output will be a real number between 0–1.
In the sigmoid neuron, you are trying to regress the relationship between
X and Y in terms of probability. Even though the output is between 0 and
1, you can still make use of the sigmoid function for binary classification
tasks by selecting a threshold.

What is a Sigmoid function?

The sigmoid function is a mathematical function that has a characteristic


that can take any real value and map it to between 0 to 1 shaped like the
letter “S”.

The sigmoid function is also known as a logistic function.

Mathematical Formulation

The sigmoid function, denoted as σ(x), is defined as


follows:

σ(x) = 1 / (1 + exp(-x))

Where:

 x is the input to the function.


 exp() denotes the exponential function.

Graphically, the sigmoid function resembles an “S”-shaped


curve, with values approaching 0 as x approaches negative
infinity, and values approaching 1 as x approaches positive
infinity. The midpoint of the curve occurs at x = 0, where
σ(0) = 0.5.

Sigmoid graph

If the value of z goes up to positive infinity, then the predicted value of y


will become 1. But if the value of z goes down to negative infinity, then
the predicted value of y will become 0.

If the outcome of the sigmoid function is greater than 0.5 then you would
classify that label to be class 1 or positive class and if it is less than 0.5
then you would classify it to be a negative class or label it as class 0.

The Sigmoid function performs the role of an activation function in


machine learning which is used to add non-linearity in a machine
learning model.

Basically, the function determines which value to pass as output and what
not to pass as output.
There are 7 types of activation functions that are used in machine learning
and deep learning.

What are the types of sigmoid functions?

There are several types of sigmoid functions available. Here are three of
the most common types of sigmoid functions.

1. Logistic Sigmoid Function

The logistic sigmoid function is normally referred to as the sigmoid


function in the world of machine learning. The logistic sigmoid function
can take any real-valued input and outputs a value between zero and one.
The logistic sigmoid function is used in many fields, including biomathematics and computer
science. In computer science, it's used in the output layer to predict probabilities

The equation of logistic function or logistic curve is a common “S” shaped


curve defined by the below equation. The logistic curve is also known as
the sigmoid curve.

Where,

L = the maximum value of the curve

e = the natural logarithm base (or Euler’s number)

x0 = the x-value of the sigmoid’s midpoint

k = steepness of the curve or the logistic growth rate

There are many applications where logistic function plays an important


role. Some of them are as follows.

Ecology: Modeling population growth, time-varying carrying capacity.

Statistics and machine learning: logistic regression and neural networks

Medicine: Modeling of growth of tumors


Agriculture: Modeling crop response

Example problem 1:

How many years will it take for a bacteria population to reach 9000, if its
growth is modelled by

here, t in years?

Solution:

According to the given,

Taking logarithm on both sides,

-0.12(t-20)=ln(0.111)

t = -ln(0.111)/0.12 + 20

On simplifying,

t=38.31 years

The graph for the above solution is as below:


2. Hyperbolic Tangent Function

The hyperbolic tangent function is another commonly used sigmoid


function. This function maps any real-valued input to the range between -
1 and 1. Here is the mathematical definition of the hyperbolic tangent
function:

3. Arctangent Function

This is yet another type of sigmoid function. The arctangent function is


essentially the inverse of the tangent function. This function maps any
real-valued input to the range −π/2 to π/2. This is the mathematical
definition of the arctangent function:

4. Guddermannian Function

This type of sigmoid function is related to the hyperbolic tangent function


(tanh) in the same way that the arctangent function is related to the
tangent function.
This function is generally applied in signal processing, mathematical
physics & communication theory. This is the mathematical definition of the
Guddermannian function:

5. Error Function

The error function (Gauss error function) is used in probability theory and
statistics for describing the probability distribution of a Gaussian random
variable.

This function delivers better performance in pattern recognition &


classification algorithms. This is the mathematical definition of the error
function:

6. Smoothstep Function

This function is most used in computer graphics, animation, and a couple


of other areas of computer science.

This function helps to create smooth transitions between colours, textures,


& other visual elements in computer graphics. This is the mathematical
definition of the Smooth step function:

7. Generalised Logistic Function


This type of sigmoid function helps to model the growth of population,
spread of diseases, and other applications in biology, ecology, and
economics.

A practical example of this function is in marketing to model the growth of


sales over a certain period, or in economics to chart out the adoption of
new technologies. This is the mathematical definition of the logistics
function:

What are the applications of Sigmoid function?

1. Artificial neural networks - as an activation function for neurons

2. Logistics regression - for modelling the probability of a binary


outcome

3. Image processing - to adjust the intensity values for enhancing the


contrast between the dark & light regions

4. Economics & Finance - for representing the rate of new technology


adoption by consumers

5. Biological systems - to represent the rate of change of system


activation over time

What is the history of the sigmoid function?

1798 - A book named An Essay on the Principle of Population was


published by the English cleric and economist Thomas Robert Malthus. He
asserted that the population was increasing in a geometric progression
(doubling every 25 years) while food supplies were increasing
arithmetically. He claimed that this difference between the two would
cause widespread famine.

1830 - Pierre François Verhulst, a Belgian mathematician wanted to


account for the fact that a population's growth is ultimately self-limiting, it
does not increase exponentially forever. To modelling the slowing down of
a population's growth which occurs when a population begins to exhaust
its resources, Verhulst picked the logistic function as a logical adjustment
to the simple exponential model.

1943 - Warren McCulloch and Walter Pitts developed an artificial neural


network model using a hard cutoff as an activation function. In this
model, a neuron generates an output of 1 or 0 depending on whether its
input is above or below a threshold.

1972 - The biologists Hugh Wilson and Jack Cowan at the University of
Chicago were trying to model biological neurons computationally and
ended up publishing the Wilson–Cowan model, in which a neuron sends a
signal to another neuron if it receives a signal greater than an activation
potential. Wilson and Cowan employed the logistic sigmoid function to
model the activation of a neuron as a function of a stimulus.

1998 - Yann LeCun selected the hyperbolic tangent as an activation


function in his groundbreaking convolutional neural network LeNet, which
was the first CNN to have the ability to recognize handwritten digits to a
practical level of accuracy.

Recently, ANNs have shifted away from sigmoid functions towards the
ReLU function, because all the variants of the sigmoid function are
computationally intensive to calculate, and the ReLU offers the required
nonlinearity to take advantage of the depth of the network, while also
being very fast to compute.

You might also like