ML - Swish Function by Google in Keras
Last Updated :
26 May, 2020
ReLU has been the best activation function in the deep learning community for a long time, but Google's brain team announced Swish as an alternative to ReLU in 2017. Research by the authors of the papers shows that simply be substituting ReLU units with Swish units improves the classification accuracy on ImageNet by 0.6% for Inception-ResNet-v2, hence, it outperforms ReLU in many deep neural nets.
Swish Activation function:
- Mathematical formula: Y = X * sigmoid(X)
- Bounded below but Unbounded above: Y approach to constant value at X approaches negative infinity but Y approach to infinity as X approaches infinity.
- Derivative of Swish, Y' = Y + sigmoid(X) * (1-Y)
- Soft curve and non-monotonic function.
Swish vs ReLU
Advantages over RelU Activation Function:
Having no bounds is desirable for activation functions as it avoids problems when gradients are nearly zero. The ReLU function is bounded above but when we consider the below region then being bounded below may regularize the model up to an extent, also functions that approach zero in a limit to negative infinity are great at regularization because large negative inputs are discarded. The swish function provides it along with being non-monotonous which enhances the expression of input data and weight to be learnt.
Below is the performance metric of Swish function over many community dominant activation functions like ReLU, SeLU, Leaky ReLU and others.
Implementation of Swish activation function in keras:
Swish is implemented as a custom function in Keras, which after defining has to be registered with a key in the Activation Class.
Code:
Python3 1==
# Code from between to demonstrate the implementation of Swish
# Our aim is to use "swish" in place of "relu" and make compiler understand it
model.add(Dense(64, activation = "relu"))
model.add(Dense(16, activation = "relu"))
Now We will be creating a custom function named Swish which can give the output according to the mathematical formula of Swish activation function as follows:
Python3 1==
# Importing the sigmoid function from
# Keras backend and using it
from keras.backend import sigmoid
def swish(x, beta = 1):
return (x * sigmoid(beta * x))
Now as we have the custom-designed function which can process the input as Swish activation, we need to register this custom object with Keras. For this, we pass it in a dictionary with a key of what we want to call it and the activation function for it. The Activation class will actually build the function.
Code:
Python3 1==
# Getting the Custom object and updating them
from keras.utils.generic_utils import get_custom_objects
from keras.layers import Activation
# Below in place of swish you can take any custom key for the name
get_custom_objects().update({'swish': Activation(swish)})
Code: Implementing the custom-designed activation function
Python3 1==
model.add(Dense(64, activation = "swish"))
model.add(Dense(16, activation = "swish"))
Similar Reads
Image Classification using Google's Teachable Machine Machine learning is a scientific field that allows computers to learn without being programmed directly. When many learners, students, engineers, and data scientists use machine learning to create diverse projects and goods, the application of machine learning is trendy. However, the development of
2 min read
How to create Models in Keras? Keras is an open-source API used for solving a variety of modern machine learning and deep learning problems. It enables the user to focus more on the logical aspect of deep learning rather than the brute coding aspects. Keras is an extremely powerful API providing remarkable scalability, flexibilit
4 min read
Google Colab - Running ML with Low-Spec Device Learning about Machine Learning is one of the trending things nowadays. But a lot of people face difficulties, as they don't have a device, that is powerful enough, and there are also a lot of issues, arising due to inefficient systems. So, let's see, how can we overcome this using an easy solution.
3 min read
Python Tensorflow - tf.keras.layers.Conv2D() Function The tf.keras.layers.Conv2D() function in TensorFlow is a key building block of Convolutional Neural Networks (CNNs). It applies convolutional operations to input images, extracting spatial features that improve the modelâs ability to recognize patterns.The Conv2D layer applies a 2D convolution over
2 min read
GPU Acceleration in Scikit-Learn Scikit-learn, a popular machine learning library in Python, is renowned for its simplicity and efficiency in implementing a wide range of machine learning algorithms. However, one common question among data scientists and machine learning practitioners is whether scikit-learn can utilize GPU for acc
4 min read