Activation Function Sigmoid
Mục Lục
Activation Function Sigmoid
“The S-shaped function”
What is Sigmoid?
The sigmoid function also called the logistic function, is traditionally a very popular activation function for neural networks.
What does Sigmoid do?
Sigmoid takes a real value as input and transforms it to output another value between 0 and 1. Inputs that are much larger than 1 are transformed to the value 1, similarly, values much smaller than 0are snapped to 0.
Who invented Sigmoid?
The sigmoid function was introduced in a series of three papers by Pierre Francios Verhulst between 1838 and 1847, who devised it as a model of population growth by adjusting the exponential growth model. under the guidance of Adolphe Quetelet.
source: Wikipedia
What does sigmoid look like?
The shape of the function for all possible inputs is an S — shape from 0.0 up through 0.5 to 1.0
The sigmoid function:
Sigmoid Equation
S represents the Sigmoid function and z represents the number for which sigmoid has to be calculated
The derivative of the sigmoid function:
The Derivative of the Sigmoid Equation
The Implementation
import matplotlib.pyplot as plt
import numpy as np
import math
x = np.linspace(-10, 10, 100) # X- axis starting from -10 to 10
z = 1/(1 + np.exp(-x)) #Sigmoid function formula
y = z*(1-z) # Derivative of Sigmoid function
plt.plot(x, z)
plt.xlabel(“x”)
plt.ylabel(“Sigmoid(X)”)
plt.show()
The Output:
Sigmoid FunctionDerivative of Sigmoid Function
Why even?
For a long time, through the early 1990s, it was the default activation function used in the neural network. It is easy to work with and has all the nice properties of activation functions! Meaning, it:
- is non-linear.
- is continuously differentiable.
- is monotonic.
- has a fixed output range.
What are the limitations?
- It gives rise to a problem of vanishing gradients.
- Towards either end of the sigmoid function, the Y values tend to respond very less to changes in X.
- Its output is not zero centered. It makes the gradient updates go too far in different directions.
- It makes optimization harder
- The network sometimes refuses to learn further or is drastically slow.
- Sigmoids saturate and kill gradients.
Learn more about sigmoid on:
Activation functions on Machine Learning Glossary (ml-cheatsheet.readthedocs.io).
Sigmoid function: Wikipedia