Sigmoid Function as Neural Network Activation Function

Sigmoid function (aka logistic function) is moslty picked up as activation function in neural networks. Because its derivative is easy to demonstrate. It produces output in scale of [0 ,1] whereas input is meaningful between [-5, +5]. Out of this range produces same outputs. In this post, we’ll mention the proof of the derivative calculation.

sigmoid_dance

Sigmoid function is formulized in the following form:

f(x) = 1 / (1 + e-x)

sigmoid-function

The function could also be demonstrated as the following equation. Divisor would be illustarated as dividend.

f(x) = (1) . (1 + e-x)-1 = (1 + e-x)-1

Mục Lục

Derivative of the sigmoid function

d f(x) / dx = (-1) . ((1 + e-x)-1-1). d(1 + e-x)/dx

d f(x) / dx = (-1) . ((1 + e-x)-2) . (e-x) . d (-x)/dx

d f(x) / dx = (-1) . ((1 + e-x)-2) . (e-x) . (-1)

d f(x) / dx = (e-x) / ((1 + e-x)2)

That’s the derivative of the sigmoid function. However, it could be demonstrated in simpler form. Let’s 1 append plus and minus 1 to dividend, in this way the result would not be changed.

d f(x) / dx = (e-x +1 -1) / (1 + e-x)2

d f(x) / dx = [(1 + e-x)/ (1 + e-x)2 ]- [1 / (1 + e-x)2]

d f(x) / dx = [1/ (1 + e-x) ]- [1 / (1 + e-x)2]

d f(x) / dx = [1/ (1 + e-x) ]- [1 / (1 + e-x)].[1 / (1 + e-x)]

d f(x) / dx = (1/ (1 + e-x)) . [1 – (1 / (1 + e-x))]

If f(x) is put instead of 1 / (1 + e-x) on the equation above, then the formula would be demonstrated as:

d f(x) / dx = f(x) . (1 – f(x))