Feed Forward Neural Networks | Intuition on Forward Propagation

Feed Forward Neural Networks | Intuition on Forward Propagation

Why Neural Networks? How do Neural Networks do what they do? How does Forward Propagation work?

Architecture and working of neural networks

before we see why whatever neural networks works, it would be appropriate to show what neural networks do. before understanding the architecture of a neural network, we need to look into what a neuron does first.

fig 1.1: An Artificial Neuron

fig 1.2: neural network

Forward propagation example

Let us consider the neural network we have in fig 1.2 and then show how forward propagation works with this network for better understanding. We can see that there are 6 neurons in the input layer which means there are 6 inputs.
Note: For calculation purposes, I am not including the biases. But, if biases were to be included, There simply will be an extra input I0 whose value will always be 1 and there will be an extra row at the beginning of the weight matrix w01, w02….w04

fig 2.1

fig 2.2

fig 2.3

Why this approach works?

We already saw what each neuron in the network does is not so different from a linear regression. In addition, the neuron adds an activation function at the end and each neuron has a different weight vector. But, why does this work?

Non-Linearity is the key

Before we go further, we need to understand the power of non-linearity. When we add 2 or more linear objects like a line, plane or a hyperplane, the resultant is also a linear object: line, plane or hyperplane respectively. No, matter in what proportion we add these linear objects, we are still going to get a linear object.

What if Neural networks didn’t use Activation functions?

If neural networks didn’t use an activation function, it’s just going to be a big linear unit, which could be easily replaced by a single linear regression model.

Conclusion

We saw how neural networks calculate their outputs and why that method works. To put it simply, the main reason behind why neural networks are able to learn complex relationships is because at each and every layer we introduce non-linearity and add different proportions of the output curve in order to get the desired result and this result also goes through an activation function and the same process is repeated to further customize the resultant. all the weights and biases in the network are important and they can be adjusted in certain ways to approximate the relationship. even though the weights assigned to each neuron are random initially, they will be learned through a special algorithm called backpropagation, which we will see in the next blog.