Neural Networks | A beginners guide - GeeksforGeeks - EU-Vietnam Business Network (EVBN)

Neural networks are artificial systems that were inspired by biological neural networks. These systems learn to perform tasks by being exposed to various datasets and examples without any task-specific rules. The idea is that the system generates identifying characteristics from the data they have been passed without being programmed with a pre-programmed understanding of these datasets. Neural networks are based on computational models for threshold logic. Threshold logic is a combination of algorithms and mathematics. Neural networks are based either on the study of the brain or on the application of neural networks to artificial intelligence. The work has led to improvements in finite automata theory. Components of a typical neural network involve neurons, connections which are known as synapses, weights, biases, propagation function, and a learning rule. Neurons will receive an input $p_j(t)$ from predecessor neurons that have an activation $a_j(t)$ , threshold $\theta_j$ , an activation function f, and an output function $f_{out}$ . Connections consist of connections, weights and biases which rules how neuron $i$ transfers output to neuron $j$ . Propagation computes the input and outputs the output and sums the predecessor neurons function with the weight.The learning of neural network basically refers to the adjustment in the free parameters i.e. weights and bias. The learning rule modifies the weights and thresholds of the variables in the network. There are basically three sequence of events of learning process. These includes:

The neural network is simulated by an new environment.
Then the free parameters of the neural network is changed as a result of this simulation.
The neural network then responds in a new way to the environment because of the changes in its free parameters.

Supervised vs Unsupervised Learning: Neural networks learn via supervised learning; Supervised machine learning involves an input variable x and corresponding desired output variable y. Here we introduce the concept of teacher who has knowledge about the environment. Thus we can say that the teacher has both input-output set. The neural network is unaware of the environment. The input is exposed to both teacher and neural network, the neural network generates an output based on the input. This output is then compared with the desired output that teacher has and simultaneously an error signal is produced. The free parameters of the network is then step by step adjusted so that error is minimum. The learning stops when the algorithm reaches an acceptable level of performance. Unsupervised machine learning has input data X and no corresponding output variables. The goal is to model the underlying structure of the data for understanding more about the data. The keywords for supervised machine learning are classification and regression. For unsupervised machine learning, the keywords are clustering and association.

Evolution of Neural Networks: Hebbian learning deals with neural plasticity. Hebbian learning is unsupervised and deals with long-term potentiation. Hebbian learning deals with pattern recognition and exclusive-or circuits; deals with if-then rules. Backpropagation solved the exclusive-or issue that Hebbian learning could not handle. This also allowed for multi-layer networks to be feasible and efficient. If an error was found, the error was solved at each layer by modifying the weights at each node. This led to the development of support vector machines, linear classifiers, and max-pooling. The vanishing gradient problem affects feedforward networks that use back propagation and recurrent neural network. This is known as deep-learning. Hardware-based designs are used for biophysical simulation and neurotrophic computing. They have large scale component analysis and convolution creates new class of neural computing with analog. This also solved back-propagation for many-layered feedforward neural networks. Convolutional networks are used for alternating between convolutional layers and max-pooling layers with connected layers (fully or sparsely connected) with a final classification layer. The learning is done without unsupervised pre-training. Each filter is equivalent to a weights vector that has to be trained. The shift variance has to be guaranteed to dealing with small and large neural networks. This is being resolved in Development Networks. Some of the other learning techniques involve error-correction learning, memory-based learning and competitive learning.

Types of Neural Networks

There are seven types of neural networks that can be used.

The first is a multilayer perceptron which has three or more layers and uses a nonlinear activation function.
The second is the convolutional neural network that uses a variation of the multilayer perceptrons.
The third is the recursive neural network that uses weights to make structured predictions.
The fourth is a recurrent neural network that makes connections between the neurons in a directed cycle. The long short-term memory neural network uses the recurrent neural network architecture and does not use an activation function.
The final two are sequence-to-sequence modules which use two recurrent networks and shallow neural networks which produce a vector space from an amount of text. These neural networks are applications of the basic neural network demonstrated below.

For the example, the neural network will work with three vectors: a vector of attributes X, a vector of classes Y, and a vector of weights W. The code will use 100 iterations to fit the attributes to the classes. The predictions are generated, weighed, and then outputted after iterating through the vector of weights W. The neural network handles backpropagation.

Examples:

Input :
X { 2.6, 3.1, 3.0,
    3.4, 2.1, 2.5,
    2.6, 1.3, 4.9, 
    0.1, 0.3, 2.3,};
y {1, 1, 1};
W {0.3, 0.4, 0.6}; 

Output :
0.990628 
0.984596 
0.994117

Below is the implementations:

Python3

import numpy as np

X = np.array([[1, 2, 3],

[3, 4, 1],

[2, 5, 3]])

y = np.array([[.5, .3, .2]])

y = y.T

sigm = 2

delt = np.random.random((3, 3)) - 1

for j in range(100):

m1 = (y - (1/(1 + np.exp(-(np.dot((1/(1 + np.exp(

-(np.dot(X, sigm))))), delt))))))*((1/(

1 + np.exp(-(np.dot((1/(1 + np.exp(

-(np.dot(X, sigm))))), delt)))))*(1-(1/(

1 + np.exp(-(np.dot((1/(1 + np.exp(

-(np.dot(X, sigm))))), delt)))))))

m2 = m1.dot(delt.T) * ((1/(1 + np.exp(-(np.dot(X, sigm)))))

* (1-(1/(1 + np.exp(-(np.dot(X, sigm)))))))

delt = delt + (1/(1 + np.exp(-(np.dot(X, sigm))))).T.dot(m1)

sigm = sigm + (X.T.dot(m2))

print(1/(1 + np.exp(-(np.dot(X, sigm)))))

Output:

[[ 0.99999294  0.99999379  0.99999353]
 [ 0.99999987  0.99999989  0.99999988]
 [ 1.          1.          1.        ]]

Limitations: The neural network is for a supervised model. It does not handle unsupervised machine learning and does not cluster and associate data. It also lacks a level of accuracy that will be found in more computationally expensive neural networks. Based on Andrew Trask’s neural network. Also, the neural network does not work with any matrices where X’s number of rows and columns do not match Y and W’s number of rows. The next steps would be to create an unsupervised neural network and to increase computational power for the supervised model with more iterations and threading.

My Personal Notes