6 Types of Neural Networks Every Data Scientist Must Know

Photo by Michael Dziedzic on Unsplash

DEEP LEARNING, NEURAL NETWORKS

6 Types of Neural Networks Every Data Scientist Must Know

The most common types of Neural Networks and their applications

Neural networks are robust deep learning models capable of synthesizing large amounts of data in seconds. There are many different types of neural networks, and they help us in a variety of everyday tasks from recommending movies or music to helping us buy groceries online.

Similar to the way airplanes were inspired by birds, neural networks (NNs) are inspired by biological neural networks. Though the principles are the same, the process and the structures can be very different. This is as true for birds and planes as it is for biological neural networks and deep learning neural networks.

To help put it into perspective, let’s look briefly at the biological neuron structure. Figure 1 shows the anatomy of a single neuron. The central part is called the cell body, where the nucleus resides. Various connections pass the stimulus to the cell body, called dendrites, and a few connections send the output to the other neurons called axons. The thickness of the dendrites and axons implies the power of the stimulus. Many neurons with various cell bodies are stacked up and form a biological neural network.

Figure 1: Anatomy of Single Neuron (Image by author)

This same structure is visible in deep learning neural networks. The input is passed through an activation function (similar to the nucleus) with weighted edges (similar to dendrites). The generated output can be passed to another activation function. Many of these activation functions can be stacked up, and each of these is called a layer. Apart from the input layer and the output layer, there are many layers in the interiors of a neural network, and these are called hidden layers.

6 Essential Types of Neural Networks

Now that we have a picture of how neural networks work, let’s look at the various types and functions of the neural networks used in deep learning. Note that each type of artificial neural network is tailored to certain tasks.

We’ll look at the most common types of neural networks, listed below:

  1. Perceptron
  2. Multi-layer Perceptron
  3. Convolutional Neural Networks
  4. Recurrent Neural Networks
  5. Long Short Term Memory Networks
  6. Generative Adversarial Networks

1. Perceptron

Perceptron is the simplest neural network structure. This model, which is also known as a single-layer neural network, contains only two layers:

  • The Input Layer
  • The Output Layer

There are no hidden layers here. Perceptron takes input and calculates the weighted input for each input node. This weighted input is passed through an activation function to generate the output.

Figure 2: Single neuron neural network (Image by author)

Due to the simple architecture, it cannot be used for complex tasks. This network is instead used for Logic Gates like AND, OR, or XOR.

Applications

Perceptrons are used in linear or binary model classification. They are also used in the formation of multilayer perceptrons, which we’ll look at next.

2. Multi-layer Perceptron

Multilayer perceptrons (MLPs), or feedforward neural networks, usually mean fully connected networks. In other words, each neuron in one layer is connected to all neurons in the adjacent layers. Hence, an MLP has higher processing power than a perceptron. However, the “fully-connectedness” of these networks makes them prone to overfitting data. Typical ways to reduce overfitting include early stopping, adding dropout layers, and adding regularization terms.

Figure 3: Architecture of Multi-layered perceptron (Image by author)

Applications

MLPs are widely used in a variety of areas. They’re common in data compression for social networks, speech recognition and hand-written character recognition systems, computer vision applications, and data prediction systems.

3. Convolutional Neural Networks

Humans identify objects using neurons in the eyes which detect edges, shapes, depth, and motion. One of the most important types of neural networks in computer vision, convolutional neural networks (CNNs) are inspired by the visual cortex of the eyes and are used for visual tasks like object detection. The convolution layer of a CNN is what sets it apart from other neural networks. This layer performs dot product, which is component-wise multiplication followed by addition.

In the initial phases of a CNN, the filters are randomized and do not provide any useful results. Using a loss function, the filters are adjusted and over multiple iterations, the network gets better at achieving its task, such as detecting object edges, for example. Though they often require a large amount of training data, CNNs are widely applicable to a wide range of image and even language tasks.

Figure 4: Convolution operation using a filter (Image by author)

Applications

Because CNNs were inspired by the visual cortex, they are widely used for applications that involve the application of computer vision. These applications include facial recognition, face detection, object recognition, handwritten letter recognition, and the detection of tumors in medical diagnosis.

4. Recurrent Neural Networks

A book often consists of a sequence of chapters. When we read a particular chapter, we don’t try to understand it in isolation, but rather in connection with previous chapters. Similarly, just like natural neural networks, machine learning models need to understand a text by utilizing already-learned text.

In traditional machine learning models, this is impossible because we cannot store a model’s previous stages. However, recurrent Neural Networks (commonly called RNN) are a type of neural network that can do this for us, making them useful for applications that require the use of past data. Let’s take a closer look at RNNs below.

Figure 5: Working of a basic RNN (Image by author)

Recurrent neural networks are networks designed to interpret temporal or sequential information. RNNs use other data points in a sequence to make better predictions. They do this by taking in input and reusing the activations of previous nodes or later nodes in the sequence to influence the output.

An RNN has a repeating module that takes input from the previous stage and gives its output as input to the next stage.

Applications

RNNs are commonly used in connected-sequence applications such as time series forecasting, signal processing, and handwritten character recognition. Also, RNNs are widely used in music generation, image captioning, and predicting stock market fluctuations.

5. Long Short-Term Memory Networks

In RNNs, we can only retain information from the most recent stage. But for a problem like a language translation, we need much more retention. That’s where LSTM networks come into the picture.

To learn long-term dependencies, our neural network needs memorization power. LSTMs are a special case of RNNs which can do that. They have the same chain-like structure as RNNs, but with a different repeating module structure. This repeating module structure allows the network to retain a much larger amount of previous stage values.

Figure 6: How an LSTM network looks (Image by author)

Applications

I’ve already mentioned how powerful LSTM networks are for language translation systems, but they have a wide range of applications. Some of these applications include sequence-to-sequence modeling tasks like anomaly detection, speech recognition, text summarization, and video classification.

6. Generative Adversarial Networks

Given training data, Generative Adversarial Networks (or simply, GANs) learn to generate new data with the same statistics as the training data. For example, if we train a GAN model on photographs, then a trained model will be able to generate new photographs that look similar to the input photographs.

A GAN contains two parts: a generator and a discriminator. The generator model creates new data while the discriminator tries to determine real data from generated data. As the generator and discriminator get better at their respective jobs, the generated data improves as a result, until it is (ideally) nearly identical in quality to the training data.

Think of the relationship as that of cops and robbers. Both are always trying to outsmart the other; the robbers to steal, and the police to catch the robbers.

Figure 7: Architecture of Generative Adversarial Networks (Image by author)

Using the generator, we first create random noise samples and pass them through the discriminator. The discriminator can easily differentiate between the two types of data, so we adjust the generator model and train again. As the iterations increase, the generator model creates data that is indistinguishable from the training data.

Applications

GANs are commonly used to create cartoon images or faces for gaming and animated movies. Also, GANs can help generate synthetic data from a small amount of data to help improve machine learning models. GANs are also a popular choice for artists looking to use machine learning models to expand their expression.

Summary

In this article, we looked at the relationship between neural networks and biological neural networks, then jumped into a number of neural networks and how they work. This article is a general guide to neural network concepts, but the domain is evolving constantly. With these concepts in hand, I hope it will be easier to understand and explore the workings of other types of neural networks, and those currently being developed. I encourage you to explore more in this domain because there is a wide range of incredible applications and lots of ongoing research.

Thanks for reading! This article was originally posted here. I am going to write more beginner-friendly posts in the future too. Follow me up on Medium to be informed about them. I welcome feedback and can be reached out on Twitter ramya_vidiyala and LinkedIn RamyaVidiyala. Happy learning!