Mục Lục

Deep Learning Foundations

A Layman’s Guide to Deep Neural Networks

A non-mathematical introduction to Deep Neural Networks with PyTorch

Prelude

The recent A.I. buzz has created enough awareness among the Neural Networks in academia and enterprise. You might certainly have crossed paths with content that emphasizes some form of AI/Neural Net system will overtaking your traditional workflow. I am sure you have definitely heard (though may not be fully aware) about Deep Neural Networks and Deep Learning.

In this post, I would like to introduce the topic with the shortest yet effective means to embrace Deep Neural Networks and implement them using PyTorch.

A layman definition for Deep Neural Networks a.k.a. Deep Learning

Take 1

Deep Learning is a sub-field of machine learning in Artificial intelligence (A.I.) that deals with algorithms inspired from the biological structure and functioning of a brain to aid machines with intelligence.

Does it sound complicated? Let’s simplify it by further breaking down each word in the definition and the huddle again with the definition. We will first start with Artificial Intelligence a.k.a. A.I.

Source — Learn Keras for Deep Neural Networks (Apress)

Artificial Intelligence (A.I.) in its most generic form can be defined as the quality of intelligence being introduced in machines. Machines are usually dumb, so to make them smarter we induce some sort of intelligence so they can make a decision independently. Say, a washing machine, that can decide to ingest the right amount of water, decide on the required time for soaking, washing and spinning i.e. taking a decision when specific inputs are provided and therefore working in a smarter way. Similarly, an ATM machine that takes a call on disbursing the amount you want with the right combination of notes available in the machine. This intelligence is technically induced into the machine in an artificial way and that’s why the name Artificial Intelligence.

An important point to note here is that this intelligence is explicitly programmed, say a comprehensive list of if-else rules. The engineer who designed the system carefully thought through all the combinations possible and designed a rule-based system that can decide by traversing through the defined rule path. What if, we need to introduce intelligence in a machine without explicit programming, probably something where the machine can learn on its own. That’s when we touch base with Machine Learning.

Machine learning can be defined as the process of inducing intelligence into a system or machine without explicit programming.

– Andrew NG, Stanford Adjunct Professor

Examples of machine learning could be a system that could predict whether a student will fail or pass in a test by learning from the historical test results and student attributes. Here, the system is not encoded with a comprehensive list of all possible rules that can decide whether a student will pass or fail, instead, the system learns on its own based on the patterns it identified from historical training data that was fed.

So, where does Deep Learning stand within this context? It happens that machine learning though works very well for a variety of problems, fails to excel in some specific cases which were very easy for humans. Say, classifying an image as a cat or dog or distinguishing audio clips as a male or female voice, etc. Machine learning mostly performs poorly with image, audio, and other unstructured data types. On researching the reasons for this poor performance, a thought of inspiration led to the idea to mimic the human brain’s biological process which is composed of billions of neurons connected and orchestrated to adapt learning new things. On a parallel track, Neural network was already a research topic for several years and had made only limited progress due to the computational and data limitations at the time. When researchers reached the cusp of machine learning and neural networks, there came the field of deep learning which was framed by developing deep neural networks i.e. an improvised neural network with many more layers.

Now, let’s take another attempt to understand the definition of Deep Learning.

Take 2

Deep Learning is a field within machine learning and Artificial intelligence (A.I.) that deals with algorithms inspired from a human brain to aid machines with intelligence without explicit programming.

Isn’t it much easier to understand, now? 🙂

Deep Learning excelled at the new frontiers where machine learning was falling behind. In due course of time, additional research and experimentation led to the understanding where we could leverage deep learning for all machine learning tasks and expect better performance provided there was surplus data availability. Deep learning, therefore, became a ubiquitous field to solve predictive problems rather than just confined to areas of computer vision, speech, etc.

What are some problems that are solved by Deep Learning today?

With the advent of cost-effective compute power and data storage, deep learning has been embraced in every digital corner of our day-to-day life. Few examples of common daily life digital products based on Deep Learning are the popular virtual assistants like Siri/Alexa/Google Assistant, the suggestion to tag your friend in a Facebook photo uploaded, autonomous driving in Tesla, the cat filter in Snapchat, product recommendations in Amazon and Netflix, the recent viral photo apps like FaceApp and Prisma, and the list goes on. You might have already used a Deep Learning based product without realizing it.

Deep Learning has forayed into virtually all industry verticals, like Healthcare with detecting cancer and diabetic retinopathy, Aviation with Fleet Optimization, Oil & Gas with predictive maintenance of machinery, Banking & Financial Services with fraud detection, retail and telecom with customer churn prediction, and a million more. Andrew NG has rightly quoted that AI is the new electricity. Just like electricity transformed everything AI will transform nearly everything in the near future.

Decomposing a Deep Neural Network

A simplified version of Deep Neural Network is represented as a hierarchical (layered) organization of neurons (similar to the neurons in the brain) with connections to other neurons. These neurons pass a message or signal to other neurons based on the received input and form a complex network that learns with some feedback mechanism. The following diagram represents an ’N’ layered Deep Neural Network.

A Deep Neural Network with N hidden layers

As you can reference in the above figure, the input data is consumed by the neurons in the first layer (not hidden) which then provide an output to the neurons within next layer and so on which provides the final output. The output might be a prediction like Yes or No (represented in probability). Each layer can have one or many neurons and each of them will compute a small function i.e. activation function. The activation function mimics the signal to pass to the next connected neurons. If the incoming neurons result in a value greater than a threshold, the output is passed else ignored. The connection between two neurons of successive layers would have an associated weight. The weight defines the influence of the input to the output for the next neuron and eventually for the overall final output. In a neural network, the initial weights would be all random but during the model training, these weights are updated iteratively to learn to predict a correct output. Decomposing the network, we can define few logical building blocks like a neuron, layer, weight, input, output, an activation function and finally a learning mechanism (optimizer) that will help the neural network incrementally update the weights (that were randomly initialized) to a more suitable weight that aids into correct prediction of the outcome.

For an intuitive understanding, let’s take an example of how a human brain learns to identify different people. When you meet a person for the second time you will be able to identify him. How does this happen? People have a resemblance in their overall structure; two eyes, two ears, a nose, lips etc. Everyone has the same structure, yet we are able to distinguish between people quite easily, right?

The nature of the learning process in the brain is quite intuitive. Rather than learning the structure of the face to identify people, it learns the deviation from a generic face i.e. how different is his eyes from the reference eye, which can then be quantified as an electrical signal with a defined strength. Similarly, it learns deviations from all parts of the face from a reference base and combines these deviations into new dimensions and finally gives an output. All of this happens in a fraction of a second that none of us realizes what our subconscious mind has actually done.

Similarly, the neural network as showcased above tries to mimic the same process using a mathematical approach. The input is consumed by neurons in the first layer and an activation function is calculated within each neuron. Based on a simple rule it forwards an output to the next neuron, similar to the deviations learned by the human brain. The larger the output of a neuron, the larger would be the significance of that input dimension. These dimensions are then combined in the next layer to form additional new dimensions, something that we can’t probably make sense of. But the system learns it intuitively. The process, when multiplied several times, develops a complex network with several connections.

Now that the structure of the neural network is understood, let’s understand how the learning happens. When we provide the input data to the defined structure, the end output would be a prediction (with a series of matrix multiplication), which could be either correct or incorrect. Based on the output, if we provide feedback to the network to improve using some means so as to predict better, the system learns by updating the weight for the connections. To achieve the process of providing feedback and defining the next step to make changes in the correct way, we use a beautiful mathematical algorithm called ‘backpropagation’. Iterating the process several times step-by-step, with more and more data helps the network update the weights appropriately to create a system where it can take a decision for predicting output based on the rules it has created for itself through the weights and connections.

The name Deep Neural networks evolved from the use of many more hidden layers making it a ‘deep’ network to learn more complex patterns. The success stories of deep learning have only surfaced in the last few years because the process of training a network is computationally heavy and needs large amounts of data. The experiments finally saw the light of the day only when compute and data storage become more available and affordable.

What are some popular frameworks for Deep Learning?

Given the adoption of deep learning has propelled at an alarming pace, the maturity of the ecosystem has also shown phenomenal improvement. Thanks to many large tech organizations and open source initiatives, we now have a plethora of options to choose from. These deep learning frameworks provide us with reusable code blocks that abstract the logical blocks we discussed above and also provides several additional handy modules in developing a deep learning model.

We can classify the available options as a low level or high-level deep learning framework. While this is by no means an industry-recognized terminology, we can use this segregation for a more intuitive understanding of the frameworks. Low-level frameworks provide a more basic block of abstraction while giving a ton of room for customization and flexibility. High-level frameworks further aggregate the abstraction to ease our work while limiting the scope of customization and flexibility. High-level frameworks use a low-level framework as a backend and typically work by converting the source into the desired low-level framework for execution. Below are a few of the popular choices of frameworks for deep learning.

Low-level frameworks

TensorFlow
MxNet
PyTorch

High-level frameworks

Keras (uses TensorFlow as a backend)
Gluon (uses mxnet as a backend)

At the moment, the most popular choice is Tensorflow (by Google). Keras is also popular given the ease it provides in quickly protyping deep learning models. However, PyTorch (by Facebook) is another popular framework that is catching up with the race at an astoundingly high speed. PyTorch is a great choice for many AI practitioners and has an easier learning curve than TensorFlow and can be used from prototyping to productionizing a Deep Learning model at ease.

In this tutorial, we will prefer using PyTorch for implementing a baby neural network. You should research and study more before deciding your choice of framework. This post by Skymind provides a great comparison and details to help in deciding a framework for your needs.

This post will however not cover an introduction to PyTorch. I would recommend exploring PyTorch’s official tutorials. Also, this guide by Elvis is also a neat write-up for beginners in PyTorch.

Building a baby neural network with PyTorch

With a basic layman overview of the subject, we can now start with building a basic neural network in PyTorch. In this example, we generate a dummy dataset that mimics a classification use-case with 32 features (columns) and 6000 samples (rows). The dataset will be rendered using the randn function from PyTorch.

This code is also available at Github Repository

Concluding thoughts

This post was aimed to give beginners an easy heads-up into the subject in a very easy language. Keeping the math abstracted and focussing from a purely functional means to leverage Deep Learning for modern enterprise projects. Most of the content was borrowed from Chapter 1 of my book ‘Learn Keras for Deep Neural Networks’.

In the next post, I would cover “A layman’s guide to Convolutional Neural Networks” and again the examples would be in PyTorch.