Deep Neural Networks Tutorial with TensorFlow

Hi everyone! In this blog, I am going to tell you about Deep Neural Networks, or DNN for short. In my last blog, I had told you about the 3 part blog that will talk about the various types of Neural Networks and this is the first part of the 3 part blog.

This blog can be followed if you have almost no prior experience and you just want to get started making AI models. However, I’ll assume you have experience with Python. In this blog we will talk about DNN’s which are complex versions of “Neural Networks” that take in and give out numbers.

So let’s begin.

Let us first define a “Network”.

This is the simplest type of Neural Network that we can make. This network has 3 inputs and 1 output. “Weights” are nothing but numbers that help neural networks determine that “strength” of the input.

To make the relationship between the weights and the nodes(the circles) clear, consider a neuron in our brain. It gets electrical signals as the inputs and outputs electrical signals too. Suppose you touch something warm. The impulse will travel through the neuron, and then to the brain. There it classifies the impulse as weak, so it knows that the fingers have touched something warm. Now if you touch something really hot, a much stronger impulse will travel through the neuron, and the brain will classify it accordingly.

The weights in the neural network do the same work and help the neural network classify the “strength” of the impulse/input.

A neural network can have only an input layer and an output layer. In the image, the input layer has 3 nodes and the output layer has 1 node, however it can have as many as we want.

Such networks(like the one shown above) can be used for things such as linear regression which is basically a straight line. Linear regression can be used to model data that shows a linear pattern such as house prices. As the floor area increases the cost increases.

A deep neural network is only a little different.

It has the same input layer and output layer. But to model more complex data, we need to find out more features in it so we add a hidden layer where the real magic happens. In these layers, the DNN extracts features from the more complex data.

So let’s start coding a DNN. I am going to use the Titanic competition from Kaggle as my data source. We are first going to perform data analysis with pandas and then train a model with TensorFlow and Keras.

First import the necessary modules:

import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt

I am using TensorFlow version “2.2.0″. You can check you version by typing:

print(tf.__version__)

Next we import the data:

train_df = pd.read_csv("data/train.csv")
train_df.head()

Next we clean up the data a little:

train_df.drop(columns='Ticket', inplace=True)
train.df.dropna(inplace=True)
train_df.index = train_df['PassengerId']
train_df.columns = ['passenger_id', 'survived', 'pclass', 'name', 
                    'sex', 'age', 'sibsp', 'parch', 'fare', 'cabin', 'embarked']
train_df.drop(columns='passenger_id', inplace=True)
train_df.head()

Next we assign categorical classes numerical values to represent them. This can be done by using “map” which will replace certain values in the data by comparing it to the input dictionary. An example of gender is:

train_df['sex'] = train_df['sex'].map({"female":0, "male":1})

After the cleanup of columns like sex and embarked we can further classify the data by putting the age and fare columns under groups. For example the first record might be in the age group of 20–40. To accomplish this, we use “pd.cut”. Here is an example for age:

train_df['age'] = pd.cut(train_df['age'], bins=[0,20,40,60,80,100], 
labels=['0-20', '20-40', '40-60', '60-80', '80-100'])
train_df['age'] = train_df['age'].map({
    '0-20': 0,
    '20-40': 1,
    '40-60': 2,
    '60-80': 3,
    '80-100': 4
})

After doing the same for the fare, this will be our table:

We did all of this to compare the data we have and find out which is a feature that we can use to train a model. If we consider gender and write code that shows that number of each gender that survived and the survival rates of each gender I will do this:

gender = {
    'male': train_df[(train_df['sex'] == 1) & (train_df['survived'] == 1)].count()[1],
    'female': train_df[(train_df['sex'] == 0) & (train_df['survived'] == 1)].count()[1],
}
f = plt.figure()
plt.bar(gender.keys(), gender.values())
plt.title("Number of people survived based on gender")
f.show()
survival_rate = {
    'male': train_df[(train_df['sex'] == 1) & (train_df['survived'] == 1)].count()[1]/train_df[train_df['sex'] == 1].count()[1],
    'female': train_df[(train_df['sex'] == 0) & (train_df['survived'] == 1)].count()[1]/train_df[train_df['sex'] == 0].count()[1],
}
g = plt.figure()
plt.bar(survival_rate.keys(), survival_rate.values())
plt.title('Survival rate per gender')
g.show()

We can do the same for all the other features and the code remains fairly the same. Here are the graphs for all the other features:

All of these graphs show some observable trend and hence it is required to train our AI model. Analysing the data like this before training is really helpful as it will help get rid of data that shows no trend.

Now we need to build and train a model. I will import a fresh training data copy to work with as I want to train my model on the raw numbers rather than the class numbers. But because data such as age and fare have very high numbers we will need to normalise the data. Normalising means we just scale down our data between 0 and 1 as neural networks work better with numbers between such ranges. Here is the clean up I did:

def normalise(series):
    return (series - series.min())/(series.max() - series.min())
train = pd.read_csv('data/train.csv')
train = train[['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Embarked', 'Survived']]
train['relatives'] = train['SibSp'] + train['Parch']
train.drop(columns = ['SibSp', 'Parch'], inplace=True)
train.replace({'Sex': {'male': 0, 'female': 1}, 'Embarked': {"S": 0, "C": 1, "Q": 2}}, inplace=True)
train['Age'] = normalise(train['Age'])
train['Fare'] = normalise(train['Fare'])
train.head()

Now we will build our model:

# creating a dictionary with the data
data = {
    'training data': np.array(train[['Pclass', 'Sex', 'Age', 'Fare', 'Embarked', 'relatives']]), 
    'labels': np.array(train['Survived'])
}
####################################################################
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=6, activation='relu', input_shape=[6,]))
model.add(tf.keras.layers.Dense(units=20, activation='relu'))
model.add(tf.keras.layers.Dense(units=25, activation='relu'))
model.add(tf.keras.layers.Dense(units=20, activation='relu'))
model.add(tf.keras.layers.Dense(units=1, activation='relu'))
opt = tf.keras.optimizers.Adam(learning_rate=1e-7)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['acc'])
hist = model.fit(data['training data'], data['labels'], verbose=False, epochs=500)
plt.plot(hist.history['acc'])
plt.show()

Here I have made a sequential model as I want my layers in sequence like the image of the neural network above. A “Dense” layer may sound fancy but it is just a name for a normal layer. Remember a “layer” is the entire selection of nodes(circles) from where the data enters and exists. The complex calculations are already defined by tensorflow so we do not need to do anything. The compile method and its arguments tell the model how we want to build our model. A loss function helps us determine how far our model is from the actual values. In this case it is binary crossentropy. Binary because our model only has 2 possible outputs: 1(Survived) or 0(Not survived). An optimizer helps us reach that spot where the error of the model is least from the actual values. The metrics just tell the model to record the accuracy of the model while training.

I am leaving a lot of things out here as they are not very important and they can be understood if you simply solve more AI problems. As the code shows, I have trained my model(model.fit()) and here is the graph of the accuracy of the model:

As we can see it is around 0.615(61.5%). This is fine for the first trail. However, you can tune other parameters such as the learning rate, which tells the model how “fast” to move to a solution. You can also go to TensorFlow’s official website to see other optimisers(here I have used Adam) to improve your model.

So that’s it for today! I hope you enjoyed this blog and learned something. The next blog will be on training a Computer Vision model. If you liked this blog then please share it and follow me here on Medium.

Thanks for reading!