Create Your First Neural Network with Python* and TensorFlow*
Mục Lục
Your First Neural Network
We’ll be using Python and TensorFlow to create a CNN that takes a small image of a typed digit from 0 to 9 and outputs what digit it is. This is a great use case to start with and will give you a good foundation for understanding key principles of the TensorFlow framework.
We’ll use the Intel Optimization for TensorFlow, which optimizes TensorFlow performance when running on Intel® architecture. The core of this optimization is the Intel® oneAPI Deep Neural Network Library (oneDNN), which is a set of building blocks for DL applications that includes convolutional and pooling layers — the base components of any CNN model. The library is a part of the Intel® oneAPI Base Toolkit, a set of libraries for developing high-performance AI, machine learning (ML), and other applications across diverse architectures.
The programming model of all oneAPI components is unified so that it can use the same code for deployment on CPU, GPU, or FPGA. Intel continues to develop oneAPI components to support new processors and optimize performance by taking advantage of new instruction set extensions.
Using oneDNN primitives as the back-end implementation for core TensorFlow algorithms provides higher performance for DNN models and ensures that the performance will be optimized for newer processor architectures.
Let’s Start Coding
Let’s get started with our simple CNN. This neural network classifies images with typed digits. The input for the network will be a small 28 × 28 pixel grayscale image, and the output will be an array of probabilities for each digit from 0 to 9. The first step is to build the TensorFlow model of the CNN. We’ll use the Keras API for this task, as it’s easier to understand when creating your first neural network.
Write and run the following code in your DL environment:
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '1'
import tensorflow
tensorflow.__version__
The first two lines of the code turn on the oneDNN optimizations for the session, and the last two lines check the version of the TensorFlow framework. Note that beginning with TensorFlow 2.9, the oneDNN optimizations are on by default and you have to set this to 0 if you wish to test performance without these optimizations.
Next, initialize the CNN model with the input layer:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input
model = Sequential()
inp = Input(shape=(28, 28, 1))
model.add(inp)
We used the ‘Sequential’ model for our CNN. This model has the simplest structure with sequential layers where the output of one layer is the input of the next one. Then, we added the input layer to the model. All TensorFlow models need to know the type of input data. In our case, this is a tensor with three dimensions of 28, 28, and 1. This input corresponds to an image of 28 × 28 pixels with one color channel (a grayscale image).
The Convolutional Layer
Let’s continue and write the following code:
from tensorflow.keras.layers import Conv2D
conv = Conv2D(32, (5, 5), padding="same", strides=(1, 1))
model.add(conv)
The code initializes a convolutional layer ‘Conv2D’ and places it after the first input layer. This convolutional layer is the key element of our neural network. It’s in charge of extracting geometric features from input images, which are then used by the next layers. We created the layer with 32 kernels (or filters) of size (5, 5). The next two arguments specify how these filters are applied to the input image: strides specifies the vertical and horizontal shifts, and padding specifies whether the input image must be padded with extra pixels to process input data.
The Activation Layer
Any convolutional layer should be followed by an activation layer. This layer introduces an activation function to the model, which controls whether a neuron will “fire” (provide an output) based on its weight and its input. The most popular activation function for recent deep neural networks (DNN) is the rectified linear unit (ReLU). The following code adds the activation layer to our model:
from tensorflow.keras.layers import Activation
conv_act = Activation("relu")
model.add(conv_act)
The Pooling Layer
Another important element of a DNN is the pooling operation. The goal of the operation is to decrease the spatial dimension of the data that will be fed to the next layer. We’ll be using the max pooling operation, as it’s proven to be efficient in downsampling feature maps in CNN models.
from tensorflow.keras.layers import MaxPooling2D
pool = MaxPooling2D((4, 4))
model.add(pool)
The added MaxPooling2D layer was initialized with a pool size of (4, 4). This means the spatial dimension of the layer’s data output decreases by four times for both vertical and horizontal axes. So the initial 28 × 28 pixel image gets reduced to a 7 × 7 numeric output after the first layer, and so on.
The pooling operation has two purposes. One is to make the model independent of slight differences in extracted feature positions, and the other is to reduce the data amount for the next processing layers, thus making the model faster.
Visualizing Our Model So Far
Now, our model consists of four layers: input, convolutional, activation, and pooling. At any stage of model building, we can view information on its structure by calling the model.summary method. The output of the method is shown in the image below:
We can see information about the output shape for every layer. For example, after the pooling layer, we can see that the spatial dimensions of the output decreased by four times from 28 to 7, just as expected. Another important piece of information is the number of trainable parameters in each layer. The convolutional layer in our model has 832 such parameters. Trainable parameters are coefficients (weights) of a layer, the values of which are tuned with the training process. The number of them directly affects the training time: the greater the number, the longer the training process.
The Flatten and Dense Layers
Let’s continue to build the model with the following code:
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
flat = Flatten()
model.add(flat)
dense = Dense(128)
model.add(dense)
dense_act = Activation("sigmoid")
model.add(dense_act)
After this, our model outputs the following summary:
The code above adds a “flatten” layer that converts the three-dimensional tensor (7, 7, 32) from the pooling layer to a flat vector with 1568 components. It then adds a “dense” layer with 128 neurons and a sigmoid activation function to the model. This is the so-called hidden fully connected layer. It precedes the last classification layer, and its goal is to build a multilayer perceptron (MLP) in the last part of a neural model, as MLP is commonly the final block in classification neural networks.
The Classification Layer
The last layer of our model must be a classification layer with 10 output values: the probabilities for each digit. We’ll use a dense layer with 10 neurons and the Softmax activation function as a common choice for modern classifier models.
out = Dense(10)
model.add(out)
out_act = Activation("softmax")
model.add(out_act)
The final summary of our model is pictured below: