6 Steps to Build a Neural Network | Documentation
Mục Lục
6 steps to build a neural network in OpenNN
This tutorial shows the principal ingredients to build a neural network model in a few steps using OpenNN. You can find the script of the example that we are going to use at Github.
The central goal here is to design a model that makes good classifications for new data or,
in other words, one which exhibits good generalization.
If you want to know more about the concepts we are going to see in this tutorial, you can read this neural network tutorial
or the machine learning blog created by Neural Designer.
Contents:
- Data set.
- Neural network.
- Training strategy.
- Model selection.
- Testing analysis.
- Model deployment.
1. Data set
The first step is to prepare the data set, which is the source of information for the classification problem.
For that, we need to configure the following concepts:
- Data source.
- Variables.
- Instances.
The data source is the file iris_flowers.csv.
It contains the data for this example in comma-separated values (CSV) format and can be loaded as
DataSet data_set("path_to_source/iris_flowers.csv",',',true);
The number of columns is 5, and the number of rows is 150. The variables in this problem are:
- sepal_length: Sepal length, used as input.
- sepal_width: Sepal width, used as input.
- petal_length: Petal length, used as input.
- petal_width: Petal width, used as input.
- class: Iris Setosa, Versicolor, or Virginica, used as the target.
In this regard, OpenNN recognizes the categorical variables and transforms them into numerical variables. In this example, the transformation is as follows:
- iris_setosa: 1 0 0.
- iris_versicolor: 0 1 0.
- iris_virginica: 0 0 1.
Then, there will be seven numerical variables in the data set. Once we have the data ready, we will get the information of the variables, such as names and statistical descriptives
const Tensor<string, 1> inputs_names = data_set.get_input_variables_names();
const Tensor<string, 1> targets_names = data_set.get_target_variables_names();
The instances are divided into training, selection, and testing subsets.
They represent 60% (90), 20% (30), and 20% (30) of the original instances, respectively, and are split at random using the following command
data_set.split_samples_random();
To get the input variables number and target variables number, we use the following command
const Index input_variables_number = data_set.get_input_variables_number();
const Index target_variables_number = data_set.get_target_variables_number();
To make the neural network work in the best possible conditions, we scale the data set. In our case, we will choose the minimum-maximum scaling method
Tensor<string, 1> scaling_inputs_methods(input_variables_number);
scaling_inputs_methods.setConstant("MinimumMaximum");
const Tensor<Descriptives, 1> inputs_descriptives = data_set.scale_input_variables();
We will obtain the statistical descriptives of each input: maximum value, minimum value, mean and standard deviation. In this case, we have not scaled the targets because their values are 0 or 1, allowing good work to the neural network.
For more information about the data set methods, see DataSet class.
2. Neural network
The second step is to choose the correct neural network architecture.
For classification problems, it is usually composed by:
- A scaling layer.
- Two perceptron layers.
- A probabilistic layer.
We define the architecture by
Tensor<Index, 1> architecture(3);
const Index hidden_neurons_number = 3;
architecture.setValues({input_variables_number, hidden_neurons_number, target_variables_number});
Now, the NeuralNetwork class is responsible for building the neural network and properly organizing the layers of neurons using the following constructor. If you need more complex architectures, you should see NeuralNetwork class.
NeuralNetwork neural_network(NeuralNetwork::Classification, architecture);
Once the neural network has been created, we can introduce information in the layers for a more precise calibration
neural_network.set_inputs_names(inputs_names);
neural_network.set_outputs_names(targets_names);
In the case of the scaling layer, it is necessary to enter the descriptives and the scaling method calculated previously
ScalingLayer* scaling_layer_pointer = neural_network.get_scaling_layer_pointer();
scaling_layer_pointer->set_descriptives(inputs_descriptives);
scaling_layer_pointer->set_scalers(MinimumMaximum);
Therefore, we have already created a good-looking model. Thus we proceed to the learning process with TrainingStrategy.
3. Training strategy
The third step is to set the training strategy, which is composed of:
- Loss index.
- Optimization algorithm.
Firstly, we construct the training strategy object
TrainingStrategy training_strategy(&neural_network, &data_set);
then, set the error term
training_strategy.set_loss_method(TrainingStrategy::NORMALIZED_SQUARED_ERROR);
and finally the optimization algorithm
training_strategy.set_optimization_method(TrainingStrategy::ADAPTIVE_MOMENT_ESTIMATION);
Note that this part is unnecessary because OpenNN builds by default the training strategy object using the quasi-Newton method as the optimization algorithm and normalized squared error as the loss method.
We can now start the training process by using the command
training_strategy.perform_training();
If we need to go further, OpenNN allows control of the optimization, for example
AdaptiveMomentEstimation* adam = training_strategy.get_adaptive_moment_estimation_pointer();
adam->set_loss_goal(type(1.0e-3));
adam->set_maximum_epochs_number(10000);
adam->set_display_period(1000);
training_strategy.perform_training();
For more information about the training strategy methods, see TrainingStrategy class.
4. Model selection
The fourth step is to set the model selection, which is composed of:
- Inputs selection algorithm.
- Neurons selection algorithm.
If you are not sure you have chosen the right architecture, the model selection class aims to find the network architecture with the best generalization properties,
that is, that which minimizes the error on the selected instances of the data set.
The first step is to construct the model selection object
ModelSelection model_selection(&training_strategy);
In this example, we want to optimize the number of neurons in the network architecture using the neurons selection algorithm
model_selection.perform_neurons_selection();
Once the algorithm is finished, our model will have the most optimal architecture for our problem.
For more information about the model selection methods, see the ModelSelection class.
5. Testing analysis
The fifth step is to evaluate our model. For that purpose, we need to use the testing analysis class, whose goal is to validate the model’s generalization performance. Here, we compare the neural network outputs to the corresponding targets in the testing instances of the data set.
First of all, we must do the reverse process of the neural network input, unscaling the data
data_set.unscale_input_variables(inputs_descriptives);
We are ready to test our model. As in the previous cases, we start by building the testing analysis object
TestingAnalysis testing_analysis(&neural_network, &data_set);
and perform the testing, in our case we use confusion matrix
Tensor<Index, 2> confusion = testing_analysis.calculate_confusion();
In a confusion matrix, rows represent targets (or real values), and columns represent outputs (or predicted values).
The diagonal cells show the correctly classified cases, and the off-diagonal cells show the misclassified instances.
For more information about the testing analysis methods, see TestingAnalysis class.
6. Model deployment
Once our model is completed, the neural network is ready to predict outputs for inputs that it has never seen.
This process is called model deployment.
In order to generate predictions with new data, you can use
neural_network.calculate_outputs();
For instance, the new inputs are:
- Sepal length: 5.10 cm.
- Sepal width: 3.50 cm.
- Petal length: 1.40 cm.
- Petal width: 0.20 cm.
and in OpenNN we can write it as
Tensor<type, 2> inputs(1,4);
inputs.setValues({{type(5.1),type(3.5),type(1.4),type(0.2)}});
neural_network.calculate_outputs(inputs);
data_set.unscale_input_variables(inputs_descriptives);
or save the model for a later implementation in python, php,… .
neural_network.save_expression_c("../data/expression.txt");
neural_network.save_expression_python("../data/expression.txt");
References:
- UCI Machine Learning Repository. Iris Data Set.
-
Fisher,R.A. “The use of multiple measurements in taxonomic problems” Annual Eugenics, 7, Part II, 179-188 (1936);
also in “Contributions to Mathematical Statistics” (John Wiley, NY, 1950).