How To Create a Siamese Network With Keras to Compare Images

How To Create a Siamese Network With Keras to Compare Images

The whole code is available in Kaggle:

Where you can execute the code, fork, and modify it if you like.

I have used the famous Dataset MNIST to train the system, with 42000 28×28 gray images of numbers handwritten. The model will try to identify whether two numbers are identical.

Brief description of how works a Siamese Network.

It receives two inputs and produces two output vectors to calculate the Euclidean distance between the vectors.

Image by Author

The output of the model is a number representing the differences between the inputs. We must decide the limit so that the images can be considered of the same type. As smaller the number returned by the model, the smaller the differences. If the model receives two identical images, the return value must be zero.

In which situations can a Siamese Network be useful?

Although they are widely used in fields such as facial recognition, Siamese networks are not limited to the field of images.

They are also very popular in NLP, where they can be used to identify duplicates, texts that deal with the same topic, or even identify if two texts are of the same style or author.

They can also be used to recognize Audio files, for example, to compare voices and know if they belong to the same person.

Siamese networks work whenever we want to compare two Items with each other, whatever their type. These networks are especially recommended when the training Dataset is limited. Since we can match the available Items differently, increasing the information that we can obtain from the data.

In the notebook, I have matched each item with another item on the list. But I could have matched each item more than once to different items in the dataset, creating as many input pairs as I wanted. Which would give us an impressive set of data. This possibility of combination allows us to have enough training data, no matter how small the Dataset is.

Data Treatment.

To see the whole code, the best option is to have the Kaggle notebook open and follow along as you read.

Here I’m going to explain only the part where in which I create the pairs of data. The most specific part of data treatment to Siamese Networks.

This function iterates through each of the elements in the dataset and matches them with another element, resulting in several pairs equal to the number of elements in the dataset.

To ensure that there is a minimum number of elements with pairs of the same type, the min_equals parameter has been incorporated. The first pairs, until reaching min_equals, will be created with elements of the same type. The rest are matched randomly.

The last lines show how the pairs are created and stored in the pairs variable. Their labels are stored in the labels variable.

The data is transformed on return so that it can be treatable from the model.

Note this line at the beginning of the function: index = [np.where(y == i)[0] for i in range(10)]. It creates an array named index of 9 rows. Each row contains the positions of the numbers in the array of labels (y) that belong to the category indicated by the variable i.

That is, for index[0] we will have all the positions in the array y of the numbers with value 0. In Index[1], the positions of the value 1…

Creating the Siamese Model.

Before creating the model is necessary to do three functions. One is to calculate the Euclidean distance between the two output vectors. Another is to modify the shape of the output data. And a third, which is the loss function that is used to calculate the loss. Do not worry, it is a function, I would say, a standard for all Siamese models.

The loss function, contrastive_loss, is nested inside another function, contrastive_loss_with_margin, so that it allows us to pass a parameter to it. But it could have been ignored and always used the same value in the margin.

When building the model, the loss function is passed to the compile method, which takes care of passing the y_true and y_pred parameters so that the loss can be calculated at each step. We can’t pass any more parameters, so we nest the function. The compile method receives is the loss function that expects the input data y_true and y_pred.

The math formula, adapter to our variables, of the loss function, is:

Ytrue * Ypred² + (1 -Ytrue) * max(margin-Ypred, 0)²

Following the code, we see that in the variable square_pred we store the value of Ypred². In margin_square we store the value of max(margin-Ypred, 0)². From here, substituting the variables in the equation gives us the expression contained in the return line.