One-Shot Learning (Part 2/2): Facial Recognition Using a Siamese Network

One-Shot Learning (Part 2/2): Facial Recognition Using a Siamese Network

Photo by Andrew Seaman on Unsplash

In my previous article, we had a detailed discussion on the one-shot learning problem and how various algorithms tackled it in order to determine which approach was the most effective, and we settled on siamese networks. I’d encourage you to check it out:

But is this enough?Not quite! We need some hands-on-experience to actually learn how implement it. And what could be more relatable these days than facial recognition, which just happens to be a perfect use case for one-shot learning?

Humans learn new concepts with very little supervision. This principle behind one-shot learning is what we need to recreate in our model. Hence, we use a siamese network, which does not require extensive training samples for recognition tasks.

Image Courtesy — reports.ias

To provide a quick overview, siamese networks basically consist of two symmetrical neural networks both sharing the same weights and architectures.

They’re joined together at the end using an energy function, E which acts as a distance function whose objective is to learn whether two input images are similar or dissimilar. We’ll get a better understand of siamese networks by building a facial recognition model.

The complete code for this facial recognition model using a siamese network can be found at this link:

Once you’ve downloaded and extracted the zip, you can see /data folder consisting of sub-folders with the names of some famous people, as shown here:

Each celebrity’s folder has a few samples of a single person taken from various angles, as shown below:

You can also form your own database by making sub-folders of family and friends. The only thing to keep in mind is to focus (i.e. crop) the images on the face, which can be done with OpenCV and a haar cascade classifier, which is a pre-trained model to detect faces and eyes in an image:

Lets make a roadmap to proceed further in our code: