Generative Adversarial Networks — Explained

Generative Adversarial Networks — Explained

Deep learning has changed the way we work, compute and has made our lives a lot easier. As Andrej Karpathy mentioned it is indeed the software 2.0, as we have taught machines to figure things out themselves. There are many existing deep learning techniques which can be ascribed to its prolific success. But no major impact has been created by deep generative models, which is due to their inability to approximate intractable probabilistic computations. Ian Goodfellow was able to find a solution that could sidestep these difficulties faced by generative models and created a new ingenious model called Generative Adversarial Networks. I believe it is astonishing when you look at the capabilities of a GAN. Before moving on to an introduction on GAN, let us look at some examples to understand what a GAN and its variants are capable of.

Examples

  • Given a segmented image of the road, the network is able to fill in the details with objects such as cars etc. The network is able to convert a black & white image into colour. Given an aerial map, the network is able to find the roads in the image. It is also able to fill in the details of a photo, given the edges.

  • Given an image of a face, the network can construct an image which represents how that person could look when they are old.

These are just a few examples of GAN, there are much more examples available. Now that I have whetted your appetite, let move on to what a GAN is and how it works.

Introduction

Generative Adversarial Networks takes up a game-theoretic approach, unlike a conventional neural network. The network learns to generate from a training distribution through a 2-player game. The two entities are Generator and Discriminator. These two adversaries are in constant battle throughout the training process. Since an adversarial learning method is adopted, we need not care about approximating intractable density functions.

How it works

Generative Adversarial Network

As you can identify from their names, a generator is used to generate real-looking images and the discriminator’s job is to identify which one is a fake. The entities/adversaries are in constant battle as one(generator) tries to fool the other(discriminator), while the other tries not to be fooled. To generate the best images you will need a very good generator and a discriminator. This is because if your generator is not good enough, it will never be able to fool the discriminator and the model will never converge. If the discriminator is bad, then images which make no sense will also be classified as real and hence your model never trains and in turn you never produces the desired output. The input, random noise can be a Gaussian distribution and values can be sampled from this distribution and fed into the generator network and an image is generated. This generated image is compared with a real image by the discriminator and it tries to identify if the given image is fake or real.

Objective Function

Objective Function

Since a game-theoretic approach is taken, our objective function is represented as a minimax function. The discriminator tries to maximize the objective function, therefore we can perform gradient ascent on the objective function. The generator tries to minimize the objective function, therefore we can perform gradient descent on the objective function. By alternating between gradient ascent and descent, the network can be trained.

Gradient Ascent on DiscriminatorGradient Descent on Generator

But when applied, it is observed that optimizing the generator objective function does not work so well, this is because when the sample is generated is likely to be classified as fake, the model would like to learn from the gradients but the gradients turn out to be relatively flat. This makes it difficult for the model to learn. Therefore, the generator objective function is changed as below.

New Generator Objective function

Instead of minimizing the likelihood of discriminator being correct, we maximize the likelihood of discriminator being wrong. Therefore, we perform gradient ascent on generator according to this objective function.

Disadvantages

  • GANs are more unstable to train because you have to train two networks from a single backpropagation. Therefore choosing the right objectives can make a big difference.
  • We cannot perform any inference queries with GANs

Conclusion

Generative Adversarial Networks are a recent development and have shown huge promises already. It is an active area of research and new variants of GANs are coming up frequently.

References