Multi-Agent Diverse Generative Adversarial Networks
Arnab Ghosh*, Viveka Kulharia*, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania
As we can see from the Bags Generated by the 3 Generators, each of them seem highly artistic and designer quality.
Why to employ Multiple Generators?
As we can see from the 4 randomly sampled images from Imagenet below, the variations in the image samples is staggering.
As we have seen previously that single GANs (DCGAN) have experienced great success in modeling restricted domains of images such as celebrity faces and our motivation is to distribute the tasks of the Generators so that they can master disjoint modes of the data distribution.
In GANs the training signal for the Generator comes from the Discriminator. Hence the gradients from the Discriminator are attracted more by high density modes than the lower density modes. As shown in figure below, the generation manifold isn’t able to master low density modes well.
Problems with naive modeling of multiple Generators:
There is nothing to stop the different Generators from learning the same parts of the data distribution manifold.
Parameters would explode if the Generators share no information.
To ensure that the Generators are pushed towards distinct areas of the distribution manifold, we use diversity promoting objectives such as Generator Identification based Objective and Competing Objective.
To avoid the problem of exploding parameters we share the parameters between the Generators except the last layer.
As shown in the figure above, even if the generation manifolds of the Generators G1 and G2 are close together, our formulation can master low density modes by diversity enforcing objectives.
Model architecture with Competing Objective.
Discriminator output in traditional GANs score how real an image is.
Competing Objective is forcing the Generators to generate different kinds of images besides being real looking.
Model architecture with Generator Identification based Objective.
Intuitively, the Discriminator has knowledge of the data distribution and we leverage that information to push the different Generators towards disjoint modes.
We change the game a bit and ask the Discriminator to predict not only whether an image is real or fake but also if it's fake then which Generator produced it.
Intuitively, to succeed in this task, since the gradients of the Generators come from the Discriminator it has to learn to push the different Generators towards different kinds of modes.
To show the efficacy of our method in toy datasets, we have 18 Gaussians arranged in 3 clusters and from one cluster less points were sampled to emulate the mode density imbalance situation. GAN (on the top-right) collapses all the modes and is not able to recover the data distribution while GAN with Batch Normalization (on the bottom-left) misses out the cluster from which less samples were sampled.
Our model (on the top-right) is able to capture most of the modes and as we can see from the different colors the different Generators master disjoint modes of the data distribution.
The figure top left is for Competing Objective while on top right is for Generator Identification based Objective. Diverse generations for 'night to day' image generation task. First column in each figure represents the input. The remaining three columns show the diverse outputs of different Generators. It is evident from that different Generators are able to produce very diverse results capturing different lighting conditions, different sky patterns (cloudy vs clear sky), different weather conditions (winter, summer, rain), different landscapes, among many other minute yet useful cues.
The figure on top is for Competing Objective while the bottom one is for Generator Identification based Objective. Diverse generations for 'sketch-bag to real-bag' generation task. The first column in each figure represents the input. The remaining three columns show the diverse outputs of different Generators. It is evident from it that different Generators are able to produce very diverse results capturing different colors (blue, green, black etc), textures (leather, fabric, plastic etc), design patterns, among others.
The left figure corresponds to Competing Objective. Here the first Generator generates as expected by the DC-GAN, while the second Generator generates with high detailing of the faces and exaggerated facial features, and the third Generator generates images with sketch like quality with an artistic touch. The right figure corresponds to Generator Identification based Objective. Here, the first Generator is generating faces as the normal DC-GAN architecture would. The second Generator is generating faces like oil-paintings, while the third Generator seems to be generating images with coarse strokes without much focus on the background