0% found this document useful (0 votes)
68 views

Gan Final

The document summarizes recent research on Generative Adversarial Networks (GANs) for improving image quality and variation. It discusses how GANs can now generate sharp, high-resolution images using techniques like progressive growing, which starts training at low resolutions and increases the resolution over time. It also explores ways to improve variation, such as minibatch discrimination and repelling regularization. The research evaluates GANs on several datasets, finding it can now generate 1024x1024 CelebA images and achieve inception scores of 7.9 for CIFAR10 in unsupervised setups.

Uploaded by

mandaashok1615
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Gan Final

The document summarizes recent research on Generative Adversarial Networks (GANs) for improving image quality and variation. It discusses how GANs can now generate sharp, high-resolution images using techniques like progressive growing, which starts training at low resolutions and increases the resolution over time. It also explores ways to improve variation, such as minibatch discrimination and repelling regularization. The research evaluates GANs on several datasets, finding it can now generate 1024x1024 CelebA images and achieve inception scores of 7.9 for CIFAR10 in unsupervised setups.

Uploaded by

mandaashok1615
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

SUMMARY

Progressive growing of GNA’s for improved quality and


variation:

GAN- Generative Adversial Networks


The current most prominent approaches for producing novel
samples from high-dimensional data distributions are:
• Autoregressive models- Pixel CNN - limits the applicability
• Variational Auto Encodes (VAE’s) – produces blurry images due to
restrictions in models
• Generative Adversal Networks (GAN’s) – produces sharp images in fairly
small resolutions

GAN:
➢ Typically, GAN consists of two networks-Generator and discriminator.
Here, generator produces an image(samples) from a latent code i.e. it
is certain to be of main interest and discriminator is an adaptive loss
function that gets disregarded once the generator has been trained.
➢ The distance between training generation and distributed generation
are calculated through many formulation but the current usage is of
improvised Wasserstein loss along with experiments on least-square
loss.
➢ The evaluation of contribution is done on CELEBA, LSUN, CIFAR10
datasets. CELEBA dataset has been improvised to allow
experimentation output resolutions upto 10242 pixels.
➢ The training methodology for GAN’s is done by starting with low-
resolution images and then progressively increase resolution by
adding layers to the networks. This helps to stabilize training
sufficiently to realiably synthesize mega-pixel scale images using
WGAN-GP and LSGAN loss.
➢ The training of a GAN starts with two networks having low spatial
resolutions of 4*4 pixels and layers are added by training the
G(Generator) and D(Discriminator)at each layers throughout the
process until a resolution of 1024*1024 is obtained.
➢ Since GAN’s have a tendency to capture only subset of variation sound
in training data, ”minibatch discrimination” is considered as solution
i.e; a minibatch layers is added at the end of discriminator inorder to
improve the variation.
➢ Alternate solution could be “repelling regularize”
➢ The normalization is generator and discriminator is done in two ways:
“Equalized learning rate” and “Pixelwise feature vector normalization
in generator”
➢ A single Laplacian pyramid analyses the similarities between local
images patches by representations of generated and targeted images
starting at 16*16 pixels resolutions.
➢ Sliced Wasserstein distance (SWD) and multi-scale structural similarity
(MS-SSLM) are used to evaluate importance of contributions by
building on top of a previous state-of-the-art loss function (WGAN-GP)
and training configuration in an unsupervised setting using CELEBA
and LSUN BEDROOM datasets in 1282 pixels resolution.
➢ CELEBA dataset resulted in high 1024*1024 pixel resolution, whereas
the best inception scores for CIFAR10 were 7.90 for unsupervised and
8.87 for label conditioned setups and resulted after removing ‘ghosts’
that appear between classes in unsupervised setting was found to be
8.80.

You might also like