0% found this document useful (0 votes)

9 views

lecture16 GAN cont

The document discusses Generative Adversarial Networks (GANs) and their training processes, including the roles of the generator and discriminator. It introduces Wasserstein Loss as an alternative to binary cross-entropy loss, addressing issues like vanishing gradients. Additionally, it covers conditional and controllable generation techniques, evaluation metrics like Frechet Inception Distance (FID), and the Pix2Pix model for image-to-image translation.

Uploaded by

kayakbackwards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

lecture16 GAN cont

Uploaded by

kayakbackwards

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

COMP 4102 – Computer Vision

Generative Adversarial Networks (cont.)

Majid Komeili

* Unless otherwise noted, all material posted for this course are copyright of the
instructor, and cannot be reused or reposted without the instructor’s written permission.
Housekeeping Items

▪ GAN Tutorial: Thursday March 13 (tomorrow) at 5 pm over

Zoom.
▪ Tutorial will be recorded.

Majid Komeili, Carleton University 2

Generative Adversarial Networks
Discriminator learns to distinguish real from fake.

Random
noise

Generator learns to generate fakes that look real.

Generator and Discriminator learn from the competition with each other.
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
Majid Komeili, Carleton University 3
Training the Discriminator
𝑛

Recall: 𝐵𝐶𝐸 = − ෍ 𝑦𝑖 𝑙𝑜𝑔 𝑦ො𝑖 + (1 − 𝑦𝑖 ) 𝑙𝑜𝑔 1 − 𝑦ො𝑖

𝑖=1

Update the discriminator weights using backprop

Discriminator
BCE with labels for
𝐷𝜃 real (1) and fake (0)
G(z)

Minimize 𝒥𝐷 = 𝔼𝑥∈𝒟 −𝑙𝑜𝑔 𝐷(𝑥) + 𝔼𝑧 −𝑙𝑜𝑔 1 − 𝐷(𝐺(𝑧))

𝜃

log prob of D predicting that log prob of D predicting that

real-world data x is genuine generated data is not genuine
Majid Komeili, Carleton University 4
Training the Generator
Minimize 𝒥𝐷 = 𝔼𝑥∈𝒟 −𝑙𝑜𝑔 𝐷(𝑥) + 𝔼𝑧 −𝑙𝑜𝑔 1 − 𝐷(𝐺(𝑧))
𝜃
𝑛

Recall: 𝐵𝐶𝐸 = − ෍ 𝑦𝑖 𝑙𝑜𝑔 𝑦ො𝑖 + (1 − 𝑦𝑖 ) 𝑙𝑜𝑔 1 − 𝑦ො𝑖

𝑖=1

x backprop but do not

update parameters 𝜃

Update the Generator Discriminator

Random weights 𝜙 using backprop. BCE with all labels
noise
Generator 𝐷𝜃 equal to real (1)
𝐺𝜙 G(z)
Minimize 𝒥𝐺 = 𝔼𝑧 −𝑙𝑜𝑔 𝐷(𝐺(𝑧))
𝜙

We may also define 𝒥𝐺 as 𝒥𝐺 = −𝒥𝐷 = 𝑐𝑜𝑛𝑠𝑡. +𝔼𝑧 𝑙𝑜𝑔 1 − 𝐷(𝐺(𝑧))

Therefore, the entire cost function for GAN can be written as min max 𝒥𝐷
𝜃 𝜙
Min Max 𝔼𝑥∈𝒟 −𝑙𝑜𝑔 𝐷𝜃 (𝑥) + 𝔼𝑧 −𝑙𝑜𝑔 1 − 𝐷𝜃 (𝐺𝜙 (𝑧))
𝜃 𝜙
Majid Komeili, Carleton University 5
Training GANs
Original GAN loss:
Min Max 𝔼𝑥∈𝒟 −𝑙𝑜𝑔 𝐷𝜃 (𝑥) + 𝔼𝑧 −𝑙𝑜𝑔 1 − 𝐷𝜃 (𝐺𝜙 (𝑧))
𝜃 𝜙
▪ For each iteration
▪ Repeat K times
▪ Draw 𝑧 (1) , 𝑧 (2) ,…, 𝑧 (𝑛) (random noise) and generate n fake samples.
▪ Draw 𝑥 (1) , 𝑥 (2) ,…, 𝑥 (𝑛) (from the training set).
▪ Update the discriminator by gradient descent using:
𝑛
1
𝜃 𝑛𝑒𝑤 = 𝜃 𝑜𝑙𝑑 − ∇𝜃 ෍ −𝑙𝑜𝑔 𝐷𝜃 𝑥 (𝑖) − 𝑙𝑜𝑔 1 − 𝐷𝜃 (𝐺𝜙 (𝑧 (𝑖) ))
𝑛
𝑖=1

▪ Draw 𝑧 (1) , 𝑧 (2) ,…, 𝑧 (𝑛) (random noise) and generate n fake samples.
▪ Update the generator by gradient descent using:
1
𝜙 𝑛𝑒𝑤 = 𝜙 𝑜𝑙𝑑 − ∇𝜙 σ𝑛𝑖=1 −𝑙𝑜𝑔 𝐷𝜃 (𝐺𝜙 𝑧 (𝑖) ) Modified generator loss
𝑛

Carleton University 6
Issue with BCE loss
▪ Generally, the task of the discriminator is easier than the generator.
▪ If the discriminator becomes too strong relative to the generator, the BCE
loss will be saturated (flat regions).
▪ Consequently, there will be no gradient signal.

Majid Komeili, Carleton University 7

Wassertein Loss
▪ Wassertein Loss is an alternative to the BCE loss.
▪ It is based on the Earth Mover’s Distance.
▪ Instead of a discriminator, we have a critic 𝐶.
min max 𝔼 𝑐 𝑥 − 𝔼 𝑐 𝐺 𝑧
𝑔 𝑐
▪ The output of W-Loss could be any real value (i.e. the output of a linear layer
rather than a sigmoid) representing how real/fake is an image.
▪ Helps with vanishing gradient and mode collapse.
▪ Critic should satisfy 1-Lipschitz Continuity condition. Two common ways to
enforce this condition:
▪ Weight clipping [−𝛽, 𝛽].
▪ Gradient penalty. Add a regularization term that penalize when the gradient is higher than 1.
▪ Regularization term is defined as ∇𝑓 𝑥ො 2 − 1 2 ,

Majid Komeili, Carleton University 8

WGAN vs GAN

Discriminator/Critic Generator

𝑛 𝑛
1 1
GAN min
𝐷 𝑛
෍ −𝑙𝑜𝑔 𝐷 𝑥 (𝑖) − 𝑙𝑜𝑔 1 − 𝐷(𝐺(𝑧 (𝑖) )) m𝑖𝑛
𝐺 𝑛
෍ −𝑙𝑜𝑔 𝐷(𝐺 𝑥 (𝑖) )
𝑖=1 𝑖=1

𝑛 𝑛
1 1
WGAN m𝑎𝑥
𝑐 𝑛
෍ 𝑐 𝑥 (𝑖) − 𝑐 𝐺 𝑧 (𝑖) m𝑎𝑥
𝐺 𝑛
෍ 𝑐 𝐺 𝑧 (𝑖)
𝑖=1 𝑖=1

Carleton University 9
Conditional Generation
• Specify which class we want the Generator to generate images.

1 if real and from

the correct class

Random 0 otherwise
noise
Discriminator input
0
1 Matrix of full of zeros
Indicate 0 0
1
the class 0 0
0
0 0
Matrix of full of ones
0
0
RGB image

Majid Komeili, Carleton University 10

Conditional vs Controllable Generation
▪ Conditional Generation
▪ Generate images from a desired class.
▪ ex. Generate a sample from class dog.
▪ Controllable Generation
▪ Generate images with a desired feature.
▪ ex. having eyeglasses when generating face images.

Majid Komeili, Carleton University 11

Controllable Generation
▪ Goal: generate images with a desired feature (e.g. have eyeglasses)

eyeglasses

Random
noise

Classifier
Generator (Pre-trained eyeglass
detector )