0% found this document useful (0 votes)
57 views16 pages

Game of GANs Game Theoretical Models For Generativ

This paper reviews the integration of game theory into Generative Adversarial Networks (GANs) to address their challenges and improve performance. It categorizes recent advancements into modified game models, architectures, and learning methods, while also discussing existing issues such as training difficulties and convergence. The authors aim to provide a comprehensive overview of the state-of-the-art solutions and future research directions in the field of GANs from a game-theoretical perspective.

Uploaded by

vandyh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views16 pages

Game of GANs Game Theoretical Models For Generativ

This paper reviews the integration of game theory into Generative Adversarial Networks (GANs) to address their challenges and improve performance. It categorizes recent advancements into modified game models, architectures, and learning methods, while also discussing existing issues such as training difficulties and convergence. The authors aim to provide a comprehensive overview of the state-of-the-art solutions and future research directions in the field of GANs from a game-theoretical perspective.

Uploaded by

vandyh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

1

Game of GANs: Game-Theoretical Models for


Generative Adversarial Networks
Monireh Mohebbi Moghadam, Bahar Boroumand∗, Mohammad Jalali∗ , Arman Zareian∗,
Alireza Daeijavad, and Mohammad Hossein Manshaei
Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156-83111, Iran
Emails: {monireh.mohebbi, b.boroomand, mjalali, armanzareian.az, daeijavad}@ec.iut.ac.ir, [email protected]
arXiv:2106.06976v1 [cs.LG] 13 Jun 2021

Abstract—Generative Adversarial Network, as a promising samples from latent space [2]. This feature enables GANs
research direction in the AI community, recently attracts con- to successfully applied to various applications, ranging from
siderable attention due to its ability to generating high-quality image synthesis, computer vision, video and animation gen-
realistic data. GANs are a competing game between two neural
networks trained in an adversarial manner to reach a Nash eration, to speech and language processing, and cybersecurity
equilibrium. Despite the improvement accomplished in GANs in [5].
the last years, there remain several issues to solve. In this way, The core idea of GANs is inspired by the two-player
how to tackle these issues and make advances leads to rising zero-sum minimax game between the discriminator and the
research interests. This paper reviews literature that leverages generator, in which the total utilities of two players are zero,
the game theory in GANs and addresses how game models
can relieve specific generative models’ challenges and improve and each player’s gain or loss is exactly balanced by the
the GAN’s performance. In particular, we firstly review some loss or gain of another player. GANs are designed to reach
preliminaries, including the basic GAN model and some game a Nash equilibrium at which each player cannot increase their
theory backgrounds. After that, we present our taxonomy to gain without reducing the other player’s gain [1], [6]. Despite
summarize the state-of-the-art solutions into three significant the significant success of GANs in many domains, applying
categories: modified game model, modified architecture, and
modified learning method. The classification is based on the these generative models to real-world problems has been
modifications made in the basic model by the proposed ap- hindered by some challenges. The most significant problems
proaches from the game-theoretic perspective. We further classify of GANs are that they are hard to train, and suffer from
each category into several subcategories. Following the proposed instability problems such as mode collapse, non-convergence,
taxonomy, we explore the main objective of each class and review and vanishing the gradients. GAN needs to converge to the
the recent work in each group. Finally, we discuss the remaining
challenges in this field and present the potential future research Nash equilibrium during the training process, but it is proved
topics. that the convergence is challenging [7], [8].
Since 2014, GAN has been widely studied, and numer-
Index Terms— generative adversarial network (GAN), game
theory (GT), multi-agent systems (MAS), deep generative models, ous methods have been proposed to address its challenges.
deep learning, GAN variants. However, to produce the high quality generated samples, it is
necessary to improve the GAN theory as incompetence in this
part is one of the most important obstacles to develop better
I. I NTRODUCTION GANs [9]. As the basic principle of GANs is based on the
Generative Adversarial Network (GAN) is a class of gener- game theory and the data distribution is learned via a game
ative models was originally proposed by Goodfellow et. al. in between the generator and discriminator, exploiting the game
2014 [1]. GAN has been gained wide attention in recent years theory techniques became one of the most discussed topics,
due to its potential to model high-dimensional complex real- attracted research efforts in recent years.
world data and quickly became a promising research direction
[2]. As a type of generative models, GANs do not minimize A. Motivation and Contribution
a single training criterion. They are used for estimating real
data probability distribution. GAN usually comprises two The biggest motivation behind this survey was the absence
neural networks, a discriminator, and a generator, which are of any other review paper that has particularly focused on
trained simultaneously via an adversarial learning concept. In the game theory advances in GANs. However, many compre-
summary, GAN is more powerful in both feature learning and hensive surveys on GANs have investigated GANs in detail
representation [3]. The discriminator attempts to differentiate with different focuses (e.g. [2], [6], [10]–[22]), but, to the
between real data samples and fake samples made by the best of our knowledge, this work is the first one to explore
generator, while the generator tries to create realistic samples the GAN advancement from a game-theoretical perspective.
that cannot be distinguished by the discriminator [4]. Hence, in this paper, we attempt to provide the readers with
In particular, GAN models do not rely on any assumptions recent advances in GANs using the game theory by classifying
about the distribution and can generate infinite realistic new and surveying recently proposed works.
Our survey first introduces some background and key con-
∗ These authors contributed equally. cepts in this field. Then we classify the recent proposed game
2

of GANs models into three major categories: modified game TABLE I: Acronyms and corresponding full names appearing
model, modified architecture in terms of the number of agents, in the paper.
and modified learning method. Each group further classify into Acronym Full Name
several subcategories. We review the main contributions of GAN Generative Adversarial Network
each work in each subcategory. We also point to some existing WGAN Wasserstein GAN
problems in the discussed context and forecast the potential MAD-GAN Multi-Agent Diverse GAN
future research topics. MADGAN Multiagent Distributed GAN
MPM GAN Message Passing Multi-Agent GAN
B. Paper Structure and Organization Seq-GAN Sequence GAN
L-GAN Latent-space GAN
The rest of this paper is organized as follows. Section II
FedGAN Federated GAN
presents some background on the game theory, and GANs
ORGAN Objective-Reinforced GAN
includes basic idea, learning method, and challenges. In Sec-
CS-GAN Cyclic-Synthesized GAN
tion III we have a glimpse to the other surveys conducted
SCH-GAN Semi-supervised Cross-modal Hashing
in the field of GANs. We provide our proposed taxonomy in
GAN
Section IV and review the research models in each category
MolGAN Molecular GAN
in this section. The final section is devoted to discussion and
RNN Recurrent Neural Network
conclusion.
RL Reinforcement Learning
IRL Inverse Reinforcement Learning
II. BACKGROUND AND P RELIMINARIES NE Nash Equilibrium
Before moving on to presenting of our taxonomy and JSD Jensen-Shannon Divergence
discussing the works applying game theory methods in GANs, KL Kullback-Leibler
it needs to present some preliminary concepts in the field AE Auto-Encoder
of game theory and GANs. Here, we start by presenting an DDPG Deep Deterministic Policy Gradient
overview of game theory, and then move toward to GANs. ODE Ordinary Differential Equation
Table I lists the acronyms and their definitions used in this OCR Optical Character Recognition
paper. SS Self-Supervised task
MS Multi-class minimax game based Self-
A. Game Theory supervised tasks
SPE Subgame Perfect Equilibrium
Game theory aims to model situations in which some deci- SNEP Stochastic Nash Equilibrium Problem
sion makers interact with each other. The interaction between SVI Stochastic Variational Inequality
these decision makers is called ”game” and the decision SRFB Stochastic Relaxed Forward-Backward
makers are called ”players”. In each turn of the game, players aSRFB Averaging over Decision Variables
have actions which the set of these actions is called strategy SGD Stochastic Gradient Descent
set. It is usually assumed that players are rational, which means NAS Neural Architecture Search
that each agent tries to maximize its utility, and it is achieved IID Independent and Identically Distributed
by choosing the action that maximizes its payoff. Players’ DDL Discriminator Discrepancy Loss
action is chosen with respect to other players’ action and due
to this, each agent should have a belief system about the other
players [23]. decision maker. In minimax strategy, decision maker wants
Several solution concepts has been introduced for analyzing to cause harm to others. To rephrase it, the decision maker
the games and finding Nash equilibrium is one of them. ”Nash wants to minimize other players’ maximum payoff [24]. The
equilibrium” is a state where each player cannot increase its value which players get in the minimax or maximin strategy
payoff by changing its strategy. In the other words, Nash method is called min-max (minimax) or max-min (maximin)
equilibrium is a state where nobody regrets about its choice value, respectively. In [25], Neumann proved that in any finite,
given others’ strategy and with respect to its own payoff [24]. two players, zero-sum games, all Nash equilibria coincides
In the situation where the players assign a probability dis- with min-max strategy and max-min strategy for players. Also,
tribution to the strategy sets instead of choosing one strategy, the min-max value and max-min value are equal to Nash
the Nash equilibrium is called ”mixed Nash equilibrium” [24]. equilibrium utility.
Constant-sum games are two player games in which sum of
the two players’ utilities is equal to this amount in all other
states. When this amount equal to zero, it is called zero-sum
B. Generative Adversarial Networks
game [24].
Another solution concept is max-min strategy and max-min We provide some preliminaries on the GANs in order to
strategy method. In maximin strategy method, the decision facilitate the understanding of the basic and key concept of
maker maximizes its worst case payoff, which happens when this generative model. In particular, first, we briefly review
all other players cause as much harm as they can to the the generative models. Then, we give a brief description of
3

x
Real samples
True/
Discriminator D
G(z) Flase
Noise z Generator G

Fig. 1: GAN’s Architecture [6].

TABLE II: Variant of GANs based on Divergence metric [27]. G and optimize D to discriminate optimally. Next, we fix
Divergence metric f0 (.) f1 (.) Game Value D and try to minimize the objective function. Discriminator
Kullback-Leibler log(D) 1−D 0 operates optimally if discriminator cannot distinguish between
Reverse KL −D log(D) −1 real and fake data. For example in Jensen-Shannon metric it
Jensen-Shannon log(D) log(1 − D) − log 4 happens when pr (x)/(pr (x) + pr (z)) equals to 1/2. If both
WGAN D −D 0 discriminator and generator work optimally, the game reaches
the Nash equilibrium and the value of the min-max and max-
min value will be equal. As shown in TABLE II for Jensen-
the GAN by reviewing its basic idea, learning method, and its Shannon metric it is equal to −log4.
challenges.
1) Generative Models: A generative model is a model III. R ELATED S URVEYS
whose purpose is to simulate the distribution of training set. As GAN is becoming increasingly popular, the number
Generative models can be divided into three types. In the first of works in this field, and consequently the review articles,
type, the model gets a training set with distribution of pdata are also increasing. By now, many surveys for GANs have
(unknown to model) and tries to make distribution pmodel been presented (about 40), which can be classified into three
which is an approximation of pdata . The second one is the categories. The works in the first category [2], [6], [10]–
one which is solely capable of producing samples from pmodel [22] explore a relatively broad scope in GANs, including
and finally models where can do both. GANs which are one key concepts, algorithms, applications, different variants and
kind of generative models, concentrate on producing samples architecture. In contrast, the surveys in the second group [5],
mainly [10]. [8], [28]–[30] focus solely on a specific segment or issue in
2) GAN Fundamentals: In 2014, Goodfellow et al. [26] the GANs (e.g. regularization methods, or lass functions) and
introduced GANs as a framework in which two players are review how the researchers deal with that problem. And, in the
playing a game with the other one. The result of the game third category, a plethora of survey studies [7], [9], [19], [31]–
is to have a generative model that can produce samples [43] summarize the application of GAN in a specific field,
similar to training set. In the game introduced, players are from computer vision and image synthesis, to cybersecurity
named generator G and discriminator D. Generator is the one and anomaly detection. In the following, we briefly review
which at the end should produce samples and discriminator’s surveys in each category and express how our paper differs
aim is to distinguish training set samples and generator’s from the others.
samples. The more indistinguishable samples is produced,
the better the generative model is [26]. Any differentiable
function such as a multi-layer neural network can represent A. GAN General surveys
generator and discriminator. Generator, G(z), inputs a prior Goodfellow in his tutorial [10] answers the most frequent
noise distribution pz and maps it to approximation of training questions in context of GAN. Wang et al. [6] surveys theoretic
data distribution pg . Discriminator, D(x), maps input data and implementation models of GAN, its applications, as well
distribution pdata into a real number in the interval of [0, 1], as the advantages and disadvantages of this generative model.
which is the probability of being real sample instead of fake Creswell et al. [11] provide an overview of GANs, espe-
sample (the sample that generator produces) [13]. cially for the signal processing community, by characterizing
3) GAN Learning Models: The generator and discrimina- different methods for training and constructing GANs, and
tor can be trained using an iterative process of optimizing challenges in the theory and applications. In [19] Ghosh et
an objective function. The objective function stated in (1), al. present a comprehensive summary on the progression and
introduced by Goodfellow et. al. [26]. performance of GANs along with their various applications.
Saxena et al. in [20] conduct a survey of the advancements in
min max V (G, D) = Ex∼pdata (x) [f0 (D(x))] GANs design and optimization solutions proposed to handle
G D
GANs challenges. Kumar et al. [17] presents state-of-the-art
+ Ez∼pz (z) [f1 (1 − D(G(z)))] (1)
related work in the GAN, its applications, evaluation metrics,
Where f0 and f1 can be replaced from TABLE II based challenges, and benchmark datasets. In [18] two new deep
on divergence metric. The first proposed GAN uses Jensen- generative models, including GA, are compared, and the most
Shannon metric. remarkable GAN architectures are categorized and discussed
To train the simple model shown in Fig. 1, we first, fix by Salehi et al.
4

Gui et al. in [21] provide a review on various GANs methods Wang et al. structure a review [7] towards addressing practical
from the perspectives of algorithms, theory, and applications. challenges relevant to computer vision. They discuss the
[22] surveys different GAN variants, applications and several most popular architecture-variant, and loss-variant GANs, for
training solutions. Hitawala in [12] presents different versions tackling these challenges. Wu et al. in [31] present a survey of
of GANs and provided a comparison between them from some image synthesis and editing, and video generation with GANs.
aspects such as learning, architecture, gradient updates, object, They cover recent papers that leverage GANs in image applica-
and performance metrics. In the similar manner, Gonog et al. tions including texture synthesis, image inpainting, image-to-
in [13] review the extensional variants of GANs, and classified image translation, image editing, as well as video generation.
them regarding how they are optimized the original GAN or In the same way, [32] introduces the recent research on GANs
change the basic structure, as well as their learning methods. in the field of image processing, and categorized them in four
In [2] Hong et al. discuss the details of the GAN from the fields including image synthesis, image-to-image translation,
perspective of various object functions and architectures and image editing, and cartoon generation.
the theoretical and practical issues in training the GANs. The Researches such as [35] and [36] focus on reviewing recent
authors also enumerate the GAN variants that are applied in techniques to incorporate GANs in the problem of text-to-
different domains. Bissoto et al. in [14] conduct a review of the image synthesis. In [35], Agnese et al. propose a taxonomy
GAN advancements in six fronts includes architectural contri- to summarize GAN based text-to-image synthesis papers into
butions, conditional techniques, normalization and constraint four major categories: Semantic Enhancement GANs, Resolu-
contributions, loss functions, image-to-image translations, and tion Enhancement GANs, Diversity Enhancement GANS, and
validation metrics. Zhang et al., in their review paper [15], Motion Enhancement GANs. Different from the other surveys
survey twelve extended GAN models and classified them in in this field, Sampath et al. [37] examine the most recent
terms of the number of game players. Pan et al. in [16] developments of GANs techniques for addressing imbalance
analyze the differences among different generative models, problems in image data. The real-world challenges and imple-
and classified them from the perspective of architecture and mentations of synthetic image generation based on GANs are
objective function optimization, and discussed the training covered in this survey.
tricks, evaluation metrics, and expressed GANs applications In [33], [34], [38], the authors deal with the medical
and challenges. applications of image synthesis by GANs. Yi et al. in [38]
describe the promising applications of GANs in medical
B. GAN Challenges imaging, and identifies some remaining challenges that need
to be solved. As another paper in this subject, [33] reviews
In a different manner, Lucic in [28] conduct an empirical
GANs’ application in image denoising and reconstruction in
comparison on GAN models, with focus on unconditional vari-
radiology. Tschuchnig et al. in [34] summarize existing GAN
ants. As another survey in the second category, Alqahtani et al.
architectures in the field of histological image analysis.
[5] mainly focus on potential applications of GANs in different
As another application of GANs, [39] and [19] structure
domains. This paper attempts to identify advantages, disad-
reviews on the GANs in the cybersecurity. Yinka et al. [39]
vantages and major challenges for successful implementation
survey studies where the GAN plays a key role in the design
of GAN in different application areas. As another specific
of a security system or adversarial system. Ghosh et al. [19]
review paper, Wiatrak et al. in [8] survey current approaches
focus on the various ways in which GANs have been used to
for stabilizing the GAN training procedure, and categorizing
provide both security advances and attack scenarios in order
various techniques and key concepts. More specifically, in
to bypass detection systems.
[29], Lee et al. review the regularization methods used in
Di Mattia et al. [40] survey the principal GAN-based
the stable training of GANs, and classified them into several
anomaly detection methods. Geogres et al. [41] review the
groups by their operation principles. In contrast, [30] performs
published literature on Observational Health Data to uncover
a survey for the loss functions used in GANs, and analyze the
the reasons for the slow adoption of GANs for this subject.
pros and cons of these functions. As differentially private GAN
Gao et al. in [42] address the practical applications and
models provides a promising direction for generating private
challenges relevant to spatio-temporal applications such as
synthetic data, Fan et al. in [44] survey the existing approaches
trajectory prediction, events generation and time-series data
presented for this purpose.
imputation. The recently proposed user mobility synthesis
schemes based on GANs are summarized in [43].
C. GAN Applications According to the classification provided for review papers,
As we mentioned before, GANs has been successfully our survey falls into the second category. We focus specifically
applied to enormous application. In this way, some review on the recent progress of the application of game-theoretical
articles survey these advances. The authors in [7], [9], [31]– approaches towards addressing the GAN challenges. While
[38] conduct some review on the different aspects of GAN several surveys for GANs have been presented to date, to the
progress in the field of computer vision and image synthesis. best of our knowledge, our survey is the first to address this
Cao et al. [9] review recently proposed GAN models and their topic. Although the authors in [8] presented a few game-model
applications in computer vision. Cao et al. in [9] compared GANs, they have not done a comprehensive survey in this
the classical and stare-of-the art GAN algorithms in terms of field, and many new pieces of research have not been covered.
the mechanism, visual results of generated samples, and so on. We hope that our survey will serve as a reference for interested
5

Proposed GAN Taxonomy

Modified Game Model Modified Learning Method Modified Architecture

No regret learning Multiple generators, One


Stochastic game [45]
[10], [49], [50] discriminator [46], [64]–[67]

Stackelberg One generator, Multiple


Fictitious play [27]
game [46], [47] discriminators [60], [68]–[72]

Federated learning Multiple generators, Multiple


Bi-affine game [48]
[51], [52] discriminators [51], [66], [73]

Reinforcement One generator, One discriminator,


learning [4], [53]–[63] One classifier [4], [74]

One generator, One discriminator,


One RL agent [58], [59], [75], [76]

Fig. 2: The proposed taxonomy of the GAN advances by game theory.

researchers on this subject. that are SNE. The advantage of this approach is that there are
many algorithms for finding the solution of an SVI, like the
IV. G AME OF GAN S : A TAXONOMY forward-backward algorithm, also known as gradient descent.
In this section, we will present our taxonomy to summarize Franci et al. proposed a stochastic relaxed forward-backward
the reviewed papers into three categories by focusing on (SRFB) algorithm and a variant with an additional step for
how these work are extended from the original GAN. The averaging over decision variables (aSRFB) for the training
taxonomy is done in terms of 1. modified game mode, 2. process of GANs. For proving convergence to a solution, we
architecture modification, and 3. modified learning algorithms, need monotonicity on the pseudogradient mapping, which is
as shown in Fig. 2. Based on these primary classes, we defined by Equation (2), where Jg and Jd are the payoff
further classify each category into some subsets (Fig. 2). In functions of the generator and the discriminator.
the following sections, we will introduce each category and
 
E[∇xg Jg (xg , xd )]
the recent advances in each group will be discussed. F= (2)
E[∇xd Jd (xd , xg )]

A. Modified Game Model If pseudogradient mapping of the game is monotone and


the increasing number of samples is available, the algorithm
The core of all GANs is a competition between a generator
converges to the exact solution but with only finite, fixed mini-
and a discriminator, which model as a game. Therefore, game
batch samples, and by using the averaging technique, it will
theory plays a key role in this context. However, most of GANs
converge to a neighborhood of the solution.
relying on the basic model, formulating it as a two-player zero-
sum (minimax) game, but some research utilized other game 2) Stackelberg game: One of the main issues for GAN is
variants to tackle the challenges in this field. In this section, the convergence of the algorithm. Farnia et al. in [47] showed
we aim to review these literatures. We classify the works under that ”GAN zero-sum games may not have any local Nash
this category into three subcategories. Section IV-A1 presents equilibria” by presenting certain theoretical and numerical
researches that cast the training process as a stochastic game. examples of standard GAN problems. Therefore, based on the
Research works presented in Section IV-A2 apply the idea of natural sequential type of GANs where the generator moves
leader-follower of the Stackelberg game in the GANs. Finally first and follows the discriminator (leader), this problem can
Section IV-A3 presents GANs models as a Bi-affine game. be considered as a Stackelberg game and focused on subgame
A summary of the reviewed researches in the modified game perfect equilibrium (SPE). For solving the convergence issue,
model category is shown in Table III. the authors tried to find the equilibrium called proximal
1) Stochastic game: One of the main issues for GANs equilibrium which enables traversing the spectrum between
is that these neural networks are very hard to train because Stackelberg and Nash equilibria. In a proximal equilibrium,
of the convergence problems. Franci et al. in [45] addressed as shown in Equation (3) allow the discriminator locally to
this problem by casting the training procedure as a stochastic optimize in a norm-ball nearby the primary discriminator. To
Nash equilibrium problem (SNEP). The SNEP will recast as a keep the D̃ close to D, they penalize the distance among
stochastic variational inequality (SVI) and target the solutions the two functions by λ, as λ goes from zero to infinity, the
6

TABLE III: Summary of publications included in the modified game model category (Section IV-A)
Reference Convergence Methodology and Contributions Pros Cons
Stackelberg Game: Subsection IV-A2
Stackelberg SPNE Models multi-generator GANs Can be built on top of stan- -
GAN [46] as a Stackelberg game dard GANs & Proved the
convergence
[47] SPE Theoretical examples of stan- Can apply to any two-player Focus only on the pure strate-
dard GANs with no NE. & prox- GANs, allow the discrimi- gies of zero-sum GANs in non-
imal equilibrium as a solution, nator to locally optimize realizable settings
proximal training for GANs
Stochastic Game: Subsection IV-A1
[45] SNE Cast the problem as SNEP and Proved the convergence Need monotonicity on the pseu-
recast it to SVI, SRFB and aS- dogradient mapping, increasing
RFB solutions number of samples to reach an
equilibrium
Bi-affine Game: Subsection IV-A3
[48] Mixed NE Tackling the training of GANs Showing that all GANs can -
by reconsidering the problem be relaxed to mixed strategy
formulation from the mixed forms, flexibility
Nash Equilibria perspective

equilibria change from Stackelberg to Nash. In this section, we review literature in which proposed GAN
variants have modified the architecture in such a way that we
λ
Vλprox (G, D) := max V (G, D̃) − kD̃ − Dk2 (3) have GANs with a mixture of generators, and/or discriminators
D̃∈D 2 and show how applying such methods can provide better con-
Farnia et al. also proposed proximal training which opti- vergence properties and prevent mode collapse. However, the
mizes the proximal objective Vλprox (G, D) instead of the orig- majority of the works in this category focuses on introducing
inal objective V (G, D) that can apply to any two-player GAN. a larger number of generators and/or discriminators, but, in
Zhang et al. in [46] also used model GAN by this game and some papers, the number of generators and discriminators did
presented Stackelberg GAN to tackles the instability of GANs not change, but another agent has been added that converts
training process. Stackelberg GAN is using a multi-generator the problem to a multi-agent one.
architecture and the competition is between the generators In Section IV-B1, we will discuss GAN variants which
(followers) and the discriminator (leader). We discussed the extended the basic structure from a single generator to many
architecture details in Section IV-B1. generators. In Section IV-B2, we are going to review articles
3) Bi-affine game: Hssieh et al. in [48] examine the training that deal with the problem of mode collapse by increasing
of GANs by reconsidering the problem formulation from the number of discriminators in order to force the generator
the mixed NE perspective. In the absence of convexity, the to produce different modes. Section IV-B3 is dedicated to
theory focuses only on the local convergence, and it implies discussing works which develop GANs with multiple genera-
that even the local theory can break down if intuitions are tors and multiple discriminators. Articles will be reviewed in
blindly applied from convex optimization. In [48] the mixed Sections IV-B4 and IV-B5 extend the architecture by adding
Nash Equilibria of GANs is proposed, that they are, in fact, another agent, which is a classifier (Section IV-B4 ) or an RL
global optima of infinite-dimensional bi-affine games. Finite- agent (Section IV-B5), to show the benefits of an adding these
dimensional bi-affine games are also applied for finding mixed agents to GANs. The methodology, contribution as well as the
NE of GANs. It’s also shown that we can relax all current pros and cons of reviewed papers are summarized in Table IV.
GAN objectives into their mixed strategy forms. Eventually,
1) Multiple generators, One discriminator: The minimax
in this article, it’s experimentally shown that their method
gap is smaller in GANs with multi-generator architecture
achieves better or comparable performance than popular base-
and more stable training performances are experienced in
lines such as SGD, Adam, and RMSProp.
these GANs [46]. As we mentioned in Section IV-A2, Zhang
et al. in [46] tackle the problem of instability during the
B. Modified Architecture GAN training as a result of a gap between minimax and
As we mentioned in Section II, GANs are a framework for maximin objective values. To mitigate this issue, they design a
producing a generative model through a two-player minimax multi-generator architecture and model the competition among
game; however, in recent works, by extending the idea of using agents as a Stackelberg game. Results have shown the minimax
a single pair of generator and discriminator to the multi-agent duality gap decreases as the number of generators increases.
setting, the two-player game transforms to multiple games or In this article, the mode collapse issue is also investigated
multi-agent games. and showed that this architecture effectively alleviates the
7

mode collapse issue. One of the significant advantages of this can train multiple generators simultaneously, and the training
architecture is that it can be applied to all variants of GANs, results of all generators are consistent.
e.g., Wasserstein GAN, vanilla GAN, etc. Additionally, with Furthermore, in Message Passing Multi-Agent Generative
an extra condition on the expressive power of generators, it Adversarial Networks [67] it is proposed that with two
is shown that Stackelberg GAN can achieve ǫ -approximate generators and one discriminator that communicate through
equilibrium with Õ(1/ǫ) generator [46]. message passing better image generation can be achieved. In
Furthermore, Ghosh et al. in [64] proposed a multi- this paper, there are two objectives such as competing and
generator and single discriminator architecture for GANs conceding. The competing is introduced based on the fact
named Multi-Agent Diverse Generative Adversarial Networks that the generators compete with each other to get better
(MAD-GAN). In this paper, different generators capture var- scores for their generations from the discriminator. However,
ied, high probability modes, and the discriminator is designed the conceding is introduced based on the fact that the two
such that, along with finding the real and fake samples, generators try to guide each other in order to get better scores
identifies the generator that generated the given fake sample for their generations from the discriminator and ensures that
[64]. It is shown that at convergence, the global optimum value the message sharing mechanism guides the other generator to
of −(k + 1) log(k + 1) + k log k is achieved, where k is the generate better than itself. Generally, in this paper, innovative
number of generators. architectures and objectives aimed at training multi-agent
Comparing presented models in [64] and [46], in MAD- GANs are presented.
GAN [64] multiple generators are combined with the assump- 2) One generator, Multiple discriminators: The multi-
tion that the generators and the discriminator have infinite discriminators are constructed with homogeneous network
capacity, but in the Stackelberg GAN [46] there is no as- architecture and trained for the same task from the same
sumption on the model capacity. Also, in MAD-GAN [64] training data. In addition to introducing a multi-discriminators
the generators share common network parameters, although, in schema, Durugkar et al. in [68], from the perspective of
the Stackelberg GAN [46] various sampling schemes beyond the game theory, show that because of these similarities, the
the mixture model is allowed, and each generator has free discriminators act like each other; thus, they will converge
parameters. to similar decision boundaries. In the worst case, they may
The assumption that increasing generators will cover the even converge to a single discriminator. So, Jin et al. in
whole data space is not valid in practice. So Hoang et al. [69] by discriminator discrepancy loss (DDL), the multiplayer
in [65], in contrast with [64], approximate data distribution minimax game unifies the optimization of DDL and the GAN
by forcing generators to capture a subset of data modes loss, seeking an optimal trade-off between the accuracy and di-
independently of those of others instead of forcing generators versity of multi discriminators. Compared to [68], Hardy et al.
by separating their samples. Thus, they established a mini- in [70] distributed discriminators over multiple servers. Thus,
max formulation among a classifier, a discriminator, and a they can train over datasets that are spread over numerous
set of generators. The classifier determines which generator servers.
generates the sample by performing multi-class classification. In FakeGAN proposed in [60], Aghakhani et al. use two
Each generator is encouraged to generate data separable from discriminators and one generator. The discriminators use the
those produced by other generators because of the interaction Monte Carlo search algorithm to evaluate and pass the interme-
between generators and the classifier. In this model, multiple diate action-value as the reinforcement learning (RL) reward
generators create the samples. Then one of them will be to the generator. The generator is modeled as a stochastic
randomly picked as the final output similar to the mechanism policy agent in RL [60]. Instead of one batch in [69], Mordido
of a probabilistic mixture model. Therefore they theoretically et al. in [71] divide generated samples into multiple micro-
prove that, at the equilibrium, the Jensen-Shannon divergence batch. Then update each discriminator’s task to discriminate
(JSD) between final output and the data distribution is min- between different samples. Samples coming from its assigned
imal. In contrast, the JSD amongst generators’ distributions fake micro-batch and samples from the micro-batches assign
is maximal, hence the mode collapse problem is effectively to the other discriminator, together with the real samples.
avoided. Moreover, the computational cost that is added to the Unlike [68], Nguyen et al. in [72] combined the Kullback-
standard GAN is minimal in the suggested model by applying Leibler (KL) and reverse KL divergence (the measure of how
parameter sharing. The proposed model can efficiently scale one probability distribution is different from a second) into a
to large-scale datasets as well. unified objective function. Combining these two measures can
Ke et al. propose a new architecture, multiagent distributed exploit the divergence’s complementary statistical properties
GAN (MADGAN) in this article [66]. In this framework, to diversify the estimated density in capturing multi modes
the discriminator is considered as a leader, and the generator effectively. From the perspective of game theory in [72], there
is considered as a follower. Also, this article is based on are two discriminators and one generator with the analogy of a
the social group wisdom and the influence of the network three-player minimax game. In this case, there is a two pair of
structure on agents. MADGAN can have a multi-generator players which are playing two minimax game simultaneously.
and multi-discriminator architecture (e.g., two discriminators In one of the games, the discriminator rewards high scores
and four generators) as well as multiple generators and single for samples from data distribution (reverse KL divergence)
discriminator architecture, which is our discussed topic in this (4), while another conversely rewards high scores for samples
section. One of the vital contributions of MADGAN is that it from the generator, and the generator produces data to fool
8

both two discriminators (KL divergence) (5). for GAN convergence and communication-efficient SGD for
X X federated learning are connected by Rasouli et. al. in this
min max (G, D1 ) = α× [log D1 (x)]+ [−D1 (G(z))] article to address FedGAN converge. One of the notable results
G D1
x∼Pdata z∼Pz
(4) in [51] is that FedGAN has similar performance to general dis-
[log D2 (G(z))] tributed GAN while it converges and reduces communication
X X
min max (G, D2 ) = [−D2 (x)]+β×
G D2
x∼Pdata z∼Pz
complexity as well.
(5) In [66] multi-agent distributed GAN (MADGAN) frame-
hyperparameters α, β are being used to control and stabilize work is proposed based on the social group wisdom and the
the learning method. influence of the network structure on agents, in which the dis-
Minimizing the Kullback-Leibler (KL) divergence between criminator and the generator are regarded as the leader and the
data and model distributions covers multiple mods but may follower, respectively. The multi-agent cognitive consistency
produce completely unseen and potentially undesirable sam- problem in the large-scale distributed network is addressed
ples. In reverse KL divergence, it is observed that optimization in MADGAN. In fact, in this paper [66] the conditions
towards the reverse KL divergence criteria mimics the mod of consensus are presented for a multi-generator and multi-
seeking process where the Pmodel concentrates on a single discriminator distributed GAN by analyzing the existence of
mode of Pdata while ignoring other modes. stationary distribution to the Markov chain of multiple agent
3) Multiple generators, Multiple discriminators: The exis- states. The experimental results show that the generation effect
tence of equilibrium has always been considered one of the of the generators trained by MADGAN can be comparable
open theoretical problems in this game between generator and to that of the generator trained by GAN. More important,
discriminator. Arora et al. in [73] turn to infinite mixtures MADGAN can train multiple generators simultaneously, and
of generator’s deep nets in order to investigate the existence the training results of all generators are consistent.
of equilibria. Unsurprisingly, equilibrium exists in an infinite 4) One generator, One discriminator, One classifier: One
mixture. Therefore, in [73] showed that a mixture of a finite of the issues that GANs face is catastrophic forgetting in
number of generators and discriminators can approximate min- the discriminator neural network. Self-supervised (SS) tasks
max solution in GANs. This implies that an approximate were planned to handle this issue, however, these methods
equilibrium can be achieved with a mixture (not too many) of enable a seriously mode-collapsed generator to surpass the
generators and discriminators. In this article [73], a heuristic SS tasks. Tran et al. in [74] proposed new SS tasks, called
approximation to the mixture idea is proposed to introduce a Multi-class minimax game based Self-supervised tasks (MS)
new framework for training called MIX+GAN: use a mixture which is based on a multi-class minimax game , including a
of T components, where T is as large as allowed by the discriminator, a generator, and a classifier. The SS task is a
size of GPU memory (usually T≤5).In fact, a mixture of 4-way classification task of recognizing one among the four
T generators and T discriminators are trained which share image rotations (0, 90, 180, 270 degrees). The discriminator
the same network architecture, but have their own trainable SS task is to train the classifier C that predicts the rotation
parameters. Maintaining a mixture represents maintaining a applied to the real samples and the generator SS task is to
weight wui for the generator Gui which corresponds to the train the generator G to produce fake samples for maximizing
probability of selecting the output of Gui . These weights for classification performance. The SS task helps the generator
the generator are updated by backpropagation. This heuristic learn the data distribution and generate diverse samples by
can be applied to existing methods like DCGAN, W-GAN, etc. closing the gap between supervised and unsupervised im-
Experiments show MIX+GAN protocol improves the quality age classification. The theoretical and experimental analysis
of several existing GAN training methods and can lead to more showed that the convergence of this approach has progressed.
stable training. Li et al. in [4] also used a classifier generating catego-
As we mentioned earlier, one of the significant challenges rized text. The authors proposed a new framework Cyclic-
in GAN algorithms is their convergence. Refer to this pa- Synthesized GAN (CS-GAN), which uses GAN, RNN, and
per [51] this challenge is a result of the fact that cost RL to generate better sentences. The classifier position is to
functions may not converge using gradient descent in the ensure that the generated text contains the label information
minimax game between the discriminator and the generator. and the RNN is a character predictor because the model is
Convergence is also one of the considerable challenges in built at the character level to limit the large searching space.
federated learning. This problem becomes even more chal- We can divide the generation process into two steps, first
lenging when data at different sources are not independent and adding category information into the model and making the
identically distributed. Therefore, [51] proposed an algorithm model generate category sentences respectively, then combine
for multi-generator and multi-discriminator architecture for category information in GAN to generate labeled sentences.
training a GAN with distributed sources of non-independent- CS-GAN acts strongly in supervised learning, especially in the
and-identically-distributed data sources named Federated Gen- multi-categories datasets.
erative Adversarial Network (FedGAN). Local generators and 5) One generator, One discriminator, One RL agent: With
discriminators are used in this algorithm. These generators an AL agent, we can have fast and robust control over the
and discriminators are periodically synced via an intermediary GAN’s output or input. This architecture also can be used to
that averages and broadcasts the generator and discriminator optimize the generation process by adding an arbitrary (not
parameters. In fact, results from stochastic approximation necessarily differentiable) objective function to the model.
9

In [58], Cao et al. used this architecture for generat- 1) No-regret learning: The best response algorithms for
ing molecules and drug discovery. The authors encoded the GAN are often computationally intractable, and they do not
molecules as the original graph-based representation, which lead to convergence and have cycling behavior even in simple
has no overhead comparing to similar approaches like SMILES games. However, the simple solution, in that case, is to average
[77], which generates a text sequence from the original graph. the iterates. Regret minimization is the more suitable way
For the training part, authors were not only interested in to think about GAN training dynamics. In [49], Kodali et
generating chemically valid compounds, but also tried to op- al. propose studying GAN training dynamics as a repeated
timize the generation process toward some non-differentiable game that both players use no-regret algorithms. Also, the
metrics(e.g., how likely the new molecule is water-soluble or authors show that the GAN game’s convex-concave case has a
fat-soluble) using RL agent. In Molecular GAN (MolGAN), an unique solution. If G and D have enough capacity in the non-
external software will compute the RL loss for each molecule. parametric limit and updates are made in the function space,
The linear combination of RL loss and WGAN loss is utilized the GAN game is convex-concave. It also can be guaranteed
by the generator. convergence (of averaged iterates) using no-regret algorithms.
Weininger et al. in [77] tackled the same problem. Com- With standard arguments from game theory literature, the
paring to [58], they encoded the molecules as text se- authors show that the discriminator does not need to be optimal
quences by using SMILES, the string representation of the at each step.
molecule, not the original graph-based one. They presented In contrast to [49], much of the recent developments [10]
Objective-Reinforced Generative Adversarial Networks (OR- are based on the unrealistic assumption that the discriminator
GAN), which is built on SeqGAN [55] and their RL agent is playing optimally; this corresponds to at least one player
uses REINFORCE [78], a gradient-based approach instead of using the best-response algorithm. But in the practical case
deep deterministic policy gradient (DDPG) [79], an off-policy with neural networks, these convergence results do not hold
actor-critic algorithm which Cao et al. used in [58]. MolGAN because the game objective function is non-convex. In non-
gains better chemical property scores comparing to ORGAN, convex games, global regret minimization and equilibrium
but this model suffers from mode collapse because both the computation are computationally hard. Moreover, Kodali et
GAN and the RL objective do not encourage generating al. in [49] also analyze GAN training’s convergence from this
diverse outputs; alternatively, the ORGAN RL agent depends point of view to understand mode collapse. They show that
on REINFORCE, and the unique score is optimized penalizing mode collapse happens because of undesirable local equilibria
non-unique outputs. in this non-convex game (accompanied by sharp gradients
For controling the generator, we can also use an RL of the discriminator function around some real data points).
agent. Sarmad et al. in [75] presented RL-GAN-Net,a real- Furthermore, the authors show that a gradient penalty scheme
time completion framework for point cloud shapes. Their can avoid the mode collapse by regularizing the discriminator
suggested architecture is the combination of an auto-encoder to constrain its gradients in the ambient data space.
(AE), a reinforcement learning (RL) agent and a latent-space In [50] compares to [49], although Grnarova et al. use regret
generative adversarial network (l-GAN). Based on the pre- minimization, they provide a method that provably converges
trained AE, the RL agent selects the proper seed for the to an MN equilibrium. Because the minimax value of pure
generator. This idea of controlling the GAN’s output can open strategy for the generators is always higher than the minimax
up new potentialities to overcome the fundamental instabilities value of the mix equilibrium strategy of generators; thus,
of current deep architectures. the generators are more suitable. This convergence happens
for semi-shallow GAN architectures using regret minimization
procedures for every player. Semi-shallow GAN architectures
C. Modified Learning Algorithm are architectures that the generator is any arbitrary network,
This category covers methods in which the proposed im- and the discriminator consists of a single layer network.
provements involve modification in learning methods. Here, This method is done even though the game induced by such
in this section, we turn our attention to the literature which architectures is not convex-concave. Furthermore, they show
combines the other learning approaches such as fictitious play that the minimax objective of the generator’s equilibrium
and reinforcement learning with GANs. strategy is optimal for the minimax objective.
Different variation of GANs which are surveyed in IV-C1 2) Fictitious play: GAN is a two-player zero-sum game
study GAN training process as a regret minimization problem with a repeated game as the training process. If the zero-sum
instead of the popular view which seeks to minimize the di- game is played repeatedly between two rational players, they
vergence between real and generated distributions. As another try to increase their payoff. Let sni ∈ Si show the action taken
learning method, subsection IV-C2 utilizes fictitious play to by player i at time n and {s0i , s1i , ..., sin−1 } are previous actions
simulate the training algorithm on GAN. IV-C3 provides a chosen by player i. So player j can choose the best response,
review on the proposed GAN models that are used a federated assuming player i is choosing its strategy according to the
learning framework which trains across distributed sources to empirical distribution of {s0i , Si1 , ..., sin−1 }. Thus, the expected
overcome the data limitation of GANs. Researches in IV-C4 utility is a linear combination of utilities under different pure
seek to make a connection between GAN and RL. Table V strategies. So we can assume that each player plays the best
summarizes the contributions, pros and limitations of literature pure response at each round. In the game theory, this learning
reviewed in this catgory. rule is called fictitious play and can help us find the Nash
10

TABLE IV: Summary of the publication included in the modified architecture category (Subsection IV-B)
Reference Methodology and Contributions Pros Cons
Multiple generators, One discriminator: Subsection IV-B1
Stackelberg Tackling the instability problem in the training More stable training perfor- -
GAN [46] procedure with multi-generator architecture mances, alleviate the mode col-
lapse
MADGAN A multiagent distributed GAN framework Simultaneously training of mul- -
[66] based on the social group wisdom tiple generators, consistency of
all generators’ training results
MAD- A multi-agent diverse GAN architecture Capturing diverse modes while Assumption of infinite capac-
GAN producing high-quality samples ity for players, global optimal
[64] is not practically reachable
MGAN Encouraging generators to generate separable Overcoming the mode collaps- -
[65] data by classifier ing, diversity
[67] An innovative message passing model, where Improvement in image gener- -
messages being passed among generators ation, valuable representations
from message generator
One generator, Multiple discriminators: Subsection IV-B2
DDL-GAN Using DDL Diversity Only applicable to multiple
[69] discriminators
D2GAN Combine KL, reverse KL Quality and diversity, scalable Not powerfull as the combina-
[72] for large-scale datasets tion of autoencoder or GAN
GMAN Multiple discriminators Robust to mode collapse Complexity, converge to same
[68] outputs
microbatch Using microbatch Mitigate mode collapse -
GAN [71]
MD-GAN Parallel computation, distributed data Less communication cost, com- -
[70] putation complexity
FakeGAN Text classification - -
[60]
Multiple generators, Multiple discriminators: Subsection IV-B3
[73] Tackling generalization and equilibrium in Improve the quality of several Aggregation of losses with an
GANs existing GAN training methods extra regularization term, dis-
courages the weights being
too far away from uniform
MADGAN Address the multiagent cognitive consistency Simultaneously training of mul- -
[66] problem in large-scale distributed network tiple generators, consistency of
all generators’ training results
FedGAN A multi-generator and multi-discriminator ar- Similar performance to general -
[51] chitecture for training a GAN with distributed distributed GAN with reduction
sources in communication complexity
One generators, One discriminator, One classifier: Subsection IV-B4
CS-GAN Combile RNN, GAN, and RL, Use a classifier generate sentences based on cat- -
[4] to validate category, A character level model egory, limiting action space
[74] Multi-class minimax game based self- Improve convergence, can inte- -
supervised tasks, grate into GAN models
One generators, One discriminator, One RL agent: Subsection IV-B5
MolGAN Use Original graph-structured data,use RL ob- Better chemical property scores, Susceptible to mode collapse
[58] jective to generate specific chemical property no overhead in representation
ORGAN Encode molecules as text sequences, control Better result than trained RNNs Overhead in representation,
[59] properties of generated samples with RL, use via MLE or SeqGAN works only on sequential data
Wasserstein distance as loss function
RL-GAN- Use RL to find correct input for GAN, Com- A real time point cloud shape -
Net [75] bine AE, RL and l-GAN completion, less complexity
11

equilibrium. The fictitious play achieves a Nash equilibrium 4) Reinforcement learning: Cross-modal hashing tries to
in two-player zero-sum games if the game’s equilibrium is map different multimedia data into a common Hamming space,
unique. However, if there exist multiple Nash equilibriums, realizing fast and flexible retrieval across different modalities.
other initialization may yield other solutions. Cross-modal hashing has two weaknesses: (1) Depends on
By relating GAN with the two-player zero-sum game, Ge et large-scale labeled cross-modal training data. (2) Ignore the
al. in [27] design a training algorithm to simulate the fictitious rich information contained in a large amount of unlabeled data
play on GAN and provide a theoretical convergence guarantee. across different modalities. So Zhang et al. in [53] propose
They also show that by assuming the best response at each up- Semi-supervised Cross-modal Hashing GAN (SCH-GAN) that
date in Fictitious GAN, the distribution of the mixture outputs exploits a large amount of unlabeled data to improve hashing
from the generators converges to the data distribution. The learning. The generator takes the correlation score predicted
discriminator outputs converge to the optimum discriminator by the discriminator as a reward and tries to pick margin
function. The authors in [27] use two queues D and G, to store examples of one modality from unlabeled data when giving
the historically trained models of the discriminator and the another modality query. The discriminator tries to predict
generator. They also show that Fictitious GAN can effectively the correlation between query and chosen examples of the
resolve some convergence issues that the standard training generator using Reinforcement learning.
approach cannot resolve and can be applied on top of existing An agent trained using RL is only able to achieve the single
GAN variants. task specified via its reward function. So Florensa et al. in [54]
provide Goal Generative Adversarial Network (Goal GAN).
3) Federated learning: Data limitation is a common draw-
This method allows an agent to automatically discover the
back in deep learning models like GANs. We can solve this
range of tasks at the appropriate level of complexity for the
issue by using distributed data from multiple sources, but
agent in its environment with no prior knowledge about the
this is difficult due to some reasons like privacy concerns of
environment or the tasks being performed and allows an agent
users, communication efficiency and statistical heterogeneity,
to generate its own reward functions. The goal discriminator is
etc. This brings the idea of using federated learning in GANs
trained to evaluate whether a goal is at the appropriate level of
to address these subjects [51], [52].
difficulty for the current policy. The goal generator is prepared
Rasouli et al. in [51], proposed a federated approach to generate goals that meet these criteria.
to GANs, which trains over distributed sources with non- GAN has limitations when the goal is for generating
independent-and-identically-distributed data sources. In this sequences of discrete tokens. First, it is hard to provide
model, every K time steps of local gradient, agents send their the gradient update from the discriminator to the generator
local discriminator and generator parameters to the intermedi- when the outputs are discrete. Second, The discriminator
ary and receive back the synchronized parameters. Due to the can only reward an entire sequence after generation; for a
average communication per round per agent, FedGAN is more partially generated sequence, it is non-trivial to balance how
efficient compare to general distributed GAN. Experiments well it is now and how well it will be in the future as
also proved FedGAN is robust by increasing K. For proving the whole sequence. Yu et al. in [55] proposed Sequence
the convergence of this model, the authors connect the con- GAN (SeqGAN) and model the data generator as a stochastic
vergence of GAN to convergence of an Ordinary Differential policy in reinforcement learning (RL). The RL reward signal
Equation (ODE) representation of the parameter updates [80] comes from the discriminator decided on a complete sequence
under equal or two time-scale updates of generators and and, using the Monte Carlo search, is passed back to the
discriminators. Rasouli et al. showed that the FedGAN ODE intermediate state-action steps. So in this method, they care
representation of parameters update asymptotically follows the about the long-term reward at every timestep. The authors
ODE representing the parameter update of the centralized consider not only the fitness of previous tokens but also the
GAN. So by using the existing results for centralized GAN, resulted future outcome. ”This is similar to playing the games
FedGAN also converges. such as Go or Chess, where players sometimes give up the
Fan et al. in [52] also proposed a generative learning immediate interests for the long-term victory” [81].
model using a federated learning framework. The aim is The main problem in [55] is that the classifier’s reward
to train a unified central GAN model with the combined cannot accurately reflect the novelty of text. So, in [56] in
generative models of each client. Fan et al. examine 4 kinds comparison to [55], Yu et al. assign a low reward for repeatedly
of synchronization strategies, synchronizing each the central generated text and high reward for ”novel” and fluent text,
model of D and G to every client (Sync D&G) or simply encouraging the generator to produce diverse and informative
sync the generator or the discriminator (Sync G or Sync D) or text, and propose a novel language-model based discriminator,
none of them (Sync None). In situations where communication which can better distinguish novel text from repeated text
costs are high, they recommend Sync G while losing some without the saturation problem. The generator reward consists
generative potential, otherwise synchronize both D and G. [52] of two parts, the reward at the sentence level and that at the
results showed that federate learning is commonly robust to the word level. The authors maximize the reward of real text and
number of agents with Independent and Identically Distributed minimize fake text rewards to train the discriminator. The
(IID) and fairly non-IID training data. However, for highly reason for minimizing the reward of generated text is that
skewed data distribution, their model performed abnormality the text that is repeatedly generated by the generator can be
due to weight divergence. identified by the discriminator and get a lower reward. The
12

motivation of maximizing the reward of real-world data lies such as the spacing between characters and strokes from one
in that not only the uncommon text in the generated data can note to another, and provide suitable rewards or penalties for
get a high reward, but also the discriminator can punish low- the generator to learn the handwriting with greater accuracy.
quality text to some extent. The optimized generation of sequences with particular de-
The same notion of SeqGAN can be applied in domains sired goals is challenging in sequence generation tasks. Most
such as image captioning. Image captioning’s aim is to de- of the current work mainly learns to generate outputs that are
scribe an image with words. Former approaches for image close to the real distribution. However, in many applications,
captioning like maximum likelihood method suffer from a so- we need to generate data similar to real ones and have specific
called exposure bias problem which happens when the model properties or attributes. Hossam et al. in [62] introduce the first
tries to produce a sequence of tokens based on previous tokens. GAN-controlled generative model for sequences that address
In this situation, the model may generate tokens that were the diversity issue in a principled approach. The authors
never seen in training data [82]. Yan et al. in [57] used the combine GAN and RL policy learning benefits while avoiding
idea of SeqGAN to address the problem of exposure bias. In mode-collapse and high variance drawbacks. The authors show
this scheme, the image captioning generator is considered as that if only pure RL is applied with the GAN-based objective,
the generator in the GAN framework whose aim is to describe the realistic quality of the output might be sacrificed for the
the images. The discriminator has two duties, the first is to cause of achieving a higher reward. For example, in the text-
distinguish the real description and generated one and the generation case, by generating sentences in which few words
second is to figure out if the description is related to the image are repeated all the time, the model could achieve a similar
or not. To deal with the discreteness of the generated text, quality score. Hence, combining a GAN-based objective with
the discriminator is considered as an agent which produces RL promotes the optimization process of RL to stay close to
a reward for the generator. Although, lack of intermediate the actual data distribution. This model can be used for any
reward is another problem which solves by using the Monte GAN model to enable it to optimize the desired goal according
Carlo roll-out strategy same as SeqGAN. to the given task directly.
Finding new chemical compounds and generating molecules A novel RL-based neural architecture search (NAS) method-
are also challenging tasks in a discrete setting. [59] and [58] ology is proposed for GANs in [63] by Tian et al. Markov
tackled this problem and proposed two models that rely on decision process formulation is applied to redefine the issue
SeqGAN. The main difference is adding an RL component of neural architecture search for GANs in this article, therefore
to the basic architecture of GAN, where we discussed in a more effective RL-based search algorithm with more global
subsection IV-B5. optimization is achieved. Additionally, data efficiency can be
The idea behind SeqGAN has also been applied to gener- improved due to better facilitation of off-policy RL training by
ating sentences with certain labels. Li et al. in [4] introduced this formulation [63]. On-policy RL is used in most of the for-
CS-GAN, which consists of a generator and a descriptor merly proposed search methods employed in RL-based GAN
(discriminator and classifier). In this model, the generator architecture, which may have a significantly long training
takes an action, and the descriptor task is to identify sentence time because of limited data efficiency. Agents in off-policy
categories by returning the reward. Details of this model are RL algorithms are enabled to learn more accurately as these
explained in subsection IV-B4. algorithms use past experience. However, using off-policy data
Aghakhani et al. in [60] introduce a system that for the first can lead to unstable policy network training because these
time expands GANs for a text classification task, specifically, training samples are systematically different from the on-
detecting deceptive reviews (FakeGAN). Previous models for policy ones [63]. A new formulation in [63] supports the off-
text classification have limitations: (1) Biased problems like policy strategy better and lessens the instability problem.
Recurrent NN, where later words in a text have more weight
than earlier words. (2) Correlation with the window size like V. D ISCUSSION , C ONCLUSION AND F UTURE W ORK
CNN. Unlike standard GAN with a single Generator and Although there are various studies that have explored dif-
Discriminator, FakeGAN uses two discriminators and one ferent aspects of GANs, but, several challenges still remain
generator. The authors modeled the generator as a stochastic should be investigated. In this section, we discuss such chal-
policy agent in reinforcement learning (RL) and used the lenges, especially in the discussed subject, game of GANs, and
Monte Carlo search algorithm for the discriminators to esti- propose future research directions to tackle these problems.
mate and pass the intermediate action-value as the RL reward
to the generator. One of the discriminators tries to distinguish
between truthful and deceptive reviews, whereas the other tries A. Open Problems and Future Directions
to distinguish between fake and real reviews. While GANs achieve the state-of-the-art performance and
Ghosh et al. in [61] use GANs for learning the handwriting compelling results on various generative tasks, but, these
of an entity and combine it with reinforcement learning tech- results come at some challenges, especially difficulty in the
niques to achieve faster learning. The generator can generate training of GANs. Training procedure suffers from instability
words looking similar to the reference word, and the dis- problems. While reaching to Nash equilibrium, generator and
criminator network can be used as an OCR (optical character discriminator are trying to minimize their own cost function,
recognition) system. The concept of reinforcement learning regardless of the other one. This can cause the problem of
comes into play when letters need to be joined to form words, non-convergence and instability because of minimizing one
13

TABLE V: Summary of the publication included in the learning method category (Subsection IV-B).
Reference Methodology and Contributions Pros Cons
No-regret learning: Subsection IV-C1
DRAGAN Applying no-regret algorithm, new High stability across objective func- -
[49] regularizer tions, mitigates mode collapse
Chekhov Online learning algorithm for semi Converge to mixed NE for semi shal- -
GAN [50] concave game low discriminator
Fictitious play: Subsection IV-C2
Fictitious Fictitious (historical models) Solve the oscillation behavior, solve Applied only on 2 player zero-
GAN [27] divergence issue on some cases, appli- sum games
cable
Federated learning: Subsection IV-C3
FedGAN Communication-efficient distributed Prove the convergence, less communi- -
[51] GAN subject to privacy constraints, cation complexity compare to general
connect the convergence of GAN to distributed GAN
ODE
[52] Using a federated learning framework Robustness to the number of clients Performs anomaly for highly
with IID and moderately non-IID data skewed data distribution, accu-
racy drops with non-IID data
Reinforcement learning: Subsection IV-C4
[54] Generate diverse appropriate level of - -
difficulty set of goal
Diversity- New objective function, generate text Diversity and novelty -
promoting
GAN [56]
[15] Using GAN for cross-model hashing Extract rich information from unla- -
beled data
SeqGAN Extending GANs to generate sequence Solve the problem of discrete data -
[55] of discrete tokens
FakeGAN Text classification - -
[60]
CS-GAN Combine RL, GAN, RNN More realistic, faster -
[4]
[61] Handwriting recognition - -
ORGAN RL agent + SeqGAN Better result than RNN trained via Works only on sequential data
[59] MLE or SeqGAN
MolGAN RL agent + SeqGAN optimizing non-differentiable metrics Susceptible to mode collapse
[58] by RL & Faster training time
OptiGAN Combining MLE and GAN Used for different models and goals -
[62]
[63] Redefining the issue of neural archi- More effective RL-based search algo- -
tecture search for GANs by applying rithm, smoother architecture sampling
Markov decision process formulation

cost can lead to maximizing the other one’s cost. Another More specifically, as the authors in [9] expressed one of
main problem of GANs which needs to be addressed is mode the most important future direction is to improve theoretical
collapse. This problem becomes more critical for unbalanced aspects of GANs to solve problems such as model collapse,
data sets or when the number of classes is high. In other non-convergence, and training difficulties. Although there have
hand, when discriminator works properly in distinguishing many works on the theory aspects, most of the current training
samples, generators gradients vanishes. This problem which is strategies are based on the optimization theory, whose scope is
called vanishing gradient should also be considered. Compared restricted to local convergence due to the non-convexity, and
with other generative models, the evaluation of GANs is more the utilization of game theory techniques is still in its infancy.
difficult. This is partially due to the lack of appropriate metrics. At present, the game theory variant GANs are limited, and
Most evaluation metrics are qualitative rather than being much of them are highly restrictive, and are rarely directly
quantitative. Qualitative metrics such as human examination applicable. Hence, there is much room for research in game-
of samples, is an arduous task and depends on the subject. based GANs which are involving other game models.
14

From the convergence viewpoint, most of the current train- R EFERENCES


ing methods converge to a local Nash equilibrium, which can [1] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
be far from an actual and global NE. While there is vast S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,”
literature on the GAN’ training, only few researches such as arXiv preprint arXiv:1406.2661, 2014.
[2] Y. Hong, U. Hwang, J. Yoo, and S. Yoon, “How generative adversarial
[48] formulation the training procedure from the mixed NE networks and their variants work: An overview,” ACM Computing
perspective, and investigation for mixed NE of GANs should Surveys (CSUR), vol. 52, no. 1, pp. 1–43, February 2019.
be examined in more depth. On the other hand, existence [3] W. Fedus, M. Rosca, B. Lakshminarayanan, A. M. Dai, S. Mohamed,
and I. Goodfellow, “Many paths to equilibrium: GANs do not need to
of an equilibrium does not imply that it can be easily find decrease a divergence at every step,” arXiv preprint arXiv:1710.08446,
by a simple algorithm. In particular, training GANs requires 2017.
finding Nash equilibria in non-convex games, and computing [4] Y. Li, Q. Pan, S. Wang, T. Yang, and E. Cambria, “A generative model
for category text generation,” Information Sciences, vol. 450, pp. 301–
the equilibria in these game is computationally hard. In the 315, June 2018.
future, we are expected to see more solutions tries to make [5] H. Alqahtani, M. Kavakli-Thorne, and G. Kumar, “Applications of
GAN training more stable and converge to actual NE. generative adversarial networks (GANs): An updated review,” Archives
of Computational Methods in Engineering, vol. 28, no. 3, pp. 525–552,
Multi-agent models such as [46], [51], [60], [64]–[73] March 2021.
are computationally more complex and expensive than two- [6] K. Wang, C. Gou, Y. Duan, Y. Lin, X. Zheng, and F.-Y. Wang,
player models and this factor should be taken into account in “Generative adversarial networks: introduction and outlook,” IEEE/CAA
Journal of Automatica Sinica, vol. 4, no. 4, pp. 588–598, 2017.
development of such variants. Moreover, in multi-generator [7] Z. Wang, Q. She, and T. E. Ward, “Generative adversarial net-
structures the divergence among the generators should be works in computer vision: A survey and taxonomy,” arXiv preprint
considered such that all of them do not generate the same arXiv:1906.01529, 2019.
[8] M. Wiatrak and S. V. Albrecht, “Stabilizing generative adversarial
samples. network training: A survey,” arXiv preprint arXiv:1910.00927, 2019.
One of the other directions that we expect to witness [9] Y.-J. Cao, L.-L. Jia, Y.-X. Chen, N. Lin, C. Yang, B. Zhang, Z. Liu,
the innovation in the future is integrating GANs with other X.-X. Li, and H.-H. Dai, “Recent advances of generative adversarial
networks in computer vision,” IEEE Access, vol. 7, pp. 14 985–15 006,
learning methods. There are a variety of methods in multi- December 2018.
agent learning literature which should be explored as they [10] I. Goodfellow, “NIPS 2016 tutorial: Generative adversarial networks,”
may be useful when applying in the multi-agent GANs. In arXiv preprint arXiv:1701.00160, 2016.
[11] A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and
addition, it looks like much more research on the relationship A. A. Bharath, “Generative adversarial networks: An overview,” IEEE
and combination between GANs and current applied learning Signal Processing Magazine, vol. 35, no. 1, pp. 53–65, 2018.
approach such as RL is still required, and it also will be a [12] S. Hitawala, “Comparative study on generative adversarial networks,”
arXiv preprint arXiv:1801.04271, 2018.
promising research direction in the next few years. Moreover, [13] L. Gonog and Y. Zhou, “A review: Generative adversarial networks,” in
GAN is proposed as unsupervised learning, but adding a 2019 14th IEEE Conference on Industrial Electronics and Applications
certain number of labels, specially in practical applications, (ICIEA), Xi’an, China, June 2019, pp. 505–510.
[14] A. Bissoto, E. Valle, and S. Avila, “The six fronts of the generative
can substantially improve its generating capability. Therefore, adversarial networks,” arXiv preprint arXiv:1910.13076, 2019.
how to combine GAN and semi-supervised learning is also [15] S.-F. Zhang, J.-H. Zhai, D.-S. Luo, Y. Zhan, and J.-F. Chen, “Recent
one of the potential future research topics. advance on generative adversarial networks,” in 2018 International
Conference on Machine Learning and Cybernetics (ICMLC), Chengdu,
As the final note, GAN is a relatively novel and new model China, July 2018, pp. 69–74.
with significant recent progress, so the landscape of possible [16] Z. Pan, W. Yu, X. Yi, A. Khan, F. Yuan, and Y. Zheng, “Recent progress
applications remains open for exploration. The advancements on generative adversarial networks (GANs): A survey,” IEEE Access,
vol. 7, pp. 36 322–36 333, 2019.
in solving the above challenges can be decisive for GANs to [17] M. P. Kumar and P. Jayagopal, “Generative adversarial networks: a sur-
be more employed in real scenarios. vey on applications and challenges,” International Journal of Multimedia
Information Retrieval, vol. 10, no. 1, pp. 1–24, March 2021.
[18] P. Salehi, A. Chalechale, and M. Taghizadeh, “Generative adversarial
B. Conclusion networks (GANs): An overview of theoretical model, evaluation metrics,
and recent developments,” arXiv preprint arXiv:2005.13178, 2020.
We conduct this review of recent progresses in GANs [19] B. Ghosh, I. K. Dutta, M. Totaro, and M. Bayoumi, “A survey on
using the game theory which can serve as a reference for the progression and performance of generative adversarial networks,”
future research. Comparing this survey to the other reviews in 2020 11th International Conference on Computing, Communication
and Networking Technologies (ICCCNT). Kharagpur, India: IEEE, July
in the literature, and considering the many published works 2020, pp. 1–8.
which deal with GAN challenges, we emphasize on the [20] D. Saxena and J. Cao, “Generative adversarial networks (GANs):
Challenges, solutions, and future directions,” arXiv preprint
theory aspects. This is all done through the taking of a game arXiv:2005.00065, 2020.
theory perspective based on our proposed taxonomy. In this [21] J. Gui, Z. Sun, Y. Wen, D. Tao, and J. Ye, “A review on generative ad-
survey, we first provided detailed background information versarial networks: Algorithms, theory, and applications,” arXiv preprint
arXiv:2001.06937, 2020.
on the game theory and GAN. In order to present a clear [22] A. Jabbar, X. Li, and B. Omar, “A survey on generative adver-
roadmap, we introduced our taxonomy which have three major sarial networks: Variants, applications, and training,” arXiv preprint
categorizes includes the game model, architecture and learning arXiv:2006.05132, 2020.
[23] M. J. Osborne et al., An introduction to game theory. Oxford university
approaches. Following the proposed taxonomy, we discussed press New York, 2004, vol. 3, no. 3.
each taxonomy separately in detail and presented the GANs [24] Y. Shoham and K. Leyton-Brown, Multiagent systems: Algorithmic,
based solutions in each subcategory. We hope this paper is game-theoretic, and logical foundations. Cambridge University Press,
2008.
beneficial for researchers interested in this field. [25] J. v. Neumann, “Zur theorie der gesellschaftsspiele,” Mathematische
annalen, vol. 100, no. 1, pp. 295–320, December 1928.
15

[26] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, [50] P. Grnarova, K. Y. Levy, A. Lucchi, T. Hofmann, and A. Krause,
S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in “An online learning approach to generative adversarial networks,” arXiv
Proceedings of the 27th International Conference on Neural Information preprint arXiv:1706.03269, 2017.
Processing Systems - Volume 2, ser. NIPS’14. Cambridge, MA, USA: [51] M. Rasouli, T. Sun, and R. Rajagopal, “FedGAN: Federated gen-
MIT Press, December 2014, pp. 2672–2680. erative adversarial networks for distributed data,” arXiv preprint
[27] H. Ge, Y. Xia, X. Chen, R. Berry, and Y. Wu, “Fictitious GAN: arXiv:2006.07228, 2020.
Training GANs with historical models,” in Proceedings of the European [52] C. Fan and P. Liu, “Federated generative adversarial learning,” arXiv
Conference on Computer Vision (ECCV), Munich, Germany, September preprint arXiv:2005.03793, 2020.
2018, pp. 119–134. [53] J. Zhang, Y. Peng, and M. Yuan, “SCH-GAN: Semi-supervised cross-
[28] M. Lucic, K. Kurach, M. Michalski, S. Gelly, and O. Bousquet, modal hashing by generative adversarial network,” IEEE transactions
“Are GANs created equal? a large-scale study,” arXiv preprint on cybernetics, vol. 50, no. 2, pp. 489–502, February 2018.
arXiv:1711.10337, 2017. [54] C. Florensa, D. Held, X. Geng, and P. Abbeel, “Automatic goal gener-
[29] M. Lee and J. Seok, “Regularization methods for generative ad- ation for reinforcement learning agents,” in Proceedings of the 35th
versarial networks: An overview of recent studies,” arXiv preprint International Conference on Machine Learning, vol. 80, Stockholm,
arXiv:2005.09165, 2020. Sweden, July 2018, pp. 1515–1528.
[30] Z. Pan, W. Yu, B. Wang, H. Xie, V. S. Sheng, J. Lei, and S. Kwong, [55] L. Yu, W. Zhang, J. Wang, and Y. Yu, “SeqGAN: Sequence generative
“Loss functions of generative adversarial networks (GANs): Oppor- adversarial nets with policy gradient,” in Proceedings of the Thirty-First
tunities and challenges,” IEEE Transactions on Emerging Topics in AAAI Conference on Artificial Intelligence. San Francisco, CA, USA:
Computational Intelligence, vol. 4, no. 4, August 2020. AAAI Press, February 2017, pp. 2852––2858.
[31] X. Wu, K. Xu, and P. Hall, “A survey of image synthesis and editing with [56] J. Xu, X. Ren, J. Lin, and X. Sun, “Diversity-promoting GAN: A
generative adversarial networks,” Tsinghua Science and Technology, cross-entropy based generative adversarial network for diversified text
vol. 22, no. 6, pp. 660–674, December 2017. generation,” in Proceedings of the 2018 Conference on Empirical
[32] L. Wang, W. Chen, W. Yang, F. Bi, and F. R. Yu, “A state-of-the-art Methods in Natural Language Processing, Brussels, Belgium, November
review on image synthesis with generative adversarial networks,” IEEE 2018, pp. 3940–3949.
Access, vol. 8, pp. 63 514–63 537, March 2020. [57] S. Yan, F. Wu, J. S. Smith, W. Lu, and B. Zhang, “Image captioning
[33] V. Sorin, Y. Barash, E. Konen, and E. Klang, “Creating artificial using adversarial networks and reinforcement learning,” in 2018 24th
images for radiology applications using generative adversarial networks International Conference on Pattern Recognition (ICPR), Beijing, China,
(GANs)–a systematic review,” Academic Radiology, vol. 27, pp. 1175– August 2018, pp. 248–253.
1185, August 2020. [58] N. De Cao and T. Kipf, “MolGAN: An implicit generative model for
[34] M. E. Tschuchnig, G. J. Oostingh, and M. Gadermayr, “Generative small molecular graphs,” arXiv preprint arXiv:1805.11973, 2018.
adversarial networks in digital pathology: a survey on trends and future [59] G. L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P. L. C. Farias,
potential,” Patterns, vol. 1, no. 6, p. 100089, September 2020. and A. Aspuru-Guzik, “Objective-reinforced generative adversarial net-
[35] J. Agnese, J. Herrera, H. Tao, and X. Zhu, “A survey and taxonomy works (ORGAN) for sequence generation models,” arXiv preprint
of adversarial neural networks for text-to-image synthesis,” Wiley Inter- arXiv:1705.10843, 2017.
disciplinary Reviews: Data Mining and Knowledge Discovery, vol. 10, [60] H. Aghakhani, A. Machiry, S. Nilizadeh, C. Kruegel, and G. Vigna,
no. 4, p. e1345, February 2020. “Detecting deceptive reviews using generative adversarial networks,” in
[36] P. Jain and T. Jayaswal, “Generative adversarial training and its utiliza- 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA,
tion for text to image generation: A survey and analysis,” Journal of USA, May 2018, pp. 89–95.
Critical Reviews, vol. 7, no. 8, pp. 1455–1463, July 2020. [61] A. Ghosh, B. Bhattacharya, and S. B. R. Chowdhury, “Handwrit-
[37] V. Sampath, I. Maurtua, J. J. A. Martı́n, and A. Gutierrez, “A survey ing profiling using generative adversarial networks,” arXiv preprint
on generative adversarial networks for imbalance problems in computer arXiv:1611.08789, 2016.
vision tasks,” Journal of Big Data, vol. 8, no. 1, p. 27, January 2021. [62] M. Hossam, T. Le, V. Huynh, M. Papasimeon, and D. Phung, “OptiGAN:
[38] X. Yi, E. Walia, and P. Babyn, “Generative adversarial network in Generative adversarial networks for goal optimized sequence genera-
medical imaging: A review,” Medical image analysis (MEDIA), vol. 58, tion,” arXiv preprint arXiv:2004.07534, 2020.
p. 101552, December 2019. [63] Y. Tian, Q. Wang, Z. Huang, W. Li, D. Dai, M. Yang, J. Wang, and
[39] C. Yinka-Banjo and O.-A. Ugot, “A review of generative adversarial O. Fink, “Off-policy reinforcement learning for efficient and effective
networks and its application in cybersecurity,” Artificial Intelligence GAN architecture search,” in European Conference on Computer Vision
Review, vol. 53, no. 3, pp. 1721–1736, March 2020. (ECCV). Springer, August 2020, pp. 175–192.
[40] F. Di Mattia, P. Galeone, M. De Simoni, and E. Ghelfi, “A survey on [64] A. Ghosh, V. Kulharia, V. P. Namboodiri, P. H. Torr, and P. K. Dokania,
GANs for anomaly detection,” arXiv preprint arXiv:1906.11632, 2019. “Multi-agent diverse generative adversarial networks,” in Proceedings of
[41] J. Geogres-Filteau and E. Cirillo, “Synthetic observational health data the IEEE conference on computer vision and pattern recognition, Salt
with GANs: from slow adoption to a boom in medical research and Lake City, Utah, June 2018, pp. 8513–8521.
ultimately digital twins?” November 2020. [65] Q. Hoang, T. D. Nguyen, T. Le, and D. Phung, “MGAN: Training
[42] N. Gao, H. Xue, W. Shao, S. Zhao, K. K. Qin, A. Prabowo, M. S. generative adversarial nets with multiple generators,” in International
Rahaman, and F. D. Salim, “Generative adversarial networks for spatio- Conference on Learning Representations, Vancouver Canada, February
temporal data: A survey,” arXiv preprint arXiv:2008.08903, 2020. 2018.
[43] S. Shin, H. Jeon, C. Cho, S. Yoon, and T. Kim, “User mobility [66] S. Ke and W. Liu, “Consistency of multiagent distributed generative
synthesis based on generative adversarial networks: A survey,” in 2020 adversarial networks,” IEEE Transactions on Cybernetics, early access,
22nd International Conference on Advanced Communication Technology pp. 1–11, October 2020.
(ICACT). Phoenix Park, Korea (South): IEEE, February 2020, pp. 94– [67] A. Ghosh, V. Kulharia, and V. Namboodiri, “Message passing multi-
103. agent GANs,” arXiv preprint arXiv:1612.01294, 2016.
[44] L. Fan, “A survey of differentially private generative adversarial net- [68] I. Durugkar, I. Gemp, and S. Mahadevan, “Generative multi-adversarial
works,” in The AAAI Workshop on Privacy-Preserving Artificial Intelli- networks,” arXiv preprint arXiv:1611.01673, 2016.
gence, New York, NY, USA, February 2020. [69] Y. Jin, Y. Wang, M. Long, J. Wang, S. Y. Philip, and J. Sun, “A multi-
[45] B. Franci and S. Grammatico, “Generative adversarial networks as player minimax game for generative adversarial networks,” in 2020 IEEE
stochastic Nash games,” arXiv preprint arXiv:2010.10013, July 2020. International Conference on Multimedia and Expo (ICME), London,
[46] H. Zhang, S. Xu, J. Jiao, P. Xie, R. Salakhutdinov, and E. P. Xing, UK, July 2020, pp. 1–6.
“Stackelberg GAN: Towards provable minimax equilibrium via multi- [70] C. Hardy, E. Le Merrer, and B. Sericola, “MD-GAN: Multi-discriminator
generator architectures,” arXiv preprint arXiv:1811.08010, 2018. generative adversarial networks for distributed datasets,” in 2019 IEEE
[47] F. Farnia and A. Ozdaglar, “GANs may have no Nash equilibria,” arXiv International Parallel and Distributed Processing Symposium (IPDPS),
preprint arXiv:2002.09124, 2020. Rio de Janeiro, Brazil, May 2019, pp. 866–877.
[48] Y.-P. Hsieh, C. Liu, and V. Cevher, “Finding mixed Nash equilibria [71] G. Mordido, H. Yang, and C. Meinel, “microbatchGAN: Stimulating
of generative adversarial networks,” in International Conference on diversity with multi-adversarial discrimination,” in 2020 IEEE Winter
Machine Learning, Long Beach, CA, USA, June 2019, pp. 2810–2819. Conference on Applications of Computer Vision (WACV), Snowmass,
[49] N. Kodali, J. Abernethy, J. Hays, and Z. Kira, “On convergence and CO, USA, March 2020, pp. 3050–3059.
stability of GANs,” arXiv preprint arXiv:1705.07215, 2017. [72] T. Nguyen, T. Le, H. Vu, and D. Phung, “Dual discriminator generative
adversarial nets,” Advances in neural information processing systems
(NIPS 2017), vol. 30, pp. 2670–2680, December 2017.
16

[73] S. Arora, R. Ge, Y. Liang, T. Ma, and Y. Zhang, “Generalization connectionist reinforcement learning,” Machine Learning, vol. 8, no. 3,
and equilibrium in generative adversarial nets (GANs),” arXiv preprint pp. 229–256, May 1992.
arXiv:1703.00573, 2017. [79] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa,
[74] N.-T. Tran, V.-H. Tran, B.-N. Nguyen, L. Yang, and N.-M. M. Cheung, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement
“Self-supervised GAN: Analysis and improvement with multi-class learning,” arXiv preprint arXiv:1509.02971, 2015.
minimax game,” Advances in Neural Information Processing Systems [80] L. Mescheder, S. Nowozin, and A. Geiger, “The numerics of GANs,”
(NeurIPS 2019), vol. 32, pp. 13 253–13 264, 2019. in Proceedings from the conference ”Neural Information Processing
[75] M. Sarmad, H. J. Lee, and Y. M. Kim, “RL-GAN-Net: A reinforcement Systems 2017., vol. 30. Red Hook, NY, USA: Curran Associates Inc.,
learning agent controlled GAN network for real-time point cloud shape December 2017, pp. 1825–1835.
completion,” in Proceedings of the IEEE Conference on Computer Vision [81] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den
and Pattern Recognition, Long Beach, CA, USA, June 2019, pp. 5898– Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanc-
5907. tot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever,
[76] J. Liang and W. Tang, “Sequence generative adversarial networks for T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis,
wind power scenario generation,” IEEE Journal on Selected Areas in “Mastering the game of go with deep neural networks and tree search,”
Communications, vol. 38, no. 1, pp. 110–118, January 2020. Nature, vol. 529, no. 7587, pp. 484–489, January 2016.
[77] D. Weininger, “Smiles, a chemical language and information system. 1. [82] S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer, “Scheduled sampling
introduction to methodology and encoding rules,” Journal of Chemical for sequence prediction with recurrent neural networks,” in Proceedings
Information and Computer Sciences, vol. 28, no. 1, pp. 31–36, February of the 28th International Conference on Neural Information Processing
1998. Systems - Volume 1, ser. NIPS’15. Cambridge, MA, USA: MIT Press,
[78] R. J. Williams, “Simple statistical gradient-following algorithms for September 2015, p. 1171–1179.

You might also like