0% found this document useful (0 votes)
14 views9 pages

Syed Saad Quadri9Ygx

Uploaded by

syed saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Syed Saad Quadri9Ygx

Uploaded by

syed saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

International Journal of All Research Education and Scientific Methods (IJARESM),

ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

META – GANs for Progressive Image Processing


Syed Saad Quadri1, Venkatesh G2, Anuroop S R3, Tharun Kumar J4
1,2,4
UG Students, Department of Computer Science and Engineering (Data Science), MVJ College of Engineering,
Bangalore, Karnataka India
3
Assistant Professor, Department of Information Science and Engineering, MVJ College of Engineering, Bangalore,
Karnataka India

---------------------------------------------------------------****************-------------------------------------------------------------

ABSTRACT

In this paper, we present Meta-GAN, a new model designed to enhance the quality of facial images in low-resolution
photographs. Meta-GAN utilizes a concept known as Generative Facial Prior, which is derived from a pre-trained
facial generation model, to achieve a balanced approach between realism and accuracy during the restoration
process. The Framework addresses the challenges posed by various forms of degradation such as blur, low
resolution and noise. Unlike conventional methods, Meta-GAN simplifies the restoration process by concurrently
enhancing facial details and correcting colors in a single pass of the neural network.

The key feature of Meta-GAN lies in its seamless integration of facial detail restoration and color enhancement,
eliminating the need for cumbersome manual adjustments. This optimization streamlines the restoration process,
resulting in increased efficiency and effectiveness. Our comprehensive experiments, conducted across diverse
datasets including synthetic benchmarks and real-world images, highlight Meta-GAN's capability in revitalizing
previously unclear faces. Additionally, it signifies a significant advancement in facial restoration technology, offering
a practical solution to real-world challenges. It underscores the potential of data-driven approaches in overcoming
limitations in image quality and fidelity. Ultimately, Meta-GAN serves as a testament to the power of innovation in
tackling complex problems, paving the way for further advancements in facial image restoration techniques.

Keywords: Adversarial Loss, Dual Path Network, Feature Splitter Modulation, Generative Adversarial Networks,
Meta-GANs, Local Component Discriminators.

INTRODUCTION

Many advances in the sphere of machine learning, particularly deep learning, were discovered when more computing or
processing capacity became accessible. Deep learning makes it easier to extract relevant, abstract, and high-level features
from input data for usage as classifiers and detectors. Generative Adversarial Network (GANs) introduced by Ian
Goodfellow [1]focuses on an adversarial training process, where a generator and discriminator networks compete, leading
to a strong network to differentiate between real and generated samples. Face restoration using GANs targets the recovery
of high-quality facial images from degraded low-quality versions with unknown forms of deterioration such as blur [2, 3],
low-resolution [4] and noise. Moreover, when this is applied to real-life situations, the challenge grows even further as we
encounter more complicated image issues, along with variations in facial orientations and emotions. In this investigation,
we utilize the concept of Generative Facial Prior to improve the process of facial restoration, drawing on the inherent
insights embedded within pre-existing face models like StyleGAN [5, 6].

Face GANs exhibit the ability to produce authentic faces with a wide range of variations, offering valuable priors like
colors, textures, and facial geometry. These generative capabilities enable the simultaneous restoration of facial details and
enhancement of colors. Nonetheless, integrating these generative priors into the restoration process poses significant
challenges. One key obstacle lies in ensuring the stability and quality of the restored faces, especially when dealing with
low-quality input images. Additionally, the complexity of mapping the latent codes from degraded face images to the
generative priors accurately hinders the restoration process. Moreover, achieving a balance between realism and fidelity in
the restored faces remains a persistent challenge due to the intricate nature of facial features and textures.

Page | 157
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

In the past, conventional methods have often resorted to employing GAN inversion techniques [7, 8]. Initially, the standard
procedure involves reversing the degraded image to a latent code of the pretrained GAN, followed by resource-intensive
optimizations tailored to each image. Despite yielding visually convincing results, these methods often compromise image
fidelity due to the limitations of low-dimensional latent codes in guiding precise restoration.

To overcome these challenges, we present Meta-GAN, a novel framework meticulously designed to achieve a harmonious
blend of realism and fidelity in a single computational iteration. Meta-GAN comprises a degradation removal module and a
pretrained face GAN serving as the facial prior. These elements are seamlessly integrated through a direct latent code
mapping mechanism, facilitated by a series of Feature Splitter Modulation (FSM) layers operating in a progressive
refinement manner.

By implementing FSM layers, we enact spatial modulation on feature subsets, enabling the preservation of unaltered
features for enhanced information retention. This methodology seamlessly integrates generative priors while upholding
fidelity standards. Furthermore, our approach introduces facial component loss using local discriminators to elevate
perceptual facial intricacies, complemented by the employment of identity-preserving loss for an additional fidelity boost.

LITERATURE SURVEY

In these methods [9, 10, 11], reference priors usually depend on images showing the same person. But to overcome this,
DFDNet suggests making a face dictionary for different parts like eyes and mouth using CNN features to guide restoration.
However, DFDNet tends to focus only on these parts, which can lead to problems in other areas like hair or ears. On the
other hand, our Meta-GAN approach looks at the whole face, making restoration more thorough.

Additionally, while DFDNet's dictionary is limited in size, Meta-GAN taps into a wider range of priors, including
geometry, textures, and colors. This broader approach gives us more options for restoration, making the results richer and
more varied.

Inspired by general face hallucination techniques[12, 13, 14, 15], our method for Face Restoration integrates two distinct
face-specific priors: geometry priors and reference priors, to enhance overall performance. Geometry priors encompass
facial landmarks [16, 17], face parsing maps [18, 16, 3], and facial component heatmaps [15]. However, these methods face
two main problems: they rely on guesses made from poor-quality pictures, which can be a problem in real-life situations.
Also, they mainly focus on the shape of a face and might not capture all the small details needed for an effective
restoration.

In contrast, our approach utilizes Generative Facial Prior, which eliminates the need for explicit geometry estimation from
degraded images, and contains rich textures within their pretrained network, offering a promising alternative for more
effective restoration.

The Channel Split Operation has often been explored to create more efficient models and enhance the representation ability
of these models. For instance, MobileNet[19] introduces depth-wise convolutions, while GhostNet[20]splits convolutional
layers into two parts and uses fewer filters to create essential feature maps. Additionally, the Dual Path Network (DPN)
architecture [21] allows for feature reuse and exploration in each path, thus enhancing its ability to represent data. Similar
concepts have also been applied in super-resolution techniques. Our Feature Split Modulation (FSM) layers share a similar
philosophy but employ different operations and serve different purposes. We utilize spatial feature transformation on one
split while maintaining the other split as the original identity to strike a balance between realism and fidelity.

Local Component Discriminators are designed to prioritize the distributions within specific patches [22, 23, 24]. When
these discriminative losses are applied to faces, they are tailored to distinct facial regions [25, 26] with semantic
significance. Similarly, our newly introduced facial feature loss adopts such strategies, focusing on specific facial areas.

Page | 158
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

Furthermore, our method integrates style supervision based on learned discriminative features, amplifying the effectiveness
of the facial component loss.

METHODOLOGIES

Figure 1: Overview of Meta-GAN Framework. It consists of two core elements: a U-Net module for removing image
degradation and a pretrained face GAN used as a reference for facial features. They communicate through a latent
code mapping process and several Feature Splitter Modulation (FSM) layers to improve overall performance.(Zoom
in for best View)

Meta-GAN Framework
In this section, we describe the main framework of Meta - GANs. When a facial image X that has been affected by some
form of degradation is given, our aim in the facial restoration process is to produce an improved image (ū). This new image
would approximately resemble the ground truth image (u) in terms of both accuracy and authenticity. The entire framework
of meta-GANs is depicted in Fig. 2.The Meta-GAN framework consists of a degradation removal module (U-Net) and a
pretrained face GAN (such as StyleGAN2), serving as a prior. These components are connected via a latent code mapping
and several Feature Splitter Modulation (FSM) layers.

The degradation removal module is tasked with addressing complex degradation in the input image and extracting two
types of features: latent features (Alatent) to map the input image to the closest latent code in StyleGAN2 [6], and multi-
resolution spatial features (Aspatial) for modulating the StyleGAN2 [6] features. Alatent is subsequently mapped to
intermediate latent codes (P) through linear layers. StyleGAN2 generates intermediate convolutional features (AGAN) based
on these latent codes, capturing rich facial details from the pretrained GAN weights.

Multi-resolution features (Aspatial) are utilized to spatially modulate the face GAN features (A GAN) using FSM layers,
operating in a coarse-to-fine manner to achieve realistic results while preserving high fidelity.During training, in addition to
the global discriminative loss, we introduce facial component loss using discriminators to enhance perceptually significant
facial components, such as the eyes and mouth. Additionally, identity-preserving guidance is employed to retain identity
information. During training, in addition to the global discriminative loss, we introduce facial component loss using
discriminators to enhance perceptually significant facial components, such as the eyes and mouth. Additionally, identity-
preserving guidance is employed to retain identity information.

Degradation Removal Module


During the facial restoration process, images often exhibit intricate degradation caused by factors like low-resolution, blur,
noise, and JPEG artifacts. To address these challenges, the degradation removal module is meticulously designed to
carefully eliminate such degradation and extract refined features (Alatent) and (Aspatial). This helps reduce the workload of

Page | 159
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

subsequent modules. We have opted to utilize the U-Net architecture, as outlined in [1], for our degradation removal
module due to its ability to broaden the receptive field, facilitating effective blur removal and the generation of multi-
resolution features.

The formula utilized for map the input image and modulate the StyleGAN2 features is as follows:

Alatent, Aspatial = U – Net(x). (1)

Utilizing the multi-resolution spatial features Aspatial, we modulate the StyleGAN2 features, whereas the latentfeatures Alatent
are employed to map the input image to the closest latent code in StyleGAN2.

To provide intermediate supervision for degradation removal, we utilize the L1 restoration loss at each resolution scale
during the initial training phase. More specifically, we generate images for each resolution scale of the UNet decoder and
enforce proximity to the pyramid of the ground-truth image.

Generative Facial Prior and Latent Code Mapping


A pretrained face GAN encompasses a wide spectrum of facial characteristics within its convolutional weights, referred to
as the generative prior[7, 8]. Leveraging these pretrained face GANs allows us to access a multitude of diverse facial details
for our task. Traditionally, leveraging generative priors entails mapping the input image to its nearest latent code Z and then
generating the corresponding output using a pre-trained GAN [4, 28, 8, 7]. However, these conventional approaches often
necessitate time-consuming iterative optimization to uphold precision. Instead of directly producing a final image, we opt to
generate intermediate convolutional features AGAN from the nearest face. This strategy enables us to retain a wealth of
intricate details and affords us the opportunity to further refine these features using input characteristics, ultimately
enhancing precision.

P = MLP(Alatent), (2)
AGAN = StyleGAN(P). (3)

Particularly, when provided with the encoded vector (Alatent) of the input image, our first step involves translating it into an
intermediate code P to better preserve semantic properties. This intermediate space derived from Z with the help of a series
of multi-layered perceptron[29], ensures more effective representation. Subsequently, these latent codes P traverse through
each convolutional layer within the pretrained GAN, generating GAN features for every resolution scale.

Feature Splitter Modulation


To fortify fidelity preservation, we harness the inherent spatial intricacies ingrained within the input features
(Aspatial)emanating from the U-Net's output (Eq. 1)to regulate the corresponding GAN features (AGAN) as delineated in (Eq.
2) and (Eq. 3). Preserving such spatial nuances proves pivotal in the realm of face restoration, ensuring the faithful retention
of local attributes and facilitating adaptive enhancements across distinct facial zones. To this end, we embrace the Feature
Splitter Modulation (FSM), renowned for its adeptness in furnishing affine transformation parameters tailored for spatial-
wise feature modulation. Renowned for its efficacy, FSM has demonstrated prowess in seamlessly integrating
supplementary conditions across an array of image restoration and generation scenarios.

At each resolution scale, we methodically derive a pair of affine transformation parameters (a,b) from the input features
(Aspatial) via a succession of convolutional layers. Subsequently, modulation ensues, orchestrated by the judicious scaling
and shifting of the GAN features (AGAN), as delineated by the formula:

a, b = Conv(Aspatial), (4)
Aoutput = FSM(AGAN | α, β) = α AGAN + β. (5)

To achieve a better balance of realness and fidelity, we further propose Feature Splitter Modulation (FSM) layers. These
layers perform spatial modulation on a segment of the GAN features, influenced by input feature (Aspatial) (contributing to
fidelity), while allowing the remaining GAN features (contributing to realness) to pass through unaltered, as described in
(Figure 1):

Aoutput = FSM(AGAN|α, β) (6)


split () split 1
= Concat[Identity(AGAN ), α AGAN + β] (7)

Page | 160
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

split () split 1
Where Concat denotes the concatenation operation, AGAN and AGAN are split features from AGAN in channel dimension.

Consequently, FSM reaps the advantages of seamlessly integrating prior information and efficiently modulating input
images, thereby striking a commendable equilibrium between texture faithfulness and fidelity. Furthermore, FSM also
offers the added benefit of reducing complexity, requiring fewer channels for modulation, akin to the methodology
employed in GhostNet[20].We implement FSM layers at every resolution scale, culminating in the generation of the
restored face Y.

Proposed Model
In training our Meta-GAN model, we pursue four primary learning objectives. Firstly, reconstruction loss is employed to
ensure that the generated outputs closely resemble the ground-truth images. This loss function acts as a guiding principle,
compelling the model to produce outputs that faithfully represent the original images. Secondly, adversarial loss is
introduced to facilitate the restoration of realistic textures in the generated images. Through adversarial training, the model
learns to generate outputs with convincing textures, enhancing the overall realism of the restored faces.

Additionally, our framework incorporates Facial Component Loss, a novel component aimed at further refining the
intricate details of facial features. By targeting specific facial components, this loss function allows the model to focus on
preserving and enhancing key facial attributes. Finally, identity preserving loss is integrated into the training process to
ensure that the identity of the restored faces remains consistent with the original input images. This loss function helps
mitigate any distortions or alterations to the facial identity during the restoration process, maintaining the overall integrity
of the facial features.

For the reconstruction loss, we utilize two commonly employed techniques: L1 loss and perceptual loss. This
reconstruction loss, denoted as Ɫrec, is defined as follows:

Ɫrec = λb1 || ū− u || + λper || ɸ(ū) − ɸ (u)|| (8)

Here, ɸ represents the pretrained VGG-19 network[29], and we utilize the feature maps {conv1,conv2,conv3,conv4,conv5}
before activation. The weights for L1 loss and perceptual loss are denoted as λ b1 and λper, respectively.Incorporating
Adversarial loss, denoted as Ɫadv, our objective is to guide the Meta-GAN towards generating outputs that conform to the
natural image manifold, showcasing realistic textures. Following the methodology used by StyleGAN2[6], we implement
logistic loss.

Ɫadv = −λadvĚŭsoftplus (D(ū)) (9)

Where λadv represents the adversarial loss weight, and D denotes the discriminator

To elevate the prominence of facial components critical to perception, we introduce a facial component loss strategy,
employing localized discriminators targeting the left eye, right eye, and mouth regions. We initially delineate these specific
areas using ROI alignment techniques[30]. Following this, we deploy individual, finely-tuned local discriminators to assess
the authenticity of the restored patches, ensuring their alignment with natural facial component distributions.

Inspired by prior work[24],we enrich our methodology with a feature style loss, departing from conventional feature
matching approaches constrained by spatial considerations. Instead, our approach centers on harmonizing the Gram matrix
statistics[31] of real and restored patches. Leveraging the Gram matrix's ability to capture feature correlations and texture
details effectively, we extract features from multiple layers within the learned local discriminators. Our objective is to align
these Gram statistics of intermediate representations derived from both real and restored patches.

In our empirical assessments, we observe that the feature style loss surpasses prior feature matching techniques, yielding
more authentic facial details and mitigating undesirable artifacts.

Incorporating insights from previous studies[32] we integrate identity preservation into our model via a specialized loss
function. Operating similar to perceptual loss[22], this mechanism revolves around the feature embedding of input faces. To
accomplish this, we utilize the pre-trained ArcFace[34]model, renowned for its efficacy in capturing crucial features
essential for identity discrimination. By enforcing identity preservation, we ensure that the restored output closely aligns
with the ground truth within the compact deep feature space:

Page | 161
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

Ɫid = λid || η(ū) − η(y) || (10)

Where λid denotes the weight of identity preserving loss and η represents face feature extractor, i.e. ArcFace[13] in our
implementation.

The overall proposed model is a combination of the above losses:

Ɫtotal = Ɫrec + Ɫadv + Ɫid (11)

The loss hyper-parameters are set as follows: λb1 = 0.1, λper = 1, λadv = 0.1, and λid = 10

DATASETS AND IMPLEMENTATIONS

Dataset for Training: Our Meta-GAN model undergoes training using the FFHQ dataset, which encompasses
approximately 70,000 high-quality images. To ensure consistency, all images are resized to dimensions of 512x512 during
the training process.

Synthetic Data Generation: Our Meta-GAN is trained using synthetic data that closely mimic real low-quality images,
enabling it to generalize effectively to real-world scenarios during inference. Following established practices in [Cite 19th
Research Paper], we employ a degradation model to synthesize the training data:

x = [(y * ka ) ↓r + nΔ]JPEGq (12)

Here, the high-quality image u undergoes convolution with a Gaussian blur kernel k a, followed by down sampling with a
scale factor r. Subsequently, additive white Gaussian noise nΔ is introduced, and the image is compressed using JPEG with
a quality factor q. For each training pair, parameters such as a, Δ, r, and q are randomly sampled from specified ranges.
Additionally, color jittering techniques are employed during training to enhance color rendition.

Testing Datasets: We compile a synthetic dataset and three distinct real datasets sourced from diverse origins. Notably,
these datasets are carefully curated to ensure no overlap with our training dataset. Here's a brief overview:

 CelebChild-Test: This dataset comprises 180 images featuring child faces of celebrities sourced from the internet.
Many of these images are characterized by low-quality attributes, with a significant portion being black-and-white
old photographs.
 CelebA-Test: This synthetic dataset comprises 3,000 CelebA-HQ images sourced from its testing partition The
generation process mirrors that employed during training.
 WebPhoto-Test: Extracted from internet sources, this dataset comprises 188 low-quality photos collected from
real-life scenarios. We further isolate 407 faces to assemble the WebPhoto testing dataset, characterized by diverse
and complex degradation patterns. Notably, some photos feature severe degradation, particularly in terms of
details and color rendition.
 LFW-Test: Derived from the LFW dataset [35], this collection encompasses low-quality images obtained from
real-world scenarios. Specifically, we aggregate the first image for each identity in the validation partition,
resulting in 1,711 testing images.

Implementation: In our practical application, we incorporate the pretrained StyleGAN2[6] with 5122 outputs as our
foundational generative facial prior. To maintain efficiency, we configure the channel multiplier of StyleGAN2 to one.
The component responsible for addressing degradation employs a UNet architecture, comprising seven down sampling and
seven up sampling layers, each integrating a residual block. For the Feature Splitter Modulation (FSM) layers, we utilize
two convolutional layers to compute the affine parameters α and β, respectively.

During training, we work with a mini-batch size of 12 and introduce data augmentation techniques such as horizontal flips
and color jittering. To focus on significant facial components like the left eye, right eye, and mouth, we employ face
component loss. These components are meticulously cropped using ROI align [8] and facial landmarks provided in the
original training dataset.Training is conducted using the Adam optimizer over 800k iterations, with the learning rate
initialized at 2 × 10-3 and decayed by a factor of 2 at the 700k-th and 750k-the iterations.Our implementation is executed
within the PyTorch framework and trained across four NVIDIA DGX A100 GPUs. Moreover, it is noteworthy to highlight
that our project has been presented on the Hugging Face platform, a prominent open-source platform specifically designed

Page | 162
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

for machine learning applications. Hugging Face is highly regarded in the machine learning community for its vast array of
pre-trained models and tools meticulously crafted for various machine learning tasks.

RESULTS AND DISCUSSIONS

Input Output Input Output Input Output

Figure 2: Illustrates the Input and Output of Low-quality images

Input Output Input Output

Figure 3: Illustrates the output of noisy and degrades Images

In Figure 3, the output of Meta-GANs is showcased through two examples: a severely degraded, noisy grayscale image and
a low-quality grayscale image. In both cases, Meta-GAN demonstrates remarkable restoration capabilities, transforming the
input images into high-quality, clear representations. The noise and distortion in the first image are effectively suppressed,
revealing intricate details and textures with newfound clarity.

Similarly, the second image undergoes significant enhancement, with fine details and grayscale tones rendered with
precision and realism.The output of Meta-GAN underscores its effectiveness in restoring and enhancing low-quality
images, particularly in scenarios where noise and degradation hinder visual clarity. By leveraging deep learning techniques,
Meta-GAN achieves impressive results in image reconstruction, offering valuable applications in fields such as forensics,
biometrics, and medical imaging. However, further research is needed to explore the robustness and scalability of Meta-
GAN across diverse datasets and real-world scenarios.

Input Output Input Output Input Output

Figure 4: Result on Dark Skinned Faces

LIMITATIONS AND FUTURE PROSPECTS


Limitations: Our method exhibits strong performance across most dark-skinned faces and diverse population groups,
leveraging both pretrained GAN and input image features for modulation. Additionally, reconstruction loss and identity-
preserving loss are employed to ensure fidelity retention in outputs. However, for grayscale input images, there may be a
bias in facial color representation due to insufficient color information, as exemplified in the last instance of Figure 4.
Page | 163
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

Hence, a diverse and balanced dataset is imperative for addressing such limitations.

Challenges: There were instances where severe degradation in real images lead to distorted facial details and artifacts in the
restored results produced by our method. Furthermore, our approach may yield unnatural outcomes for faces with very
large poses. These discrepancies stem from differences between synthetic degradation and training data distribution
compared to real-world scenarios. Future work could involve learning these distributions from real data instead of relying
solely on synthetic data.

Future Prospects: Looking ahead, our method holds promise for integration into mobile phone camera applications to
enhance image quality from low-quality cameras. Additionally, future endeavors aim to optimize hardware costs associated
with training the model.

CONCLUSION

Our study introduces the Meta-GAN framework, designed to tackle the intricate challenge of face restoration by harnessing
a rich and diverse generative facial prior. Through the innovative integration of this prior into the restoration process via
Feature Splitter Modulation layers, we have achieved a remarkable equilibrium between realism and fidelity. Our extensive
comparative analyses unequivocally demonstrate the unparalleled efficacy of Meta-GAN in the realm of face restoration
and color enhancement for real-world images, surpassing existing methodologies. This advancement holds promise for
addressing the complexities inherent in facial image restoration tasks, offering a robust solution that bridges the gap
between synthetic and real-world data distributions. By leveraging the power of generative facial priors, Meta-GAN opens
avenues for further research aimed at refining restoration techniques and enhancing the perceptual quality of facial imagery
in practical applications.

REFERENCES

[1] I. Goodfellow, J. Pouget-Abadi, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, "Generative
Adversarial Nets," in NeurIPS, 2014.
[2] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin and J. Matas, "Deblurgan: Blind motion Deblurring using
conditional adversarial networks," in CPVR, 2018.
[3] Z. Shen, W.-S. Lai, T. Xu, J. Kautz and M. Yang, " Deep semantic face deblurring," in CVPR, 2018.
[4] R. Abdal, Y. Qin and P. Wonka, "Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?,"
ICCV, 2019.
[5] T. Karras, S. Laine and T. Aila, "A style-based generator architecture for generative adversarial networks," in CVPR,
2018.
[6] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen and T. Aila, "Analyzing and improving the image quality of
stylegan," in CVPR, 2020.
[7] J. Gu, Y. Shen and B. Zhou, "Image Processing Using Multi-Code GAN Prior," in CVPR, 2020.
[8] X. Pan, X. Zhan, B. Dai, D. Lin, L. Chen Change and P. Luo, " Exploiting deep generative prior for versatile image
restoration and manipulation," in ECCV, 2020.
[9] X. Li, W. Li, D. Ren, H. Zhang, M. Wang and W. Zuo, "Enhanced blind face restoration with multi-exemplar images
and adaptive spatial feature fusion," in CVPR, 2020.
[10] X. Li, M. Liu, Y. Ye, W. Zuo, L. Lin and R. Yang, "Learning warped guidance for blind face restoration," in ECCV,
2018.
[11] B. Dogan, S. Gu and R. Timofte, "Exemplar guided face image super-resolution without facial landmarks," in
CVPRW, 2019.
[12] Q. Cao, L. Lin, Y. Shi, X. Liang and G. Li, "Attention-Aware Face Hallucination via Deep Reinforcement Learning,"
in CVPR, 2017.
[13] H. Huang, R. He, Z. Sun and T. Tan, "Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution," in
ICCV, 2017.
[14] X. Xu, D. Sun, J. Pan, Y. Zhang, H. Pfister and M.-H. Yang, "Learning to superresolve blurry face and text images," in
ICCV, 2017.
[15] X. Yu, B. Fernando, R. Hartley and F. Porikli, "Super-resolving very low-resolution face images with supplementary

Page | 164
International Journal of All Research Education and Scientific Methods (IJARESM),
ISSN: 2455-6211, Volume 12, Issue 5, May-2024, Available online at: www.ijaresm.com

attributes," in CVPR, 2018.


[16] Y. Chen, Y. Tai, X. Liu, C. Shen and J. Yang, "Fsrnet: End-to-end learning face super-resolution with facial priors," in
CVPR, 2018.
[17] S. Zhu, S. Liu, L. Chen Change and X. Tang, "Deep cascaded bi-network for face hallucination," in ECCV, 2016.
[18] C. Chaofeng, X. Li, Y. Lingbo, X. Lin, L. Zhang and K. W. Kwan-Yee, "Progressive semantic-aware style
transformation for blind face restoration," in arXiv:2009.08709, 2020.
[19] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, "Mobilenets:
Efficient convolutional neural networks for mobile vision applications," in arXiv, 2017.
[20] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu and C. Xu, "GhostNet: More Features from Cheap Operations," in CVPR,
2020.
[21] Y. Chen, J. Li, H. Xiao, X. Jin, S. Yan and J. Feng, "Dual path networks," in NeurIPS, 2017.
[22] S. Iizuka, E. Simo-Serra and H. Ishikawa, "Globally and locally consistent image completion," ACM Transactions on
Graphics (ToG), vol. 36, no. 4, pp. 1-14, 2017.
[23] Y. Li, S. Liu, J. Yang and M.-H. Yang, "Generative face completion," in CVPR, 2017.
[24] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, Jan Kautz and B. Catanzaro, " High-resolution image synthesis and
semantic manipulation with conditional gans," in CVPR, 2018.
[25] T. Li, R. Qian, C. Dong, S. Liu, Q. Yan, W. Zhu and L. Lin, "Beautygan: Instance-level facial makeup transfer with
deep generative adversarial network," in ACM MM, 2018.
[26] Q. Gu, G. Wang, M. T. Chiu, Y.-W. Tai and C.-K. Tang, " Ladn: Local adversarial disentangling network for facial
makeup and de-makeup," in ICCV, 2019.
[27] P. F. a. T. B. Olaf Ronneberger, "U-net: Convolutional networks for biomedical image segmentation," in International
Conference on Medical Image Computing, 2015.
[28] J. Zhu, Y. Shen, D. Zhao and B. Zhou, "Indomain gan inversion for real image editing," in ECCV, 2020.
[29] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in ICLR,
2015.
[30] K. He, G. Gkioxari, P. Dollar and R. Gir, "Mask r-cnn," in ICCV, 2017.
[31] G. Leon A, E. Alexander S and M. Bethge, "Image style transfer using convolutional neural networks," in CVPR,
2016.
[32] R. Huang, S. Zhang, T. Li and R. He, "Beyond face rotation: Global and local perception gan for photorealistic and
identity preserving frontal view synthesis," in CVPR, 2018.
[33] J. Deng, J. Guo, N. Xue and Stefanos, " Arcface: Additive angular margin loss for deep face recognition," in CPVR,
2019.
[34] J. Deng, J. Guo, N. Xue and S. Zafeiriou, "Arcface: Additive angular margin loss for deep face recognition," in CVPR,
2019.
[35] G. B. Huang, M. Ramesh, T. Berg and E. Learned-Miller, "Labeled faces in the wild: A database for studying face
recognition in unconstrained environments," University of Massachusetts, Amherst, 2007.

Page | 165

You might also like