Group 85 Survey Paper (1) - 1
Group 85 Survey Paper (1) - 1
Abstract— In recent months, free deep learning-based simple pranks or entertainment. In the wrong hands, they can
software tools have facilitated the creation of credible face be used to create damaging content that tarnishes reputations,
exchanges in videos that leave few traces of manipulation, spreads propaganda, or incites panic among the general
commonly known as "DeepFake" (DF) videos. public. For instance, DeepFakes have been used to
Manipulations of digital videos have been demonstrated for impersonate public figures in ways that can influence
several decades through the use of visual effects, but recent elections, spread fake news, or even lead to financial fraud.
advances in deep learning have drastically increased the These videos are so convincing that they can be difficult to
realism of fake content and made its creation more accessible. distinguish from genuine content, leading to a growing threat
These so-called AI-synthesized media (popularly referred to of misinformation that is difficult to counter.
as DF) can be generated with relative ease using AI tools. As these technologies continue to evolve, the
However, detecting these DF videos presents a significant challenge of detecting and mitigating the harmful effects of
challenge, as training algorithms to spot such fakes is DeepFakes becomes more urgent. The manipulation of audio
complex and requires substantial computational resources. and video content not only undermines trust in online media
but also poses serious ethical, social, and political risks. This
To address this challenge, we propose a system that makes it critical for researchers, technologists, and policy
leverages both Convolutional Neural Networks (CNN) and makers to work together to develop tools and strategies to
Recurrent Neural Networks (RNN) for DF detection. The combat the misuse of DeepFake technology, safeguard the
system uses a CNN to extract frame-level features and an public, and maintain the integrity of digital media.To
RNN to learn temporal inconsistencies between frames overcome such a situation, DF detection is very important.
caused by manipulation tools, helping to identify whether a So, we describe a new deep learning-based method that can
video has been altered. By training this architecture on a large effectively distinguish AI-generated fake videos (DF Videos)
dataset of fake videos, we demonstrate that our approach can from real videos. It’s incredibly important to develop
achieve competitive results, offering a promising solution for technology that can spot fakes, so that the DF can be
combating the growing threat of DF videos. Furthermore, the identified and prevented from spreading over the internet.
use of this combined architecture allows for efficient For detection the DF it is very important to
detection without compromising accuracy. understand the way Generative Adversarial Network (GAN)
Keywords: Deepfake Video Detection, convolutional creates the DF. GAN takes as input a video and an image of
Neuralnetwork (CNN), recurrent neural network (RNN) a specific individual (‘target’), and outputs another video
with the target’s faces replaced with those of another
I. INTRODUCTION individual (‘source’). The backbone of DF are deep
The increasing sophistication of smartphone adversarial neural networks trained on face images and target
cameras, coupled with the widespread availability of high- videos to automatically map the faces and facial expressions
speed internet across the globe, has significantly expanded of the source to the target. With proper post- processing, the
the reach of social media platforms and media-sharing resulting videos can achieve a high level of realism. The
portals. This technological advancement has made the GAN split the video into frames and replaces the input image
creation, sharing, and transmission of digital videos easier in every frame. Further it reconstructs the video. This process
and faster than ever before. The surge in computational is usually achieved by using autoencoders. We describe a
power has further bolstered the capabilities of deep learning new deep learning-based method that can effectively
technologies, enabling advancements that would have been distinguish DF videos from the real ones. Our method is
unimaginable just a few years ago. Deep learning, now more based same process that is used to create the DF by GAN.
accessible and powerful, has opened new avenues for The method is based on a property of the DF videos, due to
innovation but also introduced significant challenges. One of limitation of computation resources and production time, the
the most prominent of these is the rise of "DeepFake" (DF) DF algorithm can only synthesize face images of a fixed size,
technology. and they must undergo an affinal warping to match the
DeepFakes are produced using deep generative configuration of the source’s face. This warping leaves some
adversarial networks (GANs), which can manipulate both distinguishable artifacts in the output deepfake video due to
video and audio content to create highly realistic and the resolution inconsistency between warped face area and
convincing fake media. What was once the domain of highly surrounding context. Our method detects such artifacts by
skilled visual effects professionals can now be done comparing the generated face areas and their surrounding
relatively easily using publicly available tools. The regions by splitting the video into frames and extracting the
accessibility of DF creation has led to their proliferation features with a ResNext Convolutional Neural Network
across social media platforms, where they are often used (CNN) and using the Recurrent Neural Network (RNN) with
maliciously to spread false or misleading information. This Long Short-Term Memory (LSTM) capture the temporal
has led to a rise in online spamming, disinformation inconsistencies between frames introduced by GAN during
campaigns, and the manipulation of public opinion. the reconstruction of the DF. To train the ResNext CNN
The danger posed by DeepFakes goes beyond model, we simplify the process by simulating the resolution
inconsistency in affine face wrappings directly.
III. PROPOSED SYSTEM
Our project focuses on developing a web-based platform for
II. LITERATURE SURVEY detecting DeepFake (DF) videos using deep learning
The explosive growth in deep fake video and its illegal use techniques such as Convolutional Neural Networks (CNNs)
is a major threat to democracy, justice, and public trust. Due and Recurrent Neural Networks (RNNs). The platform will
to this there is a increased the demand for fake video analysis, allow users to upload videos, which will be analyzed and
detection and intervention. Some of the relatedword in deep classified as either real or fake based on temporal
fake detection are listed below: inconsistencies and manipulations introduced by DF creation
ExposingDF Videos by Detecting Face Warping tools. This platform could also evolve into a browser plugin
Artifacts used an approach to detects artifacts by comparing or be integrated into popular applications like WhatsApp,
the generated face areas and their surrounding regions with a Facebook, or Instagram, enabling real-time detection of fake
dedicated Convolutional Neural Network model. In this work media before content is shared with others.
there were two-fold of Face Artifacts. The system is designed to detect various types of DeepFakes,
Their method is based on the observations that including face-swapping (replacement DF), partial
current DF algorithm can only generate images of limited manipulations like altered facial expressions (retrenchment
resolutions, which are then needed to be further transformed DF), and identity blending (interpersonal DF). Our approach
to match the faces to be replaced in the source video. aims to enhance digital content verification by offering a
Exposing AI Created Fake Videos by DetectingEye solution that is secure, easy to use, and highly accurate. The
Blinking describes a new method to expose fake face videos project will focus on performance, ensuring it meets
generated with deep neural network models. The method is standards of reliability, accuracy, and user-friendliness while
based on detection of eye blinking in the videos, which is a handling large datasets of fake and real videos.
physiological signal that is not well presented in the
synthesized fake videos. The method is evaluated over By providing this tool, we aim to significantly reduce the
benchmarks of eye-blinking detection datasets and shows spread of manipulated content across social media platforms
promising performance on detecting videos generated with and improve the trustworthiness of digital media. This
Deep Neural Network based software DF. system has the potential to become an essential part of the
Their method only uses the lack of blinking as a clue global effort to combat misinformation and the growing
for detection. However certain other parameters must be misuse of DeepFake technology.figure.1 represents the
considered for detection of the deep fake like teeth simple system architecture of theproposed system: -
enchantment, wrinkles on faces etc. Our method is proposed
to consider all these parameters.
Using capsule networks to detect forged images and
videos uses a method that uses a capsule network to detect
forged, manipulated images and videos in different scenarios,
like replay attack detection and computer- generated video
detection.
In their method, they have used random noise in
the training phase which is not a good option. Still the model
performed beneficial in their dataset but may fail on real time
data due to noise in training. Our method is proposed to be
trained on noiseless and real time datasets.
Detection of Synthetic Portrait Videos using
Biological Signals approach extract biological signals from
facial regions on authentic and fake portrait video pairs.
Apply transformations to compute the spatial coherence and
temporal consistency, capture the signal characteristics in
feature sets and PPG maps, and train a probabilistic SVM and
a CNN. Then, the aggregate authenticity probabilities to
decide whether the video is fake or authentic.
Fake Catcher detects fake content with high
accuracy, independent of the generator, content, resolution, Fig. 1: System Architecture
and quality of the video. Due to lack of discriminator leading
to the loss in their findings to preserve biological signals,
formulating a differentiable loss function that follows the
proposed signal processing steps is not straight forward
process.
A. Dataset:
We are using a mixed dataset which consists of equal amount of
videos from different dataset sources like YouTube,
FaceForensics++[14], Deep fake detection challenge dataset[13].
Our newly prepared dataset contains 50% of the original video
and 50% of the manipulated deepfake videos. The dataset is split
into 70% train and 30% test set.
B. Preprocessing:
Dataset preprocessing includes the splitting the video into frames.
Followed by the face detection and cropping the frame with
detected face. To maintain the uniformity in the number of
frames the mean of the dataset video is calculated and the new
processed face cropped dataset is created containing the frames
equal to the mean. The frames that doesn’t have faces in it are
ignored during preprocessing. As processing the 10 second
video at 30 frames per second i.e total 300 frames will require
a lot of computationalpower. So for experimental purpose we
are proposing to usedonly first 100 frames for training the model.
C. Model:
The model consists of resnext50_32x4d followed by one LSTM
layer. The Data Loader loads the preprocessed face cropped
videos and split the videos into train and test set. Further the
frames from the processed videos are passed to the model for
training and testing in mini batches.
V. CONCLUSION
We presented a neural network-based approach designed to
effectively classify videos as either DeepFake or real, while
also providing confidence scores for our proposed model.
This method draws inspiration from the process of DeepFake
creation, which typically employs Generative Adversarial
Networks (GANs) in conjunction with Autoencoders. By
understanding the mechanics behind how DeepFakes are
generated, we have developed a model that aims to reverse-
engineer this process for detection purposes.