0% found this document useful (0 votes)
39 views52 pages

Phase 1

Uploaded by

Jayashree N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views52 pages

Phase 1

Uploaded by

Jayashree N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Visvesvaraya Technological University

Jnana Sangama, Belagavi - 590018

A Project Work Phase-I (BAD685)

Report On
“DEEPFAKE DETECTION SYSTEM”

Submitted in partial fulfilment of the requirement for the award of the degree of

BACHELOR OF ENGINEERING
IN
ARTIFICIAL INTELLIGENCE & DATA SCIENCE
Submitted by
Jayashree N 1KG22AD024
Manoj B 1KG22AD036
Niveditha P 1KG22AD041
Tejas J 1KG22AD060

Under the guidance of:


Mrs. Tanushree Mohapatra
Assistant Professor
Department of AI&DS

DEPARTMENT OF ARTIFICIAL INTELLIGENCE & DATA SCIENCE


K. S. School of Engineering and Management
#15, Mallasandra, off. Kanakapura Road, Bengaluru – 560109
2024 - 2025
K. S. School of Engineering and Management
#15, Mallasandra, off. Kanakapura Road, Bengaluru - 560109

Department of Artificial Intelligence & Data Science

CERTIFICATE

Certified that the Project Work Phase I (BAD685) entitled “DEEPFAKE DETECTION
SYSTEM” is a bonafide work carried out by:

Jayashree N 1KG22AD024
Manoj B 1KG22AD036
Niveditha P 1KG22AD041
Tejas J 1KG22AD060

in partial fulfilment for VI semester B.E., Project Work in the branch of Artificial Intelligence
& Data Science prescribed by Visvesvaraya Technological University, Belagavi during the
academic year 2024-2025. It is certified that all the corrections and suggestions indicated for
internal assessment have been incorporated. The Project Work Phase I (BAD685) report has
been approved as it satisfies the academic requirements in respect of project work prescribed
for the Bachelor of Engineering degree.

…………………………….. …………………………….. ………………………………


Signature of the Guide Signature of the HOD Signature of the Principal
Mrs. Tanushree Mohapatra Dr. Manjunath T K Dr. K Rama Narasimha
Assistant Professor Professor & Head Director & Principle
AI&DS, KSSEM AI&DS, KSSEM KSSEM

I
DECLARATION

We, the undersigned students of 6th semester, Artificial Intelligence & Data Science, KSSEM,
declare that our Project Work entitled “Deepfake Detection System” is a bonafide work of ours.
Our project is neither a copy nor by means a modification of any other engineering project.

We also declare that this project was not entitled for submission to any other university in the
past and shall remain the only submission made and will not be submitted by us to any other
university in the future.

Place: Bengaluru
Date:

Name and USN Signature

Jayashree N (1KG22AD024) ………………….


Manoj B (1KG22AD036) ………………….
Niveditha P (1KG22AD041) ………………….
Tejas J (1KG22AD060) ………………….

II
ACKNOWLEDGEMENT

The satisfaction and euphoria that accompany the successful completion of any task will be
incomplete without the mention of the individuals, we are greatly indebted to, who through
guidance and providing facilities have served as a beacon of light and crowned our efforts with
success.

We would like to express our gratitude to our MANAGEMENT, K.S. School of Engineering and
Management, Bengaluru, for providing a very good infrastructure and all the kindness forwarded
to us in carrying out this project work in college.

We would like to express our gratitude to Dr. K.V.A Balaji, CEO, K.S. School of Engineering
and Management, Bengaluru, for his valuable guidance.

We would like to express our gratitude to Dr. K. Rama Narasimha, Principal, K.S. School of
Engineering and Management, Bengaluru, for his valuable guidance.

We like to extend our gratitude to Dr. Manjunath T K, Professor and Head, Department of
Artificial Intelligence & Data Science, for providing a very good facilities and all the support
forwarded to us in carrying out this project work successfully.

We also like to thank our Project Coordinator Mrs. K. Padma Priya, Assistant Professor,
Artificial Intelligence & Data Science for her help and support provided to carry out the Project
Work successfully.

Also, we are thankful to Mrs. Tanushree Mohapatra, Assistant Professor, Department of


Artificial Intelligence & Data Science for being our Project Guide, under whose guidance this
project work has been carried out successfully.

We are also thankful to the teaching and non-teaching staff of Artificial Intelligence & Data
Science, KSSEM for helping us in completing the Project Work.

Jayashree N (1KG22AD024)
Manoj B (1KG22AD036)
Niveditha P (1KG22AD041)
Tejas J (1KG22AD060)

III
ABSTRACT

Deepfake technology leverages advanced deep learning techniques to generate highly realistic
fake videos, images, and audio that are often indistinguishable from authentic content. While these
synthetic media offer creative applications in fields like entertainment, education, and
accessibility, they also present significant threats including misinformation, identity theft, and
digital fraud. As deepfakes continue to evolve in quality and accessibility, the need for effective
and reliable detection systems becomes increasingly urgent.

This project presents a detailed literature survey and comparative analysis of five recent research
papers focusing on machine learning and deep learning-based deepfake detection techniques. The
selected studies explore a range of models, including Support Vector Machines (SVM), Decision
Trees, Convolutional Neural Networks (CNNs), EfficientNet, InceptionNet, and hybrid
architectures. Each method is analyzed in terms of its dataset usage, model architecture, accuracy,
limitations, and real-world applicability.

The goal of this phase is to understand the landscape of current detection methods, evaluate their
performance across different datasets, and identify existing research gaps. Based on the findings,
the project aims to implement or propose a robust, scalable, and generalizable deepfake detection
framework that can adapt to evolving manipulation techniques and support real-time media
verification.

IV
TABLE OF CONTENTS

CHAPTER NO. CONTENT PAGE NO.

CERTIFICATE I

DECLARATION II

ACKNOWLEDGEMENT III

ABSTRACT IV

TABE OF CONTENTS V

Chapter 1. INTRODUCTION 1
Chapter 2. LITERATURE SURVEY 2
2.1 Deepfake Detection using Inception-ResNetV2 2
2.2 DeepFake Videos Detection and Classification Using ResNeXt 3
and LSTM Neural Network
2.3 Facial Recognition for Deepfake Detection 4
2.4 A Comprehensive Overview of Deepfake: Generation, 4
Detection, Datasets, and Opportunities
2.5 Improving Deepfake Detection by Mixing Top Solutions of the 5
DFDC
2.6 Real-Time Face Transition Using Deepfake Technology (GAN 6
Model)
2.7 Wave-Spectrogram Cross-Modal Aggregation for Audio 7
Deepfake Detection
2.8 Spatial Vision Transformer: A Novel Approach to Deepfake 8
Video Detection
2.9 Robust and Generalized DeepFake Detection 9

2.10 DeepFake Videos Detection and Classification Using ResNeXt 10


and LSTM Neural Network
2.11 ABC-CapsNet: Attention-based Cascaded Capsule Network for 11
Audio Deepfake Detection
2.12 Audio Deepfake Detection Using Deep Learning 11

V
2.13 AI-Based Deepfake Detection. 12
2.14 Motion Magnified 3D Residual-in-Dense Network for 12
DeepFake Detection.
2.15 Domain Generalization via Aggregation and Separation for 13
Audio Deepfake Detection.
2.16 Comparative Analysis on Different DeepFake Detection 13
Methods and Semi Supervised GAN Architecture for
DeepFake Detection.
2.17 Deepfake Detection: A Systematic Literature Review. 14
2.18 Deepfake Generation and Detection – An Exploratory Study. 15
2.19 An Effective Approach for Detecting Deepfake Videos Using 15
Long Short-Term Memory and ResNet
2.20 Artificial Intelligence into Multimedia Deepfakes Creation 16
and Detection.
2.21 DeepFake Video Detection Using Machine Learning and 17
Deep Learning Techniques.
2.22 Enhancing Deepfake Video Detection Performance with a 18
Hybrid CNN Deep Learning Model.
2.23 Comparative Analysis of Deepfake Video Detection Using 19
InceptionNet and EfficientNet.
2.24 Comparison of Different Machine Learning Algorithms for 20
Deep Fake Detection.
2.25 Analysis and Comparison of Deepfakes Detection Methods 22
for Cross-Library Generalisation
2.26 A Comprehensive Review on Fake Images/Videos Detection 23
Techniques.
2.27 Contemporary Cybersecurity Challenges in Metaverse Using 24
Artificial Intelligence.
2.28 Deepfake Disasters: A Comprehensive Review of 25
Technology, Ethical Concerns, Countermeasures, and
Societal Implications.
2.29 DeepFake Video Detection. 26
2.30 Unmasking the Illusions: A Comprehensive Study on 27
Deepfake Videos and Images.

VI
2.31 A Heterogeneous Feature Ensemble Learning based Deepfake 29
Detection Method
2.32 Deepfake Detection in Videos and Picture: Analysis of Deep 30
Learning Models and Dataset
2.33 Deepfake Detection Using Deep Learning 30
2.34 Detecting Deepfakes: Training Adversarial Detectors with 31
GANs for Image Authentication
2.35 Detection of Deepfakes: Protecting Images and Videos 32
Against Deepfake
2.36 Div-Df: A Diverse Manipulation Deepfake Video Dataset 32
2.37 Model Attribution of Face-Swap Deepfake Videos 33
2.38 Review: DeepFake Detection Techniques using Deep Neural 34
Networks (DNN)”
2.39 Deepfake Generation and Detection: A Survey 34
2.40 Deep Fake in Picture Using Convolutional Neural Network 35
Chatpter 3 PROBLEM STATEMENT IDENTIFICATION 36
3.1 Problem statement
3.2 Project Scope
Chatpter 4 GOALS AND OBJECTIVES 37
4.1 Project Goals
4.2 Project Objectives
Chapter 5 SYSTEM REQUIREMENT SPECIFICATION 38
5.1 Software Reuirements
5.2 Hardware Requirements
Chatpter 6 METHODOLOGY 39
6.1 Literature Survey
6.2 Datasset Collection and Preprocessing
6.3 Feature Extraction
6.4 Model Training and Evaluation 40
6.5 Result Analysis 40
6.6 Documentation and Reporting 40
Chapter 7 APPLICATIONS 41
REFERENCES 42

VII
A Survey Paper on Deepfake Detection System

CHAPTER 1

INTRODUCTION
Deepfake technology, a portmanteau of "deep learning" and "fake," refers to the use of
advanced artificial intelligence techniques to create highly realistic synthetic videos, images,
or audio that can convincingly mimic real people. It utilizes deep learning models like
Generative Adversarial Networks (GANs) to manipulate or generate visual and auditory
content. While deepfakes offer creative applications in fields like filmmaking, education, and
accessibility, they also pose serious threats to security, privacy, and public trust.

With the emergence of easy-to-use tools such as FakeApp, DeepFaceLab, and FaceSwap, even
individuals without technical expertise can generate convincing fake media. These deepfakes
have been weaponized in various contexts, including fake political speeches, doctored celebrity
content, identity theft, and non-consensual pornography. Their widespread misuse has made
deepfake detection a critical issue in cybersecurity and digital forensics.

Early detection techniques relied on handcrafted features and traditional machine learning
algorithms like Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Decision
Trees. However, as deepfakes became more sophisticated, deep learning models— particularly
CNNs, InceptionNet, EfficientNet, and hybrid architectures—have shown greater success.
Despite this, a major limitation persists: many models excel on their training datasets but fail
to generalize to unseen, real-world data.

In this project, we conduct a comprehensive review of five research papers, each exploring
different detection techniques, including ML classifiers, deep CNNs, and hybrid models
incorporating both spatial and temporal cues. We compare their performance on datasets like
FaceForensics++, DFDC, and CelebDF to evaluate accuracy, generalization, and feasibility.
This review aims to highlight current advancements, uncover research gaps, and guide future
development of scalable and reliable deepfake detection systems.

Dept. of AI&DS / KSSEM 2024-2025 Page 1


A Survey Paper on Deepfake Detection System

CHAPTER 2
LITERATURE SURVEY

MEMBER 1: Jayashree N– Literature Survey

2.1. The paper ―Deepfake Detection using Inception-ResNetV2‖ by IEEE ICACFCT 2021
provides a detailed study on frame-based fake video detection using deep learning. Key
highlights include:
• Problem Statement: With the rapid evolution of AI-generated facial manipulations,
detecting deepfakes has become increasingly challenging, as they closely resemble real
facial expressions and movements.
• Proposed System: The paper proposes a CNN-based classifier using Inception- ResNetV2
to extract features from facial frames in videos. It focuses on binary classification of frames
as real or fake using deep hierarchical features.
• Implementation Details:
o Video frames are extracted and resized to 299×299.
o Face alignment is performed before feeding the data into the network.
o The model is trained and tested on a deepfake dataset.
o The network uses multiple convolutional and residual layers for feature extraction.
• Technologies Used:
o Deep Learning Model: Inception-ResNetV2
o Tools/Libraries: TensorFlow, Keras, OpenCV
o Dataset: Not explicitly mentioned but aligned with standard deepfake benchmarks.
• Accuracy: The model achieves high accuracy (over 90%) in identifying deepfake frames
using visual cues.
• Advantages:
o Strong feature extraction capability.
o Performs well on high-quality face-level manipulations.
o Scalable and can be integrated with real-time systems
• Conclusion: The study concludes that CNNs like Inception-ResNetV2 are powerful in
detecting facial forgeries in videos. However, further research is required to improve
detection under occlusion and low-light conditions.

Dept. of AI&DS / KSSEM 2024-2025 Page 2


A Survey Paper on Deepfake Detection System

2.2 The paper ―DeepFake Videos Detection and Classification Using ResNeXt and LSTM
Neural Network‖ by Suman Patel, Saroj Kumar Chandra, and Amit Jain provides a detailed
study on deepfake video classification using spatial-temporal neural networks. Key highlights
include:
• Problem Statement: Deepfake videos pose severe threats to digital authenticity by
creating realistic face-swapped videos, which are hard to detect with the human eye or
traditional algorithms.
• Proposed System: The model integrates ResNeXt (a CNN) for extracting frame-level
features and LSTM (an RNN) for capturing temporal dependencies to detect deepfake
sequences effectively.
• Implementation Details:
o Extract video frames and apply preprocessing.
o ResNeXt extracts features from each frame.
o LSTM analyzes time-based dependencies.
o Network is trained using a binary classification objective.
• Technologies Used:
o CNN + RNN: ResNeXt and LSTM
o Tools: Python, TensorFlow/Keras
o Datasets: FaceForensics++, Celeb-DF, DFDC
• Accuracy: Performance improves with more epochs. The paper highlights a significant
drop in training loss and a consistent increase in model accuracy, although exact figures are
not given.
• Advantages:
o Effective at capturing spatial and temporal artifacts.
o Can detect face-swapping and reconstruction-based deepfakes.
o Robust with sequential data.
• Conclusion: Combining ResNeXt and LSTM provides improved deepfake video
classification. Future work should explore attention mechanisms and transformer-based
models for enhanced performance.

Dept. of AI&DS / KSSEM 2024-2025 Page 3


A Survey Paper on Deepfake Detection System

2.3. The paper ―Facial Recognition for Deepfake Detection‖ by Firaol Desta and Emily J
Brown provides a detailed study on using face embeddings for detecting altered or
impersonated identities. Key highlights include:
• Problem Statement: Deepfakes enable malicious impersonation on social media, which
can lead to privacy violations, fraud, and misinformation.
• Proposed System: The system uses the Python face_recognition library to compare
known and unknown face images to detect tampered or fake identities.
• Implementation Details:
o Create a folder of known celebrity faces.
o Prepare a separate folder of altered/unknown images.
o Compare faces using facial embeddings.
o Count matches to determine authenticity.
• Technologies Used:
o Tool: Python face_recognition library.
o Libraries: Dlib, OpenCV.
o Dataset: Custom image dataset of known/unknown faces.
• Accuracy: Effective in detecting obvious modifications or disguises, but limited against
high-quality deepfakes.
• Advantages:
o Easy to implement.
o Efficient for image-level manipulation detection.
o Ideal for small-scale verification tasks.

• Conclusion: While basic facial recognition is useful for flagging identity mismatches, it is
insufficient alone for detecting modern deepfakes. Future systems must combine visual,
temporal, and contextual cues.

2.4. The paper ―A Comprehensive Overview of Deepfake: Generation, Detection,


Datasets, and Opportunities‖ by Jia Wen Seow, Mei Kuan Lim, Raphaël C.W. Phan, and
Joseph K. Liu provides a detailed study on the technical landscape and challenges of deepfake
technology. Key highlights include:
• Problem Statement: Deepfakes are a rising threat to privacy, trust, and national security.
A lack of unified detection frameworks makes current detection efforts fragmented and
inconsistent.

Dept. of AI&DS / KSSEM 2024-2025 Page 4


A Survey Paper on Deepfake Detection System

• Proposed System: The paper categorizes deepfake generation models (GANs,


autoencoders, etc.) and detection approaches (handcrafted, learning-based, multi-modal),
offering a comparative review of their effectiveness and limitations.
• Implementation Details:
o Reviews deepfake generation tools: StyleGAN, ProGAN, etc.
o Discusses detection: temporal, spatial, frequency-based, and hybrid approaches.
o Outlines benchmark datasets and evaluation metrics.
• Technologies Used:
o Generation: GANs, Autoencoders.
o Detection: CNNs, RNNs, Hybrid architectures.
o Datasets: DFDC, FaceForensics++, Celeb-DF, Google DFD.
• Accuracy: Deep learning models show high performance (~90–98%) on in-domain data
but often fail on out-of-distribution samples.
• Advantages:
o Comprehensive categorization of fake media techniques.
o Highlights gaps in current approaches.
o Emphasizes multi-modal detection as the future.
• Conclusion: No single model suffices. Future work must focus on generalizability,
multimodal features, and adversarial robustness to combat the ever-evolving landscape of
synthetic media.

2.5. The paper ―Improving Deepfake Detection by Mixing Top Solutions of the DFDC‖
by Anis Trabelsi, Marc Michel Pic, and Jean-Luc Dugelay provides a detailed study on model
ensembling for robust deepfake detection. Key highlights include:
• Problem Statement: Top-performing models in the Deepfake Detection Challenge
(DFDC) often overfit and fail on unseen data, highlighting a lack of generalization.
• Proposed System: The paper investigates assembling the top five DFDC models using
boosting, bagging, and stacking to create a more robust and generalizable detector.
• Implementation Details:
o Collect predictions from different models.
o Apply ensemble strategies to merge outputs.

o Evaluate performance on DFDC and unseen video samples.

Dept. of AI&DS / KSSEM 2024-2025 Page 5


A Survey Paper on Deepfake Detection System

• Technologies Used:
o Ensemble Methods: Bagging, Boosting, Stacking.
o Frameworks: PyTorch, TensorFlow.
o Datasets: DFDC, Celeb-DF, Google DFD, FaceForensics++.
• Accuracy:
o +2.26% improvement in accuracy.
o +41% improvement in log-loss over single models.
• Advantages:
o High generalization.
o Combines strengths of diverse models.
o Scalable to other detection frameworks.
• Conclusion: Ensembling top solutions yields better performance than individual models.
The study encourages dynamic, modular fusion strategies for practical deepfake detection.

2.6. The paper ―Improving Deepfake Detection by Mixing Top Solutions of the DFDC‖
by Anis Trabelsi, Marc Michel Pic, and Jean-Luc Dugelay provides a detailed study on model
ensembling for robust deepfake detection. Key highlights include:
• Problem Statement: Top-performing models in the Deepfake Detection Challenge
(DFDC) often overfit and fail on unseen data, highlighting a lack of generalization.
• Proposed System: The paper investigates assembling the top five DFDC models using
boosting, bagging, and stacking to create a more robust and generalizable detector.
• Implementation Details:
o Collect predictions from different models.
o Apply ensemble strategies to merge outputs.
o Evaluate performance on DFDC and unseen video samples.
• Technologies Used:
o Ensemble Methods: Bagging, Boosting, Stacking.
o Frameworks: PyTorch, TensorFlow.
o Datasets: DFDC, Celeb-DF, Google DFD, FaceForensics++.
• Accuracy:
o +2.26% improvement in accuracy.
o +41% improvement in log-loss over single models.

• Advantages:
o High generalization.
o Combines strengths of diverse models.
Dept. of AI&DS / KSSEM 2024-2025 Page 6
A Survey Paper on Deepfake Detection System

o Scalable to other detection frameworks.


• Conclusion: Ensembling top solutions yields better performance than individual models.
The study encourages dynamic, modular fusion strategies for practical deepfake detection.

2.7. The paper ―Improving Deepfake Detection by Mixing Top Solutions of the DFDC‖
by Anis Trabelsi, Marc Michel Pic, and Jean-Luc Dugelay provides a detailed study on model
ensembling for robust deepfake detection. Key highlights include:
• Problem Statement: Top-performing models in the Deepfake Detection Challenge
(DFDC) often overfit and fail on unseen data, highlighting a lack of generalization.
• Proposed System: The paper investigates assembling the top five DFDC models using
boosting, bagging, and stacking to create a more robust and generalizable detector.
• Implementation Details:
o Collect predictions from different models.
o Apply ensemble strategies to merge outputs.
o Evaluate performance on DFDC and unseen video samples.
• Technologies Used:
o Ensemble Methods: Bagging, Boosting, Stacking.
o Frameworks: PyTorch, TensorFlow.
o Datasets: DFDC, Celeb-DF, Google DFD, FaceForensics++.
• Accuracy:
o +2.26% improvement in accuracy.
o +41% improvement in log-loss over single models.
• Advantages:
o High generalization.
o Combines strengths of diverse models.
o Scalable to other detection frameworks.
• Conclusion: Ensembling top solutions yields better performance than individual models.
The study encourages dynamic, modular fusion strategies for practical deepfake detection.

2.8. The paper ―Real-Time Face Transition using Deepfake Technology (GAN Model)‖
by Shubham Tandon, Aryan Vig, Harish Chandra Kumawat, and Murli Kartik provides a
detailed study on lightweight GAN architectures for real-time face animation. Key highlights
include:
• Problem Statement: With the increasing sophistication of deepfake generation techniques,
detecting forgeries in videos has become more challenging. Traditional CNN-based

Dept. of AI&DS / KSSEM 2024-2025 Page 7


A Survey Paper on Deepfake Detection System

methods often fall short in modeling long-range dependencies and generalizing across
diverse manipulation types.
• Proposed System: The authors propose a hybrid deepfake detection model named Spatial
Vision Transformer (SViT), which integrates CNN-based ConvBlocks and SCConv
(Spatial and Channel Reconstruction Convolution) blocks with a Vision Transformer
architecture. This model is designed to capture both local (spatial) and global (contextual)
features effectively from face regions in video frames.
• Implementation Details:
o Videos are preprocessed using frame extraction, face detection (BlazeFace), and
cropping with post-processing.
o Frames are resized to 224×224 pixels.
o The architecture includes 5 ConvBlocks and 2 SCConvBlocks for feature extraction.
o These features are embedded into patches and passed through a transformer encoder
with multi-head self-attention.
o The output is processed through an MLP head for binary classification (real or fake).
• Technologies Used:
o Model Architecture: SViT (ConvBlocks + SCConv + Vision Transformer)
o Tools: Python, PyTorch, OpenCV, BlazeFace, MTCNN
o Dataset: Subset of DFDC (Deepfake Detection Challenge) dataset
• Accuracy: The SViT model achieved an accuracy of 93.92%, AUC of 94.79%, and an
F1 Score of 93.01%, outperforming baseline models like CViT and DSViT with fewer
parameters (79.9M vs. CViT’s 88.9M).
• Advantages:
o Enhanced feature representation via SCConv
o High accuracy with reduced false positives
o Efficient architecture with optimized preprocessing
o Balanced trade-off between complexity and performance

• Conclusion: The paper concludes that the Spatial Vision Transformer model provides
superior detection capabilities by combining CNN and transformer strengths. It addresses
the limitations of earlier models like CViT and DSViT, making it suitable for real-time or
resource-constrained deployment. Future research aims to test SViT on more sophisticated
deepfake techniques and larger, diverse datasets.

Dept. of AI&DS / KSSEM 2024-2025 Page 8


A Survey Paper on Deepfake Detection System

2.9. The paper ―Improving Deepfake Detection by Mixing Top Solutions of the DFDC‖ by
Anis Trabelsi, Marc Michel Pic, and Jean-Luc Dugelay provides a detailed study on model
ensembling for robust deepfake detection. Key highlights include:
• Problem Statement: Top-performing models in the Deepfake Detection Challenge
(DFDC) often overfit and fail on unseen data, highlighting a lack of generalization.
• Proposed System: The paper investigates assembling the top five DFDC models using
boosting, bagging, and stacking to create a more robust and generalizable detector.
• Implementation Details:
o Collect predictions from different models.
o Apply ensemble strategies to merge outputs.
o Evaluate performance on DFDC and unseen video samples.
• Technologies Used:
o Ensemble Methods: Bagging, Boosting, Stacking.
o Frameworks: PyTorch, TensorFlow.
o Datasets: DFDC, Celeb-DF, Google DFD, FaceForensics++.
• Accuracy:
o +2.26% improvement in accuracy.
o +41% improvement in log-loss over single models.
• Advantages:
o High generalization.
o Combines strengths of diverse models.
o Scalable to other detection frameworks.
• Conclusion: Ensembling top solutions yields better performance than individual models.
The study encourages dynamic, modular fusion strategies for practical deepfake detection.

Dept. of AI&DS / KSSEM 2024-2025 Page 9


A Survey Paper on Deepfake Detection System

2.10. The paper ―Real-Time Face Transition using Deepfake Technology (GAN Model)‖
by Shubham Tandon, Aryan Vig, Harish Chandra Kumawat, and Murli Kartik provides a
detailed study on lightweight GAN architectures for real-time face animation. Key
highlights include:
• Problem Statement: Deepfake videos use advanced face-swapping and generation
techniques to create highly realistic fake content that can deceive both humans and
traditional detection algorithms. Detecting these manipulations requires understanding not
just frame-level features but also the temporal consistency across frames.
• Proposed System: The study proposes a hybrid model combining ResNeXt-50, a
convolutional neural network, for spatial feature extraction, with LSTM (Long Short-Term
Memory) networks to capture temporal dependencies between video frames. The dual-
architecture is designed to leverage spatial and sequential patterns to detect manipulations
more effectively.
• Implementation Details:
o Video frames are extracted and preprocessed for consistent input.
o ResNeXt-50 extracts frame-level features from individual video frames.
o The sequence of feature vectors is passed into an LSTM to analyze frame-to-frame
dynamics.
o The model is trained with binary classification labels: real or fake.
• Technologies Used:
o Deep Learning Architecture: ResNeXt-50 + LSTM
o Tools: Python, TensorFlow/Keras
o Datasets: Celeb-DF, FaceForensics++, DFDC
• Accuracy: The hybrid ResNeXt-LSTM model achieved an accuracy of 93.5%,
demonstrating strong performance in identifying both spatial artifacts and temporal
inconsistencies typical of deepfake content.
• Advantages:
o Combines powerful spatial and temporal modeling.
o Performs well across multiple datasets and deepfake types.
o Suitable for sequential data such as videos.
• Conclusion: The paper highlights the effectiveness of combining CNN and RNN
architectures for deepfake detection. The integration of ResNeXt and LSTM enables the
model to analyze both visual quality and temporal coherence.

Dept. of AI&DS / KSSEM 2024-2025 Page 10


A Survey Paper on Deepfake Detection System

MEMBER 2: Manoj B– Literature Survey

2.11. The paper “ABC-CapsNet: Attention-based Cascaded Capsule Network for Audio
Deepfake Detection” by Taiba Majid Wani presents a novel approach for detecting audio
deepfakes with high precision using capsule networks and attention mechanisms. Key
highlights include:
• Problem Statement: Traditional CNN-based audio deepfake detectors suffer from loss of
spatial hierarchy and temporal information, resulting in limited performance against
sophisticated attacks.
• Proposed System: Introduces ABC-CapsNet which combines Mel spectrograms with
VGG18-based feature extraction, followed by a cascaded capsule network and attention
layer for robust classification.
• Implementation Details:
o Input: Mel spectrograms from audio samples.
o Features extracted using VGG18.
o Attention layer highlights critical segments.
o Capsule network captures hierarchical features.
o Evaluated on ASVspoof2019 and FoR datasets.
• Technologies Used: Python, VGG18, Capsule Networks, Attention Mechanisms.
• Accuracy: Achieved EER of 0.06% (ASVspoof2019) and 0.04% (FoR).
• Advantages: Maintains spatial hierarchy, improves generalization, reduces EER, and
surpasses CNN limitations.
• Conclusion: ABC-CapsNet effectively detects manipulated audio, offering high accuracy
across varied datasets and establishing a new benchmark in audio deepfake detection.

2.12. The paper “Audio Deepfake Detection Using Deep Learning” by R. Anagha focuses
on detecting audio deepfakes using CNNs applied on Mel spectrograms. Key highlights
include:
• Problem Statement: Audio deepfakes are increasingly used for impersonation and
misinformation, necessitating robust detection systems.
• Proposed System: A CNN-based deep learning architecture that processes Mel
spectrograms of audio files to distinguish real vs. fake samples.
• Implementation Details:
o Dataset: ASVspoof2019.
o Mel spectrograms generated and used as input.

Dept. of AI&DS / KSSEM 2024-2025 Page 11


A Survey Paper on Deepfake Detection System

o CNN with convolution, pooling, dropout, batch normalization.


o Binary classification with Adam optimizer and binary cross-entropy.
• Technologies Used: Python, CNN, Keras, ASVspoof2019 dataset.
• Accuracy: Accuracy and AUC not precisely quantified but shows strong detection
capabilities via ROC curve.
• Advantages: Simple architecture, effective on known dataset, supports augmentation.
• Conclusion: Deep learning-based audio spectrogram analysis is a promising direction for
deepfake detection with potential for real-world deployment.

2.13. The paper “AI-Based Deepfake Detection” by Aditi Garde explores physiological cue-
based deepfake detection focusing on eye-blinking irregularities. Key highlights include:
• Problem Statement: GANs often fail to mimic natural human traits like eye blinking,
which can be exploited for detection.
• Proposed System: A deepfake detection method based on detecting unnatural eye blinking
using CNN and SVM.
• Implementation Details:
o Extract video frames.
o Detect eyes and analyze blinking patterns.
o Train classifiers (SVM and CNN) to detect irregularities.
• Technologies Used: Python, OpenCV, CNN, SVM.
• Advantages: Innovative use of involuntary physiological behavior; works on low-
resolution content.
• Conclusion: Eye-blinking detection offers an efficient and interpretable method for real-
time deepfake detection in videos.

2.14. The paper “Motion Magnified 3D Residual-in-Dense Network for DeepFake


Detection” by Aman Mehra proposes a spatio-temporal model with motion magnification to
enhance detection in compressed videos. Key highlights include:
• Problem Statement: Compression reduces the visibility of artifacts in deepfake videos,
impacting detection accuracy.
• Proposed System: Applies motion magnification to highlight inconsistencies, followed by
a 3D Residual-in-Dense ConvNet for classification.
• Implementation Details:
o Motion magnification pre-processing.
o 3D ConvNet captures temporal inconsistencies.
Dept. of AI&DS / KSSEM 2024-2025 Page 12
A Survey Paper on Deepfake Detection System

o Dense residual connections improve learning without extra memory overhead.


• Technologies Used: Python, FaceForensics++, CelebDF, 3D ConvNet.
• Accuracy: >93% detection accuracy on high-compression FaceForensics++ dataset.
• Advantages: Compression-resilient, addresses ethnicity bias, robust across multiple
datasets.
• Conclusion: Motion-enhanced deepfake detection using 3D spatio-temporal networks
proves effective even under challenging conditions.

2.15. The paper “Domain Generalization via Aggregation and Separation for Audio
Deepfake Detection” by Yuankun Xie proposes a model that generalizes better across unseen
audio deepfake datasets. Key highlights include:
• Problem Statement: Audio deepfake detectors often overfit to the training dataset and fail
to detect new types of fake audio.
• Proposed System: ASDG method that aggregates real speech and separates fake samples
using adversarial and triplet loss learning.
• Implementation Details:
o Uses Lightweight CNN for feature generation.
o Adversarial training on real data only.
o Triplet loss increases separation between fake types.
• Technologies Used: TensorFlow, Python, multiple English audio datasets.
• Accuracy: Up to 39.24% reduction in Equal Error Rate (EER) vs. baseline.
• Advantages: Strong generalization to unseen domains; improves robustness without
needing new spoofed data.
• Conclusion: ASDG offers a scalable and adaptive audio deepfake detection approach
suitable for rapidly evolving real-world threats.

2.16. The paper “Comparative Analysis on Different DeepFake Detection Methods and
Semi Supervised GAN Architecture for DeepFake Detection” by Jerry John and Bismin V.
Sherif presents a comparative overview of detection methods and proposes a semi-supervised
GAN-based model. Key highlights include:
• Problem Statement: Deepfakes, particularly with combined visual and voice
manipulations, are difficult to detect manually. There is a need to identify optimal detection
techniques across varied scenarios.

Dept. of AI&DS / KSSEM 2024-2025 Page 13


A Survey Paper on Deepfake Detection System

• Proposed System: The paper performs a comparative analysis of feature-based, temporal-


based, and deep feature-based detection methods. It also proposes a Semi-Supervised GAN
(SGAN) architecture for improved detection.
• Implementation Details:
o Evaluates models based on face detection method, architecture type, dataset used,
and modality (video/image/audio).
o SGAN is trained with both labeled and unlabeled data to improve learning when
ground truth is limited.
• Technologies Used: GANs, CNNs, SGANs; datasets like DFDC, CelebDF,
FaceForensics++.
• Advantages: SGAN improves learning in low-data environments. The comparative
framework aids in selecting suitable detection models.
• Conclusion: The SGAN approach adds robustness and adaptability to deepfake detection
models and is particularly effective in situations with limited labeled data.

2.17. The paper “Deepfake Detection: A Systematic Literature Review” by Md Shohel Rana
provides a comprehensive survey of over 100 papers on deepfake detection. Key highlights
include:
• Problem Statement: The rapid development of deepfake generation techniques has
outpaced detection methods, necessitating a structured review to map current
advancements.
• Proposed System: The paper categorizes deepfake detection approaches into four groups:
classical ML, deep learning, statistical methods, and blockchain-based techniques.
• Implementation Details:
o Reviews 112 research papers (2018–2020).
o Evaluates based on technique, dataset used, metrics, and performance.
o Provides taxonomy and challenges for each detection method.
• Technologies Used: Multi-technique (DL, ML, blockchain); FaceSwap, DeepFaceLab,
Face2Face.
• Advantages: Broad coverage, useful for benchmarking; highlights the pros/cons of each
approach.
• Conclusion: Deep learning-based approaches outperform others, but generalization, real-
time execution, and interpretability remain key challenges for future research.

Dept. of AI&DS / KSSEM 2024-2025 Page 14


A Survey Paper on Deepfake Detection System

2.18. The paper “Deepfake Generation and Detection – An Exploratory Study” by Diya
Garg and Rupali Gill gives an overview of creation and detection methods along with
benchmark datasets. Key highlights include:
• Problem Statement: Deepfakes, especially multimodal ones (audio + video), are
increasingly realistic and pose risks in privacy and public trust.
• Proposed System: Reviews state-of-the-art detection models, including an AVoiD-DF
model that exploits audio-visual disparity using temporal-spatial information.
• Implementation Details:
o Covers GAN-based generation, D-CNN-based detection.
o Proposes multi-modal joint decoder for integrated feature analysis.
o Introduces DefakeAVMiT dataset for audio-video deepfakes.
• Technologies Used: Deep CNNs, multi-modal encoder-decoder, binary cross-entropy,
Adam optimizer.
• Advantages: Exploits cross-modal inconsistencies; novel dataset supports real-world
multimodal detection.
• Conclusion: Detecting multimodal deepfakes is more challenging but achievable with joint
analysis of audio and visual cues.

2.19. The paper “An Effective Approach for Detecting Deepfake Videos Using Long Short-
Term Memory and ResNet” by Keerthana S. introduces a hybrid LSTM + ResNet architecture
to capture temporal and spatial features. Key highlights include:
• Problem Statement: Most models fail to capture both frame-level and temporal
inconsistencies in deepfake videos.
• Proposed System: Combines ResNet (for spatial features from frames) with LSTM (for
temporal dependencies) to detect forged videos.
• Implementation Details:
o ResNext CNN used for extracting frame-level features.
o LSTM processes temporal information across frames.
o Datasets used: FaceForensics++, DFDC, Celeb-DF.
• Technologies Used: Python, ResNet, LSTM, TensorFlow/Keras.
• Accuracy: Not explicitly stated; validated across multiple benchmark datasets.
• Advantages: Captures both short and long-term inconsistencies, effective in detecting
sequential manipulations.
• Conclusion: The LSTM-ResNet model enhances deepfake detection accuracy and robustness.

Dept. of AI&DS / KSSEM 2024-2025 Page 15


A Survey Paper on Deepfake Detection System

2.20. The paper “Artificial Intelligence into Multimedia Deepfakes Creation and
Detection” by Moaiad Ahmed Khder explores AI techniques in both creating and detecting
deepfakes. Key highlights include:
• Problem Statement: Deepfake detection is becoming increasingly challenging as videos
grow more realistic, leading to a loss of public trust and increased misinformation. Traditional
detection techniques are no longer sufficient to identify subtle manipulations.
• Proposed System: The paper reviews deep learning-based systems that analyze facial
features and behaviors to distinguish real videos from manipulated ones. It emphasizes the use
of autoencoders and GANs for creation and binary classification models for detection.
• Implementation Details:
o Deep autoencoders and GANs used to generate deepfakes.
o Detection uses binary classification models trained on real vs fake data.
o Features like facial movements and speech patterns are analyzed.
o Benchmarks discussed for cross-dataset and cross-forgery generalization.
• Technologies Used: Python, Deep Learning, TensorFlow, Autoencoders, GANs, Web
scraping for data collection.
• Accuracy: Not explicitly stated; emphasizes the need for better generalization and real-
world robustness.
• Advantages:
o Can detect high-quality and subtle manipulations.
o Helps restore trust by identifying fake content.
o Adaptable with updated datasets and model improvements.
• Conclusion: Deepfake technology poses serious risks but also benefits in entertainment
and healthcare. The paper concludes that evolving AI-based detection methods, updated
datasets, and ethical awareness are crucial to combating this growing threat.

Dept. of AI&DS / KSSEM 2024-2025 Page 16


A Survey Paper on Deepfake Detection System

MEMBER 3: Niveditha P– Literature Survey


2.21. The paper “DeepFake Video Detection Using Machine Learning and Deep Learning
Techniques” by Swati N. Patil and Dr. R.S. Holambe compares traditional machine
learning and deep learning methods for deepfake video [Link] highlights include:
• Problem Statement: With the rise in deepfake videos, distinguishing real from fake has
become extremely difficult using human perception. The authors aim to build a system
that can automatically detect deepfakes from videos using ML and DL techniques.
• Proposed System: The paper proposes a system that implements both traditional
machine learning models—Support Vector Machines (SVM) and K-Nearest Neighbors
(KNN)—and a Convolutional Neural Network (CNN) to detect deepfake [Link]
CNN model is designed to automatically learn spatial features and patterns from video
frames, while the ML models rely on handcrafted feature extraction. The system aims to
compare the effectiveness of these approaches in real-world deepfake datasets.
• Implementation Details:
o Video datasets (FaceForensics++ and DFDC) are used, with videos split into
individual frames.
o Face regions are detected and cropped from each frame using Haar Cascade Classifier.
o Frames are preprocessed by resizing and normalization.
o CNN is trained end-to-end for feature extraction and classification.
o SVM and KNN classifiers are trained using manually extracted features like edges
and facial landmarks.
o Performance metrics include accuracy, precision, recall, and error rate.
• Accuracy: CNN achieved an accuracy of 91%, outperforming the traditional ML models.
• Technologies Used:
o Python programming language.
o OpenCV for video frame extraction and face detection.
o TensorFlow and Keras for CNN implementation.
o Scikit-learn for machine learning classifiers (SVM, KNN).
o Pandas and NumPy for data manipulation and preprocessing.
• Advantages:
o CNNs automatically learn deep features without manual intervention, improving
detection accuracy.
o The combination of ML and DL methods provides a good baseline for future
experiments.
Dept. of AI&DS / KSSEM 2024-2025 Page 17
A Survey Paper on Deepfake Detection System

o The approach uses widely available datasets and standard tools, enhancing
replicability.
o The system provides insights into model effectiveness on complex deepfake data.
• Conclusion: The paper concludes that deep learning methods, especially CNNs,
significantly outperform traditional machine learning classifiers in detecting deepfake
videos. It highlights the importance of automated feature learning and sets the stage for
integrating temporal analysis techniques like LSTM in future work.

2.22. The paper “Enhancing Deepfake Video Detection Performance with a Hybrid CNN
Deep Learning Model” by R. Bhanu Prasad et al. proposes a multi-CNN hybrid approach
to improve deepfake detection accuracy and [Link] highlights include:
• Problem Statement:
o Deepfake videos have increasingly subtle manipulations that single CNN models
struggle to detect reliably.
o Many existing systems fail to capture both shallow pixel-level artifacts and deeper
semantic inconsistencies.
o There is a need for a hybrid model that combines strengths of multiple CNN
architectures to enhance detection performance.
• Proposed System: The system integrates multiple CNN models—Xception, VGG16, and
InceptionResNetV2—to form a hybrid architecture that fuses features from differentlayers.
This approach allows the model to simultaneously detect fine-grained anomalies and global
inconsistencies in facial videos, improving overall accuracy and reducing false positives.
• Implementation Details:
o The DFDC dataset is used, featuring thousands of deepfake and real videos in varied
settings.
o Video frames are extracted and faces detected using Haar cascades or MTCNN.
o Each CNN model is fine-tuned on the dataset to extract complementary features.
o Features are concatenated and passed through dense layers for classification.
o The system is trained over multiple epochs using batch normalization and dropout to
prevent overfitting.
• Technologies Used:
o Python and Keras for model development
o TensorFlow backend
o DFDC dataset for training and evaluation

Dept. of AI&DS / KSSEM 2024-2025 Page 18


A Survey Paper on Deepfake Detection System

o OpenCV and MTCNN for preprocessing and face detection


o Transfer learning techniques to leverage pretrained CNN weights
• Accuracy: Achieved 98.1% accuracy and an F1-score of 0.98 on the DFDC test set,
outperforming individual CNN models.
• Advantages:
o Combines multiple CNNs for improved feature extraction.
o Robust to variations in lighting, pose, and video quality.
o Reduces false positive rates.
o Scalable and adaptable to other datasets or modalities.
• Conclusion: The hybrid CNN model significantly improves deepfake detection accuracy
by leveraging multiple CNN architectures. It demonstrates potential for real-world
applications requiring high reliability and adaptability.

2.23. The paper “Comparative Analysis of Deepfake Video Detection Using InceptionNet
and EfficientNet” by V. Pandey and K. Jain evaluates the suitability of two CNN
architectures for efficient and accurate deepfake detection. Key highlights include:
• Problem Statement:
o Many deepfake detection models achieve accuracy at the expense of high
computational costs, making real-time application challenging.
o There is a need to balance detection accuracy with inference speed and resource
efficiency.
o Choosing an optimal CNN architecture can address this trade-off.
• Proposed System: The system integrates multiple CNN models—Xception, VGG16, and
InceptionResNetV2—to form a hybrid architecture that fuses features from different
[Link] approach allows the model to simultaneously detect fine-grained anomalies and
global inconsistencies in facial videos, improving overall accuracy and reducing false
positives.
• Implementation Details:
o The DFDC dataset is used, featuring thousands of deepfake and real videos in varied
settings.
o Video frames are extracted and faces detected using Haar cascades or MTCNN.
o Each CNN model is fine-tuned on the dataset to extract complementary features.
o Features are concatenated and passed through dense layers for classification.

Dept. of AI&DS / KSSEM 2024-2025 Page 19


A Survey Paper on Deepfake Detection System

o The system is trained over multiple epochs using batch normalization and dropout to
prevent overfitting.
• Technologies Used:
o TensorFlow and Keras frameworks
o OpenCV for face detection preprocessing
o YOLOv3 for real-time face extraction
o Vision Transformers integrated with CNN backbones
o Custom dataset compiled from public deepfake sources
• Accuracy: EfficientNet achieved an accuracy of 94%, outperforming InceptionNet in both
accuracy and inference speed..
• Advantages:
o EfficientNet offers a lightweight yet powerful alternative for deepfake detection.
o Vision Transformers enhance the model’s ability to capture context.
o Suitable for real-time deployment in low-resource environments.
o Demonstrates better generalization on varied datasets.
• Conclusion: EfficientNet combined with Vision Transformers presents an optimal solution
balancing speed and accuracy, promising for scalable and real-time deepfake detection
systems

2.24. The paper “Comparison of Different Machine Learning Algorithms for Deep Fake
Detection” by R. Arora and K. Madan compares traditional machine learning classifiers for
image-based deepfake detection. Key highlights include:
• Problem Statement:
o Deepfake detection research predominantly focuses on deep learning, but there is still
a need to evaluate simpler machine learning methods for resource-constrained
environments.
o Understanding which traditional ML algorithm performs best on static image datasets
aids in selecting appropriate models for different application scenarios.
• Proposed System: The authors propose a comparative study of five machine learning
algorithms—Support Vector Machine (SVM), Naïve Bayes, Decision Tree, Random
Forest, and K-Nearest Neighbors (KNN)—to classify real and fake images.
The images are subjected to preprocessing and dimensionality reduction using Principal
Component Analysis (PCA) before being fed to the classifiers. The models are evaluated
based on standard classification metrics.

Dept. of AI&DS / KSSEM 2024-2025 Page 20


A Survey Paper on Deepfake Detection System

• Implementation Details:
o Images are preprocessed by resizing and grayscale conversion.
o PCA reduces features while retaining essential information.
o Classifiers are trained using 80:20 train-test splits.
o Model performance evaluated using accuracy, precision, recall, and F1-score.
• Technologies Used:
o Python’s Scikit-learn library for ML models
o Pandas and NumPy for data handling
o Matplotlib for visualization
o Kaggle dataset for real and fake images
• Accuracy:
o SVM achieved the highest accuracy at approximately 94%.
o Random Forest and Decision Tree showed slightly lower but comparable
performances.
• Advantages:
o Machine learning models are less computationally expensive than deep learning
models.
o Easier to implement and interpret.
o Suitable for image-only detection where temporal features are not required.
o Faster training times on moderate datasets.
• Conclusion: The study concludes that Support Vector Machine outperforms other
traditional ML algorithms in detecting deepfake images. Despite the simplicity of the
approach, SVM demonstrated strong accuracy and generalization, suggesting that it can
serve as an efficient alternative when deep learning is not practical. The paper emphasizes
the relevance of machine learning techniques in scenarios where speed, interpretability, or low-
resource deployment is crucia.

2.25. The paper “Analysis and Comparison of Deepfakes Detection Methods for Cross-
Library Generalisation” by B. Packkildurai and V. Sivasankari investigates the
generalizability of popular detection models across datasets. Key highlights include
• Problem Statement:
o Deepfake detection models often overfit to the datasets they are trained on, limiting
their real-world effectiveness.

Dept. of AI&DS / KSSEM 2024-2025 Page 21


A Survey Paper on Deepfake Detection System

o There is a critical need to evaluate how well these models generalize when tested on
entirely different datasets with varying video qualities and compression artifacts.
• Proposed System: The study compares the performance of XceptionNet, EfficientNet, and
Capsule Networks across multiple datasets including FaceForensics++, DFDC, and Celeb-
DF. Models are trained on one dataset and tested on others to evaluate cross-library
generalization.
• Implementation Details:
o Transfer learning is used for model training.
o Cross-dataset testing protocol ensures models are evaluated on unseen data distributions.
o Accuracy and F1-score are computed for each training-testing dataset pair.
o Confusion matrices and ROC curves analyze misclassification patterns.
• Technologies Used:
o Python-based scripting and experimentation
o TensorFlow/Keras deep learning frameworks
o Data augmentation tools for preprocessing.
o Evaluation metrics: Accuracy, EER, HTER
o Datasets: FaceForensics++, Celeb-DF, DeepfakeTIMIT
• Accuracy:
o Intra-library Testing:Most models achieved high accuracy (>90%) with Multi-task
reaching up to 98% accuracy in same-dataset testing.
o Cross-library Testing:Performance dropped significantly—HTER remained above
30% for all models, indicating poor generalization.
• Advantages:
o Provides a benchmark comparison across six popular detection models.
o Introduces a unified testing framework including cross-library testing, making results
more interpretable.
o Identifies real-world factors like domain offset, data partitioning, and threshold tuning
that directly impact model performance.
o Highlights the importance of person-based dataset splitting for better generalization.
• Conclusion:The study concludes that while many models perform excellently within their
training datasets, they struggle with real-world generalizability. Random data partitioning
and poorly tuned thresholds inflate performance metrics. The paper recommends using
person-based partitioning, carefully selected thresholds, and diverse datasets for

Dept. of AI&DS / KSSEM 2024-2025 Page 22


A Survey Paper on Deepfake Detection System

evaluation. It emphasizes the importance of creating more robust and transferable models
that can handle domain shifts across various deepfake generation techniques and contexts.

2.26. The paper “A Comprehensive Review on Fake Images/Videos Detection


Techniques” by Ruby Chauhan, Renu Popli, and Isha Kansal provides a detailed study on
various fake media detection techniques, ranging from traditional image forensics to
modern deep learning approaches. Key highlights include:
• Problem Statement:
o With the rapid advancement of AI, especially GANs, the creation of fake videos and
images has become alarmingly easy and realistic.
o There is a lack of unified frameworks and models that can effectively identify all
types of fake media across platforms and formats.
• Proposed System: The paper categorizes various fake image and video detection
techniques into machine learning-based, deep learning-based, and hybrid [Link]
emphasizes how CNNs, RNNs, GAN-based detectors, and ensemble models play a critical
role in detecting digital [Link] of proposing one specific model, it offers a broad
comparative review across algorithmic types and application scenarios.
• Implementation Details:
o Techniques are evaluated over datasets like FaceForensics++, Celeb-DF, and
DFDC.
o Covers preprocessing steps like face alignment, frame extraction, and data
normalization.
o Highlights feature-based approaches (e.g., eye-blink detection, head pose
inconsistencies) and deep feature learning.
o Comparative insights are drawn between handcrafted and learned feature
representations.
• Technologies Used:
o Traditional ML models: SVM, KNN, Decision Tree
o Deep Learning models: CNN, RNN, Autoencoders, GAN-based detectors
o Tools: OpenCV, TensorFlow, Keras
o Datasets: FaceForensics++, DFDC, CelebDF, FakeApp-generated sets
• Accuracy: Deep learning models like XceptionNet and hybrid CNNs demonstrate better
accuracy (~90–98%) on high-quality [Link] ML techniques underperform on
complex manipulations but are still useful in constrained settings.

Dept. of AI&DS / KSSEM 2024-2025 Page 23


A Survey Paper on Deepfake Detection System

• Advantages:
o Offers a consolidated comparison of detection techniques and their trade-offs.
o Bridges the understanding between traditional media forensics and modern AI-
driven approaches.
• Conclusion: The study concludes that no single model or method can universally detect all
types of fake content. A combination of deep learning, statistical analysis, and multi-modal
feature extraction is essential. It calls for more diverse datasets, real-time testing
environments, and adaptive models to tackle the fast-evolving landscape of synthetic media.

2.27. The paper “Contemporary Cybersecurity Challenges in Metaverse Using Artificial


Intelligence” presented at the 2022 IEEE CSCI Conference explores how AI can address
emerging threats like deepfakes in virtual and immersive environments. Key highlights
include:
• Problem Statement:With the rise of the metaverse, users are vulnerable to identity theft
and impersonation using realistic deepfake avatars and voices. Existing cybersecurity
models are insufficient to detect AI-generated identities in real-time virtual interactions.
• Proposed System: The paper suggests incorporating AI-driven verification systems within
the metaverse to authenticate avatars and detect potential deepfakes. Emphasis is placed on
anomaly detection, behavioral tracking, and facial data authentication using AI/ML tools.
• Implementation Details:
o The approach is conceptual, but outlines how machine learning models can monitor
avatar behaviors, facial movements, and voice patterns.
o Discusses using real-time surveillance modules for avatar-based systems.
o Proposes AI integration at the platform level to proactively detect suspicious activity
in metaverse spaces.
• Technologies Used:
o AI classifiers for behavioral anomaly detection
o Face and voice recognition algorithms
o Metaverse platforms, VR environments
o Datasets: Not specified (theoretical)
• Accuracy: As this is a conceptual framework, numerical accuracy is not evaluated.
However, AI systems are claimed to be effective in simulating and detecting synthetic
behavior in real-time.

Dept. of AI&DS / KSSEM 2024-2025 Page 24


A Survey Paper on Deepfake Detection System

• Advantages:
o Raises awareness of the future impact of deepfakes in emerging platforms like the
metaverse.
o Encourages the use of AI not only for security but also for trust-building and identity
protection.
• Conclusion: The paper concludes that AI must play a central role in metaverse security to
mitigate deepfake threats. Embedding intelligent detection tools within virtual systems is
essential for maintaining trust, user safety, and the integrity of digital identities.

2.28. The paper “Deepfake Disasters: A Comprehensive Review of Technology, Ethical


Concerns, Countermeasures, and Societal Implications” by Devendra Chapagain,
Naresh Kshetri, and Bindu Aryal presents a broad and interdisciplinary review of deepfake
technology and its impact. Key highlights include:
• Problem Statement: Deepfakes now pose risks like misinformation and blackmail. Their
rapid growth threatens society. Laws and ethics lag behind.
• Proposed System: The paper doesn’t propose a technical detection model but instead offers
a layered framework that addresses deepfakes from technological, ethical, social, and legal
perspectives. Recommends combining detection tools with legal policies and digital
awareness campaigns to curb the threat.
• Implementation Details:
o Analyzes over 32 research articles, grouping them by technology, detection technique,
societal impact, and countermeasure.
o Evaluates use cases such as revenge porn, fake news, and impersonation scams.

o Recommends industry standards and community-based flagging systems for digital


content.
• Technologies Used:
o Detection techniques covered: CNNs, GAN-based classifiers, watermarking,
blockchain verification, digital forensics.
o Tools: Video analysis platforms, facial detection, social media monitoring tools.
o Datasets: References known datasets like FaceForensics++, Celeb-DF, but doesn’t
implement them.
• Accuracy: The paper does not benchmark detection models but highlights that current tools
show high false positives and struggle with realistic deepfakes.

Dept. of AI&DS / KSSEM 2024-2025 Page 25


A Survey Paper on Deepfake Detection System

• Advantages:
o Offers a holistic understanding of deepfakes beyond just detection.
o Bridges technology with ethics, law, media, and psychology.
• Conclusion: The paper concludes that combating deepfakes requires more than technical
solutions. It advocates for multi-pronged efforts combining detection technology, strict
regulation, media awareness, and ethical design to create safe digital ecosystems.

2.29. The paper “DeepFake Video Detection” by Abdelrahman M. Saber, Mohamed T.


Hassan, and others investigates deepfake face-swapping detection using a combination of
spatial and temporal features. Key highlights include:
• Problem Statement:
o Traditional models focus mainly on visual inconsistencies but often overlook
temporal cues in videos, which are crucial for spotting deepfakes.
o There is a need for models that can process both appearance and motion-based
features in a unified pipeline.
• Proposed System: A hybrid detection model combining EfficientNet-B5 (for spatial
features) and Bi-LSTM (for temporal inconsistencies) is proposed.
This system leverages CNNs to capture facial textures and LSTMs to identify unusual frame
transitions or movements.
• Implementation Details:
o Uses datasets such as FaceForensics++ and CelebDF for training.
o Each video is split into frames, and facial regions are cropped and resized.
o Spatial features are extracted using EfficientNet and passed to a Bi-LSTM layer to
capture frame-level dependencies.
o The system is trained using standard backpropagation and optimized using Adam
optimizer.
• Technologies Used:
o EfficientNet-B5 CNN model
o Bi-directional Long Short-Term Memory (Bi-LSTM)
o TensorFlow/Keras
o Python and OpenCV for video processing
• Accuracy:
o Accuracy: 89.38%
o AUROC: 89.35%

Dept. of AI&DS / KSSEM 2024-2025 Page 26


A Survey Paper on Deepfake Detection System

o F1-score: 84.23%
• Advantages:
o Incorporates both spatial and temporal learning for more reliable detection.
o Performs well across multiple datasets.
o Flexible design — can be extended to other biometric modalities.
• Conclusion: The paper concludes that combining CNN and LSTM improves detection
accuracy and robustness. However, the model is computationally intensive, and future work
should focus on optimizing it for real-time and mobile environments.

2.30. The paper “Unmasking the Illusions: A Comprehensive Study on Deepfake Videos
and Images” by Ravikant Ranout and Prof. CRS Kumar examines both the creation and
detection of deepfakes, focusing on the evolution of detection techniques. Key highlights
include:
• Problem Statement:
o Deepfake generation techniques have outpaced detection methods, making it harder
for existing tools to keep up.
o A dual perspective is needed to understand both how deepfakes are made and how
they can be countered effectively.
• Proposed System: The paper reviews traditional and modern detection strategies including
forensic fingerprinting, CNN-based models, and attention-based architectures. It also
highlights future directions such as using blockchain and watermarking for validation.
• Implementation Details:
o Analyzes how early detection was done using cues like eye-blink rate, facial jitter,
and inconsistent lighting.
o Reviews recent deep learning approaches using XceptionNet, Capsule Networks,
and GAN discriminators.
o Discusses role of datasets like FaceForensics++, DFDC, and their influence on
model development.
• Technologies Used:
o CNNs, RNNs, Capsule Networks, Attention Mechanisms.
o Autoencoders, GAN discriminators.
o Python, TensorFlow, Deep Learning libraries.

Dept. of AI&DS / KSSEM 2024-2025 Page 27


A Survey Paper on Deepfake Detection System

• Accuracy: As a survey paper, it does not report specific accuracy values but identifies
XceptionNet and CapsuleNet as among the most effective models in reviewed studies.
• Advantages:
o Comprehensive summary of creation and detection mechanisms
o Bridges the technical and societal perspective on deepfakes
• Conclusion: The paper concludes that a multi-modal approach combining video, audio,
metadata, and user behavior is essential for future-proof deepfake detection.

Dept. of AI&DS / KSSEM 2024-2025 Page 28


A Survey Paper on Deepfake Detection System

MEMBER 4: Tejas J– Literature Survey

2.31. The paper “A Heterogeneous Feature Ensemble Learning based Deepfake Detection
Method” by Jixin Zhang, Ke Cheng, Giuliano Sovernigo, and Xiaodong Lin presents a novel
method combining multiple feature types to improve the robustness and accuracy of deepfake
image detection. Key highlights include:
• Problem Statement: Deepfake detectors often perform poorly when tested on fake content
generated by models different from the training set due to a lack of generalization.
• Proposed System: The authors propose an ensemble learning model integrating gray
gradient, spectrum, and texture features, which are flattened and input into a back-
propagation neural network classifier.
• Implementation Details:
o Three features: grey gradients (facial landmarks), co-occurrence matrix (texture), and
spectrum are extracted.
o A flattening process is used to unify heterogeneous features.
o These features are used in a back-propagation neural network for training and
classification.
• Technologies Used:
o Feature Engineering: Gray Gradient Histogram, Co-occurrence Matrix, Spectral
Features
o Model: Back propagation Neural Network
• Accuracy: Achieved detection accuracy of 97.04%, outperforming state-of-the-art
detectors on cross-model evaluations.
• Advantages:
o Good generalization across unknown deepfake models o Effective feature
combination strategy
o Improved accuracy and robustness
• Conclusion: The paper demonstrates that heterogeneous feature ensemble learning
significantly boosts the robustness and accuracy of deepfake detectors against unseen
manipulations.

Dept. of AI&DS / KSSEM 2024-2025 Page 29


A Survey Paper on Deepfake Detection System

2.32. The paper “Deepfake Detection in Videos and Picture: Analysis of Deep Learning
Models and Dataset” by Surendra Singh Chauhan, Nitin Jain, Satish Chandra Pandey, and
Aakash Chabaque presents a comparative analysis of deepfake detection methods and datasets.
Key highlights include:
• Problem Statement: Rising misuse of easily accessible deepfake tools necessitates better
detection systems that can keep up with evolving manipulation techniques.
• Proposed System: The authors analyze different deepfake detection models (CNN, RNN,
hybrid), datasets (FaceForensics++, DFDC), and manipulation types (lip-sync, face swap,
puppeteering).
• Implementation Details:
o Overview of models like CNN, LSTM, and DenseNet with recurrent convolutional
components.
o Used pre-existing datasets for training and evaluation (FaceForensics++, DFDC).
o Compared methods based on detection capabilities for various manipulation types.
• Technologies Used:
o Models: DenseNet + GRU, CNN + LSTM
o Libraries: TensorFlow, OpenCV, Python
o Datasets: FaceForensics++, DFDC
• Accuracy: Methods discussed achieved promising results with temporal-aware models
outperforming frame-wise models in many cases.
• Advantages:
o Comprehensive coverage of techniques
o Identifies limitations in dataset diversity
o Emphasizes need for multi-modal analysis
• Conclusion: The paper underscores the importance of hybrid and temporally aware models
and datasets with broader manipulation coverage for reliable deepfake detection.

2.33. The paper “Deepfake Detection Using Deep Learning” by Prakash Raj S, Pravin D,
Sabareeswaran G, Sanjith R K, and Gomathi B presents an application-centric study on using
modern deep learning architectures to tackle deepfake media. Key highlights include:
• Problem Statement: Traditional approaches fall short against sophisticated deepfake
techniques due to lack of contextual understanding and robustness.
• Proposed System: A study of CNN-based and hybrid (CNN + RNN) models for detecting
deepfake video and image content with focus on robustness and real-time application.

Dept. of AI&DS / KSSEM 2024-2025 Page 30


A Survey Paper on Deepfake Detection System

• Implementation Details:
o Preprocessing of video/image data for model ingestion.
o Model architectures include CNN variants (VGG, ResNet) and hybrid with LSTM
layers.
o Evaluation based on metrics like accuracy and F1-score.
• Technologies Used:
o Tools: Python, TensorFlow, Keras
o Architectures: CNN, ResNet, VGG16, LSTM
• Accuracy: The hybrid models demonstrated improved detection with temporal consistency
across frames.
• Advantages:
o Enhanced performance using RNNs on video sequences
o Effective for both static and dynamic deepfakes
• Conclusion: The authors affirm that combining spatial and temporal features using deep
learning significantly strengthens deepfake detection systems.

2.34. The paper “Detecting Deepfakes: Training Adversarial Detectors with GANs for
Image Authentication” presents a generative adversarial training approach for improving fake
image detection. Key highlights include:
• Problem Statement: Existing detectors are vulnerable to adversarial examples and fail to
generalize across manipulation techniques.
• Proposed System: A system where GANs are used to simulate adversarial attacks on
detectors, thereby improving their robustness and adaptability.
• Implementation Details:
o Adversarial examples generated using GAN variants.
o Fine-tuning traditional classifiers with adversarial training.
o Evaluation based on classification robustness.
• Technologies Used:
o GANs, TensorFlow, Custom CNNs
• Accuracy: Improved generalization on unseen manipulations compared to non-adversarial
baselines.
• Advantages:
o Boosts robustness against adversarial attacks
o Prepares models for real-world conditions

Dept. of AI&DS / KSSEM 2024-2025 Page 31


A Survey Paper on Deepfake Detection System

• Conclusion: The paper validates adversarial training as a potent strategy for developing
resilient deepfake detectors capable of withstanding attacks and unseen variations.

2.35. The paper “Detection of Deepfakes: Protecting Images and Videos Against Deepfake”
by Ying Tian, Wang Zhou, and Amin Ul Haq presents an overview of deepfake generation and
CNN-based detection techniques, with attention to societal implications. Key highlights
include:
• Problem Statement: The increasing misuse of deepfake technology poses threats to
security, privacy, and social trust.
• Proposed System: The authors describe the use of CNN architectures for detecting
manipulated images and videos, analyzing facial attributes and inconsistencies.
• Implementation Details:
o Deepfake videos analyzed for facial distortions, light mismatches, and temporal
inconsistencies.
o Model trained on facial landmarks, pixel inconsistencies, and edge information
• Technologies Used:
o CNN, GANs
o Tools: TensorFlow, OpenCV
• Accuracy: Empirical results suggest robust performance for basic CNN configurations on
typical datasets.
• Advantages:
o Ease of implementation
o Good performance on synthetic datasets
• Conclusion: The authors highlight the importance of continuous improvement in detection
tools to meet evolving threats posed by deepfake technologies.

2.36. The paper “Div-Df: A Diverse Manipulation Deepfake Video Dataset” by Deepak
Dagar and Dinesh Kumar Vishwakarma introduces a novel dataset for benchmarking deepfake
detection. Key highlights include:
• Problem Statement: Existing datasets are biased toward face-swap manipulation and do
not represent diverse real-world deepfakes.
• Proposed System: The authors present Div-DF, a dataset containing 250 manipulated
videos using face-swap, lip-sync, and facial reenactment, along with 150 real videos.

Dept. of AI&DS / KSSEM 2024-2025 Page 32


A Survey Paper on Deepfake Detection System

• Implementation Details:
o Fake videos created using FaceSwap-GAN and Wav2Lip methods.
o Includes diverse content: different identities, lighting, professions, and expressions.
• Technologies Used:
o Deepfake Synthesis Tools: FSGAN, Wav2Lip
o Evaluation Models: CNNs, Vision Transformers
• Accuracy: Detection models performed significantly lower on Div-DF than on
conventional datasets, revealing its challenge.
• Advantages:
o Improves generalization benchmarking
o Highlights real-world diversity issues
• Conclusion: Div-DF provides a richer and more realistic benchmark for testing and
improving deepfake detection systems.

2.37. The paper “Model Attribution of Face-Swap Deepfake Videos” by Shan Jia, Xin Li,
and Siwei Lyu presents a novel method to identify the specific model used to generate a
deepfake. Key highlights include:
• Problem Statement: Most studies focus on detecting whether a video is fake, but forensic
attribution to the generating model is lacking.
• Proposed System: The authors frame attribution as a multi-class classification task using a
new dataset (DFDM) containing videos from five different autoencoder models
• Implementation Details:
o Spatial and temporal attention mechanism (DMA-STA) for feature extraction.
o Trained on 6450 Deepfake videos from different encoder-decoder settings.
• Technologies Used:
o Vision Transformers with Attention Modules
o DFDM Dataset
• Accuracy: Achieved over 70% attribution accuracy on high-quality deepfakes.
• Advantages:
o Supports forensic tracking
o Discriminates subtle visual differences
• Conclusion: The method introduces a new dimension to deepfake analysis by enabling
traceability of generation tools.

Dept. of AI&DS / KSSEM 2024-2025 Page 33


A Survey Paper on Deepfake Detection System

2.38. The paper “Review: DeepFake Detection Techniques using Deep Neural Networks
(DNN)” by Harsh Chotaliya, Mohammed Adil Khatri, Shubham Kanojiya, and Mandar
Bivalkar provides a comparative analysis of various deepfake detection methods using DNNs.
Key highlights include:
• Problem Statement: Deepfake detection methods struggle with generalization and
robustness against evolving manipulations.
• Proposed System: Comparison of CNN models (VGG16, ResNet, EfficientNet) and hybrid
models (CNN + LSTM) on benchmark datasets.
• Implementation Details:
o Model evaluation on FaceForensics++ and custom test sets.
o Analysis based on accuracy, AUC, and processing time.
• Technologies Used:
o DNN Frameworks: TensorFlow, Keras
o Models: VGG16, ResNet, EfficientNet, LSTM
• Accuracy: CNN + LSTM models showed improved performance on temporal datasets.
• Advantages:
o Wide model comparison
o Identification of pros and cons for each architecture
• Conclusion: The paper offers practical insights into selecting and improving deepfake
detection models using deep learning.

2.39. The paper “Deepfake Generation and Detection: A Survey” by Tao Zhang provides an
extensive review of deepfake synthesis and detection methods, highlighting existing challenges
and future research directions. Key highlights include:
• Problem Statement: Deepfake technology poses severe societal risks due to its realism and
accessibility.
• Proposed System: Surveys both generation (face-swap, reenactment) and detection
methods (CNN, RNN, GAN-based, multimodal).
• Implementation Details:
o Compares over 20 generation and detection approaches.
o Identifies gaps in dataset diversity and model robustness.
• Technologies Used:
o DNNs, GANs, CNNs, RNNs, LSTMs, SVMs
• Accuracy: Varies widely across techniques and datasets.

Dept. of AI&DS / KSSEM 2024-2025 Page 34


A Survey Paper on Deepfake Detection System

• Advantages:
o Comprehensive landscape overview
o Highlights real-world threats and limitations
• Conclusion: Emphasizes the need for scalable, efficient, and generalizable detection
systems with robust benchmarking datasets.

2.40. The paper “Deep Fake in Picture Using Convolutional Neural Network” by Dr. J.N. Singh,
Ashutosh Gautam, and Harsh Tomar presents a deep learning approach for detecting fake
images using CNNs. Key highlights include:
• Problem Statement: With the rise of realistic image manipulation through deepfake tools,
it becomes critical to develop reliable methods to identify fake images generated through
AI.
• Proposed System: The authors develop and train a CNN-based model to differentiate
between authentic and manipulated images, leveraging the MesoNet architecture for
enhanced feature extraction.
• Implementation Details:
o Dataset: Real and fake images sourced from Kaggle.
o Model: Custom CNN using four convolutional layers, batch normalization, max
pooling, and dense layers.
o Script Flow: Data is loaded via Google Drive, images are passed through the
network using a generator, and predictions are evaluated visually.
• Technologies Used:
o Language: Python
o Frameworks: Custom MesoNet implementation, Matplotlib for visualization
o Tools: Google Colab, Image datasets from Kaggle
• Accuracy: While exact metrics are not reported, the model demonstrated strong
performance, particularly on compressed and altered inputs where traditional systems fail.
• Advantages:
o Strong image-based detection performance
o Lightweight architecture suitable for mid-tier systems (4GB RAM, 128GB SSD)
o Practical against real-world noise and compression artifacts
• Conclusion: The CNN-based approach using MesoNet provides a reliable means to detect
fake images. The authors recommend further enhancements for broader dataset diversity
and cross-domain generalization.

Dept. of AI&DS / KSSEM 2024-2025 Page 35


A Survey Paper on Deepfake Detection System

CHAPTER 3

PROBLEM IDENTIFICATION
3.1 Problem statement

Despite significant advancements in deepfake detection through spatial, temporal, and hybrid
models, existing systems struggle to operate effectively in dynamic real-world environments
such as social media, legal platforms, or crisis communication channels. Current approaches
often lack adaptability to low-resolution, user-uploaded content, overlook contextual metadata
anomalies, and fail to provide transparent, explainable outputs. This gap not only limits
detection accuracy in uncontrolled settings but also undermines public trust and decision-
making, especially during critical events where manipulated media can have serious ethical,
social, and legal consequences.

3.2 Project scope

The scope of this project includes the research and evaluation of various machine learning and
deep learning techniques for deepfake detection. It covers a comprehensive review of existing
models, ranging from traditional classifiers like SVMs to advanced CNN-based and hybrid
architectures such as EfficientNet, InceptionNet, and ensemble methods. The objective is to
analyze their performance, generalizability, computational efficiency, and applicability in real-
world scenarios. Key limitations such as poor cross-dataset performance, high resource
demands, and lack of real-time processing will be identified to inform the design of a more
robust detection methodology.

In addition to technical improvements, the project also addresses ethical and societal
considerations. The study involves experimentation using benchmark datasets like
FaceForensics++, DFDC, and Celeb-DF, with evaluation metrics including accuracy,
precision, recall, and generalization. Ultimately, the project aims to develop a scalable,
adaptive, and trustworthy deepfake detection system suitable for applications in legal forensics,
media verification, and combating misinformation.

Dept. of AI&DS / KSSEM 2024-2025 Page 36


A Survey Paper on Deepfake Detection System

CHAPTER 4

GOALS AND OBJCTIVES


4.1 Project Goals

The primary goal of this project is to explore and analyze various machine learning and deep
learning models for the detection of deepfake videos and images. The project aims to develop
a deeper understanding of how existing detection systems perform, especially across different
datasets, manipulation types, and model architectures.

Another major goal is to identify and implement an approach that offers improved accuracy,
better generalizability across datasets, and potential for real-time applicability. The ultimate
aim is to contribute toward building a reliable and scalable deepfake detection system that can
assist in real-world scenarios such as media verification, digital forensics, and online content
moderation.

4.2 Project Objectives


• To develop a robust deepfake detection system that combines spatial and temporal features
using architectures like EfficientNet and Bi-LSTM to effectively analyze user-generated
video content.
• To incorporate contextual cues—such as metadata inconsistencies, source credibility, and
upload behavior—into the detection pipeline to enhance reliability in real-world scenarios.
• To implement an explainability module (e.g., Grad-CAM) that visually highlights
manipulated regions, supporting transparency and interpretability for users, moderators, and
forensic experts.
• To design the system for deployment in varied environments such as social media,
journalism, law enforcement, and public awareness platforms, ensuring it adapts to diverse
conditions including low bandwidth, multilingual contexts, and high-stakes information
flows

Dept. of AI&DS / KSSEM 2024-2025 Page 37


A Survey Paper on Deepfake Detection System

CHAPTER 5

SYSTEM REQIUREMENT SPECFICATION


The successful implementation of a deepfake detection system requires specific hardware and
software configurations. This chapter outlines the necessary tools, libraries, and system
capabilities required for building, training, and evaluating machine learning and deep learning
models as part of this project.

5.1 Software Requirements

o The project will be developed using the Python 3.x programming language due to its
strong support for AI/ML libraries.
o TensorFlow and Keras will be the primary deep learning frameworks used for
building and training CNN models.
o Scikit-learn will be used for implementing traditional machine learning algorithms
like SVM, Random Forest, etc.
o OpenCV will be used for video frame extraction and face detection.
o NumPy, Pandas, and Matplotlib will be used for data manipulation and visualization.
o Development will be done using Jupyter Notebook, Google Colab, or VS Code.
o Seaborn will be optionally used for advanced plotting and visual analytics.
o Dataset integration will be done using tools/APIs for FaceForensics++, DFDC, and
Celeb-DF.

5.2 Hardware Requirements

o A system with a minimum Intel i5 or AMD Ryzen 5 processor is recommended.


o 8 GB RAM is required; 16 GB or more is preferred for efficient training of deep
learning models.
o A minimum of 250 GB of storage is required, preferably SSD for faster data access.
o A GPU-enabled system (such as NVIDIA GTX 1650 or higher) is recommended for
faster training and testing.
o Standard HD display, keyboard, mouse, and internet access are needed for
development and dataset downloads.

Dept. of AI&DS / KSSEM 2024-2025 Page 38


A Survey Paper on Deepfake Detection System

CHAPTER 6

METHODOLOGY
The methodology for this project outlines the systematic approach followed to study, analyze,
and implement deepfake detection models. It is divided into multiple phases, each contributing
to the successful development and evaluation of the detection system.

6.1 Literature Survey


o A thorough literature review was conducted to understand existing deepfake detection
methods using machine learning and deep learning.
o Five research papers were selected that utilized different models like SVM, CNN,
EfficientNet, InceptionNet, and hybrid CNN approaches.
o Each paper was analyzed for its architecture, dataset, methodology, accuracy,
limitations, and future potential.

6.2 Dataset Collection and Preprocessing


o Publicly available datasets such as FaceForensics++, DFDC, and Celeb-DF are used
for experimentation.
o Videos from these datasets are converted into individual frames using OpenCV.
o Face detection is applied using Haar cascades or MTCNN, and detected faces are
cropped and resized (e.g., 224x224 pixels).
o All images are normalized and labeled as real or fake for training and testing
purposes.

6.3 Feature Extraction


o In ML models, Principal Component Analysis (PCA) is used for dimensionality
reduction.
o In DL models, Convolutional Neural Networks (CNNs) automatically learn and
extract features from facial regions.
o Hybrid CNN models use multi-layer feature fusion for better representation of both
low-level and high-level image details.

Dept. of AI&DS / KSSEM 2024-2025 Page 39


A Survey Paper on Deepfake Detection System

6.4 Model Training and Evaluation


o Machine learning models like SVM, Decision Tree, and Random Forest are trained
using Scikit-learn.
o Deep learning models such as EfficientNet, InceptionNet, and hybrid CNNs are
implemented using TensorFlow/Keras.
o Training is performed using 80:20 train-test [Link] metrics include accuracy,
precision, recall, F1-score, and confusion matrix to analyze model performance.
o Cross-dataset testing is performed to measure generalization ability.

6.5 Result Analysis

o Each model’s performance is recorded and compared based on detection accuracy and
computational efficiency.
o Deep learning models generally outperform ML models in terms of accuracy but
require more training time and computational resources.
o Cross-library performance is analyzed to determine how well models generalize to
unseen datasets.

6.6 Documentation and Reporting

o Results, observations, and limitations are documented.


o Visualizations such as bar charts, ROC curves, and confusion matrices are generated.
o A comparative summary is prepared based on literature and implementation outcomes.

Dept. of AI&DS / KSSEM 2024-2025 Page 40


A Survey Paper on Deepfake Detection System

CHAPTER 7

APPLICATIONS
Deepfake detection systems have wide-ranging applications across several domains due to the
rising concern over the misuse of manipulated media. As synthetic media becomes increasingly
realistic and accessible, the need for automated and accurate detection systems has become
critical. The following are some key application areas where deepfake detection plays a vital
role:

7.1 Social Media Monitoring


o Platforms like Facebook, Instagram, YouTube, and X (formerly Twitter) face constant
threats from fake videos going viral.
o Deepfake detection models can be integrated into content moderation pipelines to
automatically flag or remove suspicious media before it spreads misinformation.

7.2 Digital Forensics and Cybersecurity


o In cybercrime investigations, detecting manipulated audio-visual evidence is essential.
o Law enforcement agencies can use deepfake detection tools to verify the authenticity
of video evidence, especially in criminal cases involving identity impersonation.

7.3 Journalism and Media Verification


o News outlets and fact-checking organizations can deploy deepfake detectors to ensure
the credibility of visual content before publication or broadcast.
o This helps fight against fake news and politically motivated disinformation.

7.4 Financial and Legal Sectors


o In online banking, virtual KYC (Know Your Customer), or digital signatures,
deepfake videos can be used for fraudulent identity verification.
o Detection systems can protect institutions by flagging forged video identities in legal
or financial processes.

Dept. of AI&DS / KSSEM 2024-2025 Page 41


A Survey Paper on Deepfake Detection System

REFERENCES
1. S. Patel, S. K. Chandra, and A. Jain, “DeepFake Videos Detection and Classification Using
ResNeXt and LSTM Neural Network,” 2023 3rd Int. Conf. Smart Gen. Comput., Commun.
Netw. (SMART GENCON), IEEE, 2023.
2. F. Desta and E. J. Brown, “Facial Recognition for Deepfake Detection,” 2022 IEEE
Integrated STEM Educ. Conf. (ISEC), pp. 364–367, IEEE, 2022.
3. J. W. Seow et al., “A Comprehensive Overview of Deepfake: Generation, Detection,
Datasets, and Opportunities,” Neurocomputing, vol. 513, pp. 351–371, Elsevier, 2022.
4. A. Trabelsi, M. M. Pic, and J.-L. Dugelay, “Improving Deepfake Detection by Mixing Top
Solutions of the DFDC,” 2022 Eur. Signal Process. Conf. (EUSIPCO), IEEE, 2022.
5. S. Tandon et al., “Real-Time Face Transition Using Deepfake Technology (GAN Model),”
2023 Int. Conf. RAEEUCCI, IEEE, 2023.
6. Z. Jin, L. Lang, and B. Leng, “Wave-Spectrogram Cross-Modal Aggregation for Audio
Deepfake Detection,” 2025 IEEE ICASSP, pp. 1–5, 2025.
7. Anonymous, “Deepfake Detection using Inception-ResNetV2,” 2021 Int. Conf. ACFCFT,
IEEE, 2021.
8. P. M. Thuan, B. T. Lam, and P. D. Trung, “Spatial Vision Transformer: A Novel Approach
to Deepfake Video Detection,” Proc. VCRIS, IEEE, 2024, pp. 1–10.
9. S. Yadav et al., “Robust and Generalized DeepFake Detection,” Proc. ICCCNT, IEEE,
2022, pp. 1–8.
10. P. Karthik et al., “DeepFake Videos Detection and Classification Using ResNeXt and
LSTM Neural Network,” 2023 Int. Conf. SMART GENCON, IEEE, 2023.
11. T. M. Wani et al., “ABC-CapsNet: Attention-based Cascaded Capsule Network for Audio
Deepfake Detection,” CVPRW, pp. 2464–2473, IEEE, 2024.
12. R. Anagha et al., “Audio Deepfake Detection Using Deep Learning,” Proc. SMART–2023,
IEEE, 2023.
13. Garde, S. Suratkar, and F. Kazi, “AI-Based Deepfake Detection,” 2022 1st Int. Conf. DDS,
IEEE, 2022.
14. A. Mehra et al., “Motion Magnified 3-D Residual-in-Dense Network for DeepFake
Detection,” IEEE Trans. Biometrics, Behav. Identity Sci., vol. 5, no. 1, pp. 39–52, Jan.
2023.
15. Y. Xie et al., “Domain Generalization via Aggregation and Separation for Audio Deepfake
Detection,” IEEE Trans. Inf. Forensics Secur., vol. 19, pp. 344–356, 2024.

Dept. of AI&DS / KSSEM 2024-2025 Page 42


A Survey Paper on Deepfake Detection System

16. J. John and B. V. Sherif, “Comparative Analysis on Different DeepFake Detection


Methods,” 2022 Int. Conf. I-SMAC, IEEE, 2022.
17. M. S. Rana et al., “Deepfake Detection: A Systematic Literature Review,” IEEE Access,
vol. 10, pp. 25494–25521, Mar. 2022.
18. D. Garg and R. Gill, “Deepfake Generation and Detection – An Exploratory Study,” 2023
IEEE UPCON, IEEE, 2023.
19. K. S. Keerthana et al., “Detecting Deepfake Videos Using LSTM and ResNet,” 2024 Int.
Conf. IC3IoT, IEEE, pp. 1–6, 2024.
20. M. A. Khder et al., “AI into Multimedia Deepfakes Creation and Detection,” 2022 IEEE
ITIKD, IEEE, 2022.
21. K. Kavitha and S. Samundeswari, “Missing children face identification using deep
learning,” Proc. 3rd ICPCSN, 2023, pp. 1–5, doi: 10.1109/ICPCSN58827.2023.00013.
22. S. Suvarna et al., “Harnessing Deep Learning for Missing Child Identification,” Proc. 2nd
ICAIT, 2024, pp. 1–5, doi: 10.1109/ICAIT61638.2024.10690354.
23. B. Sridhar et al., “Missing Children Identification Using Face Recognition,” Proc. ASSIC,
2022, pp. 1–5, doi: 10.1109/ASSIC55218.2022.10088405.
24. A. Ponmalar et al., “Finding Missing Person Using AI,” Proc. ICCPC, 2022, pp. 1–5, doi:
10.1109/ICCPC55978.2022.10072122.
25. V. Sudeksha et al., “Missing Persons Comprehensive Tracking System,” Proc. ICITEICS,
2024, pp. 1–5, doi: 10.1109/ICITEICS61368.2024.10625192.
26. J. Zhang et al., “A Heterogeneous Feature Ensemble Learning based Deepfake Detection,”
2022 IEEE ICC Workshops, pp. 613–618.
27. S. S. Chauhan et al., “Deepfake Detection in Videos and Picture,” 2022 IEEE ICDSIS,
2022.
28. P. R. S. et al., “Deepfake Detection Using Deep Learning,” 2024 ICACCS, IEEE, 2024.
29. [Author unspecified], “Detecting Deepfakes: Training Adversarial Detectors with GANs,”
[Conf./Journal unspecified].
30. Y. Tian et al., “Detection of Deepfakes: Protecting Images and Videos,” 2024
ICCWAMTIP, IEEE, 2024.
31. D. Dagar and D. K. Vishwakarma, “Div-Df: A Diverse Manipulation Deepfake Video
Dataset,” 2023 GCITC, IEEE, 2023.
32. S. Jia, X. Li, and S. Lyu, “Model Attribution of Face-Swap Deepfake Videos,” 2022 IEEE
ICIP, IEEE, 2022.
33. H. Chotaliya et al., “Review: DeepFake Detection Techniques using DNN,” 2023 h ICAST,

Dept. of AI&DS / KSSEM 2024-2025 Page 43


A Survey Paper on Deepfake Detection System

IEEE, 2023.
34. T. Zhang, “Deepfake Generation and Detection: A Survey,” Multimedia Tools Appl., vol.
81, pp. 6259–6276, Springer, 2022. doi: 10.1007/s11042-021-11733-y.
35. J. N. Singh et al., “Deep Fake in Picture Using CNN,” 2023 5th ICAC3N, IEEE, pp. 1104–
1107, 2023. doi: 10.1109/ICAC3N60023.2023.10541758.

Dept. of AI&DS / KSSEM 2024-2025 Page 44

You might also like