0% found this document useful (0 votes)
168 views

DeepFake Video Detection

The document summarizes a project presentation on deepfake video detection using ResNeXt CNN and LSTM. The proposed system combines ResNeXt Convolutional Neural Networks and Long Short-Term Memory networks to leverage the strengths of both architectures and improve the accuracy of detecting manipulated videos. The ResNeXt CNN is used to extract spatial features from video frames while LSTM networks capture temporal dependencies in the data. This hybrid model allows for modeling of visual cues in multimedia content. A comprehensive dataset of deepfake videos is introduced for training and testing the model.

Uploaded by

rd723489
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views

DeepFake Video Detection

The document summarizes a project presentation on deepfake video detection using ResNeXt CNN and LSTM. The proposed system combines ResNeXt Convolutional Neural Networks and Long Short-Term Memory networks to leverage the strengths of both architectures and improve the accuracy of detecting manipulated videos. The ResNeXt CNN is used to extract spatial features from video frames while LSTM networks capture temporal dependencies in the data. This hybrid model allows for modeling of visual cues in multimedia content. A comprehensive dataset of deepfake videos is introduced for training and testing the model.

Uploaded by

rd723489
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

East West Institute

Bengaluru-560019
of
Technology
Department of Information
Science and Engineering
A Project Work Phase 1 Presentation
on
“DeepFake Video Detection using Res-NeXt
Presented CNN and LSTM” Under the
Abhishek TN
by:
[1EW20IS003] guidance of:
Dr. Suresh MB
Prof. and Head
Chethan Gowda R
Dept. of ISE, EWIT
[1EW20IS018]
TABLE OF
CONTENTS
ABSTRACT
INTRODUCTION
LITERATURE
SURVEY
EXISTING SYSTEM
PROPOSED SYSTEM
ALGORITHMS
APPLICATIONS
ABSTRAC
The proposed system is based on deep fake technology that has become a significant concern due to its potential for
misinformation and deception. This proposed system presents a novel approach for detecting deep fake videos by combining Res-NeXt

T
Convolutional Neural networks (CNN) with Long Short-Term Memory (LSTM) networks. ResNeXt-50 has been used in deepfake
detection by training the network on a large dataset of real and fake videos. Here Resnet50 and LSTM are combined to make a hybrid
architecture are used in deep fake video detection as a web framework using python. Combining ResNet50 and LSTM can help to
leverage the strengths of both architectures and improve the accuracy of deep fake video detection, especially for videos that involve
both image-based and sequential data. The Res-NeXt CNN is employed to extract spatial features from video frames, while LSTM
networks capture temporal dependencies within the data. This hybrid architecture allows for the modeling of visual cues in multimedia
content. We introduce a comprehensive dataset of deep fake videos samples for training and testing, ensuring diversity in content and
quality. A comparative analysis of different models were assessed using various datasets such as Celeb-DF, Face Forensic++ datasets.
The proposed model is trained on this dataset and achieves remarkable accuracy in distinguishing between authentic and manipulated
media.
INTRODU
 Deepfake videos are manipulated videos that use advanced artificial intelligence and machine learning techniques to create a
fake video that looks very realistic.

most common methods include : CTION


 Detecting deep fake videos is a complex task, and there are several techniques and approaches that can be used. Some of the

 Facial analysis : This technique involves analyzing the facial expressions, movements, and inconsistencies in the video to
determine if it is a deep fake.
 Metadata analysis : Metadata can provide valuable information about the video, such as the location, date, and time it
was recorded, which can help determine if the video is authentic or not.
 Source analysis : This technique involves tracing the origin of the video and analyzing its source to determine if it has
been tampered with or manipulated.
 Machine learning : Using machine learning, algorithms can help detect deep fake videos by training the algorithms to
recognize patterns and anomalies in the video.
 Additionally, as technology advances, so do the techniques used to create deep fakes, making it a continuously evolving field
that requires ongoing research and development.
Literature Latha M,
Abdul Samad,
Siwei Lyu, Alekhya B,

Survey
Siwei Lyu,
Yuezun Li,
University at Albany,
State University of New
Ming-Ching Chang,
Yuezun Li,
University at Albany,
State University of New
Harshitha M,
Rakshitha P,
Atria Institute of
Technology, Bangalore,
Umur Aybars Ciftci, Ilke
Demir,
Lijun Yin,
Senior Member, IEEE
York, USA York, USA India

Detecting Face Detecting Eye Detecting Forged Detecting Synthetic


Warping Artifacts in Blinking in Images and Videos Portrait Videos
Exposing DF Exposing AI Using Capsule using Biological
Videos Created Fake Networks Signals
Videos
Cont. of Literature
Detecting Face Warping Artifacts in

Survey
•Exposing DF Videos
Focuses on identifying artifacts present in deep fake videos.
• This technique utilizes a specialized Convolutional Neural Network to compare the generated face regions with their neighboring
areas.
• By examining these areas, the method aims to detect specific visual irregularities that are indicative of deep fakes.
• By analyzing and comparing these transformations, the method can effectively identify the presence of face warping artifacts, thus
aiding in the detection of deep fake videos.

Gaussian
Face blur
align

( c )

( a ) ( b ) ( d )

Overview of DeepFake production pipeline Overview of Negative data generation


Cont. of Literature
Detecting Eye Blinking in Exposing AI

Survey
•Created Fake
Tackles the issue Videos
of uncovering fake face videos produced using deep neural network models.
• The approach relies on the detection of eye blinking, which is a physiological signal that is typically not well represented in
synthesized fake videos.
• By examining the presence or absence of eye blinking in the videos, this method can distinguish between genuine and fake content.

Overview of LRCN (Long Term Recurrent CNN) method


Cont. of Literature
Detecting Forged Images and Videos Using

Survey
Capsule
• Introduces theNetworks
use of capsule networks for identifying manipulated images and videos in various scenarios.
• This technique employs a network architecture that is specifically designed to capture hierarchical relationships and spatial
arrangements of visual features.
• During the training phase, random noise is introduced to enhance the network's robustness against different forms of manipulation.
• It should be acknowledged that the inclusion of random noise may not be an ideal solution and could potentially affect the method's
performance on real-time data.

Capsule architecture in broad network of forensics An example for real and fake image
Cont. of Literature
Detecting Synthetic Portrait Videos using

Survey
Biological Signals
• Focuses on extracting and analyzing biological signals from both authentic and fake portrait videos.
• By applying specific transformations and processing steps, such as feature set extraction and PPG (Photoplethysmogram) map
generation, the method aims to differentiate between real and synthetic content.
• Ensuring the preservation and discrimination of biological signals throughout the detection process is crucial to achieve optimal
performance.

• However, one of the main challenges associated


with this method lies in formulating a
differentiable loss function that effectively follows
the proposed signal processing steps.

Overview of extracted biological signal


EXISTING
• Paper 1: This method relies on the detection of face warping artifacts, which can be difficult to identify in high-quality deepfake

SYSTEM
videos.
• Paper 2: This approach may not be reliable in cases where DeepFakes have improved eye blinking synthesis or when the face area
is partially occluded, making it challenging to detect eye blinking as the sole indicator of manipulation. This method relies on the
detection of eye blinking patterns, which can be difficult to identify in videos where the subject is wearing glasses or has long
eyelashes.
• Paper 3: While capsule networks can be effective, they may require a significant amount of computational resources, making
them less efficient for real-time DeepFake detection on resource-constrained devices. This method is computationally expensive
and requires a large dataset of training data.
• Paper 4: The effectiveness of this method may be limited in cases where DeepFakes are created with high-quality biological
signal synthesis or when the forgers take measures to reduce the visibility of such signals. This method relies on the detection of
biological signals, such as heart rate and respiratory rate, which can be difficult to measure in videos of low quality or where the
subject is not facing the camera.
PROPOSED
Upload Video Training Flow

SYSTEM
Prediction Flow

Data-set
(Fake/Real Preprocessing
Data splitting Data loader
Video) (Train/test data
Pre-processed (Loading the train
Splitting video into Data-set split) videos and labels)
frames (Contains only face
video)
Face detection

DeepFake Detection Model


Face cropping

Saving the face LSTM Res-Next


cropped video Load trained Export trained (Video (Feature
model model classification) extraction)

REAL/FAKE
ALGORITHMS
CONVOLUTIONAL NEURAL NETWORKS (CNN) :
Convolutional Neural Networks (CNNs) are a class of deep learning algorithms commonly used for tasks involving image analysis
and recognition. They are designed to automatically and adaptively learn spatial hierarchies of features from data.

WORKING :
 Convolution:
o The core operation in CNNs is convolution. It involves applying a set of learnable filters (also known as kernels) to the input
image.
o Each filter scans the input data with a sliding window, performing element-wise multiplications and then summing the results to
create a feature map. The feature maps capture various features, such as edges, textures, and patterns in the input image.
 Activation Layer:
o After the convolution operation, an activation function like ReLU (Rectified Linear Unit) is often applied element-wise to
introduce non-linearity to the model. This helps the network learn complex patterns.

 Pooling Layer (subsampling):


o After convolution, a common step is to apply a pooling layer. Max pooling and average pooling are two commonly used
techniques.
o Pooling reduces the spatial dimensions of the feature maps while preserving the most important information. It helps make the
network more robust to variations in scale and position.

Overview of working of CNN feature extraction


 Fully Connected Layers:
o After several convolutional and pooling layers, one or more fully connected layers are typically added. These layers perform
high-level reasoning and classification.
o The fully connected layers combine the features learned in previous layers to make final predictions or classifications.

Purpose of working steps


LONG-SHORT TERM MEMORY (LSTM) :
Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture designed to overcome the vanishing
gradient problem, which can occur when training traditional RNNs. LSTMs are particularly well-suited for tasks involving sequential
data, such as natural language processing, speech recognition, and time series analysis.

WORKING :
The sequence of LSTM cells in each layer is fed with the output of the last cell. This enables the cell to get the previous inputs and
sequence information. A cyclic set of steps happens in each LSTM cell :
o The Forget gate is computed.
o The Input gate value is computed.
o The Cell state is updated using the above two outputs.
o The output(hidden state) is computed using the output gate.
The Cell state is aggregated with all the past data information and is the long-term information retainer. The Hidden state carries the
output of the last cell, i.e. short-term memory. This combination of Long term and short-term memory techniques enables LSTM’s to
perform well In time series and sequence data.
 Forget Gate:
The forget gate determines what information from the previous cell state should be forgotten or retained. It helps the LSTM decide
which information is no longer relevant and should be discarded.
 Input Gate:
The input gate is responsible for updating the cell state with new information. It controls what new information should be stored in the
cell state.
 Output Gate:
The output gate controls what information from the current cell state should be used to produce the current hidden state (output) of the
LSTM.

LSTM Cell
APPLICATI
Social Media Content
ONS
Moderation:
Social media platforms can use this technology to identify and
remove deep fake content, thereby preventing the spread of
deceptive and harmful information. Implement automated deep
fake detection algorithms that utilize ResNeXt CNN and LSTM
models to scan uploaded videos and flag potential deep fakes
based on irregularities in content and behavior.​
APPLICATI
Digital Identity
ONS
InVerification:​
digital transactions, deep fake detection helps in verifying the
identity of users, enhancing security and preventing
fraudulent activities. Digital identity verification is a crucial
process used in various domains, including financial
transactions, online services, and security. When it comes to
implementing digital identity verification in the context of deep
fake video detection. Digital identity verification is integral to
secure online transactions, access to digital services, and the
prevention of identity fraud​.
APPLICATI
CyberSecur
ONS
Inity:
the realm of cybersecurity, deep fake detection can identify
and counteract phishing attacks, online scams, and identity
theft. Implement multimodal verification techniques that
combine image analysis, audio analysis, and metadata checks to
validate video content. Use strong authentication methods to
confirm the identity of users and content creators, preventing
unauthorized uploads and account takeovers.​
APPLICATI
Evidence
ONS
Authentication:
Deepfake detection technology can help verify the authenticity
of video evidence presented in court. This is particularly
important in criminal cases where video footage is used
to support or refute a claim. Deepfake detection technology can
help establish the chain of custody for video evidence, ensuring
that the video has not been tampered with or manipulated at any
point.​
We are open
to any
questions you
may have!
THANK
YOU!

You might also like