A Systematic Literature Review on the Effectiveness of Deepfake Detection Techniques 2023
A Systematic Literature Review on the Effectiveness of Deepfake Detection Techniques 2023
Laura Stroebel, Mark Llewellyn, Tricia Hartley, Tsui Shan Ip & Mohiuddin
Ahmed
To cite this article: Laura Stroebel, Mark Llewellyn, Tricia Hartley, Tsui Shan Ip &
Mohiuddin Ahmed (2023) A systematic literature review on the effectiveness of
deepfake detection techniques, Journal of Cyber Security Technology, 7:2, 83-113, DOI:
10.1080/23742917.2023.2192888
REVIEW ARTICLE
1. Introduction
The remarkable technological advancements, especially in artificial neural net
works, coupled with high computing power, have led to the creation of tech
nologies that can be used to tamper with digital content. These technologies,
including FakeApp and FaceApp, have been used intensely in the recent past to
generate real-looking, but fake, visual content. Most of these technologies
enable a creator to alter various attributes in image and video content, like
hairstyle, age, voice, and many others, or even to swap the entire face in the
CONTACT Mohiuddin Ahmed [email protected] School of Science, Edith Cowan University Perth,
Western, Australia
© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives
License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduc
tion in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way. The
terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or
with their consent.
84 L. STROEBEL ET AL.
image with another. This has led to the concept of Deepfake, a name coined
from Deep Learning and Fake, which exploits the powers of deep learning to
create fake, realistic video and image content. Most of these deepfake technol
ogies result from two core neural network technologies: generative network
and discriminative network which when combined, give rise to Generative
Adversarial Networks (GANs).
Deepfake content has a wide range of useful applications across many
industries, that are yet to be fully harnessed. Abdulreda and Obaid (2022),
in their study [2], discuss techniques of facial modification. They argue
that Entire Face Synthesis is a facial modification technique used to
produce non-existent faces. Methods such as StyleGAN utilise GANs and
can produce high-quality face pictures that can be utilised in video games
and 3D modelling. However, the technology has been widely used by
malicious actors for malicious intent. For example, as early as 2017,
deepfake technologies were used to share pornographic videos that had
celebrity faces as actors. Also, these technologies have been used to
spread misinformation in various avenues, mainly across social media
platforms, conducting cybercrimes and creating political tension and
instability.
This has prompted the need to detect and distinguish between what is real
and what is fake. So far, much research has been done on deepfake detection
techniques and the ability of these techniques to identify deepfake images and
videos using artificial intelligence (AI) models trained on well-established data
sets of deepfake and typical examples. However, there has been limited
research undertaken on the validity of the training datasets used by these AI
models which, due to a lack of annotation of the features and potential bias of
the content, may cause the deepfake detection tool to fail [3].
Bias in the datasets used for AI and machine learning has been suspected for
some time and studies suggest AI models trained on unbalanced datasets will be
biased against certain groups and will perform poorly. Deepfake detection tech
nologies using these AI models will sometimes fail to identify deepfake content due
to the lack of diversity in the dataset they are trained on. Further to this, Xu et al.
(2022) propose that the deepfake detection tools themselves have a bias that
influences results and this is further exacerbated by the bias in the dataset used [4].
The limited number of audio datasets, particularly non-English based, is
another area where detection of deepfake media is lacking. Adding to this,
with the impact of accents in the current audio datasets yet to be investigated,
and audio deepfakes increasing, the need for further study in this area is
paramount.
Table 1 summarises the literature reviews/surveys that have already been
undertaken on this subject – these will be discussed in more detail in section 3.1.
The main contributions of this SLR are:
JOURNAL OF CYBER SECURITY TECHNOLOGY 85
Table 1. Summary of previous works in comparison to our study. (ref)* means references listed as reviewed document numbers are not noted in the paper and thus
the number is inferred from a paper’s reference section. The (-) under the generation and detection models means the study did not explore the technology under
that column.
L. STROEBEL ET AL.
upon the strategy presented in [9] and was conducted using the standard
research databases. Table 2 shows the search strategy used in this paper.
Further investigation on identified areas of interest was done via the data
sources in Table 2 as well as other scholarly literature databases such as arXiv
and Research Gate. Figure 1 shows the search results for this SLR – made up
of 83 documents reviewed and considered relevant for this SLR, 21 artifacts
collected for reference purposes only and 16 documents reviewed and found
to not fit the scope of this SLR based on the documents creation date or
content.
Figure 2 shows the source type of the articles included in the Reviewed Works
section of this document.
Due to the large amount of data to collect, sort through and categorise, it was
important to do this in an organised and meaningful way. This enabled us to
monitor the data, ensure information was not missed and be able to extract
meaning from the data.
Figure 1. Documents returned by our search strategy and how they were categorised after
review.
Figure 2. The source type of the documents (by year) included as reviewed works.
JOURNAL OF CYBER SECURITY TECHNOLOGY 89
3. Reviewed works
For this SLR the works reviewed could be categorised in two ways – previous
systematic literature reviews/surveys and modern deepfake detection
methods.
proportionate to deepfake generation techniques and vice versa – that is, the
creation of generation techniques was driven as much by the creation of
detection techniques as the other way around (detection by generation). This
survey [8] noted a need for competitive baselines to evaluate deepfake detec
tion techniques so true performance could be assessed. Zhang (2022) agreed,
indicating the deepfake detection techniques up to 2020, reviewed in this
survey [6], were not robust and there were many challenges when assessing
generation and detection models, including a lack of benchmark test methods
and quality datasets.
Malik et al. (2022) took a similar approach, evaluating face image and video
deepfake techniques (generation and detection) up to early 2021, and con
cluded there was a general inability in the detection models to transfer and
generalise indicating further research was needed [7].
Rana et al. (2022) undertook a detailed SLR of 112 articles relating to deep
fake detection technologies, published between 2018 and 2020, inclusive. They
[9] classified these technologies into 4 categories (deep learning-based techni
ques, classical machine learning-based methods, statistical techniques, and
blockchain-based techniques) and evaluated the performances of each when
used with the datasets available at the time. At the conclusion of this SLR [9],
deep learning-based methods were identified as outperforming all other cate
gories. This conclusion was also noted by Weerawardana and Fernando (2021),
who categorised the deepfake detection techniques they reviewed into tradi
tional and deep learning methods only [5].
Almutairi and Elgibreen (2022) identified three types of audio deepfakes –
synthetic-based, imitation-based and replay-based. This survey [11] examined
machine and deep learning audio deepfake detection technologies and deter
mined that machine learning methods proved more accurate than deep learn
ing, but required excessive training and manual feature extraction, potentially
making them unscalable. It [11] noted a lot more study in this area was needed
to address existing gaps and challenges.
Table 4. (Continued).
Metrics (NB: averaged if
DFD DF Media Source Year multiple results)
Techniques DFD Models/Methods DFD Dataset Researched Documents Pub DF Type AUC (%) ACC (%)
CNN++ Multi-dimensional biological signals FF++, DFD, UADFV Video [38] 2021 FM FF++ 96
DFD 97
UADFV 95
L. STROEBEL ET AL.
CNN++ SiteForge CASIA 2.0 Image [39] 2021 FM/IM CASIA 94.7
CNN with added Local Interpretable Model-agnostic Explanations Own dataset Twitter Own dataset 83.2
(LIME) Indian Dataset 2.0
CNN++ ResNet-18 - pre-processed with Image saliency and guided filter FF++ Image [40] 2021 FM FF++ 99
processing
SCNN Set Convolutional Neural Network (SCNN) using MesoNet/XceptionNet DFTIMIT, FF++, DFDC-P Video [41] 2021 FM 80/99 (HQ) 80/96(HQ)
79/95 (LQ) 80/94 (LQ)
CNN TCN – temporal convolution network FoR [42] Audio [11,42] 2021 FM 92 [42]
2022
CNN STN – spatial transformer network FoR [42] Audio [11,42] 2021 FM 80 [42]
2022
CNN EfficientNet-B0 - adversarial trained on FF++ FF++ Image/ [18,43] 2022 FM 69 [43]
Comparison test DFDC training Video 2021 96 (many datasets)
[18]
CNN EfficientNet-V2 FF++, FFIW-10K Video [44] 2022 FS FF++ 98
FFIW-10K 93
DNN MC-LCR – multimodal contrastive classification by locally correlated FF++, CDF, DFDC, DFor- Video [45] 2022 FM FF++ 99/90 FF++ 98/88
representations 1.0 FF++/DFDC FF++/CDF 71
Xception (pre-trained on ImageNet), MLP-Mixer 71 FF++/DFDC 70
FF++DFor-1.0 75
DNN Face/background squares captured/Siamese noise trace extraction/ CDF, UADFV Video [46] 2022 FS CDF 99.92 99.15
noise similarity analysis UADFV
88.95
CNN + ViT ViXNet FF++, CDF, DFID Image [47] 2022 FM FF++ 99/75 FF++ 97/69
CDF 99/73 CDF 94/67
DFID 99/75 DFID 95/68
(Continued)
Table 4. (Continued).
Metrics (NB: averaged if
DFD DF Media Source Year multiple results)
Techniques DFD Models/Methods DFD Dataset Researched Documents Pub DF Type AUC (%) ACC (%)
CNN++ ID-Reveal − 3D morphable model (3DMM) + Temporal ID Network + DFD Video [48] 2022 FM 87/96 (FR/ 76/85 (FR/FS)(HQ)
GAN FS)(HQ) 82/78 (FR/FS)(LQ)
90/94 (FR/
FS)(LQ)
CNN++ Dual-level collaborative framework (un-named) - utilising ResNet-50 Trained on ImageNet Video [49] 2022 FM FF++ 99 FF++ 95
and temporal learning models FF++, CDF, DFDC CDF 98 CDF 96
DFDC 84 DFDC 99
CNN++ DenseNetXX – extends ResNet by adding transition layer RFFD Image [21,50] 2022 FM [50] 99 [50] 94 [50]
FS [21] DN121 97.1 DN121 97 [21]
[21] DN169 95 [21]
DN169 99.6 DN201 96 [21]
[21]
DN201 99.4
[21]
CNN++ VGG-XX – fully connected classifiers followed by maxpooling layers RFFD Image [21,50] 2022 FM [50] VGG-19 96 VGG-19 95
FS [21] [50] VGG-19 94 [21]
VGG-19 VGG-16 92 [21]
98.7 [21] VGG-Face 99 [21]
VGG-16
97.7 [21]
VGG-Face
99.8 [21]
CNN++ CustomCNN – analyses dropout, padding, augmentation and grayscale RFFD Image [50] 2022 FM 98 89
on model performance
CNN++ LiSiam (Localisation Invariance Siamese Networks) using Xception FF++, CDF Video [51] 2022 FM 99 (HQ) 96 (HQ)
backbone 91 (LQ) 87 (LQ)
FF++/CDF
81
CNN++ Multilayer Perceptron Video [52] 2022 FM 87 87
JOURNAL OF CYBER SECURITY TECHNOLOGY
CNN++ Frame-temporality two-stream convolutional network – with CDF, FF++ Video [53] 2022 FS CDF 87 CDF 80.74
MesoNet – for frame-level and ResNet-18 for time-dependent FF++ 94 FF++ 86.61
residual features
(Continued)
95
96
Table 4. (Continued).
Metrics (NB: averaged if
DFD DF Media Source Year multiple results)
L. STROEBEL ET AL.
Techniques DFD Models/Methods DFD Dataset Researched Documents Pub DF Type AUC (%) ACC (%)
CNN++ Patch-DFD – using RESNet-50 and Inception-v3, pre-trained on DFTIMIT, CDF, FF++ Image [54] 2022 FM DFTIMIT FF++ 96.23/87.36
ImageNet 99.42/99.1
CDF 98.88
RNN Xception-ConvLTSM – with spatiotemporal attention mechanism FF++, CDF, DFDC Video [55] 2022 FS FF++ 99 FF++ 99
(CNN CDF 99 CDF 99
+LSTM) DFDC 94 DFDC 92
RNN EfficientNet-B3 [56] FF++ [56] Video [56] [56,57] 2022 FS Celeb-DF 99 FF++ 99 (raw) [56]
(CNN With vanilla LSTM [57] Celeb-DF [57] Image [56,57] [57] FF++ 99 (HQ) [56]
+LSTM) [57] FM [56] FF++ 91 (LQ) [56]
Celeb-DF 93.9 [57]
RNN Meso-4 with vanilla LSTM Celeb-DF Image [57] 2022 FM Celeb-DF 96 Celeb-DF 89.3
(CNN
+LSTM)
RNN ResNet and LSTM CDF Video [58] 2022 FM CDF 88.8 CDF 91
(CNN
+LSTM)
Other Models
Jointly modelling video and audio modalities for deepfake detection FF++, DFDC Video and [59] 2021 AM/ FF++ FF++ 95
Audio VM 99
PSCC-Net [60] Progressive Spatio-Channel Correlation Network Own dataset Image [61] 2022 AM 99.6
EfficientNet and Vision Transformers [62] FF++, DFDC Video [62] 2022 VM 95.1
JOURNAL OF CYBER SECURITY TECHNOLOGY 97
Many papers combined the traditional techniques (CNN, DNN, LTSM) with
other modules to produce ensembled and multi-attentional architectures. For
example, we found a number of models that extended various versions of
ResNetXX. Bita-Net, created by Ru et al. (2021), was based on human detection
methods – using a temporal examination at high frame rate and frame by frame
scrutiny of key frames with an additional attention branch- using ResNet-50 and
U-Net. This model [29] produced excellent, consistent results across multiple
datasets, even when trained by one dataset and then validated and tested on
another.
Pu et al. (2022) examined videos with frame-level and video-level methods
to identify fake content. ResNet-50 extracted facial features, and then fed
them to a temporal learning model and a video and frame level classifier, in
a three-step examination process to determine if a video was fake or not.
This model [49] was touted to outperform all existing methods for frame and
video level detection, as well as being robust to video quality and database
variations. However, it [49] was not generalisable to other types of facially
manipulated images or videos, such as those that were GAN-generated
(completely made-up).
98 L. STROEBEL ET AL.
giving good results on high and low-quality images, seemed to struggle with
varying light conditions and biometric challenges.
A dual-path system proposed by Luo et al, (2021) achieved good results using
neural ordinary differential equations (NODE) and a facial feature transformer
architecture [28]. However, the technique was suitable for facial manipulation
deepfakes only as it focused too much on local facial features meaning it did not
perform well on face-swapping deepfakes.
Vamsi et al. (2022) created an RNN model called ConvNets, consisting of
ResNet and LSTM [58]. The paper [58] noted, due to the architecture of the
model, less pre-processing was required and images were handled in a way that
made the model fast to implement and easy to use. It [58] also claimed high
performance when compared to other CNN models using the same datasets.
Xu et al. (2021) through extensive experimentation with multiple variations of
detection techniques developed an SCNN which gave robust and verifiable
results. The authors asserted that the detection technique [41] was more robust
than previous models and potentially could be integrated with other backbone
networks to enhance performance.
Raskar et al. (2021) proposed YOLO which claimed to be extremely accurate
and efficient in terms of computational time in object-based video forgery
detection. This technique [19] detected complex copy-move video deepfakes
with 99% accuracy. This technique was best suited to copy-move attacks
including scaling, rotation and flipping [19]. However, the researchers recog
nised that the technique cannot efficiently detect inter-frame video deepfakes.
Bakas et al. (2021) created a technique [88] that claimed to work well with
compressed videos. However, the performance dropped when one or more
complete group of pictures (GOP) was removed from the video sequence.
Sun et al. (2021) developed a technique [89] that was efficient when applied
to data that had been forged using deepfake technologies. However, while
performing well with data faked with deepfake techniques, the model per
formed poorly when given neutral data.
3.2.4. Datasets
Hazirbas et al. (2021) suggested the top 5 deepfake detection tools per
form poorly on some specific groups as they did not generalise to all
people [3]. This paper [3] went on to introduce a new dataset, called the
Casual Conversations dataset. It was made up of 45,000 videos using
3,011 subjects from various age, gender and skin tone groups – which
were annotated with additional classifiers to enhance the accuracy of the
AI models trained on it. The conclusion of this study [3] was the deepfake
detection techniques trained on a traditional dataset had a strong biased
towards lighter skin tones, and gender classification was more successful
for older age groups (+45 years old).
Nadimpalli et al. (2022), after examining the gender labelling on current
datasets, concluded when used to train deepfake detection tools they
produced biased results, skewed to successfully identify male-based deep
fakes more often than female-based ones [85]. That is, female-based
content had a higher false match and higher non-match rate than male-
based content.
Xu et al. (2022) went further, investigating 41 distinct classifiers –
demographic (age, gender, ethnicity) and non-demographic (hair, skin,
accessories, etc), across five deepfake datasets to determine if the data
sets had diversity. The paper [4] also analysed whether the deepfake
detection tools themselves had AI bias which could lead to generalizabil
ity, fairness and security issues. The analysis [4] concluded both, the
investigated datasets and deepfake detection models, exhibited bias
with regard to demographic and non-demographic classifiers. That is,
the datasets showed a lack of diversity across the attributes of their
content and the detection models themselves had a strong bias toward
certain attributes.
JOURNAL OF CYBER SECURITY TECHNOLOGY 101
Almutairi and Elgibreen (2022) identified a lack of datasets available for audio
deepfake detection model training, particularly non-English-based datasets [11].
They also questioned the impact of accents (even in English-based datasets) on
the accuracy of audio deepfake detection [11].
To conclude this section, we would like to mention several papers
[15,92], published in 2022, which presented medical imagery deepfakes
as an emerging threat. Solaiyappan et al. (2022) discussed the need for
more work in detecting deepfakes generated by GANs rather than tamper
ing methods, such as copy-move or image-splicing methods [15]. However,
Arora et al. (2022) recognised the threat of deepfake images in medical
imagery but also suggested that deepfake generation in this area was an
opportunity to build realistic, effective training material that would not be
impacted by restrictions place on personal information management [92].
4. Observations
In this section, we have addressed the research questions mentioned above,
which we believed were the most relevant and important in this area of research.
The section discusses the deepfake detection techniques identified through this
SLR, including metrics gathered on the effectiveness of these techniques and the
datasets used by them. Advances in deepfake detection technologies were noted
and current challenges and limitations were identified. Finally, potential trends in
future deepfake detection techniques have been discussed.
4.1.3. Datasets
There were 30 datasets used across the deepfake detection techniques identi
fied by this SLR, which have been detailed in Table 5. However, many of the
deepfake detection models used one or more of the following datasets:
FaceForensics++ - contains 1,000 real and 4000 manipulated videos sourced
from 977 YouTube videos. The fake videos come in two compression versions –
high quality (c23) and low quality (c40) [54]. This is a first-generation dataset
that was produced in 2019. It was the most used dataset identified for this SLR,
being used in 51% of the models listed.
CelebDF – contains 590 real videos sourced from YouTube that have been
manipulated into 5,639 deepfake videos. This dataset contains higher-quality
videos than most, as an attempt has been made to remove visible source
artifacts. It was used in 34% of the models listed.
Deepfake Detection – was created by Google and has 3,431 videos, with a ratio
of 1 real to 8.5 fake videos in the dataset. It was used in 23% of the models listed.
Deepfake Detection Challenge Dataset - contains 128,154 videos sourced
from 3,426 paid actors – the video breakdown is 104,500 fake and 23,654 real
videos. This dataset consists of videos in different lighting conditions (indoor/
outdoor) that were taken with high-resolution cameras. It was used in 19% of
the models listed.
104 L. STROEBEL ET AL.
● The quality, fairness, and trust of deepfake datasets (biased and imbal
anced data) [6,7,9,12,90];
● Robustness of the deepfake detection techniques against unknown attack
types [7];
● Temporal aggregation [7,12]; and
● Social media laundering [12]
This survey [8] noted a need for competitive baselines to evaluate deepfake
detection techniques so the true performance could be assessed. Zhang (2022)
JOURNAL OF CYBER SECURITY TECHNOLOGY 105
5. Conclusion
This paper comprehensively discussed methods used in the detection of deep
fakes and determined current techniques. As part of the review process, this
paper categorised papers into two categories, a) Previous Literature Reviews
and Surveys and b) modern deepfake detection methods. The SLR reviewed 8
literature review/survey documents with 7 of those published in 2022. It also
reviewed 75 documents from category b) modern deepfake detection methods,
ranging from 2021 to August 2022, which identified developing trends and
future research areas. The majority (90%) of the papers reviewed focused on
video and image or a combination of both.
This paper identified 48 noteworthy detection models, across all types of
media including image, video, audio, text, and a combination of all. Of these 48,
23 detection techniques were from research and proposals from 2022 leading
up to August 2022, with a further 28 proposed in 2021.
Advancements in deepfake detection have proposed a combination of tradi
tional techniques (CNN, DNN and LTSM) and other modules to strengthen these
techniques, producing ensembled and multi-attentional architectures. Recent
papers written have adopted this strategy with many suggesting multi-
attentional architectures in 2021 papers and papers written in 2022 only sug
gesting multi-attentional architectures.
108 L. STROEBEL ET AL.
Models discussed within this paper were judged against Area Under the ROC
Curve (AUC) and/or accuracy (ACC); a growing method of reporting the effec
tiveness of deepfake techniques. It is worth noting that all models this report
discovered, reported high performance in these metrics when trained, vali
dated, and tested on the same dataset. However, when these models were
trained and validated on differing datasets there were noticeable performance
drops. As such this report concluded models were not robust or adaptable when
presented with challenging or unknown conditions.
As part of this review, information was discovered that suggested deepfake
detection tools perform poorly on various cultural and ethnic groups. It was
concluded in studies reviewed that deepfake detection techniques trained on
a standard dataset had a strong bias towards lighter skin tone and gender
classification and had greater success with people aged above 45 years. As
such, challenges identified within this paper include the quality, fairness, and
trust of the deepfake datasets. Also, the robustness of these datasets when
being used against unknown attack types was a challenge for detection. Many
techniques reviewed have a large reliance on specific datasets, if these datasets
do not represent a broad section of society, including cultural or ethnic repre
sentation, their effectiveness is severely reduced. As such, the viability of data
sets is a large challenge moving forward.
A recommendation this paper suggests is the need to create a standardised
approach to dataset usage and a uniform rating system where techniques can
be validated against each other, reducing the ability to select the best or most
desirable datasets. Currently, deepfake detection technique creators and own
ers can elevate their models using targeted datasets known to produce better
results for that model type, making it appear superior.
There are current shortfalls within the research field of deepfake detection
outside of the current popular image and video types. As mentioned, this paper
found the majority of research reviewed related to one or both types. As a result,
more research is required in additional fields including audio-only deepfake
detection.
Future trends identified in this paper include audio deepfakes as the latest
evolution of deepfake exploitation, utilising synthetic voice generation or voice
conversation algorithms that output undetectable deepfake audio. As pre
viously mentioned, Gartner has predicted, within two years 20% of all successful
account takeovers will be deepfakes and synthetic voice augmentation [91].
Leading to audio deepfakes being used in deception methods including spear
phishing scams. This prediction highlights a future trend in deepfake generation
and the requirement for additional research in audio deepfake detection. In
addition to audio deepfakes, several papers also mentioned concerns surround
ing medical imagery deepfakes. As such, additional research is required in
detection methods of these deepfakes types.
JOURNAL OF CYBER SECURITY TECHNOLOGY 109
This research has highlighted a positive aspect of deep fake material. Such as
deep fake imagery can be used for good during medical training providing
realistic material.
With the evolution of technology and the low-cost barriers to entry, the
advancement of deep fakes will progress with a rapid trajectory. As this evolves,
future challenges in detection techniques will be required to adapt at the same
level or greater to those technological advancements.
Disclosure statement
No potential conflict of interest was reported by the authors.
ORCID
Mohiuddin Ahmed http://orcid.org/0000-0002-4559-4768
References
[1] Curry L, Nembhard I, Bradley E. Qualitative and mixed methods provide unique con
tributions to outcomes research. Circulation. 2009;119(10):1442–1452.
[2] Abdulreda AS, Obaid AJ. A landscape view of deepfake techniques and detection
methods. Int J Nonlinear Anal Appl. 2022;13(1):745–755.
[3] Hazirbas C, Bitton J, Dolhansky B. Towards Measuring Fairness in AI: the Casual
Conversations Datase. IEEE Transactions on Biometrics, Behavior, and Identity
Science. 2021.
[4] Xu Y, Terhörst P, Raja K, et al. A Comprehensive Analysis of AI Biases in DeepFake
Detection with Massively Annotated Databases. arXiv preprint arXiv:2208.05845, 2022.
[5] Weerawardana M, Fernando T, Deepfakes detection methods: a literature survey, in 2021
10th International Conference on Information and Automation for Sustainability (ICIAfS).
2021, IEEE Access: Negambo, Sri Lanka. p. 76–81.
[6] Zhang T. Deepfake generation and detection, a survey. Multimedia Tools Appl. 2022;81
(5):6259–6276.
[7] Malik A, Kuribayashi M, Abdullahi SM, et al. DeepFake detection for human face images
and videos: a survey. IEEE Access. 2022;10:18757–18775.
[8] Juefei-Xu F, Wang R, Huang Y, et al. Countering malicious deepfakes: survey, battle
ground, and horizon. Int J Comput Vis. 2022;130(7):1678–1734. DOI:10.1007/s11263-
022-01606-8
[9] Rana MS, Nobi MN, Murali B, et al. Deepfake detection: a systematic literature review.
IEEE Access. 2022;10:25494–25513.
[10] Celebi N, Liu Q, Karatoprak M, A survey of deep fake detection for trial courts, in 9th
International Conference on Artificial Intelligence and Applications (AIAPP 2022). 2022,
ResearchGate: Vancouver, Canada. p. 227–238.
[11] Almutairi Z, Elgibreen H. A review of modern audio deepfake detection methods:
challenges and future directions. Algorithms. 2022;15(5):155.
[12] Masood M, Nawaz M, Malik KM, et al. Deepfakes generation and detection: state-of-the-
art, open challenges, countermeasures, and way forward. Appl Intell. 2022;53
(4):3974–4026. DOI:10.1007/s10489-022-03766-z
110 L. STROEBEL ET AL.
[53] Hu J, Liao X, Wang W, et al. Detecting compressed deepfake videos in social networks
using frame-temporality two-stream convolutional network. IEEE Trans Circuits Syst
Video Technol. 2022;32(3):1089–1102. DOI:10.1109/TCSVT.2021.3074259
[54] Yu M, Ju S, Zhang J, et al. Patch-DFD: patch-based end-to-end DeepFake discriminator.
Neurocomputing. 2022;501:583–595.
[55] Chen B, Li T, Ding W. Detecting deepfake videos based on spatiotemporal attention
and convolutional LSTM. Inf Sci. 2022;601:58–70.
[56] Saif S. Generalized Deepfake Video Detection Through Time-Distribution and Metric
Learning. IT Prof. 2022;24(2):38–44.
[57] Chamot F, Geradts Z, Haasdijk E. Deepfake forensics: cross-manipulation robustness of
feedforward- and recurrent convolutional forgery detection methods. Forensic Sci Int:
Digital Invest. 2022;40:301374.
[58] Vamsi VVVNS. Deepfake detection in digital media forensics. Global Transitions
Proceedings. 2022;3(1):74–79.
[59] Zhou Y, Lim S-N, Q.C.C.O.O. Ieee/Cvf International Conference on Computer Vision
Montreal, Joint Audio-Visual Deepfake Detection, in 2021 IEEE/CVF International
Conference on Computer Vision (ICCV). 2021, Montreal, Canada: IEEE. p. 14780–14789.
[60] Prototypr. AI-Generated Faces: Free Resource of 100K Faces Without Copyright. 2019 18
September, 2019 [cited 08/10/2022]; Available from: https://blog.prototypr.io/gener
ated-photos-free-resource-of-100k-diverse-faces-generated-by-ai-2144a8615d1f.
[61] Liu X. PSCC-Net: progressive Spatio-Channel Correlation Network for Image
Manipulation Detection and Localization. IEEE Transactions on Circuits and Systems
for Video Technology. 2022;1.
[62] Coccomini DA, Messina N, Gennaro C, et al. Combining EfficientNet and vision
transformers for video deepfake detection. In: Sclaroff S, Distante C, Leo M,
Farinella GM Tombari F, editors. Image analysis and processing – iciap 2022. iciap
2022. lecture notes in computer science. Vol. 13233. Cham: Springer; 2022. doi:10.
1007/978-3-031-06433-3_19.
[63] Babbar A The VidTIMIT Audio-Video Dataset. 2016 [cited 2022 18 September]; Available
from: https://www.kaggle.com/datasets/akshay4/speakerrecognition .
[64] Afchar D. MesoNet: a Compact Facial Video Forgery Detection Network, in 2018 IEEE
International Workshop on Information Forensics and Security (WIFS). 2018, Hong Kong:
IEEE. p. 1–7.
[65] University S ImageNet. 2021 [cited 2022 18 September]; Available from: https://www.
image-net.org/.
[66] Bilello E LIDC-IDRI. 2022 16 September, 2022 [cited 2022 20 September]; Available from:
https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI.
[67] Li J CelebFaces Attributes (CelebA) Dataset. 2018 [cited 2022 18 September]; Available
from: https://www.kaggle.com/datasets/jessicali9530/celeba-dataset .
[68] Yu F. LSUN: construction of a large-scale image dataset using deep learning with
humans in the loop. Vol. 9. 2015. 10.48550/arXiv.1506.03365.
[69] Group VI FaceScrub. 2021 23 March 2021 [cited 2022 19 September]; Available from:
http://vintage.winklerbros.net/facescrub.html.
[70] Dolhansky B. The DeepFake Detection Challenge (DFDC) Dataset. arXiv preprint
arXiv:2006.07397, 2020.
[71] Xie W VGGFace2 dataset for face recognition. 2020 18 February, 2020 [cited 2022 19
September]; Available from: https://github.com/ox-vgg/vgg_face2.
[72] Code PW CASIA-Webface. [cited 2022 19 September]; Available from: https://papers
withcode.com/dataset/casia-webface.
JOURNAL OF CYBER SECURITY TECHNOLOGY 113
[73] Rathgeb C. Handbook of Digital Face Manipulation and Detection. In: Kang SB, editor.
Advances in Computer Vision and Pattern Recognition. 1. Springer International
Publishing. IX; 2022. p. 487.
[74] Zhang Y, Jiang F, Duan Z. One-class learning towards synthetic voice spoofing
detection. IEEE Signal Process Lett. 2021;28:937–941.
[75] University Y Real and Fake Face Detection. 2019 [cited 2022 19 September]; Available
from: https://www.kaggle.com/datasets/ciplab/real-and-fake-face-detection.
[76] Mirsky Y. {CT-GAN}: malicious Tampering of 3D Medical Imagery using Deep Learning. in
28th USENIX Security Symposium (USENIX Security 19). 2019, Santa Clara, CA.
[77] Reimao R. FoR: a Dataset for Synthetic Speech Detection, in 2019 International Conference
on Speech Technology and Human-Computer Dialogue (SpeD). 2019, Timisoara,
Romania: IEEE. p. 1–10.
[78] GitHub I Celeb-DF: A Large-scale Challenging Dataset for DeepFake Forensics. 2021 3 June
2021 [cited 2022 08 October]; Available from: https://github.com/yuezunli/celeb-
deepfakeforensics.
[79] Hao D. DFFD: Diverse Fake Face Dataset. 2020 June 2020 [cited 2022 8 October];
Available from: http://cvlab.cse.msu.edu/dffd-dataset.html.
[80] GitHub I Hybrid Fake Face Dataset. 2021 17 March, 2021 [cited 2022 08 October];
Available from: https://github.com/EricGzq/Hybrid-Fake-Face-Dataset.
[81] Zi B. WildDeepfake: a challenging real-world dataset for deepfake detection. in Proceedings
of the 28th ACM international conference on multimedia. 2020, Seattle, WA, USA.
[82] Zhou T Face Forensics in the Wild. 2021 4 October, 2021 [cited 2022 19 September];
Available from: https://github.com/tfzhou/FFIW.
[83] MetaAI. Casual Conversations Dataset. 2021 [cited 2022 20 September]; Available from:
https://ai.facebook.com/datasets/casual-conversations-dataset/.
[84] Kwon P. KoDF: a Large-scale Korean DeepFake Detection Dataset, in International
Conference on Computer Vision (ICCV). 2021, IEEE: Montreal, Canada. p. 10724–10733.
[85] Nadimpalli AV, Rattani A, GBDF: gender balanced deepfake dataset towards fair deepfake
detection, in International Conference on Pattern Recognition 2022. 2022, ResearchGate:
Montreal, Canada.
[86] Zhang J, Ni J, Xie X, DeepFake videos detection using self-supervised decoupling network,
in 2021 IEEE International Conference on Multimedia and Expo (ICME). 2021 : Shenzhen,
China. p. 1–6.
[87] Chen B, Liu X, Zheng Y, et al. A robust gan-generated face detection method based on
dual-color spaces and an improved xception. IEEE Trans Circuits Syst Video Technol.
2022;32(6):3527–3538. DOI:10.1109/TCSVT.2021.3116679
[88] Jamimamul Bakas RN, Bakshi S. Detection and localization of inter-frame forgeries in
videos based on macroblock variation and motion vector analysis. Comp Elec Eng.
2021;89:106929.
[89] Fang Sun NZ, Pan X, Song Z. Deepfake detection method based on cross-domain
fusion. Secur Commun Networks. 2021;2021:11.
[90] Muneef Z, Elgibreen H. A review of modern audio deepfake detection methods:
challenges and future directions. Algorithms. 2022;15(5):19.
[91] Martin EJ. Deepfakes: the latest trick of the tongue. Speech Technology. 2022;27(2):12–16.
[92] Arora A, Arora A. Generative adversarial networks and synthetic patient data: current
challenges and future perspectives. Future Healthc J. 2022;9(2):190–193.
[93] Pfefferkorn R. “Deepfakes” in the courtroom. Public Interest Law J. 2020;29:245–275.
[94] Venema AE, Geradts ZJP. Digital forensics, deepfakes, and the legal process. Sci Tech
Lawyer. 2020;16(4):14–17,23.