0% found this document useful (0 votes)
5 views

Fake_News_Detection_Using_Deep_Learning_A_Systematic_Literature_Review

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Fake_News_Detection_Using_Deep_Learning_A_Systematic_Literature_Review

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Received 24 June 2024, accepted 20 July 2024, date of publication 29 July 2024, date of current version 27 August 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3435497

Fake News Detection Using Deep Learning:


A Systematic Literature Review
MOHAMMAD Q. ALNABHAN AND PAULA BRANCO
Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada
Corresponding author: Mohammad Q. Alnabhan (e-mail: [email protected]).

ABSTRACT Nowadays, we witness rapid technological advancements in online communication platforms,


with increasing volumes of people using a vast range of communication solutions. The fast flow of
information and the enormous number of users opens the door to the publication of non-truthful news, which
has the potential to reach many people. Disseminating this news through low- or no-cost channels resulted in
a flood of fake news that is difficult to detect by humans. Social media networks are one of these channels that
are used to quickly spread this fake news by manipulating it in ways that influence readers in many aspects.
That influence appears in a recent example amid the COVID-19 pandemic and various political events such as
the recent US presidential elections. Given how this phenomenon impacts society, it is crucial to understand
it well and study mechanisms that allow its timely detection. Deep learning (DL) has proven its potential
for multiple complex tasks in the last few years with outstanding results. In particular, multiple specialized
solutions have been put forward for natural language processing (NLP) tasks. In this paper, we systematically
review existing fake news detection (FND) strategies that use DL techniques. We systematically surveyed
the existing research articles by investigating the DL algorithms used in the detection process. Our focus
then shifts to the datasets utilized in previous research and the effectiveness of the different DL solutions.
Special attention was given to the application of strategies for transfer learning and dealing with the class
imbalance problem. The effect of these solutions on the detection accuracy is also discussed. Finally, our
survey provides an overview of key challenges that remain unsolved in the context of FND.

INDEX TERMS Classification, deep learning, fake news, misinformation, systematic literature review.

I. INTRODUCTION news, as people and influencers utilize them to share their


Due to a greater interest in the use of the internet, the spread opinions, videos, and various activities [2], [3].
of fake news has become more common than ever before. Fake news greatly increased in 2016 during the period
Before the popularity of social media platforms, fake news preceding the United States (US) presidential election [4].
was less common and much more difficult to spread to a vast As such, fake news on social media networks has captured
amount of people, as it was achieved either through word of the attention of many researchers. Recently, detecting fake
mouth or through printed media. Fake news can be defined news has become an emerging area of interest for many
as the phenomenon that occurs when incorrect information researchers, such as [4] and [5]. However, fake news detection
is purposefully spread throughout social media outlets with is a complicated task requiring the use of complex models
a significant ability to convince the reader of the content to compare related or unrelated information with known
written [1]. Nowadays, anyone can publish content without truthful information [6]. Furthermore, fake news is perceived
regulation or scrutiny. Several social media platforms, such as in several ways by researchers, leading to multiple ways
Facebook and Twitter, serve as means for disseminating fake of addressing and solving this issue. Some terms related to
misinformation are used interchangeably in multiple cases.
The associate editor coordinating the review of this manuscript and These terms include fake news, rumors, spam, and disinfor-
approving it for publication was Mohamad Afendee Mohamed . mation which usually contain numerical, categorical, textual,

2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 114435
M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

and image contents [7], [8], [9]. Unfortunately, many people Paper Organization:
have the urge to spread false information on social media, This paper is organized as follows. Section II, presents
backed with professionally written, long, and referenced the research methodology, including the search strategy,
comments that allow the reader to more easily agree with research questions, source databases, search query, inclu-
the misinformation provided (e.g., [10], [11]). Researchers sion and extraction criteria, and data collection summary.
aim to eliminate the increased spread of misinformation by In Section III, we investigate the deep learning (DL) algo-
detecting the varied manners in which misinformation can be rithms used for detecting fake news. Section IV describes the
spread. As such, researchers have resorted to the use of deep publicly available datasets in the fake news domain and the
learning (DL) algorithms to detect fake news before it spreads associated challenges. SectionV, discusses transfer learning
(e.g., [12]). This is accomplished by collecting or creating strategies and open challenges in the FND context. Section VI
a dataset containing both true and false information within analyzes the class imbalance problem in fake news detection.
articles. Then, a pattern is determined, creating a model that Section VII provides a summary of the data collected in this
can predict whether a given article contains true or false SLR and answers to our research questions. Section VIII
information. addresses the research threats to validity, and Section IX
There are noticeable gaps in the existing studies on discusses the main gaps and open issues that still exist in fake
fake news detection that our research highlights. This news detection. Lastly, Section X concludes our paper.
includes (i) a lack of clear distinction between the defi-
nitions of misinformation, disinformation, and false infor- II. RESEARCH METHODOLOGY
mation; (ii) a lack of DL-based systematic reviews on A. SEARCH STRATEGY OVERVIEW
varying types of misinformation problems; (iii) a lack Our SLR is generated based on a set of detailed steps
of generalizable DL models that allow achieving a base described in [13]. We begin by defining our research
acceptable detection accuracy on different datasets, which questions, after which we build the keywords for the search
introduces the scarce use of transfer learning in this query to obtain the relevant papers for our study. Then,
context; and (iv) a lack of models that deal with different we select the most relevant databases to query and establish
levels of imbalance datasets in a fake news detection the inclusion and exclusion criteria. Finally, we define the
environment. fields to be extracted from the retrieved documents.
As technology progresses, the ability to detect misinfor-
mation becomes more complicated and thus more difficult
to detect using standard machine learning (ML) techniques. B. RESEARCH QUESTIONS
This motivates our focus on DL techniques for the problem The key focus of our SLR is on understanding how the DL
of fake news detection. techniques have been used to address the FND problem.
In this systematic literature review (SLR), we investigate We are also interested in how TL has been applied in this
existing fake news detection (FND) strategies that use deep field and how the class imbalance problem has been tackled.
learning. We focus on publicly available datasets used in FND • RQ1: Which deep learning algorithms have been used
and their NLP approaches. We aim to gather information for fake news detection throughout time?
about the transfer learning techniques applied and the • RQ2: Which datasets are used in the fake news detection
methods used for addressing class imbalance, to examine domain?
their effect on detection accuracy. Our survey aims to identify • RQ3: How effective are deep learning methods for fake
open issues and research gaps in current studies. To the best news detection?
of our knowledge, we are the first to provide a comprehen- • RQ4: Which solutions incorporate transfer learning
sive SLR that investigates the effects of transfer learning mechanisms, if any?
and class imbalance treatment in the fake news detection • RQ5: Which solutions deal with different levels of
domain. imbalanced datasets (if any)?
Key Contributions:
The main contributions of this paper are as follows:
C. SOURCE DATABASES AND SEARCH QUERY
• We provide a detailed discussion of the main deep
For the purpose of collecting research articles, we selected
learning-based algorithms used to detect fake news,
four digital databases that are renowned for their compre-
including their effectiveness.
hensive coverage and relevance to our field of study. These
• We discuss the main datasets available for fake news
databases include:
detection as well as their respective characteristics,
advantages, and disadvantages. • Google Scholar (we selected the articles that appeared
• We study transfer learning techniques and strategies for in the first thirteen retrieved pages);
dealing with class imbalance in this application domain. • Association for Computing Machinery (ACM) Digital
We also investigate their effects on the detection of fake Library database;
news and the challenges associated with implementing • IEEE Xplore database; and
these strategies. • Scopus.

114436 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

Based on the research questions established in Section II-B, • Peer-reviewed journals and conference articles retrieved
we collected a set of precise concepts that can cover the topic from the search query defined in Figure 1.
we are studying. We, therefore, formulated the search query • Articles from the Computer Science domain.
as follows: • Research articles that focus on detecting or classifying
fake news.
We applied the backward snowballing technique [14] to
gather relevant articles that might have been missed in our
search by inspecting the reference sections of the retrieved
papers. We identified two articles that were not picked up
through our search query and were added to the set of
manuscripts to analyze.

E. DATA EXTRACTION
We used Covidence [15], a special web-based software for
FIGURE 1. Search query used in our SLR. supporting the data aggregation and extraction of SLRs.
The above search statement addresses the research ques- The extracted data was organized in a spreadsheet that was
tions by focusing on the 4 key concepts in the studied topic: exported from Covidence. The data that was extracted from
‘‘fake’’, ‘‘information’’, ‘‘detect’’, and ‘‘deep learning’’. the retrieved and selected articles is the following:
We searched both the title and the abstract for articles pub- • Date: date of the publication;
lished between January 2018 and December 2023 inclusive. • Publication Type: where the article has been published
Limiting the search on this date range is motivated by the (conference/journal).
fact that FND has become more popular throughout the last • Classifier/Model: algorithms used for FND in the article.
years, especially during the COVID-19 pandemic that started • Network Structure: the architecture of the network
around the beginning of 2020. (details including the number/types of layers and any
We defined the following set of restrictions on the results special setup in the network.
that limit the selection among the returned articles. • Dataset: name of the fake news corpus or dataset(s) used.
• The selected articles must be published in peer-reviewed • TL Techniques: the TL mechanism(s) used in the
journals or conferences. Thus, we excluded patents and proposed solution.
any articles that did not conform to this condition. • Imbalance Techniques: shows whether the imbalanced
• The language of the surveyed papers must be English. issue was treated in the proposed solution and how they
Any papers retrieved that were not written in English dealt with it.
were excluded. • Effectiveness: depicts the performance of the model
• Articles containing classification models that do not in terms of accuracy, precision, recall, F-measure, and
mention the performance evaluation of the methods other evaluation metrics.
(e.g., accuracy, precision, recall, F1-score, etc.) were
excluded. F. DATA COLLECTION SUMMARY
• We excluded the articles that have not mentioned the Overall, our search query retrieved 1642 articles. We found
classifier/model used in the detection task in their 436 duplicate articles that were removed. After the first
methodology. screening of the titles and abstracts, we ended up with 393
• We excluded the articles that only applied standard ML research papers, matching our research keywords and the
algorithms instead of DL ones. inclusion and exclusion criteria. After a second full text
• We excluded older articles when extensions and more screening, we excluded 217 articles obtaining 176 research
recent editions were found. papers for analysis. Figure 2 shows the PRISMA chart
• We excluded articles that were published in domains demonstrating the retrieved papers’ selection strategy
outside of Computer Science such as art, business,
or other domains. III. DEEP LEARNING ALGORITHMS USED FOR FAKE
The total number of articles obtained from the search NEWS DETECTION
query, adhering to the extraction, duplicate removal, inclu- The thorough examination of various models and techniques
sion, and exclusion criteria, is 176. This includes 88 journal pointed out the significant role that DL plays in different
articles and 88 conference articles. The process of obtaining classification tasks including detecting fake news. Building
the articles is detailed in Section II-F. and improving such algorithms became a pressing necessity,
especially during the COVID-19 pandemic when a large
D. INCLUSION CRITERIA volume of fake news and rumours were being disseminated
We considered the following inclusion criteria for our widely. Figure 3 demonstrates a clear increase in the use of
systematic literature review: DL models over the years.

VOLUME 12, 2024 114437


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

FIGURE 4. The General DL Framework that Used for FND.

long short-term memory networks (BiLSTM). we will refer


to the model and its bi-directional version collectively as
(Bi)X, where X is the model. Graph neural networks (GNNs),
a newer type of neural network, are designed to operate
FIGURE 2. PRISMA Chart of selecting and retrieving the articles. on graph-structured data, such as social networks, chemical
molecules, or protein structures.
Recently, attention-based models have gained popularity
due to their ability to focus on certain parts of the input data
selectively. They include self-attention networks and multi-
head attention networks. Hybrid models, which combine
different types of neural networks, have also become pop-
ular. For example, convolutional recurrent neural networks
combine the spatial processing capabilities of convolutional
neural networks with the temporal modeling capabilities of
recurrent neural networks. Transformer networks, like BERT,
combine self-attention mechanisms with feedforward neural
networks to process sequences of data. Figure 5 shows a
taxonomy of the various types of neural networks that are
used for FND.
FIGURE 3. DL Models used for FND between the years of 2018 and 2023. Based on the data that we gathered from the surveyed
articles, it is evident that researchers extensively explored
The extracted data shows that the FND task usually follows several DL algorithms for the detection task. Figure 6 shows
a generic framework as is shown in Figure 4. Initially, the pro- the usage of different DL detection models for fake news.
cess involved acquiring or generating a dataset. The majority More precisely, this figure displays the percentage of papers
of studies have utilized news articles that were gathered from where a particular model was used. We observe that the
openly accessible datasets. After collecting the dataset, pre- (Bi)LSTM was the most frequently included model used in
processing techniques were employed to prepare the data for 72% of articles and the CNN model was the second most
input into a neural network. Prior investigations have mainly used model utilized in 61% of the articles reviewed. The third
employed Word2vec and GloVe word embedding methods to architecture used is the hybrid architecture, which combines
transform words into vectors [16]. Finally, the neural network different types of neural networks in the detection process.
model is trained and the predictions are obtained. Since multiple models may be used in the same research
Neural networks for FND can be categorized into different paper, summing up the percentages in Figure 6 exceeds a total
types based on their architecture and how they process data. of 100%. The following sections provide a detailed discussion
The first type is feedforward neural networks, including of the main architecture used for FND.
single-layer and multi-layer perceptrons. Convolutional neu-
ral networks (CNNs) are another type, which are designed to A. ARCHITECTURES BASED ON CONVOLUTIONAL
process data with a grid-like topology, such as images. They NEURAL NETWORKS
include traditional convolutional neural networks, residual Our findings also show that 61% of the previous works
networks, and dense networks. used Convolutional Neural Networks (CNNs) to handle the
Recurrent neural networks (RNNs) are designed to handle detection issues, attempting to boost the performance of the
sequential data, such as time series or language, and include FND process through the use of this DL algorithm [12], [17],
basic recurrent neural networks and bi-directional RNNs, [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28],
long short-term memory networks (LSTM), gated recurrent [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39],
units (GRUs) and bi-directional GRUs, and bi-directional [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50],

114438 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

FIGURE 5. Taxonomy of the main neural network categories used for FND.

being fed into the model [28]. Finally, the overall architecture
that will be used for the detection task may adopt either the
CNN itself or a CNN in a hybrid approach as we can see in
our extracted results [41].
Figure 7 illustrates an example of a CNN architecture used
for FND as proposed in [100]. The CNN architecture used
in this study is composed of an input layer, an embedding
layer, and three sets of convolutional and max pooling layers.
The input layer resizes the input data to a uniform size of
1000, while the embedding layer reduces the size to 100 by
embedding the data. The convolutional and max pooling
pairs extract features from the input. To perform this task,
FIGURE 6. Deep Learning Models for FND. filters are applied to each convolutional layer, each of which
consists of 128 filters with a kernel size of 5 and a ReLU
[51], [52], [53], [54], [55], [56], [57], [58], [59], [60], [61], activation function. Additionally, the fully connected network
[62], [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], includes both a flat and a dense layer. Lastly, the feature maps
[73], [74], [75], [76], [77], [78], [79], [80], [81], [82], [83], are classified using a dense layer with a softmax activation
[84], [85], [86], [87], [88], [89], [90], [91], [92], [93], [94], function.
[95], [96], [97], [98], [99], [100], [101], [102], [103], [104],
[105]. B. ARCHITECTURES BASED ON RECURRENT NEURAL
The detection effectiveness is the result of CNN’s ability NETWORKS
to carry out feature extraction [106]. It is worth noting here Another popular FND algorithm examined in the previous
that CNNs were trained on different fake news datasets. The studies is the Recurrent Neural Network (RNN) and its
CNN achieved notable effectiveness, with accuracy ranging variations. Authors have investigated various RNN models to
between 95% and 98%, depending on whether it was used detect fake news in sequential data. They have proposed Long
individually [78] or in conjunction with another model, Short-Term Memory (LSTM), GRU, unidirectional LSTM-
such as the Gated Recurrent Unit (GRU) [41], respectively. RNN, vanilla RNN, and Bi-directional LSTM ((Bi)LSTM).
It is also worth mentioning that the CNN has fallen in Our findings show that researchers’ focus is highly shifted
some cases to about 47% detection accuracy [30] which toward RNNs and their variations in fake news detection.
leads to the conclusion that some key points may affect the Figure 8 shows the utilization of RNNs in the previous
effectiveness of the CNNs in FND tasks. The first point is the studies.
degree of the deepness of the network being used. A deeper It is noticeable from our findings that researchers examined
CNN is considered an advantage for solving the overfitting FND using classic RNNs in only 12% of the total number
issue [62]. This is what we discovered using the data of the surveyed articles [16], [31], [46], [50], [59], [67],
collected which contains a case of building a deeper CNN, [72], [85], [89], [95], [98], [107], [108], [109], [110], [111],
called FNDNet, which solved the overfitting problem by [112], [113], [114]. Despite the importance of the RNN in
learning the discriminatory features for FND using multiple such domains, research authors discussed the RNN vanishing
hidden layers [12]. The second point affecting the CNNs’ problem [115]. One solution to solve vanishing in RNNs
effectiveness concerns the selected dataset that will be used is to use other architectures such as LSTM and (Bi)LSTM.
in the detection task and its readability and cleanness before The percentage of the articles that examined both LSTM and

VOLUME 12, 2024 114439


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

FIGURE 7. An example of CNN architecture used in FND.

[31], [32], [33], [34], [35], [36], [43], [44], [49], [50], [52],
[53], [56], [57], [59], [60], [62], [63], [64], [67], [69], [70],
[72], [73], [76], [77], [79], [80], [82], [83], [85], [86], [87],
[89], [90], [91], [95], [96], [99], [100], [102], [103], [104],
[107], [108], [109], [110], [111], [112], [116], [117], [118],
[119], [120], [121], [122], [123], [124], [125], [126], [127],
[128], [129], [130], [131], [132], [133], [134], [135], [136],
[137], [138], [139], [140], [141], [142], [143], [144], [145],
[146], [147], [148], [149], [150], [151], [152], [153], [154],
[155], [156], [157], [158], [159].
Other solutions were also adopted in the previous studies
which include using (Bi)GRU as a detection architecture.
(Bi)GRU has been examined in 16% of the total surveyed
articles [16], [24], [25], [31], [41], [56], [72], [75], [79], [84],
FIGURE 8. RNNs utilization in FND. [89], [93], [100], [109], [119], [121], [136], [143], [153],
[160], [161], [162], [163].
(Bi)LSTM was around 72% of the total articles [16], [17], Figure 9 shows the RNN GRU-based architecture for FND
[18], [19], [20], [21], [23], [24], [25], [26], [27], [29], [30], that was presented in [100]. In this proposed solution, the use

114440 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

of GRU RNNs for FND is explored. The model proposed


includes an input layer and an embedding layer with data
sizes of 1000 and 100, respectively. The GRU layer is then
implemented with identical hyperparameters as the LSTM
layer to facilitate a reliable comparison between the two.
Finally, fully connected networks are used, along with a
batch normalization layer, and a dense layer with a softmax
activation function is applied for classification.

FIGURE 9. An example of GRU architecture used in FND.

The findings from our survey also show that RNNs and
their variations had a remarkable detection accuracy in the
fake news domain when compared against other detection
models and taking into consideration the usage of different
datasets. The RNN detection effectiveness ranged from
48% [50] to around 92% [109] to around 99% [96] detection
accuracy.
Using another architecture with the RNN does not seem
to increase the accuracy of the detection results as shown
in the works of Ilie et al. [31] and Nasir et al. [46]. It is also
noticeable in our findings that the GRU model had also
participated in detecting fake news with an accuracy ranging
between about 76% [161] and 97% [24]. These satisfactory
results are not the case when using the BiGRU instead of the
standard GRU architecture. In the latter case, the detection
accuracy decreases to a range from 28% and 71% [136].
Finally, it was clear that the detection was more accurate
when it was done by a second model besides GRU in a hybrid
mechanism. This was obvious when the researchers used the
GRU with a CNN in [41] and [161] and when a GRU was
used with (Bi)LSTM in [119].
FIGURE 10. An example of CLSTM architecture used in FND.
Despite the effectiveness of the above-mentioned architec-
tures in fake news detection, previous studies showed that
the LSTM and the (Bi)LSTM are the future key players between 79.03% and 81.21%. These models also reported
in enhancing fake news detection. The average accuracy of a maximum of 99.9% detection accuracy in [25] and a
detecting fake news using LSTM architectures was ranging minimum of 11% in [43].

VOLUME 12, 2024 114441


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

In addition, the findings show that LSTM was used in a GNNs, news articles, and related information are represented
hybrid fashion with one or more architectures to determine as a graph. The nodes of the graph represent the individual
the optimal FND system among the proposed systems. entities, such as news articles, users, or social media posts,
Figure 10 shows the architecture of the CNNs-LSTM model and the edges represent the relationships or interactions
proposed in [100]. This model utilizes both hybrid and between them. To create the graph, the news articles are
recurrent models on collected news data. The proposed typically preprocessed to extract features such as the article
hybrid model incorporates both CNNs and LSTM models. content, metadata, and social media interactions. These
The algorithm includes an input layer that resizes the input features are used to construct the nodes and edges of the
data frames to 1000 and an embedding layer that embeds the graph, with the nodes representing the articles and the edges
input tensor size from 1000 to 100. The embedded tensors representing the relationships between articles, users, or other
are then processed through two sets of convolutional and entities. For example, edges could represent similarities
max-pooling layers for feature extraction. The convolutional between articles or social media interactions such as retweets
layers have 32 and 64 filters, respectively, and a kernel or mentions. Once the graph is constructed, Graph Neural
size of 3. The feature extraction process is then performed Networks are used to analyze the graph structure and extract
by the LSTM layer with 100 units, a dropout rate of 0.2, useful features for fake news detection. The GNNs use graph
and a recurrent dropout rate of 0.2. Additionally, the fully algorithms to propagate information across the graph and
connected network is designed with a batch normalization learn representations of the nodes and edges that capture their
layer, followed by three dense layers with a ReLU activation relationships and interactions. These learned representations
function and several filters of 256, 128, and 64, respectively. can then be used to classify the news articles as fake or
The classification task is carried out using a dense layer with real based on their similarity to other articles and the overall
a softmax activation function. structure of the graph.
Researchers focus more on testing the effects of developing Our findings show that only 4% of the selected research
hybrid models that adopt LSTM in the detection process. articles adopted GNN architectures for fake news detec-
They tested the importance of (Bi)LSTMs over LSTM and tion [93], [169], [170], [171], [172], [173], [174], [175]. The
reported that the (Bi)LSTM+CNN achieved considerably claimed detection accuracy was incredibly low compared
higher accuracy than when they attempted to use the LSTM to the other deep-learning models on different datasets
with the CNNs. They reported a detection accuracy of about used. The highest detection accuracy obtained when using
99% detection accuracy when they attempted to use the the GraphSAGE was 89.7% accuracy without mentioning
(Bi)LSTM instead of the LSTM [96]. whether this was on the training or the testing dataset [171].
When LSTM is combined with CNN, studies also reported The accuracy went deeply down to 61.5% when they adopted
an accuracy ranging between 97.8% in [129] and 47.06% the GNN. It also recorded a 73.12% [169] with GCN with a
in [30], with an average accuracy of 82.3%. In the case where maximum of 88.6% [171]. The other variants such as SGT,
LSTM is combined with a DNN architecture, we observed an GCN, and GAT had reached an average accuracy of about
accuracy of 91.16% [133], while when it is combined with 83.1%.
BERT the accuracy achieved was 84.10% [134].
Bi(LSTM) is also getting popular in fake news detection D. ATTENTION-BASED AND BERT-BASED ARCHITECTURES
as our survey findings show. It recorded the highest detection Another notable advancement happened in fake news
accuracy of 99.52% [126] and the lowest of 28% [136] with detection with the use of attention-based approaches using
an average of 75.22%. It also appeared connected to other different datasets. Our findings show that their use has
detection architectures such as CNN and GRU. Bi(LSTM) been increasing since the year of 2018 and has reached the
with CNN recorded the highest accuracy of 98.65% in [132] maximum in the year of 2022. In addition, this approach
and the lowest accuracy of 35.13% in [56]. The average appeared in 15% of the surveyed articles mostly in the year
detection accuracy in such cases was about 77.6%. Bi(LSTM) 2022. Authors have applied it to the other detection models
with GRU reached 89.8% detection accuracy [119]. including RNNs [31], GRU [31], [75], [160], [163], [176],
LSTM, and (Bi)LSTM [19], [49], [57], [77], [123], [132],
C. ARCHITECTURES BASED ON GRAPH NEURAL [140], [154], [174], [177], BERT [57], and CNN alone [49],
NETWORKS [77] or with other models [19], [49], [140]. The detection
Another popular model in fake news detection is the Graph accuracy ranged between 54% [49] and 98.65% [132].
Neural Network (GNN) and its variants such as Sequence Another deep learning model present in our surveyed
Graph Transform (SGT) [164], Graph Attention Networks works that shows cutting-edge detection is the BERT [178]
(GAT) [165], GraphSAGE [166], and Graph Convolutional model. It is a sophisticated pre-trained word-embedding
Networks(GCN) [167]. GNN is a neural network that model built on a transformer-encoded architecture. The
directly operates on the graph structure. One of its popular findings show that 16% of the surveyed studies adopted the
applications is node classification in which every node in BERT as a detection mechanism [43], [51], [57], [66], [74],
the network has a label. This network predicts the label [79], [81], [83], [88], [89], [91], [99], [110], [111], [134],
of the node without the ground truth [168]. In FND using [144], [156], [157], [179], [180], [181], [182], [183], [184].

114442 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

The findings also show that authors started using the BERT as and a lower accuracy of 83.35% [107]. On the other hand,
a detection model for fake news in 2021 which makes it still a DNN reached an accuracy of 94.68% [187] while it was less
novel tool for the detection model and a future direction in the accurate when applying it with an LSTM by 2.8% [133]. The
fake news detection field. Our findings show that this model findings also show that using a multichannel ANN [187] has
has reached a remarkable detection accuracy with the highest increased the detection accuracy by approximately 13% of
recorded accuracy of 98.5% [181] and an average accuracy the basic ANN which was 80.9% accurate in detecting fake
of around 90%. It is also clear in our findings that researchers news [188].
experimented with the effectiveness of applying the BERT In conclusion, we observe a clear growing trend in
with other models such as LSTM [134] and CNN [51] for the the solutions proposed using (Bi)LSTM, CNN, BERT, etc.
detection of fake news using different datasets. An example throughout the years, as Figure 12 shows.
of using BERT in the fake news detection process, FakeBERT
has been proposed in [185] which outperforms all other
F. CHALLENGES RELATED TO DEEP LEARNING METHODS
models with an accuracy of 98.9%. Figure 11 illustrates the
FOR FAKE NEWS DETECTION
proposed FakeBERT.
Despite the promising results of deep learning methods
As Figure 11 shows, this design employs three parallel
for fake news detection, several challenges remain to be
blocks of 1D-CNN with 128 filters, with each block having
addressed. These include issues related to dataset quality,
one convolutional layer. The first layer has a kernel size
model performance on imbalanced datasets, and the gener-
of 3 and 128 filters, reducing the input embedding vector
alizability of models across different datasets. In this article,
from 1000 to 998. The second layer has a kernel size of 4 and
we will explore these challenges in more detail and discuss
128 filters, reducing the input vector from 1000 to 997. The
potential solutions.
third layer has a kernel size of 5 and 128 filters, decreasing
It is important to recognize that there are several challenges
the input vector from 1000 to 996. Max-pooling layers
when it comes to achieving effective fake news detection
are also included after each convolutional layer to further
using deep learning methods. One major issue is the potential
reduce the dimension. A max-pooling layer with a kernel
for overfitting, where models achieve high accuracy on the
size of 5 reduces the vector to 1/5th of 996, which is 199.
training data but perform poorly on new, unseen data [60].
After concatenating the three convolutional layers, another
Some previous research has reported extremely high accuracy
convolution layer with a kernel size of 5 and 128 filters is
results, but these were obtained by evaluating the model on
applied. This is followed by two hidden layers with 384 and
the same data that was used for training. The performance
128 nodes respectively. The number of trainable parameters
of their models achieved a high accuracy of 99.9% [25],
for each layer is also provided in the ‘‘Param number’’
[53], [60], [120], [124], [126], [131], and [132]. This raises
column for further details.
questions about the model’s ability to generalize to new data.
A recent study has conducted a thorough comparison
Another challenge is the use of accuracy as the sole evaluation
between different deep learning models in fake news
measure for imbalanced datasets, where the number of fake
detection using various datasets [186]. The authors studied
news samples vastly outweighs the number of real news
the effect of deploying (Bi)LSTM, CNN-RNN, C-LSTM,
samples [17], [18], [19], [20], [36], [57], [60], [61], [62],
CNN, and BERT in the detection of fake news. They used
[117], [120], [123], [131], [137], [138], [139]. Accuracy
seven fake news detection datasets with each model to be
can be misleading in these cases, as it can be skewed by
able to draw a generalized conclusion. They figured out that
the dominance of the majority class. A more appropriate
the (Bi)LSTM and BERT detection models achieved the best
measure, such as precision or recall, would provide a better
detection accuracies and F-scores. The authors have also
understanding of the model’s performance. Additionally,
concluded that BERT performs better than the (Bi)LSTM
different datasets can have varying characteristics and biases,
when the model aims at detecting fake news in different
and models that perform well on one dataset may not
contexts from the one it was trained on [186].
generalize to other datasets. This is noted from our findings
in [43], [50], [125], [135], and [136]. This was also proved by
E. ARCHITECTURE BASED ON FEEDFORWARD NEURAL
the thorough experiments that were made in a recent study of
NETWORKS
cross-domain fake news detection [186]. Finally, the quality
Finally, other deep learning models have been used in the
and diversity of the training data can greatly impact the
fake news detection field with basic and standard feedforward
performance of the model [189], [190]. In some cases, models
neural network (FFN) settings. Authors categorized these
have been trained on datasets that are not representative of
under simple neural networks (NN; ANN, DNN, and FNN).
the full range of fake news content, leading to poor detection
Although these models are referred to as simple detection
performance [190], [191].
techniques and were used in 7% of the total surveyed articles,
they still reached a noticeable accuracy in detecting fake
news. Our findings show that FFN has reached a detection IV. DATASETS USED FOR FAKE NEWS DETECTION
accuracy of 89.8% [44] to about 95% when it was provided In this section, we first discuss the main characteristics of
by solid support from a strong embedding technique [148] the datasets used in the surveyed works. Then, we discuss

VOLUME 12, 2024 114443


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

FIGURE 11. An example of BERT architecture used in FND.

FIGURE 12. DL models in fake news detection throughout the time.

some of the open challenges related to the datasets in this We observe that ISOT [192], PHEME [193], Liar [194],
application domain. and FakeNewsNet [195], with its three sub-datasets, Gos-
sipCop, PolitiFact, and BuzzFeedNews, are examples of
A. MAIN DATASETS USED FOR FAKE NEWS DETECTION publicly available fake news datasets. These are among the
Researchers have used several datasets in the context of fake most popular and frequently used datasets.
news detection. However, we found that only a small part The LIAR dataset includes short statements obtained from
of these datasets is publicly available, while a considerable the Politifact fact-checking website. This dataset includes
percentage is created by the researchers and/or is not a total of 12.8 K labelled short statements. The annotation
disclosed publicly. A pie chart of the used datasets in the task has been done by the Politifact site, and the statements
surveyed studies is presented in Figure 13. are classified into 6 classes: pants-fire, false, barely-true,

114444 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

with an accuracy of 99.95%. Still, the accuracy decreased


when the models used for training were RNN and CNN,
with the lowest performance recorded at 82.5% [46]. PHEME
also exhibited high performance when the CLSTM model
was used, achieving an accuracy of 91.88% and recording a
minimum accuracy of 65.5% with the training of a CNN [55].
Lastly, FakeNewsNet sub-datasets used in training different
models such as CNNs [50], various RNNs [50], [110],
[111], [134], [136], GNNs [170], and BERT [110], [111],
[134], [180]. The best detection accuracy that achieved when
training the GAT with a 96.42% accuracy while it recorded a
71.16% accuracy when it was used to train a (Bi)GRU.

B. CHALLENGES RELATED TO THE DATASETS USED FOR


FIGURE 13. Datasets used in the surveyed studies. FAKE NEWS DETECTION
One of the main difficulties in the fake news detection field
is the scarcity of labelled cases [196], [197]. Even though
half-true, mostly-true, and true. In addition, another fake
multiple datasets with a massive amount of records exist they
news dataset was collected from real-world news articles
are mostly unlabeled or have only a few records labeled.
called ISOT. The real news cases were collected by
Researchers have collected datasets over the last few years
crawling news articles from Reuters.com, and the fake news
for use with DL models in different contexts associated with
examples were obtained from unreliable websites, which
fake news detection. Datasets are massively diverse from
were annotated by the Politifact website. The PHEME
one another due to having different research goals inside the
dataset was collected from Twitter based on 9 newsworthy
fake news detection application domain [198]. For example,
events classified by journalists. The annotation process was
some datasets contain exclusively political statements, while
conducted by journalists (human annotators) and each tweet
other datasets only include news articles or social media
was annotated with one of the following labels: ‘‘proven
posts [186].
to be false’’, ‘‘confirmed as true’’ or ‘‘unverified’’. The
To collect appropriate datasets to serve in fake news
FakeNewsNet dataset consists of three subdatasets, which
detection, we need fake articles and non-fake articles. Fake
are GossipCop, PolitiFact, and BuzzFeedNews. In total, the
articles are gathered from deceitful websites that are designed
FakeNewsNet dataset contains approximately 19,838 news
on purpose to disseminate misinformation and fake news.
articles labelled as either ‘‘fake’’ or ‘‘real’’. The news articles
The fake news published on these websites will eventually be
in the FakeNewsNet dataset were annotated by a team of
shared on social media to be read and circulated by innocent
human annotators. The annotators were given guidelines
people who do not check the news source.
for identifying fake news and were trained to identify
It is also clear from our findings that the datasets used in
various characteristics of fake news, such as misleading
fake news detection are insufficient for training models due to
headlines, fabricated content, and misleading images. Table 1
their characteristics, such as language features or size [199].
summarizes the main characteristics related to these datasets.
That leads us to the question of creating a dataset to serve as
TABLE 1. Main characteristics of the publicly available FND datasets.
a benchmark in the detection process. However, this can be
challenging due to several reasons, some of which are:
• Sources of fake and non-fake news: Identifying reliable
sources of fake and non-fake news can be difficult, espe-
cially in today’s world where there are numerous sources
of information and not all of them are trustworthy [200].
It is crucial to ensure that the dataset contains a diverse
range of sources to ensure that the model is trained to
detect fake news from a variety of sources.
• Bias in the data: Bias can be introduced in the data due
to various reasons such as the sources of the data, the
From our findings, LIAR achieved a maximum of 98.95% labelling process, or the selection criteria for the dataset.
when a Bi(LSTM) model was used for training [125]. Bias can affect the accuracy of the model and can also
The same dataset was an option for training the (Bi)GRU lead to unfair predictions [201].
in [136] which recorded a low detection accuracy of 28.12%. • Labeling issues: Labeling data for fake news detection
ISOT recorded high detection effectiveness in many cases, can be challenging, as there can be discrepancies in the
especially when the trained model was a Bi(LSTM) [53] definition of what constitutes fake news. Human labels

VOLUME 12, 2024 114445


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

may be subjective, and there may be inconsistencies the result of the data being rare, expensive to gather and
in the labelling process. Automatic labels generated label, or inaccessible. The use of other existing datasets
using machine learning techniques can also have their that are related to, but not precisely the same as a given
limitations [197], [202]. target domain of interest makes transfer learning solutions
• Bots involvement: Bots can be used to generate large an alluring strategy since big data repositories become more
volumes of fake news and spread it rapidly across the widespread. Transfer learning has been successfully used
internet, making it difficult to detect and remove. Bots in many machine and deep learning applications, including
can also be used to manipulate the labelling process by text sentiment classification [206], image classification [207],
providing biased labels, leading to inaccuracies in the [208], [209], classification of human activity [210], classifi-
dataset [200]. cation of software defects [211], and classification of multi-
• Rapid evolving nature of fake news: The nature of fake language text [212].
news is constantly evolving, and new techniques for Different techniques can be utilized in transfer learning to
creating and spreading it are being developed all the accomplish tasks as the following [213]:
time. This makes it difficult to create a comprehensive
dataset and up-to-date [198], [203]. • Training models in similar domains: This transfer
learning method trains models that belong to similar
To address these issues, it is crucial to have a well-designed domains. For instance, if there is insufficient data to
and diverse dataset that is regularly updated to reflect the complete task X, but task Y is similar and has adequate
changing nature of fake news. It is also important to have data, a model can be trained on task Y and then used to
robust labelling procedures in place, using a combination create a new model for task X [214].
of human and machine labels, to ensure that the dataset • Feature extraction: Feature extraction is another transfer
is unbiased and accurate. Additionally, researchers should learning approach where deep neural networks are
consider incorporating techniques such as adversarial training trained to extract features automatically. After training
to improve the robustness of the model to adversarial attacks. them on pre-existing models, the representations are
exported to new models. This technique is commonly
V. STRATEGIES FOR TRANSFER LEARNING employed by data scientists [215].
A. TRANSFER LEARNING STRATEGIES APPLIED TO FAKE • Utilizing pre-trained models: This approach involves
NEWS DETECTION developing pre-trained models that take transfer learning
Numerous real-world applications have made use of the variables into account. Companies experienced in model
machine and deep learning techniques. These learning development often have access to a library of models
methodologies assume that the input feature space and data that can be used to create future models. This means that
distribution properties are maintained across the experiments when dealing with a new problem, a pre-trained model
carried out because the training data and testing data are can be selected, optimized for the problem at hand, and
drawn from the same domain [204]. This assumption, then reused to train another model [214].
however, may not be accurate in some real-world machine- The first transfer learning technique involves training
learning situations. In fact, in some circumstances, gathering models in similar domains by using a pre-trained model from
training data can be costly and/or challenging. As a result, the a source domain that is similar or related to the target domain.
research community has been considering the development The idea is that the knowledge learned from the source
of high-performance learners who are trained using data that domain can be leveraged to improve model performance
could be more easily obtained from other various domains on the target domain, even if the target domain has limited
instead of the deployment domain. labelled data.
Transfer learning is a technique used to advance a learner Training models in similar domains typically involve the
in one domain by transferring knowledge from a related following steps:
domain. Real-world, non-technical experiences can help us
comprehend why transfer learning is feasible. Take the case 1) Selecting a source domain: The source domain should
of two individuals who wish to learn how to play the piano. be chosen based on its similarity or relevance to the
One person has no prior musical training, whereas the other target domain. Ideally, the source domain should have
plays the guitar and has a wealth of musical expertise. similar data distribution, task, or domain characteristics
By applying previously acquired musical information to the as the target domain, so the knowledge learned from
goal of learning to play the piano, a person with a strong the source domain can be effectively transferred to the
musical background will be able to learn the piano more target domain.
quickly and effectively [205]. One can employ knowledge 2) Acquiring or creating a labelled dataset in the source
from a task they have already mastered to help them learn domain: A labelled dataset in the source domain is
a new one that is related. needed for training the pre-trained model. This dataset
The essence and necessity of transfer learning appear when should be representative of the data in the source
there is a dearth of target training data [204]. This can be domain and should cover the task or tasks of interest.

114446 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

3) Pre-training the model on the source domain: The so that the learned features are relevant to the target
pre-trained model is trained on the labelled dataset in task.
the source domain. This involves training the model 2) Removing the last layers of the pre-trained model: The
using standard machine learning or deep learning tech- last layers of the pre-trained model, which are often
niques, such as supervised learning or unsupervised responsible for task-specific predictions, are removed
learning, depending on the availability of labelled data to retain the feature extraction capability of the model.
in the source domain. These last layers are replaced with new layers that are
4) Fine-tuning or adapting the pre-trained model to the specific to the target task.
target domain: After pre-training on the source domain, 3) Extracting features from the source data: The
the pre-trained model is fine-tuned or adapted to the pre-trained model is used to extract features from the
target domain. This typically involves further training data in the source domain. This typically involves
the model using the limited labelled data available passing the data through the layers of the pre-trained
in the target domain, while retaining the knowledge model up to a certain layer and using the outputs of
learned from the source domain. Fine-tuning can be that layer as the learned features.
done by updating the weights of some or all of the 4) Training a new model on top of the extracted features:
layers of the pre-trained model, depending on the The extracted features are then used as inputs to a new
specific task and data. model, which is trained on the limited labelled data
5) Evaluating and validating the model performance: The available in the target domain. This new model, often
fine-tuned model is evaluated and validated on the referred to as the target model, is trained using standard
target domain dataset to assess its performance. This machine learning or deep learning techniques, such as
may involve measuring metrics such as accuracy, pre- supervised learning or fine-tuning, depending on the
cision, recall, F1 score, or other relevant performance availability of data in the target domain.
indicators to determine the effectiveness of the transfer 5) Evaluating and validating the target model perfor-
learning approach. mance: The trained target model is evaluated and
validated on the target domain dataset to assess its
Transfer learning by training models in similar domains
performance. This may involve measuring metrics such
can be useful when the target domain has limited labelled
as accuracy, precision, recall, F1 score, or other relevant
data, but related or similar domains have abundant labelled
performance indicators to determine the effectiveness
data. By leveraging the knowledge learned from the related
of the transfer learning approach.
source domain, the model can benefit from the additional
data and potentially achieve better performance on the target Feature extraction in transfer learning allows leveraging
domain task. However, it is important to carefully consider the knowledge learned from the source domain to extract
the similarity and relevance between the source and target relevant features from the data in the target domain, even
domains to ensure that the knowledge transfer is effective and if the target domain has limited labelled data. By using the
results in improved performance. learned features as inputs to a new model, the target model can
For the second transfer learning technique, feature extrac- potentially benefit from the representations or embeddings
tion is one of the common techniques used in transfer learned from the source domain. This can help improve the
learning, where a pre-trained model is used to extract features performance of the target model on the target domain task.
from data in one domain and these features are then used to However, it is important to carefully consider the similarity
train a new model for a different task or domain [216]. and relevance between the source and target domains to
In transfer learning with feature extraction, the pre-trained ensure that the features extracted from the source domain are
model is typically a deep neural network trained on a large relevant to the target task.
dataset from a source domain. This model has learned to For the third transfer learning technique, utilizing
extract relevant features from the source domain data, which pre-trained models is a common approach in transfer learning
can be representations or embeddings of the input data at where a pre-trained model, typically trained on a large
different layers of the network. These learned features are dataset, is used as a starting point for training a new
then used as inputs to a new model, often referred to as the model on a smaller target dataset. The idea is that the
target model, which is trained on the limited labelled data knowledge learned from the source domain can be transferred
available in the target domain. to the target domain, even if the two domains are different,
The process of using feature extraction in transfer learning to improve the performance of the target model [217].
typically involves the following steps: Here are some key steps involved in utilizing pre-trained
models for transfer learning:
1) Selecting a pre-trained model: The pre-trained model
should be chosen based on its relevance to the target 1) Select a pre-trained model: Choose a pre-trained model
task or domain. Ideally, the pre-trained model should that is trained on a large dataset and is relevant to
have been trained on a large dataset from a source your target task. For example, suppose you are working
domain that is similar or related to the target domain, on an image classification task. In that case, you can

VOLUME 12, 2024 114447


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

choose a pre-trained Convolutional Neural Network model to adapt to the specific requirements of the downstream
(CNN) such as VGG, ResNet, or Inception, which have task. In summary, BERT is a transfer learning technique that
been trained on large image datasets like ImageNet. leverages pre-training on unlabeled text data and fine-tuning
2) Remove or freeze some layers: Depending on the archi- on specific NLP tasks to achieve state-of-the-art performance
tecture of the pre-trained model, it may be necessary on a variety of NLP benchmarks [178].
to remove or freeze some layers. For example, you can In particular, the most common transfer learning strategy
remove the output layer(s) of the pre-trained model and in fake news detection is fine-tuning pre-trained models.
replace them with new layers that are suitable for your Models like BERT, Llama, and GPT (Generative Pre-
target task. Alternatively, you can freeze the weights trained Transformer) have been pre-trained on extensive text
of some of the layers in the pre-trained model and corpora and can be fine-tuned for fake news detection [91].
only fine-tune the remaining layers during the training By adjusting the weights of these models on a specific fake
process. news dataset, researchers can achieve high detection accuracy
3) Add new layers: Add new layers on top of the with relatively low computational resources.
pre-trained model to adapt it to your target task. These Other transfer learning strategies used for fake news detec-
new layers are typically randomly initialized and are tion include the adaptation of Convolutional Neural Networks
trained using the target dataset. The output of these new (CNNs), traditionally used for image recognition, to text
layers serves as the final prediction layer for your target classification tasks, including fake news detection [22].
task. Models like VGG16, which were previously trained on large
4) Fine-tune the model: Train the entire model, including image datasets, may be reused by replacing the last layers and
the pre-trained layers and the newly added layers, retraining on textual data [226]. This strategy takes advantage
on your target dataset. During the fine-tuning process, of CNNs’ hierarchical feature extraction capabilities, which
the weights of the pre-trained layers and the new enable them to detect detailed patterns in textual data that
layers are updated using the gradients computed from indicate fake news. In addition, pre-training hybrid models
the target dataset. Fine-tuning allows the model to on large datasets and then fine-tuning them on specific fake
learn task-specific representations while leveraging the news datasets used in fake news to exploit the strengths of
knowledge from the pre-trained model. both architectures.
5) Evaluate and tune: After training, evaluate the perfor- Recently, the researchers shifted the whole focus to
mance of the transferred model on your target task. You transformer-based models, particularly those like BERT,
may need to tune the hyperparameters and architecture GPT-3, and Llama [227], [228]. These models are pre-trained
of the transferred model to optimize its performance. on massive datasets using self-supervised learning tech-
Utilizing pre-trained models can be an effective transfer niques, which enable them to understand and generate
learning approach as it allows leveraging the knowledge human-like text. For fake news detection, these models can be
learned from large datasets, reducing the need for extensive fine-tuned on labelled datasets specific to fake news, enabling
training data in the target domain, and potentially improving them to distinguish between fake and real news with high
the performance of the target model. However, it’s important precision.
to carefully choose the pre-trained model, architecture, and Based on the collected data from the surveyed articles,
fine-tuning strategy to ensure that the transferred knowledge along with their corresponding fake news detection effective-
is relevant and beneficial for the target task. ness, the following conclusions can be drawn related to the
There are various pre-trained machine learning models use of transfer learning techniques in this domain:
available in the market, such as Google’s Inception 1) CNN with AlexNet as a transfer learning technique
model [218], Microsoft’s MicrosoftML R package [219] achieved an accuracy of 93.2%. In comparison, not
and Microsoftml Python package [220], and others like applying transfer learning recorded an accuracy of
AlexNet [221], Oxford’s VGG Model [222], and Microsoft’s 70.1% in [22].
ResNet [223]. In addition, some of the well-known 2) In [179], pre-trained BERT as a transfer learning
pre-trained models used for NLP-related data problems technique achieved an accuracy of 94.66%. Similarly,
are Google’s word2vec Model [224], Stanford’s GloVe a pre-trained BERT has also helped in the detection
Model [225] and BERT [178]. of fake news using ISOT dataset [91]. In another
BERT is a pre-trained language model that was initially case of BERT variations, RoBERTa achieved an
introduced by Google in 2018. The model is trained on a accuracy of 92.77% and 91.7% on Politifact and
large corpus of unlabeled text data to learn the underlying Gossipcop respectively [227] which outperform the
structure of the language. It utilizes a transformer-based state-of-the-art, without transfer learning, techniques
architecture that allows it to capture long-term dependencies by achieving an average accuracy of 10.49% and
and contextual relationships between words. After pre- 14.53% improvements on Politifact and Gossipcop,
training, the model is fine-tuned on a specific downstream respectively.
NLP task, such as sentiment analysis, question answering, 3) CNN with various transfer learning techniques,
or named entity recognition. This fine-tuning step enables the such as AlexNet, ResNet50, MobileNet, DenseNet,

114448 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

XceptionNet, InceptionV3, VGG16, and VGG19, open challenges related to the class imbalance problem that
achieved high accuracy on the EMERGENT dataset are still open in this domain.
[226]. The detection accuracy was ranging between
91.22% and 97.68% in [48]. VGG16 was also used A. THE CLASS IMBALANCE PROBLEM IN THE CONTEXT
as a pre-trained model with freezing some layers and OF FAKE NEWS DETECTION
trained on a self-created dataset, achieving about 98% In many real-world domains, the majority of the available
detection accuracy [71]. examples belong to one class (the majority or negative
4) The Universal Language Model Fine-tuning transfer class) while a much smaller number belongs to the other
learning technique has achieved over 80% for all the class (the minority or positive class), which is typically
evaluation metrics (Accuracy, Precision, Recall, F1) on the most important class [229]. This situation is known
PHEME dataset [55]. as the class imbalance problem. The dominant class tends
to overpower classifiers in this situation, causing them
B. TRANSFER LEARNING CHALLENGES FOR FAKE NEWS to overlook the minority class. The significance of the
DETECTION imbalance problem grew as more researchers discovered that
Transfer learning has been used in various natural language it leads to inadequate classification performance and that
processing (NLP) applications, including fake news detec- most algorithms perform poorly when datasets are highly
tion. However, there are several challenges associated with imbalanced [230]. From the standpoint of applications, the
applying transfer learning in this domain. nature of the imbalance can be divided into two categories:
One initial aspect that we must highlight is that transfer data that is naturally imbalanced (e.g., credit card frauds,
learning has not been extensively explored in fake news earthquakes, shuttle failure and rare diseases) or data for
detection. we consider that this is partly due to the complexity which it is too expensive to obtain data on the minority
of the task, which requires identifying subtle linguistic cues class for learning such as natural disasters prediction,
and context-specific information. or uncommon events prediction such as volcanic eruptions
Another challenge concerns the difficulty in finding related or tsunamis, may require historical data or expert knowledge,
domains and publicly available datasets that can be useful for which could be sparse or expensive to obtain [230]. This is
training the models. The success of transfer learning relies on also the case for fake news detection where the number of
the availability of large and diverse datasets that share some fake news available is much less represented in the available
commonality with the target task. However, in the case of fake data.
news detection, relevant datasets are often limited, and it can Several techniques have been proposed to address the
be challenging to find related domains that can be used for issues associated with class imbalance. The three main types
transfer learning. of techniques that can be applied are resampling techniques
The rarity of data is another significant challenge in fake (or data pre-processing), algorithmic level techniques, and
news detection. Since the detection of fake news is a relatively data post-processing techniques [231]. The solutions most
new area of research, there are limited annotated datasets commonly used are the data pre-processing or algorithm-
available for training and testing models. This scarcity of data level techniques.
makes it difficult to apply transfer learning techniques, which In data preprocessing techniques, sampling is applied to
rely on large amounts of labelled data for pre-training. the training data to add new samples or remove existing ones.
In addition to these challenges, other issues need to be These techniques aim to change the training data distribution
addressed to apply transfer learning effectively in fake news to force the learning algorithm to focus on the most relevant
detection. For instance, the choice of pre-trained models and class. This change in the training data can be accomplished
their adaptation to specific tasks can significantly impact the through over- and/or under-sampling. Over-sampling is
performance of the models. Furthermore, the transferability the process of adding new samples to the training data
of pre-trained models across different languages, domains, while under-sampling is the process of removing samples.
and cultures is still an active area of research. Considering Figure 14 and Figure 15 illustrate the random under-sampling
transfer learning strategies is a relevant area for further and random over-sampling techniques. These techniques act
research that can lead to improved solutions for FND. by randomly removing cases or adding copies of existing
cases.
VI. STRATEGIES FOR DEALING WITH IMBALANCE In Random Under-sampling, examples from the majority
Deep learning and machine learning algorithms presuppose class are randomly removed from the training dataset until
that the target classes of the training data have similar prior the class distribution becomes more balanced. This can be
probabilities. This assumption, however, is flagrantly violated achieved by randomly selecting examples from the majority
in a variety of real-world applications, including fake news class and removing them from the training dataset. Random
detection. In this section, we start by summarizing the main under-sampling can be a simple and quick technique to
techniques used to deal with the class imbalance problem and address class imbalance, but it may result in the loss of
describe our main findings from the surveyed articles in the valuable information from the majority class, leading to a
context of fake news detection. Then we summarize the main potential loss of predictive performance.

VOLUME 12, 2024 114449


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

instance, in [235], SMOTE was used to improve the


performance of a DNN model, achieving an accuracy of 98%
on the Politifact dataset.
Additionally, another study [177] utilized the focal loss
function to prevent classification bias towards the majority
class, which significantly improved the performance of their
models on imbalanced datasets.
Downsampling has shown effectiveness in improving the
FIGURE 14. Random under-sampling. training accuracy of hybrid models like CNN+LSTM on
PHEME and FN-COV datasets in [34], achieving accuracy
rates of 91.88% and 98.62% respectively. Additionally,
downsampling has improved the accuracy of CNN and LSTM
models in [52], achieving accuracies of 92.38% and 93.56%
respectively.
It should be noted that despite the effectiveness of class
imbalance techniques in improving the accuracy of fake news
detection models, a significant portion of the literature has
FIGURE 15. Random over-sampling. not thoroughly investigated or addressed this issue. From
our findings, only five research articles investigated the class
In Random over-sampling, examples from the minority imbalance effect on fake news detection. This highlights
class are randomly duplicated or synthetically generated to the need for further research and exploration of various
increase their representation in the training dataset. This can imbalance techniques to better understand their impact on
be achieved by randomly selecting examples from the minor- model performance and generalizability in the context of fake
ity class and duplicating them or generating synthetic exam- news detection.
ples using techniques such as SMOTE (Synthetic Minority
Over-sampling Technique) [232] or ADASYN (Adaptive B. CHALLENGES RELATED TO THE CLASS IMBALANCE
Synthetic Sampling) [233]. Random over-sampling can help PROBLEM IN FAKE NEW DETECTION
in increasing the representation of the minority class, but it Class imbalance is a common problem in multiple application
may also result in overfitting or amplification of noise if not domains, and fake news detection is not an exception.
done carefully. However, it has not received as much attention as it deserves
The second method for resolving class imbalance is to in the context of fake news detection, which we consider
create or modify an existing algorithm. Instead of changing a big challenge to be addressed. The imbalance between
the distribution of the training data, the change is applied real and fake news samples in the dataset can lead to
to the learning and the decision process by increasing the biased classification, where the model performs well on the
importance of the positive class. The cost-sensitive method majority class but poorly on the minority class [231]. Even
and recognition-based approaches, kernel-based learning, when considering the usage of deep learning models, it was
such as support vector machine (SVM) and radial basis shown that the class imbalance problem will still affect the
function [234], are among the algorithms that have been performance of the models [236].
adapted to address the class imbalance problem. Typically, One of the challenges related to the class imbalance
specially developed algorithms for dealing with the class problem in fake news detection is the issue of using adequate
imbalance issue will work very well for a specific domain performance assessment metrics to evaluate the model’s
for which they were thought. However, they will fail under performance. Traditional metrics such as accuracy can be
other domains and they require a thorough understanding of misleading, as the model may perform well on the majority
the algorithm to implement the modifications [231]. class but miss out on correctly identifying the minority class.
Our findings show that the use of various imbalance This issue emphasizes the need for specialized metrics such
techniques, such as oversampling and downsampling, has as F1 score, precision, and recall [237].
shown promising results in improving the performance of In the FND domain, there is a lack of systematic studies
different classifiers, including RNN variations, CNN, and that evaluate the impact of known techniques for dealing
hybrid models like CNN+LSTM. The results indicate that with class imbalance. Techniques such as oversampling,
oversampling has been effective in improving the accuracy undersampling, and ensemble methods have been widely
of LSTM and CNN models in [21], achieving accuracies of used in many other domains. However, their effectiveness in
95.51% and 98.96%, respectively. Similarly, oversampling fake news detection remains understudied. Therefore, more
has also been beneficial for BERT, achieving an accuracy of research is needed to explore the effectiveness of these
94.66% in [179]. techniques in the FND domain. An important challenge with
Moreover, the use of SMOTE oversampling has demon- the application of these techniques for fake news detection is
strated effectiveness in dealing with class imbalance. For related to the generation of fake news texts. In this case, it is

114450 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

necessary to generate complete texts that look like real news, RQ2: Which datasets are used in the fake news
but it is also necessary to generate texts that correspond to detection domain?
fake news. This leads to another challenge connected to the The most difficult part of detecting fake news is the
need to carefully craft the synthetic text generation so that it absence of a labelled dataset with trustworthy ground truth
corresponds to either fake news or real news. In particular, labels with an accepted size [195]. For several usages in
the generated fake news articles should be realistic and DL, researchers attempted to collect datasets over the last
representative of the actual fake news articles to ensure the few years. The collected datasets are massively varied from
effectiveness of the model. one another due to the purpose of the study. For instance,
The context is also a challenge when considering the some of these datasets are political and consist of political
generation of fake news. Since fake news is often generated in statements as is the case in PolitiFact. Other datasets are built
response to specific events or situations, it can be difficult to with news articles collected in a specific time frame, while
apply generic techniques for dealing with a class imbalance other datasets include social media posts such as Twitter.
that does not consider the specific context in which the fake Moreover, fake news is frequently collected from duplicitous
news was generated. websites intended to disseminate misinformation. This fake
Lastly, special-purpose algorithms that can deal with the news will end up being shared on social media platforms
class imbalance problem have not been explored or evaluated by its creator. This fake news will also be shared by other
for FND. These algorithms include cost-sensitive learning, individuals unintentionally without checking the news source
manipulating the loss functions, or building ensembles that or by other malicious users and bots.
are specially developed to address the class imbalance Our findings show that Liar, ISOT, PHEM, and FakeNews-
problem [231]. These techniques have shown promising Net (with their three variations) are the most popular datasets
results in multiple domains, and their effectiveness in FND being used in fake news detection. These six datasets have
requires further investigation. been used in about 80% of the surveyed articles. We also
In conclusion, addressing the class imbalance problem in noticed that researchers frequently attempted to create their
fake news detection is crucial for developing accurate and own dataset to reach the required size and the domain
reliable models. Still, not much research has been done to which is obvious in about 45% of the surveyed studies.
address this problem. Researchers and practitioners need Other researchers combined two or more datasets to have an
to pay more attention to this problem and explore various acceptable-sized dataset.
techniques to overcome it. This is a possible area where It is also worth mentioning that selecting a proper dataset
future researchers should focus on that may lead to improved is a crucial task in fake news detection since it will impact the
solutions for FND. detection effectiveness. It is noticeable from our findings that
applying the same detection model in different datasets has
VII. ANSWERS TO RESEARCH QUESTIONS an enormous difference in the detection accuracy [28], [30],
In this section, we attempt to answer the research questions [41], [42], [43], [46], [50], [122], [125], [135], [136], [163],
presented in Section II-B based on our findings. The detailed [170], [172], [186], [188].
answers are described below. RQ3: How effective are deep learning methods for fake
RQ1: Which algorithms are used for fake news news detection?
detection throughout time? Researchers studied various DL algorithms in the detection
Given our findings, deep learning models are considered and classification of fake news as we mentioned previously.
effective models in fake news detection. There is a notable These algorithms include CNN, RNN (with it is variations),
increase in the number of articles that address the different GNN, BERT and Attention-based mechanisms, and hybrid
models and architectures for this task. We also noticed that the approaches. The detection effectiveness of these algorithms
research focus shifted towards deep learning models for FND is influenced by the datasets used and the combination of
during the global COVID-19 Pandemic in 2021 which forms different architectures for detection.
about 83% of the research effort that was conducted on FND. CNN and (Bi)LSTM have been the most used detection
The remaining 17% of the FND research was conducted models and achieved the highest detection accuracy when
before this year. compared against other approaches. RNNs, including their
Our findings also show that fake news can be detected variations such as LSTM/(Bi)LSTM and GRU, are utilized
by CNNs, RNNs, GRUs, LSTMs, and BERTs models in with considerable effectiveness in about 70%. Their ability
many variations and with different architectures. We noticed to maintain information over sequences allows them to
that LSTM/(Bi)LSTM were the models that appeared more understand context better, which is essential for identifying
frequently in the surveyed articles. The detection was also fake news. CNNs on the other hand have proven to be
examined using hybrid models which increased the detection effective for fake news detection tasks, appearing in 61% of
effectiveness at some points. It is also noticeable that using the research articles we surveyed. BERT and hybrid detection
the BERT model in the detection of fake news exhibits a huge models have also made a noticeable detection effectiveness
positive impact on the detection effectiveness. appearing in about 47% of the surveyed articles. Feedforward

VOLUME 12, 2024 114451


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

Neural Networks and Graph Neural Networks were also used such as the Synthetic Minority Over-sampling Technique
in the detection process even though not in many studies. (SMOTE), generate synthetic examples of the minority
It is worth mentioning that in one research article, many class by interpolating between existing instances rather
deep learning models were developed to draw comprehensive than duplicating existing ones. SMOTE helped the trained
conclusions. Hence, the total percentage of all the models that model get about 95% detection accuracy with a noticeable
appeared in the surveyed articles is more than 100%. improvement compared to the baseline case without treating
The detailed effectiveness of DL detection models in the the class imbalance [235]. This helped to mitigate the risk of
fake news field is in Section III. overfitting and enhanced the model’s generalization ability.
RQ4: Which solutions incorporate transfer learning Another strategy presented to balance the dataset was the
mechanisms, if any? random undersampling which involves reducing the number
Transfer learning is the process of exploiting what has been of instances in the majority class to match the minority
learned in one task to improve the generalization in another class [34], [52].
task [204]. The goal of transfer learning is to improve learning Finally, the focal loss function is designed to address
in the target task by leveraging knowledge from the source the class imbalance by down-weighting the loss assigned
task. to well-classified examples and focusing more on hard-to-
Transfer learning is not applied in many fake news classify instances [177]. This approach helps to prevent the
detection studies as our findings show. There are only model from becoming biased towards the majority class and
seven research articles that examined the effect of transfer ensures that the minority class instances are given appropriate
learning on detection accuracy [22], [71], [91], [98], [179], attention during training.
[183], [227]. However, utilizing transfer learning strategies Handling the imbalanced dataset achieved a better accu-
increased the detection accuracy. The transfer learning racy result compared to the baseline cases that do not deal
that was utilized in the FND domain may be categorized with the class imbalance. Thus, this is a relevant area for
under fine-tuning pre-trained models, using CNN-based further research that can lead to improved solutions for FND.
architectures, employing pre-trained hybrid models, and
leveraging transformer-based models. The highest improve- VIII. THREATS TO VALIDITY
ment presented by utilizing transfer learning was by reaching SLRs are prone to several threats to validity that may lead to
an accuracy of 93.2% when applying the Alexnet pre-trained a bias in the review outcomes. These threats are publication
model which represents an improvement of 23.1% compared bias and errors in data collection, study exclusion, and data
to the baseline case which is done without applying transfer extraction. Regarding publication bias, studies with positive
learning. results are more expected to be selected over negative studies.
It is worth mentioning that applying the same detection This issue is alleviated by attempting to determine whether
model on different datasets recorded enormous differences in the studies discuss their results and limitations. Moreover, the
the detection accuracy [28], [29], [30], [41], [43], [46], [50], sole purpose of this SLR is to report the effectiveness of DL
[122], [125], [135], [136], [163], [170], [172], [186], [188]. models rather than present new results. In addition, there is
This issue might be tackled by including a transfer learning no motivation from our SLR to select studies reporting only
approach so the detection model can report an approximate positive results.
accuracy. Regarding filtering out studies based on the search criteria,
RQ5: Which solutions deal with different levels of an we aimed to have a broad search query as we mentioned in
imbalanced dataset? Section II-C to alleviate this threat. We could also expand the
A dataset with a skewed class distribution where the survey date range to contain the studies that were published
end-user preferences are biased towards the least represented before 2018. However, fake news became more popular from
class(es) suffers from a class imbalance problem. A model 2018 onward, and we aimed to provide an updated review
learned under these conditions will focus on the majority of the most recent trends in this application domain. This
class and will not learn correctly the minority and important motivation is supported by Figure 3 which demonstrates the
classes [231]. remarkable increase in fake news detection publications over
Most of the available fake news datasets are imbalanced. time.
From the articles we surveyed, only seven papers specifically Regarding the issue of incorrectly excluded articles and
treated class imbalance and studied its effect on fake news extracting the data, we alleviated this issue by asking another
detection by utilizing various strategies to handle this issue. researcher to review some random studies. There is no rule
These strategies were: random and advance oversampling for determining the number of articles for the random check
in four articles, random undersampling in two articles, task, but about half of the surveyed articles were selected for
and utilizing a different loss function in one article. The this special check.
oversampling has been deployed by increasing the number
of instances in the minority class to match the majority IX. MAIN GAPS AND OPEN ISSUES
class which improved the detection effectiveness [21], From our investigation, we gathered a list of the main
[31], [179]. In addition, advanced oversampling techniques gaps and open challenges that still deserve the attention

114452 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

of the research community for the fake news detection news detection may be limited by the lack of a
problem. We must highlight that this is a challenging task, comprehensive understanding of how fake news is
involving several difficulties which we describe to allow created, disseminated, and received, which can impact
future researchers to focus on the most important open issues. the models’ accuracy and effectiveness. Further research
• Lack of labelled data: One of the major challenges in is needed to better understand the underlying dynamics
training deep learning models for fake news detection of fake news and inform the development of more
is the limited availability of labelled data [197], [202]. effective solutions.
Fake news datasets are often small, and obtaining • Real-world applicability: While deep learning models
accurate and comprehensive annotations for training can for fake news detection show promising results in
be challenging. This can impact the performance and controlled research settings, their real-world applica-
generalization of deep learning models, as they heavily bility, and effectiveness in detecting fake news in
rely on large amounts of labelled data for effective diverse and dynamic environments, such as social media
training. or online news platforms, is still a challenge. Real-
• Potentially biased datasets: Another issue in fake news world factors, such as varying levels of information
detection is the potential bias in the datasets used for quality, diverse sources of misinformation, and rapid
training and evaluation [201]. Fake news datasets may information spread, can impact the performance and
contain inherent biases, such as political or cultural reliability of deep learning models in practical scenarios
biases, that can affect the performance and fairness of [198], [203].
deep learning models. It is essential to carefully curate
and preprocess datasets to mitigate these biases and X. CONCLUSION
ensure the reliability and generalizability of the models. The increasing volume of people using communication
• Lack of benchmarks: There is a lack of standardized platforms has opened the door for the spread of fake
benchmarks for evaluating the performance of deep news. Fake news can influence readers in many aspects,
learning models in fake news detection. The absence of and it is crucial to understand this phenomenon and study
benchmark datasets, evaluation metrics, and protocols mechanisms that allow its early detection. Deep learning
makes it challenging to compare the performance of has shown its potential in various tasks, including natural
different models and assess their effectiveness [186], language processing, and our systematic literature review
[238]. The development of standardized benchmarks highlights its effectiveness in fake news detection.
can facilitate fair and rigorous comparisons and foster From our findings, the main categories of algorithms
advancements in the field. used for FND are CNN, RNN, GNN, Attention-based
• Transfer learning solutions not sufficiently explored: mechanisms, and BERT. Among these, the most frequently
Transfer learning, which leverages pre-trained models used are RNN-based models, which include the Bi(LSTM).
for feature extraction or model initialization, has shown We also found that Liar, ISOT, PHEME, and FakeNewsNet
promise in improving the performance of deep learning are the publicly available datasets most frequently used in
models for various tasks [239], [240]. However, in the fake news detection. These datasets are a central aspect
context of fake news detection, the exploration of because selecting a proper dataset is crucial. In effect, the
transfer learning solutions is still limited. There is data selection will have an important impact on the detection
a need to further investigate and optimize transfer effectiveness.
learning approaches for fake news detection to leverage Finally, we found that transfer learning and the class
knowledge from related tasks and domains. imbalance problem are not widely explored in fake news
• Class imbalance not adequately addressed: Class detection studies, even though these techniques have shown
imbalance, where the number of samples in different promising results in increasing detection accuracy in many
classes is significantly imbalanced, is a common issue fields. Overall, our systematic review highlights the potential
in fake news detection. Deep learning models trained on of deep learning in fake news detection and identifies
imbalanced datasets may result in biased and inaccurate important areas for future research. We also provide a
predictions, as they tend to be biased towards the comprehensive list of the main gaps and open issues in this
majority class [241], [242]. Although some studies have domain to guide the next steps of research in this area.
explored imbalance techniques such as oversampling
or undersampling, the effectiveness of these techniques REFERENCES
in deep learning for fake news detection needs further
[1] H. Allcott and M. Gentzkow, ‘‘Social media and fake news in the 2016
investigation. election,’’ J. Econ. Perspect., vol. 31, no. 2, pp. 211–236, May 2017.
• Limited understanding of fake news dynamics: Despite [2] M. R. Islam, M. A. Kabir, A. Ahmed, A. R. M. Kamal, H. Wang,
extensive research on fake news, there is still a and A. Ulhaq, ‘‘Depression detection from social network data using
machine learning techniques,’’ Health Inf. Sci. Syst., vol. 6, no. 1, pp. 1–12,
limited understanding of the complex dynamics and
Dec. 2018.
mechanisms underlying the spread and impact of [3] H. Gao and H. Liu, ‘‘Data analysis on location-based social networks,’’ in
misinformation [243]. Deep learning models for fake Mobile Social Networking: An Innovative Approach, 2014, pp. 165–194.

VOLUME 12, 2024 114453


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[4] K. Sharma, F. Qian, H. Jiang, N. Ruchansky, M. Zhang, and [27] Md. Z. H. George, N. Hossain, Md. R. Bhuiyan, A. K. M. Masum, and
Y. Liu, ‘‘Combating fake news: A survey on identification and mitigation S. Abujar, ‘‘Bangla fake news detection based on multichannel combined
techniques,’’ ACM Trans. Intell. Syst. Technol., vol. 10, no. 3, pp. 1–42, CNN-LSTM,’’ in Proc. 12th Int. Conf. Comput. Commun. Netw. Technol.
May 2019. (ICCCNT), Jul. 2021, pp. 1–5.
[5] L. Wu, J. Li, X. Hu, and H. Liu, ‘‘Gleaning wisdom from the past: Early [28] M. H. Goldani, R. Safabakhsh, and S. Momtazi, ‘‘Convolutional neural
detection of emerging rumors in social media,’’ in Proc. SIAM Int. Conf. network with margin loss for fake news detection,’’ Inf. Process. Manage.,
Data Mining, 2017, pp. 99–107. vol. 58, no. 1, Jan. 2021, Art. no. 102418.
[6] L. Wu, F. Morstatter, K. M. Carley, and H. Liu, ‘‘Misinformation in [29] S. Gonwirat, A. Choompol, and N. Wichapa, ‘‘A combined deep learning
social media: Definition, manipulation, and detection,’’ ACM SIGKDD model based on the ideal distance weighting method for fake news
Explorations Newslett., vol. 21, no. 2, pp. 80–90, Nov. 2019. detection,’’ Int. J. Data Netw. Sci., vol. 6, no. 2, pp. 347–354, 2022.
[7] J. Ma, W. Gao, P. Mitra, S. Kwon, B. J. Jansen, K.-F. Wong, and M. Cha, [30] Y.-F. Huang and P.-H. Chen, ‘‘Fake news detection using an ensemble
‘‘Detecting rumors from microblogs with recurrent neural networks,’’ in learning model based on self-adaptive harmony search algorithms,’’ Expert
Proc. Int. Joint Conf. Artif. Intell. (IJCAI), 2016, pp. 3818–3824. Syst. Appl., vol. 159, Nov. 2020, Art. no. 113584.
[8] S. K. Bharti, R. Pradhan, K. S. Babu, and S. K. Jena, ‘‘Sarcasm analysis [31] V.-I. Ilie, C.-O. Truica, E.-S. Apostol, and A. Paschke, ‘‘Context-aware
on twitter data using machine learning approaches,’’ in Trends in Social misinformation detection: A benchmark of deep learning architectures
Network Analysis: Information Propagation, User Behavior Modeling, using word embeddings,’’ IEEE Access, vol. 9, pp. 162122–162146, 2021.
Forecasting, and Vulnerability Assessment, 2017, pp. 51–76. [32] K. Ivancová, M. Sarnovský, and V. Maslej-Krcsñáková, ‘‘Fake news
[9] S. Helmstetter and H. Paulheim, ‘‘Weakly supervised learning for fake detection in Slovak language using deep learning techniques,’’ in Proc.
news detection on Twitter,’’ in Proc. IEEE/ACM Int. Conf. Adv. Social IEEE 19th World Symp. Appl. Mach. Intell. Informat. (SAMI), Jan. 2021,
Netw. Anal. Mining (ASONAM), Aug. 2018, pp. 274–277. pp. 000255–000260.
[10] S. Kumar and N. Shah, ‘‘False information on web and social media: A [33] Y. Ji, ‘‘Fake news detection based on a bi-directional LSTM with CNN,’’
survey,’’ 2018, arXiv:1804.08559. in Proc. 3rd Int. Conf. Comput. Data Sci. (CONF-CDS). Springer, 2022,
[11] K. Shu, L. Cui, S. Wang, D. Lee, and H. Liu, ‘‘DEFEND: Explainable fake pp. 36–44.
news detection,’’ in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery [34] R. Kaliyar, A. Goswami, and P. Narang, ‘‘A hybrid model for effective fake
Data Mining, Jul. 2019, pp. 395–405. news detection with a novel COVID-19 dataset,’’ in Proc. 13th Int. Conf.
Agents Artif. Intell., 2021, pp. 1066–1072.
[12] R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, ‘‘FNDNet—A deep
convolutional neural network for fake news detection,’’ Cognit. Syst. Res., [35] R. K. Kaliyar, A. Mohnot, R. Raghhul, V. Prathyushaa, A. Goswami,
vol. 61, pp. 32–44, Jun. 2020. N. Singh, and P. Dash, ‘‘Multideepfake: Improving fake news detection
with a deep convolutional neural network using a multimodal dataset,’’ in
[13] S. Keele, ‘‘Guidelines for performing systematic literature reviews in
Proc. 10th Int. Conf. Adv. Comput. (IACC), Panaji, Goa, India. Springer,
software engineering,’’ Tech. Rep., 2007.
2021, pp. 267–279.
[14] S. Jalali and C. Wohlin, ‘‘Systematic literature studies: Database searches
[36] R. K. Kaliyar, R. Singh, S. N. Laya, M. S. Sudharshan, A. Goswami, and
vs. Backward snowballing,’’ in Proc. ACM-IEEE Int. Symp. Empirical
D. Garg, ‘‘Rumeval2020-an effective approach for rumour detection with
Softw. Eng. Meas., Sep. 2012, pp. 29–38.
a deep hybrid C-LSTM model,’’ in Proc. 10th Int. Conf. Adv. Comput.
[15] J. Babineau, ‘‘Product review: Covidence (Systematic review Software),’’ (IACC), Panaji, Goa, India. Springer, Dec. 2021, pp. 300–312.
J. Can. Health Libraries Assoc. J. de l’Association des bibliothèques de la [37] A. Zubiaga, A. Aker, K. Bontcheva, M. Liakata, and R. Procter, ‘‘Detection
santé du Canada, vol. 35, no. 2, p. 68, Aug. 2014. and resolution of rumours in social media: A survey,’’ ACM Comput.
[16] S. Girgis, E. Amer, and M. Gadallah, ‘‘Deep learning algorithms for Surveys, vol. 51, no. 2, pp. 1–36, Mar. 2019.
detecting fake news in online text,’’ in Proc. 13th Int. Conf. Comput. Eng. [38] V. L. Rubin, Y. Chen, and N. K. Conroy, ‘‘Deception detection for news:
Syst. (ICCES), Dec. 2018, pp. 93–97. Three types of fakes,’’ in Proc. Assoc. Sci. Technol., Jan. 2015, vol. 52,
[17] Q. Abbas, M. U. Zeshan, and M. Asif, ‘‘A CNN-RNN based fake news no. 1, pp. 1–4.
detection model using deep learning,’’ in Proc. Int. Seminar Comput. Sci. [39] J. Brummette, M. DiStaso, M. Vafeiadis, and M. Messner, ‘‘Read all
Eng. Technol. (SCSET), Jan. 2022, pp. 40–45. about it: The politicization of ‘fake news’ on Twitter,’’ Journalism Mass
[18] A. Abdullah, M. Awan, M. Shehzad, and M. Ashraf, ‘‘Fake news Commun. Quart., vol. 95, no. 2, pp. 497–517, 2018.
classification bimodal using convolutional neural network and long short- [40] A. Marlatt. Records Suggest 2020 Election Conspiracy Involved
term memory,’’ Int. J. Emerg. Technol. Learn, vol. 11, pp. 209–212, 80m People. Accessed: Nov. 11, 2022. [Online]. Available:
Jul. 2020. https://www.satirewire.com/claim-anti-trump-conspiracy-involved-
[19] A. Abedalla, A. Al-Sadi, and M. Abdullah, ‘‘A closer look at fake news 80-million-people/
detection: A deep learning perspective,’’ in Proc. 3rd Int. Conf. Adv. Artif. [41] A. J. Keya, S. Afridi, A. S. Maria, S. S. Pinki, J. Ghosh, and M. F. Mridha,
Intell., Oct. 2019, pp. 24–28. ‘‘Fake news detection based on deep learning,’’ in Proc. Int. Conf. Sci.
[20] M. Al-Sarem, A. Alsaeedi, F. Saeed, W. Boulila, and O. AmeerBakhsh, ‘‘A Contemp. Technol. (ICSCT), Aug. 2021, pp. 1–6.
novel hybrid deep learning model for detecting COVID-19-related rumors [42] P. M. Konkobo, R. Zhang, S. Huang, T. T. Minoungou, J. A. Ouedraogo,
on social media based on LSTM and concatenated parallel CNNs,’’ Appl. and L. Li, ‘‘A deep learning model for early detection of fake news on
Sci., vol. 11, no. 17, p. 7940, Aug. 2021. social media,’’ in Proc. 7th Int. Conf. Behavioural Social Comput. (BESC),
[21] M. N. Alenezi and Z. M. Alqenaei, ‘‘Machine learning in detecting Nov. 2020, pp. 1–6.
COVID-19 misinformation on Twitter,’’ Future Internet, vol. 13, no. 10, [43] R. Kozik, S. Kula, M. Choraś, and M. Woźniak, ‘‘Technical solution
p. 244, Sep. 2021. to counter potential crime: Text analysis to detect fake news and
[22] N. M. AlShariah and A. Khader, ‘‘Detecting fake images on social media disinformation,’’ J. Comput. Sci., vol. 60, Apr. 2022, Art. no. 101576.
using machine learning,’’ Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 12, [44] V. M. Kresnáková, M. Sarnovský, and P. Butka, ‘‘Deep learning methods
pp. 170–176, 2019. for fake news detection,’’ in Proc. IEEE 19th Int. Symp. Comput. Intell.
[23] M. Z. Asghar, A. Habib, A. Habib, A. Khan, R. Ali, and A. Khattak, Informat. 7th IEEE Int. Conf. Recent Achievements Mechatronics, Autom.,
‘‘Exploring deep neural networks for rumor detection,’’ J. Ambient Intell. Comput. Sci. Robot. (CINTI-MACRo), Nov. 2019, pp. 000143–000148.
Humanized Comput., vol. 12, no. 4, pp. 4315–4333, Apr. 2021. [45] E. Masciari, V. Moscato, A. Picariello, and G. Sperlí, ‘‘Detecting fake
[24] M. C. Buzea, S. Trausan-Matu, and T. Rebedea, ‘‘Automatic fake news news by image analysis,’’ in Proc. 24th Symp. Int. Database Eng. Appl.,
detection for Romanian online news,’’ Information, vol. 13, no. 3, p. 151, Aug. 2020, pp. 1–5.
Mar. 2022. [46] J. A. Nasir, O. S. Khan, and I. Varlamis, ‘‘Fake news detection: A hybrid
[25] M. K. Elhadad, K. F. Li, and F. Gebali, ‘‘An ensemble deep learning CNN-RNN based deep learning approach,’’ Int. J. Inf. Manage. Data
technique to detect COVID-19 misleading information,’’ in Proc. 23rd Int. Insights, vol. 1, no. 1, Apr. 2021, Art. no. 100007.
Conf. Network-Based Inf. Syst. Adv. Networked-Based Inf. Syst. (NBiS- [47] A. Priya and A. Kumar, ‘‘Deep ensemble approach for COVID-19 fake
2020). Springer, 2020, pp. 163–175. news detection from social media,’’ in Proc. 8th Int. Conf. Signal Process.
[26] K. M. Fouad, S. F. Sabbeh, and W. Medhat, ‘‘Arabic fake news Integr. Netw. (SPIN), Aug. 2021, pp. 396–401.
detection using deep learning,’’ Comput., Mater. Continua, vol. 71, no. 2, [48] C. Raj and P. Meel, ‘‘ConvNet frameworks for multi-modal fake news
pp. 3647–3665, 2022. detection,’’ Appl. Intell., vol. 51, no. 11, pp. 1–17, 2021.

114454 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[49] S. P. Ramya and R. Eswari, ‘‘Attention-based deep learning models for [71] C. Mallick, S. Mishra, and M. R. Senapati, ‘‘A cooperative deep learning
detection of fake news in social networks,’’ Int. J. Cognit. Informat. Natural model for fake news detection in online social networks,’’ J. Ambient Intell.
Intell., vol. 15, no. 4, pp. 1–25, Jan. 2022. Humanized Comput., vol. 14, no. 4, pp. 4451–4460, Apr. 2023.
[50] H. Saleh, A. Alharbi, and S. H. Alsamhi, ‘‘OPCNN-FAKE: Optimized [72] M. Madani, H. Motameni, and R. Roshani, ‘‘Fake news detection using
convolutional neural network for fake news detection,’’ IEEE Access, feature extraction, natural language processing, curriculum learning, and
vol. 9, pp. 129471–129489, 2021. deep learning,’’ Int. J. Inf. Technol. Decis. Making, vol. 23, no. 3,
[51] M. Samadi, M. Mousavian, and S. Momtazi, ‘‘Persian fake news detection: pp. 1063–1098, May 2024.
Neural representation and classification at word and text levels,’’ ACM [73] G. Mareeswari and E. V. Dinesh, ‘‘Deep neural networks based detection
Trans. Asian Low-Resource Lang. Inf. Process., vol. 21, no. 1, pp. 1–11, and analysis of fake tweets,’’ in Proc. 4th Int. Conf. Signal Process.
Jan. 2022. Commun. (ICSPC), Mar. 2023, pp. 56–61.
[52] M. Sarnovskỳ, V. Maslej-Krešňáková, and K. Ivancová, ‘‘Fake news [74] M. Samadi and S. Momtazi, ‘‘Fake news detection: Deep semantic
detection related to the COVID-19 in Slovak language using deep learning representation with enhanced feature engineering,’’ Int. J. Data Sci.
methods,’’ Acta Polytechnica Hungarica, vol. 19, no. 2, pp. 43–57, 2022. Analytics, pp. 1–12, Mar. 2023.
[53] I. K. Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, ‘‘Detection of fake [75] J. Alghamdi, Y. Lin, and S. Luo, ‘‘Does context matter? Effective deep
news using deep learning CNN–RNN based methods,’’ ICT Exp., vol. 8, learning approaches to curb fake news dissemination on social media,’’
no. 3, pp. 396–408, Sep. 2022. Appl. Sci., vol. 13, no. 5, p. 3345, Mar. 2023.
[54] K. L. Tan, C. Poo Lee, and K. M. Lim, ‘‘FN-net: A deep convolutional [76] F. W. R. Tokpa, B. H. Kamagaté, V. Monsan, and S. Oumtanaga, ‘‘Fake
neural network for fake news detection,’’ in Proc. 9th Int. Conf. Inf. news detection in social media: Hybrid deep learning approaches,’’ J. Adv.
Commun. Technol. (ICoICT), Aug. 2021, pp. 331–336. Inf. Technol., vol. 14, no. 3, pp. 606–615, 2023.
[55] M. P. Thilakarathna, V. A. Wijayasekara, Y. Gamage, K. H. Peiris, C. [77] O. Prakash and R. Kumar, ‘‘Fake news detection in social networks using
Abeysinghe, I. Rafaideen, and P. Vekneswaran, ‘‘Hybrid approach and attention mechanism,’’ in Proc. Int. Conf. Cogn. Intell. Comput. (ICCIC).
architecture to detect fake news on Twitter in real-time using neural Springer, 2023, pp. 453–462.
networks,’’ in Proc. 5th Int. Conf. Inf. Technol. Res. (ICITR), Dec. 2020, [78] S. Kumar, A. Kumar, A. Mallik, and R. R. Singh, ‘‘OptNet-fake: Fake news
pp. 1–6. detection in socio-cyber platforms using grasshopper optimization and
[56] F. Torgheh, M. R. Keyvanpour, B. Masoumi, and S. V. Shojaedini, ‘‘A novel deep neural network,’’ IEEE Trans. Computat. Social Syst., early access,
method for detecting fake news: Deep learning based on propagation path 2023, doi: 10.1109/TCSS.2023.3246479.
concept,’’ in Proc. 26th Int. Comput. Conf., Comput. Soc. Iran (CSICC), [79] Q. Zhang, Z. Guo, Y. Zhu, P. Vijayakumar, A. Castiglione, and B. B. Gupta,
Iran, Mar. 2021, pp. 1–5. ‘‘A deep learning-based fast fake news detection model for cyber-physical
social services,’’ Pattern Recognit. Lett., vol. 168, pp. 31–38, Apr. 2023.
[57] A. Wani, I. Joshi, S. Khandve, V. Wagh, and R. Joshi, ‘‘Evaluating
deep learning approaches for COVID19 fake news detection,’’ in Proc. [80] A. Kishwar and A. Zafar, ‘‘Fake news detection on Pakistani news
Int. Workshop Combating Online Hostile Posts Regional Lang. During using machine learning and deep learning,’’ Expert Syst. Appl., vol. 211,
Emergency Situation. Springer, 2021, pp. 153–163. Jan. 2023, Art. no. 118558.
[58] Z. Wang, Z. Yin, and Y. A. Argyris, ‘‘Detecting medical misinformation [81] P. K. Verma, P. Agrawal, V. Madaan, and R. Prodan, ‘‘MCred: Multi-
on social media using multimodal deep learning,’’ IEEE J. Biomed. Health modal message credibility for fake news detection using BERT and CNN,’’
Informat., vol. 25, no. 6, pp. 2193–2203, Jun. 2021. J. Ambient Intell. Humanized Comput., vol. 14, no. 8, pp. 10617–10629,
Aug. 2023.
[59] F. Xing and C. Guo, ‘‘Mining semantic information in rumor detection via
[82] A. K. Yadav, S. Kumar, D. Kumar, L. Kumar, K. Kumar, S. K. Maurya,
a deep visual perception based recurrent neural networks,’’ in Proc. IEEE
M. Kumar, and D. Yadav, ‘‘Fake news detection using hybrid deep learning
Int. Congr. Big Data (BigDataCongress), Jul. 2019, pp. 17–23.
method,’’ Social Netw. Comput. Sci., vol. 4, no. 6, p. 845, Nov. 2023.
[60] A. Zervopoulos, A. G. Alvanou, K. Bezas, A. Papamichail,
[83] A. Saeed and E. A. Solami, ‘‘Fake news detection using machine learning
M. Maragoudakis, and K. Kermanidis, ‘‘Deep learning for fake news
and deep learning methods,’’ Comput., Mater. Continua, vol. 77, no. 2,
detection on Twitter regarding the 2019 Hong Kong protests,’’ Neural
pp. 2079–2096, 2023.
Comput. Appl., vol. 34, no. 2, pp. 969–982, Jan. 2022.
[84] A. Y. Umar, I. S. Ahmad, and K. Muhammad, ‘‘Fake news detection using
[61] Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao, ‘‘EANN:
CNN/GRU deep learning model,’’ Sule Lamido Univ. J. Sci. Technol.,
Event adversarial neural networks for multi-modal fake news detection,’’
vol. 7, no. 1, pp. 57–67, 2023.
in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
[85] Y. Singh, ‘‘Fake news detection using LSTM in TensorFlow and deep
2018, pp. 849–857.
learning,’’ J. Appl. Sci. Educ. (JASE), vol. 3, no. 2, pp. 1–14, 2023.
[62] R. K. Kaliyar, ‘‘Fake news detection using a deep neural network,’’ in Proc.
[86] P. Sharma and R. Sahu, ‘‘Fake news detection using deep learning based
4th Int. Conf. Comput. Commun. Autom. (ICCCA), Dec. 2018, pp. 1–7.
approach,’’ in Proc. Int. Conf. Circuit Power Comput. Technol. (ICCPCT),
[63] O. Ajao, D. Bhowmik, and S. Zargari, ‘‘Fake news identification on Twitter Aug. 2023, pp. 651–656.
with hybrid CNN and RNN models,’’ in Proc. 9th Int. Conf. Social Media
[87] M. R. H. Shezan, M. N. Zawad, Y. A. Shahed, and S. Ripon, ‘‘Bangla fake
Soc., Jul. 2018, pp. 226–230.
news detection using hybrid deep learning models,’’ in Applied Informatics
[64] L. Wu, Y. Rao, H. Yu, Y. Wang, and A. Nazir, ‘‘False information for Industry 4.0. Boca Raton, FL, USA: CRC Press, 2023, pp. 46–60.
detection on social media via a hybrid deep model,’’ in Proc. 10th Int. [88] C. Nandhakumar, C. Kowsika, R. Reshema, and L. Sandhiya, ‘‘Fake news
Conf. Social Inform. (SocInfo), St. Petersburg, Russia. Springer, Sep. 2018, detection using machine learning and deep learning classifiers,’’ in Proc.
pp. 323–333. Int. Conf. Inf. Commun. Technol. Intell. Syst. Springer, 2023, pp. 165–175.
[65] K. Popat, S. Mukherjee, A. Yates, and G. Weikum, ‘‘DeClarE: Debunking [89] P. M. Subhash, D. Gupta, S. Palaniswamy, and M. Venugopalan, ‘‘Fake
fake news and false claims using evidence-aware deep learning,’’ 2018, news detection using deep learning and transformer-based model,’’ in Proc.
arXiv:1809.06416. 14th Int. Conf. Comput. Commun. Netw. Technol. (ICCCNT), Jul. 2023,
[66] S. Alyoubi, M. Kalkatawi, and F. Abukhodair, ‘‘The detection of fake news pp. 1–6.
in Arabic tweets using deep learning,’’ Appl. Sci., vol. 13, no. 14, p. 8209, [90] A. Jaiswal, H. Verma, and N. Sachdeva, ‘‘Swarm optimized fake news
Jul. 2023. detection on social-media textual content using deep learning,’’ in Proc.
[67] G. Güler and S. Gündüz, ‘‘Deep learning based fake news detection on Int. Conf. Adv. Comput., Commun. Appl. Informat. (ACCAI), May 2023,
social media,’’ Int. J. Inf. Secur. Sci., vol. 12, no. 2, pp. 1–21, Jun. 2023. pp. 1–8.
[68] Y. Doke, ‘‘Deep fake detection through deep learning,’’ Int. J. Res. Appl. [91] I. Ennejjai, A. Ariss, N. Kharmoum, W. Rhalem, S. Ziti, and M. Ezziyyani,
Sci. Eng. Technol., vol. 11, no. 5, pp. 861–866, May 2023. ‘‘Artificial intelligence for fake news,’’ in Proc. Int. Conf. Adv. Intell. Syst.
[69] Y. Lu and H. Ye, ‘‘Detection method of fake news spread in social network Sustain. Develop. Springer, 2022, pp. 77–91.
based on deep learning,’’ in Proc. Int. Conf. Adv. Hybrid Inf. Process. [92] O. Ngada and B. Haskins, ‘‘Investigating fake news detection by means
Springer, 2022, pp. 473–488. of deep learning on a limited data set,’’ in Proc. IEEE Asia–Pacific Conf.
[70] F. Mira, ‘‘Deep learning technique for recognition of deep fake videos,’’ in Comput. Sci. Data Eng. (CSDE), Dec. 2022, pp. 1–6.
Proc. IEEE IAS Global Conf. Emerg. Technol. (GlobConET), May 2023, [93] Z. Wang, ‘‘Deep learning methods for fake news detection,’’ in Proc. IEEE
pp. 1–4. 2nd Int. Conf. Data Sci. Comput. Appl. (ICDSCA), 2022, pp. 472–475.

VOLUME 12, 2024 114455


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[94] N. Jayakody, A. Mohammad, and M. N. Halgamuge, ‘‘Fake news detection [117] K. K. Kumar, S. H. Rao, G. Srikar, and M. B. Chandra, ‘‘A novel approach
using a decentralized deep learning model and federated learning,’’ in Proc. for detection of fake news using long short term memory (LSTM),’’ Int. J.,
48th Annu. Conf. IEEE Ind. Electron. Soc. (IECON), Oct. 2022, pp. 1–6. vol. 10, no. 5, 2021, Art. no. 1320800.
[95] D. R. Collen, L. K. Nyandoro, and K. Zvarevashe, ‘‘Fake news detection [118] T. Ahmad, M. S. Faisal, A. Rizwan, R. Alkanhel, P. W. Khan, and
using 5L-CNN,’’ in Proc. 1st Zimbabwe Conf. Inf. Commun. Technol. A. Muthanna, ‘‘Efficient fake news detection mechanism using enhanced
(ZCICT), Zimbabwe, Nov. 2022, pp. 1–7. deep learning model,’’ Appl. Sci., vol. 12, no. 3, p. 1743, Feb. 2022.
[96] A. Qdroo and M. Baykara, ‘‘A new approach to detect fake news related [119] N. Aslam, I. Ullah Khan, F. S. Alotaibi, L. A. Aldaej, and
to COVID-19 pandemic using deep neural network,’’ J. Appl. Sci. Technol. A. K. Aldubaikil, ‘‘Fake detect: A deep learning ensemble model for fake
Trends, vol. 3, no. 2, pp. 81–88, Dec. 2022. news detection,’’ Complexity, vol. 2021, pp. 1–8, Apr. 2021.
[97] Z. A. Jawad and A. J. Obaid, ‘‘Combination of convolution neural [120] T. Chauhan and H. Palivela, ‘‘Optimization and improvement of fake
networks and deep neural networks for fake news detection,’’ 2022, news detection using deep learning approaches for societal benefit,’’ Int.
arXiv:2210.08331. J. Inf. Manage. Data Insights, vol. 1, no. 2, Nov. 2021, Art. no. 100051.
[98] S. Suratkar and F. Kazi, ‘‘Deep fake video detection using transfer learning [121] M.-Y. Chen, Y.-W. Lai, and J.-W. Lian, ‘‘Using deep learning models to
approach,’’ Arabian J. Sci. Eng., vol. 48, no. 8, pp. 9727–9737, Aug. 2023. detect fake news about COVID-19,’’ ACM Trans. Internet Technol., vol. 23,
[99] J. Del Ser, M. N. Bilbao, I. Laña, K. Muhammad, and D. Camacho, no. 2, pp. 1–23, May 2023.
‘‘Efficient fake news detection using bagging ensembles of bidirectional [122] R. Garg, ‘‘Effective fake news classifier and its applications to COVID-
echo state networks,’’ in Proc. Int. Joint Conf. Neural Netw. (IJCNN), 2022, 19,’’ in Proc. IEEE Bombay Sect. Signature Conf. (IBSSC), Nov. 2021,
pp. 1–7. pp. 1–6.
[100] A. Sedik, A. A. Abohany, K. M. Sallam, K. Munasinghe, and T. Medhat, [123] V. Jain, R. K. Kaliyar, A. Goswami, P. Narang, and Y. Sharma, ‘‘AENeT:
‘‘Deep fake news detection system based on concatenated and recurrent An attention-enabled neural architecture for fake news detection using
modalities,’’ Expert Syst. Appl., vol. 208, Dec. 2022, Art. no. 117953. contextual features,’’ Neural Comput. Appl., vol. 34, no. 1, pp. 771–782,
[101] K. Sangeeta, ‘‘Fake news detection using feature selection and deep learn- Jan. 2022.
ing,’’ Int. J. Res. Appl. Sci. Eng. Technol., vol. 10, no. 6, pp. 3878–3887, [124] T. Jiang, J. P. Li, A. U. Haq, and A. Saboor, ‘‘Fake news detection using
Jun. 2022. deep recurrent neural networks,’’ in Proc. 17th Int. Comput. Conf. Wavelet
[102] J. Rautela, V. Ramalingam, and H. Makhdoomi, ‘‘Fake news detection Act. Media Technol. Inf. Process. (ICCWAMTIP), 2020, pp. 205–208.
through deep learning techniques,’’ Int. J. Health Sci., vol. 6, no. 5, [125] N. Kanagavalli, S. B. Priya, and J. D, ‘‘Design of hyperparameter tuned
p. 2107—2111, 2022. deep learning based automated fake news detection in social networking
[103] V. Kandasamy, Š. Hubálovský, and P. Trojovský, ‘‘Deep fake detection data,’’ in Proc. 6th Int. Conf. Comput. Methodologies Commun. (ICCMC),
using a sparse auto encoder with a graph capsule dual graph CNN,’’ PeerJ Mar. 2022, pp. 958–963.
Comput. Sci., vol. 8, p. e953, May 2022. [126] J. Kumari, R. Choudhary, S. Kumari, and G. Krishna, ‘‘A deep learning
[104] Y. Tashtoush, B. Alrababah, O. Darwish, M. Maabreh, and N. Alsaedi, ‘‘A based approach for classification of news as real or fake,’’ in Proc. Data
deep learning framework for detection of COVID-19 fake news on social Sci. Security (IDSCS). Singapore: Springer, 2021, pp. 239–246.
media platforms,’’ Data, vol. 7, no. 5, p. 65, May 2022. [127] D.-H. Lee, Y.-R. Kim, H.-J. Kim, S.-M. Park, and Y.-J. Yang, ‘‘Fake
[105] R. Muppidi and D. V. Biksham, ‘‘Deep convolutional neural network for news detection using deep learning,’’ J. Inf. Process. Syst., vol. 15, no. 5,
fake news detection over online social networks,’’ Interantional J. Sci. Res. pp. 1119–1130, 2019.
Eng. Manage., vol. 6, no. 4, pp. 1–10, Apr. 2022. [128] J. Liu, C. Wang, C. Li, N. Li, J. Deng, and J. Z. Pan, ‘‘DTN: Deep triple
[106] Q. Hu, Q. Li, Y. Lu, Y. Yang, and J. Cheng, ‘‘Multi-level word features network for topic specific fake news detection,’’ J. Web Semantics, vol. 70,
based on CNN for fake news detection in cultural communication,’’ Pers. Jul. 2021, Art. no. 100646.
Ubiquitous Comput., vol. 24, no. 2, pp. 259–272, Apr. 2020. [129] R. Mahesh, B. Poornika, N. Sharaschandrika, S. D. Goud, and
[107] S. Deepak and B. Chitturi, ‘‘Deep neural approach to fake-news P. U. Kumar, ‘‘Identification of fake news using deep learning architec-
identification,’’ Proc. Comput. Sci., vol. 167, pp. 2236–2243, 2020. ture,’’ in Proc. 3rd Int. Conf. Inventive Res. Comput. Appl. (ICIRCA),
[108] C. Kulkarni, P. Monika, S. Shruthi, M. D. Bharadwaj, and D. Uday, Sep. 2021, pp. 1246–1253.
‘‘COVID-19 fake news detection using glove and bi-LSTM,’’ in Proc. 2nd [130] B. Majumdar, Md. RafiuzzamanBhuiyan, Md. A. Hasan, Md. S. Islam,
Int. Conf. Sustain. Expert Syst. (ICSES). Springer, 2022, pp. 43–56. and S. R. H. Noori, ‘‘Multi class fake news detection using LSTM
[109] R. Malhotra, A. Mahur, and Achint, ‘‘COVID-19 fake news detection approach,’’ in Proc. 10th Int. Conf. Syst. Model. Advancement Res. Trends
system,’’ in Proc. 12th Int. Conf. Cloud Comput., Data Sci. Eng. (SMART), Dec. 2021, pp. 75–79.
(Confluence), Jan. 2022, pp. 428–433. [131] S. Mengji, ‘‘Fake news detection using RNN-LSTM,’’ Int. J. Res. Appl.
[110] Y. Tian, J. Gu, Y. Jia, and R. O. Sinnott, ‘‘An exploration of machine and Sci. Eng. Technol., vol. 9, no. 10, pp. 1731–1737, Oct. 2021.
deep learning models for fake news detection in social media,’’ in Proc. [132] A. Mohapatra, N. Thota, and P. Prakasam, ‘‘Fake news detection and
8th Int. Conf. Behav. Social Comput. (BESC), 2021, pp. 1–6. classification using hybrid BiLSTM and self-attention model,’’ Multimedia
[111] H. Zhu and R. O. Sinnott, ‘‘A performance comparison of fake news Tools Appl., vol. 81, no. 13, pp. 18503–18519, May 2022.
detection approaches,’’ in Proc. IEEE Asia–Pacific Conf. Comput. Sci. [133] U. Narayan, A. Kumar, and K. Kumar, ‘‘Fake news detection using hybrid
Data Eng. (CSDE), Dec. 2021, pp. 1–7. of deep neural network and stacked LSTM,’’ in Proc. 3rd Int. Conf. Adv.
[112] G. S. Mahara and S. Gangele, ‘‘Fake news detection: A RNN-LSTM, bi- Comput., Commun. Control Netw. (ICAC3N), Dec. 2021, pp. 385–390.
LSTM based deep learning approach,’’ in Proc. IEEE 1st Int. Conf. Data, [134] N. Rai, D. Kumar, N. Kaushik, C. Raj, and A. Ali, ‘‘Fake news
Decis. Syst. (ICDDS), Dec. 2022, pp. 01–06. classification using transformer based enhanced LSTM and BERT,’’ Int.
[113] G. Anusha, G. Praveen, D. Mounika, U. S. Krishna, and R. Cristin, J. Cognit. Comput. Eng., vol. 3, pp. 98–105, Jun. 2022.
‘‘Detection of fake news using recurrent neural network,’’ in Proc. [135] R. R. Rajalaxmi, L. V. Narasimha Prasad, B. Janakiramaiah, C.
IEEE Int. Conf. Distrib. Comput. Electr. Circuits Electron. (ICDCECE), S. Pavankumar, N. Neelima, and V. E. Sathishkumar, ‘‘Optimizing
Apr. 2022, pp. 1–5. hyperparameters and performance analysis of LSTM model in detecting
[114] P. K. Sree, G. R. Babu, P. B. V. R. Rao, P. V. Chintalapati, and fake news on social media,’’ ACM Trans. Asian Low-Resource Lang. Inf.
M. Prasad, ‘‘Fake news detection using cellular automata based deep Process., Mar. 2022, doi: 10.1145/3511897.
learning,’’ in Proc. 3rd Int. Conf. Comput. Inf. Technol. (ICCIT), Sep. 2023, [136] F. Sadeghi, A. J. Bidgoly, and H. Amirkhani, ‘‘Fake news detection on
pp. 167–171. social media using a natural language inference approach,’’ Multimedia
[115] H. Ali, M. Khan, A. AlGhadhban, M. Alazmi, A. Alzamil, K. Al-Utaibi, Tools Appl., vol. 81, no. 23, pp. 33801–33821, Sep. 2022.
and J. Qadir, ‘‘Analyzing the robustness of fake-news detectors under [137] S. R. Sahoo and B. B. Gupta, ‘‘Multiple features based approach for
black-box adversarial attacks,’’ IEEE Access, vol. 9, pp. 81678–81692, automatic fake news detection on social networks using deep learning,’’
2021. Appl. Soft Comput., vol. 100, Mar. 2021, Art. no. 106983.
[116] E. Qawasmeh, M. Tawalbeh, and M. Abdullah, ‘‘Automatic identification [138] Y. Seo and C.-S. Jeong, ‘‘FaGoN: Fake news detection model using
of fake news using deep learning,’’ in Proc. 6th Int. Conf. Social Netw. grammatic transformation on neural network,’’ in Proc. 13th Int. Conf.
Anal., Manage. Secur. (SNAMS), Oct. 2019, pp. 383–388. Knowl., Inf. Creativity Support Syst. (KICSS), Nov. 2018, pp. 1–5.

114456 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[139] P. Shrivastava and D. K. Sharma, ‘‘Fake content identification using pre- [161] G. Sarin and P. Kumar, ‘‘ConvGRUText: A deep learning method for fake
trained glove-embedding,’’ in Proc. 5th Int. Conf. Inf. Syst. Comput. Netw. text detection on online social media,’’ in Proc. Pasific Asia Conf. Inf. Syst.
(ISCON), Oct. 2021, pp. 1–6. (PACIS), 2020, pp. 60–74.
[140] T. E. Trueman, A. Kumar, P. Narayanasamy, and J. Vidya, ‘‘Attention- [162] A. Verma, V. Mittal, and S. Dawn, ‘‘FIND: Fake information and news
based C-BiLSTM for fake news detection,’’ Appl. Soft Comput., vol. 110, detections using deep learning,’’ in Proc. 12th Int. Conf. Contemp. Comput.
Oct. 2021, Art. no. 107600. (IC), Aug. 2019, pp. 1–7.
[141] M. Umer, Z. Imtiaz, S. Ullah, A. Mehmood, G. S. Choi, and [163] M. Dong, L. Yao, X. Wang, B. Benatallah, Q. Z. Sheng, and
B.-W. On, ‘‘Fake news stance detection using deep learning architecture H. Huang, ‘‘DUAL: A deep unified attention model with latent relation
(CNN-LSTM),’’ IEEE Access, vol. 8, pp. 156695–156706, 2020. representations for fake news detection,’’ in Proc. 19th Int. Conf. Web
[142] P. Ushashree, A. Naik, S. Gurav, A. Kumar, S. Chethan, and B. Mad- Inf. Syst. Eng. (WISE), Dubai, United Arab Emirates. Cham, Switzerland:
humala, ‘‘Fake news detection using neural network,’’ in Proc. IEEE Int. Springer, Nov. 2018, pp. 199–209.
Conf. Integr. Circuits Commun. Syst. (ICICACS), Aug. 2023, pp. 1–5. [164] S. Taheri, S. H. Hashemi, A. Y. Zomaya, and J. Yong, ‘‘Sequence graph
[143] A. Anand, R. Kulkarni, and P. Agrawal, ‘‘Fake news identification: An transform: A general approach to substructure-aware sequence encoding,’’
effective combined approach using ML and DL techniques,’’ in Proc. 2nd in Proc. IEEE Int. Conf. Data Mining (ICDM), 2018, pp. 117–126.
Int. Conf. Paradigm Shifts Commun. Embedded Syst., Mach. Learn. Signal [165] P. Velickovic, G. Cucurull, A. Casanova, A. R. P. Liò, and Y. Bengio,
Process. (PCEMS), Apr. 2023, pp. 1–6. ‘‘Graph attention networks,’’ in Proc. ICLR, 2018, pp. 1–12.
[166] W. Hamilton, Z. Ying, and J. Leskovec, ‘‘Inductive representation
[144] K. Saini and R. Jain, ‘‘A hybrid LSTM-BERT and glove-based deep
learning on large graphs,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 30,
learning approach for the detection of fake news,’’ in Proc. 3rd Int. Conf.
2017, pp. 1025–1035.
Smart Data Intell. (ICSMDI), Mar. 2023, pp. 400–406.
[167] T. N. Kipf and M. Welling, ‘‘Semi-supervised classification with graph
[145] A. Khoudi, N. Yahiaoui, and F. Rebahi, ‘‘Detect misinformation of
convolutional networks,’’ 2016, arXiv:1609.02907.
COVID-19 using deep learning: A comparative study based on word
[168] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini,
embedding,’’ in Proc. 1st Int. Conf. Adv. Innov. Smart Cities (ICAISC),
‘‘The graph neural network model,’’ IEEE Trans. Neural Netw., vol. 20,
Jan. 2023, pp. 1–5.
no. 1, pp. 61–80, Jan. 2008.
[146] S. M. Bankar and S. K. Gupta, ‘‘Fake news detection using LSTM-based [169] N. Bai, F. Meng, X. Rui, and Z. Wang, ‘‘Rumour detection based on graph
deep learning approach and word embedding feature extraction,’’ in Proc. convolutional neural net,’’ IEEE Access, vol. 9, pp. 21686–21693, 2021.
Int. Conf. Commun., Electron. Digital Technol. Singapore: Springer, 2023,
[170] F. B. Mahmud, M. Md. S. Rayhan, M. H. Shuvo, I. Sadia, and
pp. 129–141.
Md. K. Morol, ‘‘A comparative analysis of graph neural networks and
[147] A. Matheven and B. V. D. Kumar, ‘‘Fake news detection using deep commonly used machine learning algorithms on fake news detection,’’ in
learning and natural language processing,’’ in Proc. 9th Int. Conf. Soft Proc. 7th Int. Conf. Data Sci. Mach. Learn. Appl. (CDMA), Mar. 2022,
Comput. Mach. Intell. (ISCMI), Nov. 2022, pp. 11–14. pp. 97–102.
[148] J. A. Reshi and R. Ali, ‘‘Online fake news detection using pre- [171] I. Pilkevych, D. Fedorchuk, O. Naumchak, and M. Romanchuk, ‘‘Fake
trained embeddings,’’ in Proc. 5th Int. Conf. Multimedia, Signal Process. news detection in the framework of decision-making system through graph
Commun. Technol. (IMPACT), 2022, pp. 1–5. neural network,’’ in Proc. IEEE 4th Int. Conf. Adv. Inf. Commun. Technol.
[149] M. Madani, H. Motameni, and H. Mohamadi, ‘‘Fake news detection using (AICT), Sep. 2021, pp. 153–157.
deep learning integrating feature extraction, natural language processing, [172] Y. Ren, B. Wang, J. Zhang, and Y. Chang, ‘‘Adversarial active learning
and statistical descriptors,’’ Secur. Privacy, vol. 5, no. 6, p. e264, 2022. based heterogeneous graph neural network for fake news detection,’’ in
[150] P. Katariya, V. Gupta, R. Arora, A. Kumar, S. Dhingra, Q. Xin, and Proc. IEEE Int. Conf. Data Mining (ICDM), 2020, pp. 452–461.
J. Hemanth, ‘‘A deep neural network-based approach for fake news [173] M. Sun, I. A. Hameed, H. Wang, and M. Pasquine, ‘‘Perceiving the
detection in regional language,’’ Int. J. Web Inf. Syst., vol. 18, nos. 5–6, narrative style for fake news detection using deep learning,’’ in Proc.
pp. 286–309, Dec. 2022. IEEE 23rd Int Conf High Perform. Comput. Communications; 7th Int Conf
[151] A. Divija, ‘‘Fake news classifier,’’ Int. J. Res. Appl. Sci. Eng. Technol., Data Sci. Systems; 19th Int Conf Smart City; 7th Int Conf Dependability
vol. 10, no. 6, pp. 1716–1722, Jun. 2022. Sensor, Cloud Big Data Syst. Appl. (HPCC/DSS/SmartCity/DependSys),
[152] J. Soni, ‘‘An efficient LSTM model for fake news detection,’’ Comput. Dec. 2021, pp. 1195–1202.
Sci. Eng., Int. J., vol. 12, no. 2, pp. 1–10, Apr. 2022. [174] B. Upadhayay and V. Behzadan, ‘‘Hybrid deep learning model for fake
[153] E. Amer, K.-S. Kwak, and S. El-Sappagh, ‘‘Context-based fake news news detection in social networks (student abstract),’’ in Proc. AAAI Conf.
detection model relying on deep learning models,’’ Electronics, vol. 11, Artif. Intell., 2022, vol. 36, no. 11, pp. 13067–13068.
no. 8, p. 1255, Apr. 2022. [175] P. Hiremath, S. S. Kalagi, and Mohana, ‘‘Analysis of fake news detection
using graph neural network (GNN) and deep learning,’’ in Proc. 2nd Int.
[154] N. Xiang, ‘‘Deep learning-based fake information detection and influence
Conf. Autom., Comput. Renew. Syst. (ICACRS), Dec. 2023, pp. 1805–1811.
evaluation,’’ Comput. Intell. Neurosci., vol. 2022, pp. 1–8, Feb. 2022.
[176] E. Y. Okano, Z. Liu, D. Ji, and E. E. S. Ruiz, ‘‘Fake news detection on fake.
[155] S. M. Jaybhaye, V. Badade, A. Dodke, A. Holkar, and P. Lokhande, ‘‘Fake BR using hierarchical attention networks,’’ in Proc. 14th Int. Conf. Comput.
news detection using LSTM based deep learning approach,’’ in Proc. ITM Process. Portuguese Language (PROPOR), Evora, Portugal. Springer,
Web Conf., vol. 56, 2023, p. 03005. Mar. 2020, pp. 143–152.
[156] R. S. Aziz, A. T. Sadiq, M. Kherallah, and A. Douik, ‘‘Arabic fake [177] A. Al Obaid, H. Khotanlou, M. Mansoorizadeh, and D. Zabihzadeh,
news detection for COVID-19 using deep learning and machine learning,’’ ‘‘Multimodal fake-news recognition using ensemble of deep learners,’’
Periodicals Eng. Natural Sci. (PEN), vol. 11, no. 6, p. 56, Dec. 2023. Entropy, vol. 24, no. 9, p. 1242, Sep. 2022.
[157] M. B. Narayanan, A. K. Ramesh, K. S. Gayathri, and A. Shahina, [178] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, ‘‘BERT: Pre-training
‘‘Fake news detection using a deep learning transformer based encoder– of deep bidirectional transformers for language understanding,’’ 2018,
decoder architecture,’’ J. Intell. Fuzzy Syst., vol. 45, no. 5, pp. 8001–8013, arXiv:1810.04805.
Nov. 2023. [179] S. M. Isa, G. Nico, and M. Permana, ‘‘Indobert for Indonesian fake news
[158] S. Malik, A. K. Chakraverti, and A. I. Abidi, ‘‘Enhancing fake news detection,’’ ICIC Exp. Lett., vol. 16, no. 3, pp. 289–297, 2022.
detection using classification algorithms and deep learning,’’ in Proc. [180] B. Palani, S. Elango, and V. Viswanathan K, ‘‘CB-fake: A multimodal
10th IEEE Uttar Pradesh Sect. Int. Conf. Electr., Electron. Comput. Eng. deep learning framework for automatic fake news detection using capsule
(UPCON), Dec. 2023, pp. 780–787. neural network and BERT,’’ Multimedia Tools Appl., vol. 81, no. 4,
[159] A. Chabukswar and P. D. Shenoy, ‘‘Fake news detection using optimized pp. 5587–5620, Feb. 2022.
deep learning model through effective feature extraction,’’ in Proc. Int. [181] S. Sharma, M. Saraswat, and A. K. Dubey, ‘‘Fake news detection using
Conf. Recent Adv. Inf. Technol. for Sustain. Develop. (ICRAIS), Nov. 2023, deep learning,’’ in Proc. 3rd Iberoamerican Conf. 2nd Indo-American
pp. 118–123. Conf. Knowl. Graphs Semantic Web (KGSWC), Kingsville, Texas, USA,
[160] N. Ahuja and S. Kumar, ‘‘S-HAN: Hierarchical attention networks with Nov. 2021. Springer, pp. 249–259.
stacked gated recurrent unit for fake news detection,’’ in Proc. 8th Int. Conf. [182] M. Kanchana, V. M. Kumar, T. P. Anish., and P. Gopirajan, ‘‘Deep fake
Rel., INFOCOM Technol. Optim. (Trends Future Directions) (ICRITO), BERT: Efficient online fake news detection system,’’ in Proc. Int. Conf.
Jun. 2020, pp. 873–877. Netw. Commun. (ICNWC), Apr. 2023, pp. 1–6.

VOLUME 12, 2024 114457


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[183] R. H. Khan, A. Shihavuddin, M. M. M. Syeed, R. U. Haque, and [206] C. Wang and S. Mahadevan, ‘‘Heterogeneous domain adaptation using
M. F. Uddin, ‘‘Improved fake news detection method based on deep learn- manifold alignment,’’ in Proc. Int. Joint Conf. Artif. Intell. (IJCAI), 2011,
ing and comparative analysis with other machine learning approaches,’’ in vol. 22, no. 1, p. 1541.
Proc. Int. Conf. Eng. Emerg. Technol. (ICEET), Oct. 2022, pp. 1–6. [207] Y. Zhu, Y. Chen, Z. Lu, S. Pan, G.-R. Xue, Y. Yu, and Q. Yang,
[184] A. Kumar, J. P. Singh, and A. K. Singh, ‘‘COVID-19 fake news detection ‘‘Heterogeneous transfer learning for image classification,’’ in Proc. AAAI
using ensemble-based deep learning model,’’ IT Prof., vol. 24, no. 2, Conf. Artif. Intell., 2011, vol. 25, no. 1, pp. 1304–1309.
pp. 32–37, Mar. 2022. [208] B. Kulis, K. Saenko, and T. Darrell, ‘‘What you saw is not what you get:
[185] R. K. Kaliyar, A. Goswami, and P. Narang, ‘‘FakeBERT: Fake news Domain adaptation using asymmetric kernel transforms,’’ in Proc. CVPR,
detection in social media with a BERT-based deep learning approach,’’ Jun. 2011, pp. 1785–1792.
Multimedia Tools Appl., vol. 80, no. 8, pp. 11765–11788, Mar. 2021. [209] F. Arslan, N. Hassan, C. Li, and M. Tremayne, ‘‘A benchmark dataset of
[186] M. Q. Alnabhan and P. Branco, ‘‘Evaluating deep learning for cross- check-worthy factual claims,’’ in Proc. Int. AAAI Conf. Web Social Media,
domains fake news detection,’’ in Proc. Int. Symp. Found. Pract. Secur. vol. 14, 2020, pp. 821–829.
Cham, Switzerland: Springer, 2023, pp. 40–51. [210] M. Harel and S. Mannor, ‘‘Learning from multiple outlooks,’’ 2010,
[187] J. V. Tembhurne, M. M. Almin, and T. Diwan, ‘‘Mc-DNN: Fake news arXiv:1005.0027.
detection using multi-channel deep neural networks,’’ Int. J. Semantic Web [211] J. Nam and S. Kim, ‘‘Heterogeneous defect prediction,’’ in Proc. 10th
Inf. Syst., vol. 18, no. 1, pp. 1–20, Feb. 2022. Joint Meeting Found. Softw. Eng., 2015, pp. 508–519.
[188] R. K. Kaliyar, P. Kumar, M. Kumar, M. Narkhede, S. Namboodiri, and [212] P. Prettenhofer and B. Stein, ‘‘Cross-language text classification using
S. Mishra, ‘‘DeepNet: An efficient neural network for fake news detection structural correspondence learning,’’ in Proc. 48th Annu. Meeting Assoc.
using news-user engagements,’’ in Proc. 5th Int. Conf. Comput., Commun. Comput. Linguistics, Jul. 2010, pp. 1118–1127.
Secur. (ICCCS), Oct. 2020, pp. 1–6. [213] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, ‘‘How transferable are
[189] F. Zhou, Y. Hu, and X. Shen, ‘‘MSANet: Multimodal self-augmentation features in deep neural networks?’’ in Proc. Adv. Neural Inf. Process. Syst.,
and adversarial network for RGB-D object recognition,’’ Vis. Comput., vol. 27, 2014, pp. 1–9.
vol. 35, no. 11, pp. 1583–1594, Nov. 2019. [214] Z. Alyafeai, M. S. AlShaibani, and I. Ahmad, ‘‘A survey on transfer
[190] A. Mumuni and F. Mumuni, ‘‘Data augmentation: A comprehensive learning in natural language processing,’’ 2020, arXiv:2007.04239.
survey of modern approaches,’’ Array, vol. 16, Dec. 2022, Art. no. 100258. [215] J. Howard and S. Ruder, ‘‘Universal language model fine-tuning for text
[191] S. K. Hamed, M. J. A. Aziz, and M. R. Yaakub, ‘‘A review of fake news classification,’’ 2018, arXiv:1801.06146.
detection models: Highlighting the factors affecting model performance [216] M. Masum, H. Shahriar, and H. M. Haddad, ‘‘A transfer learning with
and the prominent techniques used,’’ Int. J. Adv. Comput. Sci. Appl., vol. 14, deep neural network approach for network intrusion detection,’’ Int. J.
no. 7, pp. 379–390, 2023. Intell. Comput. Res., vol. 12, no. 1, pp. 1087–1095, Jun. 2021.
[192] I. Ahmad, M. Yousaf, S. Yousaf, and M. O. Ahmad, ‘‘Fake news detection [217] C. Käding, E. Rodner, A. Freytag, and J. Denzler, ‘‘Fine-tuning deep
using machine learning ensemble methods,’’ Complexity, vol. 2020, neural networks in continuous learning scenarios,’’ in Proc. Int. Workshops
pp. 1–11, Oct. 2020. Comput. Vis. Workshops, Taipei, Taiwan. Cham, Switzerland: Springer,
Nov. 2017, pp. 588–605.
[193] A. Zubiaga, M. Liakata, and R. Procter, ‘‘Learning reporting dynamics
[218] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
during breaking news for rumour detection in social media,’’ 2016,
V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’
arXiv:1610.07363.
2014, arXiv:1409.4842.
[194] W. Y. Wang, ‘‘‘Liar, liar pants on fire’: A new benchmark dataset for fake
[219] Microsoft Corporation. (2018). MicrosoftML: A Package for
news detection,’’ 2017, arXiv:1705.00648.
Machine Learning With R. [Online]. Available: https://microsoft.
[195] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, ‘‘FakeNewsNet: github.io/MicrosoftML/
A data repository with news content, social context, and spatiotemporal
[220] Microsoft. Microsoft Machine Learning. Accessed: May 5, 2024.
information for studying fake news on social media,’’ Big Data, vol. 8,
[Online]. Available: https://docs.microsoft.com/en-us/machine-learning/
no. 3, pp. 171–188, Jun. 2020.
[221] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
[196] M. Nirav Shah and A. Ganatra, ‘‘A systematic literature review and with deep convolutional neural networks,’’ Commun. ACM, vol. 60, no. 6,
existing challenges toward fake news detection models,’’ Social Netw. pp. 84–90, May 2017.
Anal. Mining, vol. 12, no. 1, p. 168, Dec. 2022.
[222] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
[197] B. Cao, L. Hua, J. Cao, J. Gui, B. Liu, and J. Tin-Yau Kwok, ‘‘No place to large-scale image recognition,’’ 2014, arXiv:1409.1556.
hide: Dual deep interaction channel network for fake news detection based [223] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
on data augmentation,’’ 2023, arXiv:2303.18049. recognition,’’ in Proc. IEEE Conf. Comput. Vis. pattern Recognit. (CVPR),
[198] S. Warjri, P. Pakray, S. A. Lyngdoh, and A. K. Maji, ‘‘Fake news detection 2016, pp. 770–778.
using social media data for khasi language,’’ in Proc. Int. Conf. Intell. Syst., [224] T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘‘Efficient estimation of
Adv. Comput. Commun. (ISACC), Feb. 2023, pp. 1–6. word representations in vector space,’’ 2013, arXiv:1301.3781.
[199] S. Helmstetter and H. Paulheim, ‘‘Collecting a large scale dataset for [225] J. Pennington, R. Socher, and C. D. Manning, ‘‘GloVe: Global vectors for
classifying fake news tweets using weak supervision,’’ Future Internet, word representation,’’ in Proc. Empirical Methods Natural Lang. Process.
vol. 13, no. 5, p. 114, Apr. 2021. (EMNLP), 2014, pp. 1532–1543.
[200] K. D. K. Parimala and A. G. Mala, ‘‘An optimal detection of fake news [226] W. Ferreira and A. Vlachos, ‘‘Emergent: A novel data-set for stance
from Twitter data using dual-stage deep capsule autoencoder,’’ J. Experim. classification,’’ in Proc. NAACL-HLT, 2016, pp. 1163–1168.
Theor. Artif. Intell., vol. 36, no. 2, pp. 287–313, Feb. 2024. [227] B. Palani and S. Elango, ‘‘CTrL-FND: content-based transfer learning
[201] S. Kato, L. Yang, and D. Ikeda, ‘‘Domain bias in fake news datasets approach for fake news detection on social media,’’ Int. J. Syst. Assurance
consisting of fake and real news pairs,’’ in Proc. 12th Int. Congr. Adv. Appl. Eng. Manage., vol. 14, no. 3, pp. 903–918, Jun. 2023.
Informat. (IIAI-AAI), Jul. 2022, pp. 101–106. [228] B. M. Pavlyshenko, ‘‘Analysis of disinformation and fake news detection
[202] H. F. Villela, F. Corrêa, J. S. D. A. N. Ribeiro, A. Rabelo, and using fine-tuned large language model,’’ 2023, arXiv:2309.04704.
D. B. F. Carvalho, ‘‘Fake news detection: A systematic literature review [229] C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, ‘‘A
of machine learning algorithms and datasets,’’ J. Interact. Syst., vol. 14, comparative study of data sampling and cost sensitive learning,’’ in Proc.
no. 1, pp. 47–58, Mar. 2023. IEEE Int. Conf. Data Mining Workshops, 2008, pp. 46–52.
[203] S. Suryavardan, S. Mishra, M. Chakraborty, P. Patwa, A. Rani, [230] R. Longadge and S. Dongre, ‘‘Class imbalance problem in data mining
A. Chadha, A. Reganti, A. Das, A. Sheth, M. Chinnakotla, A. Ekbal, and review,’’ 2013, arXiv:1305.1707.
S. Kumar, ‘‘Findings of factify 2: Multimodal fake news detection,’’ 2023, [231] P. Branco, L. Torgo, and R. P. Ribeiro, ‘‘A survey of predictive modeling
arXiv:2307.10475. on imbalanced domains,’’ ACM Comput. Surveys, vol. 49, no. 2, pp. 1–50,
[204] K. Weiss, T. M. Khoshgoftaar, and D. Wang, ‘‘A survey of transfer Jun. 2017.
learning,’’ J. Big Data, vol. 3, no. 1, pp. 1–40, 2016. [232] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer,
[205] S. J. Pan and Q. Yang, ‘‘A survey on transfer learning,’’ IEEE Trans. ‘‘SMOTE: Synthetic minority over-sampling technique,’’ J. Artif. Intell.
Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010. Res., vol. 16, pp. 321–357, Jun. 2002.

114458 VOLUME 12, 2024


M. Q. Alnabhan, P. Branco: Fake News Detection Using DL: A Systematic Literature Review

[233] H. He, Y. Bai, E. A. Garcia, and S. Li, ‘‘ADASYN: Adaptive synthetic MOHAMMAD Q. ALNABHAN received the
sampling approach for imbalanced learning,’’ in Proc. IEEE Int. Joint bachelor’s and master’s degrees in computer
Conf. Neural Netw. (IEEE World Congr. Comput. Intell.), Jun. 2008, science with a specialization in data mining.
pp. 1322–1328. Currently, he is pursuing the Ph.D. degree with
[234] N. V. Chawla, N. Japkowicz, and A. Kotcz, ‘‘Editorial: Special issue the University of Ottawa, Canada. His major field
on learning from imbalanced data sets,’’ ACM SIGKDD Explorations of study is focused on deep learning for security
Newslett., vol. 6, no. 1, pp. 1–6, Jun. 2004.
matters and fake news detection on social media.
[235] T. Bhatia, B. Manaskasemsak, and A. Rungsawang, ‘‘Detecting fake news
sources on Twitter using deep neural network,’’ in Proc. 11th Int. Conf. Inf. He brings a wealth of expertise to his academic
Educ. Technol. (ICIET), Mar. 2023, pp. 508–512. pursuits through a diverse range of work experi-
[236] K. Ghosh, C. Bellinger, R. Corizzo, P. Branco, B. Krawczyk, and ences. He has been a dedicated part-time Professor
N. Japkowicz, ‘‘The class imbalance problem in deep learning,’’ Mach. with the Computer Science Department, University of Ottawa, since 2022,
Learn., vol. 113, no. 7, pp. 4845–4901, Jul. 2024. where he imparts his knowledge and passion for the subject to students.
[237] J.-G. Gaudreault, P. Branco, and J. Gama, ‘‘An analysis of performance In addition to his academic role, he has proven his skills as a full-stack
metrics for imbalanced classification,’’ in Proc. 24th Int. Conf. Discovery Programmer and a Microsoft Certified Trainer Specialist, showcasing his
Sci. (DS), Halifax, NS, Canada. Cham, Switzerland: Springer, Oct. 2021, proficiency in industry-standard technologies.
pp. 67–77.
[238] Z. Yan, Y. Zhang, X. Yuan, S. Lyu, and B. Wu, ‘‘DeepfakeBench: A
comprehensive benchmark of deepfake detection,’’ in Proc. Adv. Neural
Inf. Processing Syst., 2023, vol. 2, no. 6, pp. 1–32.
[239] P. Shrivastava and D. K. Sharma, ‘‘COVID-19 fake news detection using
pre-tuned BERT-based transfer learning models,’’ in Proc. 11th Int. Conf.
Syst. Model. Advancement Res. Trends (SMART), Dec. 2022, pp. 64–68.
[240] W. Tang, Z. Ma, H. Sun, and J. Wang, ‘‘Learning sparse alignments via PAULA BRANCO received the Ph.D. degree in
optimal transport for cross-domain fake news detection,’’ in Proc. IEEE computer science from the Faculty of Sciences,
Int. Conf. Acoust., Speech Signal Process. (ICASSP), Aug. 2023, pp. 1–5. University of Porto, Portugal, in 2018.
[241] I. Y. Agarwal and D. P. Rana, ‘‘Fake news and imbalanced data She joined the School of Electrical Engineering
perspective,’’ in Data Preprocessing, Active Learning, and Cost Perceptive
and Computer Science, University of Ottawa, as an
Approaches for Resolving Data Imbalance. Hershey, PA, USA: IGI Global,
Assistant Professor, in January 2020. Her research
2021, pp. 195–210.
[242] A. J. Keya, M. A. H. Wadud, M. F. Mridha, M. Alatiyyah, and interests include artificial intelligence, machine
M. A. Hamid, ‘‘AugFake-BERT: Handling imbalance through augmenta- learning, KDD, data mining, and in particular,
tion of fake news using BERT to enhance the performance of fake news imbalanced domain problems, rare extreme values
classification,’’ Appl. Sci., vol. 12, no. 17, p. 8398, Aug. 2022. forecast, health and cybersecurity applications,
[243] J. W. W. Muigai, ‘‘Understanding fake news,’’ Int. J. Sci. Res. and privacy.
Publications, vol. 9, no. 1, pp. 29–38, 2019.

VOLUME 12, 2024 114459

You might also like