0% found this document useful (0 votes)
35 views25 pages

34 Vol 102 No 4

Uploaded by

Manu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views25 pages

34 Vol 102 No 4

Uploaded by

Manu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Journal of Theoretical and Applied Information Technology

29th February 2024. Vol.102. No 4


© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

DEEP LEARNING FOR MALWARE DETECTION:


LITERATURE REVIEW
JAWHARA BOODAI1, AMINAH ALQAHTANI2, AND KHALED RIAD3,4
12
College of Computer Science and Information Technology, King Faisal University, Al-Ahsa 31982,
Saudi Arabia
3
Computer Science Department, College of Computer Sciences & Information Technology,
King Faisal University, Al-Ahsa 31982, Saudi Arabia
4
Mathematics Department, Faculty of Science, Zagazig University, Zagazig 44519, Egypt

1
E-mail: [email protected], 2 [email protected], [email protected],
4
[email protected]

ABSTRACT

Malware is among the biggest cybersecurity threats, that are changing all the time to dodge traditional
signature-based detection. In particular, machine learning, especially deep learning, is a promising method
for malware detection. This paper provides an SLR of deep learning approaches for malware detection on
Windows, Android, IoT, and other platforms. In all, we searched five major digital libraries and found 107
highly relevant studies published in 2015-2023. The SLR methodology consisted of well-formulated search
queries, inclusion/exclusion criteria, and stringent full-text evaluation. Convolutional neural networks
(CNNs) are most popular, learning spatial patterns from raw binaries. Malware sequential behaviors are
modeled using LSTM networks. Spatial and temporal learning are combined in ensemble models such as
CNN-LSTM which achieve high accuracy. But essential challenges persist, such as the generalization
problem under obfuscation, lack of transparency, and lack of labeled real-world data. Although deep learning
makes the malware detection more accurate than traditional methods, evasion attacks, interpretability, and
data limitations need to be addressed. This SLR offers important insights into the strengths, tendencies,
datasets, and weaknesses of deep learning for strong malware defense. With persistent threats, the use of
effective AI-based approaches will only further grow in importance.
Keywords: Deep Learning; Malware Detection; Convolutional Neural Networks; Long Short-Term Memory
Networks

1. INTRODUCTION frequently share common underlying behaviors that


may potentially be detected using machine learning
Deep learning involves learning multi-level data methods even when the code looks different. This
representations, with higher levels representing more makes deep learning promising for malware
abstract concepts. This enables deep learning models detection as it can potentially learn more
to learn highly complex functions directly from raw sophisticated features compared to classic machine
data without extensive feature engineering. Deep learning approaches. While research interest in
learning has proven very effective for uncovering leveraging deep learning for malware detection has
patterns in high-dimensional data and is now applied surged in recent years, most published studies tackle
across many domains. However, traditional machine only a specific malware platform, operating system,
learning often struggles to process raw, complex or variant. Comprehensive perspectives
data. Malware refers to malicious code like viruses, encompassing the full landscape are still lacking. To
Trojans, spyware designed to infect or damage help address this gap, we conducted a systematic
computer systems. Malware developers use literature review (SLR) of research on deep learning
techniques like obfuscation to avoid detection by techniques applied for malware and intrusion
signature-based antivirus tools that rely on static detection published from 2015-2023.
pattern matching. However, malware variants

1715
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Although the research interest in the use of deep  (DBNs) can automatically learn complex
learning for malware detection has increased features and patterns from raw malware samples
significantly over the past few years, most of the to detect new threats.
studies published focus on a particular malware  This literature review systematically surveys
platform, operating system, or variant. However, recent research on deep learning techniques for
broad views that include the whole landscape are still malware detection across Windows, Android,
missing. IoT, and other platforms. By thoroughly
analyzing trends, algorithms, datasets,
Several surveys concentrate on malware detection
limitations, and capabilities, it provides
for particular platforms such as Android [10] or IoT
valuable insights to guide future research on AI-
devices [11] Other studies have analyzed specific
powered malware defense systems.
deep learning algorithms such as RNNs [12] or
 As malware attacks persist and evolve, effective
CNNs [13] for malware. Nevertheless, there is no
deep-learning solutions will only grow in
comprehensive systematic analysis of the
importance.
capabilities, datasets, limitations, and open problems
The rest of the paper is organized as follows.
for deep learning techniques and computing
Section 2 discusses the systematic literature review
platforms to date.
methodology followed to identify and analyze the
In order to contribute to filling this gap, we most relevant studies. Section 3 provides a
performed an SLR of deep learning techniques used comprehensive review of deep learning techniques
for malware and intrusion detection from 2015-2023. applied for malware detection across Windows,
Our SLR methodology comprehensively reviews Android, IoT, Linux, and other platforms. For each
107 highly relevant studies in order to provide a platform, key algorithms, methods, datasets, and
comprehensive overview for Windows, Linux, capabilities are analyzed. Section 4 presents a
Android, IoT, and other platforms. discussion of the major findings, including the
predominance of CNNs, comparative effectiveness
We thoroughly analyze the comparative
over machine learning, and limitations faced.
performance of convolutional neural networks, Finally, Section 5 concludes with a summary of
recurrent networks, deep belief networks, insights gained and implications for future research
autoencoders, and ensemble models for malware
directions in this critical domain of malware
detection. Systematically, the trends, datasets,
detection using deep learning and AI.
limitations, and future research directions are
1.2 Research Questions
identified. This broader overview of the malware
Through this SLR, we aimed to thoroughly
detection landscape has not been provided in analyze the scope, trends, specific techniques and
previous surveys. methods used, algorithms applied, challenges faced,
This review serves as a helpful guide for and ability to generalize across the field. We sought
researchers who seek to develop better reliable deep to answer several key research questions:
learning-based malware defense systems by  RQ1: Which computing platforms are most
outlining the current state-of-the-art and open heavily targeted and impacted by malware
problems. As malware threats continue to expand attacks and threats?
and grow, the importance of AI-powered protection  RQ2: What are the hot eras and trends in
will only continue to rise. malware detection research, which platforms
see the most focus, and which publication
1.1 Motivation venues are most prominent?
 Malware poses one of the biggest threats to  RQ3: What particular methods and techniques
computing platforms like personal computers, do researchers employ in order to detect
mobile devices, and the Internet of Things malware using deep learning?
(IoT).  RQ4: Which significant machine learning and
 As malware continues to increase in deep learning algorithms have been used in
sophistication, traditional signature-based order to detect malware?
antivirus solutions are becoming inadequate.  RQ5: What are the primary obstacles and
 Machine learning, especially deep learning, has restrictions of applying deep learning for
emerged as a promising approach for efficient detection of malware and intrusions?
robust and generalizable malware detection.  RQ6: Are studies and suggested deep learning
 Deep learning models like convolutional techniques useful for detecting Android
neural networks (CNNs), recurrent neural malware encourage important characteristics
networks (RNNs), and deep belief networks

1716
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

including adaptability, sustainability, and etc., from the programs. These features are then used
automated selection of of optimal algorithms? to train classifiers like neural networks, SVMs,
 RQ7: Do Android-based malware detection random forests, etc., on labeled benign and malware
methods proposed demonstrate ability to samples. The trained model based on classification
identify new, unknown malware variants, and and learning can classify new unseen programs into
what feature analysis techniques are used? classes like adware, spyware, ransomware, trojans,
 RQ8: What datasets are most widely used and worms, and viruses based on the extracted features.
standard for evaluation of malware detection Using techniques like one-vs-all for multi-class
systems focused on Android and Windows classification, the model outputs predicted
platforms? probabilities for each malware type. The program is
We systematically searched for and analyzed assigned the class with the highest probability.
the most relevant studies on deep learning Careful feature engineering and model tuning are
techniques for malware detection published from critical for accurately detecting and categorizing the
2015 through 2023. Established rigorous guidelines wide range of modern malware variants.
for performing systematic literature reviews were
carefully followed to obtain comprehensive insights
without bias.
The results provide a thorough overview of
trends, techniques, algorithms, datasets, limitations,
open challenges, and future directions in this quickly
evolving field. By shedding light on the current
malware detection research landscape, this review
serves as a valuable reference for researchers or
engineers aiming to advance reliable deep learning-
driven malware defense systems. In Figure 1, a flow
chart shows the deep learning algorithms for
malware detection. As malware threats persist and
grow, effective AI-powered protection will only
increase in critical importance. The training and
testing techniques are used.
Figure 2: Flowchart of Machine learning algorithms for
malware detection.
Below, Table 1 is the abbreviation table:

Table 1: Abbreviation Table.


Abbreviation Definition
ML Machine Learning
DL Deep Learning
IoT Internet of Things
CNN Convolutional Neural Network
RNN Recurrent Neural Network
DBN Deep Belief Network
LSTM Long Short-Term Memory
Bi-GRU- Bidirectional Gated Recurrent
CNN Unit - CNN
Figure 1: Flowchart of Deep learning algorithms for BiLSTM Bidirectional LSTM
malware detection. AMD Android Malware Dataset
Malware detection can be formulated as a XGBoost eXtreme Gradient Boosting
multi-class classification problem in machine D.T Decision Tree
learning, with the goal of categorizing files or R.F Random Forest
applications into benign or one of several malware API Application Programming
types. In Figure 2, a flow chart shows that the first Interface
step is extracting informative features like system CPU Central Processing Unit
calls, API calls, opcodes, string signatures, metadata,

1717
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

ITMF Image Texture Median 2. RESEARCH METHODS


Filtering
2.1 Search Strategy
URL Uniform Resource Locator
Manually searching individual libraries for
TFIDF Term Frequency–Inverse keywords related to deep learning and malware
Document Frequency detection is inefficient. A better approach is to
KNN K-Nearest Neighbors develop a comprehensive search query combining
SVM Support Vector Machine relevant keywords, synonyms, and abbreviations
1.3 Research Objectives using logical operators like "OR" and "AND". For
 To conduct a systematic literature review (SLR) example, "Convolutional Neural Networks" and
surveying the application of deep learning "CNN" should be combined with "OR" since they
techniques for malware detection across refer to the same technique. Similarly, we want
Windows, Android, IoT, and other platforms. studies discussing both "Deep Learning" and
 To analyze the capabilities, algorithms, datasets, "Malware Detection", so these terms need "AND"
trends, limitations, and open challenges of using linkage. Deep Learning, Deep Learning techniques,
deep neural networks to detect malware based and malware detection were among the terms we
on the existing literature. used in our search. ("Deep Learning" OR
 To provide a comprehensive overview of the "Convolutional neural network" OR "Deep belief
malware detection research landscape to guide network" OR "recurrent neural network" OR "CNN"
future work on applying deep learning and AI OR "RNN" OR "DBN" OR "LSTM") AND
for robust malware defense. ("malware") AND ("detection" OR "detect" OR
 To specifically examine the use of "identification" OR "identify" OR "classification").
convolutional neural networks (CNNs), This was the final search query. This consolidated
recurrent neural networks (RNNs), long short- query enables a thorough yet efficient search for
term memory (LSTMs), deep belief networks pertinent studies across keywords and terminology
(DBNs), autoencoders, and ensemble models. related to deep learning and malware detection.
 To assess the effectiveness of deep learning for
malware detection compared to traditional 2.2 Literature Screening Criteria
machine learning approaches relying on manual
feature extraction.
Searched 5 major digital libraries: IEEE Xplore,
 To identify key obstacles faced in real-world
ACM DL, ScienceDirect, SpringerLink, Google
deployment of deep learning-based malware
Scholar.
defense systems.
Initial search results filtered using inclusion criteria:
This study focuses solely on reviewing existing
 Peer-reviewed journal or conference papers
literature and does not involve any novel data
collection or experiments. The scope is limited to  Published between 2015-2023
studies applying deep learning or machine learning  English language
techniques for malware detection. Broader  Full text available
cybersecurity topics like network intrusion detection Exclusion criteria to remove irrelevant papers:
or spam filtering are not included. The aim is to  Books, gray literature, surveys
synthesize insights from prior research to inform  Non-peer reviewed (e.g. preprints)
future work on using AI to counter evolving  Non-English
malware threats.  Duplicated studies
1.4 Problem Selection After filtering, 158 highly relevant studies were
Malware was selected as the problem domain selected for in-depth review and analysis based on
because it remains one of the most significant and relevance to deep learning for malware detection.
evolving cybersecurity threats. Deep learning has Each paper was critically read and analyzed to
emerged as a promising approach for robust malware extract key information on algorithms, datasets,
detection that can automatically learn complex limitations, results etc. related to the use of deep
features from raw data. However, a comprehensive neural networks for malware detection.
overview of deep learning techniques applied for The structured literature screening process
malware defense across platforms was lacking. enabled methodical selection of the most pertinent
prior studies on deep learning for malware detection
across computing platforms.

1718
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

2.3 Literature Screening Criteria First, there is information that examines the
Searched 5 major digital libraries: IEEE Xplore, economic, commercial, and legal
ACM DL, ScienceDirect, SpringerLink, Google consequences of malware and intrusion
Scholar. detection.
Initial search results filtered using inclusion criteria: Reports and blogs are two forms of gray
 Peer-reviewed journal or conference papers literature.
 Published between 2015-2023 Third, any documents that are not in English.
 English language Critical evaluations.
 Full text available Use of two-sided paper is number five
Exclusion criteria to remove irrelevant papers: This category includes non-journal papers,
 Books, gray literature, surveys such as those given at conferences.
 Non-peer reviewed (e.g. preprints) There is a scarcity of work on malware
 Non-English detection that does not use deep learning.
 Duplicated STUDIES
After filtering, 158 highly relevant studies were
selected for in-depth review and analysis based on 3. LITERATURE REVIEW
relevance to deep learning for malware detection. 3.1 Windows Malware Detection using Deep
Each paper was critically read and analyzed to Learning Techniques
extract key information on algorithms, datasets, Ni et al.[26] presented a convolutional neural
limitations, results etc. related to the use of deep network-based "Malware Classification Using Sim-
neural networks for malware detection. Hash and CNN" (MCSC). They decompile the
The structured literature screening process infecting code and utilize the grayscale pictures that
enabled methodical selection of the most pertinent arise to identify malware families. To transform
prior studies on deep learning for malware detection comparable viral code into hash values, locality-
across computing platforms. sensitive hashing (LSH) is utilized. The hash values
2.4 Inclusion and Exclusion Criteria are then transformed into grayscale pictures for
The search across five literature databases neural network training. They claim that their
initially returned 935 total papers. We refined the list technology detects malware at a rate of 98% or
in Table 2to identify the most relevant publications higher.
for our review. Papers were excluded based on Zhao et al.[27] describe MalDeep as a deep
screening of titles, abstracts, document types, learning-based malware detection system that
languages, and if deemed irrelevant after full-text analyzes at the malware’s binary file. Convolutional
review. Specific exclusion criteria were survey neural networks are used to categorize the pathogen
papers, book chapters, gray literature, duplicates, once the binary file is transformed to a grayscale
non-peer reviewed publications, and non-English picture. One of their system’s most impressive
papers. By systematically applying these criteria, we features is its 99% detection rate for dangerous
filtered the initial 935 papers down to 158 highly malware.
relevant journal articles to review and analyze within A deep learning algorithm for malware
our domain of deep learning for malware detection. detection that makes use of subtle system calls was
The inclusion and exclusion criteria enabled us to developed by Zhang et al.[28]. Cuckoo sandbox
hone in on the most pertinent literature from the monitors the specified program in order to obtain
initial search results. system call information and use it to train neural
networks. Their method detects malware with 95%
Table 2: Inclusion and Exclusion Criteria
accuracy using simply system calls.
Inclusion Criteria Zhang et al.[29] created a convolutional neural
The goal of this research is a deep learning- network model for detecting malware that
based malware or intrusion detection system decompiles the software into its component pieces to
The paper is either a preview or an article get op-codes and API 133 calls. Each binary is
published in a peer-reviewed scientific organized, and the API frequency vectors and PCA-
journal. initialized opcode bigram matrices are constructed.
Between January 2015 and December 2023, These data are used to train a convolutional neural
three issues of the magazine will be network (CNN) and a backpropagation neural
published. network (BPNN) to include features. Their malware
Exclusion Criteria detection technology has a 95% accuracy rate.

1719
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Zhong and Gu [30]demonstrated a multi-tiered sequences and statistical statistics from sandboxed
deep learning strategy for picking significant malware executables. Before the system call
characteristics from static and dynamic feature sets. sequences are fed into the deep LSTM model for
It generates cluster sub-trees by grouping malware classification, they are first categorized
comparable qualities together using the K-means using a random forest model.
algorithm. It decides if an application is dangerous ScalMalNet is a distributed system designed by
or safe by merging the outputs of the deep learning Vinayakumar et al.[36] to gather malware samples
models in the tree. from multiple websites. These samples are processed
The ransomware detection approach presented in an immediate or asynchronous distributed
by Zhang et al.[31] converts ransom ware family manner. They recommended detecting malware
names and op-code information into numerical using image processing and static and dynamic
tensors in order to train a neural network. Their analysis. According to their research, deep learning
approach employs self-attentional convolutional malware detection is considerably more successful
neural networks (SA-CNN). One disadvantage is than classic ML approaches.
that the accuracy is just about 90%. A convolutional neural network technique was
Deep learning has become a prominent presented by Kolosnjaji et al.[37] for detecting
technique to protect Windows systems against malware in binary files. Grayscale graphics are
malware, with convolutional neural networks created by breaking down the malware binary into 8-
applied extensively. Accuracy rates up to 99% have bit chunks, which are then converted to decimal
been achieved by researchers in detecting new values ranging from 0 to 255, organized into a 2D
malware samples. But challenges like improving array, and displayed. With the assistance of this
detection of ransomware illustrate that continued image, CNN learns to spot infections.
advancement of deep learning systems can further Athiwaratkun and Stokes[38]proposed
enhance malware detection on the Windows MalConv, using 1D convolutions on raw byte
platform. sequences for malware detection without feature
In [32], Yuxin and Siyi developed a deep belief engineering. It views the malware binary as a long
network approach for malware detection that input sequence, applying narrow 1D convolutions
extracts opcode sequences from malware and max-pooling to automatically learn local
executable. A PE parser is used in their system to relationships between malware bytes.
convert the PE file into a set of machine instructions. Deep learning techniques like DBNs, CNNs,
A feature extractor finds high-classification-power and LSTMs have been extensively explored for
n-gram sequences and utilizes them to represent the Windows malware detection using static and
PE file as an n-gram vector. This data is sent into a dynamic analysis of PE files, opcodes, and API call
malware detection system that employs neural sequences. Direct modeling of malware binaries as
networks. Their approach detects dangerous images or sequences enables deep learning to
malware with a 98% success rate. achieve high accuracy without relying on manual
Yue [33] suggests this loss function for malware feature extraction.
photo identification using deep convolutional Convolutional neural networks were used by
networks by combining softmax regression and Anderson et al. [39] to develop a deep learning-
entropy loss. They argue that their loss function based malware detection system. Their approach
appropriately handles the challenges raised by accepts raw byte sequences as input rather than
datasets with significantly variable malware family relying on manual feature engineering. Malware
distributions. binary files are converted into byte plots and
It was first used by Ye et al. [34] as a malware visualized as grayscale images to train the
detection technique that operates directly on convolutional networks. This spatial representation
Windows PE files. An API feature extractor is helps model positional relationships within the
employed in their suggested approach to decompress malware code.
the PE file and extract the relevant API calls. It uses Yousefi-Azar et al.[40]developed a self-taught
constrained Boltzmann machine-based deep learning system using sparse autoencoders for
learning models and unsupervised heterogeneous detecting malicious Windows executables. They
auto-encoders to identify malware based on API generate image inputs from binary file hashes and
request patterns. apply transformations to augment the training data.
Xiaofeng et al. [35] proposed an LSTM RNN This improves the model’s ability to generalize to
malware detection approach that integrated machine new malware samples.
learning and deep learning. It collects API call

1720
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Han et al.[41] created a malware detection relies on permissions. By employing network


framework using raw Windows dynamic trace log clustering techniques, the application organizes its
data as input to a deep neural network. By attributes and generates a dataset for the purpose of
eliminating manual feature extraction and using categorizing newly occurring illnesses.
native log data, their system achieves higher API methods, opcode features, authorization
accuracy compared to classical machine learning features, shared library function op-code features,
techniques. component features, and environmental elements
A long-term study on deep learning for just-in- were all included in the feature vectors that Kim et
time malware detection in Windows executables was al.[68] constructed for each feature. The malware
presented by Rhode et al.[42]. They evaluate categorization model is further enhanced by
detection performance over an extended period as incorporating these vectors. By including several
new malware samples appear. Deep learning aspects instead of relying on a limited number, they
consistently outperforms traditional machine possess a competitive edge over other
learning approaches over time as new threats methodologies.
emerge. A deep learning-based malware prediction
These and other studies highlight the system that takes CPU, memory, and battery usage
capabilities of deep learning to enable robust into account was proposed by Milosevic and
malware detection in Windows without extensive Huang[69]. Their unsupervised technique collects
feature engineering. Deep neural networks can data using encoder-decoder and LSTM networks and
automatically learn complex positional relationships runs on a variety of platforms. Their system’s
and sequences found in malware code. Their ability weakness is reflected in a low F1 score, which
to generalize from raw binaries and log data also hovers around 80%.
improves detection of new malware strains over Yuan et al.[70] extracted three elements to
time. Overall, deep learning shows significant construct an online Android malware detection tool:
promise for enhancing Windows malware detection. crucial permissions, sensitive API calls, and
3.2 Android Malware Detection using Deep dynamic behavior. These are inputs to deep belief
Learning Techniques networks, which aid in the detection of infections in
Researchers used deep learning to construct apps. Following an unsupervised training period,
malware and intrusion detection systems for the their model is enhanced via supervised
Android platform, similar to their work on Windows. backpropagation.
Meta-data from the literature on Windows-based Yuan et al.[71] developed an approach that
malware detection and data highlighted in the makes advantage of API sensitivity, permissions,
research questions (RQ7 and RQ8) are used to and dynamic behavior with their work on deep
analyze the work conducted on the Android platform learning. Their method effectively categorizes
in this section. malware with a success rate of over 96% by utilizing
Devi [65]offered a solution for identifying deep belief networks and a total of over 200 features.
fraudulent applications that needed user consent on Yen and Sun[72] provided a technique for
Android. They collected data from Android package identifying malware in APK files based on the
manifests and permissions, created feature vectors, significance of terms. To assign a value to each word
and trained their model using neural networks and in the APK’s translated Java classes, text mining
the k-means clustering technique. The method’s low using Term Frequency-Inverse Document
success rate (88% to be exact) is a disadvantage. Frequency (TFIDF) is employed. A CNN
Karbab et al.[66] presented MalDozer, a tool for architecture generates word-significant pictures.
detecting Android malware based on the sequence of Xie et al.[73] extracted seven types of malware
API method calls.MalDozer extracts API method characteristics using a CNN technique, including
calls by using classes from an Android package. The APIs, hardware, intents, permissions, and limited
dex file is subjected to a process of discretization, APIs. To train and validate CNNs, they first create
wherein an identification is assigned to each API feature vectors, then turn them into matrices and split
method, resulting in the creation of semantic vectors. the dataset. The 99.25% accuracy is pretty
In this study, researchers utilize neural networks to acceptable.
predict the possible risks associated with Android Wang et al.[74] combined a deep autoencoder
applications. One advantage of their methodology is with a customized CNN known as CNN-S to detect
in its ability to consistently get a high F1 score across Android viruses. To train the model, they employ
several datasets. Khedkar et al. [67] presented a seven different sorts of characteristics, such as
methodology for detecting Android malware that limited APIs, permissions, intents, and coding

1721
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

patterns. Its 99.82% accuracy percentage is a big network, files, courses, and SMS services in real
plus. time. These data are fitted with a Markov chain
Luo et al. [75]introduced ITMF for analyzing model to create feature vectors that are fed into deep
and discovering Android malware using picture neural networks for classification. Deep learning and
texture median filtering. The lower noise levels in machine learning experiments, on the other hand,
ITMF improve both image and signal processing. demonstrated just a moderate gain in accuracy
The grayscale photos are supplied to ITMF for (approximately 81%).
feature extraction once the binary images have been Shiqi et al.[81] developed an attention-CNN-
transformed to vectors. Deep belief networks LSTM model to extract texture fingerprints and
outperform shallow learning when trained on malware activity embeddings from binaries using
properties such as APIs and URLs. deep belief networks. Malicious applications create
Saif et al. [76] created a deep belief network grayscale pictures. The attention-CNN-LSTM
system by analyzing static and dynamic Android architecture receives the fingerprint properties and
applications. Relief feature selection is used to activity embeddings required for malware
reduce the number of features that contribute to a identification. They outperformed typical machine
vector, which may include manifest nodes, API learning algorithms in terms of precision.
calls, system functions, and dynamic behavior. This The use of techniques such as dynamic analysis,
vector is utilized in the construction of a deep call graph mapping, opcode extraction, permissions,
learning network classifier. Using API call graphs, and API analysis has resulted in improvements in
Pektas and Acarman[77] presented a technique for Android malware detection using deep learning.
detecting Android malware. The model is fed graph Long short-term memory networks that combine
embedding vectors after a given number of CNNs with attention mechanisms and binary image
consecutive API requests. This simplifies the converters are also promising. However, further
extraction of features from API call patterns for use research is needed to increase generalizability and
in malware classification. Deep learning has been accuracy in the face of expanding mobile threats.
intensively researched for static, dynamic, and An Android malware detection method using
hybrid analysis of Android malware. The use of the Bag of Words paradigm to extract hardware,
neural networks in conjunction with data mining and permissions, APIs, intents, and network address
filtering procedures results in excellent accuracy characteristics was proposed by Halim et al. [82]. A
without the requirement for human feature convolutional neural network (CNN) and a long
engineering. short-term memory (LSTM) stack were examined as
Pektas and Acarman [78] created a method for deep learning architectures. When tested for
detecting Android malware that uses features from malware, both the CNN-LSTM and the LSTM-CNN
instruction call graphs to look at every possible path obtained 98.53% accuracy.
of execution. Their method derives call trees and A deep belief network strategy utilizing Lasso
execution routes in terms of opcodes via pseudo- feature selection and shrinkage is presented by
dynamic analysis. Graphs of potential paths are Elsersy and Anuar [83]. They compared a KNN
created, which are subsequently translated into classifier to a DBN classifier for malware detection
numerical vectors. These vectors are given into a and discovered that the latter was more accurate. We
model that identifies malware risk using Long Short- did, however, set a limit on total accuracy, which
Term Memory Recurrent Neural Networks. The came in at 85.22%.
results demonstrated that they were more accurate Using API call sequences, D’Angelo et al.[84]
than typical machine learning approaches. generated sparse matrices to simulate temporal
Nauman et al. [79] investigated numerous deep behavior in an autoencoder system. Features
learning algorithms for Android malware detection collected from sparse matrices are used to
at scale, including CNNs, DBNs, LSTMs, and distinguish between malicious and genuine software.
autoencoders. Static analysis and manifest files give For the goal of recognizing Android malware,
information like as components, limited APIs, and Chen et al.[85]recommended modeling permission
deep model rights. The efficiency of various and API features as word vectors using word2vec.
architectural layouts was evaluated using malware Amin et al. [86] combined DBNs, LSTMs,
feature data. A. CNNs, and autoencoders with other deep learning
Martın et al.[80] presented CANDYMAN, algorithms to develop a method for identifying
which combines deep learning, dynamic analysis, Android malware based on.dex files. They claim that
and Markov chains to classify Android malware. their study of byte code attributes can properly
With DroidBox, you may gather information on the classify malware with 99.9% certainty.

1722
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

When Android apps are in operation, Alzaylaee and a need for effective detection and mitigation
et al. [87] present a system that employs DynaLog techniques.
dynamic analysis to extract components such as Machine learning, specifically deep learning
APIs, actions, and permissions. The highest-scoring techniques, is widely explored for IoT malware
features from InfoGain are fed into deep learning detection and classification. These techniques have
models that detect malware. shown great potential in improving the accuracy and
3.3 IoT Malware detection using deep learning efficiency of IoT malware detection compared to
techniques traditional methods. However, there is a lack of
Many methods and strategies for recognizing research on IoT malware analysis, and most existing
malicious files and IoT malware are documented in studies use simple detection methods. Future studies
the literature. Specific strategies combine static and should focus on developing advanced deep-learning
dynamic analytic capabilities to detect Android- models tailored explicitly for IoT malware detection.
based mobile malware. Continuous research is being Furthermore, the application of generative
conducted to establish a categorization model to adversarial networks in developing deep learning
investigate the relationship between potential models for identifying Android malware has been
hazards and vulnerabilities in home automation suggested. This also implies the possibility of
systems[155]. The security of smartphones in the investigating the use of generative adversarial
context of the Internet of Things, as well as the networks for IoT malware detection. These
detection of application threats and impacts, are also advancements in detecting and analyzing malware
being investigated. demonstrate a growing recognition of IoT malware
In terms of detection methods, IoT malware as a significant threat to Internet security. The review
detection approaches can be divided into two of existing literature emphasizes the importance of
primary categories: dynamic and static analysis. proactively identifying and mitigating IoT-based
Future research will focus on analyzing IoT threats through advanced learning methods,
security difficulties, problems, and challenges and accentuating the need for practical detection
discussing security objectives, aims, and approaches that do not rely heavily on prior
vulnerabilities. Various detection strategies based on knowledge about malware features[163].
virus characteristics, tracking of hazardous actions, By conducting software analysis between IoT
and energy usage are being researched to reduce the and Android samples and utilizing graph properties
threat of IoT malware[160]. Malware detection is obtained from control flow graph structures, a
also done automatically using machine learning detection system for IoT malware can be built using
methods such as ensemble classifiers and the ADA advanced deep learning models[164].
GRAD optimize algorithm. These strategies strive to These techniques include utilizing deep learning
recognize application attributes and classify them as algorithms and generative adversarial networks to
risky, aggressive, benign, or malicious to safeguard detect and classify IoT malware and employing
IoT networks from malware assaults and improve abstract graph structures, such as control flow
the overall security and stability of the Internet[161]. graphs, for analyzing and detecting IoT malware.
Given the expanding number of Internet of Things The use of generative adversarial networks to
devices and their critical importance in many develop deep learning models for highly accurate
applications, efficient malware detection and identification of unknown malware samples has
mitigation approaches are critical. The proposed resulted in significant progress in IoT malware
architecture for developing and recognizing new IoT detection [165,166]. Furthermore, the system
malware samples at the edge layer of IoT networks proposed for creating new malware samples at the
using raw byte code is a potential solution to the edge layer of IoT networks utilizing raw byte code
problem of a scarcity of malware samples for holds much promise. This could assist in alleviating
machine learning-based detection approaches [162]. the issue of malware sample scarcity for machine
However, there is insufficient literature learning-based detection systems [167]. As IoT
specifically concentrated on IoT malware. Despite devices continue to increase and play a critical role
this scarcity, analysts have identified it as a in various applications, the emphasis on effective
substantial threat to internet security and stability. malware detection and mitigation methods becomes
They stress the importance of comprehending IoT even more crucial[168].
malware through analysis and detection to mitigate 3.4 Windows malware detection using the latest
2022 to 2023 effectively, and it is evident that there ML techniques
is a growing concern about the threat of IoT malware Malware targeting Windows operating systems
is one of the most prevalent cybersecurity threats

1723
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

today. Traditional signature-based antivirus deep neural networks, logistic regression, naïve
solutions are inadequate in detecting new and Bayes, random forests, and support vector machines.
evolving malware variants. However, machine Deep learning models like deep belief networks
learning techniques have emerged as a promising (DBNs) and convolutional neural networks (CNNs)
approach for detecting both known and novel can automatically extract useful features from raw
malware[155]. binaries and permissions[161,162].
Recent research has explored various machine Other techniques like logistic regression, naïve
learning algorithms for Windows malware detection, Bayes, and random forests rely on expert-defined
including deep learning neural networks, ensemble features based on static and dynamic analysis. Key
learning, and support vector machines. Deep features includerequested permissions, intent filters,
learning methods like convolutional neural networks and function calls in the code. Dimensionality
(CNNs) can automatically learn complex features reduction methods like principal component analysis
from raw byte sequences of malware samples. help eliminate redundant features. Support vector
Studies have shown that CNNs achieve over 99% machines (SVMs) have proven particularly
detection accuracy on benchmark Windows malware effective, as they can model complex Android
datasets[151]. malware families[163,164]. Most Android malware
In addition to deep learning, ensemble methods detection systems take a hybrid approach, combining
like random forests and gradient boosted trees have static and dynamic analysis[165]. For instance,
also proven effective for Windows malware DBNs could first extract features from the Android
detection. Ensemble learners combine multiple weak application package (APK) code and manifest[166].
predictive models to create an overall strong Then, an SVM or random forest algorithm could
predictor. The random forest algorithm trains classify the app as malicious or benign based on
multiple decision trees on different subsets of those features[167,168].
features and data points, aggregating their outputs So, machine learning has made Android
for the final classification. This provides robustness malware detection scalable and automated[55].
against overfitting on training data[152,153]. Deep learning methods obviate manual feature
Research has also utilized support vector engineering, while ensemble methods like random
machines (SVMs) for malware classification. SVMs forest provide robust predictions. As Android
identify optimal hyperplanes to distinguish between malware continues to evolve, these AI-based
malicious and benign software samples. Kernel techniques will grow increasingly important for
functions like radial basis functions help SVMs security [59].
classify complex malware types. SVMs achieve high 3.6 IoT Malware detection using the latest ML
accuracy, but their performance depends on careful techniques
feature engineering and selection[154,155]. So, Internet of Things (IoT) devices are
modern machine learning has enhanced static, proliferating rapidly, but often lack adequate
dynamic, and hybrid analysis of Windows malware. security. This makes them attractive targets for
Static analysis focuses on characteristics extracted malware, including botnets like Mirai that
from the malware executable, while dynamic compromise IoT devices for DDoS attacks. Machine
analysis executes the sample in a contained learning presents a promising approach to detect
environment[156,157]. Hybrid analysis combines malware infecting IoT devices like routers, IP
both for comprehensive detection[158]. Ultimately, cameras, and connected appliances[161].
an ensemble of multiple machine learning models
A primary challenge with IoT malware
provides optimal malware detection capabilities on
detection is the diversity of hardware architectures
the Windows platform[159,160].
and operating systems. ML techniques should be
3.5 Android malware detection using the latest
platform-agnostic to detect malware on Linux,
ML techniques
RTOS, and other IoT OSes. Deep learning methods
Malware targeting the Android mobile
like convolutional neural networks (CNNs),
operating system has exploded in recent years.
recurrent neural networks (RNNs), and autoencoders
Traditional malware scanners depend on malware
can analyze raw binary files and network traffic on
signatures, and often fail to detect new threats that
any platform[162].
elude signature databases. However, machine
learning presents a robust solution for identifying In addition, IoT devices have limited computing
both known and zero-day Android malware. capacity, precluding complex ML model
Various machine learning algorithms have been deployment locally. Hence, ML-based IoT malware
leveraged for Android malware detection, including detection is best performed at the network level.

1724
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Network traffic analysis can identify anomalies and To improve generalization across evolving Linux
malicious connections. Supervised models like malware families, adversarial machine learning can
random forest and SVM can classify benign vs. augment training data with mutated malware
malicious traffic when trained on labeled data samples. Transfer learning can leverage models
samples[154,155]. trained on other platforms like Windows and
Android and transfer knowledge to the Linux
Moreover, unsupervised ML algorithms like
domain[177]. So, machine learning has emerged as
isolation forests, k-means clustering, and one-class
a powerful tool for Linux malware detection amidst
SVM can detect IoT malware with no prior training
the rise of Linux adoption. Advanced deep learning
data [60]. Such anomaly detection models learn
and ensemble approaches overcome limitations of
patterns of normal behavior, flagging deviations as
traditional signature-based methods. As malicious
potential threats [63,64]. So, ML delivers efficient
actors increasingly target Linux devices, robust ML-
and robust IoT malware detection amidst hardware
powered detection capabilities are crucial for
diversity and limited on-device capabilities [65,66].
security[178].
Deep learning extracts useful features from binaries
3.8 Main Deep Learning Algorithms in Malware
and network traffic, while ensemble methods
Detection
classify threats. Anomaly detection techniques work
We examined the application of various DL
even with limited labeled data[71]. As IoT adoption
models in the literature[37] in order to address RQ4
accelerates, ML will become indispensable to
and determine the primary deep learning algorithms
securing these devices against malware intrusions.
utilized for malware detection. Table 2 summarizes
3.7 Linux Malware Detection the results, showing Convolutional Neural Networks
In addition to Windows, Android, and IoT (CNNs) were employed in over 50% of the surveyed
devices, Linux-based systems are also vulnerable to papers[38]. This makes CNNs the most predominant
malware threats[169]. As servers, desktops, and technique for malware detection[39,40]. Various
cloud infrastructure increasingly run on Linux, forms of LSTM-based neural networks were used in
detecting Linux malware has become crucial. 25 studies, comprising 25% of the literature[41].
Signature-based antivirus tools are inadequate for DBN and autoencoderbased algorithms were applied
detecting new Linux malware strains[170]. Machine in 12% and 10% of publications,
learning provides robust Linux malware detection respectively[42,43].
capabilities by modeling unique characteristics of Like many other domains, convolutional neural
Linux malware families[171]. networks are the most popular deep learning
Debnath S et al.[172] discussed the key features approach for malware detection and
for Linux malware detection include executable classification[44]. CNNs can detect meaningful
metadata like format, headers, sections, libraries features from unsupervised data, making them well-
used etc. Dynamic analysis examines runtime suited for classification problems like image
behaviors like system calls, network activity, and file recognition, medical imaging, and malware
operations once executed in a contained detection[45]. In particular, CNNs are widely
environment. Hybrid approaches combine both reported to be highly effective for image
static and dynamic features[172,173]. classification and object detection. The capability of
Shallow machine learning algorithms like CNNs to learn robust features from raw inputs like
logistic regression, naïve Bayes, support vector malware binaries and images enables their
machines (SVMs), and random forests have been widespread use[46].
applied for Linux malware detection using expert- On the other hand, recurrent neural networks
defined feature engineering[174]. Deep learning like LSTMs can model sequential data and time-
techniques like CNNs and RNNs can automatically based patterns in malware code and behavior.
extract useful features from raw binaries, Variants of LSTM account for a significant portion
disassembled code, assembly instructions and of deep-learning malware research. Autoencoders
system calls[175]. and DBNs have shown promising results for learning
Unsupervised learning is also relevant for Linux compressed representations of benign and malicious
anomaly detection, as normal behavior can be files[47].The dominance of CNNs aligns with their
profiled to detect deviations. One-class SVMs, demonstrated ability to learn spatial patterns from
isolation forests, and autoencoders identify malware binaries represented as images or raw byte
anomalies without prior training[176]. Ensemble sequences. While other techniques have niche uses,
models that combine multiple shallow and deep convolutional neural networks are the workhorse of
learning algorithms also boost detection accuracy. deep learning for robust malware detection across

1725
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

most studies. The extensive use of CNNs highlights binaries represented as images or sequential data.
their applicability for learning features automatically They can effectively model positional patterns in
from low-level malware data[151,152]. Sea Table 3. malware code and binaries where feature location
RQ1 address windows PCs and Android mobile matters.
devices are the most common platforms targeted by RQ5 address the key challenges faced in
malware, due to their widespread adoption. applying deep learning for malware detection
However, emerging research also looks at threats for include limited labeled training data, model
Linux systems, IoT devices, and websites. On overfitting on seen malware families, evasion attacks
Windows, key malware threats include viruses, degrading generalization, and lack of model
trojans, spyware, and ransomware. For Android, the interpretability. Sustaining accuracy over long
open app ecosystem leads to risks from malicious periods as new malware strains continuously evolve
apps containing backdoors, spyware, ransomware is an open research problem. Adversarial malware
and banking trojans. Websites face threats like drive- can craft inputs to evade detection. Hybrid deep
by downloads and watering hole attacks that learning ensemble models help improve robustness.
distribute malware. The prevalence of these But a universal robust solution remains lacking.
platforms makes them prime targets for attackers. As Lack of transparency around model logic also makes
their adoption continues growing, securing them real-world deployment difficult.
from malware threats is crucial. RQ6 shows the most Android malware
RQ2 shows the peak era of deep learning detection studies do not focus on sustainability,
malware detection research is from 2015-2023. Most automatic selection of optimal algorithms, and
studies have focused on Windows and Android continuous retraining over time. But adaptability
malware, with top publication venues being IEEE over long periods is critical as the malware
Transactions on Information Forensics and Security, landscape rapidly evolves. Incremental learning to
Computers & Security journal, and IEEE Access update models and retaining performance over years
journal. The surge in deep learning research for remains an open challenge. Automated selection of
malware coincides with the emergence of AI/ML the best performing deep learning architectures is
across security domains. CNNs and other deep also lacking.
learning methods allow learning from raw malware RQ7 shows the static, dynamic, and hybrid
samples like binaries and bytecode without analysis techniques are used to extract features from
extensive feature engineering. Their ability to Android apps for detecting new malware variants.
automate feature extraction and train on low-level However, more research is needed to strengthen
malware data has driven adoption for detecting zero-day threat detection. Generative adversarial
constantly evolving threats. networks show promise for improving
RQ3 address the most widely used deep generalization. But evasive malware continues to be
learning techniques are convolutional neural a challenge. Signatures and heuristics have limited
networks (CNNs), recurrent networks like LSTM, effectiveness for brand new threats. So, techniques
deep belief networks (DBNs), and autoencoders. to boost resilience are crucial.
CNNs can directly learn spatial patterns and RQ8 address the widely used datasets for
relationships from raw binaries and opcode evaluating Android malware detection include
sequences. LSTMs and GRUs model the sequential Drebin, Contagio, VirusShare and the Android
nature of malware behaviors over time. DBNs help Genome Project. For Windows malware, common
learn hierarchical abstract features from malware. datasets are EMBER, BIG2015 and SOREL-20M.
Auto-encoders allow learning compact latent Standard datasets allow comparing different
representations of malware. These techniques enable techniques. But they may not reflect realworld
end-toend learning from low-level malware diversity. Expanding datasets with adversarial
executables, API calls, opcodes, etc. without relying samples can help improve robustness. Overall the
on manual feature extraction and selection. lack of rich labeled real-world data remains a key
RQ4 shows that out of the deep learning limitation.
algorithms, convolutional neural networks
dominate, applied in over 50% 620 of papers. 4. DISCUSSION
RNN/LSTM networks are also popular for
sequential data. Autoencoders have niche uses for 4.1 Effectiveness of Deep Learning in Malware
anomaly detection. DBNs help extract hierarchical Detection
features. The prevalence of CNNs highlights their Deep learning works best when it is used to
ability to learn spatial relationships from malware analyze unstructured data. We need to either

1726
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

organize the data or create systems that can evaluate An automotive cyber-attack intrusion
unstructured data because the majority of the data detection system with an RNN accuracy of 86.9%
produced by these systems is unstructured and was created by Loukas et al. [114]. They obtained
comes in different forms. Deep learning can be used accuracies of 73.3%, 74%, 77.3%, and 79.9%,
to develop malware detection algorithms that work respectively, using a range of machine learning
better with unstructured and unlabeled data. techniques, including Logistic Regression, D.T.,
Furthermore, if the raw data supplied accurately R.F., and SVM. (The Ullah et al. [102] cyber threat
represents the problem, a deep learning system can detection system was far more accurate 96% than
quickly complete thousands of challenging and earlier systems that relied on machine learning
repetitive tasks after only one training session. The algorithms.
lack of machine learning or deep learning algorithms Vinayakumar et al. [109] developed an
in traditional malware detection systems makes them intrusion detection system that relied on deep neural
ineffective in identifying new malware strains. networks and achieved an accuracy of 99.2%. Using
"Malware definitions," which are used to identify conventional machine learning methods, we were
potential threats, are frequently updated by them. able to achieve an average accuracy of almost 80%.
However, after the training, machine learning Luo et al. [81] used a method called attention CNN-
and deep learning algorithms can identify complex LSTM to detect Android malware. They observed
patterns in both structured and unstructured data, that the average accuracy of their deep learning-
which is essential for creating malware detection based model was 96%, whereas that of the SVMand
systems that are effective. Malware programmers KNN-based models was 95% and 94%, respectively.
create programs that may undergo code Pektas and Acarman [78] proposed a
modifications during transmission, rendering them comparable methodology for identifying Android
undetected by typical pattern-matching tools. These malware, which involves the utilization of
viruses are smart and simple, readily tricking instruction call graphs. Their study yielded a 91.4%
pattern-matching programs. Many malware samples accuracy rate. The suggested approach demonstrated
have similar behavioral characteristics that ML and higher accuracy compared to many commercially
DL algorithms may use to uncover previously available learning algorithms (KNN, Logistic
undiscovered malware. Regression, SVN, and R.F.), as evidenced by
4.2. Performance of DL Compared with ML percentages of 80%, 70%, 79%, and 89%.
When trained on large datasets, deep Schranko de Oliveira and Sassi [90] have
learning algorithms outperform machine learning developed an Android malware detection system
algorithms in terms of output accuracy. High-level using a deep neural network. This system
attributes may be inferred using these techniques outperforms other machine learning Regression,
without the need for time-consuming feature Extra Trees, and K-Nearest Neighbors (KNN), with
extraction or specific subject knowledge. Deep an accuracy rate of 91%. The study conducted by
learning was found to be more effective than Jain et al. [55] showed that the accuracy achieved
machine learning in various papers that we re- using Extreme Learning Machines (ELM) with a
examined, which used both machine learning and single hidden layer was 97.7%, which surpassed the
deep learning techniques to detect malware accuracy obtained by a Convolutional Neural
[101,102]. Network (CNN) architecture, which was 96.3%.
Early-stage malware detection during the Similarly, Pastor et al. [118] compared
first few seconds of a program’s execution was many classical learning algorithms to CNN and
developed by Rhode et al. [39]. They compared found that conventional learning algorithms
RNN to common learning algorithms like SVM generated equivalent or higher results when
(support vector motion) and found that RNN recognizing crypto-mining activities. In most cases,
performed better. While SVM’s 80% accuracy was deep learning models significantly outperformed
rather good, RNN’s 96% accuracy after 19 seconds their non-deep counterparts [103,104]. These
was far superior. The accuracy rate of Decision numbers point to the efficacy of deep learning
Trees was 92.6%, whereas that of the Random Forest systems in detecting and pursuing threats like
classifier was 92%. Haddad-pajouh et al. [103] malware. We may not find a highly precise scalable
employed RNN-LSTM to identify risks in an IoT solution using only shallow learning methods [105].
setting with a success rate of 98.18%. They also used Nonetheless, as previous studies and the results of
more traditional forms of machine learning, with our experiment demonstrate, DL algorithms are not
KNN yielding the highest accuracy (94%) of the guaranteed to outperform ML techniques [106]. We
bunch. compared the accuracy of Deep Autoencoders to that

1727
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

of many other ML techniques and found that employed in the literature to automatically extract
machine learning models performed features for training the DL model[128]. The data
better[107,108]. quality is critical for machine learning and deep
4.3. Challenges in Malware Detection Using Deep learning algorithms used to detect
Learning malware[129,130]. As a result, in addition to
To train machine learning and deep technological methodologies, the availability of a
learning algorithms, a large amount of data is large and insightful dataset is important to the
required [109]. One of the most difficult difficulties predicted accuracy of such systems [131,132].
in malware and threat detection is providing the
algorithm with enough harmful and benign 5. CONCLUSIONS
samples[110]. It is critical to keep the public
This systematic literature review set out to
datasets[111] up to date so that models may be
provide a comprehensive overview of deep learning
trained using the most recent malware samples.
techniques applied for malware detection across
There is also the issue of "overtraining the
computing platforms. Our analysis of 107 highly
model," which might result in incorrect findings.
relevant studies from 2015-2023 reveals that
This might happen if the data is noisy or if inaccurate
convolutional neural networks dominate this
labeling is possible [112]. Several studies have
research landscape, enabling effective learning of
demonstrated high levels of accuracy[113,114].
spatial patterns from raw malware samples.
They did not, however, present any experimental
However, several limitations remain that constrain
evidence of their systems’ resiliency to new malware
real-world deployment of deep learning-based
threats [115]. To benefit from the advantages of deep
malware defense systems.
learning over standard threat detection systems
[125,126], malware detection systems must be able
A key objective was assessing the effectiveness
to differentiate between unique malware types and
of deep learning compared to traditional machine
variants on the malware samples used for training.
learning approaches relying on manual feature
The fast expansion of the Android
engineering. The literature overwhelmingly
ecosystem and the associated multiplication of
demonstrates enhanced accuracy from deep learning
issues [139,140] emphasize the need of a flexible
models like CNNs, LSTMs, and autoencoders that
approach that may be utilized regularly in the future
automatically extract useful features from
to identify new types of dangers. To solve the issue
executable files, byte sequences, and API calls.
of continually changing malware, the model only has
However, model resilience against obfuscation
to be altered every few years, at the expense of minor
attacks that degrade generalization requires further
performance advantages. The model’s evolvability
improvement.
is what makes it sustainable.
There are several deep learning algorithms
We also sought to identify key challenges faced
in the literature that can deal with complicated
in applying deep learning for robust malware
challenges like malware classification across large
detection. Insufficient labeled real-world training
data sets. However, most researchers fail to identify
data, lack of model interpretability, and inability to
the best approach for their issues because they do not
sustain accuracy over long periods emerged as
often train, test, and deploy the model related to ML
primary limitations. Though deep learning achieves
algorithm (s) selection challenges [141]. Due to
high accuracy on benchmark datasets, performance
Android malware’s dynamic nature and quick
in operational environments remains uncertain.
growth, maintaining up-to-date supervised detection
models is a difficult task [142,144]. A long-term
In conclusion, while deep learning shows
malware detection model must be created that can
significant promise for malware detection, progress
automatically update itself over time in an efficient
on robustness, transparency, and continuous
and scalable manner. When determining a model’s
retraining is needed. As malware threats persist and
long-term viability, examine the retention rate,
evolve, developing sustainable, self-adaptive deep
lifespan, and performance reduction after the
learning models must be a priority. This systematic
specified time frame.
review highlighted crucial gaps that need addressing
One of the key problems in using deep
to realize the potential of AI-powered techniques for
learning for autonomous feature engineering is
reliable malware defense. More work is required to
picking or automatically learning features that will
transition promising research solutions to large-scale
perform well over time and in the future[127]. Static,
real-world deployment.
dynamic, and hybrid analytic techniques have been

1728
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

6. FUTURE EXTENSIONS [3]. M. Christodorescu, S. Jha, S. A. Seshia, D. Song,


and R. E. Bryant, “Semantics-aware malware
Our research lead us to the following detection,” in Proceedings of the 2005 IEEE
implications for readers and future researchers Symposium on Security and Privacy (S&P’05),
interested in malware detection using machine pp. 32–46, IEEE, Oakland, CA, USA, 2005 May
learning and deep learning: [4]. E. Gandotra, D. Bansal, and S. Sofat, “Malware
 Because of the explosion of internet data, analysis and classification: a survey,” Journal of
traditional data processing techniques cannot Information Security, vol. 05, no. 02, pp. 56–64,
deal with the ensuing massive data quantities 2014.
[133,134]. Big data frameworks like Hadoop [5]. A. S. Bist, “A survey of deep learning algorithms
and Spark enable the processing of enormous for malware detection,” International Journal of
datasets [135,136]. Because huge volumes of Computer Science and Information Security, vol.
data must be processed, internet security 16, no. 3, 2018.
systems may benefit from merging deep
[6]. A. Naway and Y. Li, “A review on the use of
learning virus detection with big data
deep learning in android malware detection,”
technologies [137].
2018, https://arxiv.org/abs/ 1812.10360.
 It remains uncertain how well deep learning
malware systems proposed in research will [7]. T. Sonali Kothari and V. Khedkar, “Analysis of
perform at scale on big datasets. Testing and recent trends in malware attacks on android
validation of big data is an open challenge [138]. phone: a survey using scopus database,” Library
 Web and internet security has not received as Philosophy and Practice, pp. 1–20, 2021.
much attention in deep learning security [8]. B. Yadav and S. Tokekar, “Recent innovations
research as the Windows and Android platforms and comparison of deep learning techniques in
have. However, internet security is just as malware classification: a review,” International
important [139]. Journal of Information Security Science, vol. 9,
 Our study suggests developing deep learning no. 4, pp. 230–247, 2021.
systems for internet security should be an area [9]. M. V. R. Kumar, S. Anand Kumar, A. Bando, S.
of focus [140]. R. Gs, H. Shah, and S. C. Reddy, “A survey of
 Many studies report high malware detection deep learning techniques for malware analysis,”
accuracy, up to 99.9%, with deep learning. International Journal of Advanced Science and
However, realizing this performance in real- Technology, vol. 29, no. 4, pp. 6031– 6042,
world deployments remains an open problem. 2020.
Researchers should enable easy and effective [10]. H. Lubuva, Q. Huang, and G. C. Msonde,
use of deep learning for malware protection by “A review of static malware detection for
end users. Android apps permission based on deep
 Developing sustainable, self-evolvable deep learning,” International Journal of Computer
learning models that avoid frequent retraining is Networks and Applications, vol. 6, no. 5, pp. 80–
important. 91, 2019.
[11]. D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim,
7. ACKNOWLEDGMENT and K. J. Kim, “A survey of deep learning-based
network anomaly detection,” Cluster
This work was supported by the Deanship of Computing, vol. 22, no. S1, pp. 949–961, 2019.
Scientific Research, Vice Presidency for Graduate [12]. R. A. Ariyaluran Habeeb, F. Nasaruddin, A.
Studies and Scientific Research, King Faisal Gani, I. A. Targio Hashem, E. Ahmed, and M.
University, Saudi Arabia [Grant No. 5440]. Imran, “Real-time big data processing for
anomaly detection: a survey,” International
REFERENCES: Journal of Information Management, vol. 45, pp.
[1]. N. Idika and A. P. Mathur, “A survey of malware 289–307, 2019.
detection techniques,” Purdue University, vol. [13]. A. Souri and R. Hosseini, “A state-of-the-art
48, no. 2, 2007 survey of malware detection approaches using
[2]. Y. LeCun, Y. Bengio, and G. Hinton, “Deep data mining techniques,” Humancentric
learning,” Nature, vol. 521, no. 7553, pp. 436– Computing and Information Sciences, vol. 8, no.
444, 2015. 1, pp. 3–22, 2018.

1729
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[14]. S. Sharma and A. Kaul, “A survey on [25]. S. Keele, “Guidelines for performing
Intrusion Detection Systems and Honeypot systematic literature reviews in software
based proactive security mechanisms in engineering,” vol. 5, EBSE, Mumbai,India,
VANETs and VANET Cloud,” Vehicular 2007, Technical report Ver. 2.3 EBSE Technical
communications, vol. 12, pp. 138–164, 2018. Report.
[15]. J. Mart´ınez Torres, C. Iglesias Comesaña, [26]. S. Ni, Q. Qian, and R. Zhang, “Malware
and P. J. GarciaNieto, “Review: machine identification using visualization images and
learning techniques applied to cybersecurity,” deep learning,” Computers & Security, vol. 77,
International Journal of Machine Learning and pp. 871–885, 2018.
Cybernetics, vol. 10, no. 10, pp. 2823– 2836, [27]. Y. Zhao, C. Xu, B. Bo, and Y. Feng,
2019. “Maldeep: a deep learning classification
[16]. S. Hajiheidari, K. Wakil, M. Badri, and N. J. framework against malware variants based on
Navimipour, “Intrusion detection systems in the texture visualization,” Security and
Internet of things: a Security and Communication Networks, vol. 2019, Article ID
Communication Networks comprehensive 4895984, 11 pages, 2019.
investigation,” Computer Networks, vol. 160, [28]. J. Zhang, K. Zhang, Z. Qin, H. Yin, and Q.
pp. 165–191, 2019. Wu, “Sensitive system calls based packed
[17]. R. Coulter and L. Pan, “Intelligent agents malware variants detection using principal
defending for an IoT world: a review,” component initialized MultiLayers neural
Computers Security, vol. 73, pp. 439–458, 2018. networks,” Cybersecurity, vol. 1, no. 1, pp. 10–
[18]. E. Benavides, W. Fuertes, S. Sanchez, and 13, 2018.
M. Sanchez, “Classification of phishing attack [29]. J. Zhang, Z. Qin, H. Yin, L. Ou, and K.
solutions by employing deep learning Zhang, “A featurehybrid malware variants
techniques: a systematic literature review,” detection using CNN based opcode embedding
Developments and advances in defense and and BPNN based API embedding,” Computers
security, pp. 51– 64, 2020. & Security, vol. 84, pp. 376–392, 2019.
[19]. K. Bakour, H. M. Unver, and R. Ghanem, [30]. W. Zhong and F. Gu, “A multi-level deep
“(e Android ¨ malware detection systems learning system for malware detection,” Expert
between hope and reality,” SN Applied Sciences, Systems with Applications, vol. 133, pp.162,
vol. 1, no. 9, pp. 1120–1142, 2019. 2019.
[20]. M. Aly, F. Khomh, M. Haoues, A. Quintero, [31]. B. Zhang, W. Xiao, Xi Xiao, A. K.
and S. Yacout, “Enforcing security in Internet of Sangaiah, W. Zhang, and J. Zhang,
(ings frameworks: a systematic literature “Ransomware classification using patch based
review,” Internet of =ings, vol. 6, Article ID CNN and self-attention network on embedded N-
100050, 2019. grams of opcodes,” Future Generation Computer
[21]. A. M. Aleesa, B. B. Zaidan, A. A. Zaidan, Systems, vol. 110, pp. 708–720, 2020.
and N. M. Sahar, “Review of intrusion detection [32]. D. Yuxin and Z. Siyi, “Malware detection
systems based on deep learning techniques: based on deep learning algorithm,” Neural
coherent taxonomy, challenges, motivations, Computing and Applications, vol. 31, pp. 461–
recommendations, substantial analysis and 472, 2017.
future directions,” Neural Computing & [33]. S. Yue, “Imbalanced malware images
Applications, vol. 32, no. 14, pp. 9827–9858, classification: a cnn based approach,” 2017,
2020. [34]. Y. Ye, L. Chen, S. Hou, W. Hardy, and X.
[22]. Z. Wang, Q. Liu, and Y. Chi, “Review of Li, “DeepAM: a heterogeneous deep learning
android malware detection based on deep framework for intelligent malware detection,”
learning,” IEEE Access, vol. 8, pp. 181102– Knowledge and Information Systems, vol. 54,
181126, 2020. no. 2, pp. 265–285, 2018.
[23]. B. Kitchenham, “Procedures for Performing [35]. L. Xiaofeng, Z. Xiao, J. Fangshuo, Y.
Systematic Reviews,” Keele UK Keele Shengwei, and S. Jing, “ASSCA: API based
University, vol. 33, pp. 1– 26, 2004. sequence and statistics features combined
[24]. B. Kitchenham and S. Charters, “Guidelines malware detection architecture,” Procedia
for Performing Systematic Literature Reviews in Computer Science, vol. 129, pp. 248–256, 2018.
Software Engineering,”2007.

1730
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[36]. R. Vinayakumar, M. Alazab, K. P. Soman, Journal of Parallel and Distributed Computing,


P. Poornachandran, and S. Venkatraman, vol. 120, pp. 23–31,2018.
“Robust intelligent malware detection using [47]. D. Gibert, C. Mateu, J. Planes, and R.
deep learning,” IEEE Access, vol. 7, pp. 46717– Vicens, “Using convolutional neural networks
46738, 2019. for classification of malware represented as
[37]. S. Venkatraman, M. Alazab, and R. images,” Journal of Computer Virology and
Vinayakumar, “A hybrid deep learning image- Hacking Techniques, vol. 15, no. 1, pp. 15–28,
based analysis for effective malware detection,” 2019.
Journal of Information Security and [48]. Z. Cui, F. Xue, X. Cai, Y. Cao, G. G. Wang,
Applications, vol. 47, pp. 377–389, 2019. and J. Chen,“Detection of malicious code
[38]. M. Tang and Q. Qian, “Dynamic API call variants based on deep learning,” IEEE
sequence visualisation for malware Transactions on Industrial Informatics, vol. 14,
classification,” IET Information security,vol. 13, no. 7, pp. 3187–3196, 2018.
no. 4, pp. 367–377, 2019. [49]. Z. Cui, L. Du, P. Wang, X. Cai, and W.
[39]. M. Rhode, P. Burnap, and K. Jones, “Early- Zhang, “Malicious code detection based on
stage malware prediction using recurrent neural CNNs and multi-objective algorithm,” Journal of
networks,” Computers & Security, vol. 77, pp. Parallel and Distributed Computing, vol. 129, pp.
578–594, 2018. 50–58, 2019.
[40]. M. F. Rafique, M. Ali, A. S. Qureshi, A. [50]. L. Chen, “Deep transfer learning for static
Khan, and M. Mirza, “Malware classification malware classification,” 2018,
using deep learning based feature extraction and https://arxiv.org/pdf/1812.07606. Security and
wrapper based feature selection technique,” Communication Networks 27.
2019, [51]. E. D. O. Andrade, J. Viterbo, C. N.
[41]. M. H. Nguyen, D. L. Nguyen, X. M. Vasconcelos, J. Guerin, ´ and F. C. Bernardini,
Nguyen, and T. T. Quan, “Auto-detection of “A model based on lstm neural networks to
sophisticated malware using lazy-binding identify five different types of malware,”
control flow graph and deep learning,” Procedia Computer Science, vol. 159, pp. 182–
Computers & Security, vol. 76, pp. 128–155, 191, 2019.
2018. [52]. A. F. Agarap, “Towards building an
[42]. A. Namavar Jahromi, S. Hashemi, A. intelligent anti-malware system: a deep learning
Dehghantanha et al., “An improved two-hidden- approach using support vector machine (SVM)
layer extreme learning machine for malware for malware classification,” 2017, https://
hunting,” Computers & Security, vol. 89, Article arxiv.org/abs/1801.00318.
ID 101655, 2020. [53]. E. Masabo, K. S. Kaawaase, and J. Sansa-
[43]. Q. Le, O. Boydell, B. Mac Namee, and M. Otim, “Big data: deep learning for detecting
Scanlon, “Deep learning at the shallow end: malware,” in Proceedings of the 2018
malware classification for nondomain experts,” IEEE/ACM Symposium on Software
Digital Investigation, vol. 26, pp. S118–S126, Engineering in Africa (SEiA), pp. 20–26, IEEE,
2018. Gothenburg, Sweden, 2018 May.
[44]. J. Y. Kim, S. J. Bu, and S. B. Cho, “Zero- [54]. B. Yuan, J. Wang, D. Liu, W. Guo, P. Wu,
day malware detection using transferred and X. Bao, “Bytelevel malware classification
generative adversarial networks based on deep based on Markov images and deep learning,”
autoencoders,” Information Sciences, vol. 460- Computers & Security, vol. 92, Article ID
461, pp. 83–102, 2018. 101740, 2020.
[45]. M. Kalash, M. Rochan, N. Mohammed, N. [55]. Jain M, Andreopoulos W, Stamp M. CNN
Bruce, Y. Wang, and F. Iqbal, “A deep learning vs ELM for image-based malware classification.
framework for malware classification,” arXiv preprint arXiv:2103.13820. 2021 Mar 24.
International Journal of Digital Crime and [56]. F. O. Catak, A. F. Yazı, O. Elezaj, and J.
Forensics, vol. 12, no. 1, pp. 90–108, 2020. Ahmed, “Deep learning based Sequential model
[46]. S. Huda, S. Miah, J. Yearwood, S. Alyahya, for malware analysis using Windows exe API
H. Al-Dossari, and R. Doss, “A malicious threat Calls,” PeerJ Computer Science, vol. 6, Article
detection model for cloud assisted internet of ID e285, 2020.
things (CoT) based industrial control system
(ICS) networks using deep belief network,”

1731
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[57]. Y. Fang, Y. Zeng, B. Li, L. Liu, and L. android malware 836 detection using various
Zhang, “DeepDetectNet vs RLAttackNet: an features,” IEEE Transactions on Information
adversarial method to improve deep learning- Forensics and Security, vol. 14, no. 3, pp. 773–
based static malware detection model,” PLoS 788, 2019.
One, vol. 15, no. 4, Article ID e0231626, 2020. [69]. N. Milosevic and J. Huang, “Deep learning
[58]. H. Darabian, S. Homayounoot, A. guided Android malware and anomaly
Dehghantanha et al., “Detecting cryptomining detection,” 2019, 839https://arxiv.org/
malware: a deep learning approach for static and abs/1910.10660.
dynamic analysis,” Journal of Grid Computing, [70]. Z. Yuan, Y. Lu, and Y. Xue,
vol. 18, no. 2, pp. 293–303, 2020. “Droiddetector: android malware
[59]. R. Mitsuhashi and T. Shinagawa, “High- characterization and detection using deep learn-
accuracy malware classification with a malware- ing,” Tsinghua Science and Technology, vol. 21,
optimized deep learning model,” 2020. no. 1, pp. 114–123, 2016.
[60]. D. Gibert, C. Mateu, and J. Planes, [71]. Z. Yuan, Y. Lu, Z. Wang, and Y. Xue,
“HYDRA: a multimodal deep learning “Droid-sec: deep learning in android malware
framework for malware classification,” detection,” in Proceedings of the 2014 ACM
Computers & Security, vol. 95, Article ID conference on SIGCOMM, pp. 371-372,
101873, 2020. Chicago, IL, USA, 2014 August.
[61]. D. Vasan, M. Alazab, S. Wassan, B. Safaei, [72]. Y. S. Yen and H. M. Sun, “An Android
and Q. Zheng, “Image-Based malware mutation malware detection based on deep
classification using ensemble of CNN learning using visualization of importance from
architectures (IMCEC),” Computers & Security, codes,” Microelectronics Reliability, vol. 93, pp.
vol. 92, Article ID 101748, 2020. 109–114, 2019.
[62]. M. Cho, J. S. Kim, J. Shin, and I. Shin, [73]. N. Xie, X. Di, X. Wang, and J. Zhao,
“Mal2d: 2d based deep learning model for “AndroMD: android malware detection based on
malware detection using black and white binary convolutional neural networks,” International
image,” IEICE - Transactions on Info and Journal of Performability Engineering, vol. 14,
Systems, vol. 103, no. 4, pp. 896–900, 2020. no. 3, p. 547, 2018
[63]. Y. Sung, S. Jang, Y.-S. Jeong, and J. H. J. J. [74]. W. Wang, M. Zhao, and J. Wang, “Effective
Park, “Malware classification algorithm using android malware detection with a hybrid model
advanced Word2vecbased BiLSTM for ground based on deep autoencoder and convolutional
control stations,” Computer Communications, neural network,” Journal of Ambient Intelligence
vol. 153, pp. 342–348, 2020. and Humanized Computing, vol. 10, no. 8, pp.
[64]. X. Huang, L. Ma, W. Yang, and Y. Zhong, 3035–3043, 2019.
“A method for windows malware detection [75]. S. Q. Luo, B. Ni, P. Jiang, S. W. Tian, L. Yu,
based on deep learning,” Journal of Signal and R. J. Wang, “Deep learning in Drebin:
Processing Systems, vol. 93, no. 2-3, pp. 265– android malware image texture median filter
273, 2021. analysis and detection,” KSII Transactions on
[65]. K. Devi, “Android malware detection using Internet and Information Systems (TIIS), vol. 13,
deep learning,” International Research Journal of no. 7, pp. 3654–3670, 2019.
Engineering and Technology (IRJET), [76]. D. Saif, S. Elgokhy, and E. A. Sallam,
Academia, vol. 06, no. 05, 2019. “Deep Belief Networksbased framework for
[66]. E. B. Karbab, M. Debbabi, A. Derhab, and malware detection in An- droid systems,”
D. Mouheb, “Android malware detection using Alexandria Engineering Journal, vol. 57, no. 4,
deep learning on api method sequences,” pp. 4049–4057, 2018.
2017,https://arxiv.org/abs/1712.08996. [77]. A. Pektas¸ and T. Acarman, “Deep learning
[67]. Y. Suleiman, S. Sezer, and I. Muttik, for effective Android malware detection using
“Android malware detection: An eigenspace API call graph em- beddings,” Soft Computing,
analysis approach,” in Pro- ceedings of the 2015 vol. 24, no. 2, pp. 1027–1043, 2020.
Science and Information Conference (SAI), pp. [78]. Pekta¸s A, Acarman T. Learning to detect
1236–1242, 2015. Android malware via opcode sequences.
[68]. T. Kim, B. Kang, M. Rho, S. Sezer, and E. Neurocomputing. 2020 Jul 5;396:599-608.
G. Im, “A multimodal deep learning method for

1732
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[79]. M. Nauman, T. Ali, S. Khan, and T. A. Syed, hybrid deep learning model,” Security and
“Deep Neural Architectures for Large Scale Communication Networks, vol. 2020, Article ID
Android Malware Analysis,” Cluster 8863617, pp. 1–11, 2020.
Computing, vol. 21, pp. 1–20, 2018. [90]. A. Schranko de Oliveira and R. J. Sassi,
[80]. A. Martın, V. Rodr´ıguez-Fern´andez, and “Chimera: an android malware detection method
D. Camacho, “CANDYMAN: classifying based on multimodal deep learning and hybrid
Android malware families by modelling analysis,” 2020.
dynamic traces with Markov chains,” [91]. F. Mercaldo and A. Santone, “Deep learning
Engineering Applications of Artificial for image-based mobile malware detection,”
Intelligence, vol. 74, pp. 121–133, 2018. Journal of Computer Virology and Hacking
[81]. L. Shiqi, Z. Liu, B. Ni, H. Wang, H. Sun, and Techniques, vol. 16, no. 2, pp. 157–171, 2020.
Y. Yuan, “Android malware analysis and [92]. M. Amin, D. Shehwar, A. Ullah, T. Guarda,
detection based on attention-CNN-LSTM,” T. A. Tanveer, and S. Anwar, “A deep learning
Journal of Computers, vol. 14, no. 1, pp. 31–43, system for health care IoT and smartphone
2019. malware detection,” Neural Computing
[82]. M. A. Halim, A. Abdullah, and K. A. Z. Applications, vol. 34, no. 14, pp. 11283– 11294,
Ariffin, “Recurrent neural network for malware 2020.
detection,” Int. J. Ad- vance Softwarw [93]. X. Su, W. Shi, X. Qu, Y. Zheng, and X. Liu,
Computing. Appl, vol. 11, no. 1, pp. 43–63, “DroidDeep: using Deep Belief Network to
2019. characterize and detect android malware,” Soft
[83]. W. F. Elsersy and N. B. Anuar, “Android Computing, vol. 24, no. 8, pp. 6017–6030, 2020.
malware detection using deep belief network,” [94]. Z. Ma, H. Ge, Z. Wang, Y. Liu, and X. Liu,
PERTANIKA JOUR- NAL OF SCIENCE AND “Droidetec: android malware detection and
TECHNOLOGY, vol. 25, pp. 143–150, 2017. malicious code localization through deep
[84]. G. D’Angelo, M. Ficco, and F. Palmieri, learning,” 2020.
“Malware detection in mobile environments [95]. Z. Ren, H. Wu, Q. Ning, I. Hussain, and B.
based on Autoencoders and API28 Security and Chen, “End-toend malware detection for android
Communication Networks images,” Journal of IoT devices using deep learning,” Ad Hoc
Parallel and Distributed Computing, vol. 137, pp. Networks, vol. 101, Article ID 102098, 2020.
26–33, 2020. [96]. W. Niu, R. Cao, X. Zhang, K. Ding, K.
[85]. T. Chen, Q. Mao, M. Lv, H. Cheng, and Y. Zhang, and T. Li, “OpCode-level function call
Li, “Droidvecdeep: android malware detection graph based android malware classification
based on Word2Vec and deep belief network,” using deep learning,” Sensors, vol. 20, no. 13, p.
KSII Transactions on Internet and Information 3645, 2020.
Systems (TIIS), vol. 13, no. 4, pp. 2180–2197, [97]. R. Feng, S. Chen, X. Xie, G. Meng, S. W.
2019. Lin, and Y. Liu, “A performance-sensitive
[86]. M. Amin, T. A. Tanveer, M. Tehseen, M. malware detection system using deep learning on
Khan, F. A. Khan, and S. Anwar, “Static mobile devices,” IEEE Transactions on
malware detection and attrib- ution in android Information Forensics and Security, vol. 16, pp.
byte-code through an end-to-end deep system,” 1563–1578, 2021.
Future Generation Computer Systems, vol. 102, [98]. H. Zhu, Y. Li, R. Li, J. Li, Z. H. You, and H.
pp. 112–126, 2020. Song, “SEDMDroid: an enhanced stacking
[87]. M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, ensemble framework for android malware
“DL-Droid: deep learning based android detection,” IEEE Transactions on Network
malware detection using real devices,” Science and Engineering, vol. 8, no. 2, pp. 984–
Computers & Security, vol. 89, Article ID 994, 2021.
101663, 2020. [99]. J. Feng, L. Shen, Z. Chen, Y. Wang, and H.
[88]. X. Pei, L. Yu, and S. Tian, “AMalNet: a Li, “A two-layer deep learning method for
deep learning framework based on graph android malware detection using network
convolutional networks for malware detection,” traffic,” IEEE Access, vol. 8, pp. 125786–
Computers & Security, vol. 93, Article ID 125796, 2020.
101792, 2020. [100]. Azmoodeh, A. Dehghantanha, and K. K. R.
[89]. T. Lu, Y. Du, L. Ouyang, Q. Chen, and X. Choo, “Robust malware detection for internet of
Wang, “Android malware detection based on a

1733
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

(battlefield) things devices using deep [111]. R. Priyadarshini and R. K. Barik, “A deep
eigenspace learning,” IEEE transactions on Learning Based Intelligent Framework to
sustainable computing, vol. 4, no. 1, pp. 88–95, Mitigate DDoS Attack in Fog Environment,”
2019. Journal of King Saud University-Computer and
[101]. F. Xiao, Z. Lin, Y. Sun, and Y. Ma, Information Sciences, vol. 34, no. 3, pp. 825–
“Malware detection based on deep learning of 831, 2019.
behavior graphs,” Mathematical Problems in [112]. A. Pektas¸ and T. Acarman, “Deep learning
Engineering, vol. 2019, Article ID 8195395, pp. to detect botnet via network flow summaries,”
1–10, 2019. Neural Computing & Applications, vol. 31, no.
[102]. F. Ullah, H. Naeem, S. Jabbar et al., “Cyber 11, pp. 8021–8033, 2019.
security threats detection in internet of things [113]. Y. Pan, F. Sun, Z. Teng et al., “Detecting
using deep learning approach,” IEEE Access, web attacks with end-to-end deep learning,”
vol. 7, pp. 124379–124389, 2019. Journal of Internet Services and Applications,
[103]. H. Haddadpajouh, A. Dehghantanha, R. vol. 10, no. 1, pp. 16–22, 2019.
Khayami, and K. K. R. Choo, “A deep recurrent [114]. G. Loukas, T. Vuong, R. Heartfield, G.
neural network based approach for internet of Sakellari, Y. Yoon, and D Gan, “Cloud-based
things malware threat hunting,” Future cyber-physical intrusion detection formvehicles
Generation Computer Systems, vol. 85, pp. 88– using deep learning,” IEEE Access, vol. 6, pp.
96, 2018. 3491–3508, 2018.
[104]. M. N. Al-Hawawreh, N. Moustafa, and E. [115]. Y. S. Jeong, J. Woo, and A. R. Kang,
Sitnikova, “Identification of malicious activities “Malware detection on byte streams of pdf files
in industrial internet of things based on deep using convolutional neural networks,” Security
learning models,” Journal of Information and Communication Networks, vol. 2019,pp. 1–
Security and Applications, vol. 41, pp. 1–11, 9, 2019.
2018. [116]. Homayoun, A. Dehghantanha, M.
[105]. A. Abusnaina, M. Abuhamad, H. Alasmary Ahmadzadeh et al., “DRTHIS: deep ransomware
et al., “A deep learning-based fine-grained threat hunting and intelligence system at the fog
hierarchical learning approachfor robust layer,” Future Generation Computer Systems,
malware classification,” 2020. vol. 90, pp. 94–104, 2019.
[106]. H. Naeem, F. Ullah, M. R. Naeem et al., [117]. A. McDole, M. Abdelsalam, M. Gupta, and
“Malware detection in industrial internet of S. Mittal, “Analyzing cnn based behavioural
things based on hybrid image visualization and malware detection techniques Security and
deep learning model,” Ad Hoc Networks, vol. Communication Networks 29 on cloud iaas,” in
105, Article ID 102154, 2020. Proceedings of the International Conference on
[107]. S. Lu, L. Ying, W. Lin et al., “New era of Cloud Computing, pp. 64–79, Honolulu, HI,
deeplearning-based malware intrusion detection: USA, 2020, September.
the malware detection and prediction based on [118]. A. Pastor, A. Mozo, S. Vakaruk et al.,
deep learning,” 2019, https://arxiv.org/ “Detection of encrypted cryptomining malware
abs/1907.08356. connections with machine and deep learning,”
[108]. B. Yu, J. Pan, D. Gray et al., “Weakly IEEE Access, vol. 8, pp. 158036–158055, 2020.
supervised deep learning for the detection of [119]. A. N. Jahromi, S. Hashemi, A.
domain generation algorithms,” IEEE Access, Dehghantanha, R. M. Parizi, and K. K. R. Choo,
vol. 7, pp. 51542–51556, 2019. “An enhanced stacked LSTM method with no
[109]. R. Vinayakumar, M. Alazab, K. P. Soman, random initialization for malware threat hunting
P. Poornachandran, A. Al-Nemrat, and S. in safety and time-critical systems,” IEEE
Venkatraman, “Deep learning approach for Transactions on Emerging Topics in
intelligent intrusion detection system,” IEEE Computational Intelligence, vol. 4, no. 5, pp.
Access, vol. 7, pp. 41525–41550, 2019. 630–640, 2020.
[110]. H. M. Song, J. Woo, and H. K. Kim, “In- [120]. J. Hemalatha, S. A. Roseline, S. Geetha, S.
vehicle networkintrusion detection using deep Kadry, and R. Damaˇsevicius, “An efficient
convolutional neural network,” Vehicular densenet-based deep learning ˇmodel for
Communications, vol. 21, Article ID 100198, malware detection,” Entropy, vol. 23, no. 3, p.
2020. 344, 2021.

1734
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[121]. S. Baek, J. Jeon, B. Jeong, and Y. S. Jeong, [132]. H. Cai and J. Jenkins, “Towards sustainable
“Two-stage hybrid malware detection using deep android malware detection,” in Proceedings of
learning,” Humancentric Computing and the 40th International Conference on Software
Information Sciences, vol. 11, no. 27, pp. 10– Engineering: Companion Proceeedings, pp. 350-
22967, 2021. 351, Gothenburg Sweden, 2018, May.
[122]. G. Iadarola, F. Martinelli, F. Mercaldo, and [133]. E. Mariconti, L. Onwuzurike, P. Andriotis,
A. Santone, “Towards an interpretable deep E. De Cristofaro,G. Ross, and G. Stringhini,
learning model for mobile malware detection and “Mamadroid: detecting android malware by
family identification,” Computers & Security, building Markov chains of behavioral models,”
vol. 105, Article ID 102198, 2021. 2016.
[123]. N. Zhang, Y. A. Tan, C. Yang, and Y. Li, [134]. L. Onwuzurike, E. Mariconti, P. Andriotis,
“Deep learning feature exploration for android E. D. Cristofaro, G. Ross, and G. Stringhini,
malware detection,” Applied Soft Computing, “Mamadroid: detecting android malware by
vol. 102, Article ID 107069, 2021. building Markov chains of behavioral models
[124]. V. Sihag, M. Vardhan, P. Singh, G. (extended version),” ACM Transactions on
Choudhary, and S. Son, “De-LADY: deep Privacy and Security (TOPS), vol. 22, no. 2, pp.
learning based Android malware detection using 1–34, 2019.
Dynamic features,” J. Internet Server. [135]. W. Li, X. Fu, and H. Cai, “Androct: ten
Information. Security.vol. 11, no. 2, pp. 34–45, years of app call traces in android,” in
2021. Proceedings of the 2021 IEEE/ACM 18th
[125]. I. Obaidat, M. Sridhar, K. M. Pham, and P. International Conference on Mining Software
H. Phung, “Jadeite: a novel image-behavior- Repositories (MSR), pp. 570–574, IEEE,
based approach for java malware detection using Madrid, Spain, 2021 May.
deep learning,” Computers & Security, vol. 113, [136]. J. Garcia, M. Hammad, and S. Malek,
Article ID 102547, 2022. “Lightweight, obfuscation-resilient detection
[126]. J. Kim, Y. Ban, E. Ko, H. Cho, and J. H. Yi, and family identification of android malware,”
“MAPAS: a practical deep learning-based ACM Transactions on Software Engineering and
android malware detection system,” Methodology, vol. 26, no. 3, pp. 1–29, 2018.
International Journal of Information Security, [137]. H. Cai, N. Meng, B. Ryder, and D. Yao,
vol. 21, no. 4, pp. 725–738, 2022. “Droidcat: effective android malware detection
[127]. R. Chaganti, V. Ravi, and T. D. Pham, and categorization via app-level profiling,” IEEE
“Deep learning based cross architecture internet Transactions on Information Forensics and
of things malware detection and classification,” security, vol. 14, no. 6, pp. 1455–1470, 2019. H.
Computers & Security, vol. 120, Article ID Gao, S. Cheng, and W. Zhang, “GDroid: android
102779, 2022 malware detection and classification with graph
[128]. X. Xing, X. Jin, H. Elahi, H. Jiang, and G. convolutional network,” Computers & Security,
Wang, “A malware detection approach using vol. 106, Article ID 102264, 2021.
autoencoder in deep learning,” IEEE Access, vol. [138]. H. Cai, “Embracing mobile app evolution
10, pp. 25696–25706, 2022. via continuous ecosystem mining and
[129]. M. Kumar, “Scalable malware detection characterization,” in Proceedings of the
system using distributed deep learning,” IEEE/ACM 7th International Conference on
Cybernetics & Systems, pp. 1–29, 2022. Mobile Software Engineering and Systems, pp.
31–35, 2020, July.
[130]. A. A. Hamza, I. T. Abdel Halim, M. A.
Sobh, an M. Bahaa-Eldin, “HSAS-MD analyzer: [139]. G. Suarez-Tangil and G. Stringhini, “Eight
a hybrid security analysis system using model- years of rider measurement in the android
checking technique and deep learning for malware ecosystem: evolution and lessons
malware detection in IoT apps,” Sensors, vol. 22, learned,” 2018,
no. 3, p. 1079, 2022. https://arxiv.org/abs/1801.08115.
[131]. S. S Lad and A. C. Adamuthe, “Improved [140]. R. Ali, S. Lee, and T. C. Chung, “Accurate
deep learning model for static PE files malware multi-criteria decision making methodology for
detection and classification,” International recommending machine learning algorithm,”
Journal of Computer Network and Information Expert Systems with Applications, vol. 71, pp.
Security, vol. 14, no. 2, pp. 14–26, 2022. 257–278, 2017.

1735
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

[141]. X. Fu and H. Cai, “On the deterioration of [151]. W. Li, X. Fu, and H. Cai, “AndroCT: ten
learning-based malware detectors for Android,” years of app call traces in android,” in
in Proceedings of the 2019 IEEE/ACM 41st Proceedings of the 2021 IEEE/ACM 18
International Conference on Software International Conference on Mining Software
Engineering: Companion Proceedings (ICSE- Repositories (MSR), pp. 570–574, IEEE,
Companion), pp. 272-273, IEEE, Montreal, Madrid, Spain, 2021 May.
Canada, 2019 May. [152]. H. Wang, J. Si, H. Li, and Y. Guo,
[142]. K. Xu, Y. Li, R. Deng, K. Chen, and J. Xu, “Rmvdroid: towards a reliable android malware
“DroidEvolver: selfevolving Android malware dataset with app metadata,” in Proceedings of the
detection system,” in Proceedings of the 2019 2019 IEEE/ACM 16th International Conference
IEEE European Symposium on Security and on Mining Software Repositories (MSR) pp.
Privacy (EuroSP), pp. 47–62, IEEE, Stockholm, 404–408, IEEE, Montreal, QC, Canada, 2019
Sweden, 2019 June. May.
[143]. H. Cai, “Assessing and improving malware [153]. H. Cai and B. G. Ryder, “A longitudinal
detection sustainability through app evolution study of application structure and behaviors in
studies,” ACM Transactions on Software android,” IEEE Transactions on Software
Engineering and Methodology, vol. 29, no. 2, pp. Engineering, vol. 47, no. 12, pp. 2934–2955,
1–28, 2020. 2021.
[144]. M. Egele, T. Scholte, E. Kirda, and C. [154]. P. Liu, L. Li, Y. Zhao, X. Sun, and J.
Kruegel, “A survey on automated dynamic Grundy, “Androzooopen: collecting large-scale
malware-analysis techniques and tools,” ACM open source android apps for the research
Computing Surveys, vol. 44, no. 2, pp. 1–42, community,” in Proceedings of the 17th
2012. International Conference on Mining Software
[145]. M. Akour, I. Alsmadi, and M. Alazab, “(e Repositories, pp. 548–552, Republic of Korea,
malware detection challenge of accuracy,” in 2020 June.
Proceedings of the 2016 2nd International [155]. F. Wei, Y. Li, S. Roy, X. Ou, and W. Zhou,
Conference on Open Source Software “Deep ground truth analysis of current android
Computing (OSSCOM), pp. 1–6, IEEE, Beirut, malware,” in Proceedings of the International
Lebanon, 2016 December. Conference on Detection Of Intrusions And
[146]. S. D. Nikolopoulos and I. Polenakis, “A Malware, and Vulnerability Assessment, pp.
graph-based model for malware detection and 252–276.
classification using system-call groups,” Journal [156]. Y. Zhou and X. Jiang, “Dissecting android
of Computer Virology and Hacking Techniques, malware: characterization and evolution,” in
vol. 13, no. 1, pp. 29–46, 2017. Proceedings of the 2012 IEEE symposium on
[147]. V. Sessions and M. Valtorta, “(e effects of security and privacy, pp. 95–109, IEEE, San
data quality on machine learning algorithms,” Francisco, CA, USA, 2012 May.
ICIQ, vol. 6, pp. 485- 498, 2006. Security and [157]. H. S. Anderson and P. Roth, “Ember: an
Communication Networks open dataset for training static pe malware
[148]. F. O. Catak and A. F. Yazı, “A benchmark machine learning models,” 2018,
API call dataset for windows PE malware https://arxiv.org/abs/1804.04637.
classification,” 2019. [158]. R. Harang and E. M. Rudd, “SOREL-20M:
[149]. H. Cai, X. Fu, and A. Hamou-Lhadj, “A a large scale benchmark dataset for malicious PE
study of run-time behavioral evolution of benign detection,” 2020, https://arxiv.org/abs/2012.076
versus malicious apps in android,” Information [159]. G. Kavallieratos, N. Chowdhury, S. K.
and Software Technology, vol. 122, Article ID Katsikas, V. Gkioulos and S. D. Wolthusen.
106291, 2020. "Threat Analysis for Smart Homes". Sep. 2019.
[150]. K. Allix, T. F. Bissyande, J. Klein, and Y. [160]. M. Bouzidi, N. Gupta, F. A. Cheikh, A.
Le Traon, ´ “Androzoo: collecting millions of Shalaginov and M. Derawi. "A Novel
android apps for the research community,” in Architectural Framework on IoT Ecosystem,
Proceedings of the 2016 IEEE/ACM13th Security Aspects and Mechanisms: A
Working Conference on Mining Software Comprehensive Survey". Jan. 2022.
Repositories (MSR), pp. 468–471, IEEE, Austin, [161]. K. Bobrovnikova, S. Lysenko, B. Savenko,
TX, USA, 2016 May. P. Gaj and O. Savenko. "Technique for IoT

1736
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

malware detection based on control flow graph methods. In2014 International Conference on
analysis". Feb. 2022. Advances in Computing, Communications and
[162]. T. N. Phu, K. H. Dang, D. N. Quoc, N. T. Informatics (ICACCI) 2014 Sep 24 (pp. 356-
Dai and N. N. Binh. "A Novel Framework to 361). IEEE.
Classify Malware in MIPS Architecture-Based [175]. Landman T, Nissim N. Deep-Hook: A
IoT Devices". Dec. 2019. trusted deep learning-based framework for
[163]. F. Xiao, Z. Lin, Y. Sun and Y. Ma. unknown malware detection and classification in
"Malware Detection Based on Deep Learning of Linux cloud environments. Neural Networks.
Behavior Graphs". Feb. 2019. 2021 Dec 1;144:648-85.
[164]. B. Yadav and S. Tokekar. "Malware Multi- [176]. Cozzi E, Graziano M, Fratantonio Y,
Class Classification based on Malware Balzarotti D. Understanding linux malware.
Visualization using a Convolutional Neural In2018 IEEE symposium on security and privacy
Network Model". Apr. 2023. (SP) 2018 May 20 (pp. 161-175). IEEE.
[165]. T. Wan et al.. "Efficient Detection and [177]. Vurdelja I, Blaži´c I, Draškovi´c D, Nikoli´c
Classification of Internet-of-Things Malware B. Detection of linux malware using system
Based on Byte Sequences from Executable tracers–An overview of solutions. IcEtran. 2020
Files". Jan. 2020. Sep.
[166]. W. Yaokumah, J. K. Appati and D. A. [178]. Maniriho P, Mahmood AN, Chowdhury MJ.
Kumah. "Machine Learning Methods for A Survey of Recent Advances in Deep Learning
Detecting Internet-of-Things (IoT) Malware". Models for Detecting Malware in Desktop and
Oct. 2021. Mobile Platforms. arXiv preprint
[167]. S. Hwang and J. Kim. "A Malware arXiv:2209.03622. 2022 Sep 8.
Distribution Simulator for the Verification of
Network Threat Prevention Tools". Oct. 2021.
[168]. R. N. Alhamad and F. Alserhani. "Prediction
Models to Effectively Detect Malware Patterns
in the IoT Systems". Jan. 2022.
[169]. Olowoyeye O. Evaluating Open Source
Malware Sandboxes with Linux malware
(Doctoral dissertation, Auckland University of
Technology).
[170]. Luckett P, McDonald JT, Glisson WB,
Benton R, Dawson J, Doyle BA. Identifying
stealth malware using CPU power consumption
and learning algorithms. Journal of Computer
Security. 2018 Jan 1;26(5):589-613.
[171]. Dervisis I. Linux Malware Analysis
(Doctoral dissertation, University of Piraeus
(Greece)).
[172]. Debnath S, Biswas S. Malware
Identification, Analysis and Similarity. Cyber
Security and Network Security. 2022 Mar 31:47-
69.
[173]. Aslan ÖA, Samet R. A comprehensive
review on malware detection approaches. IEEE
access. 2020 Jan 3;8:6249-71.
[174]. Asmitha KA, Vinod P. Linux malware
detection using non-parametric statistical

1737
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

Table 3: The models used in this study

Referenc Method Model Library Platform Dataset Performanc DL/M


e e L
[120] Visualize Dense CNN Keras Window Malimg, BIG 98.46% DL
malware as s 2015, accuracy
images, MaleVis
classify with
reweighted
loss
[121] Static op- Bi-LTSM, CNN Not stated IoT KISA 2019 U DL
code p to 95%
extraction + accuracy
dynamic
analysis
[122] Represent CNN TensorFlo Android Argus Lab 97% DL
app as image, w, CUDA accuracy
extract dex
file bytes as
pixels
[123] Text CNN Keras Android Various 96.6% DL
classification datasets accuracy
on app
analysis
sequences
[124] Dynamic CNN with Leaky Not stated Android Self- 98% DL
analysis logs ReLU generated accuracy
to feature
vectors
[125] Static CNN Not stated Java Self- 98.4% DL
analysis of platform generated accuracy
Java bytecode s
control flow
[126] API call CNN Not stated Android Playstore , 93.2% DL
graph VirusShare accuracy
patterns
analysis
[127] Static/dynami Bi-GRU- CNN Keras, IoT Various 98% detect DL
c analysis on TensorFlo sources accuracy,
ELF binaries w, 100classify
scikitlearn accuracy
[128] Bytecode to Autoencoder + TensorFlo Android Playstore , 96.2% DL
images, CNN w VirusShare accuracy
autoencoder
reconstructio
n error
[129] Distributed CNN-BiLSTM Not stated Window Various 97% DL
model, static s sources accuracy
+ dynamic
analysis
[130] Source code CNN PyTorch IoT Not stated 95% DL
conversion, accuracy
model
checking
[131] Static PE file Not stated Keras Window EMBER 97.5% ML
feature s accuracy
extraction
[132] Visualize CNN (VGG16) Not stated Window VirusSig n 94.7% ML
malware as s dataset accuracy

1738
Journal of Theoretical and Applied Information Technology
29th February 2024. Vol.102. No 4
© Little Lion Scientific

ISSN: 1992-8645 www.jatit.org E-ISSN: 1817-3195

RGB images,
hybrid
analysis
[150] Byteplot CNN (Inception) TensorFlo Window EMBE R 98.6% ML
image w s accuracy
visualization
+ CNN
[151] API call Random forest scikit-learn Android Conta gio/Vir 97.2% F1 ML
graph usShare score
analysis +
random forest
[152] Control flow SVM (RBF LibSVM Linux Sel 95.3% ML
graph + SVM kernel) f-generated accuracy
[153] Function call Isolatio scikit-learn IoT IoT-23 99.1% ML
monitoring + n forest detection
isolation rate
forest
[154] Network Naïve Bayes scikit-learn IoT ISOT 91.7% ML
traffic + accuracy
naïve Bayes
[155] Bytecode Ensemble learner scikit-learn Android Drebin 93.8% ML
image + accuracy
ensemble of
multiple
models
[156] Binary Gradient XGBoost Window BIG 2015 97.2% ML
execution boosting s accuracy
traces +
gradient
boosting
[157] Control flow CART Scikit-learn Linux self Mining 90.1% ML
graph + accuracy
CART
decision tree
[158] API call Logistic Scikit-learn Android AMD 92.4% ML
monitoring + regression accuracy
logistic
regression
[159] Byteplot Random forest Scikit-learn Window VirusShare 96.7% ML
visualization s accuracy
+ random
forest

1739

You might also like