ML Aproch 4 Cyberattack Detection & Prevention IOT
ML Aproch 4 Cyberattack Detection & Prevention IOT
17
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
18
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
Pattern-matching for keystroke sequences suggesting an attack ranks features based on their importance and applies them
[26]. This method's drawbacks include the various ways to strategically to boost detection accuracy and overall model
describe the same attack at the keystroke level and the general performance.
unavailability of user-typed keystrokes.
iv) Expert Systems:
3.2 Machine learning
Machine learning focuses on developing algorithms that
According to [25] an expert system is a computer program that
improve automatically through experience. These algorithms
can represent and reason about a field with a wealth of
are widely used in information filtering systems to identify user
information to provide guidance and solve difficulties. Expert
preferences and in data mining applications to uncover patterns
system detectors encode attack knowledge as if-then
within large datasets [34], [35]. The two primary machine
implication rules. The if part of a rule specifies the prerequisites
learning techniques are clustering and classification, which are
for an attack. When every condition on the left side of a rule is
particularly effective in detecting hidden patterns without
met, the actions on the right side of the rule are carried out,
requiring prior knowledge of their structure [36]. Unlike
which could lead to the existence of an intrusion or the firing
traditional methods of cyberattack detection, machine learning
of more rules [27]. The primary benefit of creating if-then
approaches can dynamically adapt to complex data
implication rules is that control thinking is kept outside
distributions, making them well-suited for cyber threat
problem-solution development.
detection and anomaly identification in IoT networks [37],
[38].
3.2.1 Clustering
Clustering is a fundamental technique in data mining that
combines data points into distinct groups based on their
similarities and shared attributes [36], [37]. As an unsupervised
learning method, it uncovers hidden structures in datasets
without requiring predefined labels [38]. Various clustering
approaches exist, each using diverse strategies to enhance data
partitioning and pattern recognition [39], [40].
Fig 2 Misuse detection system with pattern matching [9]
i) Hierarchical clustering
3. DATA MINING APPROACHES FOR Hierarchical clustering arranges datasets through a stepwise,
INTRUSION DETECTION iterative process rather than grouping all data points
As the volume of digital documents continues to grow across simultaneously [37], [38]. This method progressively merges
multiple languages worldwide, data mining has gained or splits clusters based on similar measures, resulting in a
significant traction in the field of knowledge discovery [10]. structured hierarchy. Hierarchical clustering is further labelled
According to [11], data mining is an automated process used to into distinct approaches, including:
extract meaningful and valuable insights from vast data a) Division clustering
repositories, making it an essential tool for handling large-scale In divisive clustering, the dataset starts as a single cluster and
information. The rapid advancements in data mining have led is recursively split until each data point forms its cluster,
to the development of numerous algorithms derived from following a top-down hierarchical structure.
statistics, pattern recognition, machine learning, and database b) Agglomerative Clustering
management [12]. These innovations have expanded the Initially, it reflects each data point as an individual cluster and
capabilities of data analysis, enabling more efficient and iteratively merges the closest clusters based on predefined
accurate knowledge extraction. In the context of this study, the criteria, following a bottom-up approach from leaf to root.
following data mining techniques are particularly relevant: ii) Partitional clustering
Partitional clustering segments data points into k distinct
3.1 Feature selection groups based on specific significance criteria, ensuring optimal
Feature Selection (FS) is a crucial step in enhancing intrusion separation and similarity within each cluster [39]
detection for IoT networks. It focuses on identifying the most iii) K-Mean Clustering method:
relevant features while eliminating redundant or unnecessary This algorithm groups data into clusters by reducing the
ones to improve classification accuracy [32], [33]. This process distance between each point and the respective cluster centroid.
becomes even more important when dealing with high- It has three main variations: k-means for numerical data, k-
dimensional data, where using every available feature can be medoids for categorical datasets, and k-prototypes for mixed
inefficient and may reduce model effectiveness due to limited data types [40].
data samples [34], [35]. The quality of FS directly impacts the a) K-mean: Applied to sets of numerical data.
performance of machine learning-based detection systems, b) K-media: Applied to categorical datasets
ensuring that only the most meaningful attributes are used for c) K-prototype: Applied to both numerical and
accurately identifying cyber threats [36], [37]. In cybersecurity categorical datasets.
research, different feature selection (FS) techniques, including iv) Fuzzy C Mean Clustering:
filter, wrapper, and embedded methods, are employed to This clustering method not only evaluates the distance between
optimize feature subsets and improve detection accuracy [38], data points and cluster canters but also incorporates
[39]. Figures 3, 4, and 5 illustrate how key features were membership values, allowing data points to belong to multiple
extracted from 14 attack files in the Edge-IIoT dataset using clusters with varying degrees of association [40].
Chi-Square, Mutual Information, and Random Forest selection v) QT Clustering
techniques. These methods help identify the most significant Quality Threshold (QT) clustering groups data points based on
features, enhancing the efficiency of cyberattack detection. a predefined cluster approach. It ensures high-quality clusters
These statistical approaches help enhance cyberattack detection by identifying large groups whose diameters do not exceed a
models by improving classification accuracy while decreasing user-specified threshold, maintaining consistency and
computational complexity [40], [41], [42]. By examining each reliability in cluster formation [41]
selected attribute, the system extracts meaningful insights,
19
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
3.2.2 Classification art technique that uses Hidden Markov Models (HMM) to
A data item is classified into a few pre-established categories. detect advanced online threats. According to the findings, these
Typically, these algorithms produce "classifiers" in the form of attacks involve several steps and may occur over an extended
rules or decision trees [13]. This technique can be used in period. Certain acts may be interchangeable within each step.
intrusion detection to collect enough "normal" and "abnormal" To conceal the intrusion, an intruder may purposefully choose
audit data for a user or program. A classification algorithm can a series of acts within a step.
then be used to learn a classifier that identifies whether the audit Other cases can entail inconsistent action sequences (due to
data belongs to the abnormal or normal class. [14]. background noise) or the offender's inexperience [52]. An
In classification-based IDS, all traffic is classified by IDS as intrusion detection system must be able to manage some of
either malicious or normal [15]. However, reducing false these uncertainties. HMMs are ideally suited to tackle the
positives, (classification of benign traffic as malicious) and multi-step attack problem [53]. Authors [54], and [55] directly
false negatives (the classification of malicious traffic as compare HMMs to two other traditional methods, decision
normal) is the difficult part of the classification-based IDS trees and neural nets, and demonstrate that HMMs detect these
[16]. In intrusion detection systems, classification methods intricate intrusions significantly better than neural networks
include fuzzy logic, neural networks, genetic algorithms, and and generally better than decision trees.
inductive rule generation. From [45], the author notes that the Hidden Markov Model”
assumes that the state variables are hidden and correspond to
phenomena that are, perhaps, fundamentally unobservable,”
3.3 Statistical Techniques and as such, should perform well in modelling user actions. He
This method compares events statistically according to a concluded HMM and the instance-based learner, trained using
predefined list of parameters [44]. Statistical methods are the same data, performed comparably.
referred to as "top-down" learning and are used once the
relationships between the data are established, using
3.4 Profiles
mathematics to help with the search. There are three categories for profiles: activity, template, and
The three fundamental categories of statistical methods are abnormality. The IDS represents the associated activity profile
decision trees, nonlinear, and linear [18]. Statistical models test when an audit record is created [17]. Depending on the model
the obtained system and network data for attack analysis. and value, it may generate an anomaly profile and trigger an
Operational, Average and Standard Deviation, Multivariate, alarm. The activity profile is made using a profile template if it
Markovian, and Time Series models are the most used models. doesn’t exist.
Different periods, such as the day of the week, the month, the Creating profiles is the most challenging aspect of IDSs,
year, or per-host or per-service basis, can be used to compute though templates are retained [18]. A template comprises the
statistical trends. previously listed fields in a data structure. The IDS will not
Denning (1987) [45] discussed some of the issues and solutions identify activity profiles when a new user is created in the
associated with statistical measurements to identify system and will instead generate the necessary ones using the
abnormalities. The operational model, mean and standard appropriate template profiles upon the user's initial login [19].
deviation model, multivariate model, Markov process model, Except for subjects, every field in the template is duplicated in
and time series model are the five statistical measurements she the new activity profile. Each Single subject can use profiles.
described. The IDS's rules use these measures to identify The frequently used data structures for profiles: Name, Subject,
intrusions. Object, Action pattern, Resource-usage-pattern, Exception-
Operational model: An intrusion is indicated when the pattern, Time, Variable type, Threshold, and Value. The
operational model surpasses a predetermined threshold. The profile's three main components are name, Subject, and Object.
security policy typically establishes the threshold. For instance,
the security policy may stipulate that a password should be
3.5 Proposed Structural Framework
reported if three or more trials are unsuccessful. [46]. Design
Mean and standard deviation model: This model indicates The Internet has significantly transformed modern life, offering
incursion if it deviates from the mean ± threshold stdev [47]. In vast opportunities alongside increasing cybersecurity threats
this instance, the threshold is distinct from the last one in those [58]. Cyber intruders are generally classified into two
four and is typically employed since, under a normal categories: outsiders and insiders. Outsiders operate externally,
distribution, about 100% of the data should fall inside that targeting systems through email-based spam attacks or
range. attempting to bypass firewalls to compromise internal networks
The multi-variate model: In which activity correlation is [59]. In contrast, insiders are legal users who exploit their
employed. For instance, the CPU time and I/O that software access privileges, impersonate higher-level users, or misuse
uses. It's possible that only observing CPU usage is insufficient confidential data to facilitate unlawful access from external
to identify an intrusion [48]. sources. Addressing these threats requires robust security
Hidden Markov model: HMM is a modest kind of dynamic mechanisms for potential cyberattack detection.
Bayesian network and is a statistical tool for modelling To enhance network security, this study introduces a structural
sequential observations. The Markov chain model: Where design framework for intrusion detection and prevention in IoT
activities are viewed as events, and the likelihood that an event networks. While detecting known attacks is essential, the
will occur is determined by its past [49]. For instance, if a identification of unknown threats is equally critical. Anomaly
programmer often uses a set of commands to modify, compile, detection techniques play a vital role in uncovering these
link, and run an application, then the IDS can determine what unknown attacks. Since each intrusion detection method offers
commands are expected because the same set of commands is distinct advantages in identifying cyber threats, the proposed
always expected. An intrusion is suspected if an unusual framework leverages a statistical approach and machine
command occurs, and an IDS will raise an alarm. (visible) that learning as depicted in Fig. 6 to improve cyberattack detection
depend probabilistically on a hidden sequence of occurrences and response effectiveness.
(hidden states) [50]. The study by [51], describes a state-of-the-
20
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
Fig 3: Proposed structural framework design for cyberattack detection in IoT network
The suggested approach considers IoT network traffic An initial sample of randomly selected participants is used to
capturing and IoT system audit logs. Algorithms for supervised begin the process. After that, the population changes over
learning can be used to determine if an activity is harmful or several generations as participants' attributes steadily improve,
normal. Network packet data sets are classified to detect as seen by an increase in fitness value. The network is trained
attacks. The use of supervised learning to develop network using the supervised approach to identify the unknown attacks
traffic rules is suggested in the paper. These rules help as the last phase. Define a decent fitness function that offers
differentiate standard connections from abnormal ones. The incentives to the appropriate kind of participants. To enhance
study follows a two-step approach: first, a statistical method is the grouping outcomes, the study considers all pertinent
used to identify the most relevant features, and then supervised criteria. Our fitness function can be found using:
learning is applied to identify attack patterns effectively. The
optimal features are used to form rules for detecting various Fitness = Error rate + Entropy measure + Rule consistency
cyberattacks using Random Forests. This permits the overview A rule's classification is a consequent portion if it applies to a
of higher levels of generality and thus higher detection rates. specific case. If they don't match, there's no classification. An
21
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
22
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
Table 1 The proposed model performance an F1-score of 0.81. Similarly, XSS attacks have a precision of
Metric Value 0.96 but a recall of only 0.75, leading to an F1-score of 0.84.
Accuracy 0.9705 These lower recall values indicate that a significant number of
these attacks go undetected, posing a potential security risk.
Macro Average Precision 0.92
Uploading attacks demonstrate strong performance, achieving
Macro Average Recall 0.87
a precision of 0.97, recall of 0.85, and an F1-score of 0.91.
Macro Average F1-Score 0.89 Meanwhile, Vulnerability Scanner attacks are detected with
Weighted Average Precision 0.9681 near-perfect accuracy, boasting a precision of 1.00, recall of
Weighted Average Recall 0.9705 0.97, and an F1-score of 0.98, indicating highly reliable
Weighted Average F1-Score 0.9691 detection.
Our proposed model integrates statistical feature selection
Table 1 demonstrates the model's strong overall performance, techniques with the Random Forest algorithm to enhance
with an accuracy of 0.9705, indicating that approximately cyberattack detection in IoT networks. Optimal features are
97.05% of predictions are correct. This highlights the model's selected using Random Forest, Information Gain, and Chi-
high effectiveness in accurately classifying instances. Square methods, considering CPU time and program
Macro Averages: The model demonstrates strong performance input/output. Moving forward, our research will focus on
across all classes, with an average precision of 0.92, meaning implementing ensemble learning and evaluating its
92% of predicted positive instances were correctly classified. effectiveness in detecting multiple cyberattacks while reducing
The recall score of 0.87 indicates that, on average, 87% of false alarms, considering the resource limitations of IoT
actual positive instances were accurately identified. networks.
Additionally, the F1-score of 0.89, which balances precision
and recall, reflects the model's effectiveness when treating each 6. CONCLUSION
class equally. This study introduces a structured framework design to
Weighted Averages: With precision (0.9681), recall (0.9705), improve cyberattack detection in IoT networks through an
and F1-score (0.9691), these metrics account for the support integrated feature selection and machine learning approach.
(number of instances) in each class, providing insight into the The proposed model leverages statistical techniques such as
model's performance on larger classes. The high weighted Random Forest, Information Gain, and Chi-Square to identify
scores, closely aligning with overall accuracy, indicate that the the most relevant features, thereby enhancing classification
model excels in correctly classifying most instances. accuracy. Experimental results demonstrate strong overall
performance with high precision, recall, and F1 scores across
5. DISCUSSION OF THE RESULTS multiple attack categories. While the model exhibits near-
The evaluation of the cyberattack detection model, trained perfect detection for several attack types, challenges persist in
using the Random Forest algorithm with improved statistical identifying individual threats, such as SQL Injection and XSS
feature selection, demonstrates robust overall performance attacks, where recall values remain comparatively lower. These
while exhibiting varying effectiveness across different attack findings stress the crucial role of feature engineering in
categories. improving detection accuracy while also underscoring the need
For backdoor attacks the model demonstrates exceptional for further refinement to enhance the detection of harder-to-
performance, achieving perfect precision (1.00) and high recall classify attacks.
(0.97), leading to an F1-score of 0.98 across 4,952 instances.
This indicates that nearly all backdoor attacks are correctly 7. RECOMMENDATION
identified, with minimal false negatives and no false positives. The study findings indicate a potential for further enhancement
For DDoS attacks, the model performs remarkably well in three to improve performance. Future research can explore this
out of four attack types. ICMP, TCP SYN, and UDP Flood direction to refine and enhance the model.
attacks achieve perfect detection (precision, recall, and F1- • Enhance Low-Recall Detection: Optimize recall for
score all at 1.00) across a significantly large number of SQL Injection and XSS by refining feature selection,
instances. However, HTTP Flood attacks show slightly lower adjusting thresholds, or adding relevant features.
performance, with a precision of 0.93, recall of 0.87, and an F1- • Expand Dataset: Increase data volume and diversity
score of 0.90 over 46,159 samples, indicating a minor to improve generality across attack types.
misclassification rate for this specific attack variant. • Explore Advanced Models: Investigate deep learning
For MITM attacks, despite the limited sample size (241 techniques like CNNs and RNNs for better pattern
instances), the model achieves flawless classification with recognition.
perfect precision, recall, and F1 score. • Real-World Deployment: Test the model in IoT
Similarly, OS Fingerprinting attacks are detected with high environments to assess adaptability and
precision (0.98) and strong recall (0.91), leading to an F1 score effectiveness.
of 0.94. While performance is strong, there remains minor • Hybrid Detection: Combine anomaly- and signature-
room for enhancement. based methods for robust threat identification.
The model also demonstrates robust detection of Password
attacks (precision 0.95, recall 0.99, F1-score 0.97) and Port
8. DECLARATION
Scanning attacks (precision 0.95, recall 1.00, F1-score 0.98), This is my work, and it hasn't been submitted to another
confirming its reliability in identifying these threats. publication.
For Ransomware attacks, the model achieves perfect precision 9. REFERENCES
(1.00) but has a lower recall (0.88), leading to an F1 score of
[1] A. N. Ayesh, “Enhancing Urban Living in Smart Cities
0.94. This suggests that while false positives are nearly
Using the Internet of Things (IoT),” Int. Acad. J. Sci. Eng.,
nonexistent, some actual ransomware instances are
vol. 11, no. 1, pp. 237–246, 2024, Doi:
misclassified.
10.9756/iajse/v11i1/iajse1127.
The most challenging attack types appear to be SQL Injection
and XSS attacks. SQL Injection attacks demonstrate high [2] R. Lakhani, “Cybersecurity Threats in Internet of Things
precision (0.96) but suffer from low recall (0.71), resulting in (IoT) Networks: Vulnerabilities and Defense
23
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
Mechanisms,” vol. 12, no. 11, pp. 25965–25980, 2023, survey,” Mach. Learn. with Appl., vol. 12, no. April, p.
doi: 10.18535/ijecs/v12i11.4779. 100470, 2023, doi: 10.1016/j.mlwa.2023.100470.
[3] Y. Lu, “Security and Privacy of Internet of Things: A [16] M. H. Thwaini, “Anomaly Detection in Network Traffic
Review of Challenges and Solutions,” J. Cyber Secure. using Machine Learning for Early Threat Detection,” Data
Mobil., vol. 12, no. 6, pp. 813–844, 2023, doi: Metadata, vol. 1, pp. 1–16, 2022, doi:
10.13052/jcsm2245-1439.1261. 10.56294/dm202272.
[4] K. Mahanta and H. B. Maringanti, “Security in the Internet [17] R. Foorthuis, On the nature and types of anomalies: a
of Things ( IoT ): Developing intrusion detection systems review of deviations in data, vol. 12, no. 4. Springer
for IoT devices and networks and addressing the unique International Publishing, 2021. doi: 10.1007/s41060-021-
security challenges posed by this connection,” Proc. Int. 00265-1.
Conf. Artif. Intell. 5G Commun. Netw. Technol., no. May,
pp. 570–576, 2023. [18] K. C. Nalavade, “Using Machine Learning and Statistical
Models for Intrusion Detection,” Int. J. Comput. Appl.,
[5] A. Alaa Hammad, M. Adnan Falih, S. Ali Abd, and S. vol. 175, no. 31, pp. 14–21, 2020, doi:
Rashid Ahmed, “International Journal of Computing and 10.5120/ijca2020920854.
Digital Systems Detecting Cyber Threats in IoT
Networks: A Machine Learning Approach,” no. [19] P. Schummer, A. del Rio, J. Serrano, D. Jimenez, G.
December 2024, doi: 10.12785/ijcds/1571020041. Sánchez, and Á. Llorente, “Machine Learning-Based
Network Anomaly Detection: Design, Implementation,
[6] F. Alwahedi, A. Aldhaheri, M. A. Ferrag, A. Battah, and and Evaluation,” AI, vol. 5, no. 4, pp. 2967–2983, 2024,
N. Tihanyi, “Machine learning techniques for IoT doi: 10.3390/ai5040143.
security: Current research and future vision with
generative AI and large language models,” Internet Things [20] Peng Zhou, “Payload-based Anomaly Detection for
Cyber-Physical Syst., vol. 4, no. December 2023, pp. 167– Industrial Internet Using Encoder Assisted GAN,” in 2020
185, 2024, doi: 10.1016/j.iotcps.2023.12.003. IEEE 6th International Conference on Computer and
Communications, 2020, pp. 669–673.
[7] Z. Hasan, H. R. Mohammad, and M. Jishkariani,
“Machine Learning and Data Mining Methods for Cyber [21] A. Chatterjee and B. S. Ahmed, “IoT anomaly detection
Security: A Survey,” Mesopotamian J. CyberSecurity, methods and applications: A survey,” Internet of Things
vol. 2022, no. January, pp. 47–56, 2022, doi: (Netherlands), vol. 19, no. June, p. 100568, 2022, doi:
10.58496/MJCS/2022/006. 10.1016/j.iot.2022.100568.
[8] W. Hilal, S. A. Gadsden, and J. Yawney, “Financial [22] B. Nawaal, U. Haider, I. U. Khan, and M. Fayaz,
Fraud: A Review of Anomaly Detection Techniques and “Signature-Based Intrusion Detection System for IoT,”
Recent Advances,” Expert Syst. Appl., vol. 193, p. Cyber Secur. Next-Generation Comput. Technol., no.
116429, 2022, doi: 10.1016/j.eswa.2021.116429. November, pp. 141–158, 2024, doi:
10.1201/9781003404361-8.
[9] H. Taherdoost, “Security and Internet of Things: Benefits,
Challenges, and Future Perspectives,” Electron., vol. 12, [23] A. Abbas, M. A. Khan, S. Latif, M. Ajaz, A. A. Shah, and
no. 8, pp. 1–22, 2023, doi: 10.3390/electronics12081901. J. Ahmad, “A New Ensemble-Based Intrusion Detection
System for Internet of Things,” Arab. J. Sci. Eng., vol. 47,
[10] T. Sobh, “An Artificial Immune System for Detecting no. 2, pp. 1805–1819, 2022, doi: 10.1007/s13369-021-
Network Anomalies Using Hybrid Immune Theories,” J. 06086-5.
ACS Adv. Comput. Sci., vol. 0, no. 0, pp. 0–0, 2024, doi:
10.21608/asc.2024.258634.1021. [24] G. Rekha, S. Malik, A. K. Tyagi, and M. M. Nair,
“Intrusion detection in cyber security: Role of machine
[11] P. Satam, “Anomaly Based Wi-Fi Intrusion Detection learning and data mining in cyber security,” Adv. Sci.
System,” Proc. - 2017 IEEE 2nd Int. Work. Found. Appl. Technol. Eng. Syst., vol. 5, no. 3, pp. 72–81, 2020, doi:
Self* Syst. FAS*W 2017, pp. 377–378, 2017, doi: 10.25046/aj050310.
10.1109/FAS-W.2017.180.
[25] A. Meleshko and V. Desnitsky, “The Modeling and
[12] J. C. S. Sicato, S. K. Singh, S. Rathore, and J. H. Park, “A Detection of Attacks in Role-Based Self-Organized
comprehensive analyses of intrusion detection system for Decentralized Wireless Sensor Networks,” Telecom, vol.
IoT environment,” J. Inf. Process. Syst., vol. 16, no. 4, pp. 5, no. 1, pp. 145–175, 2024, doi:
975–990, 2020, doi: 10.3745/JIPS.03.0144. 10.3390/telecom5010008.
[13] D. Fahrmann, L. Martin, L. Sanchez, and N. Damer, [26] Z. Yang, Z. Sarwar, I. Hwang, R. Bhaskar, B. Y. Zhao,
“Anomaly Detection in Smart Environments: A and H. Zheng, “Can Virtual Reality Protect Users from
Comprehensive Survey,” IEEE Access, vol. 12, pp. Keystroke Inference Attacks?,” 2023, [Online]. Available:
64006–64049, 2024, doi: http://arxiv.org/abs/2310.16191
10.1109/ACCESS.2024.3395051.
[27] M. S. Hammad, R. E. N. Altarazi, R. N. Al Banna, D. F.
[14] S. Trilles, S. S. Hammad, and D. Iskandaryan, “Anomaly Al Borno, and S. S. Abu-naser, “A Proposed Expert
detection based on Artificial Intelligence of Things: A System for Diagnosis of Migraine,” vol. 7, no. 6, pp. 1–8,
Systematic Literature Mapping,” Internet of Things 2023.
(Netherlands), vol. 25, no. April, p. 101063, 2024, doi:
10.1016/j.iot.2024.101063. [28] J. Sen and S. Mehtab, “Machine Learning Applications in
Misuse and Anomaly Detection,” Secur. Priv. From a Leg.
[15] M. Landauer, S. Onder, F. Skopik, and M. Wurzenberger, Ethical, Tech. Perspect., pp. 1–22, 2020, doi:
“Deep learning for anomaly detection in log data: A 10.5772/intechopen.92653.
24
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
[29] I. E. Salem, M. M. Mijwil, A. W. Abdulqader, M. M. Int. J. Geo-Information, vol. 9, no. 5, pp. 1–16, 2020, doi:
Ismaeel, A. Alkhazraji, and A. M. Z. Alaabdin, 10.3390/ijgi9050329.
“Introduction to The Data Mining Techniques in
Cybersecurity,” Mesopotamian J. CyberSecurity, vol. [43] P. Dini, A. Elhanashi, A. Begni, S. Saponara, Q. Zheng,
2022, pp. 28–37, 2022, doi: 10.58496/MJCS/2022/004. and K. Gasmi, “Applied Sciences Overview on Intrusion
Detection Systems Design Exploiting Machine Learning
[30] R. R. Asaad and R. M. Abdulhakim, “The Concept of Data for Networking Cybersecurity,” 2023.
Mining and Knowledge Extraction Techniques,” Qubahan
Acad. J., vol. 1, no. 2, pp. 17–21, 2021, doi: [44] L. Boero, M. Cello, M. Marchese, E. Mariconti, T.
10.48161/qaj.v1n2a43. Naqash, and S. Zappatore, “Statistical fingerprint-based
intrusion detection system (SF-IDS),” Int. J. Commun.
[31] C. Singh, “Machine Learning in Pattern Recognition,” Syst., vol. 30, no. 10, 2017, doi: 10.1002/dac.3225.
Eur. J. Eng. Technol. Res., vol. 8, no. 2, pp. 63–68, 2023,
doi: 10.24018/ejeng.2023.8.2.3025. [45] T. Lappas and K. Pelechrinis, “Data Mining Techniques
for (Network) Intrusion Detection Systems,” Dep.
[32] M. Mohamed, A. Abdullah, A. M. Zaki, F. H. Rizk, M. M. Comput. Sci. Eng. UC Riverside, Riverside CA 92521,
Eid, and E. M. El El-Kenway, “Advances and Challenges 2007.
in Feature Selection Methods: A Comprehensive
Review,” J. Artif. Intell. Metaheuristics, vol. 7, no. 1, pp. [46] Y. Hu, A. Yang, H. Li, Y. Sun, and L. Sun, “A survey of
67–77, 2024, doi: 10.54216/jaim.070105. intrusion detection on industrial control systems,” Int. J.
Distrib. Sens. Networks, vol. 14, no. 8, 2018, doi:
[33] M. Kumar, C. Sharma, S. Sharma, N. Nidhi, and N. Islam, 10.1177/1550147718794615.
“Analysis of Feature Selection and Data Mining
Techniques to Predict Student Academic Performance,” [47] M. N. Martinez and M. J. Bartholomew, “What does it
2022 Int. Conf. Decis. Aid Sci. Appl. DASA 2022, no. ‘mean’? A review of interpreting and calculating different
March, pp. 1013–1017, 2022, doi: types of means and standard deviations,” Pharmaceutics,
10.1109/DASA54658.2022.9765236. vol. 9, no. 2, 2017, doi: 10.3390/pharmaceutics9020014.
[34] I. H. Sarker, “Machine Learning: Algorithms, Real-World [48] M. R. Ahmed, S. Islam, S. Shatabda, A. K. M. Muzahidul
Applications and Research Directions,” SN Comput. Sci., Islam, M. Towhidul, and I. Robin, “Intrusion Detection
vol. 2, no. 3, pp. 1–21, 2021, doi: 10.1007/s42979-021- System in Software-Defined Networks Using Machine
00592-x. Learning and Deep Learning Techniques-A
Comprehensive Survey,” Ieee, no. December, pp. 1–47,
[35] A. F. A. H. Alnuaimi and T. H. K. Albaldawi, “An 2023, doi: 10.36227/techrxiv. 17153213.v1.
overview of machine learning classification techniques,”
BIO Web Conf., vol. 97, pp. 1–24, 2024, doi: [49] A. Goswami, G. Choudhury, H. K. Sarmah, and A.
10.1051/bioconf/20249700133. Begum, “‘Markov Chain’ - The Most Invaluable
Contribution of A. A Markov Towards Probability Theory
[36] T. ALASALI and Y. ORTAKCI, “Clustering Techniques and Modern Technology: A Historical Search,” Int. J.
in Data Mining: A Survey of Methods, Challenges, and Innov. Res. Sci. Technol., vol. 7, no. 3, 2020.
Applications,” Comput. Sci., no. June 2024, doi:
10.53070/bbd.1421527. [50] S. N. Eshun and P. Palmieri, “De-anonymisation of real-
world location traces: two attacks based on the hidden
[37] P. Shetty and S. Singh, “Hierarchical Clustering: A Markov model,” J. Locat. Based Serv., vol. 18, no. 3, pp.
Survey,” Int. J. Appl. Res., vol. 7, no. 4, pp. 178–181, 272–301, 2024, doi: 10.1080/17489725.2024.2385312.
2021, doi: 10.22271/allresearch.2021.v7.i4c.8484.
[51] A. Ahmadian Ramaki, A. Rasoolzadegan, and A. Javan
[38] J. Landaburu, “済無No Title No Title No Title,” J. GEEJ, Jafari, “A systematic review on intrusion detection based
vol. 7, no. 2, pp. 1–23, 2016, [Online]. Available: on the Hidden Markov Model,” Stat. Anal. Data Min., vol.
http://www.joi.isoss.net/PDFs/Vol-7-no-2- 11, no. 3, pp. 111–134, 2018, doi: 10.1002/sam.11377.
2021/03_J_ISOSS_7_2.pdf
[52] R. Gaharwal, P. Kumar, and U. Dwivedi, “Xournals
[39] S. Pitafi, T. Anwar, and Z. Sharif, “A Taxonomy of Xournals Detection techniques for Intrusion Detection
Machine Learning Clustering Algorithms, Challenges, System Xournals,” vol. 01, no. 01, pp. 16–20, 2019.
and Future Realms,” Appl. Sci., vol. 13, no. 6, 2023, doi:
[53] S. Ingale, M. Paraye, and D. Ambawade, “A Survey on
10.3390/app13063529.
Methodologies for Multi-Step Attack Prediction,” Proc.
[40] C. A. Buckner et al., “We are IntechOpen, the world’s 4th Int. Conf. Inven. Syst. Control. ICISC 2020, no. Icisc,
leading publisher of Open Access books Built by pp. 37–45, 2020, doi:
scientists, for scientists TOP 1 %,” Intech, vol. 11, no. 10.1109/ICISC47916.2020.9171106.
Tourism, p. 13, 2016, [Online]. Available:
[54] M. Rabbani et al., “A review on machine learning
https://www.intechopen.com/books/advanced-biometric-
approaches for network malicious behavior detection in
technologies/liveness-detection-in-biometrics
emerging technologies,” Entropy, vol. 23, no. 5, pp. 1–41,
[41] A. Rachwał et al., “Determining the Quality of a Dataset 2021, doi: 10.3390/e23050529.
in Clustering Terms,” Appl. Sci., vol. 13, no. 5, pp. 1–20,
[55] A. Mishra, Y. I. Alzoubi, M. J. Anwar, and A. Q. Gill,
2023, doi: 10.3390/app13052942.
“Attributes impacting cybersecurity policy development:
[42] D. Phiri, M. Simwanda, V. Nyirenda, Y. Murayama, and Evidence from seven nations,” Comput. Secur., vol. 120,
M. Ranagalage, “Decision tree algorithms for developing 2022, doi: 10.1016/j.cose.2022.102820.
rulesets for object-based land cover classification,” ISPRS
[56] O. Watts, G. E. Henter, T. Merritt, Z. Wu, and S. King,
“From HMMS to DNNS: Where do the improvements
25
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.77, March 2025
come from?,” ICASSP, IEEE Int. Conf. Acoust. Speech Learning-Based Model Analysis Through Decision Tree,”
Signal Process. - Proc., vol. 2016-May, pp. 5505–5509, IEEE Access, vol. 11, no. June, pp. 80348–80391, 2023,
2016, doi: 10.1109/ICASSP.2016.7472730. doi: 10.1109/ACCESS.2023.3296444.
[57] G. Alter, “Reflections on the Intermediate Data Structure [59] A. Yadav, N. Thaker, D. Makwana, N. Waingankar, and
(IDS),” Hist. Life Course Stud., vol. 10, no. 3, pp. 71–75, P. Upadhyay, “Intruder Detection System: A Literature
2021, doi: 10.51964/hlcs9570. Review,” SSRN Electron. J., 2021,
doi:10.2139/ssrn.3866777.
[58] Z. Azam, M. M. Islam, and M. N. Huda, “Comparative
Analysis of Intrusion Detection Systems and Machine
IJCATM : www.ijcaonline.org 26