0% found this document useful (0 votes)
7 views

Machine learning for securing Cyber-Physical Systems under Cyber attacks

This study explores the use of the Random Forest algorithm for enhancing intrusion detection in Cyber-Physical Systems against evolving cyber threats. It emphasizes the importance of data preprocessing and feature selection, utilizing the KDD dataset to categorize attributes and evaluate performance based on Detection Rate and False Alarm Rate. The proposed system aims to improve accuracy, reduce overfitting, and adapt to various network traffic patterns, ultimately strengthening cybersecurity measures.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Machine learning for securing Cyber-Physical Systems under Cyber attacks

This study explores the use of the Random Forest algorithm for enhancing intrusion detection in Cyber-Physical Systems against evolving cyber threats. It emphasizes the importance of data preprocessing and feature selection, utilizing the KDD dataset to categorize attributes and evaluate performance based on Detection Rate and False Alarm Rate. The proposed system aims to improve accuracy, reduce overfitting, and adapt to various network traffic patterns, ultimately strengthening cybersecurity measures.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Machine learning for securing Cyber–Physical Systems

under Cyber attacks

Presented By:
V. Anitha (731720205002) Guided By:
G. Mariyammal (731720205017)
Mrs.J.Jenshya AP/IT
R. Menaka (731720205018)
S. Priyadharshini (731720205021)
ABSTRACT
• In the face of rapidly evolving cyber threats within the Internet landscape, this study delves into the application of the Random
Forest algorithm for intrusion detection, a crucial facet of cyber security.

• Embracing Machine Learning (ML) methodologies, the paper provides a tutorial overview of the Random Forest algorithm
along with its counterparts, indexing and summarizing pertinent papers based on temporal or thermal correlations.

• Acknowledging the pivotal role of data in ML methods, the research also highlights commonly used network datasets and
addresses associated challenges in the cyber security domain.

• Focusing on the well-established KDD dataset, the project employs the Random Forest (RF) algorithm to categorize attributes
into four classes Basic, Content, Traffic, and Host specifically targeting the improvement of intrusion detection strategies.

• Evaluation of the analysis is performed with respect to two key metrics, Detection Rate (DR) and False Alarm Rate (FAR),
offering valuable insights into the effectiveness of the Random Forest algorithm in enhancing the capabilities of an Intrusion
Detection System (IDS).
OBJECTIVES
• Detect network intrusions with high accuracy. This means that the IDS system should be able to identify both known and
unknown attacks with a low false positive rate.

• Reduce over fitting. Over fitting occurs when the IDS system learns the training data too well, which can lead to poor
performance on new data.

• Improve adaptability and flexibility. The proposed system automatically selects the studied parameter values according to the
used training dataset, which makes the system more adaptable to different types of network traffic and intrusion patterns.
INTRODUCTION
• In the digital age, the security of computer networks and data has become paramount. With the increasing sophistication of
cyber threats and the interconnectedness of our systems, the need for robust network intrusion detection systems (NIDS) has
never been greater.

• Intrusion detection plays a pivotal role in safeguarding organizations, detecting unauthorized access, and mitigating potential
threats to information systems. Traditional intrusion detection methods often face challenges in adapting to the ever-
evolving threat landscape.

• To address these challenges and enhance the efficacy of intrusion detection, we propose a novel approach "Network
Intrusion Detection with Two-Phased Hybrid Ensemble Learning and Automatic Feature Selection.

• “This research embarks on a journey to amalgamate cutting-edge techniques from the realms of machine learning, data
science, and cybersecurity. By fusing the power of ensemble learning and automatic feature selection into a two-phased
detection system, we aim to redefine the landscape of network intrusion detection.
LITERATURE SURVEY
• INTRUSION DETECTION SYSTEMS IN THE INTERNET OF THINGS: A COMPREHENSIVE INVESTIGATION

• Somayye Hajiheidariet.al. Has proposed in this system, Recently, a new dimension of intelligent objects has been provided by
reducing the power consumption of electrical appliances. Daily physical objects have been upgraded by electronic devices
over the Internet to create local intelligence and make communication with cyberspace. Internet of things (IoT) as a new term
in this domain is used for realizing these intelligent objects. Since the objects in the IoT are directly connected to the unsafe
Internet, the resource constraint devices are easily accessible by the attacker. Such public access to the Internet causes things to
become vulnerable to the intrusions. The purpose is to categorize the attacks that do not explicitly damage the network, but by
infecting the internal nodes, they are ready to carry out the attacks on the network, which are named as internal attacks.
Therefore, the significance of Intrusion Detection Systems (IDSs) in the IoT is undeniable. However, despite the importance of
this topic, there is not any comprehensive and systematic review about discussing and analyzing its significant mechanisms.
LITERATURE SURVEY
• ENSEMBLE LEARNING FOR INTRUSION DETECTION SYSTEMS: A SYSTEMATIC MAPPING STUDY AND
CROSS-BENCHMARK EVALUATION

• BayuAdhi Tamaet.al. Has proposed in this system Intrusion detection systems (IDSs) are intrinsically linked to a
comprehensive solution of cyberattacks prevention instruments. To achieve a higher detection rate, the ability to design an
improved detection framework is sought after, particularly when utilizing ensemble learners. Designing an ensemble often lies
in two main challenges such as the choice of available base classifiers and combiner methods. This paper performs an
overview of how ensemble learners are exploited in IDSs by means of systematic mapping study. We collected and analyzed
124 prominent publications from the existing literature. The selected publications were then mapped into several categories
such as years of publications, publication venues, datasets used, ensemble methods, and IDS techniques
LITERATURE SURVEY
• DEEP ABSTRACTION AND WEIGHTED FEATURE SELECTION FOR WI-FI IMPERSONATION DETECTION

• Muhamad Erza Amina toet.al. Has proposed in this system, The recent advances in mobile technologies have resulted in IoT-
enabled devices becoming more pervasive and integrated into our daily lives. The security challenges that need to be overcome
mainly stem from the open nature of a wireless medium such as a Wi-Fi network. An impersonation attack is an attack in
which an adversary is disguised as a legitimate party in a system or communications protocol. The connected devices are
pervasive, generating high-dimensional data on a large scale, which complicates simultaneous detections. Feature learning,
however, can circumvent the potential problems that could be caused by the large-volume nature of network data. This study
thus proposes a novel Deep-Feature Extraction and Selection (D-FES), which combines stacked feature extraction and
weighted feature selection.
EXISTING SYSTEM

• In recent years, machine learning-based cyber intrusion detection methods have gained increasing popularity. The
number and complexity of new attacks continue to rise; therefore, effective and intelligent solutions are necessary.

• Unsupervised machine learning techniques are particularly appealing to intrusion detection systems since they can
detect known and unknown types of attacks as well as zero-day attacks.

• In the current paper, we present an unsupervised anomaly detection method, which combines Sub-Space
Clustering (SSC) and One Class Support Vector Machine (OCSVM) to detect attacks without any prior
knowledge.

• The proposed approach is evaluated using the well-known NSL-KDD dataset. The experimental results
demonstrate that our method performs better than some of the existing techniques
DISADVANTAGES

• Unsupervised anomaly detection methods can generate false positives, which are instances that are
incorrectly classified as anomalous. This can lead to unnecessary alerts and disruptions to normal operations.

• It can also generate false negatives, which are instances that are incorrectly classified as normal. This can
allow attacks to go undetected.

• The Unsupervised anomaly detection methods can be computationally expensive, especially for large and
high-dimensional datasets.

• It can be difficult to tune, as there is no single set of parameters that will work best for all datasets.
PROPOSED SYSTEM
• The proposed system integrates the Random Forest algorithm for intrusion detection within the dynamic and evolving
landscape of cyber threats. By first loading relevant data, including the well-established KDD dataset, the system initiates a
robust foundation.

• Subsequent data pre-processing addresses challenges in cyber security data, ensuring the dataset's cleanliness and readiness
for analysis. Feature selection categorizes attributes into classes, such as Basic, Content, Traffic, and Host, aiming to
improve intrusion detection strategies.

• The training and testing phases apply the Random Forest algorithm to learn patterns and correlations within the data,
subsequently evaluating the model's performance using key metrics like Detection Rate (DR) and False Alarm Rate (FAR).

• The culmination of these modules results in an Intrusion Detection System (IDS) that leverages machine learning
methodologies, specifically the Random Forest algorithm, to effectively enhance cyber security measures against emerging
threats in the digital domain.
ADVANTAGES

• The Random Forest algorithm exhibits superior accuracy in intrusion detection, effectively distinguishing between normal
and malicious network behavior.

• Its ensemble nature and use of multiple decision trees make Random Forest resilient to overfitting, enhancing the model's
generalization capabilities.

• The algorithm provides insights into feature importance, aiding in the identification of critical attributes for effective
intrusion detection strategies.

• Random Forest is versatile, accommodating various types of data and exhibiting consistent performance across diverse
cyber security scenarios.
SYSTEM SPECIFICATION:

• HARDWARE REQUIREMENTS

• Processor Type : AMD RYZEN 7


• Speed : 4.40GHZ
• RAM :16 GB RAM
• Hard disk : 1 TB
• Keyboard : 101/102 Standard Keys
• Mouse : Optical Mouse
• SOFTWARE REQUIREMENTS

• Operating System : Windows 10


• Front End : Jupyter Notebook/ Anaconda tool
• Coding Language : Python
MODULES
• Load Data

• Data Pre-processing

• Feature Selection

• Training and Testing

• Intrusion Detection using Random Forest Algorithm


MODULE DESCRIPTION

• Load Data

• This module involves the initial step of loading the relevant data for intrusion detection. The study mentions the utilization
of the well-established KDD dataset, which is likely to include diverse network traffic data. This step is crucial for
acquiring the necessary information to train and test the Random Forest algorithm for intrusion detection.

• Data Pre-processing

• Data pre-processing is a pivotal stage in any machine learning project. In this module, the study addresses the challenges
associated with cyber security data. It involves cleaning and organizing the loaded data, handling missing values, and
ensuring that the dataset is ready for the subsequent stages, such as feature selection and training.
MODULE DESCRIPTION

• Feature Selection

• Feature selection is the process of identifying and choosing relevant attributes from the dataset. In the context of intrusion
detection, the study categorizes attributes into classes such as Basic, Content, Traffic, and Host. This module delves into the
criteria for selecting these features and highlights their importance in improving intrusion detection strategies.

• Training and Testing

• This module involves the application of the Random Forest algorithm to train the model using the prepared dataset. The
training phase is essential for the algorithm to learn patterns and correlations within the data. Subsequently, the model is
tested on separate datasets to evaluate its performance. The evaluation is based on key metrics, including Detection Rate
(DR) and False Alarm Rate (FAR).
MODULE DESCRIPTION

• Intrusion Detection using Random Forest Algorithm

• The final module focuses on the application of the Random Forest algorithm for intrusion detection. It utilizes the insights
gained from the training and testing phases to enhance the capabilities of an Intrusion Detection System (IDS). The module
provides a comprehensive analysis of the algorithm's effectiveness in improving the detection of cyber threats within the
Internet landscape.
SYSTEM FLOW DIAGRAM

Computing
Feature Anomaly Score
Loading Dataset Preprocessing
Selection Based On Selected
Features

Detecting Threats
Result Using rf Method
SCOPE OF THE PROJECT
• Future work in advancing the Random Forest-based Intrusion Detection System (IDS) could explore the integration of deep
learning techniques to enhance the model's ability to capture intricate patterns and dependencies in network data.

• Additionally, research efforts could focus on the development of more sophisticated feature selection methods tailored to
the unique challenges of evolving cyber threats. Exploring real-time adaptation mechanisms, such as online learning or
reinforcement learning, would further fortify the IDS against dynamic attack landscapes.
CONCLUSION
• In conclusion, this study underscores the efficacy of the Random Forest algorithm as a robust tool for intrusion detection in
the ever-evolving landscape of cyber threats.

• Through meticulous data loading, pre-processing, and feature selection, the algorithm demonstrates its versatility and
ability to improve detection strategies.

• The training and testing phases substantiate its high accuracy, resilience to overfitting, and efficiency in handling large
datasets. The insights gained into feature importance further contribute to enhancing the overall performance of an Intrusion
Detection System (IDS).
REFERENCES
• [1] R. Kumar, A. Malik, and V. Ranga, ‘‘An intellectual intrusion detection system using hybrid hunger games search and
remora optimization algorithm for IoT wireless networks,’’ Knowl.-Based Syst., vol. 256, Nov. 2022, Art. no. 109762.

• [2] W. Wang, S. Jian, Y. Tan, Q. Wu, and C. Huang, ‘‘Representation learningbased network intrusion detection system
by capturing explicit and implicit feature interactions,’’ Comput. Secur., vol. 112, Jan. 2022, Art. no. 102537.

• [3] J. Oughton, W. Lehr, K. Katsaros, I. Selinis, D. Bubley, and J. Kusuma, ‘‘Revisiting wireless internet connectivity:
5G vs Wi-Fi 6,’’ Telecomm. Policy, vol. 45, no. 5, Jun. 2021, Art. no. 102127

• [4] B. A. Tama and S. Lim, ‘‘Ensemble learning for intrusion detection systems: A systematic mapping study and cross-
benchmark evaluation,’’ Comput. Sci. Rev., vol. 39, Feb. 2021, Art. no. 100357.

• [5] S. Lei, C. Xia, Z. Li, X. Li, and T. Wang, ‘‘HNN: A novel model to study the intrusion detection based on multi-
feature correlation and temporalspatial analysis,’’ IEEE Trans. Netw. Sci. Eng., vol. 8, no. 4, pp. 3257–3274, Oct. 2021
LOADING DATASET AND PREPROCESSING:
FEATURE SELECTION:
FEATURE SELECTION:
DETECTING THREATS USING MRF METHOD:
THANK YOU

You might also like