Enhanced_Network_Anomaly_Detection_Using_Autoencoders_A_Deep_Learning_Approach_for_Proactive_Cybersecurity
Enhanced_Network_Anomaly_Detection_Using_Autoencoders_A_Deep_Learning_Approach_for_Proactive_Cybersecurity
Abstract - Anomaly detection in network traffic is core part of network traffic. This leads to adoption of machine
modern network management. It plays vital role in finding learning (ML) methods which provide capability to learn
the security threats and performance issues. Finding the and adapt the new attack patterns dynamically. Among
threats using the machine learning (ML) method provides an
many ML methods [1], deep learning method such as
efficient solution to dynamically lean normal network
autoencoder have shown significant solution in anomaly
behaviour and detect deviation indicative of anomalies. This
study proposed a Support vector machine with Autoencoder detection. Autoencoder are unsupervised neural designed
for detecting the anomalies in network traffic. Implementing to learn efficient representation of input data. By training
ML methods for threat identification improves network on normal network traffic, autoencoders can identify
security by enabling rapid response to potential threats and patterns and find deviations making them well suited for
issues. Hence this method also reducing downtime and anomaly detection tasks.
improving overall network reliability. This study indicates the
benefits of ML driven anomaly detection, highlighting its
capability to provide a proactive security posture, rapid
identification and resolution of anomalies ensuring an
efficient network infrastructure.
I. INTRODUCTION
➢ To compare the autoencoder's effectiveness with time applications within highly dynamic network
other machine learning techniques in anomaly environments. The integration of machine learning
detection. with spatial-temporal data analysis stands out as a
➢ To provide advancements into the practical major merit, enabling effective adaptation to network
implementation of autoencoders for real-time changes and conditions [15].
network security.
II. LITERATURE SURVEY The study [16] focuses on real-time detection of
network traffic anomalies in big data environments
To mitigate the high dimensional traffic and using deep learning methods, specifically Long Short-
overfitting issue, [2] represents a network traffic Term Memory (LSTM) and Convolutional Neural
anomaly detection method. It utilizes the chaotic Networks (CNN). It introduces a hybrid CNN-LSTM
neural network algorithm. An adaptive technique is method that processes network traffic data efficiently,
also employed to increase the performance of handling the high volume and dynamic nature of
classification and quality of data. This method reduces modern network traffic [17].
the dimension of feature and enhance the
effectiveness. It also eliminates the computational [18] represents the unsupervised learning method for
complexity. Using Gaussian mixture method, a anomaly detection in cloud. It manages the unlabelled
network traffic anomaly detection method is proposed data and also reduce the complexity. But this method
in [3]. This approach predicts and learn effectively has real time novel attack vector which leads to zero-
among normal and attack detection. But it has more day threats. Different deep learning methods for
complex on performance. A technique for anomaly network anomaly detection are employed in [19] using
detection in network traffics is proposed in [4]. The the CSE-CIC-IDS2018 dataset. It demonstrates
method utilizes Clustering methodologies with effective data preprocessing and hyperparameter
Euclidean distance calculations. This method performs tuning for accurate anomaly detection. The results
all these existing methods and showing its efficiency from these methods achieve multi-class classification
and robustness [5]. accuracies above 98%, showcasing a significant
improvement in network intrusion detection systems
The integration of SVM and Advanced support (IDS).
vector machine is introduced in [6]. This method
mitigates the issues in conventional approaches using III. PROPOSED METHODOLOGY
multiple SVMs. This method achieves in detection of This study focuses on utilizing
zero-day attack and its application [7, 8]. An Intrusion
autoencoders for anomaly detection in network traffic.
Detection System (IDS) for detecting Denial of
It involves gathering and preprocessing of network
Service (DoS) attacks in IoT networks using various
machine learning algorithms is proposed in [9]. Key traffic data, training an autoencoder method on normal
merits of this study include robust detection traffic and assessing the ability of method to identify
capabilities against DoS attacks and adaptability to the anomalies. The autoencoder in various network
IoT network configurations. However, limitations scenarios and compare it with other ML-based
might include potential overfitting with complex detection methods.
methods and the computational demand of the genetic
algorithm in real-time applications. This work 3.1 Data Collection
enhances the security of IoT networks by effectively
Collecting network traffic data from various
identifying and classifying DoS traffic [10, 11].
sources is crucial to obtain a comprehensive view of
The study [12] focuses on real-time anomaly network activity. Here are the primary sources for
detection in network traffic using Convolutional collecting network traffic data:
Neural Networks (CNN) integrated with Software
Defined Networks (SDN), addressing dynamic ➢ Routers
network configurations and preventing information ➢ Firewalls
loss in edge cluster networks. The proposed system ➢ Network analysis tools
shows high accuracy in anomaly detection, as ➢ Intrusion Detection system
highlighted by empirical results where the CNN-based ➢ Network switches
method efficiently processes and identifies anomalies ➢ Endpoints
through direct feature extraction from network traffic
[13]. A novel approach to traffic anomaly detection 3.2 Preprocessing Network Traffic Data
on road networks is proposed in [14] utilizing a
spatial-temporal graph neural network It achieves Preprocessing network traffic data is a crucial step
enhanced performance over baseline methods by to ensure that the data is clean, well-structured, and
employing a spatial-temporal representation that suitable for machine learning methods. Here are the
allows for accurate and automatic anomaly detection. detailed steps and methods used in preprocessing:
However, the method's complexity and the
computational demand could pose challenges in real-
➢ Removing the noise to smooth out short term Feature extraction is a crucial step in preparing data
fluctuations and indicate long term trends. for machine learning methods, particularly for tasks
➢ Linear interpolation is used to estimate the like anomaly detection in network traffic. This process
missing values based on existing dataset. involves selecting and engineering features that will
➢ Normalization methods are used to guarantee help the method identify patterns and make accurate
the uniformity across features and improve predictions.
performance of method.
Feature engineering involves creating new
features from the existing data to enhance the
method’s ability to detect anomalies. This process can
reveal hidden patterns and relationships in the data
that are not immediately apparent from the raw
features.
3.6 Classification
𝑦𝑖 (𝑤. 𝑥𝑖 − 𝑏) ≥ 1
Lagrangian Dual Problem can be transformed into 𝑏𝑒 is the bias vector of the encoder.
its dual form using Lagrange multipliers: 𝜎 is the activation function (e.g., ReLU, sigmoid).
1
ℒ (𝑤, 𝑏, 𝑎) = ||𝑤||2 − ∑𝑛𝑖=1 𝛼𝑖 [𝑦𝑖 (𝑤. 𝑥𝑖 − 𝑏) − 1] Decoder: Reconstructs the input from the latent
2
(4) representation z”
SRU [15]
Methodology
HMAODL-CTC
RF+SVM [14]
traffic obtained over a month, indicating the realistic
Proposed
and modern network environment.
[18]
Key Features:
TABLE 1: COMPARATIVE ANALYSIS FOR ACCURACY TABLE 2: COMPARATIVE ANALYSIS FOR PRECISION
VALUES OF PROPOSED WITH EXISTING METHOD VALUES OF PROPOSED WITH EXISTING METHOD
SRU [15]
HMAODL-CTC
Methodology
RF+SVM [14]
Precision (in %)
Proposed
0
[18]
SRU [15]
Methodology
HMAODL-CTC
RF+SVM [14]
Proposed
[18]
Fig. 4. Precision analysis Figure 5 illustrates the recall values, which measure
the proportion of correctly identified anomalies out
Figure 4 shows the precision values for the same of all actual anomalies. The proposed methodology
four methods. Precision measures the proportion of achieves the highest recall at 96%, indicating its
correctly identified anomalies out of all instances superior capability to detect almost all actual
identified as anomalies. The proposed methodology anomalies in network traffic. This high recall value
achieves the highest precision at 97%, indicating its suggests that the proposed method effectively
superior ability to accurately detect anomalies with minimizes false negatives, ensuring that most of the
minimal false positives. This high precision suggests anomalies present in the network traffic are detected.
that the proposed method is highly effective in
correctly identifying true positive anomalies, reducing iv. F1 Score:
the rate of false alarms.
F1-score is defined as the mean value between
iii. Recall: precision and recall, offering a balance between
among these two. It is calculated by,
Recall (also known as sensitivity or true positive
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
rate) measures the proportion of correctly identified 𝑓1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗ (16)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙
anomalies out of all actual anomalies. It is given by
high recall indicates a low number of false negatives.
𝑇𝑃 TABLE 4: COMPARATIVE ANALYSIS FOR F1-SCORE
𝑅𝑒𝑐𝑎𝑙𝑙 = (15) VALUES OF PROPOSED WITH EXISTING METHOD
𝑇𝑃+𝐹𝑁
[5] Simon, J., Kapileswar, N., Phani Kumar, P. and Aarthi Elaveini,
F1-score (in %) M., 2024. Improved geographic opportunistic routing protocol
for void hole elimination in underwater IoTs: Parameter tuning
1 by TSA optimization. International Journal of Communication
0.8 Systems, 37(3), p.e5659.
[6] Pradeep, S. And Geetha, A., 2024. Advanced Support Vector
0.6 Machine Based Aggregation Method for Network Anomaly
0.4 Detection. Journal Of Basic Science and Engineering, 21(1),
Pp.1442-1452.
0.2
[7] Murugan, K.S., Sudharsanam, V., Padmavathi, B., Simon, J.,
0 F1-score (in %)
Jacintha, V. and Sumathi, K., 2020, November. BER analysis of
HMAODL-CTC
SRU [15]
Methodology
RF+SVM [14]