Machine Learning Based Intrusion Detection Systems Using HGWCSO and ETSVM Techniques
Machine Learning Based Intrusion Detection Systems Using HGWCSO and ETSVM Techniques
Abstract²In recent years, computer networks have grown earning, T-IDS built an RDPLM. The model's accuracy is
significantly in size and complexity, and Intrusion Detection 99.984% on the popular Botnet dataset, and training takes
Systems (IDS) have become an integral part of the system 21.38 seconds. The sequential minimum optimization and
foundation. An IDS must overcome obstacles such as a low random tree identification jobs were proven to reduce error
detection rate and a high computational load. Insufficient pruning using deep neural networks.
feature selection in IDS can have a negative impact on the
accuracy of machine learning methods, resulting in errors in [8] Prioritised notifications based on risk. The approach
the form of False Negatives (FN) and False Positives (FP), employs priority, reliability, and asset value as judgement
which must be minimised. The research presents an effective criteria to quantify the warning's risk. With snort, you may
feature selection and classification technique for intrusion improve intrusion detection by classifying the most severe
detection by combining the Hybrid Grey Wolf optimizer alerts by risk category, so only the most serious alerts are
Cuckoo Search Optimization (HGWCSO) with the Enhanced displayed to the security administrator, reducing the number
Transductive Support Vector Machine (ETSVM). The of FP. Evaluated using KDD Cup 99 Dataset and pattern
proposed strategies are capable of selecting the top eight matching. [9] tested the performance of NIDS on an
features from a total of 41 features without sacrificing OpenStack private cloud. This study's purpose is to assess
precision or recall. The experimental results reveal that the NIDS' performance and accuracy in classifying assaults. The
proposed system outperforms the current system in terms of results show the model's output is safe and exact. The NIDS's
accuracy, precision, recall, and F-measure.
real-time warning can also identify attacks across the
Keywords²HGWCSO, ETSVM, IDS, accuracy, precision, network.
recall, and F-measure. [10] Created an ARIMA model for online service
intrusion detection. To begin, the ARIMA model processes
I. INTRODUCTION the training data. Second, it predicts its future behaviour
within a confidence interval. To check for anomalies, it
In this study, the HGWCSO approach is integrated with
analyses the testing data; if any occurrence goes outside the
the ETSVM algorithm to improve intrusion detection
confidence interval's range, it notifies an administrator.
accuracy [1- 4]. The min-max normalisation method is
Experiments are run and results are based on real-world data.
employed to complete the preprocessing, which enhances the
[11] This paper proposes an intrusion detection method
attack detection accuracy. The feature selection procedure is
based on information gain, mutual correlation, and feature
then carried out with the assistance of the HGWCSO
cardinality. The variable feature subset uses a genetic
algorithm, which aids in the selection of the best features
method. The results show that information-based feature
from the KDD dataset. The best traits are then restored as a
selection can enhance detection rates, with this model
result of increased fitness self-esteem. Using the ETSVM
obtaining 87.54 percent accuracy.
classification technique, the intrusion and normal features are
correctly sorted from the dataset [5]. The execution measures
considered include precision, recall, specificity, and III. PROBLEM SPECIFICATION
accuracy. The complexity of research computation Several approaches have been proposed for developing
necessitates a little more thought. Various assaults on real- automated and intelligent IDS that can detect and eradicate
time data can be researched in the future. piracy assaults on computer networks. In many IDSs, rule-
based expert systems and statistical methods are used as
II. LITERATURE SURVEY detectors. Because rule-based experts can detect certain well-
known intrusions, detecting fresh intrusions is difficult, and a
[6] Used TLMD, C5.0, and the Naive Bayes algorithm to
signature database must be updated often and manually.
enhance detection rate and false alarm rate of adaptive
Furthermore, statistical-based IDS need the collection of
network intrusion detection. The TLMD approach also helps
adequate data in order to create a complex mathematical
manage unbalanced datasets, cope with contiguous
model, which is problematic in complex network traffic. To
characteristics, and reduce noise in the training dataset. The
diminish training time and to enhance accurate outcomes, it
detection rate, accuracy, and false alarm rate of the newly
is necessary to recognize significant network traffic features
suggested TLMD approach are compared to current methods
and to use a proficient classifier.
on the KDD Cup99 benchmark intrusion detection dataset.
The unique TLMD technique provides a low false alarm rate
and a high detection rate in the unbalanced dataset. [7] IV. METHODOLOGY
provides a novel method. With feature sets, feature selection The HGWOCS with ETSVM method is used in the
algorithms, simplified sub spacing, and randomised newly presented system to provide more accurate
metal
,(((
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on April 06,2023 at 03:43:21 UTC from IEEE Xplore. Restrictions apply.
classification output for the KDD dataset. The proposed ETSVM works well with high-dimensional data. The KDD
architecture is showed in figure 1. dataset is first normalised using min-max. The HGWCSO
algorithm selects more significant and optimal features. The
cuckoo search algorithm improves GWO dependability and
execution searching. Finally, ETSVM algorithm helps to
detect intrusion attacks more efficiently. Compared to other
algorithms, the newly proposed HGWCSO with ETSVM
method yields higher execution measurements with more
precision, recall, specificity, and accuracy.
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on April 06,2023 at 03:43:21 UTC from IEEE Xplore. Restrictions apply.
against Particle Swarm Optimization (PSO) with AdaBoost conventional approaches. The x-axis represents the amount
based SVM, Best Feature Selection Algorithm (BFS) with of samples, while the y-axis depicts accuracy. The
Genetic and Ant Colony Optimization (GACO), and HGWCSO algorithm is used to choose the best feature. The
Gaussian Firefly Algorithm (GFA) with Improved Relevance GWO's dependability and searching are enhanced, and the
Vector Machine (IRVM) methods. 10-fold cross validation accuracy rate is increased, thanks to the cuckoo search
was used in the experiments. The sample is divided into ten algorithm. In contrast to existing approaches, the
equal-sized subsamples in 10-fold Cross Validation. One experimental findings show that the innovative system
sample from the ten subsamples is utilised for model achieves higher accuracy.
validation, while the remaining nine are used for training.
The cross-validation procedure is performed ten times (the
TABLE III. NUMBER OF SAMPLES VS PRECISION
folds), with each of the ten subsamples serving as the
validation data precisely once. To create a single output, Number
PSO+
BFS + HGWCS
average the ten results acquired. The entire data is used for Adaboos GFA+IR
of Hybrid O+ETSV
t based VM
both training and testing, and each observation is used for samples
SVM
GACO M
testing once. Confusion matrixes are used to classify the 50 76.58 81.36 89.6 93.6
accuracy, precision, etc. The confusion matrix is formulated 100 79.65 82.35 91.2 94.65
by the bellow image (Figure 3). 150 83.65 86.35 92.65 95.69
200 85.69 88.31 95.6 97.65
250 89.65 91.25 96.5 98.6
300 91.26 94.35 97.22 99.12
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on April 06,2023 at 03:43:21 UTC from IEEE Xplore. Restrictions apply.
D. Estimated Ratio of F-Measure [3] '3 *DLNZDG HWDO ³,QWUXVLRQ 'HWHFWLRQ 6\VWHP 8VLQJ %DJJLQJ
(QVHPEOH0HWKRGRI0DFKLQH/HDUQLQJ´,QWUXVLRQ'HWHFWLRQ6\VWHP
In Table 5 & Figure 7, the HGWCSO with ETSVM Using Bagging Ensemble Method of Machine Learning,
scheme is compared to the other methods in terms of f- ieeexplore.ieee.org, https://ieeexplore.ieee.org/document/7155853.
measure. The number of samples is represented on the x - Accessed 8 Apr. 2022.
axis, while the y-axis depicts the f-measure. In comparison to [4] 0RKDQ 3LOOD 9DLVKQR HW DO ³6HQVRUV _ )UHH )XOO-Text | Leveraging
other approaches, the newly developed HGWCSO with Computational Intelligence Techniques for Defensive Deception: A
5HYLHZ 5HFHQW $GYDQFHV 2SHQ 3UREOHPV DQG )XWXUH 'LUHFWLRQV´
ETSVM achieves a superior f-measure, as seen in the graph. MDPI, doi.org, 11 Mar. 2022, https://doi.org/10.3390/s22062194.
[5] +HOPHU*:RQJ-6.+RQDYDU 9 0LOOHU /µ$XWRPDWHG
TABLE V. NUMBER OF SAMPLES VS F-MEASURE 'LVFRYHU\ RI &RQFLVH 3UHGLFWLYH 5XOHV IRU ,QWUXVLRQ 'HWHFWLRQ¶
Journal of Systems and Software, vol. 60, no. 3, pp. 165-175.
Number
PSO+
BFS + HGWCS [6] <XDQ < +XR / +RJUHIH ' µ7ZR /D\HUV 0ulti-Class
of
Adaboos
Hybrid
GFA+IR
O+ETSV 'HWHFWLRQ 0HWKRG IRU 1HWZRUN ,QWUXVLRQ 'HWHFWLRQ 6\VWHP¶ ,(((
t based VM Symposium on Computers and Communications (ISCC), pp. 767-772
samples GACO M
SVM
[7] .ROL06 &KDYDQ0.µ$Q$GYDQFHG0HWKRGIRU'HWHFWLRQ
50 72.13 81.346 89.06 95.95 RI %RWQHW 7UDIILF XVLQJ ,QWUXVLRQ 'HWHFWLRQ 6\VWHP¶ ,((( IEEE
100 75.23 83.065 91.02 96.86 International Conference on Inventive Communication and
150 76.528 84.5696 92.4565 96.95 Computational Technologies (ICICCT), pp. 481-485.
200 79.346 86.035 95.63 98.12 [8] &KDNLU (0 0RXJKLW 0 .KDPOLFKL <, µ$Q (IILFLHQW
0HWKRG IRU (YDOXDWLQJ $OHUWV RI,QWUXVLRQ'HWHFWLRQ6\VWHPV¶,(((
250 78.65 89.65 96.50 98.5 International Conference on Wireless Technologies, Embedded and
300 82.36 91.23 97.22 99.12 Intelligent Systems (WITS), pp. 1-6.
[9] 6DQWRVR%,,GUXV056 *XQDZDQ,3µ'HVLJQLQJ1HWZRUN
Intrusion and Detection System using Signature-Based Method for
3URWHFWLQJ 2SHQ6WDFN 3ULYDWH &ORXG¶ ,((( ,Qternational Annual
Engineering Seminar (InAES), pp. 61-66
[10] Gopalakrishnan.S, Dr.Ebenezer Abishek.B, Dr. A. Vijayalakshmi, Dr.
V. Rajendran., 2021. Analysis And Diagnosis Using Deep-Learning
Algorithm On Erythemato-Squamous Disease.
doi:10.14445/22315381/IJETT-V69I3P210.
[11] Gopalakrishnan.S, Dr.Ebenezer Abishek.B, Dr. A. Vijayalakshmi, Dr.
V. Rajendran., 2021. An MS-ROI based Detection and Segmentation
of Erythemato-Squamous Disease. doi:10.14445/22315381/IJETT-
V69I8P231
VI. CONCLUSION
To improve intrusion detection accuracy, the HGWCSO
approach is paired with the ETSVM algorithm. Finishing the
preprocessing with min-max normalisation enhances attack
detection accuracy. The HGWCSO technique is then used to
identify the best features from the KDD dataset. Increased
fitness self-esteem restores the best features. The ETSVM
classification system sorts the dataset into intrusion and
normal features. Execution measures examined include
precision, recall, specificity, and accuracy. The complexity
of research computation requires additional thought. Future
research can look into real-time attacks.
ACKNOWLEDGMENT
The authors thank to VISTAS, for supporting this
research.
REFERENCES
[1] )HQJ <DQKRQJ HW DO ³$ 1RYHO +\EULG &XFNRR 6HDUFK $OJRULWKP
with Global Harmony Search for 0±1 Knapsack Problems | Atlantis
3UHVV´ $ 1RYHO +\EULG &XFNRR 6HDUFK $OJRULWKP ZLWK *OREDO
Harmony Search for 0±1 Knapsack Problems | Atlantis Press, doi.org,
https://doi.org/10.1080/18756891.2016.1256577. Accessed 8 Apr.
2022.
[2] *DLHG,-HPLOL) .RUEDD2µ,QWUXVLRQ'HWHFWLRQ%DVHGRQ
Neuro-)X]]\ &ODVVLILFDWLRQ¶ ,((($&6 WK ,QWHUQDWLRQDO
Conference of Computer Systems and Applications (AICCSA), pp. 1-
8.
Authorized licensed use limited to: SRM Institute of Science and Technology. Downloaded on April 06,2023 at 03:43:21 UTC from IEEE Xplore. Restrictions apply.