0% found this document useful (0 votes)

1 views

IOP with vivek

The research paper discusses the identification of modern distributed denial-of-service (DDoS) attacks using machine learning methods, specifically light gradient boosting (LGBM) and extreme gradient boosting (XGBoost). It utilizes the CICDDoS 2019 dataset and achieves high accuracy rates of 94.88% and 94.89% for the respective methods. The study highlights the increasing need for efficient attack identification systems due to the rise in automated processes and IoT device vulnerabilities during the pandemic.

Uploaded by

shruti

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

IOP with vivek

Uploaded by

shruti

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Journal of Physics: Conference Series

PAPER • OPEN ACCESS You may also like

- A passive DDoS attack detection
Boosting Algorithms to Identify Distributed Denial- approach based on abnormal analysis in
SDN environment
of-Service Attacks Shimin Sun, Xinchao Zhang, Wentian
Huang et al.

- A Comprehensive Analysis of DDoS

To cite this article: V Kumar et al 2022 J. Phys.: Conf. Ser. 2312 012082 attacks based on DNS
Lei Fang, Hongbin Wu, Kexiang Qian et al.

- DDoS Detection and Protection Based on

Cloud Computing Platform
Tianwen Jili and Nanfeng Xiao
View the article online for updates and enhancements.

This content was downloaded from IP address 27.61.121.114 on 14/08/2022 at 17:31

ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

Boosting Algorithms to Identify Distributed Denial-of-Service

Attacks
V Kumar1, A Kumar1, S Garg1 and S R Payyavula2
1
Birla Institute of Technology, Mesra, Ranchi,
2
Samsung Research India, Bangalore

Email: [email protected]

Abstract. In the current pandemic situation, much work became automated using Internet of
Things (IoT) devices. The security of IoT devices is a major issue because they can easily be
hacked by third parties. Attackers cause interruptions in vital ongoing operations through these
hacked devices. Thus, the demand for an efficient attack identification system has increased in
the last few years. The present research aims to identify modern distributed denial-of-service
(DDoS) attacks. To provide a solution to the problem of DDoS attacks, an openly available
dataset (CICDDoS 2019) has recently been introduced and implemented. The attacks currently
occurring in the dataset were identified using two machine learning methods, i.e. the light
gradient boosting method (LGBM) and extreme gradient boosting (XGBoost). These methods
have been selected because of their superior prediction ability in high volumes of data in less
time than other methods require. The accuracy achieved by LGBM and XGBoost were 94.88%
and 94.89% in 30 and 229 seconds(s), respectively.

1. Introduction
Distributed denial-of-service (DDoS) attacks have become an unavoidable security issue these days [1].
DDoS attacks obstruct devices involved in communication networks. The devices may be completely
blocked or partially stop working while under attack. The first DDoS attack, which immobilised the
oldest internet service provider, Panix, for several days, was discovered in 1996 [2]. Attacks became
common after a few years, and according to the Cisco Annual Internet Report, their number will increase
to up to 15 million by 2023 [3]. Thus, a systematic solution to DDoS attacks is highly recommended.
The execution of DDoS attacks is shown in Figure 1. The attacker converts different vulnerable devices
into bots. These bots send voluminous requests to the target server, which results in network congestion,
causing all machines connected to the server to stop responding.

Figure 1. DDoS Attack

DDoS attacks have been identified by several strategy-based methods, statistical methods, and machine
learning (ML) methods [4]. The first strategy-based method worked by blacklisting IP addresses [5].

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

This approach requires IP addresses to be updated regularly. Also, this approach fails in IP spoofing.
The second approach used virtual machine (VM) technology, installed at each sensor, and collecting its
data by applying the Dempster-Shafer theory at the frontend [6]. This approach produces fewer false
positives, but it cannot detect unknown attacks. According to Arbor Networks [7], DDoS attack activity
during the first quarter of 2015 shows that attack duration has shortened, but the impact is very high,
with a size of 1.25 Gbps. In 2015, the majority of these attacks leveraged reflection amplification
techniques that use the Simple Service Discovery Protocol (SSDP) with 26Gbps and the Network Time
Protocol (NTP) with 51 Gbps.
The statistical approaches that were applied to identify DDoS attacks were discussed in [8]. Parametric
[9] and non-parametric [10] methods were applied. Multivariate correlation analyses were used to model
traffic behaviour and spectral analysis was used to handle large data in the application of the parametric
statistical method. A Markov chain, regression analysis and time series analysis were done to predict
future attacks when applying the non-parametric statistical method.
The volume of DDoS attacks is growing very fast [8], thus, computer modelling of these attacks was
strongly recommended. The researchers started applying machine learning (ML) to model and identify
attack patterns. The advantage of ML models is that they learn from data and predict with high accuracy
[11]. Popular ML models used to identify DDoS attacks are support vector machines (SVM), decision
tree (DT), random forest (RF), K-nearest neighbour (KNN), Naïve Byes (NB) and neural network (NN)
[12]. The ML methods applied in various research are shown in Table 1.

Table 1. ML models applied for DDoS in recent years

Ref, year Dataset Methods Attack Type Max Accuracy
[12], 2019 CAIDA UCSD SVM, NB, ANN, DT Fuzzers, 94.43%
Dataset 2008-11-21 Backdoor, DoS,
DDoS, Exploits,
Shellcode, Worms,
Generic,
Reconnaissance
and Analysis.
[13], 2018 KDD Cups ’99 NN, SVM ,DT, RF, NB and Denial of Service 94% F1-Score
KNN.
[14], 2020 CICIDS2017 Rough set, Convolution PortScan(98.08%) 98.08%
Neural Network DDoS(87.66%)
[15], 2020 Data collected from LGBM, Deep Neural TCP, UDP, ICMP .992±0.014
Home IOT devices Network (DNN), SVM Area under the
precision-recall
curve (AUPRC)
[16], 2020 Collected Data SVM DDoS Attack 94% F1-Score
(SDNTrafficsDS),
KDDCups’99
[17], 2020 CICIDS2017 RF, Bayes Net (BN), NB and DoS/DDoS 99.87%
J48 classifier
[18], 2020 CIC-DDoS2019 ResNet Syn, TFTP, UDP 87%
Lag, DNS, LDAP,
MSSQL, NetBIOS,
NTP, SNMP,
SSDP, UDP,
Normal traffic
[19], 2021 CIC-DDoS2019 J48 Classifier PortMap (99.95%), 99.99%
SYN (99.99%) ,
NetBIOS
(99.99%), MSSQL
(99.99%),
LDAP(99.99%)

2
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

[20], 2021 CIC-DDoS2019* AdaBoost (AB), Linear Legitimate, Non- 98.78%

Discriminant Analysis Legitimate (DNS,
(LDA), Logistic Regression LDAP, MSSQL,
(LR) and RF NetBIOS, NTP,
SNMP, SSDP,
UDP, SYN, TFTP

Since CIC-DDoS2019* is the most recent dataset available on the Web, the current research is conducted
on the same, using efficient ML methods, such as light gradient boosting method (LGBM) and extreme
gradient boosting (XGBoost) [23]. As the volume of DDoS attack data is growing, faster adaptive
boosting is not considered a suitable solution to this problem because adaptive boosting is slow and
sensitive to noise [21]. LGBM utilises less memory and makes predictions with high accuracy. The
XGBoost predicts with a higher accuracy than any other ML method because it utilises both L1 and L2
regularisation and implements parallel processing. The workflow of the current research is shown in
Figure 2.

Clean
data
Dataset Train-data
(80%) Cross-
CSV validation Predicted
2019 Output
Files Test-data Algorithm
Encoding (20%) Comparison
of Labels

Figure 2. Workflow of research

1.1 Our Contribution

1. All attacks present in the dataset have been included in attack identification which is not present
in most of the research available in literature.
2. Multiclass classification was performed in this study.
3. ML methods LGBM and XGBoost have been applied, with high accuracy, to minimise the time
needed for classification Separate experiments have been conducted for reflection and
exploitation attacks.
4. ML models have been trained to test and train data individually, and models have been trained
with training data and tested with test data (adversarial case).
The flow of the research work is described in the following: section 2 provides a description of the
boosting algorithms applied; section 3 elaborates on the experiments conducted and discusses the
differences between this research and previous works; and section 4 presents the conclusion of this
research.

2. Boosting Algorithms
Boosting algorithms [21] were designed in 1999 with the aim to improve the accuracy of ML algorithms.
These are tree-based ensemble algorithms that can be applied to the data that doesn’t follow any
distribution. They are designed to handle mixed data types. There are different gradient boosting
algorithms, discussed in [22]. The XGBoost and LGBM methods were used in this research because of
their extreme learning capability and fast processing. Descriptions of the methods are presented in this
section.

3
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

2.1 XGBoost
XGBoost [23] finds the best split in trees using histograms. A histogram is a graphical form of number
of bins into a feature. Thus, in histogram-based methods the splitting is done based on bins rather than
features. The method executes faster because features are binned before the construction of the tree. The
evolution of the XGboost method is shown in Figure 3.

Boosting Gradient XGBoost

High performing boosting Optimized Gradient
Sequential Gradient Gradient decent Extreme
boosting using
models with boosting algorithm applied gradient parallel processing,
minimizing to minimize error boosting handling missing
errors from in sequential values, tree pruning
previous model models and regularization

Figure 3. Evolution of XGBoost

2.2 Lightweight Gradient Boosting Method (LGBM)

LGBM [22] is also a gradient boosting variant based on decision tree algorithms. LGBM reduces
memory usage and improves efficiency. LGBM differs from other gradient boosting frameworks by
expanding vertically, i.e. it grows leaf-wise. The other algorithms, on the other hand, expand
horizontally and level-by-level. LGBM selects the leaf with the least error and highest efficiency. The
growth in the LGBM tree is shown in Figure 4.

Figure 4. Growth of LGBM

3. Experiments and Discussions

The experiments were conducted in current articles using the CIC-DDoS 2019 [24] dataset. All
experiments were conducted on the Windows 10 operating system, i7 -10750 [email protected] GHz with
16GB of RAM. The programming language used was Python 3.7 and codes were executed in Jupyter
Notebook.

3.1 Dataset description

The CIC-DDoS 2019 dataset consists of the following DDoS attacks, shown in table 2.

Table 2 shows the 11 DDoS attacks on training day include NTP, DNS, LDAP, MSSQL, NetBIOS,
SNMP, SSDP, UDP, UDP-Lag, WebDDoS, SYN Flood (SYN) and TFTP, and the seven attacks on test
day include PortMap, NetBIOS, LDAP, MSSQL, UDP, UDP-Lag and SYN.

4
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

Table 2. Types and Timings of different attacks

Days Attacks Attack times
PortMAP 09:43 – 09:51
NetBIOS 10:00 – 10:09
LDAP 10:21 – 10:30
MSSQL 10:33 – 10:42
Testing Set
UDP 10:53 – 11:03
UDP-Lag 11:14 – 11:24
SYN 11:28 – 17:35

NTP 10:35 – 10:45

DNS 10:52 – 11:05
LDAP 11:22 – 11:32
MSSQL 11:36 – 11:45
NetBIOS 11:50 – 12:00
SNMP 12:12 – 12:23
Training Set SSDP 12:27 – 12:37
UDP 12:45 – 13:09
UDP-Lag 13:11 – 13:15
WebDDoS (ARME) 13:18 – 13:29
SYN 13:29 – 13:34
TFTP 13:35 – 17:15

All CSV files present in both attacks categories consist of 88 attributes collected by CICFlowMeter [25]

3.2 Pre-Processing Steps:

All CSV files presented in the dataset were pre-processed by the following steps:
1. Removal of attributes with object instances (e.g. source IP, destination IP, timestamp, etc.)
2. Removing instances with Label BENIGN
3. Encoding the Labels to integers
4. Reduction of file size to reach an executable number of instances
5. Removal of columns with standard deviation = 0 and Correlation = 0
6. Normalization of independence variables
x = (x-mean(x))/std(x)
x : Independent variable
7. Verification of standard deviation & correlation
8. Removal of unnamed column
9. Dividing dataset into train and test data (80:20 ratio)

The attributes selected and deleted after pre-processing are shown in Figure 5(a) and (b), respectively.

5
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

(a) Total 65 attributes selected after preprocessing

(b) List of deleted attributes

Figure 5. List of attributes after pre-processing
3.3 Experiment I:
The experiment I consist of ML algorithm applied on training day attacks, which were broadly
categorised in reflection and exploitation attack classes. The reflection attack consists of eight attacks
NTP, DNS, LDAP, MSSQL, NetBIOS, SNMP, SSDP, WebDDOS and TFTP. (Note WebDDoS was
not included for accuracy computation as its instances were only 409 which is .00003% of smallest CSV
in training data). The exploitation attack consists of UDP, UDP-Lag, and SYN. Each attack belongs to
each class has comma separated value (CSV) files for each attack under it. The size of the CSV files in
terms of number of instances are shown in Figure 6.

6
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

NUMBER OF INSTANCES IN EACH CSV

TFTP(1048574) SYN Flood(1582681)
NTP(1217007)
UDP Flood(1470742)
UDP-Lag(370607)
NetBIOS(4094986) MSSQL(1844905)

SSDP(2611374)

SNMP(5161377) LDAP(2181542)

DNS(5074413)

Figure 6. Attacks under training and their number of instances in each CSV

ML models were applied to identify the different attacks that had been executed in approximately
1,500,000 instances in the available hardware resources. Thus, a dataset was reconstructed by taking
every 9th instance in MSSQL, 10th instance in LDAP, 20th instance in NetBIOS, 10th instance in SSDP,
25th instance in DNS, 25th instance in SNMP, 6th instance in NTP and 5th instance in TFTP in reflection
attacks and every 3rd instance of SYN, 3rd instance in UDP flood and UDP-lag were sampled out of the
500,000 exploitation attacks. These values were decided by checking the instances that were present in
the original files. The different attacks in this experiment were encoded as follows: LDAP – 1, DNS –
2, MSSQL – 3, NetBIOS – 4, NTP – 5, SNMP – 6, SSDP – 7, TFTP – 10, UDP – 8, SYN – 9 and UDP-
Lag – 11.

The reconstructed dataset was pre-processed using the steps mentioned in section 3.2. The files of
different attacks under reflection and exploitation attacks were merged, and the attacks in each class
were identified by two ML algorithms, i.e. LGBM and XGBoost. The XGBoost methods were found to
be expensive in terms of time. LGBM performed the best, with high accuracy and least time required,
as shown in table 3.

Table 3: Accuracies obtained after applying ML methods for Experiment I

Methods Reflection attack (15 lakh Exploitation attack
instances) (15 lakh instances)
with balancing
Accuracy (%) Time (Min) Accuracy (%) Time (Min)
XGBoost 85.76 9.56 73.08 7
LGBM 83.7 4:56 73.00 2

3.4 Experiment II:

This experiment was similar to experiment I. On test day, six attacks were taken for identification. The
details of the CSV files are shown in Figure 7. Note UDP-Lag portion is not visible in figure 7 due to
very less number of instances (0.05%).

7
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

NUMBER OF INSTANCES IN EACH CSV

UDP- UDP-Lag(1873) LDAP(1905191)
Flood(3754680)

MSSQL(5763061)

SYN Flood(4284751)

PortMap(186960)
NetBIOS(3454578)

Figure 7. Attacks under test and their number of instances in each CSV

However, a balancing of the types of attacks was required in this experiment because fewer instances of
PortMap and UDP-Lag attacks occurred than other types of attacks. This balancing was accomplished
by taking every 25th instance in MSSQL, 10th instance in LDAP, 20th instance in SYN, 15th instance in
NetBIOS, 15th instance in UDP and all instances of PortMap. The number of UDP-lag cases was only
1,873; thus, it was oversampled to 200,000. The different attacks in this experiment were encoded as
follows: LDAP – 1, MSSQL – 2, NetBIOS – 3, PortMap – 4, SYN – 5, UDP – 6, UDP-Lag – 7. The
accuracies obtained after experiment 2 are shown in Table 4.

Table 4. Accuracies obtained after applying ML methods for Experiment II

Methods Test dataset with Balancing
Accuracy (%) Time (Min)
XGBoost 87.8 47
LGBM 87.4 3

3.5 Experiment III:

In this experiment, all attacks present on the training day (MSSQL, LDAP, SYN, NetBIOS, UDP and
UDP-Lag) were taken as training data, and all attacks on the test day (MSSQL, LDAP, SYN, NetBIOS,
UDP and UDP-Lag] were taken as test data. Since DNS, NTP, SNMP, SSDP, TFTP and PortMap are
not common attacks in training and testing day, they were excluded. The different attacks in this
experiment were, for both training and test day, encoded as follows: LDAP – 1, MSSQL – 2, NetBIOS
– 3, SYN – 5, UDP – 6 and UDP-Lag –7. The number of training and test day instances was reduced or
increased to achieve an 80:20 ratio, which is equal to the ratios in experiment I and experiment II. The
accuracies obtained in this experiment are shown in Table 5.

Table 5. Accuracies obtained after Experiment III

Methods 14,29,785 train data and 3,14,579 test data
Without Cross Validation With Cross Validation
Accuracy (%) Time (Min) Accuracy (%) Time (Min)

XGBoost 34 0.51 99.1 1:05

LGBM 55 6:24 99.2 43:49

8
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

It was discovered that accuracy was quite poor without cross-validation but very high with cross-
validation. Thus, a classification report for LGBM was generated, shown in Figure 8.

Figure 8. Classification report with UDP-Lag

According to the classification report, the precision, recall and F1 score for UDP-Lag were significantly
lower than for others. In addition, the number of UDP-Lag instances in the test dataset was just 1,873;
this is approximately 0.05% of the training data shown in Figure 9.

Even after up-sampling, the expected accuracies were not attained, due to the small amount of test data.
As a result, a new experiment was run without UDP-Lag; the accuracies obtained without UDP-Lag are
presented in Table 6, and the classification report is shown in Figure 9.

Table 6. Accuracies obtained in experiment III without UDP-Lag attack

Methods 14,29,785 train data and 3,14,579 test data
Without Cross Validation With Cross Validation
Accuracy (%) Time (Min) Accuracy (%) Time (Min)

XGBoost 94.89 3:49 99.1 31:40

LGBM 94.88 0:30 99.2 0:49

Figure 9. Classification report without UDP-Lag

The above experiments show the suitability of using LGBM and XGBoost methods for the CIC-DDoS
2019 dataset. Both methods show more than 80% accuracy in less time than traditional ML methods in
most of the experiments. LGBM executed the task in less than 5 minutes in most of the experiments,
whereas traditional ML methods take 45-75 minutes for the same size of data (shown in experiment II
in table 4 under Sec. 3.4). A comparison with existing research on the CIC-DDoS 2019 dataset is
presented in Table 7.

9
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

Table 7. Comparison of present work from existing methods

Ref, Year Methods Attacks Train vs Accuracy Time
Test
[18], 2020 ResNet Syn, TFTP, UDP Lag, DNS, No 87% -
LDAP, MSSQL, NetBIOS, NTP,
SNMP, SSDP, UDP, Normal
traffic
[19], 2021 J48 Classifier PortMap, SYN , NetBIOS, No 99.99% -
MSSQL, LDAP
[20], 2021 AdaBoost (AB), Binary Classification No 98.87% -
Linear
Discriminant
Analysis (LDA),
Logistic
Regression (LR)
and RF
[26],2021 Deep Neural Binary Classification No 99.99% -
Network
[27], 2021 Unsupervised
Learning
(Encoder
Decoder Model)
Present work LGBM, Multiclass Classification Yes 99.2% 0.49
XGBoost Mins

*”-” means time of execution is not provided in article.

In Table 7 it is observed that all previous research focussed on accuracy; execution time had never been
measured. In addition, no previous study had examined training vs test cases. In the present study, the
models were trained with day 1 attacks in the dataset (presented as training data) and tested with day 2
attacks (presented as test data in the dataset). The LGBM model presented in the current research
predicted the attacks with very high accuracy, 94.88% without cross-validation and 99.2% with cross-
validation, in 0.30 and 0.49 mins, respectively.

A ResNet pretrained network, which has a very complex neural network architecture particularly suited
to image processing applications, was used in [18]. ResNet generally takes a long time to converge due
to its complex architecture, and it requires special hardware to execute [28]. J48 Classifier was used in
[19] for the identification of individual attacks, providing very high accuracy. Only five attacks were
considered in this research. Binary classification was done in [20, 26] which produced higher accuracy
levels than multiclass classification [29]. Furthermore, AdaBoost and deep learning techniques were
always slower than LGBM and XGBoost [30].
Thus, a fast and efficient solution to DDoS attacks is presented in the current research, with the novelty
that models have been tested for unseen data.

4. Conclusions
With the increased volume of DDoS attacks, the problem of attack identification is getting more complex
by the day. The presented work proposes a time-efficient solution for reflection, exploitation and test
data. As a special case, five DDoS attacks in MSSQL, LDAP, SYN, NetBIOS and UDP present in the

10
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

training were recognised from corresponding similar attacks in the test data with an accuracy of 94.88%
and 94.89% by LGBM and XGBoost in 0.30 and 3.48 mins, respectively. ML models generally are very
prone to overfitting. To handle that situation, 10-fold cross-validation was applied, which improved
accuracy to 99.2% in 0.48 minutes by LGBM.
The present work can be applied to real data collected from IoT devices. A limitation that was found in
the present work is that all instances present in the dataset cannot be processed, even with the use of
high-end machines.

Acknowledgement: Our sincere thanks to Samsung R&D, Bangalore and Birla Institute of Technology,
Mesra, Ranchi, for providing us with the opportunity to work on the SRIB prism project. We would like
to thank Mr Prem Abhishek and Mr Bimal Gupta, Samsung R&D, Bangalore for their valuable
suggestions and support, enabling us to carry out this research efficiently.

References:
[1] Dalmazo BL, Marques JA, Costa LR, Bonfim MS, Carvalho RN, da Silva AS, Fernandes S,
Bordim JL, Alchieri E, Schaeffer‐Filho A, Paschoal Gaspary L. A systematic review on
distributed denial of service attack defense mechanisms in programmable networks.
International Journal of Network Management. 2021 May 24:e2163.
[2] Wani S, Imthiyas M, Almohamedh H, M Alhamed K, Almotairi S, Gulzar Y. Distributed Denial
of Service (DDoS) Mitigation Using Blockchain—A Comprehensive Insight. Symmetry. 2021
Feb;13(2):227.
[3] Malathy B, Krieshaanthiny N, Chitra B. Cloud-Based Enhanced Storage System Using Android
Technology. INTI JOURNAL. 2021;2021(01).
[4] Chen YW, Sheu JP, Kuo YC, Van Cuong N. Design and implementation of IoT DDoS attacks
detection system based on machine learning. In2020 European Conference on Networks and
Communications (EuCNC) 2020 Jun 15 (pp. 122-127). IEEE.
[5] Ramachandran A, Feamster N, Vempala S. Filtering spam with behavioral blacklisting. In
Proceedings of the 14th ACM conference on computer and communications security 2007 Oct
28 (pp. 342-351).
[6] Bakshi A, Dujodwala YB. Securing cloud from ddos attacks using intrusion detection system
in virtual machine. In2010 Second International Conference on Communication Software and
Networks 2010 Feb 26 (pp. 260-264). IEEE.
[7] Arbor networks detects largest ever DDoS attack in Q1 2015 DDoS report. In: Arbor Networks
(2015). http://www.arbornetworks.com/arbor-networks-detects-largest-ever-ddosattack-in-q1-
2015-ddos-report
[8] Khalaf BA, Mostafa SA, Mustapha A, Mohammed MA, Abduallah WM. Comprehensive
review of artificial intelligence and statistical approaches in distributed denial of service attack
and defence methods. IEEE Access. 2019 Apr 16;7:51691-713.
[9] Tan Z, Jamdagni A, He X, Nanda P, Liu RP. A system for denial-of-service attack detection
based on multivariate correlation analysis. IEEE transactions on parallel and distributed
systems. 2013 May 23;25(2):447-56.
[10] Saranya R, Kannan SS, Sundaram SM. Integrated quantum flow and hidden Markov chain
approach for resisting DDoS attack and C-Worm. Cluster Computing. 2019 Nov;22(6):14299-
310.
[11] Attaran M, Deb P. Machine learning: the new big thing for competitive advantage. International
Journal of Knowledge Engineering and Data Mining. 2018;5(4):277-305.

11
ICE4CT2021 IOP Publishing
Journal of Physics: Conference Series 2312 (2022) 012082 doi:10.1088/1742-6596/2312/1/012082

[12] Tuan TA, Long HV, Son LH, Kumar R, Priyadarshini I, Son NT. Performance evaluation of
Botnet DDoS attack detection using machine learning. Evolutionary Intelligence. 2020
Jun;13(2):283-94.
[13] Divekar A, Parekh M, Savla V, Mishra R, Shirole M. Benchmarking datasets for anomaly-based
network intrusion detection: KDD CUP 99 alternatives. In2018 IEEE 3rd International
Conference on Computing, Communication and Security (ICCCS) 2018 Oct 25 (pp. 1-8). IEEE.
[14] Prasad M, Tripathi S, Dahal K. An efficient feature selection based Bayesian and Rough set
approach for intrusion detection. Applied Soft Computing. 2020 Feb 1;87:105980.
[15] Meidan Y, Sachidananda V, Peng H, Sagron R, Elovici Y, Shabtai A. A novel approach for
detecting vulnerable IoT devices connected behind a home NAT. Computers & Security. 2020
Oct 1;97:101968.
[16] Oo MM, Kamolphiwong S, Kamolphiwong T, Vasupongayya S. Analysis of Features Dataset
for DDoS Detection by using ASVM Method on Software Defined Networking. International
Journal of Networked and Distributed Computing. 2020 Apr;8(2):86-93.
[17] Stiawan D, Idris MY, Bamhdi AM, Budiarto R. CICIDS-2017 dataset feature analysis with
information gain for anomaly detection. IEEE Access. 2020 Jul 16;8:132911-21.
[18] Hussain F, Abbas SG, Husnain M, Fayyaz UU, Shahzad F, Shah GA. IoT DoS and DDoS Attack
Detection using ResNet. In2020 IEEE 23rd International Multitopic Conference (INMIC) 2020
Nov 5 (pp. 1-6). IEEE.
[19] Kshirsagar D, Kumar S. A feature reduction based reflected and exploited DDoS attacks
detection system. Journal of Ambient Intelligence and Humanized Computing. 2021 Jan 28:1-
3.
[20] Maranhão JP, da Costa JP, Javidi E, de Andrade CA, de Sousa Jr RT. Tensor based framework
for Distributed Denial of Service attack detection. Journal of Network and Computer
Applications. 2021 Jan 15;174:102894.
[21] Schapire RE. A brief introduction to boosting. InIjcai 1999 Jul 31 (Vol. 99, pp. 1401-1406).
[22] Bentéjac C, Csörgő A, Martínez-Muñoz G. A comparative analysis of gradient boosting
algorithms. Artificial Intelligence Review. 2021 Mar;54(3):1937-67.
[23] Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H. Xgboost: extreme gradient boosting.
R package version 0.4-2. 2015 Aug 1;1(4):1-4.
[24] Sharafaldin I, Lashkari AH, Hakak S, Ghorbani AA. Developing realistic distributed denial of
service (DDoS) attack dataset and taxonomy. In 2019 International Carnahan Conference on
Security Technology (ICCST) 2019 Oct 1 (pp. 1-8). IEEE.
[25] Lashkari AH, Zang Y, Owhuo G, Mamun MS, Gil GD. CICFlowMeter.
[26] Cil AE, Yildiz K, Buldu A. Detection of DDoS attacks with feed forward based deep neural
network model. Expert Systems with Applications. 2021 May 1;169:114520.
[27] Odumuyiwa V, Alabi R. DDOS Detection on Internet of Things Using Unsupervised
Algorithms. Journal of Cyber Security and Mobility. 2021 May 27:569-92.
[28] Sundar KS, Bonta LR, Baruah PK, Sankara SS. Evaluating training time of Inception-v3 and
Resnet-50,101 models using TensorFlow across CPU and GPU. In 2018 Second International
Conference on Electronics, Communication and Aerospace Technology (ICECA) 2018 Mar 29
(pp. 1964-1968). IEEE.
[29] Lorena AC, De Carvalho AC, Gama JM. A review on the combination of binary classifiers in
multiclass problems. Artificial Intelligence Review. 2008 Dec;30(1):19-37.
[30] Shahraki A, Abbasi M, Haugen Ø. Boosting algorithms for network intrusion detection: A
comparative evaluation of Real AdaBoost, Gentle AdaBoost and Modest AdaBoost.
Engineering Applications of Artificial Intelligence. 2020 Sep 1;94:103770.