ITU Challenge
ITU Challenge
Abstract—The transition from conventional architectures to In this paper, we conduct a thorough investigation of in-
Software Defined Networks (SDNs) has revolutionized network trusion detection systems that are tailored for SDN-enabled
management and control in contemporary networking. Nonethe- networks. Our research capitalizes on a novel dataset that
less, the centralization of network control within SDNs has
introduced a significant security risk, necessitating the implemen- combines actual user data sourced from an SD-WAN envi-
tation of robust intrusion detection systems. This paper examines ronment with well-established datasets representing various
intrusion detection within SDN-enabled networks, concentrating intrusion scenarios. The dataset contains distinct sample types,
on the development of multiclass classifiers capable of identifying including Normal flow data, DDoS flow data, Malware flow
an array of intrusion types. A comprehensive dataset combining data, and web-based flow data. By leveraging the power
actual user data from an SD-WAN environment with established
datasets is provided to facilitate research. The sample set includes of machine learning techniques, we aim to create intrusion
Normal flow data, DDoS flow data, Malware flow data, and web- detection models that transcend binary categorizations, thereby
based flow data, among other types. Using machine learning contributing to the protection of SDN architectures against a
techniques, our research aims to facilitate the development of vast array of security threats. Through our research, we hope
effective intrusion detection models, thereby contributing to the to promote a deeper understanding of intrusion detection in
protection of SDN-based networks against a wide range of
threats. the context of SDN and to provide practical insights into the
Index Terms—SDN, DDoS, Intrusion Detection, Machine development of multiclass classifiers that have the potential to
Learning bolster network security in this era of dynamic networking.
The remainder of the paper is structured as follows: in
I. I NTRODUCTION Section II, we examine previous research on IDS. Section
III describes the methodology of our research, while Section
With the advent of Software Defined Networks (SDNs) IV evaluates the results. Section V concludes the paper and
in recent years, the networking landscape has undergone a discusses prospective research.
dramatic transformation. This paradigm shift has ushered in
unprecedented flexibility and efficiency in network manage- II. R ELATED W ORK
ment, made possible by the centralization of control through In an effort to protect Software Defined Networks (SDNs)
a single entity — the SDN controller. While this architectural from intrusions, a vast amount of research has been devoted
change offers numerous benefits, it has also introduced a major to the creation of effective intrusion detection systems (IDS).
security concern: the potential vulnerability of the centralized This section examines a selection of seminal research papers
controller, which, if compromised, could have catastrophic that advance intrusion detection techniques, particularly in
repercussions for the entire network. SDN-enabled environments.
In the context of SDNs, the security of the central controller Maxime Labonne [1] explored anomaly-based network in-
is crucial to the network operations’ integrity. Any unautho- trusion detection using machine learning techniques. The study
rized access or nefarious activity directed at the controller can highlighted the ability of machine learning algorithms to
have far-reaching effects on the network’s functionality and detect anomalies in network behaviour, providing insights into
security. In response, the implementation of effective intrusion enhancing network security by recognizing deviations from the
detection systems (IDS) has become mandatory. norm.
Traditional intrusion detection methods frequently rely on In SDN environments, Junhong Li [2] focused on the
binary classifications, labelling instances as normal or malevo- detection of Distributed Denial-of-Service (DDoS) attacks.
lent. Nonetheless, the changing threat landscape necessitates a Li proposed an innovative method for effectively identifying
more nuanced approach. Intrusions can manifest as Distributed DDoS attacks that combines dense neural networks, autoen-
Denial-of-Service (DDoS) attacks, malware infiltrations, and coders, and the Pearson correlation coefficient. The study
web exploits, among others. Therefore, the development of demonstrated how neural network architectures and correlation
multiclass classifiers capable of categorizing these diverse metrics can be utilized to improve the accuracy of intrusion
intrusion varieties is crucial. detection systems.
Naveen Bindra and Manu Sood [3] investigated the impact limited number of instances of these classes may hinder the
of feature selection techniques on the efficacy of machine algorithm’s ability to generalize accurately. Table I contains
learning models designed for the detection of DDoS attacks. overall summary.
Their research emphasized the importance of preprocessing
stages in enhancing the precision of intrusion detection mod- TABLE I
els. The research provided vital insights into the optimization DATASET S UMMARY
of machine learning-based intrusion detection systems through
Class Count
the evaluation of a variety of feature selection methods. 0 BENIGN 1,432,050
K. Muthamil Sudar and P. Deepalakshmi [4] introduced a 1 DoS Hulk 145,575
novel intrusion detection system based on flow analysis and 2 PortScan 100,125
3 DDoS 80,656
customized for SDN environments. Their research utilized 4 DoS GoldenEye 6,484
hybrid machine learning techniques to identify intrusions in 5 FTP-Patator 5,000
SDN environments. This study highlighted the importance of 6 SSH-Patator 3,714
7 DoS slowloris 3,651
employing flow-based analysis and hybrid machine learning 8 DoS Slowhttptest 3,464
models to address the unique security challenges that SDN 9 Bot 1,238
architectures present. 10 Web Attack Brute Force 949
11 Web Attack XSS 410
Zhen Yang et al. [5] conducted a comprehensive systematic 12 Infiltration 22
literature review, surveying anomaly-based network intrusion 13 Web Attack Sql Injection 12
detection methodologies and datasets. Their research, which 14 Heartbleed 6
Total 1,783,356
was published in Computers and Security, Volume 116 (2022),
offered a comprehensive overview of cutting-edge techniques
and data sets in this field. This contribution is a valuable
resource for researchers pursuing a comprehensive compre- B. Feature Selection
hension of the landscape and methodologies surrounding
anomaly-based network intrusion detection. The process of feature selection was instrumental in refining
Collectively, the cited works emphasize the importance of the original 77 features of the dataset. Using the robust
machine learning in developing resilient intrusion detection Random Forest (RF) classifier, we evaluated each attribute’s
systems in SDN environments. Incorporating techniques such significance in relation to the target variable with great care.
as anomaly detection, neural networks, and hybrid machine Through this exhaustive evaluation, we identified and retained
learning, researchers have made significant advances in en- a subset of 28 carefully curated features with significant
hancing the security of SDNs against a variety of intrusion predictive power. To validate and strengthen our choices,
scenarios. In conjunction with Zhen Yang et al.’s systematic lit- we subjected these 28 characteristics to Principal Compo-
erature review, these studies contribute to a greater understand- nent Analysis (PCA), a well-known dimensionality reduction
ing of the evolving threat landscape and the corresponding technique. Remarkably, the results derived from both RF and
defences in the dynamic domain of SDN-based networking. PCA exhibited remarkable coherence, further validating the
effectiveness of the feature subset we chose. This consoli-
III. R ESEARCH M ETHODOLOGY dated subset not only improves the predictive ability of our
A. Dataset models, but also assures an essential level of interpretability
The provided dataset, which is central to a competition, for extracting insights from complex datasets. Table II lists the
contains 1.78 million rows, each of which is characterized by prominent features.
77 distinct columns. A crucial component of the dataset is
its solitary labelled column, which serves as the competition’s TABLE II
output variable. The structure of this dataset reflects the intent S IGNIFICANT F EATURE L IST
to use machine learning to predict or classify outcomes based Features
on the interaction of the various independent features. This Total Length of Fwd Packets Flow Packets/s
dataset is notable for its pronounced class disparity, a situation Total Length of Bwd Packets Fwd IAT Mean
Fwd Packet Length Mean Max Packet Length
in which some classes in the labelled column are considerably Bwd Packet Length Min Packet Length Variance
underrepresented in comparison to others. This disparity can Bwd Packet Length Std Avg Fwd Segment Size
present difficulties for machine learning algorithms, potentially Fwd IAT Total Subflow Fwd Bytes
Fwd Header Length InitW inb ytesb ackward
resulting in biased model performance in which the majority Packet Length Std Flow IAT Max
class dominates prediction accuracy. This imbalance necessi- Average Packet Size Fwd IAT Std
tates the application of specialized techniques to ensure that Fwd Header Length.1 Packet Length Mean
InitW inb ytesf orward PSH Flag Count
all classes are treated fairly. Some classes within the labelled Fwd Packet Length Max Avg Bwd Segment Size
column exhibit an extremely sparse representation, with only a Bwd Packet Length Max Subflow Bwd Bytes
minimal two-digit total, adding to the complexity. This scarcity Bwd Packet Length Mean Fwd IAT Max
heightens the need for cautious model management, as the
C. Data Prepossessing D. Model Architecture
In the model construction phase, a sophisticated ensemble
We selected the desired features, which would be impactful approach is employed. It involves combining multiple indi-
for the model to learn the pattern for accurate predictions. The vidual models to create a more accurate and robust predictive
data is then cleaned by using a function, which removes rows model [7]. This approach enhances accuracy by leveraging
containing missing or infinite values to ensure data quality. It diverse models that capture different data patterns, reduces
is used to convert the data to only numeric values, except overfitting by promoting generalization, and handles complex
for the Label column. It is vital for accurate analysis and relationships in the data effectively [8]. In ensemble learning,
modeling processes, resulting in enhanced data reliability. The ”training” involves developing individual models, each trained
column names had blank characters, which were cleaned. We on distinct subsets of data or employing varied algorithms to
separated the data classes into 3 sub-groups, depending on capture diverse data patterns. These models collectively form
their data samples. Benign, DoS Hulk, PortScan, and DDoS the ensemble. The ”meta model,” also called the ensemble
were separated in the High Samples class. Which had more model, integrates the predictions from the individual models
than 80000 rows per class. We then reduced the Benign to make a final prediction. This amalgamation harnesses the
sample from 1432050 to 150000 samples, as the extra data strengths of each model, yielding enhanced accuracy and
was not significantly contributing to model performance. The robustness.
Mid-Samples class consisted of DoS GoldenEye, FTP-Patator, Ensemble methods that are implemented in our experiment
SSH-Patator, DoD Slowloris, DoS Slowhttptest classes, where is Random Forest, AdaBoost, and XGBoost. Multiple bag-
each of them had at least 3000 samples. The high and mid- ging and boosting algorithms can be combined to create a
sample classes were less challenging to classify. The other generic heterogeneous model architecture. In our study, each
classes consisted of less than 1000 samples, in fact, three classifier is meticulously instantiated with algorithm-specific
of them had less than 50 samples. We then augmented five parameters, such as random state and objective, which shape
of these classes using the synthetic minority over-sampling the way each classifier constructs its decision boundaries and
technique (SMOTE) [6] approach. It works by generating responds to optimization goals. During the training of base
synthetic samples for the minority class by interpolating models, the essence of supervised learning takes place. The
between existing instances and their nearest neighbors. This training set is partitioned into smaller segments of training
helps create a more balanced class distribution, preventing subsets. These subsets, encompassing features and correspond-
the model from being biased towards the majority class and ing labels facilitate the iterative fitting process. The base
improving its ability to accurately classify the minority class. models learn from samples in the training data. They adjust
Table III illustrates the number of samples for each of these their weights to make accurate prediction about labels for
classes before and after the augmentation. new validation data. These predictions are combined with the
original features to create a new set of data. The meta-model
then learns how to best use these combined predictions and
TABLE III features from previously used three base models, to make
L OW-S AMPLE OF CLASSES DISTRIBUTION BEFORE AND AFTER a final prediction for unseen data. This improves prediction
AUGMENTATION
accuracy by bringing together different models’ insights and
Attack Type Original Count After SMOTE making a more informed decision. In our experiment, the
Web Attack – Brute Force 949 949 Random Forest classifier was used as a meta-model.
Web Attack – XSS 410 1210
Infiltration 22 272
Random Forest is a robust meta learning model due to
Web Attack – Sql Injection 12 262 its ability to combine the predictive power of multiple de-
Heartbleed 6 256 cision trees while addressing their limitations. It constructs
an ensemble by training numerous decision trees on differ-
ent data subsets and features, resulting in diverse models.
We then concatenated three subsets of data and consolidated Through majority voting, it aggregates these trees’ predictions,
it into a single training file with a relatively similar number reducing overfitting and enhancing generalization. Moreover,
of data samples among three subsets of data. Then the data the random feature selection for each tree adds an element
preprocessing is done by scaling the feature values to have of variability, mitigating the risk of a single tree dominating
zero mean and unit variance. The labels are prepared to the ensemble. This approach leads to robust and accurate
convert textual class labels into numerical values. The training predictions, making Random Forest an effective choice for
dataset is then split into a smaller training set and a validation complex datasets like our dataset.
set, where 10% of the data is allocated for validation while
ensuring that the class distribution is preserved in both sets. IV. R ESULT A NALYSIS
These preprocessing steps are essential to ensure consistent In our study, we thoroughly examined how well our ensem-
and well-prepared data for training, validation, and testing, ble model performed in detecting different types of network
ultimately leading to accurate and reliable model performance traffic and security threats. We tested the model on two
evaluation. different sets of data: one to check its accuracy during training,
Fig. 1. Validtion Set Result
and the other to see how well it works on new, unseen data. on its strengths, addressing challenges with rare cases and
The results were quite promising. The average F1-score for complex attacks, and fine-tuning some aspects, we can make it
5-fold cross validation on the validation set was 97.77%. Our even better at its job. This analysis gives us important insights
model was very good at correctly identifying common types to improve the model and make it more effective in real-world
of network traffic and well-known attacks. It also did a good situations.
job of maintaining its accuracy when dealing with new data, As per the requirement of the competition, we need to
which is an important sign of a robust model. compare the performance of our approach with three existing
However, we did notice some challenges. When it came to methods. We found the approach mentioned in [8] interesting
detecting rare security threats or specific types of attacks, the as the authors propose a hybrid models that combines CNN
model struggled. This tells us that we might need more data or and Random forest for intrusion detection. We implemented
different techniques to handle these rare cases effectively. We the approach mentioned in [?]. The customized regularized
solved the class imbalance issue by augmenting the minority function from [8] is also implemented. The result of this ap-
class data, but due to a very small number of original samples proach on validation dataset are shown in Figure 3. The other
in the minor classes, the model was not able to robustly two baselines are supervised Random Forest and Xgboost
classify them in test data. We also found that the model had a models. We select these two models as these two models are
bit of trouble when dealing with complex attack patterns. For commonly used as comparison in the literature. The confusion
instance, it was not as accurate in identifying certain web- matrix of these two models are shown in Figure 4 and 5,
based attacks. respectively.
A common theme throughout our analysis was finding the V. C ONCLUSION
right balance between precision and having a good recall score. In terms of future work, several approaches for enhanc-
While the model did well in most of the cases, there’s still ing the performance and robustness of our ensemble model
room for improvement in this balance. Figure 1 illustrates can be applied. First, the handling of rare classes, such as
the confusion matrix of the ensemble model on validation ”Heartbleed,” ”Infiltration,” and ”Web Attack – Sql Injection,”
data and Figure. 2 depicts the result for test data. In the demands specialized meticulous attention. Second, an explo-
training, we used 5-fold stratified cross-validation to ensure ration of alternate feature engineering can be investigated.
model robustness. Investigating them by identifying and incorporating more
To sum up, our ensemble model showed promising results salient aspects of network traffic data, we can enhance the
in detecting network traffic and security threats. By focusing model’s overall performance. For imbalanced data, rectifying
the skewed class distributions can empower the model to learn
more effectively from underrepresented classes, thus allowing
a more equitable learning process. Third, the optimization
of threshold values and hyperparameters stands as a key
consideration. Future research could delve into comprehen-
sive experiments to fine-tune threshold settings, achieving an
optimal balance between precision and recall across various
classes. Moreover, further refinement of the ensemble model
architecture by using different classifiers and incorporating
neural networks can be explored. Techniques like class-specific
weighting or the exploration of diverse ensemble configura-
tions could be used to address performance disparities among
specific classes. Class distribution-wise separate training mod-
els can be implemented in future to find out the performance
based on different subsets of data. For instance, larger sample
data can be trained by neural networks, and smaller data can
be trained by tree-based models. Such architectures and voting
mechanisms can be further investigated.
R EFERENCES
[1] Maxime Labonne. Anomaly-based network intrusion detection using
machine learning. Cryptography and Security [cs.CR]. Institut Polytech-
nique de Paris, 2020. English. NNT : 2020IPPAS011. Tel 02988296.
[2] Junhong Li. DETECTION OF DDOS ATTACKS BASED ON DENSE
NEURALNETWORKS, AUTOENCODERS AND PEARSON CORRE-
LATION COEFFICIENT. Dalhousie University. April 2020.
[3] Naveen BINDRA, Manu SOOD. Evaluating the Impact of Feature
Selection Methods on the Performance of the Machine Learning Models
in Detecting DDoS Attacks. ROMANIAN JOURNAL OF INFORMA-
TION SCIENCE AND TECHNOLOGY. Volume 23, Number 3, 2020,
250–261.
[4] K.Muthamil Sudar, P.Deepalakshmi. Flow Based Intrusion Detection
System for Software Defined Networking using Hybrid Machine Learn-
ing Technique. International Journal of Innovative Technology and
Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-9 Issue-2S2,
December 2019.
[5] Zhen Yang, Xiaodong Liu, Tong Li, Di Wu, Jinjiang Wang, Yunwei
Zhao, Han Han, A systematic literature review of methods and datasets
for anomaly-based network intrusion detection, Computers & Security,
Volume 116, 2022.
[6] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer,
“SMOTE: synthetic minority over-sampling technique,” Journal of arti-
ficial intelligence research, vol. 16, pp. 321–357, 2002.
[7] B. N. Narayanan and V. S. P. Davuluru, “Ensemble malware classifica-
tion system using deep neural networks,” Electronics, vol. 9, no. 5, p.
721, 2020.
[8] K. Abbas, M. Afaq, T. Ahmed Khan, A. Rafiq, and W.-C. Song, “Slicing
the core network and radio access network domains through intent-based
networking for 5G networks,” Electronics, vol. 9, no. 10, p. 1710, 2020.
[9] M. S. ElSayed, N.-A. Le-Khac, M. A. Albahar, and A. Jurcut, ‘A novel
hybrid model for intrusion detection systems in SDNs based on CNN
and a new regularization technique’, Journal of Network and Computer
Applications, vol. 191, p. 103160, 2021