Machine Learning in Intrusion Detection
Machine Learning in Intrusion Detection
30
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
INTRUSION
DETECTION
TIME OF DATA
ALERT ARCHITECTURE ENVIRONMENT PROCESSING
DETECTION SOURCE
ANALYSIS/
STRATEGY
MEATHOD
WIRED WIRELESS CONJUGATION
BASICALLY FULL
CENTRALIZED DISTRIBUTED
WITH WITHOUT
INFRASTRUCTURE INFRASTRUCTURE
ANOMALY MISUSE
DETECTION DETECTION
DATA RULE
COMPONENT HYBRID DISTRIBUTED CENTRALIZED MINING BASED
FIG:2
STATE SIGNATURE
TRANSITION BASED
DISTRIBUTED GRID
IDS IDS
HOST NETWORK
HYBRID NBA
BASED BASED
31
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
ANOMALY DETECTION
According to the type of processing related to the knowledge-based IDS schemes. Knowledge based techniques
‘‘behavioral’’ model of the target system, anomaly detection are divided into frame based model, rule based model and
techniques can be classified into three main categories [4] expert system. Rule based is modified form of the grammar
statistical based, knowledge-based, and machine learning- based production rules. Frame based model localizes an entire
based. In [18] the well-known intrusion detection approaches body of expected knowledge and actions into a single
and Comparison of various approaches reviewed with the structure. Expert systems are intended to classify the audit
strength and weakness of those approaches. data according to a set of rules, involving three steps. First,
different attributes and classes are identified from the training
2.1 Statistical anomaly-based IDS data. Second, a set of classification rules, parameters or
A statistical anomaly-based IDS find out normal network procedures are deduced. Third, the audit data are classified
activity like what sort of bandwidth is generally used, what accordingly [5], [6] .
protocols are used, what ports and devices generally connect Pros: - 1) Robustness. Flexibility and scalability
to each other- and aware the administrator or user when traffic
is detected which is anomalous (not normal) [5] [7]. It is again Cons: -1) Difficult and time-consuming availability of high-
categorized into univariate, multivariate and time series quality knowledge/data.
model. Univariate model parameters are modeled as
independent Gaussian random variables thus defining an 2.3 Machine learning-based IDS
acceptable range of values for every variable. The multivariate Machine learning techniques are based on establishing an
model considers the correlation between two or more explicit or implicit model. A singular characteristic of these
variables. The time series model uses an interval timer, schemes is the need for labeled data to train the behavioral
together with an event counter or resource measure and take model, a procedure that places severe demands on resources.
into account the order and inter arrival times of observations In many cases, the applicability of machine learning
and their values which are labelled as anomaly if its principles coincides with that for the statistical techniques,
probability of occurrence is too low at a given time. although the former is focused on building a model that
improves its performance on the basis of previous results.
Pros: - 1) Prior knowledge about normal activity not required. Hence, machine learning for IDS has the ability to change its
2) Accurate notification of malicious activities execution strategy as it acquires new information. This feature
could make it desirable to use such schemes for all situations.
Cons:- 1) Susceptible to be trained by attackers.
2)Difficult setting of parameters and metrics. Pros:-1)Flexibility and adaptability capture of
3) Unrealistic quasi-stationary process assumption interdependencies.
32
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
33
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
distributed information processing models that consist of: (a) MEDIUM-LOW) and suitable fuzzy rules to detect the
a set of simple processing units (nodes, neurons), (b) a set of intrusion. In their report authors have not specified how they
synapses (connection weights), (c) the network architecture have derived their fuzzy set. The fuzzy set is a very important
(pattern of connectivity), and (d) a learning process used to issue for the fuzzy inference engine\ and in some cases
train the network. [20] Based on the advantages and genetic approach can be implemented to select the best
disadvantages of the improved GA and LM algorithm, in this combination. The proposed system is tested using data
paper, the Hybrid Neural Network Algorithm (HNNA) is collected from the local area network in the college of
presented. Firstly, the algorithms uses the advantage of the Engineering at Iowa State University and the results are
improved GA with strong whole searching capacity to search reported in this paper. The reported results are descriptive and
global optimal point in the whole question domain. Then, it not numerical; therefore, it is difficult to evaluate the
adopts the strong point of the LM algorithm with fast local performance of the reported work.
searching to fine search near the global optimal point. The
paper used respectively the three algorithms, namely the 4.5 Genetic algorithms
Improved GA, LM algorithm and HNNA, to adjust the input Genetic algorithms are classified as global search heuristics,
and output parameters of the ANN model, and adopt the and evolutionary computation that uses techniques inspired by
theories of the fusion of the multi-classifiers to structure the evolutionary biology such as recombination, selection,
Intrusion Detection System. By repeating an experiment, it is inheritance and mutation. Thus, genetic algorithms represent
found that the HNNA is better in stability and convergence another type of machine learning-based technique, capable of
precision than LM algorithm and improved GA from the deriving classification rules [11] and/or selecting appropriate
training result. The testing results are also proving that the features or optimal parameters for the detection process [10] .
detection rate of the multiple classifier intrusion detection
system based on HNNA learning algorithm, including all In [25] rule evolution approach based on Genetic
attack categories that has a few or many training samples, is Programming (GP) for detecting novel attacks on networks is
higher than the IDS that use LM and improved GA learning proposed. In their framework, four genetic operators, namely
algorithm, and the false negative rate is less. So, the HNNA is reproduction, mutation, crossover and dropping condition
proved to be feasible in theory and practice. operators, are used to evolve new rules. New rules are used to
detect novel or known network attacks. Experimental results
In [21] according to the difference between the attack show that rules generated by GPs with part of KDD 1999 Cup
categories, they adjust the 41-dimensional input features of data set has a low false positive rate (FPR), a low false
the neural-network-based multiple classifier intrusion negative rate (FNR) and a high rate of detecting unknown
detection system. After repeated experiment, they find that the attacks. However, an evaluation with full KDD training and
every adjusted sub-classifier is better in convergence testing data is missing in the paper.
precision, shorter in training time than the 41-features sub-
classing, moreover, the whole intrusion detection system is More efforts using GA for intrusion detection are made in [14,
higher in the detection rate, and less in the false negative rate 4, 8] proposes a linear representation scheme for evolving
than the 41-features multiple classifier intrusion detection fuzzy rules using the concept of complete binary tree
system. So, the scheme of the adjusting input features is able structures. GA is used to generate genetic operators
to optimize the neural-network-based multiple classifier For producing useful and minimal structural modifications to
intrusion detection system, and proved to be feasible in the fuzzy expression tree represented by chromosomes.
practice However, the training process in this approach is
computationally very expensive and time consuming. Bridges
4.4 Fuzzy logic techniques and Vaughn employ GA to tune the fuzzy membership
Fuzzy logic is derived from fuzzy set theory under which functions and select an appropriate set of features
reasoning is approximate rather than precisely deduced from in their intelligent intrusion detection system. GA as
classical predicate logic. Fuzzy techniques are thus used in the evolutionary algorithms was successfully used in different
field of anomaly detection mainly because the features to be types of IDS. Using GA returned impressive results; the best
considered can be seen as fuzzy variables [10]. The fitness value was very close to the ideal fitness value. GA is a
application of fuzzy logic for computer security was first randomization search method often used for optimization
proposed in [23]. Fuzzy Intrusion Recognition Engine (FIRE) problem. GA was successfully able to generate a model with
for detecting intrusion activities is proposed in [24] and the the desired characteristics of high correct detection rate and
anomaly based IDS is implemented using the data mining low false positive rate for IDS [15].
techniques and the fuzzy logic. The fuzzy logic part of the
system is responsible for both handling the large number of 4.6 Clustering and outlier detection
input parameters and dealing with the inaccuracy of the input Clustering techniques work by grouping the observed data
data. Three fuzzy characteristics used in this work are into clusters, according to a given similarity or distance
COUNT, UNIQUENESS and VARIANCE. The implemented measure. The procedure most commonly used for this consists
fuzzy inference engine uses five fuzzy sets for each data in selecting a representative point for each cluster. Clustering
element (HIGH, MEDIUM-HIGH, MEDIUM LOW and techniques to determine the occurrence of intrusion events
only from the raw audit data, and so the effort required to tune
34
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
the IDS is reduced. One of the most popular and most widely system process is performed providing input as packets
used clustering algorithms is K-Means [19], which is a non- without specifying answer class.
hierarchical Centroid-based approach.
Packets
4.7 Data Mining
Data mining is an information activity to discover hidden
facts contained in the database. These techniques are used to
Packet Sniffing
find patterns and intelligent relationships in data and infer
rules that allow the prediction of future result.
Packets
Association rule learning is one of many data mining
techniques that describe events that tend to occur together.
Association rule discovery is to define normal activity by
which discovery of anomalies is easily enabled. Classification Preprocessing
is to classify each audit record into one of the possible
categories normal and anomaly. Packets
Off-line IDS: - In this phase packet capturing is done from 5.3 Post Processing
dataset (for e.g KDD dataset/NLS KDD) to serve for the data The result got in preprocessing phase is evaluated against
source of the IDS. answer class and system performance is measured in
5.2 Classification combinations of correctness and false alarms. i.e. True
In classification phase utilize the data received from the Positive, True Negative, False Positive and False Negative.
previous phase for detecting whether the normal packet or
attack packet. Depending on feature values the corresponding 5.4 Reducing False Alarms
algorithms will classify the packet into similar groups. It If the system is still giving some false alarms for all the
consists of two processes: algorithms some more training is needed to be given. This is
the machine learning mechanism i.e. the system will keep on
(a) Training data (b) Testing data learning on its own without human interference. And hence
there is no updating required.
In training phase answer class is provided along with the
packet features which will help to formulate rules deciding 6. CONCLUSIONS
mapping domains. These rules may get changed replaced In this paper authors have presented an overview of machine
depending on further training. Every algorithm has its own learning technologies which are being utilized for the
strategy of classification. detection of attacks in IDS and system design of effective
IDS. The security of information in computer based systems is
In the Testing Phase, untrained data are given to the system
a major concern to researchers. The work of IDS and
for sampling whether true answers are obtained or not. The
35
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
methodologies which has been a major focus of information Computer Science Laboratory, SRI International; 1985.
security related research. Machine learning is a vast and Technical Report #83F83- 01-00
advanced field still relatively immature and definitely not [6] Anderson D, Lunt TF, Javitz H, Tamaru A, Valdes A.
optimized for IDS. “Detecting unusual program behavior using the statistical
component of the next-generation intrusion detection
7. FUTURE DIRECTION expert system (NIDES),” Menlo Park, CA, USA:
In recent years, the challenges that lie ahead of us in intrusion Computer Science Laboratory, SRI International; 1995.
SRIO-CSL-95-06.
detection system are huge, which are listed as follows
[7] Ye N, Emran SM, Chen Q, Vilbert S. “ Multivariate
1. Inability to lessen the number of false positives statistical analysis of audit trails for host-based intrusion
which reduce efficiency of IDS. Good IDS should detection,” IEEE Transactions on Computers 2002;51(7).
perform with a high precision and a high recall, as [8] Wenke Lee and Salvatore J. Stolfo, “A framework for
well as a lower false positive rate and a lower false constructing features and models for intrusion detection
negative rate. How one can have confidence in the systems,” 2000, ACM Trans. Inf. Syst. Secur., 3(4):227–
result is a major issue. 261.
[9] Heckerman D.“A tutorial on learning with Bayesian
2. Time taken to process the huge amount of data for networks,” Microsoft Research; 1995. Technical Report
training is very large. MSRTR-95-06.
3. To improve classification accuracy is a major task [10] Bridges, Vaughn, “Fuzzy Data mining and genetic
algorithms applied to intrusion detection,” In:
in IDS. Impose to focus on multi classifier system.
Proceedings of the National Information Systems
Security Conference; 2000. pp. 13–31.
4. Because of inadequate computing resources and
tremendous increase of targeted attacks necessity of [11] Li W. “Using genetic algorithm for network intrusion
real time Intrusion detection system. However, its detection,” C.S.G. Department of Energy; 2004. pp. 1–8.
implementation in real life environment is [12] Y. Zhai, P. Ning, P. Iyer, D.S. Reeves, “Reasoning about
challenging. complementary intrusion evidence,” in: Proceedings of
the 20th Annual Computer Security Applications
5. Need of the standard evaluation dataset which Conference (ACSAC 04), December 2004.
simulate for real time IDS . [13] X.D. Hoang, J. Hu, P. Bertok, “A program-based
anomaly intrusion detection scheme using multiple
6. Feature reduction work -Many studies use feature detection engines and fuzzy inference,” Journal of Net-
selection for data reduction, to decrease the work and Computer Applications 32 (2009) 1219–1228.
computational complexity. Need to concentrate [14] Kamra, Bertino, “Design and Implementation of
more to perform the task of data deduction an Intrusion Response System for Relational Databases,”
IEEE Transaction on Knowledge and Data
7. Need to implement a combination technique for Engineering, Volume: 23, Issue: 6 doi 10.1109
misuse detection and anomaly detection /TKDE.2010.151 ,2011, pp: 875 – 888
[15] Suhail Owais ,Václav Snášel, Pavel Krömer,Ajith
The machine learning technique could turn out very good Abraham ,”Survey: Using Genetic Algorithm Approach
field for IDS by resolving these challenges. in Intrusion Detection Systems Techniques “,978-0-
7695-3184-7/08 DOI 10.1109/CISIM7th Computer
8. RREFERENCES Information Systems and Industrial Management
[1] S. Chebrolu, A. Abraham, and J. P. Thomas, “Feature Applications./2008 IEEE
deduction and ensemble design of intrusion detection [16] C. Xiang and S. M. Lim, “Design of multiple-level
systems,” Comput. Secure., vol. 24, no. 4, pp. 295–307, hybrid classifier for intrusion detection system,” in
Jun. 2005 Workshop on Machine Learning for Signal Processing,
[2] W. Lee and S. J. Stolfo, “A framework for constructing 2005, pp. 117–122.
features and models for intrusion detection systems,” [17] B. Daniel, C. Julia, J. Sushil, P. Leonard, N. N. Wu,
ACM Trans. Inf. Syst. Secur. vol. 3, no. 4, pp. 227–261, “ADAM: Detecting intrusions by data mining”,
Nov. 2000. Proceedings of the 2001 IEEE, workshop on Information
[3] Denning D, “An Intrusion-Detection Model,” IEEE Assurance and Security, West Point, NY, 2001.
Transactions on Software Engineering, Vol. SE-13, No [18] Murali A, Rao M, “A Survey
2, Feb 1987. on Intrusion Detection Approaches,” Information and
[4] Lazarevic A, Kumar V, Srivastava J. Intrusion detection: Communication Technologies, 2005. ICICT 2005. First
“A survey, Managing cyber threats: issues, approaches, International Conference
and challenges,” Springer Verlag; 2005. pp. 330. on DOI: 10.1109/ICICT.2005.1598592, Year: 2005, pp:
233 – 240
[5] Denning DE, Neumann PG. “Requirements and model
for IDES – a real-time intrusion detection system,”
36
International Journal of Computer Applications (0975 – 8887)
Volume 78 – No.16, September 2013
[19] Mrutyunjaya Panda, and Manas Ranjan Patra “ paradigms, ACM New York, NY, USA, 1993, pp. 175-
NETWORK INTRUSION DETECTION USING 184.
NAÏVE BAYES ”, IJCSNS International Journal of
Computer Science and Network Security, VOL.7 No.12, [24] John E. Dickerson and Julie A. Dickerson, Fuzzy
December 2007 network profiling for intrusion detection, Proceedings of
NAFIPS 19th International Conference of the North
[20] Li Xiangmei Qin Zhi “The Application of Hybrid American Fuzzy Infor mation Processing Society
Neural Network Algorithms in Intrusion Detection (Atlanta, USA), July 2000, pp. 301-306.
System “978-1-4244-8694-6/11 ©2011 IEEE
[25] [Link] and [Link], Unsupervised Anomaly Detection
[21] Xiangmei Li ,”Optimization of the Neural-Network- Using an Evolutionary Extension of K-means
Based Multiple Classifiers Intrusion Detection System Algorithm,International Journal on Information and
“,978-1-4244-5143-2/10 ©2010 IEEE computer Science, Inderscience Pulisher 2 (May, 2008),
107-139.
[22] Naeem Seliya Taghi M. Khoshgoftaar, ”Active Learning
with Neural Networks for Intrusion Detection”, IEEE IRI [26] Jiankun Hu, Xinghuo Yu, Qiu D, Hsiao-Hwa Chen; “A
2010, August 4-6, 2010, Las Vegas, Nevada, USA 978- simple and efficient hidden Markov model scheme for
1-4244-8099-9/10 host-based anomaly intrusion detection,” IEEE
Transaction on Network, Volume: 23, Issue:
[23] H.H. Hosmer, Security is fuzzy!: applying the fuzzy 1 DOI: 10.1109/MNET.2009.4804323, Year: 2009,
logic paradigm to the multipolicy paradigm, Page(s): 42 – 47.
Proceedings of the 1992-1993 workshop on New security
IJCATM : [Link] 37