0% found this document useful (0 votes)
44 views

Intrusion Detection System

The document discusses intrusion detection systems and the use of explainable artificial intelligence (XAI) to provide explanations for their predictions. It reviews state-of-the-art methods for intrusion detection systems and their challenges. The document proposes a taxonomy and surveys black box and white box approaches for explainable intrusion detection systems (X-IDS).

Uploaded by

Aditi Goel
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Intrusion Detection System

The document discusses intrusion detection systems and the use of explainable artificial intelligence (XAI) to provide explanations for their predictions. It reviews state-of-the-art methods for intrusion detection systems and their challenges. The document proposes a taxonomy and surveys black box and white box approaches for explainable intrusion detection systems (X-IDS).

Uploaded by

Aditi Goel
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Intrusion Detection Systems: A Survey on Current

Methods, Challenges, and Opportunities

Preamble

The network intrusion is a kind of unauthorized access of a computer in an enterprise


network by malicious parties with mallafide intension, and it may be passive or active
intrusion. The passive intrusion is an attempt by malicious parties to steal personal
information only; while, in active intrusion the malicious parties can steal and destroy
information as well. The intrusion is a challenging issue for a business network; now days a
business network can be prevented using Intrusion Detection System (IDS) from intrusion.
The IDS have drawn attention of industry and academia towards the application of Artificial
Intelligence (AI) and Machine Learning (ML). The AI has ability to handle huge amounts of
data with high prediction accuracy, and it is hosted in organizational Cyber Security
Operation Center (CSOC) as a defence tool to monitor and detect malicious network flow
that may impact confidentiality, integrity, and availability. The Deep Learning (DL) uses as a
black box models and do not provide a justification for their predictions. The DL creates a
barrier for CSOC analysts, as they are unable to improve their decisions based on the model’s
predictions. This survey reviews the state-of-the-art in for Intrusion Detection System as its
current challenges along with discussion of black box and white box approaches
comprehensively. We also present the tradeoff between these approaches in terms of their
performance and ability to produce explanations. We proposed a generic architecture that
considers human-in-the-loop which can be used as a guideline when designing IDS. Research
recommendations are given from three critical viewpoints: the need to define explainability
for IDS, the need to create explanations tailored to various stakeholders, and the need to
design metrics to evaluate explanations.

Kew Words: IDS, AI, ML, DL, Black Box, White Box
I. Introduction

The industries and academia are widely using AI and ML to solve intrusion problems and
offers data-driven automation that could enable security systems to identify and respond to
intrusion threats in real time. These AI- based cyber defense systems are hosted in an
organizational CSOC, and CSOC operated by security analysts’ act as a cyber security
information hub and a defense base [1–11].

This security systems includes Security Information and Event Management (SIEM)
systems, vulnerability assessment solutions, governance, risk and compliance systems,
application and database scanners, Intrusion Detection Systems (IDS), user and entity
behavior analytics, Endpoint Detection and Remediation (EDR), etc. The security analysts
maintain an “organizational state”, keeping themselves one step ahead of the attackers to
prevent potential intrusions [12].

The author James Anderson’s seminal introduced the term “intrusion detection” in the early
1980s [13]. The author Dorothy E. Denning proposed first functional IDS based on
Anderson’s work in the mid-1980s to automates the process of monitoring and analyzing
events occurring within a computer system or network for indications of potential security
problems before they inflict widespread damage [14-16].

The intrusion is a breach of at least one of the principles either Confidentiality and Integrity
or Availability. The objective of an IDS is to detect unauthorized access by malicious parties
of a business network within an organization; and a lots of research work has been done by
researchers and industries to improve the operational capacity of the IDS [17–19].

The development history of IDSs has shown that the numerous IDSs have been developed
with application of a variety of techniques i.e. an array of disciplines, including statistical
methods, ML techniques, and others [20]. Currently the researchers are widely using ML and
DL techniques to develop IDS due to their ability to attain a high detection rate [21].

The authors primarily focused on intrusion detection techniques that are based on deep
learning techniques [22-23]. The preceding surveys are deficient in their ability to explain
IDSs techniques inference processes and final results. They are frequently treated as a black
box by both developers and users [24].

The earlier opaque or non-transparent black box nature models can achieve impressive
prediction accuracies; but they lack justifications for their predictions due to their nested and
non-linear structure that makes it difficult to identify the precise information in the data that
influences their decision-making [25-26]. This black box nature model creates problems for
several domains in which AI or components of AI are integrated [27].

The predictions made by the IDS model should be understandable a n d when an IDS model
is presented with zero-day attacks, the model may misclassify the attacks as normal, resulting
in a system breach [28]. Understanding why specific samples are misclassified is the first
step toward debugging and diagnosing the system. It is critical to provide detailed
explanations for such misclassifications, so as to determine the appropriate course of
action to prevent future attacks [29-30].

The AI systems designed using DL techniques are difficult to interpret due to their inability to
be decomposed into intuitive components [31]. The Explainable AI (XAI) seeks to remedy
this and other problems. According to the Defense Advanced Research Projects Agency
(DARPA), the XAI systems are able to explain their reasoning to a human user, characterize
their strengths and weaknesses, and convey a sense of their future behavior [32-33].

The IDS model should go beyond merely detecting intrusions and it should provide reasoning
for the detected threat [34]. The explanations in the form of correlations of various factors i.e.
time of intrusion, type, and suspicious network flow, influencing the predicted outcome can
assist cyber security analysts in quickly analyzing tasks and making decisions [35].

The major contributions of the paper are as follows:

 We presented state-of-the-art of the XAI approach and discuss the critical issues those
surround it, and how these issues relate to the intrusion detection. Here we proposed
taxonomy based literature review to help on groundwork for formally defining
explainability in intrusion detection.
 The comprehensive survey of the current landscape of XIDS implementations is
presented, with an emphasis on two major approaches: black and white box. The
distinction between the two approaches is discussed in detail, as is the rationale for
why the black box approach with post-hoc explainability is more appropriate for
intrusion detection tasks.

 We propose a generic explainable architecture with a user-centric approach for


designing X-IDS that can accommodate a wide variety of scenarios and applications
without adhering to a specific specification or technological solution. _ We discuss the
challenges inherent in designing X-IDS and make research recommendations aimed at
effectively mitigating these challenges for future researchers interested in developing
X-IDS.

The rest parts of this are arranged as: Section II is a brief discussion on explainable artificial
intelligence (XAI). Section III summarizes survey and taxonomy. Section IV discussed about
literature on black box and white box X-IDS approaches. Section V introduces a generic X-
IDS architecture that future researchers can use as a guide. Section VI identifies research
challenges and makes recommendations to future researchers. Finally, Section VII concludes
this survey.

II. Explainable Artificial Intelligence (XAI)

Lent et al. [36] defines XAI is a system that provides an easily understandable chain of
reasoning from the users’ order through systems’ knowledge and inference to resulting
behaviour. Authors used the term XAI to describe their system’s ability to explain the
behaviour of AI controlled entities. The definition given by DARPA [37] indicates XAI as
intersection of different areas including machine learning, human computer interface, and end
user explanation.

The XAI offers significant benefits to a broad range of domains that rely on artificial
intelligence systems. At present, XAI is being used in mission-critical systems and defense
[38-39]. The researchers are proposing explanation systems to foster the trust of AI systems in
the transportation domain. Some research works are based on image processing with
explainability. Transparency regarding decision-making processes is critical in the criminal
justice system [40-41]. The various explainable methods for judicial decision support systems
have been proposed by authors in [42-43].

Medical anomaly detection healthcare risk prediction system, genetics, and healthcare image
processing are some of the areas that are moving towards adoption of XAI [44-49]. The area
of finance uses AI-based credit score decisions and counterfeit banknotes detection [50-51].
The entertainment industry XAI for recommender systems is found in the works of [52-53].

Arrieta et al. [54] argued that one of the issues that hinders the establishment of common
ground for the meaning of the term ‘explainability’ in the context of AI is the interchangeable
misuse of ‘interpretability’ and ‘explainability’. Interpretability is the ability to explain or
convey meaning in human-comprehensible terms [55]. In this sense, if system users need an
explanation as a proxy system to understand the reasoning process, that explanation is
precisely represented by the XAI.

Fig. 1: Explainable Artificial Intelligence (XAI)

The green color boxes in Fig. 1 represent model dependency, grey color boxes represents
the scope of the explanations. The light purple color boxes represent various types of
stakeholders in the IDS ecosystem, and pink color boxes represent techniques to evaluate
explanations.
A. Concept of Explainability

The several explanation method approaches have been proposed by different authors in the
pursuit of explaining AI systems. The authors in [56] conducted a survey of black box
specific explainability methods and proposed taxonomy for XAI systems based on four
characteristics such as: (i) Nature of the problem (ii) Type of explainer used (iii) Type of
black box model processed by the explainer (iv) Type of data supported by the black box.

The authors [57] presented the concept of explainability in two clusters. The first cluster refer
to attributes of explainability that contains criteria and characteristics used by scholars in
trying to define the construct of explainability. The second cluster refers to the theoretical
approaches for structuring explanations.

The authors [58] proposed taxonomy for categorizing XAI techniques based on explanation
scope, algorithm methodology, and usage. Similarly, the authors in [59] surveyed over 180
articles related to explainability and categorized explain- ability using three criteria: the
complexity of interpretability, the scope of interpretability, and model dependency.

A common category found in the literature regarding the taxonomy of explainability is the
scope of explainability and model dependency. The following subsections describe these
categories in greater detail.

1. Local Explainability

The local explainability explains a single prediction or decision and it is used to generate a
unique explanation or justification of the specific decision made by the model [60]. Some of
the local explanation methods include the Local Interpretable Model Agnostic Explanation
(LIME) [61], the Anchors [62] and the Leave One Covariate Out (LOCO) [63].

The LIME was originally proposed by Ribeiro et al. [64]. The authors of this paper used a
surrogate model to approximate the predictions of the black box model. Rather than training
a global surrogate model, the LIME uses a local surrogate model to interpret individual
predictions. The explanation produced by LIME is obtained by using Equation 1.
ξ ( X )= argmin { £ ( f , g , w ) + Ω ( ᵷ ) }(1)
x

gϵ G

Where, g represents the model of explanation for the occurrence x, (e.g., linear
regression); G denotes a class of potentially interpretable models, such as linear models,
L
decision tree; denotes the loss function (e.g., mean squared error), which is used to
determine how close the explanation model’s predictions are to the original model’s
predictions; f denotes the original model; wx specifies the weighting factor between the
sampled and original data; Ω(g) captures the complexity of model g.

Lundberg et al. [65] proposed a game-theoretic optimal solution based on Shapley


values for model explainability referred to as Shapely Additive Explanations (SHAP).
The SHAP calculates the significance of each feature in each prediction. The authors
have demonstrated the equivalence of this model among various local interpretable
models including LIME [66], Deep Learning Important Features (Deep LIFT) [67], and
Layer Wise Relevance Propagation (LRP) [68]. The SHAP value can be computed for
any model, not just simple linear models. SHAP specifies the explanation for an
instance x as seen in Equation 2.

M
g ( z ) =∅ 0 + ∑ ∅ j Z (2)
'
'
j
j=1

Where, g represents explanation model; z’ is the coalition vector of the simplified features,
and z’ € {0, 1}M. The 1 in z’ indicates that the new data contains identical features to the
original data, while the 0 indicates that the new data contains features that are distinct from
the original data); M denotes the maximum size of a coalition; φ j Є R denotes the attribute
for the feature j in the instance x. This is known as the Shapley value. If φ j is a large positive
number, it indicates that feature j has a significant positive effect on the model’s prediction.
φ0 represents the model output with all simplified inputs toggled off.
2. Global Explainability

The global explainability makes easier to follow the reasoning behind all the possible
outcomes. It shed light on the models’ decision-making process as a whole that resulting in
an understanding of the attributions for a variety of input data [69]. The LIME model was
extended with a submodular pick algorithm in order to comprehend the models’ global
correlations [70]. The LIME provides a global understanding of the model from individual
data instances using a submodular pick algorithm by providing a non-redundant global
decision boundary for the machine learning model [71].

The author Kim et al. proposed another global explainability method namely Concept
Activation Vectors (CAVs) that can interpret internal states of a neural network in the
human-friendly concept domain [72]. Yang et al. proposed a novel method known as Global
Interpretation via Recursive Partitioning (GIRP) to construct a global interpretation tree
based on local explanations for a variety of machine learning models [73]. Another method
of global explanation includes an explanation by information extraction, in this authors
propose a method of information extraction that is only lightly supervised and provides a
global interpretation [74].

3 . Model-specific Interpretability

The model-specific interpretability methods are restricted to a limited number of model


classes. We are restricted to using only models in this that provide a specific type of
interpretation, which can reduce our options for using more accurate and representative
models [75].

4. Model-agnostic interpretability

The agnostic models are not tied to any specific type of ML model or definition modular, in
the sense the explanatory module is unrelated to the model for which it generates
explanations [68]. The model-agnostic interpretations are used to interpret artificial neural
networks and can be local global. The authors argue that a significant amount of research in
XAI is concentrated on model-agnostic post-hoc explainability algorithms due to their ease of
integration and breadth of application [69-70].
The authors broadly categorize the techniques of model-agnostic interpretability into four
types, including visualization, knowledge extraction, influence methods, and example-based
explanations based on other reviewed papers [71-75].

B. Formalizing Explainability Tasks From User Perspectives

A machine learning model must be human comprehensible and explainable. The interpretable
elements serve as foundation of explanation that is highly dependent on the question of
“who” will receive the explanation. The authors identified three targets of explanation,
including regular user, expert user, and the external entity [76]. An explanation should be
specific to the user types and it must be made to expert users not regular users in a legal
scenario. If explanations are geared towards regular users then the chance of developing trust
and acceptance of XAI methods is high on the other hand [77].

Keeping humans in the loop determines the overall explainability value; authors in this
emphasize the significance of humans-in-the-loop approach for explainable systems from two
perspectives i.e. Human-like explanation and Human-friendly explanation [78-81]. The first
aspect focuses how to produce explanations that simulate human cognitive process. The
second aspect is concerned with developing explanations that are centered on humans [82-
85]. The section V discussed importance of human-centered design when developing X-IDS
systems. The section VI-B examines the explainability requirements imposed by various
stakeholders in the IDS ecosystem.

C. Measure for Evaluating Explainability Techniques

The researchers have been done lots of work on evaluating explanations, and quantifying
their relevance. The authors [86] proposed three classes as evaluation methods for
interpretability, including application-grounded, human-grounded, and functionally-grounded
methods.

The application-grounded evaluation is concerned with the impact of the interpretation


process’s results on the human, domain expert, or end user, in terms of a well defined task or
application [87]. The Human-grounded evaluation is concerned with conducting simplified
application-grounded evaluation where experiments are run with regular users rather than
domain experts. The functionally-grounded evaluation does not require human subjects, and
rather uses formal, well-defined mathematical definitions of interpretability to determine the
method’s quality [88-89].

The authors outlined three different evaluation criteria of explanations for deep networks i.e.
processing, representation, and explanation producing [90 -93]. The first criterion technique
simulates data processing to generate insights about the relationships between a model’s
inputs and outputs. The second criterion describes how data is represented in networks and
explains the representation. The third criterion states that the explanation producing systems
can be evaluated according to how well they match user expectations. The section VI-C
provides greater detail about this proposed model. We describe our survey approach and
develop taxonomy for X-IDS grounded in the current literature in next section.

III. Survey & Taxonomy

The term intrusion refers to an unauthorized access of an enterprises computer network by


malicious parties. The IDSs are a collection of tools, methods, and resources that assist
CSOC analysts in identifying, assessing, and reporting intrusions. The IDSs are component
of protection not a stand-alone protection measure; it surrounds a system [94-96]. The IDS
are classified into two types according to their behaviour such as host-based or network-
based. The host-based IDS monitor traffic that is originating from and coming to a specific
host. The Network-based IDS are strategically positioned in a network to analyze incoming
and outgoing communication between network nodes [97-99].

The IDSs use three detection techniques i.e. signature-based, anomaly-based, and hybrid
signature-based. The signature-based IDS monitors network traffic and compare networks
with the database of known malicious threats’ signatures or attributes [100]. However,
signature-based IDS is not capable to detect zero-day attacks, metamorphic threats, or
polymorphic threats [101]. The anomaly-based IDS looks for patterns in data that do not
conform to expected behavior and allow them to recognize such threats [102]. However, this
detection system are susceptible to higher false positive rates because they may categorize
previously unseen, yet legitimate, system behaviors as anomalies [103]. The Hybrid IDS
integrate both signature-based and anomaly-based detection methods that allows for an
increased detection rate of known intrusions, the ability to detect unseen intrusions, and
reduce false positives [104].

The prior works on IDSs have focused on XAI from the lens of explainability, qualifying the
definitions of notion, users, and metrics. This survey follows that direction by creating a
taxonomy surrounding current XAI techniques for IDS [105]. A summary of our taxonomy
can be seen in Fig. 2. The two primary families of XAI techniques are white box models and
black box models which greatly affect our survey taxonomy for approaches to X-IDS.

Fig. 2: Proposed taxonomy.

The proposed X-IDS techniques are categorized into white and black box categories. The
white box approaches encompass the techniques of Regression, Rule-Based, Clustering, and
Statistical & Probabilistic Methods. The black box approaches encompass Feature,
Perturbation, Decomposition, and Hybrid approaches. These approaches define the method of
explainability to interpret the model’s decision process. The survey of existing systems is
based on the taxonomy in Fig. 2, is available in Section IV. We describe the salient features
of white box models and black box models in continuation of this.

A. Salient Features of White Box Techniques

This defined conditions as explainable a r e e a s y t o u n d e r s t a n d a n d p r o d u c e d r e s u l t s


c a n b e understood e a s i l y [ 1 0 6 ] . These models cover a wide variety of techniques that
fall into four distinct families: Regression, Rule-Based, Clustering, and Statistical &
Probabilistic Methods. These approaches have a well formed background of statistical
support and maturity. Therefore, these models are most often employed in the early stages of
modeling, in the pipelines of more complex models, and in domains where scrutiny and
transparency are of paramount importance [107].

Regression models are highly computationally efficient that allows for rapid construction as
well as deployment into low-resource systems; where detection time is critical, such as IoT
edge devices [108]. The regression approaches can be split into Parametric Regression and
Non-parametric Regression. The popular regression techniques are Linear Regression (LR),
Logistic Regression (LoR), various non-linear models, Poisson Regression, Kernel
Regression (KR) [109].

The Statistical & Probabilistic Methods a r e a broad category for the numerous statistical
models of reasoning that exists in the literature. Notably, many of these methods have seen
a decline in use as a compliment to the rise in popularity of various black box methods
[110]. Clustering-based approaches use supervised or unsupervised learning to aggregate
similar data objects. This similar condition is defined by a similarity, or dissimilarity,
measure. Traditionally these methods are defined on distance based metrics such as
Euclidean, Manhattan, Cosine Mea- sure, Pearson coefficient, and many others. Examples of
popular clustering algorithms are K-Means, Self-Organizing Maps (SOMs), Density-Based
Spatial Clustering of Applications with Noise (DBSCAN), Agglomerative Clustering, and
Spectral Clustering [111-115].

B. Salient Features of Black Box Techniques

The decision systems are considered opaque in black box models and black box composed all
of the state-of-the-art, are limited due to the lacking ability of model inspection and evaluation
[116]. Currently, there exists no singular solution to the black box inspection problem.
However, many candidate explanations have emerged, exploring and exploiting various
aspects of the machine learning process to create explanations for black box models. These
candidate explanations currently fall into four distinct families: Feature, Perturbation,
Decomposition, and Hybrid [117-119].
Hybrid-based explanations encapsulate a type of model construction often demonstrated in
pipelined machine learning architectures [120]. These models can range from ensembles, a
blend of white box and black box approaches working in tandem, to carefully composed IDS
pipelines encapsulating many of the best state-of-the-art approaches. Therefore, hybrid
approaches present the most variability of explanations, with respect to methodology, location
of explanations, and application [121-122].

IV. Approaches to Explainable IDS (X-IDS)

Here we will describe in detail the black box and the white box approaches to XAI in
intrusion detection systems as per the survey overview presented in Section III, and the
taxonomy showcased in Fig. 2.

A. Black Box X-IDS Models

The authors Guidotti et al. describe a black box predictor as a data-mining and machine
learning obscure model [123]. This internals are either unknown to the observer or are known
but are uninterruptable by humans. A black box model is not explainable by itself; therefore,
to make a black box model explainable [124].

we have to adopt several techniques to extract explanations from the inner logic or the
outputs of the model [125]. We have further divided the literature into different categories of
XAI black box models: feature based, perturbations based, decomposition based, and hybrid
approaches. These classifications are based upon how explanations are generated [126]. A
detailed literature overview is also available in Table 1.

1. Feature-based Approaches

The feature extraction scheme is a popular scheme for explanations considers the influence
features [127]. Several researchers’ solutions that currently exploit this assumption are Partial
Dependence Plot (PDP), Accumulated Local Effects (ALE), H-statistic, and SHAP. The
authors Wang et al. proposed SHAP-based framework that uses both local and global
explanations to increase the explainability of the IDS model. The IDS model consists of a
binary Neural Network (NN) classifier and a multi-class NN classifier [128 -129].
The local and global explanations models and predictions are fed to the SHAP module to
generate explanations [130]. The local explanations are generated by choosing an attack and
randomly selecting 100 of the occurrences. The authors evaluate explainability by using a
Neptune attack, where a flooding of SYN packets is observed. The researchers can make
inferences about how the model might react during a related attack using global explanation
produced by the SHAP module [131]. The authors Islam et al. built a domain knowledge
infused explainable IDS framework. Their architecture is composed of two parts: a feature
generalizer that uses the CIA principles and an evaluator that compares the black box models
using different configurations. The feature generalizer first maps the top three ranked features
to attack types, and then maps attack types to the CIA principles [132].

The small difference between the full dataset and the domain infused dataset show that the
authors can now create a way to explain predictions without negatively impacting model
performance [133]. The authors create another CIA scoring formula that shows how much
impact a CIA mapped feature had on the samples prediction. These C, I, and A scores can
then be shown to an analyst to explain the prediction. To test their method against unknown
attacks, the models are trained on all attacks in the dataset except one. The classifier is tested
on a dataset that includes all of the attacks [134].

The results show that the novel domain infused dataset performs similarly to the full dataset.
In one case, the domain infused dataset is able to be used to find an attack that the full dataset
configuration could not. The authors have demon started that creating an explainable algorithm
and dataset can be useful for both accuracy and explanations [135].

A novel method in [136] uses Auto-Encoders (AE) in combination with SHAP to explain
anomalies. Anomalies are detected using the reconstruction score of the AE. Samples that
return a higher reconstruction score are considered anomalous. An explainer module is
created with the goal to link the input value of anomalies to their high reconstruction score.
Features are split into two sets. The first set contains features that are causing the reconstruction
score to be higher, while the second set does the opposite [137-138].
Paper Title Focus/Objective Contribution Limitation
Feature based IDS
An Explainable Machine Locally and globally explainable • Framework that creates both lo- cal and • More intrusion detection datasets
Learning Framework for NN using SHAP for IDS. global explanations. should be tested.
Intrusion Detection Systems • First use of SHAP in the field of IDS. • SHAP cannot work in real-time.
[11] • Comparison between one-vs-all classifier and • SHAP needs to be tested on more
multi-class classi- fier. robust attacks.
Domain Knowledge Aided Ex- Use CIA principles on data to im- • Method for the collection and use of Domain • Domain Knowledge is applied to a
plainable Artificial Intelligence prove both generalizability and ex- Knowledge in IDS specific dataset. New mappings may
for Intrusion Detection and plainability of a model • Use CIA principles to aid in ex- plainability be needed on new datasets.
Response [95] • Domain Knowledge increase generalizability • More datasets need to be tested.
An Explainable Machine Explore explainability in IDS by • Evaluate twe different Network Intrusion • Explanations are only done using
Learning- based Network comparing two different IDS feat. Detection datasets: NetFLow and SHAP.
Intrusion Detection System for CICFlowMeter. • No analysis on the performance of
Enabling Generalisabil- ity in • Creation of two new datasets in the the explainer.
Securing IoT Networks [96] CICFlowMeter format.
• An explainable analysis is per- formed using
SHAP.
Explaining Anomalies Detected Use SHAP to create custom expla- Method for explaining anomalies found by an
• • Custom explanation lacks any form
by Autoencoders Using SHAP nations for anomalies found with an autoencoder. of visualization to aid the user.
[97] autoencoder. • Preliminary experiment with real word data
and domain experts.
• Suggest methods for evaluating
explanations.
Perturbation based IDS
A New Explainable Deep Explainable Intrusion Detection in • Conv-LSTM-based autoencoder for time- • Tested only on a single dataset
Learning Framework for Cyber the field of IoT series attacks • Considers only univariate time-
Threat Dis- covery in Industrial • Detects zero-day attacks series data
IoT Networks [98] • Sliding window technique that increases
accuracy of CNN and LSTM model XAI
concepts to improve trust
An Adversarial Approach for Explain models and predictions • Methodology explaining incor- rectly • Only tested on DoS attacks from
Ex- plainable AI in Intrusion through an adversarial approach classified samples to help improve flaws in NSL-KDD
Detection Systems [6] the model
Feature-Oriented Design of A suite of visual tools used to im- • Analysis of Features and Re- quirements to • Only tested on a single dataset
Visual Analytics System for prove explainability of CNNs improve visual analysis of XAI • Scalability of visual analytics
Interpretable Deep Learning • IDSBoard, a GUI for under- standing Deep system
Based Intrusion De- tection [99] Learning Intru- sion Detection • Visual analytics system only de-
• Demonstrate the effectiveness of visual signed for CNN
analytics
Explanation framework for Intru- Explaining IDS explanations using Explaining classifications based on feature
• • Analysis of the counterfactual
sion Detection [100] a Counterfactual technique. importance. technique was only run on one type
• Advice on how to change a clas- sification to of ML algorithm.
its desired result.
• Outline the decision process so that the user
can simulate it themselves.
Decomposition/Gradient based IDS
Toward Explainable Deep Initial steps into XAI for DNN In- • Framework for creating an ex- plainable Deep • Experiments are run using only DOS
Neural Network Based Anomaly trusion Detection Network XAI concepts to improve trust attacks from the NSL-KDD dataset
Detec- tion. [101]
Towards explaining anomalies: Explaining anomalies found by a • A method for ‘neuralizing’ a one-class SVM • Experiments solely run using one-
A deep Taylor decomposition of SVM using Deep Taylor Decompo- to be explained by Deep Taylor class SVM. No comparison to ‘real’
one- class models. [102] sition. Decomposition. neural networks.
Hybrid IDS

Achieving explainability of intru- Building a hybrid IDS based around • An explainer module modeled after the • Two models need to be effec- tively
sion detection system by hybrid ‘XAI Desiderata’ that does not de- ‘XAI Desiderata.’ trained.
oracle-explainer approach [103] crease performance or add vulnera- • A Hybrid-Oracle explainer Intru- sion • Would benefit from being tested on
bility. Detection System. multiple datasets.
Explainable deep few-shot An anomaly detection system able • Prior-driven anomaly detection • Experiments only run using im- age
anomaly detection with deviation of detecting anomalies learned from framework. based datasets with relatively small
networks [104] few anomalous training samples. • DevNet, an anomaly detection framework sample sizes.
based on Gausian prior, Z-Score-based
deviation loss, and multiple instance
learn- ing.
• A theoretical and empirical anal- ysis of
Few-shot anomaly detec- tion.

Table 1: An overview of the existing literature on black-box approaches to intrusion detection systems, with a
focus on their scope, contribution, and limitations.
2. Perturbation-based Approaches

The perturbation-based approaches are model agnostic they can be applied to any kind of
models. This is based on the inclusion, removal, or modification of a feature in a dataset; and
it makes minor modifications of input data to observe changes in output predictions. The
authors have created a CNN model along with a dashboard user interface to make the black
box deep learning components more explainable [99]. This model gathers feature
requirements for their dashboard from literature. I t include: (i) it is important to know
the role that individual neurons play in predictions; (ii) multiple models should be tested,
and the best parameters should be selected to achieve the best accuracy; (iii) visualization
should assist in finding interesting results; (iv) there should be an explanation as to how the
model made a decision; (v) we should be able to see the data representation in each layer of
the model. This model is able to achieve an 80% accuracy, and dashboard UI includes: a
detailed view of each cluster of neurons and the associated feature class, a t-SNE scatter
plot of the activation values, a feature map of the convolution kernel, a feature panel that
explains how the model came to a prediction, a confusion matrix of predicted instances, and
a graph for finding input data patterns.

The authors Khan et al. [98] propose an explainable auto encoder-based detection
framework using convolution and recurrent networks to discover cyber threats in IoT
networks. The model is capable of detecting both known and zero-day attacks. It leverages a
2-step, sliding window technique that is used to transform a 1-dimensional sample into
smaller contiguous 2-dimensional samples. This 2D sample is then fed through a CNN,
comprised of a 1D convolution layer and a 1D max-pooling layer which extracts spatial
features. The data is then fed into the auto-encoder based LSTM that extracts temporal
features. Finally, the DNN uses the extracted representation to make predictions. To make
the model explainable, the authors use LIME. This model obtain 99.35% accuracy using
their proposed model.

In another impactful model [6], the authors argue that rather than explaining every prediction,
it is possible to create a model that explains misclassifications using a counterfactual
technique. The goal is to explain adversarial attacks, which aim to confuse models into
misclassifying input samples. Using this technique, the authors find weak points in their
model and develop strategies to overcome these limitations. When an input sample is
classified incorrectly, minimal changes are made to the sample until it is classified correctly.
The difference between the original, incorrectly labeled sample and the new, correctly
labeled sample are used to explain the occurrence of the misclassification.

A linear classifier and a multi-layer preceptor (MLP) are used during testing and the authors
achieve an accuracy of 93% and 95%, respectfully. The authors technique for minimizing the
difference between samples is effective as the projections created by t-SNE are nearly
identical. More insight can then be gathered from these projections as they show which
features caused the misclassification along with the magnitude of the impact. This method
appears to be a good way to communicate why a classification occurred and allows for a user
to make the necessary inferences.

The authors Burkart et al. [100] proposed a similar application of counterfactuals on an


explainable IDS framework. Here the goal of the system is to answer the question: Why did X
happen and not Y? The authors aim to create explanations that are understandable and
actionable. By understandable, they mean explaining an instance of classification, and by
actionable they mean giving advice for changing the classification. The framework should
also allow the users to simulate these changes themselves.

3. Decomposition-based Approaches

The relevance score is an important matter of any model; the decomposition-based


approaches decompose the output of a model to create a relevance score [105]. This model
uses Layer-wise Relevance Propagation (LRP) technique; where the scoring mechanism
propagates backwards from the output node, highlighting activated neurons that impact
predictions. These approaches can either decompose the output or decompose the gradient of
the model.

The authors proposed an explainable DNN model using LRP; which objective is to give a
confidence score of a prediction, give a textual explanation of a prediction, and the
reasons why the prediction was chosen. The authors argued that the explanation for detected
anomalies is provided to reduce the opaqueness of DNN model and enhance human trust in
the algorithm. The authors created a partial implementation of their framework consisting of
a feed forward DNN with explanations created by LRP. NSL-KDD is used for their
experimentation. This model can achieve up to 97% accuracy from each of the
implementations.

Fig. 3: A visual depiction of Layer-wise Relevance Propagation.

The relevance scores (Rj, Rk) are calculated backwards from the output for each layer (j and
k represent neurons). Scores from each previous layer are used to score the next set of
neurons with the final outcome being the importance of each input in the above fig. 3 [106].

B. White Box X-IDS Models

Models that can provide an explanation to expert users with- out utilizing additional models
are referred to as interpretable or white box models [85]. A white box model’s internal logic
and programming steps are completely transparent, resulting in an interpretable decision
process [110]. However, when the model is to be explained to non-expert users, it may
demand post-hoc explainability, such as visualizations [63]. This interpretability, on the other
hand, usually comes at a price in terms of performance [111].

At the moment a myriad of white box approaches are available for intrusion detection. The
survey done in this will focus on the approaches most commonly used in the literature, as per
our overview presented in Section III and the taxonomy showcased in Fig. 2. Table 2
summarizes state-of-the-art research, challenges, and contributions with respect to white box
approaches for intrusion detection systems.

1. Regression
The linear regression (LR) is a supervised ML technique that establishes a relationship
between a dependent variable and independent variables by computing a best fit line.

The linearity of the learned relationship puts LR under the umbrella of interpretable models.
Intrinsically interpretable models such as LR and Logistic Regression (LoR), meet the
characteristics of transparent models [29]. When the number of features is small in LR, the
weight or coefficient of the equation can be used for explaining predictions. The learned
relationships are linear and can be expressed for a single instance i as given in Equation. 3.

Y^ = β0 + β 1 X 1 +… … .+ β n X n +ε (3)

Where in Equation 3, ў is the output or target (dependent) variable. X is the input


independent) variable, β0 and β1 are coefficients, and ϵ is error term.

There are various regression-based IDS models those exist in the literature. The authors in
[118] deployed anomaly-based intrusion detection systems using two different statistical
methods i.e. Linear Discriminant Analysis (LDA) and LoR. While LR models are desirable
for intrusion detection purposes, their performance is susceptible to outliers [119]. To
mitigate the impact of outliers, the authors in [120] proposed a robust regression method for
anomaly detection.

The existing approaches render promising outcomes; none of them were designed with
explainability in mind. The authors [112] propose an explainable HPC-based Double
Regression (HPCDR) ML framework to overcome the issue of explainability in the area of
hardware performance counter (HPC)–based intrusion detection. The study examines two
distinct types of attacks: micro architectural and malware. The tests are conducted on five
distinct datasets for the first type of attack: Two distinct datasets are considered i.e. Bashlite
and PNScan for the second attack. To minimize computational overhead, the proposed study
employs Ridge Regression (RR) rather than Shapely values to generate interpretable results.
First, the three ML models (RF, DT, and NN) are chosen to evaluate the classification
accuracy. Second, the output from these models is perturbed and passed to the first RR model
where HPCs are employed as features and weight coefficients are received.
2. Decision Tree and Rule Based
A Decision Tree (DT) is a graph theory based tree structure in which decision support system
elements are based on graph theory.

The two approaches e.i. top-down and bottom-up are used to constructing decision trees. The
top-down approach is based on the divide and conquers strategy [121]. The decision model
built with a top-down approach begins with the root node and splits the root nodes into two
disjoint subsets: left child and right child. This process is repeated recursively over child
nodes until a stop condition is met [122]. The DT possesses three properties that make them
interpretable e.i. simulatability, decomposability, and algorithmic transparency [63]. Fig. 4
illustrated a simple DT that detects an attack on a remotely accessible computer system.

The authors Mahbooba et al. [113] approach the task of developing an interpretable model
to identify malicious nodes for IDS using a DT on the KDD dataset. The authors chose
Iterative Dichotomiser algorithm to ensure interpretability because it mimics a human-based
decision strategy. The authors demonstrate that the algorithm can rank the relevance of
features, provide explainable rules, and reach a level of accuracy comparable to state-of-the-
art [125].

Sinclair et al. [127] extract rules using a DT and a Genetic Algorithm (GA) for improving the
performance of the IDS model. The authors in [128] and [129] focus on optimizing the IDS
model by extracting rules using a GA. To add transparency to the decision process, Dias et al.
[114] proposed an interpretable and explainable hybrid intrusion detection system. The
proposed system integrates expert-written rules and dynamic knowledge generated by a DT
algorithm. The authors suggest that the model can achieve explainability through the
justifications of each diagnosis. Justification of certain predictions is provided in a tree-like
format in the form of a suggested rule that provides a more intuitive and straightforward
understanding of the diagnosis.
Fig. 4: Simple decision tree for the detection of a user- to-root (U2R) attack on a computer
system, attack classified as Yes or No [126].

3. Statistical and Probabilistic Methods

Statistical and probabilistic methods use this information to determine whether the given
event is anomalous or not. The moment is predicted anomalous if they are either above or
below a predefined interval [134]. This approach is further divided into the univariate model,
multivariate model, time series model [135], parametric and non-parametric model [136-
137], operational model, Markov model and statistical moments [132, 138].

A different approach to intrinsically explainable statistical methods for network intrusion


detection is proposed by Pontes et al. [115]. The authors in this study introduce a novel
Energy-based Flow Classifier (EFC) that utilizes inverse Potts models to infer anomaly
scores based on labeled benign examples. This method is capable of accurately performing
binary flow classification on DDoS attacks. They perform experiments on three different
datasets: CIDDS-001, CIC- IDS2017, and CICDDoS19. Results indicate that the pro- posed
model is more adaptable to different data distributions than classical ML-based classifiers.
Additionally, they argue that their model is naturally interpretable and that individual
parameter values can be analyzed in detail.

4. Clustering

The clustering is the most widely used strategy for unsupervised ML and it classifies samples
according to a similarity criterion. Clustering algorithms that can be explained have several
advantages. The primary benefit of explainable clustering is that it summarizes the input
behavior patterns within clusters, enabling users to comprehend the clusters’ underlying
commonalities [116].

The Self-Organizing Maps (SOMs) are an unsupervised clustering technique within the
artificial neural networks umbrella. It has two layers: an input layer that accepts high
dimensional space and an output layer that generates a non-linear mapping of high
dimensional space into reduced dimensions. It is trained to produce a low dimensional
representation of a large training dataset while preserving important topological and metric
relationships of the input data [142]. A graphical illustration of a simple SOM model is
depicted in Fig. 6.

Fig. 6: Clustering based on Self Organizing Maps.

The input layer consists of X0, X1...Xn nodes in Fig. 6, these nodes have specific weights
illustrated by W00, W01, ...Wn1. These weights determine cluster classification such as C 0 and
C1. An anomaly detection system using SOM techniques based on offline audit trail data is
proposed in [143]. The major shortcoming of the proposed system is it does not allow for
real-time detection.

On the other hand, the authors in [144] propose Hierarchical SOMs (HSOM) for host-based
intrusion detection on computer networks that are capable of operating on real-time data
without requiring extensive offline training or expert knowledge. Another model based on
HSOM is proposed in [145]. The authors in [116] developed a novel model-specific
explainable technique for the SOM algorithm that generates both local and global
explanations for Cyber-Physical Systems (CPS) security. They used the SOMs training
approach (winner-take-all algorithm) together with visual data mining capabilities
(Histograms, t-SNE, Heat Maps, and U-Matrix) of SOMs to make the algorithm
explainable.

Paper Title Focus/Objective Contribution Limitation


Regression based IDS
Explainable To develop an explainable • Proposes an explainable HPC- • DL models were not chosen to
Machine Learning X-IDS technique based on based Double Regression (HPCDR) evaluate the optimal ML
for Intrusion the double RR technique framework for intrusion detection model.
Detection via and utilizing HPC as a with human- interpretable results. • Only Four HPC features were
Hardware feature. • HPCDR is evaluated against real- chosen for experimentation.
Performance world malware to determine • Other microarchitectural
Counters [112] whether it provides transparent attacks (e.g. Prime+Probe)
hardware-assisted malware de- and malware (e.g. Rootkits)
tection and to detect microarchi- are not considered in the
tectural attacks with an indica- tion study.
of the malicious origin.
Decision Tree and Rule based
IDS
XAI to Enhance Focused on the • Addressed XAI concept to en- • Information gain in decision
Trust Management interpretability in a widely hance trust management that hu- trees is biased in favor of
in Intrusion used benchmark dataset man expert can understand. those attributes with more
Detection Systems KDD datasets. • Analyzed the importance of fea- levels.
Us- ing Decision ture based on the entropy mea- sure • This behavior might impact
Tree Model [113] for intrusion detection. pre- diction performance.
• Interpreted the rules extracted from
the DT approach for intru- sion
classification.
A Hybrid To design interpretable • Providing an IDS that stands out for • DT only considered as ML
Approach for an and explain- able hybrid its ML support on populating the model for system design.
Inter- pretable and intrusion detection sys- knowledge base. • Knowledge base is small.
Explainable tem to achieve better and • Focus on interpretability and ex-
Intrusion Detection more long-lasting security. plainability, since it justifies the
System [114] suggested rules, and the diagno- sis
performed to each asset.
Statistical and Probabilistic
Models
A New Method for To develop a new method • Implementation of a naturally in- • Only binary classification is
Flow-Based for flow- based network terpretable flow classifier based on con- sidered in the approach.
Network Intrusion intrusion detection using the inverse Potts model to be • Applicability of the proposed
Detection Using inverse statistical method. employed in NIDS. methods in real world data.
the Inverse Potts • Performance comparison with other
Model [115] ML based models using three
datasets.
Clustering based IDS
Explainable Propose a novel • Brief overview of Supervised • Only clustering method is used.
unsupervised Explainable Unsupervised Machine Learning (SML), Un-
machine learning Machine Learning supervised Machine Learning
for cyber-physical (XUnML) approach using (UnML), and XAI.
systems [116] the Self Organizing Map • Exploring initial desiderata towards
(SOM) algorithm. Explainable UnML (XUnML),
defining XUnML terminology
based on the terminology used for
XAI, and exploring the necessity of
XUnML for CPSs.
ANNaBell Island: Provide explanation to the • Benign and malicious traffic is • It is not clear if the temporal
A 3D Color outputs of SOM models separated by color coding and map maintain same basic
Hexagonal SOM using color scheme and zoning in the island. landscape or change over
for Visual Intru- island landscape analogy • Color and zone categorization of time.
sion Detection for different network network traffic provides the ex- • The proposed map seems to
[117] traffics. planation of the output. be specific to the tested
network only.

Table 2: A summary of the existing literature on white-box approaches to intrusion detection systems, with an
emphasis on their scope, contribution, and limitations.

V. Designing of Explainable IDS (X-IDS)

The objective of IDS is to monitor a network continuously for malicious activity or security
violations that is known as intrusion. A significant problem with AI based IDS is their high
false positive and false negative rates. Recently, many IDS based on ML or DL techniques
have been proposed to address this issue, such as DNN [148], [149], RNN [150], [151], and
CNN [152], [153]. However, the effective uses of these approaches require using high-quality
data, as well as a considerable amount of computing resources [154]. Additionally, this
modeling approach has typically suffered from model bias, a lack of decision process
transparency, and a lack of user trust.

The ML/DL techniques in IDS systems generates event logs in the form of ‘benign’ or
‘malicious’ classification reports, that can be further analyzed by CSoC analysts.

However, they do not showcase the connection between the inputs and output. A cyber
security specialist serves as a user who reviews IDS results to be more precise [155]. This
creates a larger problem for CSoC experts as they are unable to optimize their decisions based
on the model’s decision process.

One promising technique is to design X-IDS with a human-in-the-loop approach to address


this semantic gap. Typically, methods that are retraceable, explainable, and supported by
visualizations amplify cyber security analysts’ understanding in managing cyber security
incidents in both proactive and reactive manners.
The X-IDS architecture proposed in this paper is based on the DARPA recommended
architecture for the design of XAI systems [24]. The layered architecture consists of three
phases: pre-modeling phase, modeling phase, and post-modeling explainability phase. The
different modules work in tandem to provide CSoC analysts with more accurate and
explainable output in each phase.

Fig. 7: Recommended architecture for the design of an X-IDS based on DARPA [24].

It can be observed in the above Fig. 7 that the layered architecture is divided into three
phase that are pre-modelling, modelling, and post-modelling explainability. Each phase
contributes to the development of an explanation for various stakeholders, thereby assisting
in decision-making.

A. Pre-Modelling Phase

The pre-modelling phase is first phase which input is raw network flow known as dataset and
output is a high-quality dataset. We will first describe different benchmark datasets available
for Intrusion Detection. We then present common data preprocessing techniques used in the
literature.

1. Datasets
The labelled datasets for cyber security related AI tasks remains a challenge while access to
representative. A variety of widely accessible datasets can be used to train and bench- mark
X-IDS. Many of these datasets are generated in an emulated environment to address privacy
concerns. The NSL-KDD is relatively small compared to other datasets in the field.
A more modern dataset is CICIDS2017 [158] which contains more up-to-date attacks and
network flows. It includes 3 million samples in addition which allows scalability testing.
Another noteworthy dataset is UGR [159], a multi-terabyte dataset collected over the course
of 5 months. This dataset is built to test IDS for long-term trends. The authors state that their
dataset captures potential trends in daytime, nighttime, weekday, and weekend traffic. These
publicly available datasets, though good for bench- marking, are not suitable for deployable
systems. We recommend CSoC users deploying X-IDS systems evaluate these systems on
organizational representative datasets.

2. Exploratory Data Analysis (EDA) and Data Visualization

The data pre-processing is essential for increasing the likelihood of ML models producing
accurate predictions. One can gain a general understanding of a dataset’s key features and
characteristics using Exploratory Data Analysis (EDA). To comprehend the features,
visualization techniques such as heat maps, network diagrams, bar charts, and correlation
matrices may be used. Once a comprehension of feature space has been attained, the data is
forwarded to the feature engineering model for further processing.

3. Feature Engineering

Normally in pre-processing IDS datasets we normalize the numerical features and encodes
the categorical features. The encoded datasets features space can be quite large which makes
them computationally expensive. Two approaches Feature Selection and Feature Extraction
are widely discussed use to reduce dimensions Feature Selection and Feature Extraction.

The feature selection techniques are used to reduce the feature space by selecting a subset of
features without transforming them. There are three types of feature selection techniques
popular in the IDS domain e.i. filters, wrappers, and the embedded/hybrid method [160].
Apart from these, libraries such as Scikit-Learn [161] have also been used in published works
for feature selection.

Another technique known as dimensionality reduction that is uses in feature engineering to


extract features. Feature extraction reduces the size of the feature space by transforming the
original features while retaining most of their defining attributes. The most commonly used
feature extraction technique in the literature is the Principal Component Analysis (PCA)
[162]. PCA is an unsupervised method that does not require class knowledge to identify
features. It also facilitates the identification of correlations and relationships between the
features of a dataset.

B. Modelling Phase

The second phase is the modeling phase. The input of this phase is the high-quality dataset
generated in the pre- processing phase and the output is the explanations generated by the
explainer module. First, the high-quality dataset is fed into the ML/DL model of choice.
Second, the predictions generated by the model in use are passed through an ex- plainer
module. Third, these explanations are evaluated by an evaluation module. This process
enables users to understand the reason behind certain predictions, which in turn, helps the
CSoC analysts in their decision-making process.

1. AI Model
In Section IV, we discussed two different approaches which are currently being employed by
different authors to create X-IDS: the black box and the white box. AI modules in these
approaches generate predictions. However, there is a trade-off between the accuracy and the
interpretability with these approaches. The white box approaches are popular for their
interpretability, while the black box approaches are known for their prediction accuracy. In
context of IDS, high prediction accuracy is required to prevent attacks.

Moreover, black box models can capture significant non-linearity and complex interactions
between data that white box models are not able to capture. For example, Recurrent Neural
Networks (RNN) can capture temporal dependencies between samples. On the other hand,
models like Support Vector Machine (SVM) and Deep Neural Network (DNN) can create
their own representation of data. which might be helpful to dis- cover unknown attacks. For
this reason, we believe that future X-IDS should be built using black box models.

We found that authors use a variety of black box algorithms in their work, such as SVM,
CNN, RF, and MLP, which prove to be quite effective in our literature review. Another
popular algorithm of choice in the intrusion detection domain is a variant of the RNN,
referred to as Long Short-Term Memory (LSTM). Recently, Generative Adversarial
Networks (GAN) have also become relatively popular. Consequently, there are a multitude of
black box algorithms from which to select. Explainer modules then approximate the
prediction generated by AI module employing a white box or black box algorithms.

2. Explainer Module and Evaluation

The prediction generated by the model of choice in the AI module is then fed to the
explainer module. The common explainers used from previous works include LIME, SHAP,
and LRP. These out-of-the-box modules allow for quick testing on different algorithms and
datasets. However, there are some problems with solely using these approaches in future X-
IDS works. To begin, methods such as SHAP do not run in real-time. Therefore, it may be
time-consuming to attempt to use SHAP on a simple Multi-Layer Perceptron classifier with
a large feature space dataset. In X-IDS, both predictions and explanations must be made as
quickly as possible. Secondly, these approaches are not always designed with X-IDS
stakeholders in mind.

At present, there are no set standard metrics to evaluate explanations. Several authors have
attempted to evaluate explanations in various ways. In Section II-C we described different
ways to evaluate the explanations. Metrics such as application grounded evaluation, human-
grounded evaluations, and function-grounded evaluation proposed by Doshi et al. [80] can
be used as a baseline to evaluate the explanation generated by X-IDS. A noteworthy method
to evaluate the effectiveness of explanations is proposed by authors in [24]. Figure 9
illustrates their approach.

C. Post Modelling Explainability Phase

The post-modelling explainability phase is third and final phase has two major components:
the explanation interface and users. The recommendation, decision, or action generated by
the AI module, explained by an explainer module, and evaluated by an explanation
evaluation module is rendered in a graphical user interface.

1. Explanation Interface

An excellent approach that builds such an explanation interface is found in the work by [99]
and [95]. The custom visual dashboards are created to help the user to understand the X-IDS.
The engineers who design X-IDS can use this approach as guidance to create their
explanation interface. Furthermore, this paper also recommends that future X- IDS
developers make custom explainers built for specific stakeholders to help improve
explainability.

2. Users

The stakeholders will consist of developers, defence practitioners, and investors. The
developers are tasked with creating, modifying, and maintaining the X-IDS. The defense
practitioners guard the assets of the investors. Finally, the investors make budgeting decisions
for the benefit of the X-IDS system and other assets. These three audiences have distinct
tasks and explainability requirements that must be addressed differently by the X-IDS. An
explanation interface designed from the user’s perspective can bridge this gap.

The stakeholders will need a way to voice that opinion, if an explanation is unclear or
unhelpful. The developers can revise the explainer module in such a situation. For the same
reasons, incorrect predictions and explanations need to be corrected and updated. The
developers or defence practitioners will then need to introduce new data to the model.
Moreover, a different method of data preprocessing may be required to augment the efficacy
of the model.

VI. Research Challenges & Recommendations

The sub-domain of explainable AI based Intrusion Detection Systems is still in its infancy.
Researchers working on X-IDS must be made aware of the issues that hinder its
development. The issues that we described in Section II such as finding the right notion of
explainability, generating explanations from a stakeholder’s perspective, and lack of formal
standard metrics to evaluate explanations are prevalent in the X-IDS domain as well. Existing
X-IDS research is primarily focused on the goal of making algorithms explainable.
Explanations are not being designed around stakeholders, and researchers need to quantify
useful evaluation metrics. Apart from these challenges, issues pertaining to IDS may also
pose a problem for X-IDS. There are many promising avenues of exploration, in this section
we detail some existing research challenges and give our recommendations.
A. Defining Explainability for Intrusion Detection

The first problem faced by researchers designing X-IDS is the lack of consensus on the
definition of explainability in IDS. The research community needs to agree on a common
definition of explainability for IDS. To find common ground, we can leverage the
foundational XAI definition proposed by DARPA [31]. However, an X-IDS definition needs
more security domain-specific elements. The inclusion of the CIA principles may be a good
start for cementing a definition that combines aspects of cyber security and XAI.

Questions relevant to the X-IDS that researchers need to answer include: “What is
explainability when used for intrusion detection?”, “How do we effectively create
explanations for IDS?”, and “Who are we creating explanations for?”. Other questions such
as “How can Confidentiality, Integrity, and Availability benefit from explanations?” and
“How do we categorize X-IDS algorithms?” should be reassessed by X-IDS researchers as
well. Current work is extremely narrow in its scope and limits its objective to explaining each
sample in an IDS dataset. These works also do not consider the type of audience when
building X-IDS.

B. DEFINING TASKS AND STAKEHOLDERS

The second challenge is to define the task and the stakeholders of the X-IDS ecosystem.
After formalizing the definition of ‘explainability’ for X-IDS, we need to create explanations
tailored to the stakeholders. Fig. 8 demonstrated a simple user and explanation taxonomy.
We consider three major stakeholders based on their roles in this taxonomy including
developers, defense practitioners, and investors. Each of the stakeholder categories
necessitates a different degree of explanation and visualization. Programmers and CSoC
members are more familiar with the field and may want more complex explanations.
Investors, on the other hand, may be more satisfied with summarized visualizations. Each
user group performs varying tasks based on the explanations. Programmers will work to
debug and increase the efficacy of the AI model. CSoC members will be tasked with
protecting investor assets. Indirectly, investors will need to make hiring or budgeting
decisions. Future research is needed to determine the best types of explanations for each user
group.
Fig. 8: A simple taxonomy illustrating the importance of tailoring explanations to specific stakeholders based
on their roles in CSoCs.

C. Evaluation Metrics

Evaluating the explanation generated by the ‘explainer module’ is third challenge in


designing X-IDS. Finding the best explanation for each stakeholder category requires
customized evaluation metrics. Currently, there is no consensus on metrics for explanations.
In particular, we recommended evaluation metrics proposed by authors in [80] to evaluate
explanations for X-IDS in Section V-B2.

Another notable work that could serve as a baseline for evaluating explanations is the
psychological model of explanation created by the Florida Institute for Human and Machine
Cognition (IHMC) [24]. The proposed model is illustrated in Fig. 9. The user receives an
explanation from the XAI model. This explanation can be tested for “goodness” and the
satisfaction of the user/stakeholder. The user then revises their mental model of the XAI
system. Their understanding of the system can be tested. Tasks are performed based on the
explanation. The IHMC model merges the purpose of the XAI model, with the task and
mindset of the user. A noteworthy method to evaluate the effectiveness of explanations is
proposed by authors in [24]. Figure 9 illustrates their approach.
Fig. 9: Different categories for assessing the effectiveness of explanations in the IHCM
psychological model with detailed explanation process [24].

D. Adversarial AI

The adversarial AI refers to the use of artificial intelligence for malicious purposes, including
attacks on other artificial intelligence systems to evade detection [163] or to poison data
[164]. Malicious actors can potentially attack the classifiers that are used to generate
predictions and cause misclassification. In context of X-IDS, the explanations generated by
the explainer module may become a new point of attack for malicious actors. Attackers may
add, delete, or modify explanations to evade detection [165]. Attackers may also attack
training datasets to alter the explainer’s behavior. The methods and effects of these attacks
will need to be explored. Defense techniques must be created to correct attacked explanations
[166–168].

E. Misleading/Incorrect Explanations

An explanation does not have to be attacked to be mislead- ing. The explanation itself may be
misleading, or the user may interpret the explanation incorrectly. This may lead to
circumstances where the model is correct and the user is the problem. The explainer will need
to be modified to prevent user error in such situations.
Explanations that are misclassified either by an attack or due to the poor quality of data can
have a significant negative impact on CSoCs. CSoCs security analysts should always
critically analyze the reasoning behind the prediction. More- over, methods for auditing
previously incorrect explanations should be created. Ideally, the X-IDS should be able to
audit itself and generate explanations for the audit.

F. Scalability and Performance

The performance is of utmost importance for an IDS. CSoCs can incur losses for lost time.
Explanations should not needlessly slow down an IDS. So how do we optimize an X-IDS?
One approach is that the explainer could generate explanations for every sample it sees, or it
could strategically choose which samples to explain. A comprehensive analysis of the CPU,
RAM, and disk usage should be run on current and future explainers.

VII. Conclusion

The exponential growth of cyber networks and the myriad applications that run on them
have made CSoC, Cyber- physical systems, and critical infrastructure vulnerable to cyber-
attacks. Securing these domains and their resources through the use of defense tools such as
IDS, is critical to combating and resolving this issue [10], [11].

The recent AI- based IDS research has demonstrated unprecedented prediction accuracy,
which is helping to lead to its widespread adoption across the industry. CSoC analysts
largely rely on the results of these models to make their decision. However, in most cases,
decision-making is impaired simply because these opaque models fail to justify their
predicted outcomes. A systematic review of current state-of-the-art research on ‘XAI’ or
‘explainability’ highlighted some key challenges in this domain, such as the lack of
consensus surrounding the definition of ‘explainability’, the need to formalize explain-
ability from the user’s perspective, and the lack of metrics to evaluate explanations. We
propose a taxonomy to address this problem with a focus on its relevance and applicability
to the domain of intrusion detection. Nevertheless, the field of IDS requires a high degree of
precision to prevent attacks and avoid false positives. Bearing this in mind, a black box
approach is recommended when developing an X-IDS solution.
We present in detail two distinct approaches found in the body of literature which address
the concern of ‘explainability’ in the IDS domain, including the white box approach and the
black box approach in this paper,. The white box approach makes the model in use
inherently interpretable, whereas the black box approach requires post-hoc explanation
techniques to make the predictions more interpretable. While the former approach may
provide a more detailed explanation to assist CSoC members in decision-making, its
prediction performance is in general outperformed by the latter.

This architecture is sufficiently generic to support a wide variety of scenarios and


applications without being bound by a particular specification or technological solution.
Finally, we provide research recommendations to researchers that are interested in
developing X-IDS. We also propose a three-layered architecture for the design of an X-IDS
based on the DARPA recommended architecture [24] for the design of XAI systems in
addition.

Reference

[1] Donghwoon Kwon, Hyunjoo Kim, Jinoh Kim, Sang C Suh, Ikkyun Kim, and Kuinam J Kim. A survey of
deep learning-based network anomaly detection. Cluster Computing, 22(1):949–961, 2019.
[2] Siddharth Sridhar, Adam Hahn, and Manimaran Govindarasu. Cyber– physical system security for the
electric power grid. Proceedings of the IEEE, 100(1):210–224, 2011.
[3] Ragunathan Rajkumar, Insup Lee, Lui Sha, and John Stankovic. Cyber- physical systems: the next
computing revolution. In Design automation conference, pages 731–736. IEEE, 2010.
[4] Alvaro Cardenas, Saurabh Amin, Bruno Sinopoli, Annarita Giani, Adrian Perrig, Shankar Sastry, et al.
Challenges for securing cyber physical systems. In Workshop on future directions in cyber-physical
systems security, volume 5. Citeseer, 2009.
[5] Mohammad Ahmadian and Dan C Marinescu. Information leakage in cloud data warehouses. IEEE
Transactions on Sustainable Computing, 5(2):192–203, 2018.
[6] Daniel L Marino, Chathurika S Wickramasinghe, and Milos Manic. An adversarial approach for
explainable ai in intrusion detection systems. In IECON 2018-44th Annual Conference of the IEEE
Industrial Electronics Society, pages 3237–3243. IEEE, 2018.
[7] Valeria Cardellini, Emiliano Casalicchio, Stefano Iannucci, Matteo Lu- cantonio, Sudip Mittal, Damodar
Panigrahi, and Andrea Silvi. An in- trusion response system utilizing deep q-networks and system
partitions. arXiv preprint arXiv:2202.08182, 2022.
[8] Ketki Sane, Karuna Pande Joshi, and Sudip Mittal. Semantically rich framework to automate cyber
insurance services. IEEE Transactions on Services Computing, 2021.
[9] Andrew McDole, Maanak Gupta, Mahmoud Abdelsalam, Sudip Mittal, and Mamoun Alazab. Deep
learning techniques for behavioural malware analysis in cloud iaas. In Malware Analysis using Artificial
Intelligence and Deep Learning. Springer, 2021.
[10] Syed Wali and Irfan Khan. Explainable ai and random forest based reliable intrusion detection system.
2021.
[11] Maonan Wang, Kangfeng Zheng, Yanqing Yang, and Xiujuan Wang. An explainable machine learning
framework for intrusion detection systems. IEEE Access, 8:73127–73141, 2020.
[12] Raytheon. Cyber security operations center (csoc), 2017.
[13] James P Anderson. Computer security threat monitoring and surveil- lance. Technical Report, James P.
Anderson Company, 1980.
[14] Dorothy E Denning. An intrusion detection model. IEEE Transactions on software engineering, (2):
222–232, 1987
[15] Rebecca Gurley Bace, Peter Mell, et al. Intrusion detection systems, 2001.
[16] Shelly Xiaonan Wu and Wolfgang Banzhaf. The use of computational intelligence in intrusion detection
systems: A review. Applied soft computing, 10(1):1–35, 2010.
[17] Biswanath Mukherjee, L Todd Heberlein, and Karl N Levitt. Network intrusion detection. IEEE
network, 8(3):26–41, 1994.
[18] Wenke Lee, Salvatore J Stolfo, and Kui W Mok. A data mining framework for building intrusion
detection models. In Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.
99CB36344), pages 120–132. IEEE, 1999.
[19] Anna L Buczak and Erhan Guven. A survey of data mining and machine learning methods for cyber
security intrusion detection. IEEE Communications surveys & tutorials, 18(2):1153–1176, 2015.
[20] Mustapha Belouch, Salah El Hadaj, and Mohamed Idhammad. Perfor- mance evaluation of intrusion
detection based on machine learning using apache spark. Procedia Computer Science, 127:1–6, 2018.
[21] Erza Aminanto and Kwangjo Kim. Deep learning in intrusion detection system: An overview. In 2016
International Research Conference on Engineering and Technology (2016 IRCET). Higher Education
Forum, 2016.
[22] Kwangjo Kim and Muhamad Erza Aminanto. Deep learning in intrusion detection perspective: Overview
and further challenges. In 2017 Interna- tional Workshop on Big Data and Information Security (IWBIS),
pages 5–10. IEEE, 2017.
[23] Feiyu Xu, Hans Uszkoreit, Yangzhou Du, Wei Fan, Dongyan Zhao, and Jun Zhu. Explainable ai: A brief
survey on history, research areas, approaches and challenges. In CCF international conference on natural
language processing and Chinese computing, pages 563–574. Springer, 2019.
[24] David Gunning and David Aha. Darpa’s explainable artificial intelligence (xai) program. AI Magazine,
40(2):44–58, 2019.
[25] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. Explain- able artificial intelligence:
Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296,
2017.
[26] Alaa Marshan. Artificial intelligence: Explainability, ethical issues and bias. Annals of Robotics and
Automation, 5(1):034–037, 2021.
[27] Richard A Berk and Justin Bleich. Statistical procedures for forecasting criminal behavior: A
comparative assessment. Criminology & Pub. Pol’y, 12:513, 2013.
[28] Quoc Phong Nguyen, Kar Wai Lim, Dinil Mon Divakaran, Kian Hsiang Low, and Mun Choon Chan.
Gee: A gradient-based explainable vari- ational autoencoder for network anomaly detection. 2019 IEEE
Con- ference on Communications and Network Security (CNS), pages 91–99, 2019.
[29] Zachary C Lipton. The mythos of model interpretability: In machine learning, the concept of
interpretability is both important and slippery. Queue, 16(3):31–57, 2018.
[30] Michael Van Lent, William Fisher, and Michael Mancuso. An explain- able artificial intelligence system
for small-unit tactical behavior. In Proceedings of the national conference on artificial intelligence, pages
900–907. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2004.
[31] DARPA. Broad agency announcement explainable artificial intelligence (xai). DARPA-BAA-16-53,
pages 7–8, 2016.
[32] Johanna D Moore and William R Swartout. Explanation in expert systemss: A survey. Technical report,
UNIVERSITY OF SOUTH- ERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES
INST, 1988.
[33] Amina Adadi and Mohammed Berrada. Peeking inside the black-box: a survey on explainable artificial
intelligence (xai). IEEE access, 6:52138– 52160, 2018.
[34] Jacob Haspiel, Na Du, Jill Meyerson, Lionel P Robert Jr, Dawn Tilbury, X Jessie Yang, and Anuj K
Pradhan. Explanations and expectations: Trust building in automated vehicles. In Companion of the
2018 ACM/IEEE international conference on human-robot interaction, pages 119–120, 2018.
[35] Maria Paz Sesmero Lorente, Elena Magán Lopez, Laura Alvarez Flo- rez, Agapito Ledezma Espino,
José Antonio Iglesias Martínez, and Araceli Sanchis de Miguel. Explaining deep learning-based driver
models. Applied Sciences, 11(8):3321, 2021.
[36] Yanfen Li, Hanxiang Wang, L Minh Dang, Tan N Nguyen, Dongil Han, Ahyun Lee, Insung Jang, and
Hyeonjoon Moon. A deep learning-based
[37] hybrid framework for object detection and recognition in autonomous driving. IEEE Access, 8:194228–
194239, 2020.
[38] Javier Martínez-Cebrián, Miguel-Ángel Fernández-Torres, and Fernando Díaz-De-María. Interpretable
global-local dynamics for the prediction of eye fixations in autonomous driving scenarios. IEEE Access,
8:217068– 217085, 2020.
[39] Thomas Ponn, Thomas Kröger, and Frank Diermeyer. Identification and explanation of challenging
conditions for camera-based object detection of automated vehicles. Sensors, 20(13):3699, 2020.
[40] Ashley Deeks. The judicial demand for explainable artificial intelligence. Columbia Law Review,
119(7):1829–1850, 2019.
[41] Octavio Loyola-González. Understanding the criminal behavior in mexico city through an explainable
artificial intelligence model. In Mexican International Conference on Artificial Intelligence, pages 136 –
149. Springer, 2019.
[42] Qiaoting Zhong, Xiuyi Fan, Xudong Luo, and Francesca Toni. An ex- plainable multi-attribute decision
model based on argumentation. Expert Systems with Applications, 117:42–61, 2019.
[43] Charlotte S Vlek, Henry Prakken, Silja Renooij, and Bart Verheij. A method for explaining bayesian
networks for legal evidence with scenar- ios. Artificial Intelligence and Law, 24(3):285–324, 2016.
[44] Andreas Holzinger, Chris Biemann, Constantinos S Pattichis, and Dou- glas B Kell. What do we need to
build explainable ai systems for the medical domain? arXiv preprint arXiv:1712.09923, 2017.
[45] Krishna Gade, Sahin Cem Geyik, Krishnaram Kenthapadi, Varun Mithal, and Ankur Taly. Explainable ai
in industry. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery
& data mining, pages 3203–3204, 2019.
[46] Sarah Itani, Fabian Lecron, and Philippe Fortemps. A one-class classi- fication decision tree based on
kernel density estimation. Applied Soft Computing, 91:106250, 2020.
[47] Leeanne Lindsay, Sonya Coleman, Dermot Kerr, Brian Taylor, and Anne Moorhead. Explainable
artificial intelligence for falls prediction. In International Conference on Advances in Computing and
Data Sciences, pages 76–84. Springer, 2020.
[48] Emmanuel Pintelas, Meletis Liaskos, Ioannis E Livieris, Sotiris Kot- siantis, and Panagiotis Pintelas.
Explainable machine learning framework for image classification problems: case study on glioma cancer
predic- tion. Journal of imaging, 6(6):37, 2020.
[49] Edi Prifti, Yann Chevaleyre, Blaise Hanczar, Eugeni Belda, Antoine Danchin, Karine Clément, and Jean-
Daniel Zucker. Interpretable and accurate prediction models for metagenomics data. GigaScience,
9(3):giaa010, 2020.
[50] Scott M Lundberg, Bala Nair, Monica S Vavilala, Mayumi Horibe, Michael J Eisses, Trevor Adams,
David E Liston, Daniel King-Wai Low, Shu-Fang Newman, Jerry Kim, et al. Explainable machine
learning predictions to help anesthesiologists prevent hypoxemia during surgery. bioRxiv, page 206540,
2017.
[51] Liang-Chin Huang, Wayland Yeung, Ye Wang, Huimin Cheng, Aarya Venkat, Sheng Li, Ping Ma,
Khaled Rasheed, and Natarajan Kan- nan. Quantitative structure–mutation–activity relationship tests
(qsmart) model for protein kinase inhibitor response prediction. BMC bioinfor- matics, 21(1):1–22, 2020.
[52] Augusto Anguita-Ruiz, Alberto Segura-Delgado, Rafael Alcalá, Concep- ción M Aguilera, and Jesús
Alcalá-Fdez. explainable artificial intelli- gence (xai) for the identification of biologically relevant gene
expression patterns in longitudinal human studies, insights from obesity research. PLoS computational
biology, 16(4):e1007792, 2020.
[53] Satya M Muddamsetty, Mohammad NS Jahromi, and Thomas B Moes- lund. Expert level evaluations for
explainable ai (xai) methods in the medical domain. In International Conference on Pattern Recognition,
pages 35–46. Springer, 2021.
[54] Mara Graziani, Vincent Andrearczyk, Stéphane Marchand-Maillet, and Henning Müller. Concept
attribution: Explaining cnn decisions to physi- cians. Computers in biology and medicine, 123:103865,
2020.
[55] Isabel Rio-Torto, Kelwin Fernandes, and Luís F Teixeira. Understanding the decisions of cnns: An in-
model approach. Pattern Recognition Letters, 133:373–380, 2020.
[56] Ye Eun Chun, Se Bin Kim, Ja Yun Lee, and Ji Hwan Woo. Study on credit rating model using
explainable ai. The Korean Data & Information Science Society, 32(2):283–295, 2021.
[57] Miseon Han and Jeongtae Kim. Joint banknote recognition and counterfeit detection using explainable
artificial intelligence. Sensors, 19(16):3607, 2019.
[58] Elvio Amparore, Alan Perotti, and Paolo Bajardi. To trust or not to trust an explanation: using leaf to
evaluate local linear xai methods. PeerJ Computer Science, 7:e479, 2021.
[59] Jasper van der Waa, Elisabeth Nieuwburg, Anita Cremers, and Mark Neerincx. Evaluating xai: A
comparison of rule-based and example- based explanations. Artificial Intelligence, 291:103404, 2021.
[60] Kacper Sokol and Peter Flach. Explainability fact sheets: a framework for systematic assessment of
explainable approaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and
Transparency, pages 56–67, 2020.
[61] Tomasz Rutkowski, Krystian Łapa, and Radosław Nielek. On explainable fuzzy recommenders and
their performance evaluation. International Journal of Applied Mathematics and Computer Science,
29(3):595–610, 2019.
[62] Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, and Tat-Seng Chua.
Explainable reasoning over knowledge graphs for recommendation. In Proceedings of the AAAI
conference on artificial intelligence, volume 33, pages 5329–5336, 2019.
[63] Guoshuai Zhao, Hao Fu, Ruihua Song, Tetsuya Sakai, Zhongxia Chen, Xing Xie, and Xueming Qian.
Personalized reason generation for explainable song recommendation. ACM Transactions on Intelligent
Systems and Technology (TIST), 10(4):1–21, 2019.
[64] Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik,
Alberto Barbado, Salvador García, Ser- gio Gil-López, Daniel Molina, Richard Benjamins, et al.
Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and chal- lenges toward
responsible ai. Information Fusion, 58:82–115, 2020.
[65] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino
Pedreschi. A survey of methods for explaining black box models. ACM computing surveys (CSUR),
51(5):1–42, 2018.
[66] Tim Miller. Explanation in artificial intelligence: Insights from the social sciences. Artificial
intelligence, 267:1–38, 2019.
[67] Krishna Gade, Sahin Geyik, Krishnaram Kenthapadi, Varun Mithal, and Ankur Taly. Explainable ai in
industry: Practical challenges and lessons learned. In Companion Proceedings of the Web Conference
2020, pages 303–304, 2020.
[68] Giulia Vilone and Luca Longo. Notions of explainability and evaluation approaches for explainable
artificial intelligence. Information Fusion, 76:89–106, 2021.
[69] Arun Das and Paul Rad. Opportunities and challenges in explainable artificial intelligence (xai): A
survey. arXiv preprint arXiv:2006.11371, 2020.
[70] Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris Kotsiantis. Explainable ai: A review of
machine learning interpretability methods. Entropy, 23(1):18, 2021.
[71] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?" explaining the
predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on
knowledge discovery and data mining, pages 1135–1144, 2016.
[72] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Anchors: High- precision model-agnostic
explanations. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
[73] Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J Tibshirani, and Larry Wasserman. Distribution-free
predictive inference for regression. Journal of the American Statistical Association, 113(523):1094–
1111, 2018.
[74] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in
neural information processing systems, 30, 2017.
[75] Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through
propagating activation differences. In Interna- tional conference on machine learning, pages 3145–3153.
PMLR, 2017.
[76] Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, Klaus- Robert Müller, and Wojciech
Samek. Layer-wise relevance propagation for neural networks with local renormalization layers. In
International Conference on Artificial Neural Networks, pages 63–71. Springer, 2016.
[77] Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al.
Interpretability beyond feature attribution: Quan- titative testing with concept activation vectors (tcav).
In International conference on machine learning, pages 2668–2677. PMLR, 2018.
[78] Chengliang Yang, Anand Rangarajan, and Sanjay Ranka. Global model interpretation via recursive
partitioning. In 2018 IEEE 20th International Conference on High Performance Computing and
Communications; IEEE 16th International Conference on Smart City; IEEE 4th International
Conference on Data Science and Systems (HPCC/SmartCity/DSS), pages 1563–1570. IEEE, 2018.
[79] Marco A Valenzuela-Escárcega, Ajay Nagesh, and Mihai Surdeanu. Lightly-supervised representation
learning with global interpretability. arXiv preprint arXiv:1805.11545, 2018.
[80] Avi Rosenfeld and Ariella Richardson. Explainability in human–agent systems. Autonomous Agents and
Multi-Agent Systems, 33(6):673–705, 2019.
[81] Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv
preprint arXiv:1702.08608, 2017.
[82] Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal.
Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th
International Conference on data science and advanced analytics (DSAA), pages 80– 89. IEEE, 2018.
[83] Ismail Butun, Salvatore D Morgera, and Ravi Sankar. A survey of intrusion detection systems in wireless
sensor networks. IEEE commu- nications surveys & tutorials, 16(1):266–282, 2013.
[84] Ashu Sharma and Sanjay Kumar Sahay. Evolution and detection of polymorphic and metamorphic
malwares: A survey. arXiv preprint arXiv:1406.7061, 2014.
[85] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detec- tion: A survey. ACM computing
surveys (CSUR), 41(3):1–58, 2009.
[86] Octavio Loyola-Gonzalez. Black-box vs. white-box: Understanding their advantages and weaknesses
from a practical point of view. IEEE Access, 7:154096–154113, 2019.
[87] Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics,
pages 1189–1232, 2001.
[88] Daniel W Apley and Jingyu Zhu. Visualizing the effects of predictor variables in black box supervised
learning models. Journal of the Royal Statistical Society: Series B (Statistical Methodology),
82(4):1059–1086, 2020.
[89] Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. Peeking inside the black box:
Visualizing statistical learning with plots of indi- vidual conditional expectation. journal of
Computational and Graphical Statistics, 24(1):44–65, 2015.
[90] Jerome H Friedman and Bogdan E-Popescu. Predictive learning via rule ensembles. The annals of
applied statistics, 2(3):916–954, 2008.
[91] Vitali Petsiuk, Abir Das, and Kate Saenko. Rise: Randomized in- put sampling for explanation of
black-box models. arXiv preprint arXiv:1806.07421, 2018.
[92] Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul Kundaje. Not just a black box:
Learning important features through propagating activation differences. arXiv preprint
arXiv:1605.01713, 2016.
[93] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In
International conference on machine learning, pages 3319–3328. PMLR, 2017.
[94] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakr- ishna Vedantam, Devi Parikh, and
Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In
Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
[95] Grégoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert
Müller. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern
recognition, 65:211– 222, 2017.
[96] Sheikh Rabiul Islam, William Eberle, Sheikh K Ghafoor, Ambareen Siraj, and Mike Rogers. Domain
knowledge aided explainable artifi- cial intelligence for intrusion detection and response. arXiv preprint
arXiv:1911.09853, 2019.
[97] Mohanad Sarhan, Siamak Layeghy, and Marius Portmann. An explain- able machine learning-based
network intrusion detection system for en- abling generalisability in securing iot networks. ArXiv,
abs/2104.07183, 2021.
[98] Liat Antwarg, Bracha Shapira, and Lior Rokach. Explaining anomalies detected by autoencoders using
shap. ArXiv, abs/1903.02407, 2019.
[99] Izhar Ahmed Khan, Nour Moustafa, Dechang Pi, Karam M Sallam, Albert Y Zomaya, and Bentian Li. A
new explainable deep learning framework for cyber threat discovery in industrial iot networks. IEEE
Internet of Things Journal, 2021.
[100] Chunyuan Wu, Aijuan Qian, Xiaoju Dong, and Yanling Zhang. Feature- oriented design of visual
analytics system for interpretable deep learning based intrusion detection. In 2020 International
Symposium on Theoret- ical Aspects of Software Engineering (TASE), pages 73–80. IEEE, 2020.
[101] Nadia Burkart, Maximilian Franz, and Marco F Huber. Explanation framework for intrusion detection.
In Machine Learning for Cyber Physical Systems, pages 83–91. Springer Vieweg, Berlin, Heidelberg,
2021.
[102] Kasun Amarasinghe, Kevin Kenney, and Milos Manic. Toward ex- plainable deep neural network
based anomaly detection. In 2018 11th International Conference on Human System Interaction (HSI),
pages 311–317. IEEE, 2018.
[103] Jacob Kauffmann, Klaus-Robert Müller, and Grégoire Montavon. To- wards explaining anomalies: A
deep taylor decomposition of one-class models. ArXiv, abs/1805.06230, 2020.
[104] Mateusz Szczepan´ski, Michał Choras´, Marek Pawlicki, and Rafał Kozik. Achieving explainability of
intrusion detection system by hybrid oracle- explainer approach. In 2020 International Joint Conference
on Neural Networks (IJCNN), pages 1–8. IEEE, 2020.
[105] Guansong Pang, Choubo Ding, Chunhua Shen, and Anton van den Hengel. Explainable deep few-shot
anomaly detection with deviation networks. arXiv preprint arXiv:2108.00462, 2021.
[106] Leila Arras, Grégoire Montavon, Klaus-Robert Müller, and Wojciech Samek. Explaining recurrent
neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206, 2017.
[107] Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus-Robert
Müller. Layer-wise relevance propagation: an overview. Explainable AI: interpreting, explaining and
visualizing deep learning, pages 193–209, 2019.
[108] Francesco Paolo Caforio, Giuseppina Andresini, Gennaro Vessio, Annal- isa Appice, and Donato
Malerba. Leveraging grad-cam to improve the accuracy of network intrusion detection systems. In DS,
2021.
[109] Jacob Kauffmann, Lukas Ruff, Grégoire Montavon, and Klaus-Robert Muller. The clever hans effect in
anomaly detection. ArXiv, abs/2006.10609, 2020.
[110] Lars Kai Hansen and Laura Rieger. Interpretability in intelligent systems–a new concept? In
Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pages 41–49. Springer, 2019.
[111] Emmanuel Pintelas, Ioannis E Livieris, and Panagiotis Pintelas. A grey-box ensemble model exploiting
black-box accuracy and white-box intrinsic interpretability. Algorithms, 13(1):17, 2020.
[112] Sheikh Rabiul Islam, William Eberle, Sheikh Khaled Ghafoor, and Mo- hiuddin Ahmed. Explainable
artificial intelligence approaches: A survey. arXiv preprint arXiv:2101.09429, 2021.
[113] Abraham Peedikayil Kuruvila, Xingyu Meng, Shamik Kundu, Gaurav Pandey, and Kanad Basu.
Explainable machine learning for intrusion detection via hardware performance counters. IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022.
[114] Basim Mahbooba, Mohan Timilsina, Radhya Sahal, and Martin Serrano. Explainable artificial
intelligence (xai) to enhance trust management in intrusion detection systems using decision tree model.
Complexity, 2021, 2021.
[115] Tiago Dias, Nuno Oliveira, Norberto Sousa, Isabel Praça, and Orlando Sousa. A hybrid approach for an
interpretable and explainable intrusion detection system. arXiv preprint arXiv:2111.10280, 2021.
[116] Camila FT Pontes, Manuela MC de Souza, João JC Gondim, Matt Bishop, and Marcelo Antonio
Marotta. A new method for flow-based network intrusion detection using the inverse potts model. IEEE
Trans- actions on Network and Service Management, 18(2):1125–1136, 2021.
[117] Chathurika S Wickramasinghe, Kasun Amarasinghe, Daniel L Marino, Craig Rieger, and Milos Manic.
Explainable unsupervised machine learning for cyber-physical systems. IEEE Access, 9:131824–
131843, 2021.
[118] Chet Langin, Michael Wainer, and Shahram Rahimi. Annabell island: a 3d color hexagonal som for
visual intrusion detection. Internation Journal of Computer Science and Information Security, 9(1):1–7,
2011.
[119] Basant Subba, Santosh Biswas, and Sushanta Karmakar. Intrusion de- tection systems using linear
discriminant analysis and logistic regression. In 2015 Annual IEEE India Conference (INDICON),
pages 1–6. IEEE, 2015.
[120] Ziyu Wang, Jiahai Yang, and Fuliang Li. A new anomaly detection method based on igte and igfe. In
International Conference on Security and Privacy in Communication Networks, pages 93–109. Springer,
2014.
[121] Ziyu Wang, Jiahai Yang, Zhang ShiZe, and Chenxi Li. Robust regression for anomaly detection. In
2017 IEEE International Conference on Communications (ICC), pages 1–6. IEEE, 2017
[122] J Ross Quinlan. C4. 5: programs for machine learning. Elsevier, 2014.
[123] Oded Z Maimon and Lior Rokach. Data mining with decision trees: theory and applications, volume
81. World scientific, 2014.
[124] Octavio Loyola-Gonzalez, Andres Eduardo Gutierrez-Rodríguez, Miguel Angel Medina-Pérez, Raul
Monroy, José Francisco Martínez- Trinidad, Jesús Ariel Carrasco-Ochoa, and Milton Garcia-Borroto.
An explainable artificial intelligence model for clustering numerical databases. IEEE Access, 8:52370–
52384, 2020.
[125] Nave Frost, Michal Moshkovitz, and Cyrus Rashtchian. Exkmc: Expand- ing explainable k-means
clustering. arXiv preprint arXiv:2006.02399, 2020.
[126] Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz, and Cyrus Rashtchian. Explainable k-means and k-
medians clustering. In Proceedings of the 37th International Conference on Machine Learning, Vienna,
Austria, pages 12–18, 2020.
[127] Ralf Colmar Staudemeyer. The importance of time: Modelling network intrusions with long short-term
memory recurrent neural networks. PhD thesis, UNIVERSITY OF THE WESTERN CAPE, 2012.
[128] Chris Sinclair, Lyn Pierce, and Sara Matzner. An application of machine learning to network intrusion
detection. In Proceedings 15th Annual Computer Security Applications Conference (ACSAC’99),
pages 371–377. IEEE, 1999.
[129] AA Ojugo, AO Eboka, OE Okonta, RE Yoro, and FO Aghware. Genetic algorithm rule-based intrusion
detection system (gaids). Journal of Emerging Trends in Computing and Information Sciences,
3(8):1182– 1194, 2012.
[130] Kriti Chadha and Sushma Jain. Hybrid genetic fuzzy rule based infer- ence engine to detect intrusion in
networks. In Intelligent Distributed Computing, pages 185–198. Springer, 2015.
[131] Martin Roesch et al. Snort: Lightweight intrusion detection for networks. In Lisa, volume 99, pages
229–238, 1999.
[132] Brian Caswell, Jay Beale, and Andrew Baker. Snort intrusion detection and prevention toolkit.
Syngress, 2007.
[133] A Qayyum, MH Islam, and M Jamil. Taxonomy of statistical based anomaly detection techniques for
intrusion detection. In Proceedings of the IEEE Symposium on Emerging Technologies, 2005., pages
270–276. IEEE, 2005.
[134] VVRPV Jyothsna, Rama Prasad, and K Munivara Prasad. A review of anomaly based intrusion
detection systems. International Journal of Computer Applications, 28(7):26–35, 2011.
[135] Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, and Robert Atkinson. Shallow
and deep networks intrusion detection system: A taxonomy and survey. arXiv preprint
arXiv:1701.02145, 2017.
[136] Ansam Khraisat, Iqbal Gondal, Peter Vamplew, and Joarder Kamruzza- man. Survey of intrusion
detection systems: techniques, datasets and challenges. Cybersecurity, 2(1):1–22, 2019.
[137] Monowar H Bhuyan, Dhruba Kumar Bhattacharyya, and Jugal K Kalita. Network anomaly detection:
methods, systems and tools. Ieee communi- cations surveys & tutorials, 16(1):303–336, 2013.
[138] An Trung Tran. Network anomaly detection. Future Internet (FI) and Innovative Internet Technologies
and Mobile Communication (IITM) Focal Topic: Advanced Persistent Threats, 55, 2017.
[139] Manasi Gyanchandani, JL Rana, and RN Yadav. Taxonomy of anomaly based intrusion detection
system: a review. International Journal of Scientific and Research Publications, 2(12):1–13, 2012.
[140] Ayesha Binte Ashfaq, Mobin Javed, Syed Ali Khayam, and Hayder Radha. An information-theoretic
combining method for multi-classifier anomaly detection systems. In 2010 IEEE International
Conference on Communications, pages 1–5. IEEE, 2010.
[141] Wenyao Sha, Yongxin Zhu, Tian Huang, Meikang Qiu, Yan Zhu, and Qiannan Zhang. A multi-order
markov chain based scheme for anomaly detection. In 2013 IEEE 37th Annual Computer Software and
Applica- tions Conference Workshops, pages 83–88. IEEE, 2013.
[142] Nong Ye et al. A markov chain model of temporal behavior for anomaly detection. In Proceedings of
the 2000 IEEE Systems, Man, and Cybernetics Information Assurance and Security Workshop, volume
166, page 169. Citeseer, 2000.
[143] Teuvo Kohonen, Erkki Oja, Olli Simula, Ari Visa, and Jari Kangas. Engineering applications of the
self-organizing map. Proceedings of the IEEE, 84(10):1358–1384, 1996.
[144] Albert J Hoglund, Kimmo Hatonen, and Antti S Sorvari. A computer host-based user anomaly
detection system using the self-organizing map. In Proceedings of the IEEE-INNS-ENNS International
Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and
Perspectives for the New Millennium, volume 5, pages 411–416. IEEE, 2000.
[145] Peter Lichodzijewski, A Nur Zincir-Heywood, and Malcolm I Heywood. Host-based intrusion
detection using self-organizing maps. In Proceed- ings of the 2002 International Joint Conference on
Neural Networks. IJCNN’02 (Cat. No. 02CH37290), volume 2, pages 1714–1719. IEEE, 2002.
[146] H Gunes Kayacik, A Nur Zincir-Heywood, and Malcolm I Heywood. A hierarchical som-based
intrusion detection system. Engineering applica- tions of artificial intelligence, 20(4):439–451, 2007.
[147] Chet Langin, Hongbo Zhou, and Shahram Rahimi. A model to use denied internet traffic to indirectly
discover internal network security problems. In 2008 IEEE International Performance, Computing and
Communications Conference, pages 486–490. IEEE, 2008.
[148] Chet Langin, Hongbo Zhou, Shahram Rahimi, Bidyut Gupta, Mehdi Zargham, and Mohammad R
Sayeh. A self-organizing map and its modeling for discovering malignant network traffic. In 2009 IEEE
symposium on computational intelligence in Cyber Security, pages 122–129. IEEE, 2009.
[149] Kasun Amarasinghe and Milos Manic. Improving user trust on deep neural networks based intrusion
detection systems. In IECON 2018- 44th Annual Conference of the IEEE Industrial Electronics
Society, pages 3262–3268. IEEE, 2018.
[150] Jin Kim, Nara Shin, Seung Yeon Jo, and Sang Hyun Kim. Method of in- trusion detection using deep
neural network. In 2017 IEEE international conference on big data and smart computing (BigComp),
pages 313–316. IEEE, 2017.
[151] Maximilian Sölch. Detecting anomalies in robot time series data using stochastic recurrent networks,
2015.
[152] Chuanlong Yin, Yuefei Zhu, Jinlong Fei, and Xinzheng He. A deep learning approach for intrusion
detection using recurrent neural networks. Ieee Access, 5:21954–21961, 2017.
[153] Meliboev Azizjon, Alikhanov Jumabek, and Wooseong Kim. 1d cnn based network intrusion
detection with normalization on imbalanced data. In 2020 International Conference on Artificial
Intelligence in Information and Communication (ICAIIC), pages 218–224. IEEE, 2020.
[154] R Vinayakumar, KP Soman, and Prabaharan Poornachandran. Applying convolutional neural network
for network intrusion detection. In 2017 International Conference on Advances in Computing,
Communications and Informatics (ICACCI), pages 1222–1228. IEEE, 2017.
[155] Andreas Holzinger. From machine learning to explainable ai. In 2018 world symposium on digital
intelligence for systems and machines (DISA), pages 55–66. IEEE, 2018.
[156] Hong Liu, Chen Zhong, Awny Alnusair, and Sheikh Rabiul Islam. Faixid: a framework for enhancing
ai explainability of intrusion detection results using data cleaning techniques. Journal of Network and
Systems Man- agement, 29(4):1–30, 2021.
[157] Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A Ghorbani. A detailed analysis of the kdd cup
99 data set. In 2009 IEEE symposium on computational intelligence for security and defense
applications, pages 1–6. Ieee, 2009.
[158] KDD Cup 1999 Data the uci kdd archive. http://kdd.ics.uci.edu/ databases/kddcup99/kddcup99.html,
1999. Accessed: 2022-04-09.
[159] Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. Toward generating a new intrusion
detection dataset and intrusion traffic charac- terization. ICISSp, 1:108–116, 2018.
[160] Gabriel Maciá-Fernández, José Camacho, Roberto Magán-Carrión, Pe- dro García-Teodoro, and
Roberto Therón. Ugr ‘16: A new dataset for the evaluation of cyclostationarity-based network idss.
Computers & Security, 73:411–424, 2018.
[161] Samina Khalid, Tehmina Khalil, and Shamila Nasreen. A survey of feature selection and feature
extraction techniques in machine learning. In 2014 science and information conference, pages 372–378.
IEEE, 2014.
[162] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
[163] O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Van- derplas, A. Passos, D.
Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay. Scikit-learn: Machine learning in Python.
Journal of Machine Learning Research, 12:2825–2830, 2011.
[164] Karl Pearson. Liii. on lines and planes of closest fit to systems of points in space. The London,
Edinburgh, and Dublin philosophical magazine and journal of science, 2(11):559–572, 1901.
[165] Hyrum S Anderson, Jonathan Woodbridge, and Bobby Filar. Deepdga: Adversarially-tuned domain
generation and detection. In Proceedings of the 2016 ACM Workshop on Artificial Intelligence and
Security, pages 13–21, 2016.
[166] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Targeted backdoor attacks on deep
learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
[167] Nidhi Rastogi, Sara Rampazzi, Michael Clifford, Miriam Heller, Matthew Bishop, and Karl Levitt.
Explaining radar features for detecting spoofing attacks in connected autonomous vehicles. arXiv preprint
arXiv:2203.00150, 2022.
[168] Dongqi Han, Zhiliang Wang, Ying Zhong, Wenqi Chen, Jiahai Yang, Shuqiang Lu, Xingang Shi, and
Xia Yin. Evaluating and improving ad- versarial robustness of machine learning-based network intrusion
detec- tors. IEEE Journal on Selected Areas in Communications, 39(8):2632– 2647, 2021.
[169] Marek Pawlicki, Michał Choras´, and Rafał Kozik. Defending network intrusion detection systems
against adversarial evasion attacks. Future Generation Computer Systems, 110:148–154, 2020.
[170] Alexander Hartl, Maximilian Bachl, Joachim Fabini, and Tanja Zseby. Explainability and adversarial
robustness for rnns. In 2020 IEEE Sixth International Conference on Big Data Computing Service and
Applica- tions (BigDataService), pages 148–156. IEEE, 2020.

You might also like