Middle-Level Explanations in XAI
Middle-Level Explanations in XAI
Knowledge-Based Systems
journal homepage: [Link]/locate/knosys
article info a b s t r a c t
Article history: A central issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence
Received 24 February 2022 (XAI) is to provide methods to give explanations for the behaviours of Machine Learning (ML) non-
Received in revised form 6 August 2022 interpretable models after the training. Recently, it is becoming more and more evident that new
Accepted 15 August 2022
directions to create better explanations should take into account what a good explanation is to a human
Available online 23 August 2022
user. This paper suggests taking advantage of developing an XAI framework that allows producing
Keywords: multiple explanations for the response of image a classification system in terms of potentially different
XAI middle-level input features. To this end, we propose an XAI framework able to construct explanations
Explainable AI in terms of input features extracted by auto-encoders. We start from the hypothesis that some
Hierarchical auto-encoders, relying on standard data representation approaches, could extract more salient and
Middle-level understandable input properties, which we call here Middle-Level input Features (MLFs), for a user with
Interpretable models respect to raw low-level features. Furthermore, extracting different types of MLFs through different
type of auto-encoders, different types of explanations for the same ML system behaviour can be
returned. We experimentally tested our method on two different image datasets and using three
different types of MLFs. The results are encouraging. Although our novel approach was tested in
the context of image classification, it can potentially be used on other data types to the extent that
auto-encoders to extract humanly understandable representations can be applied.
© 2022 Elsevier B.V. All rights reserved.
[Link]
0950-7051/© 2022 Elsevier B.V. All rights reserved.
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 1. Examples of Middle Level input Features (MLFs). Each MLF represents a part of the input which is perceptually and cognitively salient to a human being, as
for example the ears of a cat or the wings of an airplane. These features are intuitively more humanly interpretable respect to low-level features (as for example
raw unrelated image pixels), so a decision explanation expressed in terms of MLF relevance can be easier to understand for a human being respect to explanations
expressed in terms of low level features.
needs to identify the overall input properties perceptually and image, said superpixels which are obtained by a clustering algo-
cognitively salient to him [7]. Thus, an XAI approach should rithm. These superpixels can be interpreted as MLFs. In [7,26] the
alleviate this weakness of low-level approaches and overcome explanations are formed of elements selected from a dictionary
their limitations, allowing the possibility to construct explana- of MLFs, obtained by sparse dictionary learning methods [27].
tions in terms of input features that represent more salient and In [28] authors propose to exploit the latent representations
understandable input properties for a user, which we call here learned through an adversarial auto-encoder for generating a
Middle-Level input Features (MLFs) (see Fig. 1). Although there synthetic neighbourhood of the image for which an explanation
is a recent research line which attempts to give explanations in is required. However, these approaches propose specific solutions
terms of visual human-friendly concepts [12–14] (we will discuss which cannot be generalised to different types of input proper-
them in Section 2), however we notice that the goal to learn data ties. By contrast, in this paper, we investigate the possibility of
representations that are easily factorised in terms of meaningful obtaining explanations using an approach that can be applied to
features is, in general, pursued in the representation learning different types of MLFs, which we call General MLF Explanations
framework [15], and more recently in the feature disentanglement (GMLF). More precisely, we develop an XAI framework that can be
learning context [16]. These meaningful features may represent applied whenever (a) the input of an ML system can be encoded
parts of the input such as nose, ears and paw in case of, for and decoded based on MLFs, and (b) any Explanation method
example, face recognition tasks (similarly to the outcome of a producing a Relevance Map (ERM method) can be applied on
clustering algorithm) or more abstract input properties such as both the ML model and the decoder. In this sense, we propose a
shape, viewpoint, thickness, and so on, leading to data repre- general framework insofar as it can be applied to several different
sentations perceptually and cognitively salient to the human computational definitions of MLFs and a large class of ML models.
being. Based on these considerations, in this paper, we propose to Consequently, we can provide multiple and different explanations
develop an XAI approach able to give explanations for an image based on different MLFs. In particular, in this work we tested our
classification system in terms of features which are obtained novel approach in the context of image classification using MLFs
by standard representation learning methods such as variational extracted by three different methods: (1) image segmentation
auto-encoder [17] and hierarchical image segmentation [18]. In by auto-encoders, (2) hierarchical image segmentation by auto-
particular, we exploit middle-level data representations obtained encoders, and (3) Variational auto-encoders. About the points (1)
by auto-encoder methods [19] to provide explanations of image and (2), a simple method to represent the output of a segmenta-
classification systems. In this context, in an earlier work [20] we tion algorithm in terms of encoder–decoder is reported. However,
proposed an initial experimental investigation on this type of this approach can be used on a wide range of different data types
explanations exploiting the hierarchical organisation of the data to the extent that encoder–decoder methods can be applied.
in terms of more elementary factors. For example, natural images Thus, the medium or long-term objective of this research work
can be described in terms of the objects they show at various is to develop a XAI general approach producing explanations for
levels of granularity [21–23]. Or in [24] a hierarchical prototype- an ML System behaviour in terms of potentially different and
based approach for classification is proposed. This method has a user-selected input features, composed of input properties which
certain degree of intrinsic transparency, but it does not fall into the human user can select according to his background knowl-
post-hoc explainability category. edge and goals. This aspect can play a key role in developing
To the best of our knowledge, in the XAI literature, how- user-centred explanations. It is essential to note that, in making
ever, there are relatively few approaches that pursue this line an explanation understandable for a user, it should be taken
of research. In [25], the authors proposed LIME, a successful XAI into account what information the user desires to receive [2,
method which is based, in case of image classification problems, 12,29,30]. Recently, it is becoming more and more evident that
on explanations expressed as sets of regions, clusters of the new directions to create better explanations should take into
2
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
account what a good explanation is for a human user, and con- beings. On the other side, methods as Local Interpretable Model-
sequently to develop XAI solutions able to provide user-centred agnostic Explanations (LIME) [25] relies on feature partitions, as
explanations [2,12,14,31,32]. By contrast, much of the current super-pixel in the image case. However, the explanations given
XAI methods provide specific ways to build explanations that are by LIME (or its variants) are built through a new model that
based on the researchers’ intuition of what constitutes a ‘‘good’’ approximates the original one, thus risking to loose the real
explanation [31,33]. reasons behind the behaviour of the original model [2].
To summarise, in this paper the following novelties are pre- Recently, a growing number of studies [12–14,48] have fo-
sented: cused on providing explanations in the form of middle-level or
high-level human ‘‘concepts’’ as we are addressing it in this paper.
1. a XAI framework where middle-level or high-level input In particular, in [12] the authors introduce the Concept Ac-
properties can be built exploiting standard methods of data tivation Vectors (CAV) as a way of visually representing the
representation learning is proposed; neural network’ inner states associated with a given class. CAVs
2. our framework can be applied to several different com- should represent human-friendly concepts. The basic ideas can be
putational definitions of middle-level or high-level input described as follow: firstly, the authors suppose the availability
properties and a large class of ML models. Consequently, of an external labelled dataset XC where each label corresponds
multiple and different explanations based on different to a human-friendly concept. Then, given a pre-trained neural
middle-level input properties can be possibly provided network classifier to be explained, say NC , they consider the
given an input-ML system response; functional mapping fl from the input to the l-layer of NC . Based on
3. The middle-level or high-level input proprieties are com- fl , for each class c of the dataset XC , they build a linear classifier
puted independently from the ML classifier to be explained. composed of fl followed by a linear classifier to distinguish the
element of XC belonging to the class c from randomly chosen
The paper is organised as follows: Section 3 describes in detail
images. The normal to the learned hyperplane is considered the
the proposed approach; in Section 2 we discuss differences and
CAV for the user-defined concept corresponding to the class c.
advantages of GMLF with respect similar approaches presented in
Finally, given all the input belonging to a class K of the pre-
the literature; experiments and results are discussed in Section 5.
trained classifier NC , the authors define a way to quantify how
In particular, we compared our approach with LIME method and
much a concept c, expressed by a CAV, influences the behaviour
performed both qualitative and quantitative evaluations of the
of the classifier, using directional derivatives to computes NC ’s
results; the concluding Section summarises the main high-level
conceptual sensitivity across entire class K of inputs.
features of the proposed explanation framework and outlines
Building upon the paper discussed above, in [14] the au-
some future developments.
thors provide explanations in terms of fault-lines[49]. Fault-lines
should represent ‘‘high-level semantic aspects of reality on which
2. Related works
humans zoom in when imagining an alternative to it’’. Each fault-
line is represented by a minimal set of semantic xconcepts that
The importance of eXplainable Artificial Intelligence (XAI) is
need to be added to or deleted from the classifier’s input to
discussed in several papers [33–36]. Different strategies have alter the class that the classifier outputs. Xconcepts are built
been proposed to face the explainability problem, depending both following the method proposed in [12]. In a nutshell, given a
on the AI system to explain and the type of explanation proposed. pre-trained convolutional neural network CN whose behaviour is
Among all the XAI works proposed over the last years, an impor- to be explained, xconcepts are defined in terms of super-pixels
tant distinction is between model-based and post-hoc explain- (images or parts of images) related to the feature maps of the
ability [37], the former consisting in AI systems explainable by lth CN’s convolutional layer, usually the last convolutional layer
design (e.g., decision trees), since their inner mechanisms are eas- before the full-connected layer. In particular, these super-pixels
ily interpreted, the latter proposing explanation built for system are collected when the input representations at the convolution
that are not easy to understand. In particular, several methods layer l are used to discriminate between a target class c and an
to explain Deep Neural Networks (DNNs) are proposed in the alternate class calt , and they are computed based on the Grad-
literature due to the high complexity of their inner structures. CAM algorithm [44]. In this way, one obtains xconcepts in terms
In a nutshell, DNNs are computational architectures organised as of images related to the class c and able to distinguish it from
several consecutive layers of elementary computing units, called the class calt . Thus, when the classifier CN responds that an input
neurons. Each neuron i belonging to a layer l achieves a two- x belongs to a class c, the authors provide an explanation in terms
step computation (see [38], chapter 4): the neuron input ali is of xconcepts which should represent semantic aspects of why x
computed first based on real values, said weights, associated with belongs c instead of an alternate class calt .
connections coming from neurons belonging to other layers, and In [13] the authors propose a method to provide explanations
a bias value associated to the neuron i. Then, the neuron output related to an entire class of a trained neural classifier. The method
is computed by an activation function f (·), i.e., f (ali ) (see [39] is based on the CAVs introduced in [12] and sketched above. How-
for a review). The flow of computation proceeds from the the ever, in this case, the CAVs are automatically extracted without
first hidden layer to the output layer in a forward-propagation the need an external labelled dataset expressing human-friendly
fashion. A very common approach to explain the behaviours of a concepts.
DNN consists in returning visual-based explanations in terms of Many of the approaches discussed so far focus on global expla-
input feature importance scores, as for example Activation Maxi- nations, i.e., explanations related to en entire class of the trained
mization (AM) [40], Layer-Wise Relevance propagation (LRP) [6], neural network classifier (see [12,13]). Instead, in our approach,
Deep Taylor Decomposition [41,42], Class Activation Mapping we are looking for local explanations, i.e., explanations for the
(CAM) methods [43,44], Deconvolutional Network [45] and Up- response of the ML model to each single input. Some authors, see
convolutional network [46,47]. Although heatmaps seem to be a for example [12], provide methods to obtain local explanations,
type of explanation that is easy to understand for the user, these but in this case, the explanations are expressed in terms of high-
methods build relevances on the low-level input features (the sin- level visual concepts which do not necessarily belong to the input.
gle pixel), while input middle-level properties which determined Thus, again human users are left with a significant interpretive
the answer of the classifier have to be located and interpreted load: starting from external high-level visual concepts, the hu-
by the user, leaving much of the interpretive work to the human man user needs to identify the input properties perceptually and
3
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
cognitively related to these concepts. On the contrary, the input part of these approaches is based on Auto-Encoder (AE) architec-
(MLFs) high-level properties are expressed, in our approach, in tures [19,52]. AEs correspond to neural networks composed of at
terms of elements of the input itself. least one hidden layer and logically divided into two components,
Another critical point is that high-level or middle-level user- an encoder and a decoder. From a functional point of view, an
friendly concepts are computed on the basis of the neural net- AE can be seen as the composition of two functions E and D:
work classifier to be explained. In this way, a short-circuit can be E is an encoding function (the encoder) which maps the input
created in which the visual concepts used to explain the classifier space onto a feature space (or latent encoding space), D is a
are closely related to the classifier itself. By contrast, in our decoding function (the decoder) which inversely maps the feature
approach, MLFs are extracted independently from the classifier.
space on the input space. A meaningful aspect is that by AEs,
A crucial aspect that distinguishes our proposal from the
one can obtain data representations in terms of latent encodings
above-discussed research line is grounded on the fact that we
h, where each hi may represent a MLF ξi of the input, such as
propose an XAI framework able to provide multiple explana-
tions, each one composed of a specific type of middle-level parts of the input (for example, nose, ears and paw) or more
input features (MLFs). Our methodology only needs that MLFs abstract features which can be more salient and understandable
can be obtained using methods framed into data representation input properties for a user. See for example variational AE [53–
research, and, in particular, any auto-encoder architecture for 55] or image segmentation [56–59] (see Fig. 1). Furthermore,
which an explanation method producing a relevance map can be different AEs can extract different data representations which are
applied on the decoder (see Section 3.1). not mutually exclusive.
To summarise, our GMLF approach, although shares with the Based on the previous considerations, we want to build upon
above describe research works the idea to obtain explanations the idea that the elements composing an explanation can be
based on middle-level or high-level human-friendly concepts, determined by an AE which extracts relevant input features for
presents the following elements of novelty: a human being, i.e., MLFs, and that one might change the type of
MLFs changing the type of auto-encoder or obtain multiple and
1. It is a XAI framework where middle-level or high-level
input properties can be built on the basis of standard different explanations based on different MLFs.
methods of data representation learning.
2. It outputs local explanations. 3.1. General description
3. The middle-level or high-level input proprieties are com-
puted independently from the ML classifier to be explained.
Given an ML classification model M which receives an input
Regarding points (2) and (3) we notice that a XAI method that x ∈ Rd and outputs y ∈ Rc , our approach can be divided into two
has significant similarity with our approach is LIME [25] or its consecutive steps.
variants (see, for example, [50]). LIME, especially in the context In the first step, we build an auto-encoder AE ≡ (E , D) such
of images, is one of the predominant XAI methods discussed in that each input x can be encoded by E in a latent encoding h ∈
the literature [50,51]. It can provide local explanations in terms Rm and decoded by D. As discussed above, to each value hi is
of superpixels which are regions or parts of the input that the
associated a MLF ξj , thus each input x is decomposed in a set
classifier receives, as we have already discussed in Section 1.
of m MLFs ξ = {ξi }m i=1 , where to each ξi is associated the value
These superpixels can be interpreted as middle-level input prop-
hi . Different choices of the auto-encoder can lead to MLFs ξi of
erties, which can be more understandable for a human user
different nature, so to highlight this dependence we re-formalise
than low-level features such as pixels. In this sense, we view a
similarity in the output between our approach GMLF and LIME. this first step as follows: we build an encoder Eξ : x ∈ Rd → h ∈
The explanations built by LIME can be considered comparable Rm and a decoder Dξ : h ∈ Rm → x ∈ Rd , where h encodes x in
with our proposed approach but different in the construction terms of the MLFs ξ .
process. While LIME builds explanation relying on a proxy model In the second step of our approach, we use an ERM method
different from the model to explain, the proposed approach relies (an explanation method producing a relevance map of the input)
only on the model to explain, without needing any other model on both M and Dξ , i.e., we apply it on the model M and then use
that approximates the original one. To highlight the difference the obtained relevance values to apply the ERM method on Dξ
between the produced explanations, in Section 4 a comparison getting a relevance value for each middle-level feature. In other
between LIME and GMLF outputs is made. words, we stack Dξ on the top of M thus obtaining a new model
DMξ which receives as input an encoding u and outputs y, and
3. Approach uses an ERM method on DMξ from y to u. In Fig. 2 we give a
graphic description of our approach GMLF, and in algorithm 1 it
Our approach stems from the following observations. is described in more details considering a generic auto-encoder,
The development of data representations from raw low-level while in algorithms 3 and 4 our approach (GMLF) is described in
data usually aims to obtain distinctive explanatory features of case of specific auto-encoders (see Sections 3.2 and 3.3).
the data, which are more conducive to subsequent data anal-
Thus, we search for a relevance vector u ∈ Rm which informs
ysis and interpretation. This critical step has been tackled for
the user how much each MLF of ξ has contributed to the ML
a long time using specific methods developed exploiting expert
model answer y. Note that, GMLF can be generalised to any
domain knowledge. However, this type of approach can lead to
unsuccessful results and requires a lot of heuristic experience decoder Dξ to which a ERM method applies on. In this way, one
and complex manual design [52]. This aspect is similar to what can build different explanations for a M’s response in terms of
commonly occurs in many XAI approaches, where the explana- different MLFs ξ .
tory methods are based on the researchers’ intuition of what In the remainder of this section, we will describe three alterna-
constitutes a ‘‘good’’ explanation. tive ways (segmentation, hierarchical segmentation and VAE) to
By contrast, representation learning successfully investigates obtain a decoder such that a ERM method can be applied to, and
ways to obtain middle/high-level abstract feature representations so three ways of applying our approach GMLF. We experimentally
by automatic machine learning approaches. In particular, a large tested our framework using all the methods.
4
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 3. A segmentation-based MLF framework. MLF decoder is built as a neural network having as weights the segments returned by a hierarchical segmentation
algorithm (see text for further details). The initial encoding is the ‘‘1’’ vector since all the segments are used to compose di input image. The relevance backward
algorithm returns the most relevant segments.
Fig. 4. A VAE-based MLF framework. The MLF decoder is built as a neural network composed of the VAE decoder module followed by a full-connected layer containing
the residual of the input (see text for further details). The initial input encoding is given by the VAE encoder module. The relevance backward algorithm returns the
most relevant latent variables.
resulting network R(E(h)) generates x as output, given its latent cases, we used as image classifier a VGG16 [64] network pre-
encoding h. trained on ImageNet. MLF relevances are computed with the LRP
In Fig. 4 it is shown a pictorial description of GMLF approach algorithm using the α − β rule [6].
when the auto-encoder is built based on VAE, the algorithmic In Section 5 we show a set of possible explanations of the clas-
description is reported in algorithm 4. sifier outputs on image sampled from STL-10 dataset [65] and the
Aberdeen data set from University of Stirling ([Link]
[Link]). The STL10 data-set is composed of images belonging
4. Experimental assessment
to 10 different classes (airplane, bird, car, cat, deer, dog, horse,
monkey, ship, truck), and the Aberdeen database is composed
In this section, we describe the chosen experimental setup.
of images belonging to 2 different classes (Male, Female). Only
The goal is to examine the applicability of our approach for dif- for the Aberdeen data-set the classifier was fine-tuned using an
ferent types of MLFs obtained by different encoders. As stated in subset of the whole data-set as training set.
Section 3.1, three different types of MLFs are evaluated: flat (non
hierarchical) segmentation, hierarchical segmentation and VAE 4.1. Flat segmentation approach
latent coding. For non-hierarchical/hierarchical MLF approaches,
the segmentation algorithm proposed in [60] was used to make For the flat (non-hierarchical) segmentation approach, images
MLFs, since its segmentation constraints respect the causality from the STL-10 and the Aberdeen data sets are used to generate
and the location principles reported in Section 3.2. However, for the classifier outputs and corresponding explanations. For each
the non-hierarchical method, any segmentation algorithm can be test image, a set of segments (or superpixels) S are generated us-
used (see for example [63]). ing the image segmentation algorithm proposed [60] considering
For the Variational Auto-Encoder (VAE) based GMLF approach, just one level. Therefore, a one-layer neural network decoder as
we used a β -VAE [62] as MLFs builder, since it results particularly described in Section 3.2 was constructed using the segmentation
suitable for generating interpretable representations. In all the S. The resulting decoder is stacked on the top of the VGG16 model
6
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
and fed with the ‘‘1’’ vector (see Fig. 3). The relevance of each a) DMξ ← stackTogether(D,R,M);
superpixel/segment was then computed using the LRP algorithm.
6 U ← RP(DMξ , h, y);
4.2. Hierarchical image segmentation approach 7 return {u1 , . . . uK };
Images from the Aberdeen dataset are used to construct an In Fig. 5 we show some of the explanations produced for
explanation based on VAE encoding latent variables relevances. a set of images using the flat (non hierarchical) segmentation-
The VAE model was trained on an Aberdeen subset using the based experimental setup described in Section 3.2. The proposed
architecture suggested in [62] for the CelebA dataset. Then, an explanations are reported considering the first two more relevant
encoding of 10 latent variables is made using the encoder net- segments according to the method described in Section 3.2. For
work for each test image. The resulting encodings were fed to the each image, the real class and the assigned class are reported.
decoder network stacked on top of the trained VGG16. Next, the From a qualitative visual inspection, one can observe that the
LRP algorithm was applied on the decoder top layer to compute selected segments seem to play a relevant role for distinguishing
the relevance of each latent variable. the classes.
7
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 5. Explanations obtained by GMLF using the flat strategy (second columns), LIME (third columns) and LRP (fourth columns) for VGG16 network responses using
images from STL10 (a) and Aberdeen datasets (b). In both (a) and (b), for each input (first columns) the explanation in terms of most relevant segments are reported
for the proposed flat approach (second columns) and LIME (third columns). For better clarity, we report a colourmap where only the first two most relevant segments
are highlighted both for MLRF and LIME.
Fig. 6. Examples of a two-layer hierarchical explanation on images classified as warplane, tobby, hartebeest, dalmatian respectively by VGG16. (a) First column:
segment heat map. Left to right: segments sorted in descending relevance order. Top-down: the coarsest (second row) and the finest (third row) hierarchical level.
(b) LIME explanation: same input, same segmentation used in (a).
vector on the latent variable coding is computed. Then, a set of three types of explanations, although based on different MLFs,
decoded images are generated varying the two most relevant la- seem coherent to each other.
tent variables while fixing the other ones to the original encoding
values. 5.5. Quantitative evaluation
One can observe that varying the most relevant latent vari-
ables it seems that relevant image properties for the classifier A quantitative evaluation is performed adopting the MoRF
(Most Relevant First) and AOPC (Area Over Perturbation Curve) [6,
decision are modified such as hair length and style.
66] curve analysis. In this work, MoRF curve is computed fol-
lowing the region flipping approach, a generalisation of the pixel-
5.4. Multiple MLF explanations flipping measure proposed in [6]. In a nutshell, given an image
classification, image regions (in our case segments) are iteratively
For the same classifier input–output, we show the possibility replaced by random noise and fed to the classifier, following the
to provide multiple and different MLF explanations based on the descending order with respect to the relevance values returned
three types of previously mentioned MLFs. In Fig. 10, for each by the explanation method. In this manner, more relevant for the
input, three different types of explanations are shown. In the classification output the identified MLFs are, steepest is the curve.
first row, an explanation based on MLFs obtained by a flat image Instead, AOPC is computed as:
segmentation is reported. In the second row, an explanation L
based on MLFs obtained by an hierarchical segmentation. In the 1 ∑
AOPC = ⟨ f (x(0) ) − f (x(k) )⟩
last row, a VAE-based MLF explanation is showed. Notice that the L+1
k=0
9
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 7. Examples of a two-layer hierarchical explanation on images classified as Female and Male by VGG16. (a) First column: segment heat map. Left to right:
segments sorted in descending relevance order. Top-down: the coarsest (second row) and the finest (third row) hierarchical level. (b) LIME explanation: same input,
same segmentation used in (a).
Fig. 8. Results obtained by Hierarchical MLF approach (described in Section 3.2) using VGG16 network on STL10 images wrongly classified by the model. (a) A dog
wrongly classified as a poodle, although it is evidently of a completely different race. Inspecting the MLF explanations at different hierarchy scales, it can be seen
that the classifier was probably misled by the wig (which probably led the classifier toward the poodle class), (b) A cat wrongly classified as a bow tie. Inspecting
the MLF explanations at different hierarchy scales, it can be seen that the shape and the position of the cat head near the neck of the shirt, having at the same
time the remaining of its body hidden, could be responsible for the wrong class.
Fig. 9. Results obtained by VAE MLF approach (described in Section 3.3) using a VGG16 network on Aberdeen image dataset. For each image, a VAE is constructed.
For each input, the resulting relevance vector on the latent variable is computed. Then, decoded images are generated varying the two most relevant latent variables
while fixing the other ones to the original values.
10
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 10. For each input, three different types of explanations obtained by GMLF approach are shown. In the first row, an explanation based on a flat image segmentation
is reported. In the second row, an explanation based on an hierarchical segmentation. In the last row, a VAE-based MLF explanation is showed.
Fig. 11. A quantitative evaluation of the hierarchical GMLF approach on different input images. To evaluate the hierarchical GMLF approach respect to the LIME
approach, a most relevant segment analysis is made using MoRF curves. MoRF curves computed with the proposed approach (red) and LIME (blue) using the last
layer MLF as segmentation for both methods are shown. At each iteration step, a perturbed input based on the returned explanation is fed to the classifier. On the
y axis of the plot, the classification probability (in %) of the original class for each perturbed input. On the x axis, some perturbation steps. For each input image,
the figures in the first and the second row show the perturbed inputs fed to the classifier at each perturbation step for the proposed explainer system and the LIME
explainer, respectively. More relevant for the classification output the identified MLFs are, steepest the MoRF curve is.
where L is the total number of perturbation steps, f (·) is the and AOPC are shown in Figs. 11 and 12. In Fig. 11 MoRF curves
classifier output score, x(0) is the original input image, x(i) is the for some inputs are shown. It is evident that the MLFs selected
input perturbed at step i, and ⟨·⟩ is the average operator over by the proposed hierarchical approach are more relevant for
a set of input images. In this manner, more relevant for the the produced classification output. This result is confirmed by
classification output the identified MLFs are, greater the AOPC the average MoRF and average AOPC curves (Fig. 12), obtained
value is. averaging over the MoRF and AOPC curves of a sample of 100 and
To evaluate the hierarchical approach with respect to the flat 50 random images taken from STL10 and Aberdeen respectively.
segmentation approach, at each step, MLFs were removed from To make an easy comparison between the proposed methods
the inputs exploiting the hierarchy in a topological sort depth- and summarising the quantitative evaluations, last iteration AOPC
first search based on the descending order’s relevances. Therefore, values of the proposed methods and LIME are reported in Tables 1
the MLFs of the finest hierarchical layer were considered. MoRF and 2 for STL 10 and Aberdeen dataset respectively.
11
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 12. Average MoRF (first row) and AOPC (second row) computed on a sample of 100 and 50 random images sampled from STL10 (first column) and Abardeen
(second column) respectively. Both the curves of the proposed hierarchical approach (red) and LIME (blue) are plotted using as baseline the removal of the Middle
Level Features from the input images in a random order (green). More relevant for the classification output the identified MLFs are, steepest the MoRF curve is and
greater the AOPC value is.
Table 1 Table 2
Average AOPC of the proposed methods and LIME obtained averaging over Average AOPC of the proposed methods and LIME obtained averaging over
the last AOPC perturbation step on a sample of 100 random images taken the last AOPC perturbation step values on a sample of 50 random images
from STL10 dataset. Flat and hierarchical proposal are compared with LIME, taken from Aberdeen dataset.
resulting better in both cases. Since the LIME MLFs structure is hardly AOPC
different from VAE MLFs (the former corresponds to superpixels, the latter
to latent variables), the AOPC reported has not to be compared with the LIME 0.014
other results. Flat (proposed) 0.571
Hierarchical (proposed) 0.661
AOPC
LIME 0.042
Flat (proposed) 0.598
Hierarchical (proposed) 0.732 in a variational latent space perturbing a latent variable can lead
VAE (proposed) 0.595
changing in the whole input image. Therefore, classifiers fed with
decoded images generated by different MLF types could return
no comparable results, which may not be informative to make
In Fig. 14, the same quantitative analysis using the VAE strat- comparisons between MoRF curves.
egy is shown. Examples of MoRF curves using the VAE are shown
in Fig. 13. As in the hierarchical approach, the latent features are 6. Conclusion
sorted following the descending order returned by the relevance
algorithm, and then noised in turn for each perturbation step. A framework to generate explanations in terms of middle-level
Due to the difference between LIME and VAE MLFs (the former features is proposed in this work. With the expression Middle
corresponds to superpixels, the latter to latent variables), no Level Features (MLF), (see Section 1, we mean input features that
comparison with LIME was reported. In our knowledge, no other represent more salient and understandable input properties for a
study reports explanations in terms of latent variables, therefore user, such as parts of the input (for example, nose, ears and paw,
is not easy to make a qualitative comparison with the existing in case of images of humans) or more abstract input properties
methods. Differently from perturbing the MLF of a superpixel- (for example, shape, viewpoint, thickness and so on). The use of
based approach where only an image part is substituted by noise, middle-level features is motivated by the need to decrease the
12
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
Fig. 13. A quantitative evaluation of the VAE GMLF approach on different input images. MoRF curves computed with the proposed approach (red) perturbing the VAE
latent variables in the order given by the explainer are shown. At each iteration step, a perturbed input based on the returned explanation is fed to the classifier.
On the y axis of the plot, the classification probability (in %) of the original class for each perturbed input. On the x axis, some perturbation steps. For each input
image, the figures show the perturbed inputs fed to the classifier at each perturbation step for the proposed explainer system. More relevant for the classification
output the identified MLFs are, steepest the MoRF curve is.
Fig. 14. Average MoRF (first column) and AOPC (second column) computed on a sample of 50 random images sampled from Aberdeen dataset. The curve proposed
with the VAE approach (red) is plotted using as baseline the removal of the Middle Level Features from the input images in a random order (green). More relevant
for the classification output the identified MLFs are, steepest the MoRF curve is and greater the AOPC value is.
human interpretative burden in artificial intelligence explanation heatmaps can be applied on both the decoder and the ML system
systems. whose decision is to be explained (see Section 3.1). Consequently,
Our approach can be considered a general framework to ob- the proposed approach enables one to obtain different types
tain humanly understandable explanations insofar as it can be of explanations in terms of different MLFs for the same pair
applied to different types of middle-level features as long as an input/decision of an ML system, that may allow developing XAI
encoder/decoder system is provided (for example image segmen- solutions able to provide user-centred explanations according to
tation or latent coding) and an explanation method producing several research directions proposed in literature [2,31].
13
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
We experimentally tested (see Sections 4 and 5) our approach [12] B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, et al., Inter-
using three different types of MLFs: flat (non hierarchical) seg- pretability beyond feature attribution: Quantitative testing with concept
activation vectors (tcav), in: International Conference on Machine Learning,
mentation, hierarchical segmentation and VAE latent coding. Two
PMLR, 2018, pp. 2668–2677.
different datasets were used: STL-10 dataset and the Aberdeen [13] A. Ghorbani, J. Wexler, J. Zou, B. Kim, Towards automatic concept-based
dataset from the University of Stirling. explanations, 2019, arXiv preprint arXiv:1902.03129.
We evaluated our results from both a qualitative and a quan- [14] A. Akula, S. Wang, S.-C. Zhu, Cocox: Generating conceptual and counter-
titative point of view. The quantitative evaluation was obtained factual explanations via fault-lines, in: Proceedings of the AAAI Conference
on Artificial Intelligence, Vol. 34, 2020, pp. 2594–2601.
using MoRF curves [66]. [15] Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and
The results are encouraging, both under the qualitative point new perspectives, IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2013)
of view, giving easily human interpretable explanations, and the 1798–1828.
quantitative point of view, giving comparable performances to [16] F. Locatello, S. Bauer, M. Lucic, G. Raetsch, S. Gelly, B. Schölkopf, O.
Bachem, Challenging common assumptions in the unsupervised learning
LIME. Furthermore, we show that a hierarchical approach can
of disentangled representations, in: International Conference on Machine
provide, in several cases, clear explanations about the reason Learning, PMLR, 2019, pp. 4114–4124.
behind classification behaviours. [17] R.T. Chen, X. Li, R. Grosse, D. Duvenaud, Isolating sources of disentan-
glement in variational autoencoders, 2018, arXiv preprint arXiv:1802.
CRediT authorship contribution statement 04942.
[18] F.L. Galvão, S.J.F. Guimarães, A.X. Falcão, Image segmentation using dense
and sparse hierarchies of superpixels, Pattern Recognit. 108 (2020) 107532.
Andrea Apicella: Performed the experiments, Analyzed the [19] D. Charte, F. Charte, M.J. del Jesus, F. Herrera, An analysis on the use of
results, Writing & review original draft. Salvatore Giugliano: Per- autoencoders for representation learning: Fundamentals, learning task case
formed the experiments, Analyzed the results, Writing & review studies, explainability and challenges, Neurocomputing 404 (2020) 93–107.
[20] A. Apicella, S. Giugliano, F. Isgrò, R. Prevete, Explanations in terms of
original draft. Francesco Isgrò: Performed the experiments, An-
hierarchically organised middle level features, in: [Link] - 2021 Ital-
alyzed the results, Writing & review original draft. Roberto Pre- ian Workshop on Explainable Artificial Intelligence, CEUR Workshop
vete: Performed the experiments, Analyzed the results, Writing & Proceedings, 2021.
review original draft. [21] M. Tschannen, O. Bachem, M. Lucic, Recent advances in autoencoder-based
representation learning, 2018, arXiv preprint arXiv:1812.05069.
[22] C.K. Sønderby, T. Raiko, L. Maaløe, S.K. Sønderby, O. Winther, Ladder vari-
Declaration of competing interest
ational autoencoders, in: Proceedings of the 30th International Conference
on Neural Information Processing Systems, 2016, pp. 3745–3753.
The authors declare that they have no known competing finan- [23] S. Zhao, J. Song, S. Ermon, Learning hierarchical features from deep
cial interests or personal relationships that could have appeared generative models, in: International Conference on Machine Learning,
to influence the work reported in this paper. PMLR, 2017, pp. 4091–4099.
[24] X. Gu, W. Ding, A hierarchical prototype-based approach for classification,
Inform. Sci. 505 (2019) 325–351.
Acknowledgement [25] M.T. Ribeiro, S. Singh, C. Guestrin, "Why should I trust you?": Explaining
the predictions of any classifier, in: Proceedings of the 22Nd ACM SIGKDD
This work is supported by the European Union - FSE-REACT- International Conference on Knowledge Discovery and Data Mining, KDD
’16, ACM, New York, NY, USA, 2016, pp. 1135–1144.
EU, PON Research and Innovation 2014–2020 DM1062/2021 con-
[26] A. Apicella, F. Isgrò, R. Prevete, G. Tamburrini, Contrastive explanations
tract number 18-I-15350-2 and by the Ministry of University to classification systems using sparse dictionaries, in: International Con-
and Research, PRIN research project ‘‘BRIO – BIAS, RISK, OPACITY ference on Image Analysis and Processing, Springer, Cham, 2019, pp.
in AI: design, verification and development of Trustworthy AI.", 207–218.
Project no. 2020SSKZ7R . [27] F. Donnarumma, R. Prevete, D. Maisto, S. Fuscone, E.M. Irvine, M.A. van der
Meer, C. Kemere, G. Pezzulo, A framework to identify structured behavioral
patterns within rodent spatial trajectories, Sci. Rep. 11 (1) (2021) 1–20.
References [28] R. Guidotti, A. Monreale, S. Matwin, D. Pedreschi, Explaining image
classifiers generating exemplars and counter-exemplars from latent repre-
[1] A. Adadi, M. Berrada, Peeking inside the black-box: A survey on explainable sentations, in: Proceedings of the AAAI Conference on Artificial Intelligence,
artificial intelligence (XAI), IEEE Access 6 (2018) 52138–52160. Vol. 34, 2020, pp. 13665–13668.
[2] M. Ribera, A. Lapedriza, Can we do better explanations? A proposal of [29] A. Apicella, F. Isgro, R. Prevete, A. Sorrentino, G. Tamburrini, Explaining
user-centered explainable ai., in: IUI Workshops, Vol. 2327, 2019, p. 38. classification systems using sparse dictionaries, in: Proceedings of the
[3] A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, European Symposium on Artificial Neural Networks, Computational Intelli-
S. García, S. Gil-López, D. Molina, R. Benjamins, et al., Explainable artificial gence and Machine Learning, Special Session on Societal Issues in Machine
intelligence (XAI): Concepts, taxonomies, opportunities and challenges Learning: When Learning from Data Is Not Enough, Bruges, Belgium, 2019.
toward responsible AI, Inf. Fusion 58 (2020) 82–115. [30] A. Apicella, F. Isgro, R. Prevete, G. Tamburrini, A. Vietri, Sparse dictionaries
[4] D. Doran, S. Schulz, T.R. Besold, What does explainable AI really mean? A for the explanation of classification systems, in: PIE, Rome, Italy, 2019, p.
new conceptualization of perspectives, 2017, CoRR abs/1710.00794. 009.
[5] A. Nguyen, J. Yosinski, J. Clune, Multifaceted feature visualization: Uncov- [31] B.Y. Lim, Q. Yang, A.M. Abdul, D. Wang, Why these explanations? Selecting
ering the different types of features learned by each neuron in deep neural intelligibility types for explanation goals, in: IUI Workshops, 2019.
networks, 2016, ArXiv E-Prints. [32] S. Kim, T. Qin, T.-Y. Liu, H. Yu, Advertiser-centric approach to understand
[6] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, On user click behavior in sponsored search, Inform. Sci. 276 (2014) 242–254.
pixel-wise explanations for non-linear classifier decisions by layer-wise [33] T. Miller, Explanation in artificial intelligence: Insights from the social
relevance propagation, PLoS One 10 (7) (2015) e0130140. sciences, Artificial Intelligence 267 (2019) 1–38.
[7] A. Apicella, F. Isgro, R. Prevete, G. Tamburrini, Middle-level features for the [34] A. Weller, Transparency: Motivations and challenges, 2017, arXiv:1708.
explanation of classification systems by sparse dictionary methods, Int. J. 01870.
Neural Syst. 30 (08) (2020) 2050040. [35] W. Samek, K.-R. Müller, Towards explainable artificial intelligence, in: W.
[8] G. Montavon, W. Samek, K.-R. Müller, Methods for interpreting and Samek, G. Montavon, A. Vedaldi, L. Hansen, K. Muller (Eds.), Explainable
understanding deep neural networks, Digit. Signal Process. 73 (2018) 1–15. AI: Interpreting, Explaining and Visualizing Deep Learning, Springer, 2019,
[9] Z.C. Lipton, The mythos of model interpretability: In machine learning, the pp. 5–22.
concept of interpretability is both important and slippery, Queue 16 (3) [36] A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado,
(2018) 31–57. S. García, S. Gil-López, D. Molina, R. Benjamins, et al., Explainable artificial
[10] Q. Zhang, S. Zhu, Visual interpretability for deep learning: a survey, Front. intelligence (XAI): Concepts, taxonomies, opportunities and challenges
Inf. Technol. Electron. Eng. 19 (1) (2018) 27–39. toward responsible AI, Inf. Fusion 58 (2020) 82–115.
[11] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional net- [37] W.J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, B. Yu, Definitions,
works: Visualising image classification models and saliency maps, in: 2nd methods, and applications in interpretable machine learning, Proc. Natl.
International Conference on Learning Representations, Workshop Track Acad. Sci. 116 (44) (2019) 22071–22080, [Link]
Proceedings, Banff, Canada, 2014. 1900654116.
14
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725
[38] C.M. Bishop, N.M. Nasrabadi, Pattern recognition and machine learning, 4, [54] D. Rezende, S. Mohamed, Variational inference with normalizing flows,
(4) Springer, 2006. in: International Conference on Machine Learning, PMLR, 2015, pp.
[39] A. Apicella, F. Donnarumma, F. Isgrò, R. Prevete, A survey on modern 1530–1538.
trainable activation functions, Neural Networks 138 (2021) 14–32. [55] Y. Li, Q. Pan, S. Wang, H. Peng, T. Yang, E. Cambria, Disentangled variational
[40] D. Erhan, Y. Bengio, Courville, P. Vincent, Visualizing higher-layer features auto-encoder for semi-supervised learning, Inform. Sci. 482 (2019) 73–85.
of a deep network, Univ. Montreal 1341 (3) (2009) 1. [56] H. Gao, C.-M. Pun, S. Kwong, An efficient image segmentation method
[41] A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, W. Samek, Layer- based on a hybrid particle swarm algorithm with learning strategy, Inform.
wise relevance propagation for neural networks with local renormalization Sci. 369 (2016) 500–521.
layers, in: International Conference on Artificial Neural Networks, Springer, [57] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder
Barcelona, Spain, 2016, pp. 63–71. with atrous separable convolution for semantic image segmentation, in:
[42] G. Montavon, S. Lapuschkin, A. Binder, W. Samek, K.-R. Müller, Explaining Proceedings of the European Conference on Computer Vision, ECCV, 2018,
nonlinear classification decisions with deep taylor decomposition, Pattern pp. 801–818.
Recognit. 65 (2017) 211–222. [58] J. Yu, D. Huang, Z. Wei, Unsupervised image segmentation via stacked
[43] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Learning deep fea- denoising auto-encoder and hierarchical patch indexing, Signal Process.
tures for discriminative localization, in: Proceedings of the IEEE Conference 143 (2018) 346–353.
on Computer Vision and Pattern Recognition, 2016, pp. 2921–2929. [59] X. Zhang, Y. Sun, H. Liu, Z. Hou, F. Zhao, C. Zhang, Improved clustering
[44] R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, algorithms for image segmentation based on non-local information and
Grad-cam: Visual explanations from deep networks via gradient-based back projection, Inform. Sci. 550 (2021) 129–144.
localization, in: Proceedings of the IEEE International Conference on [60] S.J.F. Guimarães, J. Cousty, Y. Kenmochi, L. Najman, A hierarchical image
Computer Vision, 2017, pp. 618–626. segmentation algorithm based on an observation scale, in: Joint IAPR
[45] M.D. Zeiler, G.W. Taylor, R. Fergus, Adaptive deconvolutional networks for International Workshops on Statistical Techniques in Pattern Recognition
mid and high level feature learning, in: Computer Vision (ICCV), 2011 IEEE (SPR) and Structural and Syntactic Pattern Recognition, SSPR, Springer,
International Conference on, IEEE, Barcelona, Spain, 2011, pp. 2018–2025. 2012, pp. 116–125.
[46] M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional net- [61] L. Guigues, J.P. Cocquerez, H. Le Men, Scale-sets image analysis, Int. J.
works, in: European Conference on Computer Cision, Springer, Zurich, Comput. Vis. 68 (3) (2006) 289–317.
Switzerland, 2014, pp. 818–833. [62] I. Higgins, L. Matthey, A. Pal, C.P. Burgess, X. Glorot, M. Botvinick, S.
[47] A. Dosovitskiy, T. Brox, Inverting visual representations with convolutional Mohamed, A. Lerchner, Beta-VAE: Learning basic visual concepts with a
networks, in: Proceedings of the IEEE Conference on Computer Vision and constrained variational framework, in: ICLR, 2017.
Pattern Recognition, Las Vegas, USA, 2016, pp. 4829–4837. [63] A. Apicella, S. Giugliano, F. Isgrò, R. Prevete, A general approach to compute
[48] B. Zhou, Y. Sun, D. Bau, A. Torralba, Interpretable basis decomposition the relevance of middle-level input features, in: Pattern Recognition. ICPR
for visual explanation, in: Proceedings of the European Conference on International Workshops and Challenges, Springer International Publishing,
Computer Vision, ECCV, 2018, pp. 119–134. Cham, 2021, pp. 189–203.
[49] D. Kahneman, A. Tversky, The simulation heuristic, Technical Report, [64] K. Simonyan, A. Zisserman, Very deep convolutional networks for
Stanford Univ Ca Dept Of Psychology, 1981. large-scale image recognition, in: International Conference on Learning
[50] X. Zhao, X. Huang, V. Robu, D. Flynn, BayLIME: Bayesian local interpretable Representations, 2015.
model-agnostic explanations, 2020, arXiv preprint arXiv:2012.03058. [65] A. Coates, A. Ng, H. Lee, An analysis of single-layer networks in unsu-
[51] J. Dieber, S. Kirrane, Why model why? Assessing the strengths and pervised feature learning, in: Proceedings of the Fourteenth International
limitations of LIME, 2020, arXiv preprint arXiv:2012.00093. Conference on Artificial Intelligence and Statistics, JMLR Workshop and
[52] B. Li, D. Pi, Network representation learning: a systematic literature review, Conference Proceedings, 2011, pp. 215–223.
Neural Comput. Appl. (2020) 1–33. [66] W. Samek, A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, Evaluating
[53] D.P. Kingma, M. Welling, Auto-encoding variational bayes, 2013, arXiv the visualization of what a deep neural network has learned, IEEE Trans.
preprint arXiv:1312.6114. Neural Netw. Learn. Syst. 28 (11) (2016) 2660–2673.
15