0% found this document useful (0 votes)
29 views15 pages

Middle-Level Explanations in XAI

This paper presents a framework for eXplainable Artificial Intelligence (XAI) that generates multiple explanations for image classification systems using Middle-Level input Features (MLFs) extracted by auto-encoders. The proposed approach aims to provide more human-understandable explanations compared to traditional low-level feature methods, allowing for user-centered interpretations. Experimental results demonstrate the effectiveness of this framework across different image datasets and MLF extraction methods.

Uploaded by

luminereblanc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views15 pages

Middle-Level Explanations in XAI

This paper presents a framework for eXplainable Artificial Intelligence (XAI) that generates multiple explanations for image classification systems using Middle-Level input Features (MLFs) extracted by auto-encoders. The proposed approach aims to provide more human-understandable explanations compared to traditional low-level feature methods, allowing for user-centered interpretations. Experimental results demonstrate the effectiveness of this framework across different image datasets and MLF extraction methods.

Uploaded by

luminereblanc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Knowledge-Based Systems 255 (2022) 109725

Contents lists available at ScienceDirect

Knowledge-Based Systems
journal homepage: [Link]/locate/knosys

Exploiting auto-encoders and segmentation methods for middle-level


explanations of image classification systems

Andrea Apicella , Salvatore Giugliano, Francesco Isgrò, Roberto Prevete
Laboratory of Augmented Reality for Health Monitoring (ARHeMLab), Università degli Studi di Napoli Federico II, Italy
Laboratory of Artificial Intelligence, Privacy & Applications (AIPA Lab), Università degli Studi di Napoli Federico II, Italy
Dipartimento di Ingegneria Elettrica e delle Tecnologie dell’Informazione, Università degli Studi di Napoli Federico II, Italy

article info a b s t r a c t

Article history: A central issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence
Received 24 February 2022 (XAI) is to provide methods to give explanations for the behaviours of Machine Learning (ML) non-
Received in revised form 6 August 2022 interpretable models after the training. Recently, it is becoming more and more evident that new
Accepted 15 August 2022
directions to create better explanations should take into account what a good explanation is to a human
Available online 23 August 2022
user. This paper suggests taking advantage of developing an XAI framework that allows producing
Keywords: multiple explanations for the response of image a classification system in terms of potentially different
XAI middle-level input features. To this end, we propose an XAI framework able to construct explanations
Explainable AI in terms of input features extracted by auto-encoders. We start from the hypothesis that some
Hierarchical auto-encoders, relying on standard data representation approaches, could extract more salient and
Middle-level understandable input properties, which we call here Middle-Level input Features (MLFs), for a user with
Interpretable models respect to raw low-level features. Furthermore, extracting different types of MLFs through different
type of auto-encoders, different types of explanations for the same ML system behaviour can be
returned. We experimentally tested our method on two different image datasets and using three
different types of MLFs. The results are encouraging. Although our novel approach was tested in
the context of image classification, it can potentially be used on other data types to the extent that
auto-encoders to extract humanly understandable representations can be applied.
© 2022 Elsevier B.V. All rights reserved.

1. Introduction literature [3,4], and many approaches to the problem of overcom-


ing their opaqueness are now pursued [5–7]. For example, in [8]
A large part of Machine Learning(ML) techniques – includ- a series of techniques for the interpretation of DNNs is discussed,
ing Support Vector Machines (SVM) and Deep Neural Networks and in [9] the authors examine and discuss the motivations
(DNN) – give rise to systems having behaviours often complex underlying the interest in ML systems’ interpretability, discussing
to interpret [1]. More precisely, although ML techniques with and refining this notion. In the literature, particular attention is
reasonably well interpretable mechanisms and outputs exist, given to post-hoc explainability [3], i.e., the methods to provide
as, for example, decision trees, the most significant part of ML explanations for the behaviours of non-interpretable models after
techniques give responses whose relationships with the input the training. In the context of this multifaceted interpretabil-
are often difficult to understand. In this sense, they are com- ity problem, we note that in the literature, one of the most
monly considered as black-box systems. In particular, as ML successful strategies is to provide explanations in terms of
systems are being used in more and more domains and, so, by a ‘‘visualisations’’ [2,10]. More specifically, explanations for image
more varied audience, there is the need for making them under- classification systems are given in terms of low-level input fea-
standable and trusting to general users [2,3]. Hence, generating tures, such as relevance or heat maps of the input built by model-
explanations for ML system behaviours that are understandable agnostic (without disclosing the model internal mechanisms)
to human beings is a central scientific and technological issue or model-specific (accessing to the model internal mechanisms)
addressed by the rapidly growing research area of eXplainable methods, like sensitivity analysis [11] or Layer-wise Relevance
Artificial Intelligence (XAI). Several definitions of interpretabil- Propagation (LRP) [6]. For example, LRP associates a relevance
ity/explainability for ML systems have been discussed in the XAI value to each input element (pixels in case of images) to explain
the ML model answer. The main problem with such methods is
∗ Corresponding author. that human users are left with a significant interpretive burden.
E-mail address: [Link]@[Link] (A. Apicella). Starting from each low-level feature’s relevance, the human user

[Link]
0950-7051/© 2022 Elsevier B.V. All rights reserved.
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 1. Examples of Middle Level input Features (MLFs). Each MLF represents a part of the input which is perceptually and cognitively salient to a human being, as
for example the ears of a cat or the wings of an airplane. These features are intuitively more humanly interpretable respect to low-level features (as for example
raw unrelated image pixels), so a decision explanation expressed in terms of MLF relevance can be easier to understand for a human being respect to explanations
expressed in terms of low level features.

needs to identify the overall input properties perceptually and image, said superpixels which are obtained by a clustering algo-
cognitively salient to him [7]. Thus, an XAI approach should rithm. These superpixels can be interpreted as MLFs. In [7,26] the
alleviate this weakness of low-level approaches and overcome explanations are formed of elements selected from a dictionary
their limitations, allowing the possibility to construct explana- of MLFs, obtained by sparse dictionary learning methods [27].
tions in terms of input features that represent more salient and In [28] authors propose to exploit the latent representations
understandable input properties for a user, which we call here learned through an adversarial auto-encoder for generating a
Middle-Level input Features (MLFs) (see Fig. 1). Although there synthetic neighbourhood of the image for which an explanation
is a recent research line which attempts to give explanations in is required. However, these approaches propose specific solutions
terms of visual human-friendly concepts [12–14] (we will discuss which cannot be generalised to different types of input proper-
them in Section 2), however we notice that the goal to learn data ties. By contrast, in this paper, we investigate the possibility of
representations that are easily factorised in terms of meaningful obtaining explanations using an approach that can be applied to
features is, in general, pursued in the representation learning different types of MLFs, which we call General MLF Explanations
framework [15], and more recently in the feature disentanglement (GMLF). More precisely, we develop an XAI framework that can be
learning context [16]. These meaningful features may represent applied whenever (a) the input of an ML system can be encoded
parts of the input such as nose, ears and paw in case of, for and decoded based on MLFs, and (b) any Explanation method
example, face recognition tasks (similarly to the outcome of a producing a Relevance Map (ERM method) can be applied on
clustering algorithm) or more abstract input properties such as both the ML model and the decoder. In this sense, we propose a
shape, viewpoint, thickness, and so on, leading to data repre- general framework insofar as it can be applied to several different
sentations perceptually and cognitively salient to the human computational definitions of MLFs and a large class of ML models.
being. Based on these considerations, in this paper, we propose to Consequently, we can provide multiple and different explanations
develop an XAI approach able to give explanations for an image based on different MLFs. In particular, in this work we tested our
classification system in terms of features which are obtained novel approach in the context of image classification using MLFs
by standard representation learning methods such as variational extracted by three different methods: (1) image segmentation
auto-encoder [17] and hierarchical image segmentation [18]. In by auto-encoders, (2) hierarchical image segmentation by auto-
particular, we exploit middle-level data representations obtained encoders, and (3) Variational auto-encoders. About the points (1)
by auto-encoder methods [19] to provide explanations of image and (2), a simple method to represent the output of a segmenta-
classification systems. In this context, in an earlier work [20] we tion algorithm in terms of encoder–decoder is reported. However,
proposed an initial experimental investigation on this type of this approach can be used on a wide range of different data types
explanations exploiting the hierarchical organisation of the data to the extent that encoder–decoder methods can be applied.
in terms of more elementary factors. For example, natural images Thus, the medium or long-term objective of this research work
can be described in terms of the objects they show at various is to develop a XAI general approach producing explanations for
levels of granularity [21–23]. Or in [24] a hierarchical prototype- an ML System behaviour in terms of potentially different and
based approach for classification is proposed. This method has a user-selected input features, composed of input properties which
certain degree of intrinsic transparency, but it does not fall into the human user can select according to his background knowl-
post-hoc explainability category. edge and goals. This aspect can play a key role in developing
To the best of our knowledge, in the XAI literature, how- user-centred explanations. It is essential to note that, in making
ever, there are relatively few approaches that pursue this line an explanation understandable for a user, it should be taken
of research. In [25], the authors proposed LIME, a successful XAI into account what information the user desires to receive [2,
method which is based, in case of image classification problems, 12,29,30]. Recently, it is becoming more and more evident that
on explanations expressed as sets of regions, clusters of the new directions to create better explanations should take into
2
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

account what a good explanation is for a human user, and con- beings. On the other side, methods as Local Interpretable Model-
sequently to develop XAI solutions able to provide user-centred agnostic Explanations (LIME) [25] relies on feature partitions, as
explanations [2,12,14,31,32]. By contrast, much of the current super-pixel in the image case. However, the explanations given
XAI methods provide specific ways to build explanations that are by LIME (or its variants) are built through a new model that
based on the researchers’ intuition of what constitutes a ‘‘good’’ approximates the original one, thus risking to loose the real
explanation [31,33]. reasons behind the behaviour of the original model [2].
To summarise, in this paper the following novelties are pre- Recently, a growing number of studies [12–14,48] have fo-
sented: cused on providing explanations in the form of middle-level or
high-level human ‘‘concepts’’ as we are addressing it in this paper.
1. a XAI framework where middle-level or high-level input In particular, in [12] the authors introduce the Concept Ac-
properties can be built exploiting standard methods of data tivation Vectors (CAV) as a way of visually representing the
representation learning is proposed; neural network’ inner states associated with a given class. CAVs
2. our framework can be applied to several different com- should represent human-friendly concepts. The basic ideas can be
putational definitions of middle-level or high-level input described as follow: firstly, the authors suppose the availability
properties and a large class of ML models. Consequently, of an external labelled dataset XC where each label corresponds
multiple and different explanations based on different to a human-friendly concept. Then, given a pre-trained neural
middle-level input properties can be possibly provided network classifier to be explained, say NC , they consider the
given an input-ML system response; functional mapping fl from the input to the l-layer of NC . Based on
3. The middle-level or high-level input proprieties are com- fl , for each class c of the dataset XC , they build a linear classifier
puted independently from the ML classifier to be explained. composed of fl followed by a linear classifier to distinguish the
element of XC belonging to the class c from randomly chosen
The paper is organised as follows: Section 3 describes in detail
images. The normal to the learned hyperplane is considered the
the proposed approach; in Section 2 we discuss differences and
CAV for the user-defined concept corresponding to the class c.
advantages of GMLF with respect similar approaches presented in
Finally, given all the input belonging to a class K of the pre-
the literature; experiments and results are discussed in Section 5.
trained classifier NC , the authors define a way to quantify how
In particular, we compared our approach with LIME method and
much a concept c, expressed by a CAV, influences the behaviour
performed both qualitative and quantitative evaluations of the
of the classifier, using directional derivatives to computes NC ’s
results; the concluding Section summarises the main high-level
conceptual sensitivity across entire class K of inputs.
features of the proposed explanation framework and outlines
Building upon the paper discussed above, in [14] the au-
some future developments.
thors provide explanations in terms of fault-lines[49]. Fault-lines
should represent ‘‘high-level semantic aspects of reality on which
2. Related works
humans zoom in when imagining an alternative to it’’. Each fault-
line is represented by a minimal set of semantic xconcepts that
The importance of eXplainable Artificial Intelligence (XAI) is
need to be added to or deleted from the classifier’s input to
discussed in several papers [33–36]. Different strategies have alter the class that the classifier outputs. Xconcepts are built
been proposed to face the explainability problem, depending both following the method proposed in [12]. In a nutshell, given a
on the AI system to explain and the type of explanation proposed. pre-trained convolutional neural network CN whose behaviour is
Among all the XAI works proposed over the last years, an impor- to be explained, xconcepts are defined in terms of super-pixels
tant distinction is between model-based and post-hoc explain- (images or parts of images) related to the feature maps of the
ability [37], the former consisting in AI systems explainable by lth CN’s convolutional layer, usually the last convolutional layer
design (e.g., decision trees), since their inner mechanisms are eas- before the full-connected layer. In particular, these super-pixels
ily interpreted, the latter proposing explanation built for system are collected when the input representations at the convolution
that are not easy to understand. In particular, several methods layer l are used to discriminate between a target class c and an
to explain Deep Neural Networks (DNNs) are proposed in the alternate class calt , and they are computed based on the Grad-
literature due to the high complexity of their inner structures. CAM algorithm [44]. In this way, one obtains xconcepts in terms
In a nutshell, DNNs are computational architectures organised as of images related to the class c and able to distinguish it from
several consecutive layers of elementary computing units, called the class calt . Thus, when the classifier CN responds that an input
neurons. Each neuron i belonging to a layer l achieves a two- x belongs to a class c, the authors provide an explanation in terms
step computation (see [38], chapter 4): the neuron input ali is of xconcepts which should represent semantic aspects of why x
computed first based on real values, said weights, associated with belongs c instead of an alternate class calt .
connections coming from neurons belonging to other layers, and In [13] the authors propose a method to provide explanations
a bias value associated to the neuron i. Then, the neuron output related to an entire class of a trained neural classifier. The method
is computed by an activation function f (·), i.e., f (ali ) (see [39] is based on the CAVs introduced in [12] and sketched above. How-
for a review). The flow of computation proceeds from the the ever, in this case, the CAVs are automatically extracted without
first hidden layer to the output layer in a forward-propagation the need an external labelled dataset expressing human-friendly
fashion. A very common approach to explain the behaviours of a concepts.
DNN consists in returning visual-based explanations in terms of Many of the approaches discussed so far focus on global expla-
input feature importance scores, as for example Activation Maxi- nations, i.e., explanations related to en entire class of the trained
mization (AM) [40], Layer-Wise Relevance propagation (LRP) [6], neural network classifier (see [12,13]). Instead, in our approach,
Deep Taylor Decomposition [41,42], Class Activation Mapping we are looking for local explanations, i.e., explanations for the
(CAM) methods [43,44], Deconvolutional Network [45] and Up- response of the ML model to each single input. Some authors, see
convolutional network [46,47]. Although heatmaps seem to be a for example [12], provide methods to obtain local explanations,
type of explanation that is easy to understand for the user, these but in this case, the explanations are expressed in terms of high-
methods build relevances on the low-level input features (the sin- level visual concepts which do not necessarily belong to the input.
gle pixel), while input middle-level properties which determined Thus, again human users are left with a significant interpretive
the answer of the classifier have to be located and interpreted load: starting from external high-level visual concepts, the hu-
by the user, leaving much of the interpretive work to the human man user needs to identify the input properties perceptually and
3
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

cognitively related to these concepts. On the contrary, the input part of these approaches is based on Auto-Encoder (AE) architec-
(MLFs) high-level properties are expressed, in our approach, in tures [19,52]. AEs correspond to neural networks composed of at
terms of elements of the input itself. least one hidden layer and logically divided into two components,
Another critical point is that high-level or middle-level user- an encoder and a decoder. From a functional point of view, an
friendly concepts are computed on the basis of the neural net- AE can be seen as the composition of two functions E and D:
work classifier to be explained. In this way, a short-circuit can be E is an encoding function (the encoder) which maps the input
created in which the visual concepts used to explain the classifier space onto a feature space (or latent encoding space), D is a
are closely related to the classifier itself. By contrast, in our decoding function (the decoder) which inversely maps the feature
approach, MLFs are extracted independently from the classifier.
space on the input space. A meaningful aspect is that by AEs,
A crucial aspect that distinguishes our proposal from the
one can obtain data representations in terms of latent encodings
above-discussed research line is grounded on the fact that we
h, where each hi may represent a MLF ξi of the input, such as
propose an XAI framework able to provide multiple explana-
tions, each one composed of a specific type of middle-level parts of the input (for example, nose, ears and paw) or more
input features (MLFs). Our methodology only needs that MLFs abstract features which can be more salient and understandable
can be obtained using methods framed into data representation input properties for a user. See for example variational AE [53–
research, and, in particular, any auto-encoder architecture for 55] or image segmentation [56–59] (see Fig. 1). Furthermore,
which an explanation method producing a relevance map can be different AEs can extract different data representations which are
applied on the decoder (see Section 3.1). not mutually exclusive.
To summarise, our GMLF approach, although shares with the Based on the previous considerations, we want to build upon
above describe research works the idea to obtain explanations the idea that the elements composing an explanation can be
based on middle-level or high-level human-friendly concepts, determined by an AE which extracts relevant input features for
presents the following elements of novelty: a human being, i.e., MLFs, and that one might change the type of
MLFs changing the type of auto-encoder or obtain multiple and
1. It is a XAI framework where middle-level or high-level
input properties can be built on the basis of standard different explanations based on different MLFs.
methods of data representation learning.
2. It outputs local explanations. 3.1. General description
3. The middle-level or high-level input proprieties are com-
puted independently from the ML classifier to be explained.
Given an ML classification model M which receives an input
Regarding points (2) and (3) we notice that a XAI method that x ∈ Rd and outputs y ∈ Rc , our approach can be divided into two
has significant similarity with our approach is LIME [25] or its consecutive steps.
variants (see, for example, [50]). LIME, especially in the context In the first step, we build an auto-encoder AE ≡ (E , D) such
of images, is one of the predominant XAI methods discussed in that each input x can be encoded by E in a latent encoding h ∈
the literature [50,51]. It can provide local explanations in terms Rm and decoded by D. As discussed above, to each value hi is
of superpixels which are regions or parts of the input that the
associated a MLF ξj , thus each input x is decomposed in a set
classifier receives, as we have already discussed in Section 1.
of m MLFs ξ = {ξi }m i=1 , where to each ξi is associated the value
These superpixels can be interpreted as middle-level input prop-
hi . Different choices of the auto-encoder can lead to MLFs ξi of
erties, which can be more understandable for a human user
different nature, so to highlight this dependence we re-formalise
than low-level features such as pixels. In this sense, we view a
similarity in the output between our approach GMLF and LIME. this first step as follows: we build an encoder Eξ : x ∈ Rd → h ∈
The explanations built by LIME can be considered comparable Rm and a decoder Dξ : h ∈ Rm → x ∈ Rd , where h encodes x in
with our proposed approach but different in the construction terms of the MLFs ξ .
process. While LIME builds explanation relying on a proxy model In the second step of our approach, we use an ERM method
different from the model to explain, the proposed approach relies (an explanation method producing a relevance map of the input)
only on the model to explain, without needing any other model on both M and Dξ , i.e., we apply it on the model M and then use
that approximates the original one. To highlight the difference the obtained relevance values to apply the ERM method on Dξ
between the produced explanations, in Section 4 a comparison getting a relevance value for each middle-level feature. In other
between LIME and GMLF outputs is made. words, we stack Dξ on the top of M thus obtaining a new model
DMξ which receives as input an encoding u and outputs y, and
3. Approach uses an ERM method on DMξ from y to u. In Fig. 2 we give a
graphic description of our approach GMLF, and in algorithm 1 it
Our approach stems from the following observations. is described in more details considering a generic auto-encoder,
The development of data representations from raw low-level while in algorithms 3 and 4 our approach (GMLF) is described in
data usually aims to obtain distinctive explanatory features of case of specific auto-encoders (see Sections 3.2 and 3.3).
the data, which are more conducive to subsequent data anal-
Thus, we search for a relevance vector u ∈ Rm which informs
ysis and interpretation. This critical step has been tackled for
the user how much each MLF of ξ has contributed to the ML
a long time using specific methods developed exploiting expert
model answer y. Note that, GMLF can be generalised to any
domain knowledge. However, this type of approach can lead to
unsuccessful results and requires a lot of heuristic experience decoder Dξ to which a ERM method applies on. In this way, one
and complex manual design [52]. This aspect is similar to what can build different explanations for a M’s response in terms of
commonly occurs in many XAI approaches, where the explana- different MLFs ξ .
tory methods are based on the researchers’ intuition of what In the remainder of this section, we will describe three alterna-
constitutes a ‘‘good’’ explanation. tive ways (segmentation, hierarchical segmentation and VAE) to
By contrast, representation learning successfully investigates obtain a decoder such that a ERM method can be applied to, and
ways to obtain middle/high-level abstract feature representations so three ways of applying our approach GMLF. We experimentally
by automatic machine learning approaches. In particular, a large tested our framework using all the methods.
4
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

follows that, if the segmentations have a hierarchical relation,


each coarser segmentation can be expressed in terms of the finer
ones. More in detail, ∑eachk+region vki of Sk can be expressed as a
linear combination j αj vj 1
where αj is 1 if all the pixels in vkj +1
k
belong to vi , 0 otherwise. We can apply the same reasoning going
from SK to the image x considering it as a trivial partition SK +1
where each region represents a single image pixel, i.e., SK +1 =
{vK1 +1 , vK2 +1 , . . . , vKd +1 }, with vijK +1 = xj if i = j, otherwise vijK +1 =
0.
It is straightforward to construct a feed-forward full connected
neural network of K + 1 layers representing an image x in terms
of a set of K hierarchically organised segmentations {Sk }Kk=1 as
follows (see Fig. 3): the kth network layer has |Sk | inputs and
|Sk+1 | outputs, the identity as activation functions, biases equal
Fig. 2. A general scheme of the proposed explanation framework. Given a to 0 and each weights wijk equal to 1 if the vkj +1 region belongs to
middle-level feature encoder and the respective decoder, this last one is stacked the vki region, 0 otherwise. The last layer K + 1 has d outputs and
on the top of the model to inspect. Next, the encoding of the input is fed to
weights equal to (vKp +1 )dp=1 . The resulting network can be viewed
the decoder-model system. A backward relevance propagation algorithm is then
applied. as a decoder that, fed with the 1 vector, outputs the image x.
Note that if one considers K = 1, it is possible to use the
same approach in order to obtain an x’s segmentation without a
hierarchical organisation. In this case the corresponding decoder
Algorithm 1: Proposed method GMLF is a network composed of just one layer. We want to clarify that
Input: data point x, trained model M, an ERP method RP the segmentation module described in this section represents
Output: Feature Relevances U a way to build an auto-encoder which encodes latent variables
1 y ← M(x); that are associate to image segments. These image segments are
2 build an auto-encoder AE ≡ (Eξ , Dξ ); candidate MLFs. Explanations are built by a selection of these
3 h ← Eξ (x); candidate segments in the second computational step of our
4 define R : h ↦ → x − Dξ (h); approach. We emphasise that the first step of our framework is
5 define DMξ : h ↦ → M(Dξ (h) + R(h)) ; to build an auto-encoder so that each input can be decomposed
6 U ← RP(DMξ , h, y) ; in a set of MLFs where each latent variable is associated to a
7 return U; specific MLF. These MLFs represent candidate input properties to
be included into the final explanation which is computed by the
second computational step of our approach. In this second part, a
number of candidate middle-level input features are selected by
3.2. MLFs from image segmentation an explanation method producing a relevance map of the input
such as Layer-wise Relevance Propagation method (LRP). How-
Here we describe the implementation of the GMLF approach ever, different choices of the auto-encoder can lead to different
to the case of an auto-encoder built of the basis of hierarchical MLFs of different nature.
segmentation. The approach is depicted in Fig. 3, while we give
an algorithmic formalisation in algorithms 2 and 3.
Given an image x ∈ Rd , a segmentation algorithm returns a
3.3. MLF from variational auto-encoders
partition of x composed of m regions {qi }m i=1 . Some of the exist-
ing segmentation algorithms can be considered hierarchical seg-
The concept of ‘‘entangled features’’ is strictly related to the
mentation algorithms, since they return partitions hierarchically
organised with increasingly finer levels of details. concept of ‘‘interpretability’’. As stated in [62], a disentangled data
More precisely, following [60], we consider a segmentation representation is most likely more interpretable than a classical
algorithm to be hierarchical if it ensures both the causality princi- entangled data representation. This fact is due to the generative
ple of multi-scale analysis [61] (that is, if a contour is present at factors representation into separate latent variables representing
a given scale, this contour has to be present at any finer scale) single features of the data (for example, the size or the colour of
and the location principle (that is, even when the number of the represented object in an image).
regions decreases, contours are stable). These two principles en- Using Variational Auto Encoders (VAE) is one of the most
sure that the segmentation obtained at a coarser detail level can affirmed neural network-based methods to generate disentangled
be obtained by merging regions obtained at finer segmentation encodings. In general, a VAE is composed of two parts. First, an
levels. encoder generates an entangled encoding of a given data point
In general, given an image, a possible set of MLFs can be the (in our case, an image). Then a decoder generates an image from
result of a segmentation algorithm. Given an image x ∈ Rd , an encoding. Once trained with a set of data, the VAE output x̃
and a partition of x consisting of m regions {qi }m on a given input x can be obtained as the composition of two
i=1 , each image’s
region qi can be represented by a vector vi ∈ Rd ∑ defined as functions, an encoding function E(·) and a decoding function D(·),
follows: vij = 0 if xj ∈ / qi , otherwise vij = xj , and m i=1 vi =
implemented as two stacked feed-forward neural networks.
x. Henceforth, for simplicity and without loss of generality, we The encoding function generates a data representation E(x) =
will use vi instead of qi since they represent the same entities. h of an image x, the decoding function generates an approximate
Consequently, x can be expressed as linear combination of the vi version D(h) = x̃ of x given the encoding h, with a residual
with all the coefficients equal to 1, which represent the encoding r = x − x̃. So, it is possible to restore the original image data
of the image x on the basis of the m regions. More in general, simply adding the residual to x̃, that is x = x̃ + r. Consequently,
given a set of K different segmentations {S1 , S2 , . . . , SK } of the we stack the decoder neural networks with a further dense layer
same image sorted from the coarser to the finer detail level, it R(·) having d neurons with weights set to 0 and biases set to r. The
5
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 3. A segmentation-based MLF framework. MLF decoder is built as a neural network having as weights the segments returned by a hierarchical segmentation
algorithm (see text for further details). The initial encoding is the ‘‘1’’ vector since all the segments are used to compose di input image. The relevance backward
algorithm returns the most relevant segments.

Fig. 4. A VAE-based MLF framework. The MLF decoder is built as a neural network composed of the VAE decoder module followed by a full-connected layer containing
the residual of the input (see text for further details). The initial input encoding is given by the VAE encoder module. The relevance backward algorithm returns the
most relevant latent variables.

resulting network R(E(h)) generates x as output, given its latent cases, we used as image classifier a VGG16 [64] network pre-
encoding h. trained on ImageNet. MLF relevances are computed with the LRP
In Fig. 4 it is shown a pictorial description of GMLF approach algorithm using the α − β rule [6].
when the auto-encoder is built based on VAE, the algorithmic In Section 5 we show a set of possible explanations of the clas-
description is reported in algorithm 4. sifier outputs on image sampled from STL-10 dataset [65] and the
Aberdeen data set from University of Stirling ([Link]
[Link]). The STL10 data-set is composed of images belonging
4. Experimental assessment
to 10 different classes (airplane, bird, car, cat, deer, dog, horse,
monkey, ship, truck), and the Aberdeen database is composed
In this section, we describe the chosen experimental setup.
of images belonging to 2 different classes (Male, Female). Only
The goal is to examine the applicability of our approach for dif- for the Aberdeen data-set the classifier was fine-tuned using an
ferent types of MLFs obtained by different encoders. As stated in subset of the whole data-set as training set.
Section 3.1, three different types of MLFs are evaluated: flat (non
hierarchical) segmentation, hierarchical segmentation and VAE 4.1. Flat segmentation approach
latent coding. For non-hierarchical/hierarchical MLF approaches,
the segmentation algorithm proposed in [60] was used to make For the flat (non-hierarchical) segmentation approach, images
MLFs, since its segmentation constraints respect the causality from the STL-10 and the Aberdeen data sets are used to generate
and the location principles reported in Section 3.2. However, for the classifier outputs and corresponding explanations. For each
the non-hierarchical method, any segmentation algorithm can be test image, a set of segments (or superpixels) S are generated us-
used (see for example [63]). ing the image segmentation algorithm proposed [60] considering
For the Variational Auto-Encoder (VAE) based GMLF approach, just one level. Therefore, a one-layer neural network decoder as
we used a β -VAE [62] as MLFs builder, since it results particularly described in Section 3.2 was constructed using the segmentation
suitable for generating interpretable representations. In all the S. The resulting decoder is stacked on the top of the VGG16 model
6
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Algorithm 2: Hierarchical segmentation-based Encoder– Algorithm 3: GMLF approach in case of Hierarchical


Decoder Generator segmentation-based auto-encoder
Input: data point x, hierarchical segmentation procedure Input: a data point x ∈ Rd , a trainedNeuralNet returning
seg, hierarchical segmentation parameters the class scores given a data point, a hierarchical
λ = (λ1 , λ2 , . . . , λK ) segmentation procedure seg, hierarchical
Output: A Decoder Dξ , an Encoder Eξ segmentation parameters λ = (λ1 , λ2 , . . . , λK ), a
1 {S1 , S2 , . . . , SK } ← seg(x, λ);
relevance propagation algorithm RP returning a
2 SK +1 ← ∅;
relevance vector for each network layer given: i) a
3 for xj ∈ x do
neural network, ii) an input and iii) its class
4 let vK +1 ∈ {0}d ; probabilities, a generateNeuralNet w ork function that
5 vjjK +1 ← xj ; returns a neural networks with weights, biases and
6 SK +1 ← SK +1 ∪ {vK +1 }; activation function given as parameters
7 end Output: relevances for the first K layers {u1 , . . . , uK }
8 for 1 ≤ k ≤ K do 1 y ← M(x);

9 let W k ∈ {0}|Sk |×|Sk+1 | ;


a) y ← TrainedNeuralNet(x);
10 let bk ∈ {0}|Sk+1 | ;
11 for 1 ≤ i ≤ |Sk | do 2 build an auto-encoder AE ≡ (Eξ , Dξ );
12 for 1 ≤ j ≤ |Sk+1 | do
13 if vkj +1 belongs to vki then a) (Eξ , Dξ ) ← buildAE(x, seg , λ) ▷ see algorithm 2;
14 Wij ← 1;
15 end 3 h ← Eξ (x);
16 end 4 define R : h ↦ → x − Dξ (h) :
17 end
a) let Wres ∈ {0}d×d ;
18 define identity : a ↦ → a;
19 Dξ ← generateNeuralNet w ork(weights = {W k }Kk=+11 , b) r = x − Dξ (x);
20 biases = { } ,
bk Kk=+11 c) bres ← r;
21 activation function = identity); d) define identity : a ↦ → a;
22 define Eξ : x ↦ → e ∈ {1}|S1 | ; e) R ← generateNeuralNet w ork(weights = {Wres },
23 return Dξ , Eξ ;
biases = {bres },
24 end
activation function = identity);
5 define DMξ : h ↦ → M(Dξ (h) + R(h)) :

and fed with the ‘‘1’’ vector (see Fig. 3). The relevance of each a) DMξ ← stackTogether(D,R,M);
superpixel/segment was then computed using the LRP algorithm.
6 U ← RP(DMξ , h, y);
4.2. Hierarchical image segmentation approach 7 return {u1 , . . . uK };

As for the non-hierarchical segmentation approach, the seg-


mentation algorithm proposed in [60] was used, but in this case,
three hierarchically organised levels were considered. Thus, for 5. Results
each test image, 3 different sets of segments (or superpixels)
{Si }3i=1 related between them in a hierarchical fashion are gen- In this section we report the evaluation assessment of the
erated, going from the coarsest (i = 1) to the finest (i = different realisation of the GMLF framework described in the
3) segmentation level. Next, a hierarchical decoder is made as previous section. For the evaluation we show both qualitative and
described in Section 3.2 and stacked on the classifier (see Fig. 3). quantitative results (see Section 5.5). In particular, in the first part
As for the non-hierarchical case, the decoder is then fed with the
of this section we report some examples of explanations obtained
‘‘1’’s vector. Finally, LRP is used to obtain hierarchical explana-
using flat and hierarchical segmentation-based MLFs, and VAE-
tions as follows: (1) first, at the coarsest level i = 1, the most
based MLFs. Thereafter, we show an example of explanation
relevant segment simax is selected; (2) then, for each finer level
using different types of MLFs. Finally, in Section 5.5 we report a
i > 1, the segment simax corresponding to the most relevant
quantitative evaluation of the obtained results.
segment belonging to si−1 max is chosen.

4.3. Variational auto-encoders 5.1. Flat segmentation

Images from the Aberdeen dataset are used to construct an In Fig. 5 we show some of the explanations produced for
explanation based on VAE encoding latent variables relevances. a set of images using the flat (non hierarchical) segmentation-
The VAE model was trained on an Aberdeen subset using the based experimental setup described in Section 3.2. The proposed
architecture suggested in [62] for the CelebA dataset. Then, an explanations are reported considering the first two more relevant
encoding of 10 latent variables is made using the encoder net- segments according to the method described in Section 3.2. For
work for each test image. The resulting encodings were fed to the each image, the real class and the assigned class are reported.
decoder network stacked on top of the trained VGG16. Next, the From a qualitative visual inspection, one can observe that the
LRP algorithm was applied on the decoder top layer to compute selected segments seem to play a relevant role for distinguishing
the relevance of each latent variable. the classes.
7
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 5. Explanations obtained by GMLF using the flat strategy (second columns), LIME (third columns) and LRP (fourth columns) for VGG16 network responses using
images from STL10 (a) and Aberdeen datasets (b). In both (a) and (b), for each input (first columns) the explanation in terms of most relevant segments are reported
for the proposed flat approach (second columns) and LIME (third columns). For better clarity, we report a colourmap where only the first two most relevant segments
are highlighted both for MLRF and LIME.

5.2. Hierarchical image segmentation


Algorithm 4: GMLF approach in case of VAE auto-encoder
Input: a data point x ∈ Rd , a trainedNeuralNet returning In Figs. 6 and 7 we show a set of explanations using the hierar-
the class scores given a data point, a getTrainedVAE chical approach described in Section 3.2 on images of the STL10
procedure returning a trained VAE, a relevance and the Aberdeen datasets. In this case, we exploit the hierar-
propagation algorithm RP returning a relevance chical segmentation organisation to provide MLF explanations. In
vector given: i) a neural network, ii) an input and particular, for each image, a three layers decoder has been used,
iii) its class probabilities, a generateNeuralNet w ork obtaining three different image segmentations S1 , cS2 and S3 , from
function returning a neural networks with weights, the coarsest to the finest one, which are hierarchically organised
biases and activation function given as parameters (see Section 3.2). For the coarsest segmentation (S1 ), the two
Output: relevances u of each latent variable most relevant segments s11 and s12 are highlighted in the central
1 y ← M(x); row. For the image segmentation S2 the most relevant segment s21
belonging to s11 and the most relevant segment s22 belonging to s12
a) y ← TrainedNeuralNet(x); are highlighted in the upper and the lower row (second column).
The same process is made for the image segmentation S3 , where
2 build an auto-encoder AE ≡ (Eξ , Dξ ); the most relevant segment s31 belonging to s21 and the most rele-
vant segment s32 belonging to s22 are shown in the third column.
a) (Eξ , Dξ ) ← getTrainedVAE(); From a qualitative perspective, one can note that the proposed
approach seems to select relevant segments for distinguishing the
3 h ← Eξ (x);
classes. Furthermore, the hierarchical organisation provides more
4 define R : h ↦ → x − Dξ (h) : clear insights about the input image’s parts, contributing to the
classifier decision.
a) let Wres ∈ {0}d×d ;
The usefulness of a hierarchical method can also be seen in
b) r = x − Dξ (x); cases of wrong classifier responses. See, for example, Fig. 8 where
c) bres ← r; a hierarchical segmentation MLF approach was made on two
d) define identity : a ↦ → a; images wrongly classified: (1) a dog wrongly classified as a poodle
although it is evidently of a completely different race, and (2)
e) R ← generateNeuralNet w ork(weights = {Wres },
a cat classified as a bow tie. Inspecting the MLF explanations at
biases = {bres }, different hierarchy scales, it can be seen that, in the dog case, the
activation function = identity); classifier was misled by the wig (which probably led the classifier
5 define DMξ : h ↦ → M(Dξ (h) + R(h)) : toward the poodle class), while, in the other case, the cat head
position near the neck of the shirt, while the remaining part of the
a) DMξ ← stackTogether(Dξ , R, M); body is hidden, could be responsible for the wrong classification.

6 u ← RP(DMξ , h, y); 5.3. VAE-based MLF explanations


7 return u;
In Fig. 9 a set of results using the VAE-based experimental
setup described in Section 4 is shown. For each input, a relevance
8
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 6. Examples of a two-layer hierarchical explanation on images classified as warplane, tobby, hartebeest, dalmatian respectively by VGG16. (a) First column:
segment heat map. Left to right: segments sorted in descending relevance order. Top-down: the coarsest (second row) and the finest (third row) hierarchical level.
(b) LIME explanation: same input, same segmentation used in (a).

vector on the latent variable coding is computed. Then, a set of three types of explanations, although based on different MLFs,
decoded images are generated varying the two most relevant la- seem coherent to each other.
tent variables while fixing the other ones to the original encoding
values. 5.5. Quantitative evaluation
One can observe that varying the most relevant latent vari-
ables it seems that relevant image properties for the classifier A quantitative evaluation is performed adopting the MoRF
(Most Relevant First) and AOPC (Area Over Perturbation Curve) [6,
decision are modified such as hair length and style.
66] curve analysis. In this work, MoRF curve is computed fol-
lowing the region flipping approach, a generalisation of the pixel-
5.4. Multiple MLF explanations flipping measure proposed in [6]. In a nutshell, given an image
classification, image regions (in our case segments) are iteratively
For the same classifier input–output, we show the possibility replaced by random noise and fed to the classifier, following the
to provide multiple and different MLF explanations based on the descending order with respect to the relevance values returned
three types of previously mentioned MLFs. In Fig. 10, for each by the explanation method. In this manner, more relevant for the
input, three different types of explanations are shown. In the classification output the identified MLFs are, steepest is the curve.
first row, an explanation based on MLFs obtained by a flat image Instead, AOPC is computed as:
segmentation is reported. In the second row, an explanation L
based on MLFs obtained by an hierarchical segmentation. In the 1 ∑
AOPC = ⟨ f (x(0) ) − f (x(k) )⟩
last row, a VAE-based MLF explanation is showed. Notice that the L+1
k=0

9
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 7. Examples of a two-layer hierarchical explanation on images classified as Female and Male by VGG16. (a) First column: segment heat map. Left to right:
segments sorted in descending relevance order. Top-down: the coarsest (second row) and the finest (third row) hierarchical level. (b) LIME explanation: same input,
same segmentation used in (a).

Fig. 8. Results obtained by Hierarchical MLF approach (described in Section 3.2) using VGG16 network on STL10 images wrongly classified by the model. (a) A dog
wrongly classified as a poodle, although it is evidently of a completely different race. Inspecting the MLF explanations at different hierarchy scales, it can be seen
that the classifier was probably misled by the wig (which probably led the classifier toward the poodle class), (b) A cat wrongly classified as a bow tie. Inspecting
the MLF explanations at different hierarchy scales, it can be seen that the shape and the position of the cat head near the neck of the shirt, having at the same
time the remaining of its body hidden, could be responsible for the wrong class.

Fig. 9. Results obtained by VAE MLF approach (described in Section 3.3) using a VGG16 network on Aberdeen image dataset. For each image, a VAE is constructed.
For each input, the resulting relevance vector on the latent variable is computed. Then, decoded images are generated varying the two most relevant latent variables
while fixing the other ones to the original values.

10
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 10. For each input, three different types of explanations obtained by GMLF approach are shown. In the first row, an explanation based on a flat image segmentation
is reported. In the second row, an explanation based on an hierarchical segmentation. In the last row, a VAE-based MLF explanation is showed.

Fig. 11. A quantitative evaluation of the hierarchical GMLF approach on different input images. To evaluate the hierarchical GMLF approach respect to the LIME
approach, a most relevant segment analysis is made using MoRF curves. MoRF curves computed with the proposed approach (red) and LIME (blue) using the last
layer MLF as segmentation for both methods are shown. At each iteration step, a perturbed input based on the returned explanation is fed to the classifier. On the
y axis of the plot, the classification probability (in %) of the original class for each perturbed input. On the x axis, some perturbation steps. For each input image,
the figures in the first and the second row show the perturbed inputs fed to the classifier at each perturbation step for the proposed explainer system and the LIME
explainer, respectively. More relevant for the classification output the identified MLFs are, steepest the MoRF curve is.

where L is the total number of perturbation steps, f (·) is the and AOPC are shown in Figs. 11 and 12. In Fig. 11 MoRF curves
classifier output score, x(0) is the original input image, x(i) is the for some inputs are shown. It is evident that the MLFs selected
input perturbed at step i, and ⟨·⟩ is the average operator over by the proposed hierarchical approach are more relevant for
a set of input images. In this manner, more relevant for the the produced classification output. This result is confirmed by
classification output the identified MLFs are, greater the AOPC the average MoRF and average AOPC curves (Fig. 12), obtained
value is. averaging over the MoRF and AOPC curves of a sample of 100 and
To evaluate the hierarchical approach with respect to the flat 50 random images taken from STL10 and Aberdeen respectively.
segmentation approach, at each step, MLFs were removed from To make an easy comparison between the proposed methods
the inputs exploiting the hierarchy in a topological sort depth- and summarising the quantitative evaluations, last iteration AOPC
first search based on the descending order’s relevances. Therefore, values of the proposed methods and LIME are reported in Tables 1
the MLFs of the finest hierarchical layer were considered. MoRF and 2 for STL 10 and Aberdeen dataset respectively.
11
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 12. Average MoRF (first row) and AOPC (second row) computed on a sample of 100 and 50 random images sampled from STL10 (first column) and Abardeen
(second column) respectively. Both the curves of the proposed hierarchical approach (red) and LIME (blue) are plotted using as baseline the removal of the Middle
Level Features from the input images in a random order (green). More relevant for the classification output the identified MLFs are, steepest the MoRF curve is and
greater the AOPC value is.

Table 1 Table 2
Average AOPC of the proposed methods and LIME obtained averaging over Average AOPC of the proposed methods and LIME obtained averaging over
the last AOPC perturbation step on a sample of 100 random images taken the last AOPC perturbation step values on a sample of 50 random images
from STL10 dataset. Flat and hierarchical proposal are compared with LIME, taken from Aberdeen dataset.
resulting better in both cases. Since the LIME MLFs structure is hardly AOPC
different from VAE MLFs (the former corresponds to superpixels, the latter
to latent variables), the AOPC reported has not to be compared with the LIME 0.014
other results. Flat (proposed) 0.571
Hierarchical (proposed) 0.661
AOPC
LIME 0.042
Flat (proposed) 0.598
Hierarchical (proposed) 0.732 in a variational latent space perturbing a latent variable can lead
VAE (proposed) 0.595
changing in the whole input image. Therefore, classifiers fed with
decoded images generated by different MLF types could return
no comparable results, which may not be informative to make
In Fig. 14, the same quantitative analysis using the VAE strat- comparisons between MoRF curves.
egy is shown. Examples of MoRF curves using the VAE are shown
in Fig. 13. As in the hierarchical approach, the latent features are 6. Conclusion
sorted following the descending order returned by the relevance
algorithm, and then noised in turn for each perturbation step. A framework to generate explanations in terms of middle-level
Due to the difference between LIME and VAE MLFs (the former features is proposed in this work. With the expression Middle
corresponds to superpixels, the latter to latent variables), no Level Features (MLF), (see Section 1, we mean input features that
comparison with LIME was reported. In our knowledge, no other represent more salient and understandable input properties for a
study reports explanations in terms of latent variables, therefore user, such as parts of the input (for example, nose, ears and paw,
is not easy to make a qualitative comparison with the existing in case of images of humans) or more abstract input properties
methods. Differently from perturbing the MLF of a superpixel- (for example, shape, viewpoint, thickness and so on). The use of
based approach where only an image part is substituted by noise, middle-level features is motivated by the need to decrease the
12
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

Fig. 13. A quantitative evaluation of the VAE GMLF approach on different input images. MoRF curves computed with the proposed approach (red) perturbing the VAE
latent variables in the order given by the explainer are shown. At each iteration step, a perturbed input based on the returned explanation is fed to the classifier.
On the y axis of the plot, the classification probability (in %) of the original class for each perturbed input. On the x axis, some perturbation steps. For each input
image, the figures show the perturbed inputs fed to the classifier at each perturbation step for the proposed explainer system. More relevant for the classification
output the identified MLFs are, steepest the MoRF curve is.

Fig. 14. Average MoRF (first column) and AOPC (second column) computed on a sample of 50 random images sampled from Aberdeen dataset. The curve proposed
with the VAE approach (red) is plotted using as baseline the removal of the Middle Level Features from the input images in a random order (green). More relevant
for the classification output the identified MLFs are, steepest the MoRF curve is and greater the AOPC value is.

human interpretative burden in artificial intelligence explanation heatmaps can be applied on both the decoder and the ML system
systems. whose decision is to be explained (see Section 3.1). Consequently,
Our approach can be considered a general framework to ob- the proposed approach enables one to obtain different types
tain humanly understandable explanations insofar as it can be of explanations in terms of different MLFs for the same pair
applied to different types of middle-level features as long as an input/decision of an ML system, that may allow developing XAI
encoder/decoder system is provided (for example image segmen- solutions able to provide user-centred explanations according to
tation or latent coding) and an explanation method producing several research directions proposed in literature [2,31].
13
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

We experimentally tested (see Sections 4 and 5) our approach [12] B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, et al., Inter-
using three different types of MLFs: flat (non hierarchical) seg- pretability beyond feature attribution: Quantitative testing with concept
activation vectors (tcav), in: International Conference on Machine Learning,
mentation, hierarchical segmentation and VAE latent coding. Two
PMLR, 2018, pp. 2668–2677.
different datasets were used: STL-10 dataset and the Aberdeen [13] A. Ghorbani, J. Wexler, J. Zou, B. Kim, Towards automatic concept-based
dataset from the University of Stirling. explanations, 2019, arXiv preprint arXiv:1902.03129.
We evaluated our results from both a qualitative and a quan- [14] A. Akula, S. Wang, S.-C. Zhu, Cocox: Generating conceptual and counter-
titative point of view. The quantitative evaluation was obtained factual explanations via fault-lines, in: Proceedings of the AAAI Conference
on Artificial Intelligence, Vol. 34, 2020, pp. 2594–2601.
using MoRF curves [66]. [15] Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and
The results are encouraging, both under the qualitative point new perspectives, IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2013)
of view, giving easily human interpretable explanations, and the 1798–1828.
quantitative point of view, giving comparable performances to [16] F. Locatello, S. Bauer, M. Lucic, G. Raetsch, S. Gelly, B. Schölkopf, O.
Bachem, Challenging common assumptions in the unsupervised learning
LIME. Furthermore, we show that a hierarchical approach can
of disentangled representations, in: International Conference on Machine
provide, in several cases, clear explanations about the reason Learning, PMLR, 2019, pp. 4114–4124.
behind classification behaviours. [17] R.T. Chen, X. Li, R. Grosse, D. Duvenaud, Isolating sources of disentan-
glement in variational autoencoders, 2018, arXiv preprint arXiv:1802.
CRediT authorship contribution statement 04942.
[18] F.L. Galvão, S.J.F. Guimarães, A.X. Falcão, Image segmentation using dense
and sparse hierarchies of superpixels, Pattern Recognit. 108 (2020) 107532.
Andrea Apicella: Performed the experiments, Analyzed the [19] D. Charte, F. Charte, M.J. del Jesus, F. Herrera, An analysis on the use of
results, Writing & review original draft. Salvatore Giugliano: Per- autoencoders for representation learning: Fundamentals, learning task case
formed the experiments, Analyzed the results, Writing & review studies, explainability and challenges, Neurocomputing 404 (2020) 93–107.
[20] A. Apicella, S. Giugliano, F. Isgrò, R. Prevete, Explanations in terms of
original draft. Francesco Isgrò: Performed the experiments, An-
hierarchically organised middle level features, in: [Link] - 2021 Ital-
alyzed the results, Writing & review original draft. Roberto Pre- ian Workshop on Explainable Artificial Intelligence, CEUR Workshop
vete: Performed the experiments, Analyzed the results, Writing & Proceedings, 2021.
review original draft. [21] M. Tschannen, O. Bachem, M. Lucic, Recent advances in autoencoder-based
representation learning, 2018, arXiv preprint arXiv:1812.05069.
[22] C.K. Sønderby, T. Raiko, L. Maaløe, S.K. Sønderby, O. Winther, Ladder vari-
Declaration of competing interest
ational autoencoders, in: Proceedings of the 30th International Conference
on Neural Information Processing Systems, 2016, pp. 3745–3753.
The authors declare that they have no known competing finan- [23] S. Zhao, J. Song, S. Ermon, Learning hierarchical features from deep
cial interests or personal relationships that could have appeared generative models, in: International Conference on Machine Learning,
to influence the work reported in this paper. PMLR, 2017, pp. 4091–4099.
[24] X. Gu, W. Ding, A hierarchical prototype-based approach for classification,
Inform. Sci. 505 (2019) 325–351.
Acknowledgement [25] M.T. Ribeiro, S. Singh, C. Guestrin, "Why should I trust you?": Explaining
the predictions of any classifier, in: Proceedings of the 22Nd ACM SIGKDD
This work is supported by the European Union - FSE-REACT- International Conference on Knowledge Discovery and Data Mining, KDD
’16, ACM, New York, NY, USA, 2016, pp. 1135–1144.
EU, PON Research and Innovation 2014–2020 DM1062/2021 con-
[26] A. Apicella, F. Isgrò, R. Prevete, G. Tamburrini, Contrastive explanations
tract number 18-I-15350-2 and by the Ministry of University to classification systems using sparse dictionaries, in: International Con-
and Research, PRIN research project ‘‘BRIO – BIAS, RISK, OPACITY ference on Image Analysis and Processing, Springer, Cham, 2019, pp.
in AI: design, verification and development of Trustworthy AI.", 207–218.
Project no. 2020SSKZ7R . [27] F. Donnarumma, R. Prevete, D. Maisto, S. Fuscone, E.M. Irvine, M.A. van der
Meer, C. Kemere, G. Pezzulo, A framework to identify structured behavioral
patterns within rodent spatial trajectories, Sci. Rep. 11 (1) (2021) 1–20.
References [28] R. Guidotti, A. Monreale, S. Matwin, D. Pedreschi, Explaining image
classifiers generating exemplars and counter-exemplars from latent repre-
[1] A. Adadi, M. Berrada, Peeking inside the black-box: A survey on explainable sentations, in: Proceedings of the AAAI Conference on Artificial Intelligence,
artificial intelligence (XAI), IEEE Access 6 (2018) 52138–52160. Vol. 34, 2020, pp. 13665–13668.
[2] M. Ribera, A. Lapedriza, Can we do better explanations? A proposal of [29] A. Apicella, F. Isgro, R. Prevete, A. Sorrentino, G. Tamburrini, Explaining
user-centered explainable ai., in: IUI Workshops, Vol. 2327, 2019, p. 38. classification systems using sparse dictionaries, in: Proceedings of the
[3] A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, European Symposium on Artificial Neural Networks, Computational Intelli-
S. García, S. Gil-López, D. Molina, R. Benjamins, et al., Explainable artificial gence and Machine Learning, Special Session on Societal Issues in Machine
intelligence (XAI): Concepts, taxonomies, opportunities and challenges Learning: When Learning from Data Is Not Enough, Bruges, Belgium, 2019.
toward responsible AI, Inf. Fusion 58 (2020) 82–115. [30] A. Apicella, F. Isgro, R. Prevete, G. Tamburrini, A. Vietri, Sparse dictionaries
[4] D. Doran, S. Schulz, T.R. Besold, What does explainable AI really mean? A for the explanation of classification systems, in: PIE, Rome, Italy, 2019, p.
new conceptualization of perspectives, 2017, CoRR abs/1710.00794. 009.
[5] A. Nguyen, J. Yosinski, J. Clune, Multifaceted feature visualization: Uncov- [31] B.Y. Lim, Q. Yang, A.M. Abdul, D. Wang, Why these explanations? Selecting
ering the different types of features learned by each neuron in deep neural intelligibility types for explanation goals, in: IUI Workshops, 2019.
networks, 2016, ArXiv E-Prints. [32] S. Kim, T. Qin, T.-Y. Liu, H. Yu, Advertiser-centric approach to understand
[6] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, W. Samek, On user click behavior in sponsored search, Inform. Sci. 276 (2014) 242–254.
pixel-wise explanations for non-linear classifier decisions by layer-wise [33] T. Miller, Explanation in artificial intelligence: Insights from the social
relevance propagation, PLoS One 10 (7) (2015) e0130140. sciences, Artificial Intelligence 267 (2019) 1–38.
[7] A. Apicella, F. Isgro, R. Prevete, G. Tamburrini, Middle-level features for the [34] A. Weller, Transparency: Motivations and challenges, 2017, arXiv:1708.
explanation of classification systems by sparse dictionary methods, Int. J. 01870.
Neural Syst. 30 (08) (2020) 2050040. [35] W. Samek, K.-R. Müller, Towards explainable artificial intelligence, in: W.
[8] G. Montavon, W. Samek, K.-R. Müller, Methods for interpreting and Samek, G. Montavon, A. Vedaldi, L. Hansen, K. Muller (Eds.), Explainable
understanding deep neural networks, Digit. Signal Process. 73 (2018) 1–15. AI: Interpreting, Explaining and Visualizing Deep Learning, Springer, 2019,
[9] Z.C. Lipton, The mythos of model interpretability: In machine learning, the pp. 5–22.
concept of interpretability is both important and slippery, Queue 16 (3) [36] A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado,
(2018) 31–57. S. García, S. Gil-López, D. Molina, R. Benjamins, et al., Explainable artificial
[10] Q. Zhang, S. Zhu, Visual interpretability for deep learning: a survey, Front. intelligence (XAI): Concepts, taxonomies, opportunities and challenges
Inf. Technol. Electron. Eng. 19 (1) (2018) 27–39. toward responsible AI, Inf. Fusion 58 (2020) 82–115.
[11] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional net- [37] W.J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, B. Yu, Definitions,
works: Visualising image classification models and saliency maps, in: 2nd methods, and applications in interpretable machine learning, Proc. Natl.
International Conference on Learning Representations, Workshop Track Acad. Sci. 116 (44) (2019) 22071–22080, [Link]
Proceedings, Banff, Canada, 2014. 1900654116.

14
A. Apicella, S. Giugliano, F. Isgrò et al. Knowledge-Based Systems 255 (2022) 109725

[38] C.M. Bishop, N.M. Nasrabadi, Pattern recognition and machine learning, 4, [54] D. Rezende, S. Mohamed, Variational inference with normalizing flows,
(4) Springer, 2006. in: International Conference on Machine Learning, PMLR, 2015, pp.
[39] A. Apicella, F. Donnarumma, F. Isgrò, R. Prevete, A survey on modern 1530–1538.
trainable activation functions, Neural Networks 138 (2021) 14–32. [55] Y. Li, Q. Pan, S. Wang, H. Peng, T. Yang, E. Cambria, Disentangled variational
[40] D. Erhan, Y. Bengio, Courville, P. Vincent, Visualizing higher-layer features auto-encoder for semi-supervised learning, Inform. Sci. 482 (2019) 73–85.
of a deep network, Univ. Montreal 1341 (3) (2009) 1. [56] H. Gao, C.-M. Pun, S. Kwong, An efficient image segmentation method
[41] A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, W. Samek, Layer- based on a hybrid particle swarm algorithm with learning strategy, Inform.
wise relevance propagation for neural networks with local renormalization Sci. 369 (2016) 500–521.
layers, in: International Conference on Artificial Neural Networks, Springer, [57] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder
Barcelona, Spain, 2016, pp. 63–71. with atrous separable convolution for semantic image segmentation, in:
[42] G. Montavon, S. Lapuschkin, A. Binder, W. Samek, K.-R. Müller, Explaining Proceedings of the European Conference on Computer Vision, ECCV, 2018,
nonlinear classification decisions with deep taylor decomposition, Pattern pp. 801–818.
Recognit. 65 (2017) 211–222. [58] J. Yu, D. Huang, Z. Wei, Unsupervised image segmentation via stacked
[43] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Learning deep fea- denoising auto-encoder and hierarchical patch indexing, Signal Process.
tures for discriminative localization, in: Proceedings of the IEEE Conference 143 (2018) 346–353.
on Computer Vision and Pattern Recognition, 2016, pp. 2921–2929. [59] X. Zhang, Y. Sun, H. Liu, Z. Hou, F. Zhao, C. Zhang, Improved clustering
[44] R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, algorithms for image segmentation based on non-local information and
Grad-cam: Visual explanations from deep networks via gradient-based back projection, Inform. Sci. 550 (2021) 129–144.
localization, in: Proceedings of the IEEE International Conference on [60] S.J.F. Guimarães, J. Cousty, Y. Kenmochi, L. Najman, A hierarchical image
Computer Vision, 2017, pp. 618–626. segmentation algorithm based on an observation scale, in: Joint IAPR
[45] M.D. Zeiler, G.W. Taylor, R. Fergus, Adaptive deconvolutional networks for International Workshops on Statistical Techniques in Pattern Recognition
mid and high level feature learning, in: Computer Vision (ICCV), 2011 IEEE (SPR) and Structural and Syntactic Pattern Recognition, SSPR, Springer,
International Conference on, IEEE, Barcelona, Spain, 2011, pp. 2018–2025. 2012, pp. 116–125.
[46] M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional net- [61] L. Guigues, J.P. Cocquerez, H. Le Men, Scale-sets image analysis, Int. J.
works, in: European Conference on Computer Cision, Springer, Zurich, Comput. Vis. 68 (3) (2006) 289–317.
Switzerland, 2014, pp. 818–833. [62] I. Higgins, L. Matthey, A. Pal, C.P. Burgess, X. Glorot, M. Botvinick, S.
[47] A. Dosovitskiy, T. Brox, Inverting visual representations with convolutional Mohamed, A. Lerchner, Beta-VAE: Learning basic visual concepts with a
networks, in: Proceedings of the IEEE Conference on Computer Vision and constrained variational framework, in: ICLR, 2017.
Pattern Recognition, Las Vegas, USA, 2016, pp. 4829–4837. [63] A. Apicella, S. Giugliano, F. Isgrò, R. Prevete, A general approach to compute
[48] B. Zhou, Y. Sun, D. Bau, A. Torralba, Interpretable basis decomposition the relevance of middle-level input features, in: Pattern Recognition. ICPR
for visual explanation, in: Proceedings of the European Conference on International Workshops and Challenges, Springer International Publishing,
Computer Vision, ECCV, 2018, pp. 119–134. Cham, 2021, pp. 189–203.
[49] D. Kahneman, A. Tversky, The simulation heuristic, Technical Report, [64] K. Simonyan, A. Zisserman, Very deep convolutional networks for
Stanford Univ Ca Dept Of Psychology, 1981. large-scale image recognition, in: International Conference on Learning
[50] X. Zhao, X. Huang, V. Robu, D. Flynn, BayLIME: Bayesian local interpretable Representations, 2015.
model-agnostic explanations, 2020, arXiv preprint arXiv:2012.03058. [65] A. Coates, A. Ng, H. Lee, An analysis of single-layer networks in unsu-
[51] J. Dieber, S. Kirrane, Why model why? Assessing the strengths and pervised feature learning, in: Proceedings of the Fourteenth International
limitations of LIME, 2020, arXiv preprint arXiv:2012.00093. Conference on Artificial Intelligence and Statistics, JMLR Workshop and
[52] B. Li, D. Pi, Network representation learning: a systematic literature review, Conference Proceedings, 2011, pp. 215–223.
Neural Comput. Appl. (2020) 1–33. [66] W. Samek, A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, Evaluating
[53] D.P. Kingma, M. Welling, Auto-encoding variational bayes, 2013, arXiv the visualization of what a deep neural network has learned, IEEE Trans.
preprint arXiv:1312.6114. Neural Netw. Learn. Syst. 28 (11) (2016) 2660–2673.

15

You might also like