Artificial Intelligence
Artificial Intelligence
Fundamental Concepts of Data and Knowledge > Human Centricity and User Interaction
Graphical Abstract
Explainable AI.
Description unavailable
1 INTRODUCTION AND MOTIVATION
Artificial intelligence (AI) is perhaps the oldest field of computer science and
very broad, dealing with all aspects of mimicking cognitive functions for real-
world problem solving and building systems that learn and think like people.
Therefore, it is often called machine intelligence (Poole, Mackworth, & Goebel,
1998) to contrast it to human intelligence (Russell & Norvig, 2010). The field
revolved around the intersection of cognitive science and computer science
(Tenenbaum, Kemp, Griffiths, & Goodman, 2011). AI now raises enormous interest due
to the practical successes in machine learning (ML). In AI there was always a
strong linkage to explainability, and an early example is the Advice Taker proposed
by McCarthy in 1958 as a “program with common sense” (McCarthy, 1960). It was
probably the first time proposing common sense reasoning abilities as the key to
AI. Recent research emphasizes more and more that AI systems should be able to
build causal models of the world that support explanation and understanding, rather
than merely solving pattern recognition problems (Lake, Ullman, Tenenbaum, &
Gershman, 2017).
ML is a very practical field of AI with the aim to develop software that can
automatically learn from previous data to gain knowledge from experience and to
gradually improve it's learning behavior to make predictions based on new data
(Michalski, Carbonell, & Mitchell, 1984). The grand challenges are in sense-making,
in context understanding, and in decision making under uncertainty (Holzinger,
2017). ML can be seen as the workhorse of AI and the adoption of data intensive ML
methods can meanwhile be found everywhere, throughout science, engineering and
business, leading to more evidence-based decision-making (Jordan & Mitchell, 2015).
The enormous progress in ML has been driven by the development of new statistical
learning algorithms along with the availability of large data sets and low-cost
computation (Abadi et al., 2016). One nowadays extremely popular method is deep
learning (DL).
Explainability is at least as old as AI itself and rather a problem that has been
caused by it. In the pioneering days of AI (Newell, Shaw, & Simon, 1958), reasoning
methods were logical and symbolic. These approaches were successful, but only in a
very limited domain space and with extremely limited practical applicability. A
typical example is MYCIN (Shortliffe & Buchanan, 1975), which was an expert system
developed in Lisp to identify bacteria causing severe infections and to recommend
antibiotics. MYCIN was never used in clinical routine, maybe because of its stand-
alone character and the high effort in maintaining its knowledge base. However,
these early AI systems reasoned by performing some form of logical inference on
human readable symbols, and were able to provide a trace of their inference steps.
This was the basis for explanation, and there is some early related work available,
for example, (Johnson, 1994; Lacave & Diez, 2002; Swartout, Paris, & Moore, 1991).
Here, we should mention that there are three types of explanations: (1) a peer-to-
peer explanation as it is carried out among physicians during medical reporting;
(2) an educational explanation as it is carried out between teachers and students;
(3) A scientific explanation in the strict sense of science theory (Popper, 1935).
We emphasize that in this article we mean the first type of explanation.
In medicine there is growing demand for AI approaches, which are not only
performing well, but are trustworthy, transparent, interpretable and explainable
for a human expert; in medicine, for example, sentences of natural language (Hudec,
Bednrov, & Holzinger, 2018). Methods and models are necessary to reenact the
machine decision-making process, to reproduce and to comprehend both the learning
and knowledge extraction process. This is important, because for decision support
it is necessary to understand the causality of learned representations (Gershman,
Horvitz, & Tenenbaum, 2015; Pearl, 2009; Peters, Janzing, & Schölkopf, 2017).
At the same time, it is interesting to know that while it is often assumed that
humans are always able to explain their decisions, this is often not the case!
Sometimes experts are not able to provide an explanation based on the various
heterogeneous and vast sources of different information. Consequently, explainable-
AI calls for confidence, safety, security, privacy, ethics, fairness and trust
(Kieseberg, Weippl, & Holzinger, 2016), and brings usability (Holzinger, 2005) and
Human-AI Interaction into a new and important focus (Miller, Howe, & Sonenberg,
2017). All these aspects together are crucial for applicability in medicine
generally, and for future personalized medicine, in particular (Hamburg & Collins,
2010).
(i) Ground truth cannot always be well defined, especially when making a medical
diagnosis.
(ii) Human (scientific) models are often based on causality as an ultimate aim for
understanding underlying mechanisms, and while correlation is accepted as a basis
for decisions, it is viewed as an intermediate step. In contrast today's successful
ML algorithms are typically based on probabilistic models and provide only a crude
basis for further establishing causal models. When discussing the explainability of
a machine statement, we therefore propose to distinguish between:
Directly understandable, hence explainable for humans are data, objects or any
graphical representations ≤ℝ3, for example, images (arrays of pixels, glyphs,
correlation functions, graphs, 2D/3D projections etc., or text (sequences of
natural language). Humans are able to perceive data as images or words and process
it as information in a physiological sense, cognitively interpret the extracted
information with reference to their subjective previous knowledge (humans have a
lot of prior knowledge) and integrating this new knowledge into their own cognitive
knowledge space. Strictly speaking, there must be made a distinction between
understanding natural images (pictures), understanding text (symbols) and
understanding spoken language.
Not directly understandable, thus not explainable for humans are abstract
vectorspaces >ℝ3(e.g., word-embeddings) or undocumented, that is, previously
unknown input features (e.g., sequences of text with unknown symbols (e.g., Chinese
for an English speaker). An example shall illustrate it: in the so-called word
embedding (Mikolov, Chen, Corrado, & Dean, 2013), words and/or phrases are assigned
to vectors. Conceptually, this is a mathematical embedding of a space with one
dimension per word into a continuous vector space with a reduced dimension. Methods
to generate such a “mapping” include, for example, deep neural nets and
probabilistic models with an explicit representation in relation to the context in
which the words appear.
For more details on the theory behind scientific explainability we refer to the
principles of abductive reasoning (Ma et al., 2010) and point to some current work
(Babiker & Goebel, 2017; Goebel et al., 2018).
Posthoc systems aim to provide local explanations for a specific decision and make
it reproducible on demand (instead of explaining the whole systems behavior). A
representative example is local interpretable model-agnostic explanations (LIME)
developed by Ribeiro, Singh, and Guestrin (2016b), which is a model-agnostic
system, where x ∈ ℝ d is the original representation of an instance being explained,
and x′ ∈ ℝ d′ is used to denote a vector for its interpretable representation (e.g.,
x may be a feature vector containing word embeddings, with x′ being the bag of
words). The goal is to identify an interpretable model over the interpretable
representation that is locally faithful to the classifier. The explanation model is
g : ℝ d′ → ℝ, g ∈ G, where G is a class of potentially interpretable models, such as
linear models, decision trees, or rule lists; given a model g ∈ G, it can be
visualized as an explanation to the human expert (for details please refer to
(Ribeiro, Singh, & Guestrin, 2016a)). Another example for a posthoc system is black
box explanations through transparent approximations (BETA), a model-agnostic
framework for explaining the behavior of any black-box classifier by simultaneously
optimizing for fidelity to the original model and interpretability of the
explanation introduced by Lakkaraju, Kamar, Caruana, and Leskovec (2017).
Typically, deep neural networks are trained using supervised learning on large and
carefully annotated data sets. However, the need for such data sets restricts the
space of problems that can be addressed. On one hand, this has led to a
proliferation of deep learning results on the same tasks using the same well-known
data sets (Rolnick, Veit, Belongie, & Shavit, 2017). On the other hand, to the
emerging relevance of weakly- and un-supervised approaches that aim at reducing the
need for annotations (Schlegl, Seeböck, Waldstein, Schmidt-Erfurth, & Langs, 2017;
Seeböck et al., 2018).
Several approaches to probe and interpret deep neural networks exist (Kendall &
Gal, 2017). Uncertainty provides a measure of how small perturbations of training
data would change model parameters, the so-called model uncertainty or epistemic
uncertainty, or how input parameter changes would affect the prediction for one
particular example, the predictive uncertainty, or aleatoric variability (Gal,
2016). In a Bayesian Deep Learning approach, Pawlowski, Brock, Lee, Rajchl, and
Glocker (2017) approximate model parameters through variational methods, resulting
in uncertainty information of model weights, and a means to derive predictive
uncertainty from the model outputs. Providing uncertainty facilitates the
appropriate use of model predictions in scenarios where different sources of
information are combined as typically the case in medicine. We can further
differentiate aleatoric uncertainty, into homoscedatic uncertainty independent of a
particular input, and heteroscedatic uncertainty possibly changing with different
inputs to the system.
Methods for attribution seek to link a particular output of the deep neural network
to input variables. Sundararajan, Taly, and Yan (2017) analyze the gradients of the
output when changing individual input variables. In a sense this traces the
prediction uncertainty back to the components of a multivariate input. Zhou,
Khosla, Lapedriza, Oliva, and Torralba (2016) use activation maps to identify parts
of images relevant for a network prediction. Recently attribution approaches for
generative models have been introduced. Baumgartner, Koch, Tezcan, Ang, and
Konukoglu (2017) demonstrate how image areas that are specific to the foreground
class in Wasserstein Generative Adversarial Networks (WGAN) can be identified and
high-lighted in the data. Biffi et al. (2018) learn interpretable features for
variational auto encoders (VAE) by learning gradients in the latent embedding space
that it linked to the classification result.
Activation maximization (Montavon et al., 2017) identifies input patterns that lead
to maximal activations relating to specific classes in the output layer (Berkes &
Wiskott, 2006; Simonyan & Zisserman, 2014). This makes the visualization of
prototypes of classes possible, and assesses which properties the model captures
for classes1 (Erhan, Bengio, Courville, & Vincent, 2009). For a neural network
classifier mapping data points x to a set of classes (ω c) c, the approach
identifies highly probable regions in the input space, that create high output
probabilities for a particular class. These positions can be found by introducing a
data density model in the standard objective function logp(ω c | x) − λ‖x‖2 that is
maximized during model training. Instead of the ℓ2-norm regularizer that implements
a preference for inputs that are close to the origin, the density model or “expert”
(Montavon et al., 2017) results in the term logp(ω c| x) + logp(x) that is to be
maximized. Here, the prototype is encouraged to simultaneously produce strong class
response and to resemble the data. By application of Bayes' rule, the newly defined
objective can be identified, up to modeling errors and a constant term, as the
class-conditioned data density p(x| ω c). The learned prototype thus corresponds to
the most likely input x for the class ω c (Figure 2).
The selection of the so-called expert p(x) plays an important role. Basically,
there are four different cases: In the case where “the expert” is absent, that is,
the optimization problem reduces to the maximization of the class probability
function p(ω c| x). In the case where we see the other extreme, that is, the expert
is overfitted on some data distribution, and thus, the optimization problem becomes
essentially the maximization of the expert p(x) itself.
–liver cell (hepatocyte) plates regular-one cell layer thick/several cell layers
thick
–canalicular bilirubinostasis
–fibrosis: portal/perisinusoidal/pericellular/perivenular/septal/porto-portal/
porto-central/centro-central/meshed wire fibrosis/incomplete cirrhosis/cirrhosis.
For a specific case values of all above features contribute to the diagnosis with
different weights and causal relations present in the human model on liver
pathology, which an expert acquired by training and experience.
4 FUTURE OUTLOOK
4.1 Weakly supervised learning
Supervised learning is very expensive in the medical domain because it is
cumbersome to get strong supervision information and fully ground-truth labels.
Particularly, labeling a histopathological image is not only time-consuming but
also a critical task for cancer diagnosis, as it is clinically important to segment
the cancer tissues and cluster them into various classes (Xu, Zhu, Chang, Lai, &
Tu, 2014). Digital pathological images generally have some issues to be considered,
including the very large image size (and the involved problems for DL),
insufficiently labeled images (the small training data available), the time needed
from the pathologist (expensive labeling), insufficient labels (region of
interest), different levels of magnification (resulting in different levels of
information), color variation and artifacts (sliced and placed on glass slides)
etc. (Komura & Ishikawa, 2018).
Weakly supervised learning (Xu et al., 2014) is an umbrella term for a variety of
methods to construct predictive models by learning with weak supervision; weak
because of either incomplete, inexact or inaccurate supervision. In a strong
supervision task we want to learn f : X → Y from the training data set D = (x 1, y1), …
(x m, y m), wherein X is the feature space and (x i, y i) are always assumed to be
identically and independently distributed data (which is not the case in real-world
problems!).
Level 1: Association P(y| x) with the typical activity of “seeing” and questions
including “How would seeing X change my belief in Y?”, in our use-case above this
was the question of “what does a feature in a histology slide the pathologist about
a disease?”
Level 2: Intervention P(y| do(x), z) with the typical activity of “doing” and
questions including “What if I do X?”, in our use-case above this was the question
of “what if the medical professional recommends treatment X—will the patient be
cured?”
Level 3: Counterfactuals P(y x| x′, y′) with the typical activity of “retrospection”
and questions including “Was Y the cause for X?”, in our use-case above this was
the question of “was it the treatment that cured the patient?”
For each of these levels we have to develop methods to measure effectiveness (does
an explanation describe a statement with an adequate level of detail), efficiency
(is this done with a minimum of time and effort) and user satisfaction (how
satisfactory was the explanation for the decision making process). Again we should
mention that there are three types of explanations: (1) a peer-to-peer explanation
as it is carried out among physicians during medical reporting; (2) an educational
explanation as it is carried out between teachers and students; (3) A scientific
explanation in the strict sense of science theory (Popper, 1935). We emphasize that
in this article we always refer to the first type of explanation.
5 CONCLUSION
AI is already one of the key technologies in our economy. It will bring changes
similar to the introduction of the steam engine or electricity. However, concerns
about potential loss of control in the Human-AI relationship are growing. Issues
such as autonomous driving and the unclear decision making of the vehicle, for
example, in extreme cases shortly before an accident collision, have long been the
subject of public debate. The same goes for the question of the extent to which AI
can or should support medical decisions or even make them itself. In many cases it
will be necessary to understand how a machine decision was made and to assess the
quality of the explanation.
Today, DL algorithms are very useful in our daily lives: autonomous driving, face
recognition, speech understanding, recommendation systems, etc. already work very
well. However, it is very difficult for people to understand how these algorithms
come to a decision. Ultimately, these are so-called “black box” models. The problem
is that even if we understand the underlying mathematical principles and theories,
such models lack an explicit declarative representation of knowledge. Early AI
solutions (at that time called expert systems) had the goal from the beginning of
making solutions comprehensible, understandable and thus explainable, which was
also possible in very narrowly defined problems. Of course, we should mention that
many problems do possibly not need explanations for everything at any time.
Here, the area of explainable AI is not only useful and necessary, but also
represents a huge opportunity for AI solutions in general. The generally accused
opacity of AI can thus be reduced and necessary trust built up. Exactly this can
promote the acceptance with future users lastingly.
The main problem of the most successful current ML systems, recently emphasized by
Pearl (2018), is that they work on a statistical, or model-free mode, which entails
severe limitations on their performance. Such systems are not able to understand
the context, hence cannot reason about interventions and retrospection. However,
such approaches needs the guidance of a human model similar to the ones used in
causality research (Pearl, 2009; Pearl & Mackenzie, 2018) to answer the question
“Why?”. The establishment of causability as a solid scientific field can help here.
“Data can tell you that the people who took a medicine recovered faster than those
who did not take it, but they cant tell you why. Maybe those who took the medicine
did so because they could afford it and would have recovered just as fast without
it.”
Judea Pearl (2018), The Book of Why: The New Science of Cause and Effect