Interpretable Machine Learning - Definitions, Methods, and Applications PDF
Interpretable Machine Learning - Definitions, Methods, and Applications PDF
Machine-learning models have demonstrated great success in learning complex patterns that en-
able them to make predictions about unobserved data. In addition to using models for prediction,
the ability to interpret what a model has learned is receiving an increasing amount of attention.
However, this increased focus has led to considerable confusion about the notion of interpretabil-
ity. In particular, it is unclear how the wide array of proposed interpretation methods are related,
arXiv:1901.04592v1 [stat.ML] 14 Jan 2019
Increase Increase tion in the form sparsity into a sparse problem can help a
model achieve higher predictive accuracy and yield more rele-
vant insights. Note that incorporating sparsity can often be
quite difficult, as it requires understanding the data-specific
Fig. 2. Impact of interpretability methods on descriptive and predictive accuracies.
structure of the sparsity and how it can be modelled.
Model-based interpretability (Sec 4) involves using a simpler model to fit the data Methods for obtaining sparsity often utilize a penalty on
which can negatively affect predictive accuracy, but yields higher descriptive accuracy. a loss function, such as LASSO (31) and sparse coding (32),
Post hoc interpretability (Sec 5) involves using methods to extract information from a or on a model selection criteria such as AIC or BIC (33, 34).
trained model (with no effect on predictive accuracy). These correspond to the model
Many search-based methods have been developed to find sparse
and post hoc stages in Fig 1.
solutions. These methods search through the space of non-
zero coefficients using classical subset-selection methods (e.g.
post hoc and model-based methods aim to increase descriptive orthogonal matching pursuit (35)). Model sparsity is often
accuracy, but only model-based affects the predictive accuracy. useful for high-dimensional problems, where the goal is to
Not shown is relevancy, which determines what type of output identify key features for further analysis. As a result, sparsity
is helpful for a particular problem and audience. penalties have been incorporated into complex models such
as random forests to identify a sparse subset of important
features (36).
4. Model-based interpretability
In the following example from genomics, sparsity is used
We now discuss how interpretability considerations come into to increase the relevancy of the produced interpretations by
play in the modeling stage of the data science life cycle (see reducing the number of potential interactions to a manageable
Fig 1). At this stage, the practitioner constructs an ML model level.
from the collected data. We define model-based interpretability . Ex. Identifying interactions among regulatory factors
as the construction of models that readily provide insight into or biomolecules is an important question in genomics. Typ-
the relationships they have learned. Different model-based ical genomic datasets include thousands or even millions of
interpretability methods provide different ways of increasing features, many of which are active in specific cellular or devel-
descriptive accuracy by constructing models which are easier to opmental contexts. The massive scale of such datasets make
understand, sometimes resulting in lower predictive accuracy. interpretation a considerable challenge. Sparsity penalties
The main challenge of model-based interpretability is to come are frequently used to make the data manageable for statisti-
up with models that are simple enough to be easily understood cians and their collaborating biologists to discuss and identify
by the audience, yet sophisticated enough to properly fit the promising candidates for further experiments.
underlying data. For instance, one recent study (23) uses a biclustering ap-
In selecting a model to solve a domain problem, the practi- proach based on sparse canonical correlation analysis (SCCA)
tioner must consider the entirety of the PDR framework. The to identify interactions among genomic expression features in
first desideratum to consider is predictive accuracy. If the con- Drosophila melanogaster (fruit flies) and Caenorhabditis ele-
structed model does not accurately represent the underlying gans (roundworms). Sparsity penalties enable key interactions
problem, any subsequent analysis will be suspect (28, 29). Sec- among features to be summarized in heatmaps which contain
ond, the main purpose of model-based interpretation methods few enough variables for a human to analyze. Moreover, this
is to increase descriptive accuracy. Finally, the relevancy of study performs stability analysis on their model, finding it
a model’s output must be considered, and is determined by to be robust to different initializations and perturbations to
the context of the problem, data, and audience. We now dis- hyperparameters.
cuss some widely useful types of model-based interpretability
methods. B. Simulatability. A model is said to be simulatable if a human
(for whom the interpretation is intended) is able to internally
A. Sparsity. When the practitioner believes that the under- simulate and reason about its entire decision-making process
lying relationship in question is based upon a sparse set of (i.e. how a trained model produces an output for an arbitrary
signals, they can impose sparsity on their model by limiting input). This is a very strong constraint to place on a model,
the number of non-zero parameters. In this section, we focus and can generally only be done when the number of features is
on linear models, but sparsity can be helpful more generally. low, and the underlying relationship is simple. Decision trees
When the number of non-zero parameters is sufficiently small, (37) are often cited as a simulatable model, due to their hier-
a practitioner can interpret the variables corresponding to archical decision-making process. Another example is lists of