Feature Attribution from First Principles

Taimeskhanov, Magamed; Garreau, Damien

Computer Science > Machine Learning

arXiv:2505.24729 (cs)

[Submitted on 30 May 2025]

Title:Feature Attribution from First Principles

Authors:Magamed Taimeskhanov, Damien Garreau

View PDF

Abstract:Feature attribution methods are a popular approach to explain the behavior of machine learning models. They assign importance scores to each input feature, quantifying their influence on the model's prediction. However, evaluating these methods empirically remains a significant challenge. To bypass this shortcoming, several prior works have proposed axiomatic frameworks that any feature attribution method should satisfy. In this work, we argue that such axioms are often too restrictive, and propose in response a new feature attribution framework, built from the ground up. Rather than imposing axioms, we start by defining attributions for the simplest possible models, i.e., indicator functions, and use these as building blocks for more complex models. We then show that one recovers several existing attribution methods, depending on the choice of atomic attribution. Subsequently, we derive closed-form expressions for attribution of deep ReLU networks, and take a step toward the optimization of evaluation metrics with respect to feature attributions.

Comments:	30 pages, 3 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2505.24729 [cs.LG]
	(or arXiv:2505.24729v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2505.24729

Submission history

From: Magamed Taimeskhanov [view email]
[v1] Fri, 30 May 2025 15:53:11 UTC (41 KB)

Computer Science > Machine Learning

Title:Feature Attribution from First Principles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Feature Attribution from First Principles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators