Latent SHAP: Toward Practical Human-Interpretable Explanations

Bitton, Ron; Malach, Alon; Meiseles, Amiel; Momiyama, Satoru; Araki, Toshinori; Furukawa, Jun; Elovici, Yuval; Shabtai, Asaf

Computer Science > Machine Learning

arXiv:2211.14797 (cs)

[Submitted on 27 Nov 2022]

Title:Latent SHAP: Toward Practical Human-Interpretable Explanations

Authors:Ron Bitton, Alon Malach, Amiel Meiseles, Satoru Momiyama, Toshinori Araki, Jun Furukawa, Yuval Elovici, Asaf Shabtai

View PDF

Abstract:Model agnostic feature attribution algorithms (such as SHAP and LIME) are ubiquitous techniques for explaining the decisions of complex classification models, such as deep neural networks. However, since complex classification models produce superior performance when trained on low-level (or encoded) features, in many cases, the explanations generated by these algorithms are neither interpretable nor usable by humans. Methods proposed in recent studies that support the generation of human-interpretable explanations are impractical, because they require a fully invertible transformation function that maps the model's input features to the human-interpretable features. In this work, we introduce Latent SHAP, a black-box feature attribution framework that provides human-interpretable explanations, without the requirement for a fully invertible transformation function. We demonstrate Latent SHAP's effectiveness using (1) a controlled experiment where invertible transformation functions are available, which enables robust quantitative evaluation of our method, and (2) celebrity attractiveness classification (using the CelebA dataset) where invertible transformation functions are not available, which enables thorough qualitative evaluation of our method.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2211.14797 [cs.LG]
	(or arXiv:2211.14797v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.14797

Submission history

From: Ron Bitton [view email]
[v1] Sun, 27 Nov 2022 11:33:26 UTC (1,145 KB)

Computer Science > Machine Learning

Title:Latent SHAP: Toward Practical Human-Interpretable Explanations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Latent SHAP: Toward Practical Human-Interpretable Explanations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators