The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Balagopalan, Aparna; Zhang, Haoran; Hamidieh, Kimia; Hartvigsen, Thomas; Rudzicz, Frank; Ghassemi, Marzyeh

doi:10.1145/3531146.3533179

Computer Science > Machine Learning

arXiv:2205.03295 (cs)

[Submitted on 6 May 2022 (v1), last revised 2 Jun 2022 (this version, v2)]

Title:The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Authors:Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

View PDF

Abstract:Machine learning models in safety-critical settings like healthcare are often blackboxes: they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanations for different protected subgroups using real data from four settings in finance, healthcare, college admissions, and the US justice system. Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups. We also demonstrate that pairing explainability methods with recent advances in robust machine learning can improve explanation fairness in some settings. However, we highlight the importance of communicating details of non-zero fidelity gaps to users, since a single solution might not exist across all settings. Finally, we discuss the implications of unfair explanation models as a challenging and understudied problem facing the machine learning community.

Comments:	Published in FAccT 2022
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Cite as:	arXiv:2205.03295 [cs.LG]
	(or arXiv:2205.03295v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.03295
Related DOI:	https://doi.org/10.1145/3531146.3533179

Submission history

From: Aparna Balagopalan [view email]
[v1] Fri, 6 May 2022 15:23:32 UTC (2,388 KB)
[v2] Thu, 2 Jun 2022 17:01:15 UTC (1,525 KB)

Computer Science > Machine Learning

Title:The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators