What will it take to generate fairness-preserving explanations?

Jessica Dai; Sohini Upadhyay; Stephen H. Bach; Himabindu Lakkaraju

arXiv:2106.13346·cs.LG·June 28, 2021·6 cites

What will it take to generate fairness-preserving explanations?

Jessica Dai, Sohini Upadhyay, Stephen H. Bach, Himabindu Lakkaraju

PDF

Open Access

TL;DR

This paper investigates whether explanations of black-box models on tabular data preserve fairness properties, highlighting potential issues and proposing future research directions for fairness-aware explanations.

Contribution

It reveals that current explanation methods may not maintain the fairness of black-box models and suggests directions for developing fairness-preserving explanations.

Findings

01

Explanations can obscure or ignore fairness properties.

02

Current explanation algorithms may produce misleading fairness information.

03

Future research should focus on fairness-aware explanation methods.

Abstract

In situations where explanations of black-box models may be useful, the fairness of the black-box is also often a relevant concern. However, the link between the fairness of the black-box model and the behavior of explanations for the black-box is unclear. We focus on explanations applied to tabular datasets, suggesting that explanations do not necessarily preserve the fairness properties of the black-box algorithm. In other words, explanation algorithms can ignore or obscure critical relevant properties, creating incorrect or misleading explanations. More broadly, we propose future research directions for evaluating and generating explanations such that they are informative and relevant from a fairness perspective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Adversarial Robustness in Machine Learning