Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment
Jonathan Dodge, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy and, Casey Dugan

TL;DR
This study empirically investigates how different types of explanations influence people's fairness judgments of machine learning systems, highlighting the importance of explanation style, individual differences, and the potential for personalized fairness assessments.
Contribution
It provides an empirical analysis of explanation impacts on fairness perception, revealing how explanation style and individual differences affect fairness judgments in ML systems.
Findings
Certain explanations are perceived as less fair.
Some explanations increase confidence in fairness.
Different explanations reveal different fairness issues.
Abstract
Ensuring fairness of machine learning systems is a human-in-the-loop process. It relies on developers, users, and the general public to identify fairness problems and make improvements. To facilitate the process we need effective, unbiased, and user-friendly explanations that people can confidently rely on. Towards that end, we conducted an empirical study with four types of programmatically generated explanations to understand how they impact people's fairness judgments of ML systems. With an experiment involving more than 160 Mechanical Turk workers, we show that: 1) Certain explanations are considered inherently less fair, while others can enhance people's confidence in the fairness of the algorithm; 2) Different fairness problems--such as model-wide fairness issues versus case-specific fairness discrepancies--may be more effectively exposed through different styles of explanation;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Adversarial Robustness in Machine Learning
