Revealing Unfair Models by Mining Interpretable Evidence

Mohit Bajaj; Lingyang Chu; Vittorio Romaniello; Gursimran Singh; Jian; Pei; Zirui Zhou; Lanjun Wang; Yong Zhang

arXiv:2207.05811·cs.LG·July 14, 2022·1 cites

Revealing Unfair Models by Mining Interpretable Evidence

Mohit Bajaj, Lingyang Chu, Vittorio Romaniello, Gursimran Singh, Jian, Pei, Zirui Zhou, Lanjun Wang, Yong Zhang

PDF

Open Access

TL;DR

This paper introduces RUMIE, a method to automatically reveal and explain unfairness in trained machine learning models by mining interpretable evidence in the form of discriminated data groups and key attributes.

Contribution

The paper presents a novel approach to uncover and interpret unfairness in models by identifying discriminated data groups and key decision attributes, improving scalability over existing methods.

Findings

01

Effectively reveals unfairness in real-world datasets

02

Provides highly interpretable evidence of model discrimination

03

Outperforms baseline methods in scalability

Abstract

The popularity of machine learning has increased the risk of unfair models getting deployed in high-stake applications, such as justice system, drug/vaccination design, and medical diagnosis. Although there are effective methods to train fair models from scratch, how to automatically reveal and explain the unfairness of a trained model remains a challenging task. Revealing unfairness of machine learning models in interpretable fashion is a critical step towards fair and trustworthy AI. In this paper, we systematically tackle the novel task of revealing unfair models by mining interpretable evidence (RUMIE). The key idea is to find solid evidence in the form of a group of data instances discriminated most by the model. To make the evidence interpretable, we also find a set of human-understandable key attributes and decision rules that characterize the discriminated data instances and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification