Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box   Models

Julius Adebayo; Lalana Kagal

arXiv:1611.04967·cs.LG·November 16, 2016·47 cites

Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models

Julius Adebayo, Lalana Kagal

PDF

Open Access 1 Repo

TL;DR

This paper introduces an iterative orthogonal projection method to interpret black-box models, enabling quantification of input attribute importance to assess potential bias and discrimination.

Contribution

The paper proposes a novel iterative orthogonal projection technique for interpreting black-box models and quantifying input dependence to evaluate fairness.

Findings

01

Effective in quantifying attribute dependence

02

Helps identify potential bias in models

03

Provides a tool for fairness assessment

Abstract

Predictive models are increasingly deployed for the purpose of determining access to services such as credit, insurance, and employment. Despite potential gains in productivity and efficiency, several potential problems have yet to be addressed, particularly the potential for unintentional discrimination. We present an iterative procedure, based on orthogonal projection of input attributes, for enabling interpretability of black-box predictive models. Through our iterative procedure, one can quantify the relative dependence of a black-box model on its input attributes.The relative significance of the inputs to a predictive model can then be used to assess the fairness (or discriminatory extent) of such a model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yevgeni-integrate-ai/vfae
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Ethics and Social Impacts of AI · Law, Economics, and Judicial Systems

MethodsInterpretability