Discover the Unknown Biased Attribute of an Image Classifier

Zhiheng Li; Chenliang Xu

arXiv:2104.14556·cs.CV·October 5, 2021

Discover the Unknown Biased Attribute of an Image Classifier

Zhiheng Li, Chenliang Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method to automatically discover unknown biased attributes in image classifiers by optimizing hyperplanes in generative model latent spaces, reducing reliance on human conjecture.

Contribution

It proposes a new framework using hyperplanes, total-variation loss, and orthogonalization constraints to identify hidden biases in classifiers without prior bias knowledge.

Findings

01

Successfully discovers unnoticeable biased attributes in various classifiers.

02

Achieves better disentanglement of target and biased attributes.

03

Demonstrates generalizability across diverse image domains.

Abstract

Recent works find that AI algorithms learn biases from data. Therefore, it is urgent and vital to identify biases in AI algorithms. However, the previous bias identification pipeline overly relies on human experts to conjecture potential biases (e.g., gender), which may neglect other underlying biases not realized by humans. To help human experts better find the AI algorithms' biases, we study a new problem in this work -- for a classifier that predicts a target attribute of the input image, discover its unknown biased attribute. To solve this challenging problem, we use a hyperplane in the generative model's latent space to represent an image attribute; thus, the original problem is transformed to optimizing the hyperplane's normal vector and offset. We propose a novel total-variation loss within this framework as the objective function and a new orthogonalization penalty as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhihengli-UR/discover_unknown_biases
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning and Data Classification