LUCID-GAN: Conditional Generative Models to Locate Unfairness
Andres Algaba, Carmen Mazijn, Carina Prunkl, Jan Danckaert, Vincent, Ginis

TL;DR
LUCID-GAN introduces a conditional generative approach to identify and analyze unethical biases in black-box models by producing realistic, canonical inputs, overcoming limitations of gradient-based inverse design methods.
Contribution
It extends the LUCID framework by using a generative model to locate unfairness, applicable to non-differentiable models and capable of assessing complex discrimination.
Findings
Detects biases in black-box models without training data access
Generates realistic canonical inputs for bias analysis
Applicable to non-differentiable models
Abstract
Most group fairness notions detect unethical biases by computing statistical parity metrics on a model's output. However, this approach suffers from several shortcomings, such as philosophical disagreement, mutual incompatibility, and lack of interpretability. These shortcomings have spurred the research on complementary bias detection methods that offer additional transparency into the sources of discrimination and are agnostic towards an a priori decision on the definition of fairness and choice of protected features. A recent proposal in this direction is LUCID (Locating Unfairness through Canonical Inverse Design), where canonical sets are generated by performing gradient descent on the input space, revealing a model's desired input given a preferred output. This information about the model's mechanisms, i.e., which feature values are essential to obtain specific outputs, allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Power and Status Dynamics · Social and Intergroup Psychology · Ethics and Social Impacts of AI
