TL;DR
AEQUITAS is an automated approach that discovers discriminatory inputs in machine-learning models to validate and improve their fairness, using probabilistic search strategies and test generation techniques.
Contribution
It introduces three novel probabilistic search strategies for fairness testing and a fully automated method to enhance model fairness using generated test inputs.
Findings
Effectively uncovers fairness violations in six classifiers.
Generates up to 70% discriminatory inputs during testing.
Improves model fairness by up to 94% using generated inputs.
Abstract
Fairness is a critical trait in decision making. As machine-learning models are increasingly being used in sensitive application domains (e.g. education and employment) for decision making, it is crucial that the decisions computed by such models are free of unintended bias. But how can we automatically validate the fairness of arbitrary machine-learning models? For a given machine-learning model and a set of sensitive input parameters, our AEQUITAS approach automatically discovers discriminatory inputs that highlight fairness violation. At the core of AEQUITAS are three novel strategies to employ probabilistic search over the input space with the objective of uncovering fairness violation. Our AEQUITAS approach leverages inherent robustness property in common machine-learning models to design and implement scalable test generation methodologies. An appealing feature of our generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
