Fairness in the Eyes of the Data: Certifying Machine-Learning Models
Shahar Segal, Yossi Adi, Benny Pinkas, Carsten Baum, Chaya Ganesh,, Joseph Keshet

TL;DR
This paper introduces a privacy-preserving framework for certifying the fairness of machine learning models through interactive testing, applicable to any model architecture and training process, with theoretical guarantees and cryptographic methods.
Contribution
It proposes a novel, privacy-preserving certification framework for assessing model fairness across multiple definitions, regardless of model architecture or training details.
Findings
Framework provides statistical guarantees for fairness certification.
Applicable to models with black-box access, ensuring data privacy.
Supports both private and public test data scenarios.
Abstract
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test. The framework verifies any trained model, regardless of its training process and architecture. Thus, it allows us to evaluate any deep learning model on multiple fairness definitions empirically. We tackle two scenarios, where either the test data is privately available only to the tester or is publicly known in advance, even to the model creator. We investigate the soundness of the proposed approach using theoretical analysis and present statistical guarantees for the interactive test. Finally, we provide a cryptographic technique to automate fairness testing and certified inference with only black-box access to the model at hand while hiding the participants' sensitive data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
