An Algorithmic Framework for Bias Bounties
Ira Globus-Harris, Michael Kearns, Aaron Roth

TL;DR
This paper introduces an algorithmic framework for bias bounties that enables external participants to propose subgroup improvements to trained models, ensuring convergence to optimal or stable models with no trade-offs between subgroup and overall accuracy.
Contribution
It presents a novel, provably convergent algorithmic framework for bias bounties that allows flexible subgroup improvements and maintains accuracy balance.
Findings
Framework converges to Bayes optimal or stable models.
Experimental results validate the effectiveness of the approach.
Preliminary bias bounty event demonstrates practical applicability.
Abstract
We propose and analyze an algorithmic framework for "bias bounties": events in which external participants are invited to propose improvements to a trained model, akin to bug bounty events in software and security. Our framework allows participants to submit arbitrary subgroup improvements, which are then algorithmically incorporated into an updated model. Our algorithm has the property that there is no tension between overall and subgroup accuracies, nor between different subgroup accuracies, and it enjoys provable convergence to either the Bayes optimal model or a state in which no further improvements can be found by the participants. We provide formal analyses of our framework, experimental evaluation, and findings from a preliminary bias bounty event.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
