Data Acquisition for Improving Model Fairness using Reinforcement Learning
Jahid Hasan, Romila Pradhan

TL;DR
This paper introduces DataSift, a reinforcement learning-based framework for selectively acquiring data points to enhance fairness in machine learning models, demonstrating significant fairness improvements with minimal data acquisition.
Contribution
The paper presents a novel data valuation framework using multi-armed bandits and influence functions to efficiently acquire data that improves model fairness.
Findings
Significant fairness improvements with minimal data acquisition
Effective data partitioning enhances acquisition efficiency
Influence functions reduce computation in evaluating data batches
Abstract
Machine learning systems are increasingly being used in critical decision making such as healthcare, finance, and criminal justice. Concerns around their fairness have resulted in several bias mitigation techniques that emphasize the need for high-quality data to ensure fairer decisions. However, the role of earlier stages of machine learning pipelines in mitigating model bias has not been explored well. In this paper, we focus on the task of acquiring additional labeled data points for training the downstream machine learning model to rapidly improve its fairness. Since not all data points in a data pool are equally beneficial to the task of fairness, we generate an ordering in which data points should be acquired. We present DataSift, a data acquisition framework based on the idea of data valuation that relies on partitioning and multi-armed bandits to determine the most valuable data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Statistical and Computational Modeling
MethodsFocus
