Data Acquisition for Improving Model Fairness using Reinforcement   Learning

Jahid Hasan; Romila Pradhan

arXiv:2412.03009·cs.LG·December 5, 2024

Data Acquisition for Improving Model Fairness using Reinforcement Learning

Jahid Hasan, Romila Pradhan

PDF

Open Access

TL;DR

This paper introduces DataSift, a reinforcement learning-based framework for selectively acquiring data points to enhance fairness in machine learning models, demonstrating significant fairness improvements with minimal data acquisition.

Contribution

The paper presents a novel data valuation framework using multi-armed bandits and influence functions to efficiently acquire data that improves model fairness.

Findings

01

Significant fairness improvements with minimal data acquisition

02

Effective data partitioning enhances acquisition efficiency

03

Influence functions reduce computation in evaluating data batches

Abstract

Machine learning systems are increasingly being used in critical decision making such as healthcare, finance, and criminal justice. Concerns around their fairness have resulted in several bias mitigation techniques that emphasize the need for high-quality data to ensure fairer decisions. However, the role of earlier stages of machine learning pipelines in mitigating model bias has not been explored well. In this paper, we focus on the task of acquiring additional labeled data points for training the downstream machine learning model to rapidly improve its fairness. Since not all data points in a data pool are equally beneficial to the task of fairness, we generate an ordering in which data points should be acquired. We present DataSift, a data acquisition framework based on the idea of data valuation that relies on partitioning and multi-armed bandits to determine the most valuable data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Statistical and Computational Modeling

MethodsFocus