Navigating Towards Fairness with Data Selection

Yixuan Zhang; Zhidong Li; Yang Wang; Fang Chen; Xuhui Fan; Feng Zhou

arXiv:2412.11072·cs.LG·December 17, 2024

Navigating Towards Fairness with Data Selection

Yixuan Zhang, Zhidong Li, Yang Wang, Fang Chen, Xuhui Fan, Feng Zhou

PDF

Open Access 1 Video

TL;DR

This paper presents a flexible data selection method that uses a zero-shot predictor to mitigate label bias in machine learning, improving fairness without modifying model architecture or requiring extra holdout data.

Contribution

Introduces a novel, modality-agnostic data selection approach using peer predictions and zero-shot predictors to address label bias and fairness in large-scale datasets.

Findings

01

Effective in reducing label bias across diverse datasets

02

Eliminates the need for additional holdout sets

03

Maintains model architecture while improving fairness

Abstract

Machine learning algorithms often struggle to eliminate inherent data biases, particularly those arising from unreliable labels, which poses a significant challenge in ensuring fairness. Existing fairness techniques that address label bias typically involve modifying models and intervening in the training process, but these lack flexibility for large-scale datasets. To address this limitation, we introduce a data selection method designed to efficiently and flexibly mitigate label bias, tailored to more practical needs. Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set. This strategy, supported by peer predictions, ensures the fairness of the proxy model and eliminates the need for an additional holdout set, which is a common requirement in previous methods. Without altering the classifier's architecture, our modality-agnostic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Navigating Towards Fairness with Data Selection· underline

Taxonomy

TopicsQualitative Comparative Analysis Research