Understanding and Reducing the Class-Dependent Effects of Data Augmentation with A Two-Player Game Approach

Yunpeng Jiang; Yutong Ban; Paul Weng

arXiv:2407.03146·cs.CY·July 1, 2025

Understanding and Reducing the Class-Dependent Effects of Data Augmentation with A Two-Player Game Approach

Yunpeng Jiang, Yutong Ban, Paul Weng

PDF

Open Access

TL;DR

This paper introduces CLAM, a two-player game approach to mitigate class-dependent effects of data augmentation, ensuring fairer class performance without significantly reducing overall accuracy.

Contribution

It formulates classifier training as a non-linear optimization and adversarial game, proposing a novel multiplicative weights algorithm with proven convergence to address class fairness.

Findings

01

More balanced class performance across datasets

02

Limited impact on average accuracy

03

General phenomenon beyond data augmentation

Abstract

Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed, it may have an unfair effect in multi-class classification. While data augmentation generally improves the overall performance (and therefore is beneficial for many classes), it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to counteract this phenomenon, we propose CLAM, a CLAss-dependent Multiplicative-weights method. To derive it, we first formulate the training of a classifier as a non-linear optimization problem that aims at simultaneously maximizing the individual class performances and balancing them. By rewriting this optimization problem as an adversarial two-player game, we propose a novel multiplicative weight algorithm, for which we prove the convergence. Interestingly, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques