The Effects of Mixed Sample Data Augmentation are Class Dependent

Haeil Lee; Hansang Lee; Junmo Kim

arXiv:2307.09136·cs.CV·March 28, 2024

The Effects of Mixed Sample Data Augmentation are Class Dependent

Haeil Lee, Hansang Lee, Junmo Kim

PDF

Open Access

TL;DR

This paper investigates how Mixed Sample Data Augmentation techniques affect different classes unequally, identifies causes of class dependency, and proposes a training method to mitigate negative impacts and improve overall accuracy.

Contribution

It reveals the class-dependent effects of MSDA and introduces an algorithm that combines MSDA with non-MSDA data to address this issue.

Findings

01

MSDA causes class-dependent performance variations

02

Training with mixed MSDA and non-MSDA data mitigates negative effects

03

Overall accuracy is improved by the proposed method

Abstract

Mixed Sample Data Augmentation (MSDA) techniques, such as Mixup, CutMix, and PuzzleMix, have been widely acknowledged for enhancing performance in a variety of tasks. A previous study reported the class dependency of traditional data augmentation (DA), where certain classes benefit disproportionately compared to others. This paper reveals a class dependent effect of MSDA, where some classes experience improved performance while others experience degraded performance. This research addresses the issue of class dependency in MSDA and proposes an algorithm to mitigate it. The approach involves training on a mixture of MSDA and non-MSDA data, which not only mitigates the negative impact on the affected classes, but also improves overall accuracy. Furthermore, we provide in-depth analysis and discussion of why MSDA introduced class dependencies and which classes are most likely to have them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare · Advanced Data Compression Techniques · Traffic Prediction and Management Techniques

MethodsMixup · CutMix